[PATCH] Improve combining of conditionals
This patch fixes the problem with substituting expressions into IF_THEN_ELSE during combining. Without this patch combining of conditionals inside IF_THEN_ELSE is seriously inhibited. The problem this patch fixes is that combine_simplify_rtx() prefers to return an expression (say, xor (a) (b)) even when a comparison is prefered (say, neq (xor (a) (b)) 0). Expressions are not recognized as valid conditions of if_then_else for most targets, so combiner misses a potential optimization. This patch makes combine_simplify_rtx() aware of the context it was invoked in, and, when appropriate, does not discourage it from returning a conditional. The motivating example for this fix was crcu8() routine from CoreMark. Compiling this routine for MIPS32R2 at -O3 produces there are several instances of sequence andi$2,$2,0x1 xori$2,$2,0x1 movn$3,$5,$2 ; $2 dies here which can be optimized into andi$2,$2,0x1 movz$3,$5,$2 ; $2 dies here . The patch was successfully tested on {i686, arm, mips}-linux, both GCC testsuites and SPEC2000 runs. For all targets there was no observable code difference in SPEC2000 benchmarks, so the example does not trigger very often. Still, it speeds up CoreMark by about 1%. OK for trunk? -- Maxim Kuvyrkov Mentor Graphics / CodeSourcery gcc-combine-if_then_else.ChangeLog Description: Binary data gcc-combine-if_then_else.patch Description: Binary data
Re: [PATCH v3] Re: avoid useless if-before-free tests
Janne Blomqvist wrote: On Thu, Mar 24, 2011 at 18:51, Jim Meyering j...@meyering.net wrote: Janne Blomqvist wrote: On Tue, Mar 8, 2011 at 19:53, Jim Meyering j...@meyering.net wrote: Relative to v2, I've added libgo/ to the list of exempt directories and added this recently discussed gfc_free patch, at the request of Tobias Burnus. Also, I corrected an error in fortran's ChangeLog and removed all whitespace changes from all ChangeLog files. The libgfortran changes are Ok for 4.7. For the gfortran frontend (gcc/fortran/*) I'd prefer if you'd - Replace all calls to gfc_free (x) with free (x). - Remove the gfc_free() function and prototype. - Remove the free() macro which currently prevents calling free() directly. Following up, I've refreshed the series but hit a minor snag while converting new uses of gfc_free, removing new tests-before-free and merging/reordering changes. Applying this fix first makes my problem go away: [snip] So, what's the plan here? Do you plan to get a GCC account, do you already have one, or what? Now that 4.7 is open for development, it's perhaps the right time to poke the maintainers to get this patch in. If you don't have a GCC account, as one of the Fortran maintainers I can commit the Fortran and libgfortran parts, but someone else will have to do the rest (were they ever approved, BTW?) as I have only write after approval privileges for the rest of GCC. Can someone add me to the gcc group? That would help. I already have ssh access to sourceware.org. I've rebased the series a few times, but it's been a week or so. More convertible uses are being added regularly. Plus I have to reorder/split things a little to avoid a new conflict.
Re: [PATCH][RFC] Make called function type explicit, make function pointer type conversions useless
2011-04-14 Eric Botcazou ebotca...@adacore.com * cfgexpand.c (expand_call_stmt): Rematerialize the original function type if this is not a builtin function. Hum, using fold_convert seems to be more appropriate here. Bootstrapped/regtested on i586-suse-linux, applied as obvious. 2011-04-15 Eric Botcazou ebotca...@adacore.com * cfgexpand.c (expand_call_stmt): Simply convert the function type. -- Eric Botcazou Index: cfgexpand.c === --- cfgexpand.c (revision 172469) +++ cfgexpand.c (working copy) @@ -1851,8 +1851,8 @@ expand_call_stmt (gimple stmt) call is made may be different from the type of the function. */ if (!builtin_p) CALL_EXPR_FN (exp) - = fold_build1 (NOP_EXPR, build_pointer_type (gimple_call_fntype (stmt)), - CALL_EXPR_FN (exp)); + = fold_convert (build_pointer_type (gimple_call_fntype (stmt)), + CALL_EXPR_FN (exp)); TREE_TYPE (exp) = gimple_call_return_type (stmt); CALL_EXPR_STATIC_CHAIN (exp) = gimple_call_chain (stmt);
[PATCH] doubled words
Signed-off-by: Jim Meyering meyer...@redhat.com --- While most of these are in comments, the corrections in gcc/tree-cfg.c and gcc/config/sh/constraints.md are in strings. The former at least is marked for translation, and hence appears in every .po file. gcc/config/alpha/vms-unwind.h |4 ++-- gcc/config/arm/unwind-arm.h|4 ++-- gcc/config/microblaze/microblaze.c |2 +- gcc/config/sh/constraints.md |4 ++-- gcc/cp/pt.c|2 +- gcc/java/jcf-parse.c |4 ++-- gcc/tree-cfg.c |4 ++-- 7 files changed, 12 insertions(+), 12 deletions(-) diff --git a/gcc/config/alpha/vms-unwind.h b/gcc/config/alpha/vms-unwind.h index ea2c3a3..71cb7b8 100644 --- a/gcc/config/alpha/vms-unwind.h +++ b/gcc/config/alpha/vms-unwind.h @@ -1,5 +1,5 @@ /* Fallback frame unwinding for Alpha/VMS. - Copyright (C) 1996, 1997, 1998, 2000, 2001, 2002, 2003, 2009, 2010 + Copyright (C) 1996, 1997, 1998, 2000, 2001, 2002, 2003, 2009, 2010, 2011 Free Software Foundation, Inc. This file is part of GCC. @@ -229,7 +229,7 @@ alpha_vms_fallback_frame_state (struct _Unwind_Context *context, /* If PV designates an exception dispatcher, we have to adjust the return address column to get at the signal occurrence point, and account for - for what the CHF context contains. */ + what the CHF context contains. */ if (DENOTES_EXC_DISPATCHER (pv)) { diff --git a/gcc/config/arm/unwind-arm.h b/gcc/config/arm/unwind-arm.h index a9ba126..1a51d8d 100644 --- a/gcc/config/arm/unwind-arm.h +++ b/gcc/config/arm/unwind-arm.h @@ -1,5 +1,5 @@ /* Header file for the ARM EABI unwinder - Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009 + Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2011 Free Software Foundation, Inc. Contributed by Paul Brook @@ -65,7 +65,7 @@ extern C { } _Unwind_State; - /* Provided only for for compatibility with existing code. */ + /* Provided only for compatibility with existing code. */ typedef int _Unwind_Action; #define _UA_SEARCH_PHASE 1 #define _UA_CLEANUP_PHASE 2 diff --git a/gcc/config/microblaze/microblaze.c b/gcc/config/microblaze/microblaze.c index 6ea5fa2..5796bc7 100644 --- a/gcc/config/microblaze/microblaze.c +++ b/gcc/config/microblaze/microblaze.c @@ -919,7 +919,7 @@ microblaze_rtx_costs (rtx x, int code, int outer_code ATTRIBUTE_UNUSED, int *tot { /* Add 1 to make shift slightly more expensive than add. */ *total = COSTS_N_INSNS (INTVAL (XEXP (x, 1))) + 1; - /* Reduce shift costs for for special circumstances. */ + /* Reduce shift costs for special circumstances. */ if (optimize_size INTVAL (XEXP (x, 1)) 5) *total -= 2; if (!optimize_size INTVAL (XEXP (x, 1)) 17) diff --git a/gcc/config/sh/constraints.md b/gcc/config/sh/constraints.md index 6b0e5d2..40d0d0b 100644 --- a/gcc/config/sh/constraints.md +++ b/gcc/config/sh/constraints.md @@ -1,5 +1,5 @@ ;; Constraint definitions for Renesas / SuperH SH. -;; Copyright (C) 2007, 2008 Free Software Foundation, Inc. +;; Copyright (C) 2007, 2008, 2011 Free Software Foundation, Inc. ;; ;; This file is part of GCC. ;; @@ -99,7 +99,7 @@ (match_test ival = -128 ival = 127))) (define_constraint I10 - A signed 10-bit constant, as used in in SHmedia andi, ori. + A signed 10-bit constant, as used in SHmedia andi, ori. (and (match_code const_int) (match_test ival = -512 ival = 511))) diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c index 3356e75..fc69a0c 100644 --- a/gcc/cp/pt.c +++ b/gcc/cp/pt.c @@ -13980,7 +13980,7 @@ type_unification_real (tree tparms, gcc_assert (ntparms 0); /* Reset the number of non-defaulted template arguments contained - in in TARGS. */ + in TARGS. */ NON_DEFAULT_TEMPLATE_ARGS_COUNT (targs) = NULL_TREE; switch (strict) diff --git a/gcc/java/jcf-parse.c b/gcc/java/jcf-parse.c index feeddad..a56c1b7 100644 --- a/gcc/java/jcf-parse.c +++ b/gcc/java/jcf-parse.c @@ -1,6 +1,6 @@ /* Parser for Java(TM) .class files. Copyright (C) 1996, 1998, 1999, 2000, 2001, 2002, 2003, 2004, - 2005, 2006, 2007, 2008, 2009, 2010 Free Software Foundation, Inc. + 2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc. This file is part of GCC. @@ -368,7 +368,7 @@ set_source_filename (JCF *jcf, int index) from the input class file into the output file. We don't decode the data at all, merely rewriting constant indexes whenever we come across them: this is necessary because the constant pool in the - output file isn't the same as the constant pool in in the input. + output file isn't the same as the constant pool in the input. The main advantage of this technique is that the resulting annotation data is pointer-free, so it doesn't have to be relocated diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c index
Re: [PATCH v3] Re: avoid useless if-before-free tests
On Fri, Apr 15, 2011 at 10:54, Jim Meyering j...@meyering.net wrote: Janne Blomqvist wrote: On Thu, Mar 24, 2011 at 18:51, Jim Meyering j...@meyering.net wrote: Janne Blomqvist wrote: On Tue, Mar 8, 2011 at 19:53, Jim Meyering j...@meyering.net wrote: Relative to v2, I've added libgo/ to the list of exempt directories and added this recently discussed gfc_free patch, at the request of Tobias Burnus. Also, I corrected an error in fortran's ChangeLog and removed all whitespace changes from all ChangeLog files. The libgfortran changes are Ok for 4.7. For the gfortran frontend (gcc/fortran/*) I'd prefer if you'd - Replace all calls to gfc_free (x) with free (x). - Remove the gfc_free() function and prototype. - Remove the free() macro which currently prevents calling free() directly. Following up, I've refreshed the series but hit a minor snag while converting new uses of gfc_free, removing new tests-before-free and merging/reordering changes. Applying this fix first makes my problem go away: [snip] So, what's the plan here? Do you plan to get a GCC account, do you already have one, or what? Now that 4.7 is open for development, it's perhaps the right time to poke the maintainers to get this patch in. If you don't have a GCC account, as one of the Fortran maintainers I can commit the Fortran and libgfortran parts, but someone else will have to do the rest (were they ever approved, BTW?) as I have only write after approval privileges for the rest of GCC. Can someone add me to the gcc group? That would help. I already have ssh access to sourceware.org. According to http://gcc.gnu.org/svnwrite.html , you should send an email to overseers(at)gcc.gnu.org . It says that you need a sponsor and The steering committee or a well-established GCC maintainer (including reviewers) can approve for write access any person with GNU copyright assignment papers in place and known to submit good patches.. I'm not sure if I'm considered to be well-established enough, so could someone help Jim out here, please? Or, if nobody pipes up within a few days, just mention my name as your sponsor and we'll see if the overseers consider me well-established or not.. ;) -- Janne Blomqvist
Re: [PATCH v3] Re: avoid useless if-before-free tests
Janne Blomqvist wrote: On Fri, Apr 15, 2011 at 10:54, Jim Meyering j...@meyering.net wrote: Janne Blomqvist wrote: On Thu, Mar 24, 2011 at 18:51, Jim Meyering j...@meyering.net wrote: Janne Blomqvist wrote: On Tue, Mar 8, 2011 at 19:53, Jim Meyering j...@meyering.net wrote: Relative to v2, I've added libgo/ to the list of exempt directories and added this recently discussed gfc_free patch, at the request of Tobias Burnus. Also, I corrected an error in fortran's ChangeLog and removed all whitespace changes from all ChangeLog files. The libgfortran changes are Ok for 4.7. For the gfortran frontend (gcc/fortran/*) I'd prefer if you'd - Replace all calls to gfc_free (x) with free (x). - Remove the gfc_free() function and prototype. - Remove the free() macro which currently prevents calling free() directly. Following up, I've refreshed the series but hit a minor snag while converting new uses of gfc_free, removing new tests-before-free and merging/reordering changes. Applying this fix first makes my problem go away: [snip] So, what's the plan here? Do you plan to get a GCC account, do you already have one, or what? Now that 4.7 is open for development, it's perhaps the right time to poke the maintainers to get this patch in. If you don't have a GCC account, as one of the Fortran maintainers I can commit the Fortran and libgfortran parts, but someone else will have to do the rest (were they ever approved, BTW?) as I have only write after approval privileges for the rest of GCC. Can someone add me to the gcc group? That would help. I already have ssh access to sourceware.org. According to http://gcc.gnu.org/svnwrite.html , you should send an email to overseers(at)gcc.gnu.org . It says that you need a sponsor and The steering committee or a well-established GCC maintainer (including reviewers) can approve for write access any person with GNU copyright assignment papers in place and known to submit good patches.. I'm not sure if I'm considered to be well-established enough, so could someone help Jim out here, please? Or, if nobody pipes up within a few days, just mention my name as your sponsor and we'll see if the overseers consider me well-established or not.. ;) Thanks. FYI, I have an ANY assignment on file, so no need to wait for paperwork.
Fix minor issues in gimplify.c
Long lines, superfluous 's' and missing head comment. No functional changes. There is one function left without head comment: gimplify_adjust_omp_clauses. Jakub, when you have a few minutes, would you mind adding one? TIA. Tested on i586-suse-linux, applied on the mainline as obvious. 2011-04-15 Eric Botcazou ebotca...@adacore.com * gimplify.c: Fix issues in comments throughout. (voidify_wrapper_expr): Fix long line. (build_stack_save_restore): Likewise. (gimplify_loop_expr): Likewise. (gimplify_compound_lval): Likewise. (gimplify_init_ctor_eval): Likewise. (gimplify_modify_expr_rhs): Likewise. (omp_notice_threadprivate_variable): Likewise. -- Eric Botcazou Index: gimplify.c === --- gimplify.c (revision 172469) +++ gimplify.c (working copy) @@ -91,7 +91,7 @@ static struct gimplify_ctx *gimplify_ctx static struct gimplify_omp_ctx *gimplify_omp_ctxp; -/* Formal (expression) temporary table handling: Multiple occurrences of +/* Formal (expression) temporary table handling: multiple occurrences of the same scalar expression are evaluated into the same temporary. */ typedef struct gimple_temp_hash_elt @@ -100,11 +100,12 @@ typedef struct gimple_temp_hash_elt tree temp; /* Value */ } elt_t; -/* Forward declarations. */ +/* Forward declaration. */ static enum gimplify_status gimplify_compound_expr (tree *, gimple_seq *, bool); /* Mark X addressable. Unlike the langhook we expect X to be in gimple form and we don't do any syntax checking. */ + void mark_addressable (tree x) { @@ -232,6 +233,8 @@ pop_gimplify_context (gimple body) htab_delete (c-temp_htab); } +/* Push a GIMPLE_BIND tuple onto the stack of bindings. */ + static void gimple_push_bind_expr (gimple gimple_bind) { @@ -240,19 +243,23 @@ gimple_push_bind_expr (gimple gimple_bin VEC_safe_push (gimple, heap, gimplify_ctxp-bind_expr_stack, gimple_bind); } +/* Pop the first element off the stack of bindings. */ + static void gimple_pop_bind_expr (void) { VEC_pop (gimple, gimplify_ctxp-bind_expr_stack); } +/* Return the first element of the stack of bindings. */ + gimple gimple_current_bind_expr (void) { return VEC_last (gimple, gimplify_ctxp-bind_expr_stack); } -/* Return the stack GIMPLE_BINDs created during gimplification. */ +/* Return the stack of bindings created during gimplification. */ VEC(gimple, heap) * gimple_bind_expr_stack (void) @@ -260,7 +267,7 @@ gimple_bind_expr_stack (void) return gimplify_ctxp-bind_expr_stack; } -/* Returns true iff there is a COND_EXPR between us and the innermost +/* Return true iff there is a COND_EXPR between us and the innermost CLEANUP_POINT_EXPR. This info is used by gimple_push_cleanup. */ static bool @@ -392,7 +399,7 @@ remove_suffix (char *name, int len) } } -/* Create a new temporary name with PREFIX. Returns an identifier. */ +/* Create a new temporary name with PREFIX. Return an identifier. */ static GTY(()) unsigned int tmp_var_id_num; @@ -413,9 +420,8 @@ create_tmp_var_name (const char *prefix) return get_identifier (tmp_name); } - /* Create a new temporary variable declaration of type TYPE. - Does NOT push it into the current binding. */ + Do NOT push it into the current binding. */ tree create_tmp_var_raw (tree type, const char *prefix) @@ -446,7 +452,7 @@ create_tmp_var_raw (tree type, const cha return tmp_var; } -/* Create a new temporary variable declaration of type TYPE. DOES push the +/* Create a new temporary variable declaration of type TYPE. DO push the variable into the current binding. Further, assume that this is called only from gimplification or optimization, at which point the creation of certain types are bugs. */ @@ -537,7 +543,6 @@ lookup_tmp_var (tree val, bool is_formal return ret; } - /* Return true if T is a CALL_EXPR or an expression that can be assigned to a temporary. Note that this predicate should only be used during gimplification. See the rationale for this in @@ -605,7 +610,7 @@ internal_get_tmp_var (tree val, gimple_s return t; } -/* Returns a formal temporary variable initialized with VAL. PRE_P is as +/* Return a formal temporary variable initialized with VAL. PRE_P is as in gimplify_expr. Only use this function if: 1) The value of the unfactored expression represented by VAL will not @@ -623,7 +628,7 @@ get_formal_tmp_var (tree val, gimple_seq return internal_get_tmp_var (val, pre_p, NULL, true); } -/* Returns a temporary variable initialized with VAL. PRE_P and POST_P +/* Return a temporary variable initialized with VAL. PRE_P and POST_P are as in gimplify_expr. */ tree @@ -632,8 +637,8 @@ get_initialized_tmp_var (tree val, gimpl return internal_get_tmp_var (val, pre_p, post_p, false); } -/* Declares all the variables in VARS in
Re: [lto/pph] Do not pack more bits than requested (issue4415044)
On Thu, 14 Apr 2011, Diego Novillo wrote: On Thu, Apr 14, 2011 at 15:28, Jakub Jelinek ja...@redhat.com wrote: If bitpack_word_t has BITS_PER_BITPACK_WORD bits, then for nbits = BITS_PER_BITPACK_WORD this will be undefined. Use say mask = ((bitpack_word_t) 2 (nbits - 1)) - 1; or something similar (assertion ensures that nbits isn't 0). Quite right, thanks. In the meantime, I've changed my mind with this. I think it's safer if we just assert that the value we are about to pack fit in the number of bits the caller specified. The only problematic user is pack_ts_type_value_fields when it tries to pack a -1 for the type's alias set. I think we should just stream that as an integer and not go through the bitpacking overhead. For now, I'm applying this to the pph branch. Tested on x86_64. No LTO failures. See below Diego. * lto-streamer-out.c (pack_ts_type_value_fields): Pack all bits of -1 value. * lto-streamer.h (bitpack_create): Assert that the value to pack does not overflow NBITS. * lto-streamer-in.c (unpack_ts_type_value_fields): Unpack BITS_PER_BITPACK_WORD bits for TYPE_ALIAS_SET. diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c index 97b86ce..f04e031 100644 --- a/gcc/lto-streamer-in.c +++ b/gcc/lto-streamer-in.c @@ -1734,7 +1734,7 @@ unpack_ts_type_value_fields (struct bitpack_d *bp, tree expr) TYPE_USER_ALIGN (expr) = (unsigned) bp_unpack_value (bp, 1); TYPE_READONLY (expr) = (unsigned) bp_unpack_value (bp, 1); TYPE_ALIGN (expr) = (unsigned) bp_unpack_value (bp, HOST_BITS_PER_INT); - TYPE_ALIAS_SET (expr) = bp_unpack_value (bp, HOST_BITS_PER_INT); + TYPE_ALIAS_SET (expr) = bp_unpack_value (bp, BITS_PER_BITPACK_WORD); } diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c index 3ccad8b..89ad9c5 100644 --- a/gcc/lto-streamer-out.c +++ b/gcc/lto-streamer-out.c @@ -518,7 +518,8 @@ pack_ts_type_value_fields (struct bitpack_d *bp, tree expr) bp_pack_value (bp, TYPE_USER_ALIGN (expr), 1); bp_pack_value (bp, TYPE_READONLY (expr), 1); bp_pack_value (bp, TYPE_ALIGN (expr), HOST_BITS_PER_INT); - bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, HOST_BITS_PER_INT); + bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, +BITS_PER_BITPACK_WORD); As we only want to stream alias-set zeros just change it to a single bit, like bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0, 1); and on the reader side restore either a zero or -1. } diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h index 0d49430..73afd46 100644 --- a/gcc/lto-streamer.h +++ b/gcc/lto-streamer.h @@ -1190,18 +1190,15 @@ bitpack_create (struct lto_output_stream *s) static inline void bp_pack_value (struct bitpack_d *bp, bitpack_word_t val, unsigned nbits) { - bitpack_word_t mask, word; + bitpack_word_t word = bp-word; int pos = bp-pos; - word = bp-word; - + /* We shouldn't try to pack more bits than can fit in a bitpack word. */ gcc_assert (nbits 0 nbits = BITS_PER_BITPACK_WORD); Asserts will break merging the operations and make them slow again. Please no asserts in these core routines. Look at the .optimized dump of a series of bp_pack_value, they should be basically optimized to a series of ORs. As for the -1 case, it's simply broken use of the interface. Richard. - /* Make sure that VAL only has the lower NBITS set. Generate a - mask with the lower NBITS set and use it to filter the upper - bits from VAL. */ - mask = ((bitpack_word_t) 1 nbits) - 1; - val = val mask; + /* The value to pack should not overflow NBITS. */ + gcc_assert (nbits == BITS_PER_BITPACK_WORD + || val = ((bitpack_word_t) 1 nbits)); /* If val does not fit into the current bitpack word switch to the next one. */
Re: [PATCH] Improve combining of conditionals
The patch was successfully tested on {i686, arm, mips}-linux, both GCC testsuites and SPEC2000 runs. For all targets there was no observable code difference in SPEC2000 benchmarks, so the example does not trigger very often. Still, it speeds up CoreMark by about 1%. OK for trunk? Yes, modulo the following nits: @@ -4938,11 +4938,13 @@ find_split_point (rtx *loc, rtx insn, bool set_src) IN_DEST is nonzero if we are processing the SET_DEST of a SET. + IN_COND is nonzero if we are on top level of the condition. ...we are at the top level of a condition. @@ -5221,10 +5225,12 @@ subst (rtx x, rtx from, rtx to, int in_dest, int unique_copy) expression. OP0_MODE is the original mode of XEXP (x, 0). IN_DEST is nonzero - if we are inside a SET_DEST. */ + if we are inside a SET_DEST. IN_COND is nonzero if we are on the top level + of a condition. */ Likewise. @@ -5717,7 +5723,16 @@ combine_simplify_rtx (rtx x, enum machine_mode op0_mode, int in_dest) ZERO_EXTRACT is indeed appropriate, it will be placed back by the call to make_compound_operation in the SET case. */ - if (STORE_FLAG_VALUE == 1 + if (in_cond) + /* Don't apply below optimizations if the caller would + prefer a comparison rather than a value. + E.g., for the condition in an IF_THEN_ELSE most targets need + an explicit comparison. */ + { + ; + } Remove the superfluous parentheses and move the comment to a new paragraph of the big comment just above. No need to retest, just make sure this still compiles, thanks in advance. -- Eric Botcazou
Re: [PATCH][RFC] Make called function type explicit, make function pointer type conversions useless
On Thu, 14 Apr 2011, Eric Botcazou wrote: I have the following, the gimple_call_chain check is to prevent Ada bootstrap from failing. On x86? Yes, though maybe because I had get_callee_fndecl patched to not strip conversions. I suppose your patch is ok, maybe with simply not wrapping the NOP_EXPR around builtins. I can try that indeed... Thanks for fixing this! Richard.
Re: [Patch,testsuite,avr]: add -finline-limit=0 to pr41885.c options
On Thu, Apr 14, 2011 at 8:14 PM, Denis Chertykov cherty...@gmail.com wrote: 2011/4/14 Georg-Johann Lay a...@gjlay.de: This patchlet adds -finline-limit=0 to dg-options in testsuite/gcc.target/avr/torture/pr41885.c because otherwise optimizers will fold all tests and actually no test function is called when optimization is on. The test case still passes all tests. testsuite/ 2011-04-14 Georg-Johann Lay a...@gjlay.de * gcc.target/avr/torture/pr41885.c (dg-options): Add -finline-limit=0 Please use -fno-inline instead. Richard. Index: testsuite/gcc.target/avr/torture/pr41885.c === --- testsuite/gcc.target/avr/torture/pr41885.c (Revision 172431) +++ testsuite/gcc.target/avr/torture/pr41885.c (Arbeitskopie) @@ -1,4 +1,4 @@ -/* { dg-options -w -std=c99 } */ +/* { dg-options -w -std=c99 -finline-limit=0 } */ /* { dg-do run } */ #include limits.h Approved. Denis.
Re: More of ipa-inline housekeeping
I fixed this on the and added sanity check that the fields are initialized. This has shown problem with early inliner iteration fixed thusly and fact that early inliner is attempting to compute overall growth at a time the inline parameters are not computed for functions not visited by early optimizations yet. We previously agreed that early inliner should not try to do that (as this leads to early inliner inlining functions called once that should be deferred for later consieration). I just hope it won't cause benchmarks to regress too much ;) Yeah, we agreed to that. And I forgot about it as it wasn't part of the early inliner reorg (which was supposed to be a 1:1 transform). Today C++ results shows some regressions, but nothing earthshaking. So I think it is good idea to drop this feature of early inliner since it is not really systematic. There is also great improvement on LTO SPEC2000, but I tend to hope it is unrelated change. Perhaps your aliasing? Honza
Re: More of ipa-inline housekeeping
2011/4/15 Jan Hubicka hubi...@ucw.cz: I fixed this on the and added sanity check that the fields are initialized. This has shown problem with early inliner iteration fixed thusly and fact that early inliner is attempting to compute overall growth at a time the inline parameters are not computed for functions not visited by early optimizations yet. We previously agreed that early inliner should not try to do that (as this leads to early inliner inlining functions called once that should be deferred for later consieration). I just hope it won't cause benchmarks to regress too much ;) Yeah, we agreed to that. And I forgot about it as it wasn't part of the early inliner reorg (which was supposed to be a 1:1 transform). Today C++ results shows some regressions, but nothing earthshaking. So I think it is good idea to drop this feature of early inliner since it is not really systematic. There is also great improvement on LTO SPEC2000, but I tend to hope it is unrelated change. Perhaps your aliasing? I doubt SPEC2k uses VLAs or alloca, does it? Might be the DSE improvements, but I'm not sure. Richard. Honza
[committed] Fix pr46084.c testcase (PR target/48614)
Hi! This testcase doesn't use avx-check.h and is dg-do run testcase, as it is compiled with -mavx whenever it emits any AVX instructions, it won't be able to run on any older CPUs. Fixed thusly, committed to trunk/4.6. 2011-04-15 Jakub Jelinek ja...@redhat.com PR target/48614 * gcc.target/i386/pr46084.c: Require avx_runtime instead of just avx. --- gcc/testsuite/gcc.target/i386/pr46084.c.jj 2011-04-15 08:11:00.493464611 +0200 +++ gcc/testsuite/gcc.target/i386/pr46084.c 2011-04-15 08:08:50.282146043 +0200 @@ -1,7 +1,7 @@ /* This test needs to use setrlimit to set the stack size, so it can only run on Unix. */ /* { dg-do run { target *-*-linux* *-*-solaris* *-*-darwin* } } */ -/* { dg-require-effective-target avx } */ +/* { dg-require-effective-target avx_runtime } */ /* { dg-require-effective-target split_stack } */ /* { dg-options -fsplit-stack -O2 -mavx } */ Jakub
Re: [PATCH] Improve combining of conditionals
And make IN_COND a bool instead of an int? I had the same initial reaction but then came to the same conclusion as Maxim. -- Eric Botcazou
Re: [patch][ARM] Fix 16-bit - 64-bit multiply and accumulate
Ping. On 25/03/11 16:19, Andrew Stubbs wrote: On 28/01/11 15:20, Richard Earnshaw wrote: On Fri, 2011-01-28 at 15:13 +, Andrew Stubbs wrote: On 28/01/11 14:12, Richard Earnshaw wrote: So what happens to a variation of your testcase: long long foolong (long long x, short *a, short *b) { return x + (long long)*a * (long long)*b; } With your patch? This should generate identical code to your original test-case. The patch has no effect on that testcase - it's broken in some other way, I think, and the same with and without my patch: ldrsh r3, [r3, #0] ldrsh r2, [r2, #0] push {r4, r5} asrs r4, r3, #31 asrs r5, r2, #31 mul r4, r2, r4 mla r4, r3, r5, r4 umull r2, r3, r2, r3 adds r3, r4, r3 adds r0, r0, r2 adc r1, r1, r3 pop {r4, r5} bx lr Hmmm, that probably doesn't add anything useful to the discussion. :( I'll add that one to the todo list ... Andrew Ouch! I though that used to work :-( I looked at this one again, but on a second inspection I'm not sure there's much wrong with it? When I wrote the above I thought that there was a 64-bit multiply instruction, but now I look more closely I see there isn't, hence the above. It does two 16-bit loads, sign-extends the inputs to 64-bit, does a 64-bit - 64-bit multiply, and then adds 'x'. Can the umull/add/add/adc be optimized using umlal? It's too late on a Friday to workout what's going on with the carries Also, I don't fully understand why the compiler can't just disregard the casts and use maddhidi4? Isn't that mathematically equivalent in this case? When you say it used to work, what did it use to look like? Thanks Andrew
[ping] 3 unreviewed patches
Fix annoying gcov filename handling: http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01380.html (rs6000) Fix thinko in output_profile_hook: http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01624.html Introduce -Wstack-usage: http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01992.html Thanks in advance. -- Eric Botcazou
Re: [Patch,testsuite,avr]: add -finline-limit=0 to pr41885.c options
Richard Guenther schrieb: On Thu, Apr 14, 2011 at 8:14 PM, Denis Chertykov cherty...@gmail.com wrote: 2011/4/14 Georg-Johann Lay a...@gjlay.de: This patchlet adds -finline-limit=0 to dg-options in testsuite/gcc.target/avr/torture/pr41885.c because otherwise optimizers will fold all tests and actually no test function is called when optimization is on. The test case still passes all tests. testsuite/ 2011-04-14 Georg-Johann Lay a...@gjlay.de * gcc.target/avr/torture/pr41885.c (dg-options): Add -finline-limit=0 Please use -fno-inline instead. Richard. Ok, changed it: http://gcc.gnu.org/viewcvs?view=revisionrevision=172487 Johann
Re: Implement stack arrays even for unknown sizes
Hi, On Thu, 14 Apr 2011, Dominique Dhumieres wrote: I have forgotten to mentionned that I have a variant of fatigue in which I have done the inlining manually along with few other optimizations and the timing for it is [macbook] lin/test% gfc -Ofast fatigue_v8.f90 [macbook] lin/test% time a.out /dev/null 2.793u 0.002s 0:02.79 100.0% 0+0k 0+1io 0pf+0w [macbook] lin/test% gfc -Ofast -fwhole-program fatigue_v8.f90 [macbook] lin/test% time a.out /dev/null 2.680u 0.003s 0:02.68 100.0% 0+0k 0+2io 0pf+0w [macbook] lin/test% gfc -Ofast -fwhole-program -flto fatigue_v8.f90 [macbook] lin/test% time a.out /dev/null 2.671u 0.002s 0:02.67 100.0% 0+0k 0+2io 0pf+0w [macbook] lin/test% gfc -Ofast -fwhole-program -fstack-arrays fatigue_v8.f90 [macbook] lin/test% time a.out /dev/null 2.680u 0.003s 0:02.68 100.0% 0+0k 0+2io 0pf+0w [macbook] lin/test% gfc -Ofast -fwhole-program -flto -fstack-arrays fatigue_v8.f90 [macbook] lin/test% time a.out /dev/null 2.677u 0.003s 0:02.68 99.6% 0+0k 0+0io 0pf+0w So the timing of the original code with -Ofast -finline-limit=600 -fwhole-program -flto -fstack-arrays is quite close to this lower bound. I have also looked at the failure for gfortran.dg/elemental_dependency_1.f90 and it seems due to a spurious integer(kind=4) A.37[4]; (and friends) in Yes, this is due to the DECL_EXPR statement which is rendered by the dumper just the same as a normal decl. The testcase looks for exactly one such decl, but with -fstack-arrays there are exactly two for each such array. integer(kind=4) A.37[4]; So, this is the normal decl. atmp.36.dim[0].ubound = 3; integer(kind=4) A.37[4]; And this is the DECL_EXPR statement, which then actually is transformed into the stack_save/alloca/stack_restore sequence. Ciao, Michael.
Re: [Patch, libfortran] Replace sprintf() with snprintf()
On 04/14/2011 11:53 PM, Janne Blomqvist wrote: Hi, as is well known, sprintf() is prone to buffer overflow, hence snprintf(). libgfortran uses snprintf() in some places, but not everywhere. Rather than analyzing every sprintf() call for a potential overflow, the attached patch takes the dogmatic but simple approach of replacing all the remaining sprintf() usage with snprintf(). For targets without snprintf(), io/list_read.c contained a fallback macro that uses sprintf(); this is moved to libgfortran.h so that it's available everywhere. readelf -s libgfortran.so|grep sprintf confirms that there is no remaining usage of sprintf(). Regtested on x86_64-unknown-linux-gnu, Ok for trunk? OK, thanks. Jerry
Re: [PATCH] Improve combining of conditionals
On Apr 15, 2011, at 3:34 PM, Eric Botcazou wrote: The problem this patch fixes is that combine_simplify_rtx() prefers to return an expression (say, xor (a) (b)) even when a comparison is prefered (say, neq (xor (a) (b)) 0). Expressions are not recognized as valid conditions of if_then_else for most targets, so combiner misses a potential optimization. This patch makes combine_simplify_rtx() aware of the context it was invoked in, and, when appropriate, does not discourage it from returning a conditional. Btw, this is very likely also valid for targets with STORE_FLAG_VALUE == -1 so the same IN_COND short-circuit would need to be added a few lines below in combine_simplify_rtx. But this would need to be tested. Do you happen to have access to such a target, e.g. m68k? Hm, I didn't notice that one, thanks! I have access to m68k (ColdFire, tbp) and will test this change there before committing. -- Maxim Kuvyrkov Mentor Graphics / CodeSourcery
Re: [7/9] Testsuite: remove vect_{extract_even_odd,strided}_wide
On Tue, Apr 12, 2011 at 4:14 PM, Richard Sandiford richard.sandif...@linaro.org wrote: We have separate vect_extract_even_odd and vect_extract_even_odd_wide target selectors, and separate vect_strided and vect_strided_wide selectors. The comment suggests that wide is for 32+ bits, but we often use the non-wide forms for 32-bit tests. We also have tests that combine 16-bit and 32-bit strided accesses without checking for both widths. I'm about to split vect_strided into vect_stridedN (for each stride factor N). One option was to preserve the wide distinction and have vect_stridedN_wide as well. However, given the current usage, and given that the two selectors are the same, I think it makes sense to combine them until we know what distinction we need to make. Tested on x86_64-linux-gnu and arm-linux-gnueabi. OK to install? Ok. Thanks, Richard. Richard gcc/testsuite/ * lib/target-supports.exp (check_effective_target_vect_extract_even_odd_wide): Delete. (check_effective_target_vect_strided_wide): Likewise. * gcc.dg/vect/O3-pr39675-2.c: Use the non-wide versions instead. * gcc.dg/vect/fast-math-pr35982.c: Likewise. * gcc.dg/vect/fast-math-vect-complex-3.c: Likewise. * gcc.dg/vect/pr37539.c: Likewise. * gcc.dg/vect/slp-11.c: Likewise. * gcc.dg/vect/slp-12a.c: Likewise. * gcc.dg/vect/slp-12b.c: Likewise. * gcc.dg/vect/slp-19.c: Likewise. * gcc.dg/vect/slp-23.c: Likewise. * gcc.dg/vect/vect-1.c: Likewise. * gcc.dg/vect/vect-98.c: Likewise. * gcc.dg/vect/vect-107.c: Likewise. * gcc.dg/vect/vect-strided-float.c: Likewise. Index: gcc/testsuite/lib/target-supports.exp === --- gcc/testsuite/lib/target-supports.exp 2011-04-12 11:53:54.0 +0100 +++ gcc/testsuite/lib/target-supports.exp 2011-04-12 11:55:11.0 +0100 @@ -3121,29 +3121,6 @@ proc check_effective_target_vect_extract return $et_vect_extract_even_odd_saved } -# Return 1 if the target supports vector even/odd elements extraction of -# vectors with SImode elements or larger, 0 otherwise. - -proc check_effective_target_vect_extract_even_odd_wide { } { - global et_vect_extract_even_odd_wide_saved - - if [info exists et_vect_extract_even_odd_wide_saved] { - verbose check_effective_target_vect_extract_even_odd_wide: using cached result 2 - } else { - set et_vect_extract_even_odd_wide_saved 0 - if { [istarget powerpc*-*-*] - || [istarget i?86-*-*] - || [istarget x86_64-*-*] - || [istarget ia64-*-*] - || [istarget spu-*-*] } { - set et_vect_extract_even_odd_wide_saved 1 - } - } - - verbose check_effective_target_vect_extract_even_wide_odd: returning $et_vect_extract_even_odd_wide_saved 2 - return $et_vect_extract_even_odd_wide_saved -} - # Return 1 if the target supports vector interleaving, 0 otherwise. proc check_effective_target_vect_interleave { } { @@ -3184,25 +3161,6 @@ proc check_effective_target_vect_strided return $et_vect_strided_saved } -# Return 1 if the target supports vector interleaving and extract even/odd -# for wide element types, 0 otherwise. -proc check_effective_target_vect_strided_wide { } { - global et_vect_strided_wide_saved - - if [info exists et_vect_strided_wide_saved] { - verbose check_effective_target_vect_strided_wide: using cached result 2 - } else { - set et_vect_strided_wide_saved 0 - if { [check_effective_target_vect_interleave] - [check_effective_target_vect_extract_even_odd_wide] } { - set et_vect_strided_wide_saved 1 - } - } - - verbose check_effective_target_vect_strided_wide: returning $et_vect_strided_wide_saved 2 - return $et_vect_strided_wide_saved -} - # Return 1 if the target supports section-anchors proc check_effective_target_section_anchors { } { Index: gcc/testsuite/gcc.dg/vect/O3-pr39675-2.c === --- gcc/testsuite/gcc.dg/vect/O3-pr39675-2.c 2011-04-12 11:53:54.0 +0100 +++ gcc/testsuite/gcc.dg/vect/O3-pr39675-2.c 2011-04-12 11:55:11.0 +0100 @@ -26,7 +26,7 @@ foo () } } -/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect { target vect_strided_wide } } } */ -/* { dg-final { scan-tree-dump-times vectorizing stmts using SLP 1 vect { target vect_strided_wide } } } */ +/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect { target vect_strided } } } */ +/* { dg-final { scan-tree-dump-times vectorizing stmts using SLP 1 vect { target vect_strided } } } */ /* { dg-final { cleanup-tree-dump vect } } */ Index: gcc/testsuite/gcc.dg/vect/fast-math-pr35982.c
Re: [8/9] Testsuite: split tests for strided accesses
On Tue, Apr 12, 2011 at 4:19 PM, Richard Sandiford richard.sandif...@linaro.org wrote: The next patch introduces separate vect_stridedN target selectors for each tested stride factor N. At the moment, some tests contain several independent loops that have different stride factors. It's easier to make the next change if we put these loops into separate tests. Tested on x86_64-linux-gnu and arm-linux-gnueabi. OK to install? Ok. Thanks, Richard. Richard gcc/testsuite/ * gcc.dg/vect/slp-11.c: Split into... * gcc.dg/vect/slp-11a.c, gcc.dg/vect/slp-11b.c, gcc.dg/vect/slp-11c.c: ...these tests. * gcc.dg/vect/slp-12a.c: Split 4-stride loop into... * gcc.dg/vect/slp-12c.c: ...this new test. * gcc.dg/vect/slp-19.c: Split into... * gcc.dg/vect/slp-19a.c, gcc.dg/vect/slp-19b.c, gcc.dg/vect/slp-19c.c: ...these new tests. Index: gcc/testsuite/gcc.dg/vect/slp-11.c === --- gcc/testsuite/gcc.dg/vect/slp-11.c 2011-04-12 15:18:24.0 +0100 +++ /dev/null 2011-03-23 08:42:11.268792848 + @@ -1,113 +0,0 @@ -/* { dg-require-effective-target vect_int } */ - -#include stdarg.h -#include tree-vect.h - -#define N 8 - -int -main1 () -{ - int i; - unsigned int out[N*8], a0, a1, a2, a3, a4, a5, a6, a7, b1, b0, b2, b3, b4, b5, b6, b7; - unsigned int in[N*8] = {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63}; - float out2[N*8]; - - /* Different operations - not SLPable. */ - for (i = 0; i N; i++) - { - a0 = in[i*8] + 5; - a1 = in[i*8 + 1] * 6; - a2 = in[i*8 + 2] + 7; - a3 = in[i*8 + 3] + 8; - a4 = in[i*8 + 4] + 9; - a5 = in[i*8 + 5] + 10; - a6 = in[i*8 + 6] + 11; - a7 = in[i*8 + 7] + 12; - - b0 = a0 * 3; - b1 = a1 * 2; - b2 = a2 * 12; - b3 = a3 * 5; - b4 = a4 * 8; - b5 = a5 * 4; - b6 = a6 * 3; - b7 = a7 * 2; - - out[i*8] = b0 - 2; - out[i*8 + 1] = b1 - 3; - out[i*8 + 2] = b2 - 2; - out[i*8 + 3] = b3 - 1; - out[i*8 + 4] = b4 - 8; - out[i*8 + 5] = b5 - 7; - out[i*8 + 6] = b6 - 3; - out[i*8 + 7] = b7 - 7; - } - - /* check results: */ - for (i = 0; i N; i++) - { - if (out[i*8] != (in[i*8] + 5) * 3 - 2 - || out[i*8 + 1] != (in[i*8 + 1] * 6) * 2 - 3 - || out[i*8 + 2] != (in[i*8 + 2] + 7) * 12 - 2 - || out[i*8 + 3] != (in[i*8 + 3] + 8) * 5 - 1 - || out[i*8 + 4] != (in[i*8 + 4] + 9) * 8 - 8 - || out[i*8 + 5] != (in[i*8 + 5] + 10) * 4 - 7 - || out[i*8 + 6] != (in[i*8 + 6] + 11) * 3 - 3 - || out[i*8 + 7] != (in[i*8 + 7] + 12) * 2 - 7) - abort (); - } - - /* Requires permutation - not SLPable. */ - for (i = 0; i N*2; i++) - { - out[i*4] = (in[i*4] + 2) * 3; - out[i*4 + 1] = (in[i*4 + 2] + 2) * 7; - out[i*4 + 2] = (in[i*4 + 1] + 7) * 3; - out[i*4 + 3] = (in[i*4 + 3] + 3) * 4; - } - - /* check results: */ - for (i = 0; i N*2; i++) - { - if (out[i*4] != (in[i*4] + 2) * 3 - || out[i*4 + 1] != (in[i*4 + 2] + 2) * 7 - || out[i*4 + 2] != (in[i*4 + 1] + 7) * 3 - || out[i*4 + 3] != (in[i*4 + 3] + 3) * 4) - abort (); - } - - /* Different operations - not SLPable. */ - for (i = 0; i N*4; i++) - { - out2[i*2] = ((float) in[i*2] * 2 + 6) ; - out2[i*2 + 1] = (float) (in[i*2 + 1] * 3 + 7); - } - - /* check results: */ - for (i = 0; i N*4; i++) - { - if (out2[i*2] != ((float) in[i*2] * 2 + 6) - || out2[i*2 + 1] != (float) (in[i*2 + 1] * 3 + 7)) - abort (); - } - - - return 0; -} - -int main (void) -{ - check_vect (); - - main1 (); - - return 0; -} - -/* { dg-final { scan-tree-dump-times vectorized 3 loops 1 vect { target { { vect_uintfloat_cvt vect_strided } vect_int_mult } } } } */ -/* { dg-final { scan-tree-dump-times vectorized 2 loops 1 vect { target { { { ! vect_uintfloat_cvt } vect_strided } vect_int_mult } } } } */ -/* { dg-final { scan-tree-dump-times vectorized 0 loops 1 vect {target { ! { vect_int_mult vect_strided } } } } } */ -/* { dg-final { scan-tree-dump-times vectorizing stmts using SLP 0 vect } } */ -/* { dg-final { cleanup-tree-dump vect } } */ - Index: gcc/testsuite/gcc.dg/vect/slp-11a.c === --- /dev/null 2011-03-23 08:42:11.268792848 + +++ gcc/testsuite/gcc.dg/vect/slp-11a.c 2011-04-12 15:18:25.0 +0100 @@ -0,0 +1,75 @@ +/* { dg-require-effective-target vect_int } */ + +#include stdarg.h +#include tree-vect.h + +#define N 8 + +int +main1 () +{ + int i; + unsigned int out[N*8], a0, a1,
Re: [10/9] Add tests for stride-3 accesses
On Tue, Apr 12, 2011 at 4:34 PM, Richard Sandiford richard.sandif...@linaro.org wrote: This patch adds a test for stride-3 accesses. I didn't add any particularly complicated cases because I think the testsuite already covers the interaction between the strided loads stores and other operations pretty well. Let me know if there's something I should add though. Tested on x86_64-linux-gnu and arm-linux-gnueabi. OK to install? Ok. Thanks, Richard. Richard gcc/testsuite/ * gcc.dg/vect/vect-strided-u16-i3.c: New test. Index: gcc/testsuite/gcc.dg/vect/vect-strided-u16-i3.c === --- /dev/null 2011-03-23 08:42:11.268792848 + +++ gcc/testsuite/gcc.dg/vect/vect-strided-u16-i3.c 2011-04-12 11:55:17.0 +0100 @@ -0,0 +1,112 @@ +#include stdarg.h +#include tree-vect.h + +#define N 128 + +typedef struct { + unsigned short a; + unsigned short b; + unsigned short c; +} s; + +#define A(I) (I) +#define B(I) ((I) * 2) +#define C(I) ((unsigned short) ~((I) ^ 0x18)) + +void __attribute__ ((noinline)) +check1 (s *res) +{ + int i; + + for (i = 0; i N; i++) + if (res[i].a != C (i) + || res[i].b != A (i) + || res[i].c != B (i)) + abort (); +} + +void __attribute__ ((noinline)) +check2 (unsigned short *res) +{ + int i; + + for (i = 0; i N; i++) + if (res[i] != (unsigned short) (A (i) + B (i) + C (i))) + abort (); +} + +void __attribute__ ((noinline)) +check3 (s *res) +{ + int i; + + for (i = 0; i N; i++) + if (res[i].a != i + || res[i].b != i + || res[i].c != i) + abort (); +} + +void __attribute__ ((noinline)) +check4 (unsigned short *res) +{ + int i; + + for (i = 0; i N; i++) + if (res[i] != (unsigned short) (A (i) + B (i))) + abort (); +} + +void __attribute__ ((noinline)) +main1 (s *arr) +{ + int i; + s *ptr = arr; + s res1[N]; + unsigned short res2[N]; + + for (i = 0; i N; i++) + { + res1[i].a = arr[i].c; + res1[i].b = arr[i].a; + res1[i].c = arr[i].b; + } + check1 (res1); + + for (i = 0; i N; i++) + res2[i] = arr[i].a + arr[i].b + arr[i].c; + check2 (res2); + + for (i = 0; i N; i++) + { + res1[i].a = i; + res1[i].b = i; + res1[i].c = i; + } + check3 (res1); + + for (i = 0; i N; i++) + res2[i] = arr[i].a + arr[i].b; + check4 (res2); +} + +int main (void) +{ + int i; + s arr[N]; + + check_vect (); + + for (i = 0; i N; i++) + { + arr[i].a = A (i); + arr[i].b = B (i); + arr[i].c = C (i); + } + main1 (arr); + + return 0; +} + +/* { dg-final { scan-tree-dump-times vectorized 4 loops 1 vect { target vect_strided3 } } } */ +/* { dg-final { cleanup-tree-dump vect } } */
Re: [lto/pph] Do not pack more bits than requested (issue4415044)
On Fri, Apr 15, 2011 at 04:56, Richard Guenther rguent...@suse.de wrote: @@ -518,7 +518,8 @@ pack_ts_type_value_fields (struct bitpack_d *bp, tree expr) bp_pack_value (bp, TYPE_USER_ALIGN (expr), 1); bp_pack_value (bp, TYPE_READONLY (expr), 1); bp_pack_value (bp, TYPE_ALIGN (expr), HOST_BITS_PER_INT); - bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, HOST_BITS_PER_INT); + bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, + BITS_PER_BITPACK_WORD); As we only want to stream alias-set zeros just change it to a single bit, like bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0, 1); and on the reader side restore either a zero or -1. Ah, yes. Much better. As for the -1 case, it's simply broken use of the interface. Which would've been caught by the assertion. How about this, we keep the asserts with #ifdef ENABLE_CHECKING. This would've saved me some ugly debugging. Diego.
Re: [lto/pph] Do not pack more bits than requested (issue4415044)
On Fri, 15 Apr 2011, Diego Novillo wrote: On Fri, Apr 15, 2011 at 04:56, Richard Guenther rguent...@suse.de wrote: @@ -518,7 +518,8 @@ pack_ts_type_value_fields (struct bitpack_d *bp, tree expr) bp_pack_value (bp, TYPE_USER_ALIGN (expr), 1); bp_pack_value (bp, TYPE_READONLY (expr), 1); bp_pack_value (bp, TYPE_ALIGN (expr), HOST_BITS_PER_INT); - bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, HOST_BITS_PER_INT); + bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, + BITS_PER_BITPACK_WORD); As we only want to stream alias-set zeros just change it to a single bit, like bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0, 1); and on the reader side restore either a zero or -1. Ah, yes. Much better. As for the -1 case, it's simply broken use of the interface. Which would've been caught by the assertion. How about this, we keep the asserts with #ifdef ENABLE_CHECKING. This would've saved me some ugly debugging. I think we should rather add the masking. The assert would only trigger for particular values, not for bogus use of the interface (you get sign-extension for signed arguments). Richard.
[PATCH 1/4] Remove usesess and wrong code from ipa_analyze_virtual_call_uses
Hi, ipa_analyze_virtual_call_uses contains code that was meant to deal with situation where OBJ_TYPE_REF_OBJECT is a (number of) COMPONENT_REFs on top of a dereferenced default definition SSA_NAME of a parameter. The code is useless because that never happens in the IL, if an ancestor object of a parameter is being used for a virtual call, the object in the expression is always an SSA_NAME which is assigned the proper value in a previous statement. Moreover, if it ever triggered, it might lead to wrong code because in cases like this it is necessary also to store the offset within the parameter into the indirect call graph edge information (like we do in indirect inlining). The above is done in the next patch in the series. I've split this part from it because I would like to commit it also to the 4.6 branch. I have bootstrapped and tested this on x86-64-linux without any problems. OK for trunk and the 4.6 branch? Thanks, Martin 2011-04-08 Martin Jambor mjam...@suse.cz * ipa-prop.c (ipa_analyze_virtual_call_uses): Remove handling of ADR_EXPRs. Index: src/gcc/ipa-prop.c === --- src.orig/gcc/ipa-prop.c +++ src/gcc/ipa-prop.c @@ -1383,18 +1383,6 @@ ipa_analyze_virtual_call_uses (struct cg if (!flag_devirtualize) return; - if (TREE_CODE (obj) == ADDR_EXPR) -{ - do - { - obj = TREE_OPERAND (obj, 0); - } - while (TREE_CODE (obj) == COMPONENT_REF); - if (TREE_CODE (obj) != MEM_REF) - return; - obj = TREE_OPERAND (obj, 0); -} - if (TREE_CODE (obj) != SSA_NAME || !SSA_NAME_IS_DEFAULT_DEF (obj)) return;
[PATCH 3/4] Simple relaxation of dynamic type change detection routine
Hi, in order to speed up astar, I had to persuade the function that decides whether a statement potentially modifies the dynamic type of an object by storing a new value to the VMT pointer to consider the following statement harmless (all types are integers of some sort): MEM[(i32 *)b2arp_3(D) + 8B] = 0; I'd like to experiment with this routine a bit more once I have some other IPA-CP infrastructure in place but at the moment I opted for a simple solution: All scalar non-pointer stores are deemed safe. VMT pointer is a compiler generated field which is a pointer so legal user code is not able to store stuff there through some fancy type casts and compiler generated code should have no reason whatsoever to that either. Therefore I believe this change is safe and useful. I have bootstrapped and tested the patch on x886_64-linux. OK for trunk? Thanks, Martin 2011-04-14 Martin Jambor mjam...@suse.cz * ipa-prop.c (stmt_may_be_vtbl_ptr_store): Return false for scalar non-pointer assignments. Index: src/gcc/ipa-prop.c === --- src.orig/gcc/ipa-prop.c +++ src/gcc/ipa-prop.c @@ -405,13 +405,18 @@ stmt_may_be_vtbl_ptr_store (gimple stmt) { tree lhs = gimple_assign_lhs (stmt); - if (TREE_CODE (lhs) == COMPONENT_REF - !DECL_VIRTUAL_P (TREE_OPERAND (lhs, 1)) - !AGGREGATE_TYPE_P (TREE_TYPE (lhs))) + if (!AGGREGATE_TYPE_P (TREE_TYPE (lhs))) + { + if (!POINTER_TYPE_P (TREE_TYPE (lhs))) return false; - /* In the future we might want to use get_base_ref_and_offset to find -if there is a field corresponding to the offset and if so, proceed -almost like if it was a component ref. */ + + if (TREE_CODE (lhs) == COMPONENT_REF + !DECL_VIRTUAL_P (TREE_OPERAND (lhs, 1))) + return false; + /* In the future we might want to use get_base_ref_and_offset to find +if there is a field corresponding to the offset and if so, proceed +almost like if it was a component ref. */ + } } return true; }
[PATCH 4/4] Devirtualization based on global objects
Hi, this is the patch that actually speeds up astar (together with the previous one). It implements devirtualization based on global objects. In the past we have refrained from doing this because in general it is difficult to say whether the object is currently being constructed and so it might have a dynamic type of one of its ancestors. However, if the object's class does not have any ancestors that obviously cannot happen. Devirtualizing in such conditions is enough to change a virtual call to regwayobj::isaddtobound in 473.astar to a direct one which can and should be inlined. That seemed a good justification to implement this and so the patch below does so and brings about 3.1% speedup for that benchmark with LTO. I acknowledge that instead of discarding all classes with ancestors it would be better to check that the called virtual method has the same implementation in all ancestors instead. That is perhaps something for later. It took me surprisingly long to realize that this technique can be used for folding virtual calls based on local automatically allocated objexts too and so can be used to un-XFAIL g++.dg/opt/devirt1.c that regressed in 4.6. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2011-04-15 Martin Jambor mjam...@suse.cz * ipa-cp.c (ipcp_process_devirtualization_opportunities): Devirtualize also according to actual contants. * gimple-fold.c (gimple_extract_devirt_binfo_from_cst): New function. (gimple_fold_obj_type_ref_call): New function. (gimple_fold_call): Call gimple_fold_obj_type_ref_call on OBJ_TYPE_REFs. * gimple.h (gimple_extract_devirt_binfo_from_cst): Declare. * testsuite/g++.dg/opt/devirt1.C: Bump to -O2, remove XFAIL. * testsuite/g++.dg/opt/devirt2.C: New test. * testsuite/g++.dg/ipa/devirt-g-1.C: Likewise. Index: src/gcc/ipa-cp.c === --- src.orig/gcc/ipa-cp.c +++ src/gcc/ipa-cp.c @@ -1245,51 +1245,71 @@ ipcp_process_devirtualization_opportunit for (ie = node-indirect_calls; ie; ie = next_ie) { - int param_index, types_count, j; + int param_index; HOST_WIDE_INT token, anc_offset; tree target, delta, otr_type; + struct ipcp_lattice *lat; next_ie = ie-next_callee; if (!ie-indirect_info-polymorphic) continue; param_index = ie-indirect_info-param_index; - if (param_index == -1 - || ipa_param_cannot_devirtualize_p (info, param_index) - || ipa_param_types_vec_empty (info, param_index)) + if (param_index == -1) continue; + lat = ipcp_get_lattice (info, param_index); token = ie-indirect_info-otr_token; anc_offset = ie-indirect_info-anc_offset; otr_type = ie-indirect_info-otr_type; target = NULL_TREE; - types_count = VEC_length (tree, info-params[param_index].types); - for (j = 0; j types_count; j++) + if (lat-type == IPA_CONST_VALUE) { - tree binfo = VEC_index (tree, info-params[param_index].types, j); - tree d, t; - + tree binfo = gimple_extract_devirt_binfo_from_cst (lat-constant); + if (!binfo) + continue; binfo = get_binfo_at_offset (binfo, anc_offset, otr_type); if (!binfo) - { - target = NULL_TREE; - break; - } + continue; + target = gimple_get_virt_mehtod_for_binfo (token, binfo, delta, +false); + } + else + { + int types_count, j; - t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true); - if (!t) - { - target = NULL_TREE; - break; - } - else if (!target) - { - target = t; - delta = d; - } - else if (target != t || !tree_int_cst_equal (delta, d)) + if (ipa_param_cannot_devirtualize_p (info, param_index) + || ipa_param_types_vec_empty (info, param_index)) + continue; + + types_count = VEC_length (tree, info-params[param_index].types); + for (j = 0; j types_count; j++) { - target = NULL_TREE; - break; + tree binfo = VEC_index (tree, info-params[param_index].types, j); + tree d, t; + + binfo = get_binfo_at_offset (binfo, anc_offset, otr_type); + if (!binfo) + { + target = NULL_TREE; + break; + } + + t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true); + if (!t) + { + target = NULL_TREE; + break; + } + else if (!target) + { + target = t; + delta = d; + } +
[PATCH 2/4] Handle calls to ancestor objects in IPA-CP devirtualization
Hi, early inlining can create virtual calls based on the part of an object that represents an ancestor. This patch makes ipa-prop analysis able to recognize such calls and store the required offset along with such calls (the field is already there for similar purposes of indirect inlining). The constant propagation is then made aware of the offset field and takes it into account when looking up the proper BINFO. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2011-04-13 Martin Jambor mjam...@suse.cz * ipa-cp.c (ipcp_process_devirtualization_opportunities): Take into account anc_offset and otr_type from the indirect edge info. * ipa-prop.c (get_ancestor_addr_info): New function. (compute_complex_ancestor_jump_func): Assignment analysis moved to get_ancestor_addr_info, call it. (ipa_note_param_call): Do not initialize information about polymorphic calls, return the indirect call graph edge. Remove the last parameter, adjust all callers. (ipa_analyze_virtual_call_uses): Process also calls to ancestors of parameters. Initialize polymorphic information in the indirect edge. * testsuite/g++.dg/ipa/devirt-7.C: New test. Index: src/gcc/ipa-cp.c === --- src.orig/gcc/ipa-cp.c +++ src/gcc/ipa-cp.c @@ -1246,8 +1246,8 @@ ipcp_process_devirtualization_opportunit for (ie = node-indirect_calls; ie; ie = next_ie) { int param_index, types_count, j; - HOST_WIDE_INT token; - tree target, delta; + HOST_WIDE_INT token, anc_offset; + tree target, delta, otr_type; next_ie = ie-next_callee; if (!ie-indirect_info-polymorphic) @@ -1259,14 +1259,23 @@ ipcp_process_devirtualization_opportunit continue; token = ie-indirect_info-otr_token; + anc_offset = ie-indirect_info-anc_offset; + otr_type = ie-indirect_info-otr_type; target = NULL_TREE; types_count = VEC_length (tree, info-params[param_index].types); for (j = 0; j types_count; j++) { tree binfo = VEC_index (tree, info-params[param_index].types, j); - tree d; - tree t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true); + tree d, t; + binfo = get_binfo_at_offset (binfo, anc_offset, otr_type); + if (!binfo) + { + target = NULL_TREE; + break; + } + + t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true); if (!t) { target = NULL_TREE; Index: src/gcc/ipa-prop.c === --- src.orig/gcc/ipa-prop.c +++ src/gcc/ipa-prop.c @@ -576,6 +576,49 @@ compute_complex_assign_jump_func (struct } } +/* Extract the base, offset and MEM_REF expression from a statement ASSIGN if + it looks like: + + iftmp.1_3 = obj_2(D)-D.1762; + + The base of the MEM_REF must be a default definition SSA NAME of a + parameter. Return NULL_TREE if it looks otherwise. If case of success, the + whole MEM_REF expression is returned and the offset calculated from any + handled components and the MEM_REF itself is stored into *OFFSET. The whole + RHS stripped off the ADDR_EXPR is stored into *OBJ_P. */ + +static tree +get_ancestor_addr_info (gimple assign, tree *obj_p, HOST_WIDE_INT *offset) +{ + HOST_WIDE_INT size, max_size; + tree expr, parm, obj; + + if (!gimple_assign_single_p (assign)) +return NULL_TREE; + expr = gimple_assign_rhs1 (assign); + + if (TREE_CODE (expr) != ADDR_EXPR) +return NULL_TREE; + expr = TREE_OPERAND (expr, 0); + obj = expr; + expr = get_ref_base_and_extent (expr, offset, size, max_size); + + if (TREE_CODE (expr) != MEM_REF + /* If this is a varying address, punt. */ + || max_size == -1 + || max_size != size) +return NULL_TREE; + parm = TREE_OPERAND (expr, 0); + if (TREE_CODE (parm) != SSA_NAME + || !SSA_NAME_IS_DEFAULT_DEF (parm) + || *offset 0) +return NULL_TREE; + + *offset += mem_ref_offset (expr).low * BITS_PER_UNIT; + *obj_p = obj; + return expr; +} + /* Given that an actual argument is an SSA_NAME that is a result of a phi statement PHI, try to find out whether NAME is in fact a @@ -603,7 +646,7 @@ compute_complex_ancestor_jump_func (stru struct ipa_jump_func *jfunc, gimple call, gimple phi) { - HOST_WIDE_INT offset, size, max_size; + HOST_WIDE_INT offset; gimple assign, cond; basic_block phi_bb, assign_bb, cond_bb; tree tmp, parm, expr, obj; @@ -626,29 +669,12 @@ compute_complex_ancestor_jump_func (stru assign = SSA_NAME_DEF_STMT (tmp); assign_bb = gimple_bb (assign); - if (!single_pred_p (assign_bb) - || !gimple_assign_single_p (assign)) + if (!single_pred_p (assign_bb)) return; - expr =
Re: [lto/pph] Do not pack more bits than requested (issue4415044)
On Fri, Apr 15, 2011 at 08:49, Richard Guenther rguent...@suse.de wrote: On Fri, 15 Apr 2011, Diego Novillo wrote: On Fri, Apr 15, 2011 at 04:56, Richard Guenther rguent...@suse.de wrote: @@ -518,7 +518,8 @@ pack_ts_type_value_fields (struct bitpack_d *bp, tree expr) bp_pack_value (bp, TYPE_USER_ALIGN (expr), 1); bp_pack_value (bp, TYPE_READONLY (expr), 1); bp_pack_value (bp, TYPE_ALIGN (expr), HOST_BITS_PER_INT); - bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, HOST_BITS_PER_INT); + bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, + BITS_PER_BITPACK_WORD); As we only want to stream alias-set zeros just change it to a single bit, like bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0, 1); and on the reader side restore either a zero or -1. Ah, yes. Much better. As for the -1 case, it's simply broken use of the interface. Which would've been caught by the assertion. How about this, we keep the asserts with #ifdef ENABLE_CHECKING. This would've saved me some ugly debugging. I think we should rather add the masking. The assert would only trigger for particular values, not for bogus use of the interface (you get sign-extension for signed arguments). OK, that works too. I'll prepare a patch. Diego.
Re: [6/9] NEON vec_load_lanes and vec_store_lanes patterns
On Tue, 2011-04-12 at 15:01 +0100, Richard Sandiford wrote: This patch adds vec_load_lanes and vec_store_lanes patterns for NEON. They feed directly into the corresponding intrinsic patterns. Tested on x86_64-linux-gnu and arm-linux-gnueabi. OK to install? Richard gcc/ * config/arm/neon.md (vec_load_lanesmodemode): New expanders, (vec_store_lanesmodemode): Likewise. OK. R.
Re: [PATCH v3] Re: avoid useless if-before-free tests
Janne == Janne Blomqvist blomqvist.ja...@gmail.com writes: Jim Can someone add me to the gcc group? That would help. Jim I already have ssh access to sourceware.org. Janne I'm not sure if I'm considered to be well-established Janne enough, so could someone help Jim out here, please? I added Jim to the gcc group. Tom
[PATCH] refactor gimple asm memory clobber checking
There are a couple places that check GIMPLE_ASMs for clobbering memory; this patch centralizes the logic in gimple.c. Tested on x86_64-unknown-linux-gnu. OK to commit? -Nathan * gimple.h (gimple_asm_clobbers_memory_p): Declare. * gimple.c (gimple_asm_clobbers_memory_p): Define. * ipa-pure-const.c (check_stmt): Call it. * tree-ssa-operands.c (get_asm_expr_operands): Likewise. diff --git a/gcc/gimple.c b/gcc/gimple.c index 090fc94..5dc62ea 100644 --- a/gcc/gimple.c +++ b/gcc/gimple.c @@ -5142,4 +5142,21 @@ gimple_call_builtin_p (gimple stmt, enum built_in_function code) DECL_FUNCTION_CODE (fndecl) == code); } +/* Return true if STMT clobbers memory. STMT is required to be a + GIMPLE_ASM. */ + +bool +gimple_asm_clobbers_memory_p (const_gimple stmt) +{ + unsigned i; + + for (i = 0; i gimple_asm_nclobbers (stmt); i++) +{ + tree op = gimple_asm_clobber_op (stmt, i); + if (strcmp (TREE_STRING_POINTER (TREE_VALUE (op)), memory) == 0) + return true; +} + + return false; +} #include gt-gimple.h diff --git a/gcc/gimple.h b/gcc/gimple.h index 572cabc..840e149 100644 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -973,6 +973,7 @@ extern bool walk_stmt_load_store_ops (gimple, void *, bool (*)(gimple, tree, void *)); extern bool gimple_ior_addresses_taken (bitmap, gimple); extern bool gimple_call_builtin_p (gimple, enum built_in_function); +extern bool gimple_asm_clobbers_memory_p (const_gimple); /* In gimplify.c */ extern tree create_tmp_var_raw (tree, const char *); diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c index b7deb57..eb5b0f6 100644 --- a/gcc/ipa-pure-const.c +++ b/gcc/ipa-pure-const.c @@ -639,7 +639,6 @@ static void check_stmt (gimple_stmt_iterator *gsip, funct_state local, bool ipa) { gimple stmt = gsi_stmt (*gsip); - unsigned int i = 0; if (is_gimple_debug (stmt)) return; @@ -693,16 +692,12 @@ check_stmt (gimple_stmt_iterator *gsip, funct_state local, bool ipa) } break; case GIMPLE_ASM: - for (i = 0; i gimple_asm_nclobbers (stmt); i++) + if (gimple_asm_clobbers_memory_p (stmt)) { - tree op = gimple_asm_clobber_op (stmt, i); - if (strcmp (TREE_STRING_POINTER (TREE_VALUE (op)), memory) == 0) - { - if (dump_file) -fprintf (dump_file, memory asm clobber is not const/pure); - /* Abandon all hope, ye who enter here. */ - local-pure_const_state = IPA_NEITHER; - } + if (dump_file) + fprintf (dump_file, memory asm clobber is not const/pure); + /* Abandon all hope, ye who enter here. */ + local-pure_const_state = IPA_NEITHER; } if (gimple_asm_volatile_p (stmt)) { diff --git a/gcc/tree-ssa-operands.c b/gcc/tree-ssa-operands.c index 57f443f..7f76cbf 100644 --- a/gcc/tree-ssa-operands.c +++ b/gcc/tree-ssa-operands.c @@ -832,15 +832,8 @@ get_asm_expr_operands (gimple stmt) } /* Clobber all memory and addressable symbols for asm ( : : : memory); */ - for (i = 0; i gimple_asm_nclobbers (stmt); i++) -{ - tree link = gimple_asm_clobber_op (stmt, i); - if (strcmp (TREE_STRING_POINTER (TREE_VALUE (link)), memory) == 0) - { - add_virtual_operand (stmt, opf_def); - break; - } -} + if (gimple_asm_clobbers_memory_p (stmt)) +add_virtual_operand (stmt, opf_def); }
Re: [PATCH, ARM] PR47855 Compute attr length for some thumb2 insns, 2/3
On Thu, 2011-04-14 at 21:19 +0800, Carrot Wei wrote: On Fri, Apr 8, 2011 at 6:51 PM, Ramana Radhakrishnan ramana.radhakrish...@linaro.org wrote: On 08/04/11 10:57, Carrot Wei wrote: Hi This is the second part of the fixing for http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855 This patch contains the length computation for insn patterns *arm_movqi_insn and *arm_addsi3. Since the alternatives and encodings are much more complex, the attribute length is computed in separate C functions. Sorry, no. This is potentially a maintenance pain. It hardcodes alternatives from a pattern elsewhere in the C file. I don't like doing this unless we have to with the sync primitives or with push_multi. In this case I'm not convinced we need such functions in the .c file. Why can't we use the enabled attribute here with appropriate constraints for everything other than the memory cases (even there we might be able to invent some new constraints) ? Also a note about programming style. There are the helper macros like REG_P, CONST_INT_P and MEM_P which remove the necessity for checks like GET_CODE (x) == y where y E { REG, CONST_INT, MEM} Hi Ramana As you suggested I created several new constraints, and use the enabled attribute to split the current alternatives in this new patch. It has been tested on arm qemu without regression. thanks Carrot Sorry, I don't think this approach can work. Certainly not with the way the compiler currently works, and especially for mov and add insns. These instructions are only 2 bytes long if either: 1) They clobber the condition code register or 2) They occur inside an IT block. We can't tell either of these from the pattern, so you're underestimating the length of the instruction in some circumstances by claiming that they are only 2 bytes long. That /will/ lead to broken code someday. We can't add potential clobbers to mov and add patterns because that will break reload which relies on these patterns being simple-set insns with no added baggage. It *might* be possible to add clobbers to other operations, but that will then most-likely upset instruction scheduling (I think the scheduler treats two insns that clobber the same hard reg as being strongly ordered). Putting in the clobber too early will certainly affect cond-exec generation. In short, I'm not aware of a simple way to address this problem so that we get accurate length information, but minimal impact on other passes in the compiler. R.
Re: Implement stack arrays even for unknown sizes
Michael, Yes, this is due to the DECL_EXPR statement which is rendered by the dumper just the same as a normal decl. The testcase looks for exactly one such decl, but with -fstack-arrays there are exactly two for each such array. The testsuite is run without -fstack-arrays, so I dont' understand why the DECL_EXPR statement appears. Dominique
[PATCH] Fix PR48286
This makes us avoid 323. Tested on x86_64-unknown-linux-gnu/-m32, committed. Richard. 2011-04-15 Richard Guenther rguent...@suse.de PR testsuite/48286 * gfortran.dg/cray_pointers_8.f90: Use -ffloat-store. gcc/testsuite/gfortran.dg/cray_pointers_8.f90 Index: gcc/testsuite/gfortran.dg/cray_pointers_8.f90 === --- gcc/testsuite/gfortran.dg/cray_pointers_8.f90 (revision 172493) +++ gcc/testsuite/gfortran.dg/cray_pointers_8.f90 (working copy) @@ -1,5 +1,5 @@ ! { dg-do run } -! { dg-options -fcray-pointer } +! { dg-options -fcray-pointer -ffloat-store } ! ! Test the fix for PR36528 in which the Cray pointer was not passed ! correctly to 'euler' so that an undefined reference to fcn was
Re: [PATCH] factor asm op chaining out from stmt.c:expand_asm_stmt
On Fri, Apr 15, 2011 at 3:19 PM, Nathan Froyd froy...@codesourcery.com wrote: There are several cut-and-pasted loops in expand_asm_stmt that could be parameterized by the functions used to access the particular operands. The patch below does that. Tested on x86_64-unknown-linux-gnu. OK to commit? Hmm, it doesn't make it easier to follow though. Just my 2cents, Richard. -Nathan * stmt.c (chain_asm_ops): New function. (expand_asm_stmt): Call it. diff --git a/gcc/stmt.c b/gcc/stmt.c index 1a9f9e5..1fc09e9 100644 --- a/gcc/stmt.c +++ b/gcc/stmt.c @@ -1114,13 +1114,34 @@ expand_asm_operands (tree string, tree outputs, tree inputs, free_temp_slots (); } +/* Return the operands of STMT, a GIMPLE_ASM, as described by OP_FN and + N_OPS connected via TREE_CHAIN. */ + +static tree +chain_asm_ops (const_gimple stmt, unsigned (*n_ops) (const_gimple), + tree (*op_fn) (const_gimple, unsigned)) +{ + unsigned i, n; + tree ret = NULL_TREE, t; + + n = (*n_ops) (stmt); + if (n 0) + { + t = ret = (*op_fn) (stmt, 0); + for (i = 1; i n; i++) + t = TREE_CHAIN (t) = (*op_fn) (stmt, i); + } + + return ret; +} + void expand_asm_stmt (gimple stmt) { int noutputs; - tree outputs, tail, t; + tree outputs, tail; tree *o; - size_t i, n; + size_t i; const char *s; tree str, out, in, cl, labels; location_t locus = gimple_location (stmt); @@ -1128,41 +1149,10 @@ expand_asm_stmt (gimple stmt) /* Meh... convert the gimple asm operands into real tree lists. Eventually we should make all routines work on the vectors instead of relying on TREE_CHAIN. */ - out = NULL_TREE; - n = gimple_asm_noutputs (stmt); - if (n 0) - { - t = out = gimple_asm_output_op (stmt, 0); - for (i = 1; i n; i++) - t = TREE_CHAIN (t) = gimple_asm_output_op (stmt, i); - } - - in = NULL_TREE; - n = gimple_asm_ninputs (stmt); - if (n 0) - { - t = in = gimple_asm_input_op (stmt, 0); - for (i = 1; i n; i++) - t = TREE_CHAIN (t) = gimple_asm_input_op (stmt, i); - } - - cl = NULL_TREE; - n = gimple_asm_nclobbers (stmt); - if (n 0) - { - t = cl = gimple_asm_clobber_op (stmt, 0); - for (i = 1; i n; i++) - t = TREE_CHAIN (t) = gimple_asm_clobber_op (stmt, i); - } - - labels = NULL_TREE; - n = gimple_asm_nlabels (stmt); - if (n 0) - { - t = labels = gimple_asm_label_op (stmt, 0); - for (i = 1; i n; i++) - t = TREE_CHAIN (t) = gimple_asm_label_op (stmt, i); - } + out = chain_asm_ops (stmt, gimple_asm_noutputs, gimple_asm_output_op); + in = chain_asm_ops (stmt, gimple_asm_ninputs, gimple_asm_input_op); + cl = chain_asm_ops (stmt, gimple_asm_nclobbers, gimple_asm_clobber_op); + labels = chain_asm_ops (stmt, gimple_asm_nlabels, gimple_asm_label_op); s = gimple_asm_string (stmt); str = build_string (strlen (s), s);
patch pings
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01060.html http://gcc.gnu.org/ml/gcc-patches/2011-03/msg02247.html -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iQEcBAEBAgAGBQJNqFOfAAoJEBRtltQi2kC7ijYH/ibLT2HNaFXjdE9SKT5Ih1JV dTEWPnY7QBP5Xe6FVwZ09ibPCOxJsK7yGdQBYqy5gRQor8ebifI4kenwcBcdET/m NogdM8DPWbuhgGda7cETNkru7FifSe9mRsKQGhNzVQl48oEKWmcGkm/l3a7gndfD cX08lFzqIH1wcPWUqQPf6gcUMRE4W/0j92E4PEoIbigMoSIRFcduVouBlld8NLBV aicTihAC+MFMVKSkpXGjMLCbn/HkNOyV9s9T+Or1/XuCIZy9zlB1JSfR/EJqVmPm mNmg0bUrtzXN2uImL6ohIV32i732KKcNhNm/FupZTHfbI50nDc1h8SnCgEHc2t0= =i7zA -END PGP SIGNATURE-
Re: Implement stack arrays even for unknown sizes
Hi, On Fri, 15 Apr 2011, Dominique Dhumieres wrote: Michael, Yes, this is due to the DECL_EXPR statement which is rendered by the dumper just the same as a normal decl. The testcase looks for exactly one such decl, but with -fstack-arrays there are exactly two for each such array. The testsuite is run without -fstack-arrays, so I dont' understand why the DECL_EXPR statement appears. Bummer, you're right. I unconditionally emit a DECL_EXPR for arrays even when they don't have a variable length. It's harmless, but makes the testcase fail (I wasn't seeing the fail because I've changed the testcase already to make it not fail with -fstack-arrays). I'll make the DECL_EXPR conditional on the size being variable. As Tobias already okayed the patch I'm planning to check in the slightly modified variant as below, after a new round of testing. Ciao, Michael. * trans-array.c (toplevel): Include gimple.h. (gfc_trans_allocate_array_storage): Check flag_stack_arrays, properly expand variable length arrays. (gfc_trans_auto_array_allocation): If flag_stack_arrays create variable length decls and associate them with their scope. * gfortran.h (gfc_option_t): Add flag_stack_arrays member. * options.c (gfc_init_options): Handle -fstack_arrays option. * lang.opt (fstack-arrays): Add option. * invoke.texi (Code Gen Options): Document it. * Make-lang.in (trans-array.o): Depend on GIMPLE_H. Index: fortran/trans-array.c === *** fortran/trans-array.c (revision 172431) --- fortran/trans-array.c (working copy) *** along with GCC; see the file COPYING3. *** 81,86 --- 81,87 #include system.h #include coretypes.h #include tree.h + #include gimple.h #include diagnostic-core.h /* For internal_error/fatal_error. */ #include flags.h #include gfortran.h *** gfc_trans_allocate_array_storage (stmtbl *** 630,647 { /* Allocate the temporary. */ onstack = !dynamic initial == NULL_TREE ! gfc_can_put_var_on_stack (size); if (onstack) { /* Make a temporary variable to hold the data. */ tmp = fold_build2_loc (input_location, MINUS_EXPR, TREE_TYPE (nelem), nelem, gfc_index_one_node); tmp = build_range_type (gfc_array_index_type, gfc_index_zero_node, tmp); tmp = build_array_type (gfc_get_element_type (TREE_TYPE (desc)), tmp); tmp = gfc_create_var (tmp, A); tmp = gfc_build_addr_expr (NULL_TREE, tmp); gfc_conv_descriptor_data_set (pre, desc, tmp); } --- 631,657 { /* Allocate the temporary. */ onstack = !dynamic initial == NULL_TREE ! (gfc_option.flag_stack_arrays !|| gfc_can_put_var_on_stack (size)); if (onstack) { /* Make a temporary variable to hold the data. */ tmp = fold_build2_loc (input_location, MINUS_EXPR, TREE_TYPE (nelem), nelem, gfc_index_one_node); + tmp = gfc_evaluate_now (tmp, pre); tmp = build_range_type (gfc_array_index_type, gfc_index_zero_node, tmp); tmp = build_array_type (gfc_get_element_type (TREE_TYPE (desc)), tmp); tmp = gfc_create_var (tmp, A); + /* If we're here only because of -fstack-arrays we have to +emit a DECL_EXPR to make the gimplifier emit alloca calls. */ + if (!gfc_can_put_var_on_stack (size)) + gfc_add_expr_to_block (pre, + fold_build1_loc (input_location, + DECL_EXPR, TREE_TYPE (tmp), + tmp)); tmp = gfc_build_addr_expr (NULL_TREE, tmp); gfc_conv_descriptor_data_set (pre, desc, tmp); } *** gfc_trans_auto_array_allocation (tree de *** 4759,4767 { stmtblock_t init; tree type; ! tree tmp; tree size; tree offset; bool onstack; gcc_assert (!(sym-attr.pointer || sym-attr.allocatable)); --- 4769,4779 { stmtblock_t init; tree type; ! tree tmp = NULL_TREE; tree size; tree offset; + tree space; + tree inittree; bool onstack; gcc_assert (!(sym-attr.pointer || sym-attr.allocatable)); *** gfc_trans_auto_array_allocation (tree de *** 4818,4832 return; } ! /* The size is the number of elements in the array, so multiply by the ! size of an element to get the total size. */ ! tmp = TYPE_SIZE_UNIT (gfc_get_element_type (type)); ! size = fold_build2_loc (input_location,
Re: [PATCH 1/4] Remove usesess and wrong code from ipa_analyze_virtual_call_uses
On Fri, 15 Apr 2011, Martin Jambor wrote: Hi, ipa_analyze_virtual_call_uses contains code that was meant to deal with situation where OBJ_TYPE_REF_OBJECT is a (number of) COMPONENT_REFs on top of a dereferenced default definition SSA_NAME of a parameter. The code is useless because that never happens in the IL, if an ancestor object of a parameter is being used for a virtual call, the object in the expression is always an SSA_NAME which is assigned the proper value in a previous statement. Moreover, if it ever triggered, it might lead to wrong code because in cases like this it is necessary also to store the offset within the parameter into the indirect call graph edge information (like we do in indirect inlining). The above is done in the next patch in the series. I've split this part from it because I would like to commit it also to the 4.6 branch. I have bootstrapped and tested this on x86-64-linux without any problems. OK for trunk and the 4.6 branch? Ok for both. Thanks, Richard. Thanks, Martin 2011-04-08 Martin Jambor mjam...@suse.cz * ipa-prop.c (ipa_analyze_virtual_call_uses): Remove handling of ADR_EXPRs. Index: src/gcc/ipa-prop.c === --- src.orig/gcc/ipa-prop.c +++ src/gcc/ipa-prop.c @@ -1383,18 +1383,6 @@ ipa_analyze_virtual_call_uses (struct cg if (!flag_devirtualize) return; - if (TREE_CODE (obj) == ADDR_EXPR) -{ - do - { - obj = TREE_OPERAND (obj, 0); - } - while (TREE_CODE (obj) == COMPONENT_REF); - if (TREE_CODE (obj) != MEM_REF) - return; - obj = TREE_OPERAND (obj, 0); -} - if (TREE_CODE (obj) != SSA_NAME || !SSA_NAME_IS_DEFAULT_DEF (obj)) return; -- Richard Guenther rguent...@suse.de Novell / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex
Re: Implement stack arrays even for unknown sizes
On 04/15/2011 07:28 AM, Michael Matz wrote: --- snip --- I'll make the DECL_EXPR conditional on the size being variable. As Tobias already okayed the patch I'm planning to check in the slightly modified variant as below, after a new round of testing. Thats A-OK Thanks, Jerry
Re: More of ipa-inline housekeeping
I have forgotten to say that the 125 and 516 thresholds are for x86_64-apple-darwin10. On powerpc-apple-darwin9 they are repsectively 172 and 638. Dominique
[PATCH, SMS] Avoid unfreed memory when SMS fails
Hello, This patch fixes the scenario where SMS fails to schedule a loop and continue to the next one without freeing data structures allocated while scheduling the first loop. Bootstrap and regtested on ppc64-redhat-linux. OK for mainline? Thanks, Revital Changelog: * modulo-sched.c (sms_schedule): Avoid unfreed memory when SMS fails. Index: modulo-sched.c === --- modulo-sched.c (revision 170464) +++ modulo-sched.c (working copy) @@ -1177,7 +1177,6 @@ sms_schedule (void) fprintf (dump_file, HOST_WIDEST_INT_PRINT_DEC, trip_count); fprintf (dump_file, )\n); } - continue; } else {
[PATCH, SMS] Free sccs field
Hello, The attached patch adds missing free operation for storage allocated while calculating SCCs. Bootstrap and regtested on ppc64-redhat-linux. OK for mainline? Thanks, Revital Changelog: * ddg.c (free_ddg_all_sccs): Free sccs field in struct ddg_all_sccs. Index: ddg.c === --- ddg.c (revision 171573) +++ ddg.c (working copy) @@ -1011,6 +1082,8 @@ free_ddg_all_sccs (ddg_all_sccs_ptr all_ for (i = 0; i all_sccs-num_sccs; i++) free_scc (all_sccs-sccs[i]); + if (all_sccs-sccs) +free (all_sccs-sccs); free (all_sccs); }
Re: [PATCH 2/4] Handle calls to ancestor objects in IPA-CP devirtualization
On Fri, 15 Apr 2011, Martin Jambor wrote: Hi, early inlining can create virtual calls based on the part of an object that represents an ancestor. This patch makes ipa-prop analysis able to recognize such calls and store the required offset along with such calls (the field is already there for similar purposes of indirect inlining). The constant propagation is then made aware of the offset field and takes it into account when looking up the proper BINFO. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2011-04-13 Martin Jambor mjam...@suse.cz * ipa-cp.c (ipcp_process_devirtualization_opportunities): Take into account anc_offset and otr_type from the indirect edge info. * ipa-prop.c (get_ancestor_addr_info): New function. (compute_complex_ancestor_jump_func): Assignment analysis moved to get_ancestor_addr_info, call it. (ipa_note_param_call): Do not initialize information about polymorphic calls, return the indirect call graph edge. Remove the last parameter, adjust all callers. (ipa_analyze_virtual_call_uses): Process also calls to ancestors of parameters. Initialize polymorphic information in the indirect edge. * testsuite/g++.dg/ipa/devirt-7.C: New test. Index: src/gcc/ipa-cp.c === --- src.orig/gcc/ipa-cp.c +++ src/gcc/ipa-cp.c @@ -1246,8 +1246,8 @@ ipcp_process_devirtualization_opportunit for (ie = node-indirect_calls; ie; ie = next_ie) { int param_index, types_count, j; - HOST_WIDE_INT token; - tree target, delta; + HOST_WIDE_INT token, anc_offset; + tree target, delta, otr_type; next_ie = ie-next_callee; if (!ie-indirect_info-polymorphic) @@ -1259,14 +1259,23 @@ ipcp_process_devirtualization_opportunit continue; token = ie-indirect_info-otr_token; + anc_offset = ie-indirect_info-anc_offset; + otr_type = ie-indirect_info-otr_type; target = NULL_TREE; types_count = VEC_length (tree, info-params[param_index].types); for (j = 0; j types_count; j++) { tree binfo = VEC_index (tree, info-params[param_index].types, j); - tree d; - tree t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true); + tree d, t; + binfo = get_binfo_at_offset (binfo, anc_offset, otr_type); + if (!binfo) + { + target = NULL_TREE; + break; + } + + t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true); if (!t) { target = NULL_TREE; Index: src/gcc/ipa-prop.c === --- src.orig/gcc/ipa-prop.c +++ src/gcc/ipa-prop.c @@ -576,6 +576,49 @@ compute_complex_assign_jump_func (struct } } +/* Extract the base, offset and MEM_REF expression from a statement ASSIGN if + it looks like: + + iftmp.1_3 = obj_2(D)-D.1762; + + The base of the MEM_REF must be a default definition SSA NAME of a + parameter. Return NULL_TREE if it looks otherwise. If case of success, the + whole MEM_REF expression is returned and the offset calculated from any + handled components and the MEM_REF itself is stored into *OFFSET. The whole + RHS stripped off the ADDR_EXPR is stored into *OBJ_P. */ + +static tree +get_ancestor_addr_info (gimple assign, tree *obj_p, HOST_WIDE_INT *offset) +{ + HOST_WIDE_INT size, max_size; + tree expr, parm, obj; + + if (!gimple_assign_single_p (assign)) +return NULL_TREE; + expr = gimple_assign_rhs1 (assign); + + if (TREE_CODE (expr) != ADDR_EXPR) +return NULL_TREE; + expr = TREE_OPERAND (expr, 0); + obj = expr; + expr = get_ref_base_and_extent (expr, offset, size, max_size); + + if (TREE_CODE (expr) != MEM_REF + /* If this is a varying address, punt. */ + || max_size == -1 + || max_size != size) +return NULL_TREE; + parm = TREE_OPERAND (expr, 0); + if (TREE_CODE (parm) != SSA_NAME + || !SSA_NAME_IS_DEFAULT_DEF (parm) Might be an uninitialized variable, so also check TREE_CODE (SSA_NAME_VAR (parm)) == PARM_DECL? + || *offset 0) Check this above where you check max_size/size. +return NULL_TREE; + + *offset += mem_ref_offset (expr).low * BITS_PER_UNIT; At some point it might be worth switching to get_addr_base_and_unit_offsets and not use bit but unit offsets throughout the code. + *obj_p = obj; + return expr; +} + /* Given that an actual argument is an SSA_NAME that is a result of a phi statement PHI, try to find out whether NAME is in fact a @@ -603,7 +646,7 @@ compute_complex_ancestor_jump_func (stru struct ipa_jump_func *jfunc, gimple call, gimple phi) { - HOST_WIDE_INT offset, size, max_size; +
Re: [PATCH 3/4] Simple relaxation of dynamic type change detection routine
On Fri, 15 Apr 2011, Martin Jambor wrote: Hi, in order to speed up astar, I had to persuade the function that decides whether a statement potentially modifies the dynamic type of an object by storing a new value to the VMT pointer to consider the following statement harmless (all types are integers of some sort): MEM[(i32 *)b2arp_3(D) + 8B] = 0; I'd like to experiment with this routine a bit more once I have some other IPA-CP infrastructure in place but at the moment I opted for a simple solution: All scalar non-pointer stores are deemed safe. VMT pointer is a compiler generated field which is a pointer so legal user code is not able to store stuff there through some fancy type casts and compiler generated code should have no reason whatsoever to that either. Therefore I believe this change is safe and useful. I have bootstrapped and tested the patch on x886_64-linux. OK for trunk? I think this should be only done for -fstrict-aliasing. Ok with that change. Richard. Thanks, Martin 2011-04-14 Martin Jambor mjam...@suse.cz * ipa-prop.c (stmt_may_be_vtbl_ptr_store): Return false for scalar non-pointer assignments. Index: src/gcc/ipa-prop.c === --- src.orig/gcc/ipa-prop.c +++ src/gcc/ipa-prop.c @@ -405,13 +405,18 @@ stmt_may_be_vtbl_ptr_store (gimple stmt) { tree lhs = gimple_assign_lhs (stmt); - if (TREE_CODE (lhs) == COMPONENT_REF -!DECL_VIRTUAL_P (TREE_OPERAND (lhs, 1)) -!AGGREGATE_TYPE_P (TREE_TYPE (lhs))) + if (!AGGREGATE_TYPE_P (TREE_TYPE (lhs))) + { + if (!POINTER_TYPE_P (TREE_TYPE (lhs))) return false; - /* In the future we might want to use get_base_ref_and_offset to find - if there is a field corresponding to the offset and if so, proceed - almost like if it was a component ref. */ + + if (TREE_CODE (lhs) == COMPONENT_REF +!DECL_VIRTUAL_P (TREE_OPERAND (lhs, 1))) + return false; + /* In the future we might want to use get_base_ref_and_offset to find + if there is a field corresponding to the offset and if so, proceed + almost like if it was a component ref. */ + } } return true; } -- Richard Guenther rguent...@suse.de Novell / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex
Re: [PATCH 4/4] Devirtualization based on global objects
On Fri, 15 Apr 2011, Martin Jambor wrote: Hi, this is the patch that actually speeds up astar (together with the previous one). It implements devirtualization based on global objects. In the past we have refrained from doing this because in general it is difficult to say whether the object is currently being constructed and so it might have a dynamic type of one of its ancestors. However, if the object's class does not have any ancestors that obviously cannot happen. Devirtualizing in such conditions is enough to change a virtual call to regwayobj::isaddtobound in 473.astar to a direct one which can and should be inlined. That seemed a good justification to implement this and so the patch below does so and brings about 3.1% speedup for that benchmark with LTO. I acknowledge that instead of discarding all classes with ancestors it would be better to check that the called virtual method has the same implementation in all ancestors instead. That is perhaps something for later. It took me surprisingly long to realize that this technique can be used for folding virtual calls based on local automatically allocated objexts too and so can be used to un-XFAIL g++.dg/opt/devirt1.c that regressed in 4.6. Bootstrapped and tested on x86_64-linux. OK for trunk? Thanks, Martin 2011-04-15 Martin Jambor mjam...@suse.cz * ipa-cp.c (ipcp_process_devirtualization_opportunities): Devirtualize also according to actual contants. * gimple-fold.c (gimple_extract_devirt_binfo_from_cst): New function. (gimple_fold_obj_type_ref_call): New function. (gimple_fold_call): Call gimple_fold_obj_type_ref_call on OBJ_TYPE_REFs. * gimple.h (gimple_extract_devirt_binfo_from_cst): Declare. * testsuite/g++.dg/opt/devirt1.C: Bump to -O2, remove XFAIL. * testsuite/g++.dg/opt/devirt2.C: New test. * testsuite/g++.dg/ipa/devirt-g-1.C: Likewise. Index: src/gcc/ipa-cp.c === --- src.orig/gcc/ipa-cp.c +++ src/gcc/ipa-cp.c @@ -1245,51 +1245,71 @@ ipcp_process_devirtualization_opportunit for (ie = node-indirect_calls; ie; ie = next_ie) { - int param_index, types_count, j; + int param_index; HOST_WIDE_INT token, anc_offset; tree target, delta, otr_type; + struct ipcp_lattice *lat; next_ie = ie-next_callee; if (!ie-indirect_info-polymorphic) continue; param_index = ie-indirect_info-param_index; - if (param_index == -1 - || ipa_param_cannot_devirtualize_p (info, param_index) - || ipa_param_types_vec_empty (info, param_index)) + if (param_index == -1) continue; + lat = ipcp_get_lattice (info, param_index); token = ie-indirect_info-otr_token; anc_offset = ie-indirect_info-anc_offset; otr_type = ie-indirect_info-otr_type; target = NULL_TREE; - types_count = VEC_length (tree, info-params[param_index].types); - for (j = 0; j types_count; j++) + if (lat-type == IPA_CONST_VALUE) { - tree binfo = VEC_index (tree, info-params[param_index].types, j); - tree d, t; - + tree binfo = gimple_extract_devirt_binfo_from_cst (lat-constant); + if (!binfo) + continue; binfo = get_binfo_at_offset (binfo, anc_offset, otr_type); if (!binfo) - { - target = NULL_TREE; - break; - } + continue; + target = gimple_get_virt_mehtod_for_binfo (token, binfo, delta, + false); + } + else + { + int types_count, j; - t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true); - if (!t) - { - target = NULL_TREE; - break; - } - else if (!target) - { - target = t; - delta = d; - } - else if (target != t || !tree_int_cst_equal (delta, d)) + if (ipa_param_cannot_devirtualize_p (info, param_index) + || ipa_param_types_vec_empty (info, param_index)) + continue; + + types_count = VEC_length (tree, info-params[param_index].types); + for (j = 0; j types_count; j++) { - target = NULL_TREE; - break; + tree binfo = VEC_index (tree, info-params[param_index].types, j); + tree d, t; + + binfo = get_binfo_at_offset (binfo, anc_offset, otr_type); + if (!binfo) + { + target = NULL_TREE; + break; + } + + t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true); + if (!t) + { + target = NULL_TREE; + break; + } + else if (!target) + { + target = t; +
Re: [PATCH, SMS] Avoid unfreed memory when SMS fails
On Fri, Apr 15, 2011 at 5:26 PM, Revital Eres revital.e...@linaro.org wrote: Hello, This patch fixes the scenario where SMS fails to schedule a loop and continue to the next one without freeing data structures allocated while scheduling the first loop. Bootstrap and regtested on ppc64-redhat-linux. OK for mainline? Ok. Thanks, Richard. Thanks, Revital Changelog: * modulo-sched.c (sms_schedule): Avoid unfreed memory when SMS fails. Index: modulo-sched.c === --- modulo-sched.c (revision 170464) +++ modulo-sched.c (working copy) @@ -1177,7 +1177,6 @@ sms_schedule (void) fprintf (dump_file, HOST_WIDEST_INT_PRINT_DEC, trip_count); fprintf (dump_file, )\n); } - continue; } else {
Re: [PATCH, SMS] Free sccs field
On Fri, Apr 15, 2011 at 5:27 PM, Revital Eres revital.e...@linaro.org wrote: Hello, The attached patch adds missing free operation for storage allocated while calculating SCCs. Bootstrap and regtested on ppc64-redhat-linux. OK for mainline? Ok. Thanks, Richard. Thanks, Revital Changelog: * ddg.c (free_ddg_all_sccs): Free sccs field in struct ddg_all_sccs. Index: ddg.c === --- ddg.c (revision 171573) +++ ddg.c (working copy) @@ -1011,6 +1082,8 @@ free_ddg_all_sccs (ddg_all_sccs_ptr all_ for (i = 0; i all_sccs-num_sccs; i++) free_scc (all_sccs-sccs[i]); + if (all_sccs-sccs) + free (all_sccs-sccs); free (all_sccs); }
Re: [PATCH 3/6] Allow jumps in epilogues
On 04/13/2011 02:38 PM, Bernd Schmidt wrote: This still requires the i386 output_set_got which I think I can cope with [...] Patch below, to be applied on top of all the others. Only lightly tested so far beyond standard (fairly useless) regression tests, by comparing generated assembly before/after, for -fpic -march=pentium and core2, for i686-linux and i686-apple-darwin10 (for TARGET_MACHO). I've not found or managed to create a testcase for making MI thunks generate a call to output_set_got. I think it should work, but it's not tested. Bernd * config/i386/i386.c (output_set_got): Don't call dwarf2out_flush_queued_reg_saves. (ix86_reorg): Split set_got patterns. (i386_dwarf_handle_frame_unspec, i386_dwarf_flush_queued_register_saves): New static functions. (TARGET_DWARF_HANDLE_FRAME_UNSPEC, TARGET_DWARF_FLUSH_QUEUED_REGISTER_SAVES): New macros. * config/i386/i386.md (set_got_call, set_got_pop, set_got_add): New patterns. * dwarf2out.c (scan_until_barrier): Use the target's dwarf_flush_queued_register_saves hook. * target.def (dwarf_flush_queued_register_saves): New hook. * doc/tm.texi: Regenerate. Index: gcc/config/i386/i386.c === --- gcc.orig/config/i386/i386.c +++ gcc/config/i386/i386.c @@ -8953,6 +8953,8 @@ output_set_got (rtx dest, rtx label ATTR output_asm_insn (mov%z0\t{%2, %0|%0, %2}, xops); else { + /* For normal functions, this pattern is split, but we can still +get here for thunks. */ output_asm_insn (call\t%a2, xops); #ifdef DWARF2_UNWIND_INFO /* The call to next label acts as a push. */ @@ -9010,12 +9012,6 @@ output_set_got (rtx dest, rtx label ATTR get_pc_thunk_name (name, REGNO (dest)); pic_labels_used |= 1 REGNO (dest); -#ifdef DWARF2_UNWIND_INFO - /* Ensure all queued register saves are flushed before the -call. */ - if (dwarf2out_do_frame ()) - dwarf2out_flush_queued_reg_saves (); -#endif xops[2] = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (name)); xops[2] = gen_rtx_MEM (QImode, xops[2]); output_asm_insn (call\t%X2, xops); @@ -30458,6 +30454,56 @@ ix86_reorg (void) with old MDEP_REORGS that are not CFG based. Recompute it now. */ compute_bb_for_insn (); + /* Split any set_got patterns so that we interact correctly with + dwarf2out. */ + if (!TARGET_64BIT !TARGET_VXWORKS_RTP !TARGET_DEEP_BRANCH_PREDICTION + flag_pic) +{ + rtx insn, next; + for (insn = get_insns (); insn; insn = next) + { + rtx pat, label, dest, cst, gotsym, new_insn; + int icode; + + next = NEXT_INSN (insn); + if (!NONDEBUG_INSN_P (insn)) + continue; + + icode = recog_memoized (insn); + if (icode != CODE_FOR_set_got icode != CODE_FOR_set_got_labelled) + continue; + + extract_insn (insn); + if (icode == CODE_FOR_set_got) + { + label = gen_label_rtx (); + cst = const0_rtx; + } + else + { + label = recog_data.operand[1]; + cst = const1_rtx; + } + + dest = recog_data.operand[0]; + pat = gen_set_got_call (label, cst); + new_insn = emit_insn_before (pat, insn); + RTX_FRAME_RELATED_P (new_insn) = 1; + RTX_FRAME_RELATED_P (XVECEXP (PATTERN (new_insn), 0, 1)) = 1; + gotsym = gen_rtx_SYMBOL_REF (Pmode, GOT_SYMBOL_NAME); + pat = gen_set_got_pop (dest, label); + new_insn = emit_insn_before (pat, insn); + RTX_FRAME_RELATED_P (new_insn) = 1; + RTX_FRAME_RELATED_P (XVECEXP (PATTERN (new_insn), 0, 1)) = 1; + if (!TARGET_MACHO) + { + pat = gen_set_got_add (dest, gotsym, label); + new_insn = emit_insn_before (pat, insn); + } + delete_insn (insn); + } +} + if (optimize optimize_function_for_speed_p (cfun)) { if (TARGET_PAD_SHORT_FUNCTION) @@ -30475,6 +30521,30 @@ ix86_reorg (void) move_or_delete_vzeroupper (); } +/* Handle the TARGET_DWARF_HANDLE_FRAME_UNSPEC hook. + This is called from dwarf2out.c to emit call frame instructions + for frame-related insns containing UNSPECs and UNSPEC_VOLATILEs. */ +static void +i386_dwarf_handle_frame_unspec (rtx pattern ATTRIBUTE_UNUSED, + int index ATTRIBUTE_UNUSED) +{ + gcc_assert (index == UNSPEC_SET_GOT); +} + +/* Handle the TARGET_DWARF_FLUSH_QUEUED_REGISTER_SAVES hook. + This is called from dwarf2out.c to decide whether all queued + register saves should be emitted before INSN. */ +static bool +i386_dwarf_flush_queued_register_saves (rtx insn) +{ + if (!TARGET_VXWORKS_RTP || !flag_pic) +{ + int icode = recog_memoized (insn); + return (icode ==
[patch, libgfortran] PR48589 Invalid G0/G0.d editing for NaN/infinity
Hi, I plan to commit the following simple and obvious patch. Regression tested on x86-64-linux-gnu. I will apply the test case in the PR to the testsuite. Regards, Jerry 2011-04-15 Jerry DeLisle jvdeli...@gcc.gnu.org PR libgfortran/48589 * io/write_float.def (write_infnan): Set width properly for G0. Index: write_float.def === --- write_float.def (revision 172467) +++ write_float.def (working copy) @@ -654,11 +654,11 @@ write_infnan (st_parameter_dt *dtp, const fnode *f mark = (sign == S_PLUS || sign == S_MINUS) ? 8 : 7; nb = f-u.real.w; - + /* If the field width is zero, the processor must select a width not zero. 4 is chosen to allow output of '-Inf' or '+Inf' */ - if (nb == 0) + if ((nb == 0) || dtp-u.p.g0_no_blanks) { if (isnan_flag) nb = 3;
Re: patch pings
On 04/15/2011 04:18 PM, Jeff Law wrote: http://gcc.gnu.org/ml/gcc-patches/2011-03/msg02247.html I don't know. I sympathize with the goal, but I'm not too happy about the structure of this patch. Doesn't this do the scan once for every reload in an insn? It seems to me like the loop (or rather, a function call to it) should at least be placed at the top of choose_reload_regs rather than in allocate_reload_reg. In my tests (i686-linux, compiling my standard set of testcases) the patch appears to have, in general, very little effect on code quality. It might on average be slightly better, but I've also seen several cases where we do worse. If we go to so much effort to do scanning of subsequent insns it ought to be possible to do better. AFAICT the patch ignores whether the pseudo that's being reloaded will be reloaded again in the current ebb - if not, it should get a bad spill register, where bad in this case only includes hard regs that don't currently hold a spilled pseudo. Likewise, once we've reloaded the last occurrence of a pseudo in an ebb, we should forget the reg_reloaded_contents to make sure we consider the spill reg good again. However, I'm not sure whether changes such as these alone will reduce the number of cases where the patch regresses. Bernd
Re: FDO usability patch -- cfg and lineno checksum
Resent in plain text mode .. David On Fri, Apr 15, 2011 at 9:28 AM, Xinliang David Li davi...@google.com wrote: Honza, do you have a chance to take a look at it? Thanks, David On Wed, Apr 13, 2011 at 3:25 PM, Xinliang David Li davi...@google.com wrote: Hi, in current FDO implementation, the source file version used in profile-generate needs to strictly match the version used in profile-use -- simple formating changes will invalidate the profile data (use of it will lead to compiler error or ICE). There are two main problems that lead to the weakness: 1) the function checksum is computed based on line number and number of basic blocks -- this means any line number change will invalidate the profile data. In fact, line number should play very minimal role in profile matching decision. Another problem is that number of basic blocks is also not a good indicator whether CFG matches or not. 2) cgraph's pid is used as the key to find the matching profile data for a function. If there is any new function added, or if the order of the functions changes, the profile data is invalidated. The attached is a patch from google that improves this. 1) function checksum is split into two parts: lineno checksum and cfg checksum 2) function assembler name is used in profile hashing. Bootstrapped and regression tested on x86_64/linux. Ok for trunk? Thanks, David 2011-04-13 Xinliang David Li davi...@google.com * tree.c (crc32_byte): New function. * tree.h (crc32_byte): New function. * gcov.c (function_info): Add new fields. (read_graph_file): Handle new fields. (read_count_file): Handle name. * gcov-io.c (gcov_string_length): New function. (gcov_write_words): Bug fix. (gcov_read_words): Bug fix. * gcov-io.h: New interfaces. * profile.c (get_exec_counts): New parameter. (compute_branch_probablilities): New parameter. (compute_value_histogram): New parameter. (branch_prob): Pass cfg_checksum. * coverage.c (get_const_string_type): New function. (hash_counts_entry_hash): Use string hash. (hash_counts_entry_eq): Use string compare. (htab_counts_entry_del): Delete name. (read_count_file): Add handling of cfg checksum. (get_coverage_counts): New parameter. (xstrdup_mask_random): New function. (coverage_compute_lineno_checksum): New function. (coverage_compute_cfg_checksum): New function. (coverage_begin_output): New parameter. (coverage_end_function): New parameter. (build_fn_info_type): Build new fields. (build_fn_info_value): Build new field values. * coverage.h: Interface changes. * libgcov.c (gcov_exit): Dump new fields. * gcov_dump.c (tag_function): Print new fields.
Re: [PATCH] Improve combining of conditionals
On Apr 15, 2011, at 3:44 AM, Eric Botcazou ebotca...@adacore.com wrote: And make IN_COND a bool instead of an int? I had the same initial reaction but then came to the same conclusion as Maxim. If it can be bool, it should be bool.
Re: [PATCH, SMS] Free sccs field
Hello, On 15 April 2011 18:53, Nathan Froyd froy...@codesourcery.com wrote: On Fri, Apr 15, 2011 at 06:27:05PM +0300, Revital Eres wrote: + if (all_sccs-sccs) + free (all_sccs-sccs); No need to check for non-NULL prior to free'ing. OK, I'll commit the patch without the check then. (after re-testing) Thanks, Revital -Nathan
Re: [build] Allow building libobjc_gc on Tru64 UNIX, Darwin
Nicola, Ok for mainline if both pass? Yes. [and by the way, I think you're fixing PR libobjc/32037 in the process. :-)] indeed, thanks for checking. Testing revealed that I'd been lazy with quoting. I'm including the patch I've installed, which encloses OBJC_BOEHM_GC in configure.ac in single quotes. Btw., it would be considerably easier if --enable-libobjc-gc could be enabled automatically if boehm-gc is configured. Yes. I've filed libobjc/48626 --enable-objc-gc should be automatic Besides, it seems that libobjc_gc isn't tested anywhere. Yes. and libobjc/48627 libobjc_gc should be tested just in case. Rainer 2011-04-13 Rainer Orth r...@cebitec.uni-bielefeld.de PR libobjc/32037 * Makefile.in (OBJC_GCFLAGS): Move ... * configure.ac (enable_objc_gc): ... here. Add $(libsuffix) to OBJC_BOEHM_GC. * configure: Regenerate. diff --git a/libobjc/Makefile.in b/libobjc/Makefile.in --- a/libobjc/Makefile.in +++ b/libobjc/Makefile.in @@ -93,7 +93,7 @@ LIBTOOL_INSTALL = $(LIBTOOL) --mode=inst LIBTOOL_CLEAN = $(LIBTOOL) --mode=clean #LIBTOOL_UNINSTALL = $(LIBTOOL) --mode=uninstall -OBJC_GCFLAGS=-DOBJC_WITH_GC=1 +OBJC_GCFLAGS=@OBJC_GCFLAGS@ OBJC_BOEHM_GC=@OBJC_BOEHM_GC@ OBJC_BOEHM_GC_INCLUDES=@OBJC_BOEHM_GC_INCLUDES@ OBJC_BOEHM_GC_LIBS=../boehm-gc/libgcjgc_convenience.la $(thread_libs_and_flags) diff --git a/libobjc/configure.ac b/libobjc/configure.ac --- a/libobjc/configure.ac +++ b/libobjc/configure.ac @@ -1,6 +1,6 @@ # Process this file with autoconf to produce a configure script. # Copyright 1994, 1995, 1996, 1997, 1998, 1999, 2001, 2002, 2003, 2004 -# 2005, 2006, 2009 Free Software Foundation, Inc. +# 2005, 2006, 2009, 2011 Free Software Foundation, Inc. # Originally contributed by Dave Love (d.l...@dl.ac.uk). # #This file is part of GCC. @@ -63,15 +63,25 @@ AC_ARG_ENABLE(objc-gc, the GNU Objective-C runtime.], [case $enable_objc_gc in no) +OBJC_GCFLAGS='' OBJC_BOEHM_GC='' OBJC_BOEHM_GC_INCLUDES='' ;; *) -OBJC_BOEHM_GC=libobjc_gc.la +OBJC_GCFLAGS='-DOBJC_WITH_GC=1' +OBJC_BOEHM_GC='libobjc_gc$(libsuffix).la' OBJC_BOEHM_GC_INCLUDES='-I$(top_srcdir)/../boehm-gc/include -I../boehm-gc/include' +case ${host} in + alpha*-dec-osf*) +# boehm-gc headers include pthread.h, which needs to be compiled + # with -pthread on Tru64 UNIX. +OBJC_GCFLAGS=${OBJC_GCFLAGS} -pthread + ;; +esac ;; esac], -[OBJC_BOEHM_GC=''; OBJC_BOEHM_GC_INCLUDES='']) +[OBJC_GCFLAGS=''; OBJC_BOEHM_GC=''; OBJC_BOEHM_GC_INCLUDES='']) +AC_SUBST(OBJC_GCFLAGS) AC_SUBST(OBJC_BOEHM_GC) AC_SUBST(OBJC_BOEHM_GC_INCLUDES) -- - Rainer Orth, Center for Biotechnology, Bielefeld University
[v3] Handle NOTY in extract_symvers.pl
While testing the close-to-final version of my COMDAT-group-with-Sun as patch [build, c++, lto] Support COMDAT group with Sun as (PR target/40483) http://gcc.gnu.org/ml/gcc-patches/2010-05/msg01365.html http://gcc.gnu.org/ml/gcc-patches/2010-06/msg00600.html which only awaits the release of a linker with the required fixes, I noticed that extract_symvers.pl doesn't handly NOTY/NOTYPE entries, but chokes instead, whereas the readelf based version handles them just fine. Although the final patch won't create them anymore, I'm installing the following patch that makes the script more robust. It was tested while putting the finishing touches on the as-comdat patch. Installed on mainline. Rainer 2011-04-04 Rainer Orth r...@cebitec.uni-bielefeld.de * scripts/extract_symvers.pl: Handle NOTY. diff --git a/libstdc++-v3/scripts/extract_symvers.pl b/libstdc++-v3/scripts/extract_symvers.pl --- a/libstdc++-v3/scripts/extract_symvers.pl +++ b/libstdc++-v3/scripts/extract_symvers.pl @@ -1,6 +1,6 @@ #!/usr/bin/perl -w -# Copyright (C) 2010 Free Software Foundation, Inc. +# Copyright (C) 2010, 2011 Free Software Foundation, Inc. # # This file is part of the GNU ISO C++ Library. This library is free # software; you can redistribute it and/or modify it under the @@ -108,6 +108,7 @@ while (ELFDUMP) { die unhandled symbol:\n$_ if ($bind !~ /^(GLOB|WEAK)/ or $oth !~ /[DP]/); # Adapt to readelf type naming convention. +$type = NOTYPE if ($type eq NOTY); $type = OBJECT if ($type eq OBJT); # Use correct symbol type. @@ -116,7 +117,7 @@ while (ELFDUMP) { close ELFDUMP or die elfdump error; foreach $symbol (keys %type) { -if ($type{$symbol} eq FUNC) { +if ($type{$symbol} eq FUNC || $type{$symbol} eq NOTYPE) { push @lines, $type{$symbol}:$symbol\@\@$version{$symbol}\n; } elsif ($type{$symbol} eq OBJECT and $size{$symbol} == 0) { push @lines, $type{$symbol}:$size{$symbol}:$version{$symbol}\n; -- - Rainer Orth, Center for Biotechnology, Bielefeld University
ObjC: rewritten checks for duplicate instance variables (for improved speed and improved error messages)
This patch rewrites the check for duplicate instance variables done by the Objective-C (and Objective-C++) compiler. The new code has many advantages: * it avoids copying all the instance variables into a temporary tree chain in the way that the all code was doing (according to some notes from benchmarks and profiling I did a few months ago, this should save about 5k allocations per compilation when compiling a typical ObjC GNUstep file, giving an approx 0.25% order of magnitude speedup with -fsyntax-only); * it provides better error messages by providing the informative message previous declaration of ... when a duplicate declaration is found; * it avoids producing duplicate errors if a class with duplicate instance variables is subclassed. The existing code always built a copy of the full list of instance variables for the class (including superclass ones) and then would look for duplicates. This means if you have a class with a duplicate, and subclass it 10 times, you'll get the same error message 11 times! The new code avoids this mistake, which also reduces the number of checks required (for the standard case without a hashtable, I'll explain in a minute). * it removes the special, ad-hoc field duplicate check implementation for ObjC++ and has it use the same code as the ObjC codebase. The new code does the checks for duplicates directly, by just iterating and comparing, unless the number of checks to do is very large, in which case it falls back to a hashtable. This is similar to what the C code does for structs, but here the check is more specific as we are only checking the instance variables in the current class against the other instance variables in the class and the superclass, so it's all more specific and optimized for the situation. Updated existing testcases (to check for the new previous declaration of ... messages), and added more testcases. Ok to commit ? Thanks Index: c-family/c-objc.h === --- c-family/c-objc.h (revision 172444) +++ c-family/c-objc.h (working copy) @@ -62,7 +62,7 @@ extern tree objc_build_string_object (tree); extern tree objc_get_protocol_qualified_type (tree, tree); extern tree objc_get_class_reference (tree); extern tree objc_get_class_ivars (tree); -extern tree objc_get_interface_ivars (tree); +extern bool objc_detect_field_duplicates (bool); extern void objc_start_class_interface (tree, tree, tree, tree); extern void objc_start_category_interface (tree, tree, tree, tree); extern void objc_start_protocol (tree, tree, tree); Index: c-family/ChangeLog === --- c-family/ChangeLog (revision 172444) +++ c-family/ChangeLog (working copy) @@ -1,3 +1,9 @@ +2011-04-15 Nicola Pero nicola.p...@meta-innovation.com + + * c-objc.h (objc_get_interface_ivars): Removed. + (objc_detect_field_duplicates): New. + * stub-objc.c: Likewise. + 2011-04-14 Nicola Pero nicola.p...@meta-innovation.com * stub-objc.c (objc_declare_protocols): Renamed to Index: c-family/stub-objc.c === --- c-family/stub-objc.c(revision 172444) +++ c-family/stub-objc.c(working copy) @@ -275,10 +275,10 @@ objc_get_class_reference (tree ARG_UNUSED (name)) return 0; } -tree -objc_get_interface_ivars (tree ARG_UNUSED (fieldlist)) +bool +objc_detect_field_duplicates (bool ARG_UNUSED (check_superclasses_only)) { - return 0; + return false; } tree Index: objc/ChangeLog === --- objc/ChangeLog (revision 172444) +++ objc/ChangeLog (working copy) @@ -1,3 +1,10 @@ +2011-04-15 Nicola Pero nicola.p...@meta-innovation.com + + * objc-act.c (objc_get_interface_ivars): Removed. + (objc_detect_field_duplicates): New. + (hash_instance_variable): New. + (eq_instance_variable): New. + 2011-04-14 Nicola Pero nicola.p...@meta-innovation.com * objc-act.c (objc_declare_protocols): Renamed to Index: objc/objc-act.c === --- objc/objc-act.c (revision 172444) +++ objc/objc-act.c (working copy) @@ -3813,6 +3813,8 @@ lookup_interface (tree ident) } } + + /* Implement @defs (classname) within struct bodies. */ tree @@ -3829,19 +3831,242 @@ objc_get_class_ivars (tree class_name) return error_mark_node; } + +/* Functions used by the hashtable for field duplicates in + objc_detect_field_duplicates(). Ideally, we'd use a standard + key-value dictionary hashtable , and store as keys the field names, + and as values the actual declarations (used to print nice error + messages with the locations). But, the hashtable we are using only + allows us to store keys in the hashtable, without values (it looks +
[PATCH, PR 48601] cgraph_get_create_node in emultls.c
Hi, lower_emutls_function_body in emultls.c should use cgraph_get_create_node to create call graph nodes if need be because it is introducing TLS builtin decls into IL. I have bootstrapped and tested this on x86_64-linux (where the bug does not happen) and in bugzilla it has been confirmed it fixes the issue. It is rather straight-forward and was pre-approved on IRC by Honza so I am about to commit it now. Testcases are already in the testsuite. Thanks, Martin 2011-04-15 Martin Jambor mjam...@suse.cz PR middle-end/48601 * tree-emutls.c (lower_emutls_function_body): Call cgraph_get_create_node instead of cgraph_get_node. Do not assert the result is non-NULL. Index: src/gcc/tree-emutls.c === --- src.orig/gcc/tree-emutls.c +++ src/gcc/tree-emutls.c @@ -619,8 +619,9 @@ lower_emutls_function_body (struct cgrap d.cfun_node = node; d.builtin_decl = built_in_decls[BUILT_IN_EMUTLS_GET_ADDRESS]; - d.builtin_node = cgraph_get_node (d.builtin_decl); - gcc_checking_assert (d.builtin_node); + /* This is where we introduce the declaration to the IL and so we have to + create a node for it. */ + d.builtin_node = cgraph_get_create_node (d.builtin_decl); FOR_EACH_BB (d.bb) {
Re: [PATCH] Improve combining of conditionals
If it can be bool, it should be bool. Rather basic principle IMO. Consistency is of much greater value. -- Eric Botcazou
Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.
Denis Chertykov schrieb: 2011/4/14 Georg-Johann Lay a...@gjlay.de: Denis Chertykov schrieb: 2011/4/14 Georg-Johann Lay a...@gjlay.de: The rotlmode3 expanders (mode \in {HI,SI,DI}) violates synopsis by using 4 operands instead of 3. This runs in ICE in top of optabs.c:maybe_gen_insn. The right way to do this is to use match_dup, not match_operand. So the fix is obvious. Regenerated avr-gcc and run against avr testsuite: gcc.target/avr/torture/pr41885.c generates these patterns Johann 2011-04-14 Georg-Johann Lay a...@gjlay.de * config/avr/avr.md (rotlhi3, rotlsi3, rotldi3): Fix expanders operand 3 from match_operand to match_dup. May be better to use (clobber (match_scratch ...)) here and in `*rotwmode' and `*rotbmode'. These are splitters that might split before reload, producing strange, non-matching insns like (set (scratch:QI) ... or crashing already in avr_rotate_bytes becase the split condition reads (reload_completed || MODEmode == DImode) Generally I'm agree with you, change match_operand to match_dup is easy. I already submitted the easy patch to avoid the ICE. But IMHO the right way is: - the splitters and avr_rotate_bytes are wrong and must be fixed too. - operands[3] is a scratch register and right way is to use match_scratch. I can approve the patch. Denis. Ok, here is the right-way patch. The expander generates a SCRATCH instead of a pseudo, and in case of a pre-reload split the SCRATCH is replaced with a pseudo because it is not known if or if not the scratch will actually be needed. avr_rotate_bytes now skips operations on non-REG 'scratch'. Furthermore, I added an assertion that ensures that 'scratch' is actually a REG when it is needed. Besides that, I fixed indentation. I know it is not optimal to fix indentation alongside with functionality... did it anyway :-) Finally, I exposed alternative #3 of the insns to the register allocator, because it is not possible to distinguish between overlapping or non-overlapping regs, and #3 does not need a scratch. Ran C-testsuite with no regressions. Johann 2011-04-15 Georg-Johann Lay a...@gjlay.de * config/avr/avr.md (rotlmode3): Use SCRATCH instead of REG in operand 3. Fix indentation. Unquote C snippet. (*rotwmode,*rotbwmode): Ditto. Replace scratch with pseudo for pre-reload splits. Use const_int_operand for operand 2. Expose alternative 3 to register allocator. * config/avr/avr.c (avr_rotate_bytes): Handle SCRATCH in operands[3]. Use GNU indentation style. Index: config/avr/avr.md === --- config/avr/avr.md (Revision 172493) +++ config/avr/avr.md (Arbeitskopie) @@ -1519,22 +1519,22 @@ (define_mode_attr rotsmode [(DI QI) (S (define_expand rotlmode3 [(parallel [(set (match_operand:HIDI 0 register_operand ) - (rotate:HIDI (match_operand:HIDI 1 register_operand ) -(match_operand:VOID 2 const_int_operand ))) - (clobber (match_dup 3))])] + (rotate:HIDI (match_operand:HIDI 1 register_operand ) +(match_operand:VOID 2 const_int_operand ))) + (clobber (match_dup 3))])] - -{ - if (CONST_INT_P (operands[2]) 0 == (INTVAL (operands[2]) % 8)) { - if (AVR_HAVE_MOVW 0 == INTVAL (operands[2]) % 16) -operands[3] = gen_reg_rtx (rotsmodemode); - else -operands[3] = gen_reg_rtx (QImode); - } - else -FAIL; -}) +if (CONST_INT_P (operands[2]) + 0 == INTVAL (operands[2]) % 8) + { +if (AVR_HAVE_MOVW 0 == INTVAL (operands[2]) % 16) + operands[3] = gen_rtx_SCRATCH (rotsmodemode); +else + operands[3] = gen_rtx_SCRATCH (QImode); + } +else + FAIL; + }) ;; Overlapping non-HImode registers often (but not always) need a scratch. @@ -1545,35 +1545,49 @@ (define_expand rotlmode3 ; Split word aligned rotates using scratch that is mode dependent. (define_insn_and_split *rotwmode - [(set (match_operand:HIDI 0 register_operand =r,r,#r) - (rotate:HIDI (match_operand:HIDI 1 register_operand 0,r,r) - (match_operand 2 immediate_operand n,n,n))) - (clobber (match_operand:rotsmode 3 register_operand =rotx ))] - (CONST_INT_P (operands[2]) - (0 == (INTVAL (operands[2]) % 16) AVR_HAVE_MOVW)) + [(set (match_operand:HIDI 0 register_operand =r,r,r) +(rotate:HIDI (match_operand:HIDI 1 register_operand 0,r,r) + (match_operand 2 const_int_operand n,n,n))) + (clobber (match_scratch:rotsmode 3 =rotx))] + AVR_HAVE_MOVW +0 == INTVAL (operands[2]) % 16 # (reload_completed || MODEmode == DImode) [(const_int 0)] - avr_rotate_bytes (operands); - DONE; -) + { +/* Split happens in .split1: Cook up a pseudo */ +if (SCRATCH == GET_CODE (operands[3]) + !reload_completed) + { +operands[3] = gen_reg_rtx (GET_MODE
[wwwdocs] Shuffle entries in the footer and remove margin at the bottom
This concludes this mini project. :-) Committed and the pages on gcc.gnu.org regenerated; www.gnu.org will follow over night. Gerald Index: style.mhtml === RCS file: /cvs/gcc/wwwdocs/htdocs/style.mhtml,v retrieving revision 1.115 diff -u -r1.115 style.mhtml --- style.mhtml 12 Apr 2011 21:11:23 - 1.115 +++ style.mhtml 15 Apr 2011 17:22:30 - @@ -232,14 +232,10 @@ div class=copyright -p style=margin-top:0;These pages are -a href=http://gcc.gnu.org/about.html;maintained by the GCC team/a. -Last modified date::format-time -MM-DD!-- IGNORE DIFF ---a href=http://validator.w3.org/check/referer;./a/p - -addressFor questions related to the use of GCC, please consult these web -pages and the a href=http://gcc.gnu.org/onlinedocs/;GCC manuals/a. If -that fails, the a href=mailto:gcc-h...@gcc.gnu.org;gcc-h...@gcc.gnu.org/a +address style=margin-top:0;For questions related to the use of GCC, +please consult these web pages and the +a href=http://gcc.gnu.org/onlinedocs/;GCC manuals/a. If that fails, +the a href=mailto:gcc-h...@gcc.gnu.org;gcc-h...@gcc.gnu.org/a mailing list might help. Comments on these web pages and the development of GCC are welcome on our developer list at a href=mailto:g...@gcc.gnu.org;g...@gcc.gnu.org/a. @@ -252,6 +248,11 @@ Verbatim copying and distribution of this entire article is permitted in any medium, provided this notice is preserved./p +p style=margin-bottom:0;These pages are +a href=http://gcc.gnu.org/about.html;maintained by the GCC team/a. +Last modified date::format-time -MM-DD!-- IGNORE DIFF +--a href=http://validator.w3.org/check/referer;./a/p + /div !-- --
[pph] Clean positive tests. (issue4423044)
This patch cleans up positive tests. First, stop on first failure so as to avoid littering the logfile. Second, change the extensions of the generated assembly files to .s-pph and .s+pph to be clear on the provenence of each file. Index: gcc/testsuite/ChangeLog.pph 2011-04-15 Lawrence Crowl cr...@google.com * lib/dg-pph.exp (dg-pph-pos): Stop on first failure. Change names of assembly files to be clear on which is with and without PPH. Index: gcc/testsuite/lib/dg-pph.exp === --- gcc/testsuite/lib/dg-pph.exp(revision 172457) +++ gcc/testsuite/lib/dg-pph.exp(working copy) @@ -74,13 +74,13 @@ proc dg-pph-pos { subdir test options ma if { $have_errs } { verbose -log regular compilation failed fail $nshort $options, regular compilation failed -return + return } if { ! [ file_on_host exists $bname.s ] } { - verbose -log assembly file '$bname.s' missing - fail $nshort $options, assembly comparison -return + verbose -log regular assembly file '$bname.s' missing + fail $nshort $options, regular assembly missing + return } # Rename the .s file into .s-pph to compare it after the second build. @@ -89,19 +89,33 @@ proc dg-pph-pos { subdir test options ma # Compile a second time using the pph files. dg-test -keep-output $test $options $mapflag -I. -remote_upload host $bname.s + +if { $have_errs } { + verbose -log PPH compilation failed + fail $nshort $options, PPH compilation failed + return +} + +if { ! [ file_on_host exists $bname.s ] } { + verbose -log PPH assembly file '$bname.s' missing + fail $nshort $options, PPH assembly missing + return +} + +# Rename the .s file into .s+pph to compare it. +remote_upload host $bname.s $bname.s+pph +remote_download host $bname.s+pph # Compare the two assembly files. They should be identical. -set tmp [ diff $bname.s $bname.s-pph ] +set tmp [ diff $bname.s-pph $bname.s+pph ] if { $tmp == 0 } { -verbose -log assembly file '$bname.s', '$bname.s-pph' comparison error -fail $nshort $options assembly comparison + verbose -log assembly file '$bname.s-pph', '$bname.s+pph' comparison error + fail $nshort $options assembly comparison } elseif { $tmp == 1 } { -pass $nshort $options assembly comparison + pass $nshort $options assembly comparison + file_on_host delete $bname.s-pph + file_on_host delete $bname.s+pph } else { -fail $nshort $options assembly comparison + fail $nshort $options assembly comparison } -file_on_host delete $bname.s -file_on_host delete $bname.s-pph - } -- This patch is available for review at http://codereview.appspot.com/4423044
Re: [PATCH] sel-sched: Fix erroneous re-use of pointer to last insn in a BB (PR 48235)
Vladimir and Dmitry, The new test you added, gcc.dg/pr48235.c, is failing on IA64 because it gets this excess message: cc1: note: -freorder-blocks-and-partition does not work on this architecture Could you modify the test to either filter out this message or not use the -freorder-blocks-and-partition option in IA64 or not run the test at all on IA64. I notice that gcc.dg/tree-prof/pr45354.c looks like this test but also has: /* { dg-require-effective-target freorder } */ That test does not fail, or run, on IA64 due to this dg option. Steve Ellcey s...@cup.hp.com
ObjC: get rid of another unnecessary, temporary copy of instance variables
This patch for the Objective-C compiler gets rid of another case where we'd create a temporary tree chain with a copy of all the instance variables of a class just to do a simple check on them. The check is the one to check the scope of the instance variable that is being accessed; without this patch, the check requires copying all the instance variables for the class on a temporary tree chain. With this patch, the check is done (as you'd expect) on the instance variables themselves, without copying them. This is obviously faster; on the other hand, the typical GNUstep-based ObjC file has classes with few instance variables (maybe 10 or 20) and only accesses instance variables up to a few hundred times per file, so the speedup is only of the order of 0.1% to 0.2% even with -fsyntax-only. Of course, in unusual cases (ie, classes with many instance variables, and a big file accessing instance variables lots of times) the speedup may be bigger. Ok to commit ? Thanks Index: ChangeLog === --- ChangeLog (revision 172511) +++ ChangeLog (working copy) @@ -1,5 +1,10 @@ 2011-04-15 Nicola Pero nicola.p...@meta-innovation.com + * objc-act.c (ivar_of_class): New. + (objc_is_public): Use ivar_of_class. + +2011-04-15 Nicola Pero nicola.p...@meta-innovation.com + * objc-act.c (objc_get_interface_ivars): Removed. (objc_detect_field_duplicates): New. (hash_instance_variable): New. Index: objc-act.c === --- objc-act.c (revision 172511) +++ objc-act.c (working copy) @@ -6367,6 +6367,35 @@ is_private (tree decl) DECL_NAME (decl))); } +/* Searches all the instance variables of 'klass' and of its + superclasses for an instance variable whose name (identifier) is + 'ivar_name_ident'. Return the declaration (DECL) of the instance + variable, if found, or NULL_TREE, if not found. */ +static inline tree +ivar_of_class (tree klass, tree ivar_name_ident) +{ + /* First, look up the ivar in CLASS_RAW_IVARS. */ + tree decl_chain = CLASS_RAW_IVARS (klass); + + for ( ; decl_chain; decl_chain = DECL_CHAIN (decl_chain)) +if (DECL_NAME (decl_chain) == ivar_name_ident) + return decl_chain; + + /* If not found, search up the class hierarchy. */ + while (CLASS_SUPER_NAME (klass)) +{ + klass = lookup_interface (CLASS_SUPER_NAME (klass)); + + decl_chain = CLASS_RAW_IVARS (klass); + + for ( ; decl_chain; decl_chain = DECL_CHAIN (decl_chain)) + if (DECL_NAME (decl_chain) == ivar_name_ident) + return decl_chain; +} + + return NULL_TREE; +} + /* We have an instance variable reference;, check to see if it is public. */ int @@ -6397,7 +6426,7 @@ objc_is_public (tree expr, tree identifier) return 0; } - if ((decl = is_ivar (get_class_ivars (klass, true), identifier))) + if ((decl = ivar_of_class (klass, identifier))) { if (TREE_PUBLIC (decl)) return 1;
FDO usability trivial patch to fix div by zero error with insane profile
This is a trivial patch to deal with bad profile data (from multi-threaded programs) that leads to divide by zero error. Ok for trunk? Thanks, David 2011-04-15 Xinliang David Li davi...@google.com * ipa-inline.c (cgraph_decide_recursive_inlining): Fix div by zero error with insane profile. Index: ipa-inline.c === --- ipa-inline.c (revision 172510) +++ ipa-inline.c (working copy) @@ -779,7 +779,7 @@ cgraph_decide_recursive_inlining (struct fprintf (dump_file,Not inlining cold call\n); continue; } - if (curr-count * 100 / node-count probability) + if (node-count == 0 || curr-count * 100 / node-count probability) { if (dump_file) fprintf (dump_file,
Re: FDO usability trivial patch to fix div by zero error with insane profile
On Fri, Apr 15, 2011 at 15:20, Xinliang David Li davi...@google.com wrote: This is a trivial patch to deal with bad profile data (from multi-threaded programs) that leads to divide by zero error. Ok for trunk? Thanks, David 2011-04-15 Xinliang David Li davi...@google.com * ipa-inline.c (cgraph_decide_recursive_inlining): Fix div by zero error with insane profile. Looks fine to me, but you really want Honza to review this. Any chance we can get a test case for this? Diego.
[PATCH] Prefer simplify_gen_* over gen_* in expand_debug_expr
Hi! On IRC richi mentioned yesterday that when gcc.dg/guality/asm-1.c testcase is compiled with g++ instead of gcc, it fails. Apparently that is because of slightly different ARRAY_REF in the code, which for g++ leads into (sign_extend:DI (zero_extend:SI (something:HI))) in DEBUG_INSN (as opposed to (zero_extend:DI (something:HI)) with C FE) and var-tracking doesn't happen to match it in some places. The above is something simplify-rtx.c can fix up though. This patch fixes it by using simplify_gen_* instead of gen_* in many more places than it did until now in expand_debug_expr. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2011-04-15 Jakub Jelinek ja...@redhat.com * cfgexpand.c (expand_debug_expr): Use simplify_gen_{unary,binary,ternary} instead of gen_rtx_*. --- gcc/cfgexpand.c.jj 2011-04-09 19:29:19.081421040 +0200 +++ gcc/cfgexpand.c 2011-04-14 18:52:51.354402038 +0200 @@ -2364,6 +2364,7 @@ expand_debug_expr (tree exp) { rtx op0 = NULL_RTX, op1 = NULL_RTX, op2 = NULL_RTX; enum machine_mode mode = TYPE_MODE (TREE_TYPE (exp)); + enum machine_mode inner_mode = VOIDmode; int unsignedp = TYPE_UNSIGNED (TREE_TYPE (exp)); addr_space_t as; @@ -2410,6 +2411,7 @@ expand_debug_expr (tree exp) unary: case tcc_unary: + inner_mode = TYPE_MODE (TREE_TYPE (TREE_OPERAND (exp, 0))); op0 = expand_debug_expr (TREE_OPERAND (exp, 0)); if (!op0) return NULL_RTX; @@ -2513,7 +2515,7 @@ expand_debug_expr (tree exp) case NOP_EXPR: case CONVERT_EXPR: { - enum machine_mode inner_mode = GET_MODE (op0); + inner_mode = GET_MODE (op0); if (mode == inner_mode) return op0; @@ -2560,9 +2562,9 @@ expand_debug_expr (tree exp) else if (TREE_CODE_CLASS (TREE_CODE (exp)) == tcc_unary ? TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (exp, 0))) : unsignedp) - op0 = gen_rtx_ZERO_EXTEND (mode, op0); + op0 = simplify_gen_unary (ZERO_EXTEND, mode, op0, inner_mode); else - op0 = gen_rtx_SIGN_EXTEND (mode, op0); + op0 = simplify_gen_unary (SIGN_EXTEND, mode, op0, inner_mode); return op0; } @@ -2697,7 +2699,8 @@ expand_debug_expr (tree exp) /* Don't use offset_address here, we don't need a recognizable address, and we don't want to generate code. */ - op0 = gen_rtx_MEM (mode, gen_rtx_PLUS (addrmode, op0, op1)); + op0 = gen_rtx_MEM (mode, simplify_gen_binary (PLUS, addrmode, + op0, op1)); } if (MEM_P (op0)) @@ -2770,25 +2773,23 @@ expand_debug_expr (tree exp) } case ABS_EXPR: - return gen_rtx_ABS (mode, op0); + return simplify_gen_unary (ABS, mode, op0, mode); case NEGATE_EXPR: - return gen_rtx_NEG (mode, op0); + return simplify_gen_unary (NEG, mode, op0, mode); case BIT_NOT_EXPR: - return gen_rtx_NOT (mode, op0); + return simplify_gen_unary (NOT, mode, op0, mode); case FLOAT_EXPR: - if (unsignedp) - return gen_rtx_UNSIGNED_FLOAT (mode, op0); - else - return gen_rtx_FLOAT (mode, op0); + return simplify_gen_unary (TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (exp, +0))) +? UNSIGNED_FLOAT : FLOAT, mode, op0, +inner_mode); case FIX_TRUNC_EXPR: - if (unsignedp) - return gen_rtx_UNSIGNED_FIX (mode, op0); - else - return gen_rtx_FIX (mode, op0); + return simplify_gen_unary (unsignedp ? UNSIGNED_FIX : FIX, mode, op0, +inner_mode); case POINTER_PLUS_EXPR: /* For the rare target where pointers are not the same size as @@ -2799,161 +2800,164 @@ expand_debug_expr (tree exp) GET_MODE (op0) != GET_MODE (op1)) { if (GET_MODE_BITSIZE (GET_MODE (op0)) GET_MODE_BITSIZE (GET_MODE (op1))) - op1 = gen_rtx_TRUNCATE (GET_MODE (op0), op1); + op1 = simplify_gen_unary (TRUNCATE, GET_MODE (op0), op1, + GET_MODE (op1)); else /* We always sign-extend, regardless of the signedness of the operand, because the operand is always unsigned here even if the original C expression is signed. */ - op1 = gen_rtx_SIGN_EXTEND (GET_MODE (op0), op1); + op1 = simplify_gen_unary (SIGN_EXTEND, GET_MODE (op0), op1, + GET_MODE (op1)); } /* Fall through. */ case PLUS_EXPR: - return gen_rtx_PLUS (mode, op0, op1); + return simplify_gen_binary (PLUS, mode, op0, op1); case MINUS_EXPR: - return gen_rtx_MINUS (mode, op0, op1); + return simplify_gen_binary (MINUS, mode, op0, op1);
Re: FDO usability trivial patch to fix div by zero error with insane profile
There is no way to get a test case that can produce the problem in a deterministic way. There is really not much for Honza to review though :) David On Fri, Apr 15, 2011 at 12:52 PM, Diego Novillo dnovi...@google.com wrote: On Fri, Apr 15, 2011 at 15:20, Xinliang David Li davi...@google.com wrote: This is a trivial patch to deal with bad profile data (from multi-threaded programs) that leads to divide by zero error. Ok for trunk? Thanks, David 2011-04-15 Xinliang David Li davi...@google.com * ipa-inline.c (cgraph_decide_recursive_inlining): Fix div by zero error with insane profile. Looks fine to me, but you really want Honza to review this. Any chance we can get a test case for this? Diego.
Re: ObjC: get rid of another unnecessary, temporary copy of instance variables
On Apr 15, 2011, at 11:52 AM, Nicola Pero wrote: This patch for the Objective-C compiler gets rid of another case where we'd create a temporary tree chain with a copy of all the instance variables of a class just to do a simple check on them. Ok to commit ? Ok.
Re: FDO usability trivial patch to fix div by zero error with insane profile
On Fri, Apr 15, 2011 at 16:21, Xinliang David Li davi...@google.com wrote: There is no way to get a test case that can produce the problem in a deterministic way. There is really not much for Honza to review though :) The check may be needed somewhere else, you may be papering over the real issue. I don't think you are but I don't know the code well enough to say for sure. Diego.
Re: [PATCH, rs6000 committed] Fix PowerPC bootstrap
On 04/12/2011 08:22 PM, Alan Modra wrote: On Tue, Apr 12, 2011 at 04:00:45PM -0500, Pat Haugen wrote: --- gcc/config/rs6000/rs6000.c (revision 172327) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -7976,7 +7976,7 @@ call_ABI_of_interest (tree fndecl) return true; /* Interesting functions that we are emitting in this object file. */ - c_node = cgraph_node (fndecl); + c_node = cgraph_get_create_node (fndecl); return !cgraph_only_called_directly_p (c_node); } return false; I think we should use cgraph_get_node here. OK, fixed with following. 2011-04-15 Pat Haugen pthau...@us.ibm.com * config/rs6000/rs6000.c (call_ABI_of_interest): Call cgraph_get_node instead of cgraph_get_create_node. Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 172498) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -7976,7 +7976,7 @@ call_ABI_of_interest (tree fndecl) return true; /* Interesting functions that we are emitting in this object file. */ - c_node = cgraph_get_create_node (fndecl); + c_node = cgraph_get_node (fndecl); return !cgraph_only_called_directly_p (c_node); } return false;
go patch committed: Add compound_statement to backend interface
This patch to the Go frontends adds a convenience function compound_statement to build a statement list from two statements. Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu. Committed to mainline. Ian 2011-04-15 Ian Lance Taylor i...@google.com * go-gcc.cc (Gcc_backend::compound_statement): New function. (Gcc_backend::assignment_statement): Use error_statement. (Gcc_backend::return_statement): Likewise. (Gcc_backend::if_statement): Likewise. (Gcc_backend::switch_statement): Likewise. (Gcc_backend::statement_list): Likewise. Index: gcc/go/go-gcc.cc === --- gcc/go/go-gcc.cc (revision 172468) +++ gcc/go/go-gcc.cc (working copy) @@ -191,6 +191,9 @@ class Gcc_backend : public Backend source_location); Bstatement* + compound_statement(Bstatement*, Bstatement*); + + Bstatement* statement_list(const std::vectorBstatement*); // Labels. @@ -244,7 +247,7 @@ Gcc_backend::assignment_statement(Bexpre tree lhs_tree = lhs-get_tree(); tree rhs_tree = rhs-get_tree(); if (lhs_tree == error_mark_node || rhs_tree == error_mark_node) -return this-make_statement(error_mark_node); +return this-error_statement(); return this-make_statement(fold_build2_loc(location, MODIFY_EXPR, void_type_node, lhs_tree, rhs_tree)); @@ -259,10 +262,10 @@ Gcc_backend::return_statement(Bfunction* { tree fntree = bfunction-get_tree(); if (fntree == error_mark_node) -return this-make_statement(error_mark_node); +return this-error_statement(); tree result = DECL_RESULT(fntree); if (result == error_mark_node) -return this-make_statement(error_mark_node); +return this-error_statement(); tree ret; if (vals.empty()) ret = fold_build1_loc(location, RETURN_EXPR, void_type_node, NULL_TREE); @@ -270,7 +273,7 @@ Gcc_backend::return_statement(Bfunction* { tree val = vals.front()-get_tree(); if (val == error_mark_node) - return this-make_statement(error_mark_node); + return this-error_statement(); tree set = fold_build2_loc(location, MODIFY_EXPR, void_type_node, result, vals.front()-get_tree()); ret = fold_build1_loc(location, RETURN_EXPR, void_type_node, set); @@ -294,7 +297,7 @@ Gcc_backend::return_statement(Bfunction* rettmp, field, NULL_TREE); tree val = (*p)-get_tree(); if (val == error_mark_node) - return this-make_statement(error_mark_node); + return this-error_statement(); tree set = fold_build2_loc(location, MODIFY_EXPR, void_type_node, ref, (*p)-get_tree()); append_to_statement_list(set, stmt_list); @@ -322,7 +325,7 @@ Gcc_backend::if_statement(Bexpression* c if (cond_tree == error_mark_node || then_tree == error_mark_node || else_tree == error_mark_node) -return this-make_statement(error_mark_node); +return this-error_statement(); tree ret = build3_loc(location, COND_EXPR, void_type_node, cond_tree, then_tree, else_tree); return this-make_statement(ret); @@ -363,7 +366,7 @@ Gcc_backend::switch_statement( { tree t = (*pcv)-get_tree(); if (t == error_mark_node) - return this-make_statement(error_mark_node); + return this-error_statement(); source_location loc = EXPR_LOCATION(t); tree label = create_artificial_label(loc); tree c = build3_loc(loc, CASE_LABEL_EXPR, void_type_node, @@ -376,19 +379,36 @@ Gcc_backend::switch_statement( { tree t = (*ps)-get_tree(); if (t == error_mark_node) - return this-make_statement(error_mark_node); + return this-error_statement(); append_to_statement_list(t, stmt_list); } } tree tv = value-get_tree(); if (tv == error_mark_node) -return this-make_statement(error_mark_node); +return this-error_statement(); tree t = build3_loc(switch_location, SWITCH_EXPR, void_type_node, tv, stmt_list, NULL_TREE); return this-make_statement(t); } +// Pair of statements. + +Bstatement* +Gcc_backend::compound_statement(Bstatement* s1, Bstatement* s2) +{ + tree stmt_list = NULL_TREE; + tree t = s1-get_tree(); + if (t == error_mark_node) +return this-error_statement(); + append_to_statement_list(t, stmt_list); + t = s2-get_tree(); + if (t == error_mark_node) +return this-error_statement(); + append_to_statement_list(t, stmt_list); + return this-make_statement(stmt_list); +} + // List of statements. Bstatement* @@ -401,7 +421,7 @@ Gcc_backend::statement_list(const std::v { tree t = (*p)-get_tree(); if (t == error_mark_node) - return this-make_statement(error_mark_node); + return this-error_statement(); append_to_statement_list(t, stmt_list); } return this-make_statement(stmt_list); Index: gcc/go/gofrontend/statements.cc === --- gcc/go/gofrontend/statements.cc (revision
[Patch, Fortran] PR 48624 - fix DECL for external procedures with proc arguments
Hi all, the following patch fixes an issue with a multiple decl for procedures with procedure dummy arguments. Without the patch, multiple declarations are generated, which causes link failures (euler is optimized away) with -fwhole-program. (Thanks goes to Richard for spotting the problem with gfortran.dg/cray_pointers_8.f90.) I have no idea why there should be a problem with inlining (cf. deleted comment), but removing that part did not give any test suite failure (check-gfortran + libgomp's check). Additionally, I have build and run the Polyhedron testsuite with -march=native -ffast-math -funroll-loops -O3 -fno-protect-parens -finline-limit=400 -fwhole-program. Build and tested on x86-64-linux. OK for the trunk? Tobias PS: I have not created a test case; you think it should be added, one can copy gfortran.dg/cray_pointers_8.f90 and add a version with dg-options -fwhole-file. 2011-05-15 Tobias Burnus bur...@net-b.de PR fortran/48624 * trans-decl.c (gfc_get_extern_function_decl): Fix decl for external procedures with proc arguments. diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c index 784dfc8..866720f 100644 --- a/gcc/fortran/trans-decl.c +++ b/gcc/fortran/trans-decl.c @@ -1527,7 +1527,6 @@ gfc_get_extern_function_decl (gfc_symbol * sym) tree name; tree mangled_name; gfc_gsymbol *gsym; - bool proc_formal_arg; if (sym-backend_decl) return sym-backend_decl; @@ -1544,27 +1543,10 @@ gfc_get_extern_function_decl (gfc_symbol * sym) return the backend_decl. */ gsym = gfc_find_gsymbol (gfc_gsym_root, sym-name); - /* Do not use procedures that have a procedure argument because this - can result in problems of multiple decls during inlining. */ - proc_formal_arg = false; - if (gsym gsym-ns gsym-ns-proc_name) -{ - gfc_formal_arglist *formal = gsym-ns-proc_name-formal; - for (; formal; formal = formal-next) - { - if (formal-sym formal-sym-attr.flavor == FL_PROCEDURE) - { - proc_formal_arg = true; - break; - } - } -} - if (gfc_option.flag_whole_file (!sym-attr.use_assoc || sym-attr.if_source != IFSRC_DECL) !sym-backend_decl gsym gsym-ns - !proc_formal_arg ((gsym-type == GSYM_SUBROUTINE) || (gsym-type == GSYM_FUNCTION)) (gsym-ns-proc_name-backend_decl || !sym-attr.intrinsic)) {
Re: PATCH: PR middle-end/48440: [4.7 Regression] FAIL: gcc.c-torture/compile/labels-3.c
I was curious if anyone was still looking at this problem? I see this on IA64 HP-UX in 32 bit mode where ptr_mode(SImode) != Pmode(DImode). As H.J. points out, expand_expr_real_2 is calling simplify_gen_subreg (expr.c, line 7366) and at that point op0 is (label_ref/v:DI 32) and innermode is SImode. Because innermode doesn't match the mode of op0, we abort in simplify_subreg. So it seems we either need to change how we calculate innermode (so that it is DImode) or change expand_expr so that it returns a SImode RTX expression for the tree treeop0. I am not sure which of these is the right thing to do. The tree (treeop0) that op0 is generated from is below and when we call expr_expr the modifier is EXPAND_INITIALIZER. Should expand_expr return an SImode expression for this tree (ptr_mode) or a DImode expression (Pmode) in this case? addr_expr 65866560 type pointer_type 657d8960 type void_type 657d8900 void VOID align 8 symtab 0 alias set -1 canonical type 657d8900 pointer_to_this pointer_type 657d8960 sizes-gimplified public unsigned SI size integer_cst 657d1920 constant 32 unit size integer_cst 657d16c0 constant 4 align 32 symtab 0 alias set -1 canonical type 657d8960 pointer_to_this pointer_type 657f1720 reference_to_this reference_type 657f14e0 constant arg 0 label_decl 657e21b8 l2 type void_type 657d8900 void side-effects VOID file labels-3.c line 16 col 2 align 1 context function_decl 65858180 foo initial error_mark 657e6040 (code_label/s 32 0 0 4 4 (l2) [2 uses]) chain var_decl 6585e3c0 p type pointer_type 657d8960 used unsigned SI file labels-3.c line 12 col 9 size integer_cst 657d1920 32 unit size integer_cst 657d16c0 4 align 32 context function_decl 65858180 foo (mem/f/c/i:SI (reg/f:DI 328 sfp) [0 p+0 S4 A32]) labels-3.c:11:44 Steve Ellcey s...@cup.hp.com
Re: [Patch, Fortran] PR 48624 - fix DECL for external procedures with proc arguments
On Sat, Apr 16, 2011 at 12:30:01AM +0200, Tobias Burnus wrote: I have no idea why there should be a problem with inlining (cf. deleted comment), but removing that part did not give any test suite failure (check-gfortran + libgomp's check). Additionally, I have build and run the Polyhedron testsuite with -march=native -ffast-math -funroll-loops -O3 -fno-protect-parens -finline-limit=400 -fwhole-program. The comment comes from r170414, which suggest there is a problem if a procedure is in an argument list, and that procedure gets in-lined. laptop:kargl[205] svn log -r 170414 trans-decl.c |more r170414 | pault | 2011-02-22 12:33:45 -0800 (Tue, 22 Feb 2011) | 12 lines 2011-02-22 Paul Thomas pa...@gcc.gnu.org PR fortran/45743 * trans-decl.c (gfc_get_extern_function_decl): Don't use the gsymbol backend_decl if the procedure has a formal argument that is a procedure. 2011-02-22 Paul Thomas pa...@gcc.gnu.org PR fortran/45743 * gfortran.dg/whole_file_32.f90 : New test. Perhaps, looking at the -fdump-tree-original on whole_file_32.f90 may shed light on the situation. -- Steve
Re: [C++0x] Range-based for statements and ADL
Applied (with a few formatting tweaks). Thanks, Jason
[v3] libstdc++/48631
Hi, tested x86_64-linux, committed mainline and 4_6-branch. Paolo. 2011-04-15 Daniel Krugler daniel.krueg...@googlemail.com Paolo Carlini paolo.carl...@oracle.com PR libstdc++/48631 * include/bits/unique_ptr.h (default_delete_Tp[]): Add deleted function call operator. * testsuite/20_util/default_delete/48631_neg.cc: New. * testsuite/20_util/weak_ptr/comparison/cmp_neg.cc: Adjust dg-error line numbers. Index: include/bits/unique_ptr.h === --- include/bits/unique_ptr.h (revision 172533) +++ include/bits/unique_ptr.h (working copy) @@ -79,6 +79,8 @@ can't delete pointer to incomplete type); delete [] __ptr; } + + templatetypename _Up void operator()(_Up*) const = delete; }; /// 20.7.12.2 unique_ptr for single objects. Index: testsuite/20_util/default_delete/48631_neg.cc === --- testsuite/20_util/default_delete/48631_neg.cc (revision 0) +++ testsuite/20_util/default_delete/48631_neg.cc (revision 0) @@ -0,0 +1,30 @@ +// { dg-options -std=gnu++0x } +// { dg-do compile } + +// Copyright (C) 2011 Free Software Foundation +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// http://www.gnu.org/licenses/. + +#include memory + +struct B { }; +struct D : B { }; + +// libstdc++/48631 +D d; +std::default_deleteB[] db; +typedef decltype(db(d)) type; // { dg-error use of deleted function } +// { dg-error declared here { target *-*-* } 83 } Index: testsuite/20_util/weak_ptr/comparison/cmp_neg.cc === --- testsuite/20_util/weak_ptr/comparison/cmp_neg.cc(revision 172533) +++ testsuite/20_util/weak_ptr/comparison/cmp_neg.cc(working copy) @@ -48,9 +48,9 @@ // { dg-warning note { target *-*-* } 1099 } // { dg-warning note { target *-*-* } 1094 } // { dg-warning note { target *-*-* } 1086 } -// { dg-warning note { target *-*-* } 483 } -// { dg-warning note { target *-*-* } 477 } -// { dg-warning note { target *-*-* } 467 } +// { dg-warning note { target *-*-* } 485 } +// { dg-warning note { target *-*-* } 479 } +// { dg-warning note { target *-*-* } 469 } // { dg-warning note { target *-*-* } 587 } // { dg-warning note { target *-*-* } 1056 } // { dg-warning note { target *-*-* } 1050 }
Re: [PATCH, SMS] New flag to apply SMS when SC equals 1
Hello, If it's for debugging, can you use a --parm instead (like modulo-sched-min-sc or similar)? I think I can use --param for debugging purposes in this case. (I might add modulo-sched-max-sc as well) Thanks, Revital Thanks, Richard. Thanks, Revital Changelog: * common.opt (fmodulo-sched-allow-sc-one): New flag. * modulo-sched.c (sms_schedule): Allow SMS when stage count equals one and -fmodulo-sched-allow-sc-one flag is set.
Re: [PATCH, ARM] PR47855 Compute attr length for some thumb2 insns, 2/3
Hi Richard Thank you for the detailed explanation. It sounds like an inherent difficulty of rtl passes. Then the only opportunity is ldrb/strb instructions because they never affect cc registers. thanks Carrot On Fri, Apr 15, 2011 at 9:34 PM, Richard Earnshaw rearn...@arm.com wrote: On Thu, 2011-04-14 at 21:19 +0800, Carrot Wei wrote: On Fri, Apr 8, 2011 at 6:51 PM, Ramana Radhakrishnan ramana.radhakrish...@linaro.org wrote: On 08/04/11 10:57, Carrot Wei wrote: Hi This is the second part of the fixing for http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855 This patch contains the length computation for insn patterns *arm_movqi_insn and *arm_addsi3. Since the alternatives and encodings are much more complex, the attribute length is computed in separate C functions. Sorry, no. This is potentially a maintenance pain. It hardcodes alternatives from a pattern elsewhere in the C file. I don't like doing this unless we have to with the sync primitives or with push_multi. In this case I'm not convinced we need such functions in the .c file. Why can't we use the enabled attribute here with appropriate constraints for everything other than the memory cases (even there we might be able to invent some new constraints) ? Also a note about programming style. There are the helper macros like REG_P, CONST_INT_P and MEM_P which remove the necessity for checks like GET_CODE (x) == y where y E { REG, CONST_INT, MEM} Hi Ramana As you suggested I created several new constraints, and use the enabled attribute to split the current alternatives in this new patch. It has been tested on arm qemu without regression. thanks Carrot Sorry, I don't think this approach can work. Certainly not with the way the compiler currently works, and especially for mov and add insns. These instructions are only 2 bytes long if either: 1) They clobber the condition code register or 2) They occur inside an IT block. We can't tell either of these from the pattern, so you're underestimating the length of the instruction in some circumstances by claiming that they are only 2 bytes long. That /will/ lead to broken code someday. We can't add potential clobbers to mov and add patterns because that will break reload which relies on these patterns being simple-set insns with no added baggage. It *might* be possible to add clobbers to other operations, but that will then most-likely upset instruction scheduling (I think the scheduler treats two insns that clobber the same hard reg as being strongly ordered). Putting in the clobber too early will certainly affect cond-exec generation. In short, I'm not aware of a simple way to address this problem so that we get accurate length information, but minimal impact on other passes in the compiler. R.