[PATCH] Improve combining of conditionals

2011-04-15 Thread Maxim Kuvyrkov
This patch fixes the problem with substituting expressions into IF_THEN_ELSE 
during combining.  Without this patch combining of conditionals inside 
IF_THEN_ELSE is seriously inhibited.

The problem this patch fixes is that combine_simplify_rtx() prefers to return 
an expression (say, xor (a) (b)) even when a comparison is prefered (say, 
neq (xor (a) (b)) 0).  Expressions are not recognized as valid conditions of 
if_then_else for most targets, so combiner misses a potential optimization.  
This patch makes combine_simplify_rtx() aware of the context it was invoked in, 
and, when appropriate, does not discourage it from returning a conditional.

The motivating example for this fix was crcu8() routine from CoreMark.  
Compiling this routine for MIPS32R2 at -O3 produces there are several instances 
of sequence

andi$2,$2,0x1
xori$2,$2,0x1
movn$3,$5,$2  ; $2 dies here

which can be optimized into

andi$2,$2,0x1
movz$3,$5,$2  ; $2 dies here
.

The patch was successfully tested on {i686, arm, mips}-linux, both GCC 
testsuites and SPEC2000 runs.  For all targets there was no observable code 
difference in SPEC2000 benchmarks, so the example does not trigger very often.  
Still, it speeds up CoreMark by about 1%.

OK for trunk?

--
Maxim Kuvyrkov
Mentor Graphics / CodeSourcery




gcc-combine-if_then_else.ChangeLog
Description: Binary data


gcc-combine-if_then_else.patch
Description: Binary data


Re: [PATCH v3] Re: avoid useless if-before-free tests

2011-04-15 Thread Jim Meyering
Janne Blomqvist wrote:

 On Thu, Mar 24, 2011 at 18:51, Jim Meyering j...@meyering.net wrote:
 Janne Blomqvist wrote:
 On Tue, Mar 8, 2011 at 19:53, Jim Meyering j...@meyering.net wrote:
 Relative to v2, I've added libgo/ to the list of exempt directories and 
 added
 this recently discussed gfc_free patch, at the request of Tobias Burnus.
 Also, I corrected an error in fortran's ChangeLog and removed all
 whitespace changes from all ChangeLog files.

 The libgfortran changes are Ok for 4.7.

 For the gfortran frontend (gcc/fortran/*) I'd prefer if you'd

 - Replace all calls to gfc_free (x) with free (x).
 - Remove the gfc_free() function and prototype.
 - Remove the free() macro which currently prevents calling free() directly.

 Following up, I've refreshed the series but hit a minor snag
 while converting new uses of gfc_free, removing new tests-before-free
 and merging/reordering changes.

 Applying this fix first makes my problem go away:

 [snip]

 So, what's the plan here? Do you plan to get a GCC account, do you
 already have one, or what? Now that 4.7 is open for development, it's
 perhaps the right time to poke the maintainers to get this patch in.

 If you don't have a GCC account, as one of the Fortran maintainers I
 can commit the Fortran and libgfortran parts, but someone else will
 have to do the rest (were they ever approved, BTW?) as I have only
 write after approval privileges for the rest of GCC.

Can someone add me to the gcc group?  That would help.
I already have ssh access to sourceware.org.

I've rebased the series a few times, but it's been a week or so.
More convertible uses are being added regularly.
Plus I have to reorder/split things a little to avoid a new conflict.


Re: [PATCH][RFC] Make called function type explicit, make function pointer type conversions useless

2011-04-15 Thread Eric Botcazou
 2011-04-14  Eric Botcazou  ebotca...@adacore.com

   * cfgexpand.c (expand_call_stmt): Rematerialize the original function
   type if this is not a builtin function.

Hum, using fold_convert seems to be more appropriate here.

Bootstrapped/regtested on i586-suse-linux, applied as obvious.


2011-04-15  Eric Botcazou  ebotca...@adacore.com

* cfgexpand.c (expand_call_stmt): Simply convert the function type.


-- 
Eric Botcazou
Index: cfgexpand.c
===
--- cfgexpand.c	(revision 172469)
+++ cfgexpand.c	(working copy)
@@ -1851,8 +1851,8 @@ expand_call_stmt (gimple stmt)
  call is made may be different from the type of the function.  */
   if (!builtin_p)
 CALL_EXPR_FN (exp)
-  = fold_build1 (NOP_EXPR, build_pointer_type (gimple_call_fntype (stmt)),
-		 CALL_EXPR_FN (exp));
+  = fold_convert (build_pointer_type (gimple_call_fntype (stmt)),
+		  CALL_EXPR_FN (exp));
 
   TREE_TYPE (exp) = gimple_call_return_type (stmt);
   CALL_EXPR_STATIC_CHAIN (exp) = gimple_call_chain (stmt);


[PATCH] doubled words

2011-04-15 Thread Jim Meyering

Signed-off-by: Jim Meyering meyer...@redhat.com
---
While most of these are in comments, the corrections
in gcc/tree-cfg.c and gcc/config/sh/constraints.md are in strings.
The former at least is marked for translation, and hence appears
in every .po file.

 gcc/config/alpha/vms-unwind.h  |4 ++--
 gcc/config/arm/unwind-arm.h|4 ++--
 gcc/config/microblaze/microblaze.c |2 +-
 gcc/config/sh/constraints.md   |4 ++--
 gcc/cp/pt.c|2 +-
 gcc/java/jcf-parse.c   |4 ++--
 gcc/tree-cfg.c |4 ++--
 7 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/gcc/config/alpha/vms-unwind.h b/gcc/config/alpha/vms-unwind.h
index ea2c3a3..71cb7b8 100644
--- a/gcc/config/alpha/vms-unwind.h
+++ b/gcc/config/alpha/vms-unwind.h
@@ -1,5 +1,5 @@
 /* Fallback frame unwinding for Alpha/VMS.
-   Copyright (C) 1996, 1997, 1998, 2000, 2001, 2002, 2003, 2009, 2010
+   Copyright (C) 1996, 1997, 1998, 2000, 2001, 2002, 2003, 2009, 2010, 2011
Free Software Foundation, Inc.

This file is part of GCC.
@@ -229,7 +229,7 @@ alpha_vms_fallback_frame_state (struct _Unwind_Context 
*context,

   /* If PV designates an exception dispatcher, we have to adjust the return
  address column to get at the signal occurrence point, and account for
- for what the CHF context contains.  */
+ what the CHF context contains.  */

   if (DENOTES_EXC_DISPATCHER (pv))
 {
diff --git a/gcc/config/arm/unwind-arm.h b/gcc/config/arm/unwind-arm.h
index a9ba126..1a51d8d 100644
--- a/gcc/config/arm/unwind-arm.h
+++ b/gcc/config/arm/unwind-arm.h
@@ -1,5 +1,5 @@
 /* Header file for the ARM EABI unwinder
-   Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009
+   Copyright (C) 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2011
Free Software Foundation, Inc.
Contributed by Paul Brook

@@ -65,7 +65,7 @@ extern C {
 }
   _Unwind_State;

-  /* Provided only for for compatibility with existing code.  */
+  /* Provided only for compatibility with existing code.  */
   typedef int _Unwind_Action;
 #define _UA_SEARCH_PHASE   1
 #define _UA_CLEANUP_PHASE  2
diff --git a/gcc/config/microblaze/microblaze.c 
b/gcc/config/microblaze/microblaze.c
index 6ea5fa2..5796bc7 100644
--- a/gcc/config/microblaze/microblaze.c
+++ b/gcc/config/microblaze/microblaze.c
@@ -919,7 +919,7 @@ microblaze_rtx_costs (rtx x, int code, int outer_code 
ATTRIBUTE_UNUSED, int *tot
  {
/* Add 1 to make shift slightly more expensive than add.  */
*total = COSTS_N_INSNS (INTVAL (XEXP (x, 1))) + 1;
-   /* Reduce shift costs for for special circumstances.  */
+   /* Reduce shift costs for special circumstances.  */
if (optimize_size  INTVAL (XEXP (x, 1))  5)
  *total -= 2;
if (!optimize_size  INTVAL (XEXP (x, 1))  17)
diff --git a/gcc/config/sh/constraints.md b/gcc/config/sh/constraints.md
index 6b0e5d2..40d0d0b 100644
--- a/gcc/config/sh/constraints.md
+++ b/gcc/config/sh/constraints.md
@@ -1,5 +1,5 @@
 ;; Constraint definitions for Renesas / SuperH SH.
-;; Copyright (C) 2007, 2008 Free Software Foundation, Inc.
+;; Copyright (C) 2007, 2008, 2011 Free Software Foundation, Inc.
 ;;
 ;; This file is part of GCC.
 ;;
@@ -99,7 +99,7 @@
(match_test ival = -128  ival = 127)))

 (define_constraint I10
-  A signed 10-bit constant, as used in in SHmedia andi, ori.
+  A signed 10-bit constant, as used in SHmedia andi, ori.
   (and (match_code const_int)
(match_test ival = -512  ival = 511)))

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 3356e75..fc69a0c 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13980,7 +13980,7 @@ type_unification_real (tree tparms,
   gcc_assert (ntparms  0);

   /* Reset the number of non-defaulted template arguments contained
- in in TARGS.  */
+ in TARGS.  */
   NON_DEFAULT_TEMPLATE_ARGS_COUNT (targs) = NULL_TREE;

   switch (strict)
diff --git a/gcc/java/jcf-parse.c b/gcc/java/jcf-parse.c
index feeddad..a56c1b7 100644
--- a/gcc/java/jcf-parse.c
+++ b/gcc/java/jcf-parse.c
@@ -1,6 +1,6 @@
 /* Parser for Java(TM) .class files.
Copyright (C) 1996, 1998, 1999, 2000, 2001, 2002, 2003, 2004,
-   2005, 2006, 2007, 2008, 2009, 2010 Free Software Foundation, Inc.
+   2005, 2006, 2007, 2008, 2009, 2010, 2011 Free Software Foundation, Inc.

 This file is part of GCC.

@@ -368,7 +368,7 @@ set_source_filename (JCF *jcf, int index)
from the input class file into the output file.  We don't decode the
data at all, merely rewriting constant indexes whenever we come
across them: this is necessary because the constant pool in the
-   output file isn't the same as the constant pool in in the input.
+   output file isn't the same as the constant pool in the input.

The main advantage of this technique is that the resulting
annotation data is pointer-free, so it doesn't have to be relocated
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 

Re: [PATCH v3] Re: avoid useless if-before-free tests

2011-04-15 Thread Janne Blomqvist
On Fri, Apr 15, 2011 at 10:54, Jim Meyering j...@meyering.net wrote:
 Janne Blomqvist wrote:

 On Thu, Mar 24, 2011 at 18:51, Jim Meyering j...@meyering.net wrote:
 Janne Blomqvist wrote:
 On Tue, Mar 8, 2011 at 19:53, Jim Meyering j...@meyering.net wrote:
 Relative to v2, I've added libgo/ to the list of exempt directories and 
 added
 this recently discussed gfc_free patch, at the request of Tobias Burnus.
 Also, I corrected an error in fortran's ChangeLog and removed all
 whitespace changes from all ChangeLog files.

 The libgfortran changes are Ok for 4.7.

 For the gfortran frontend (gcc/fortran/*) I'd prefer if you'd

 - Replace all calls to gfc_free (x) with free (x).
 - Remove the gfc_free() function and prototype.
 - Remove the free() macro which currently prevents calling free() directly.

 Following up, I've refreshed the series but hit a minor snag
 while converting new uses of gfc_free, removing new tests-before-free
 and merging/reordering changes.

 Applying this fix first makes my problem go away:

 [snip]

 So, what's the plan here? Do you plan to get a GCC account, do you
 already have one, or what? Now that 4.7 is open for development, it's
 perhaps the right time to poke the maintainers to get this patch in.

 If you don't have a GCC account, as one of the Fortran maintainers I
 can commit the Fortran and libgfortran parts, but someone else will
 have to do the rest (were they ever approved, BTW?) as I have only
 write after approval privileges for the rest of GCC.

 Can someone add me to the gcc group?  That would help.
 I already have ssh access to sourceware.org.

According to http://gcc.gnu.org/svnwrite.html , you should send an
email to overseers(at)gcc.gnu.org . It says that you need a sponsor
and The steering committee or a well-established GCC maintainer
(including reviewers) can approve for write access any person with GNU
copyright assignment papers in place and known to submit good
patches.. I'm not sure if I'm considered to be well-established
enough, so could someone help Jim out here, please?

Or, if nobody pipes up within a few days, just mention my name as your
sponsor and we'll see if the overseers consider me well-established or
not.. ;)


-- 
Janne Blomqvist


Re: [PATCH v3] Re: avoid useless if-before-free tests

2011-04-15 Thread Jim Meyering
Janne Blomqvist wrote:
 On Fri, Apr 15, 2011 at 10:54, Jim Meyering j...@meyering.net wrote:
 Janne Blomqvist wrote:

 On Thu, Mar 24, 2011 at 18:51, Jim Meyering j...@meyering.net wrote:
 Janne Blomqvist wrote:
 On Tue, Mar 8, 2011 at 19:53, Jim Meyering j...@meyering.net wrote:
 Relative to v2, I've added libgo/ to the list of exempt
 directories and added
 this recently discussed gfc_free patch, at the request of Tobias Burnus.
 Also, I corrected an error in fortran's ChangeLog and removed all
 whitespace changes from all ChangeLog files.

 The libgfortran changes are Ok for 4.7.

 For the gfortran frontend (gcc/fortran/*) I'd prefer if you'd

 - Replace all calls to gfc_free (x) with free (x).
 - Remove the gfc_free() function and prototype.
 - Remove the free() macro which currently prevents calling free() 
 directly.

 Following up, I've refreshed the series but hit a minor snag
 while converting new uses of gfc_free, removing new tests-before-free
 and merging/reordering changes.

 Applying this fix first makes my problem go away:

 [snip]

 So, what's the plan here? Do you plan to get a GCC account, do you
 already have one, or what? Now that 4.7 is open for development, it's
 perhaps the right time to poke the maintainers to get this patch in.

 If you don't have a GCC account, as one of the Fortran maintainers I
 can commit the Fortran and libgfortran parts, but someone else will
 have to do the rest (were they ever approved, BTW?) as I have only
 write after approval privileges for the rest of GCC.

 Can someone add me to the gcc group?  That would help.
 I already have ssh access to sourceware.org.

 According to http://gcc.gnu.org/svnwrite.html , you should send an
 email to overseers(at)gcc.gnu.org . It says that you need a sponsor
 and The steering committee or a well-established GCC maintainer
 (including reviewers) can approve for write access any person with GNU
 copyright assignment papers in place and known to submit good
 patches.. I'm not sure if I'm considered to be well-established
 enough, so could someone help Jim out here, please?

 Or, if nobody pipes up within a few days, just mention my name as your
 sponsor and we'll see if the overseers consider me well-established or
 not.. ;)

Thanks.
FYI, I have an ANY assignment on file, so no need to wait for paperwork.


Fix minor issues in gimplify.c

2011-04-15 Thread Eric Botcazou
Long lines, superfluous 's' and missing head comment.  No functional changes.

There is one function left without head comment: gimplify_adjust_omp_clauses.
Jakub, when you have a few minutes, would you mind adding one?  TIA.

Tested on i586-suse-linux, applied on the mainline as obvious.


2011-04-15  Eric Botcazou  ebotca...@adacore.com

* gimplify.c: Fix issues in comments throughout.
(voidify_wrapper_expr): Fix long line.
(build_stack_save_restore): Likewise.
(gimplify_loop_expr): Likewise.
(gimplify_compound_lval): Likewise.
(gimplify_init_ctor_eval): Likewise.
(gimplify_modify_expr_rhs): Likewise.
(omp_notice_threadprivate_variable): Likewise.


-- 
Eric Botcazou
Index: gimplify.c
===
--- gimplify.c	(revision 172469)
+++ gimplify.c	(working copy)
@@ -91,7 +91,7 @@ static struct gimplify_ctx *gimplify_ctx
 static struct gimplify_omp_ctx *gimplify_omp_ctxp;
 
 
-/* Formal (expression) temporary table handling: Multiple occurrences of
+/* Formal (expression) temporary table handling: multiple occurrences of
the same scalar expression are evaluated into the same temporary.  */
 
 typedef struct gimple_temp_hash_elt
@@ -100,11 +100,12 @@ typedef struct gimple_temp_hash_elt
   tree temp;  /* Value */
 } elt_t;
 
-/* Forward declarations.  */
+/* Forward declaration.  */
 static enum gimplify_status gimplify_compound_expr (tree *, gimple_seq *, bool);
 
 /* Mark X addressable.  Unlike the langhook we expect X to be in gimple
form and we don't do any syntax checking.  */
+
 void
 mark_addressable (tree x)
 {
@@ -232,6 +233,8 @@ pop_gimplify_context (gimple body)
 htab_delete (c-temp_htab);
 }
 
+/* Push a GIMPLE_BIND tuple onto the stack of bindings.  */
+
 static void
 gimple_push_bind_expr (gimple gimple_bind)
 {
@@ -240,19 +243,23 @@ gimple_push_bind_expr (gimple gimple_bin
   VEC_safe_push (gimple, heap, gimplify_ctxp-bind_expr_stack, gimple_bind);
 }
 
+/* Pop the first element off the stack of bindings.  */
+
 static void
 gimple_pop_bind_expr (void)
 {
   VEC_pop (gimple, gimplify_ctxp-bind_expr_stack);
 }
 
+/* Return the first element of the stack of bindings.  */
+
 gimple
 gimple_current_bind_expr (void)
 {
   return VEC_last (gimple, gimplify_ctxp-bind_expr_stack);
 }
 
-/* Return the stack GIMPLE_BINDs created during gimplification.  */
+/* Return the stack of bindings created during gimplification.  */
 
 VEC(gimple, heap) *
 gimple_bind_expr_stack (void)
@@ -260,7 +267,7 @@ gimple_bind_expr_stack (void)
   return gimplify_ctxp-bind_expr_stack;
 }
 
-/* Returns true iff there is a COND_EXPR between us and the innermost
+/* Return true iff there is a COND_EXPR between us and the innermost
CLEANUP_POINT_EXPR.  This info is used by gimple_push_cleanup.  */
 
 static bool
@@ -392,7 +399,7 @@ remove_suffix (char *name, int len)
 }
 }
 
-/* Create a new temporary name with PREFIX.  Returns an identifier.  */
+/* Create a new temporary name with PREFIX.  Return an identifier.  */
 
 static GTY(()) unsigned int tmp_var_id_num;
 
@@ -413,9 +420,8 @@ create_tmp_var_name (const char *prefix)
   return get_identifier (tmp_name);
 }
 
-
 /* Create a new temporary variable declaration of type TYPE.
-   Does NOT push it into the current binding.  */
+   Do NOT push it into the current binding.  */
 
 tree
 create_tmp_var_raw (tree type, const char *prefix)
@@ -446,7 +452,7 @@ create_tmp_var_raw (tree type, const cha
   return tmp_var;
 }
 
-/* Create a new temporary variable declaration of type TYPE.  DOES push the
+/* Create a new temporary variable declaration of type TYPE.  DO push the
variable into the current binding.  Further, assume that this is called
only from gimplification or optimization, at which point the creation of
certain types are bugs.  */
@@ -537,7 +543,6 @@ lookup_tmp_var (tree val, bool is_formal
   return ret;
 }
 
-
 /* Return true if T is a CALL_EXPR or an expression that can be
assigned to a temporary.  Note that this predicate should only be
used during gimplification.  See the rationale for this in
@@ -605,7 +610,7 @@ internal_get_tmp_var (tree val, gimple_s
   return t;
 }
 
-/* Returns a formal temporary variable initialized with VAL.  PRE_P is as
+/* Return a formal temporary variable initialized with VAL.  PRE_P is as
in gimplify_expr.  Only use this function if:
 
1) The value of the unfactored expression represented by VAL will not
@@ -623,7 +628,7 @@ get_formal_tmp_var (tree val, gimple_seq
   return internal_get_tmp_var (val, pre_p, NULL, true);
 }
 
-/* Returns a temporary variable initialized with VAL.  PRE_P and POST_P
+/* Return a temporary variable initialized with VAL.  PRE_P and POST_P
are as in gimplify_expr.  */
 
 tree
@@ -632,8 +637,8 @@ get_initialized_tmp_var (tree val, gimpl
   return internal_get_tmp_var (val, pre_p, post_p, false);
 }
 
-/* Declares all the variables in VARS in 

Re: [lto/pph] Do not pack more bits than requested (issue4415044)

2011-04-15 Thread Richard Guenther
On Thu, 14 Apr 2011, Diego Novillo wrote:

 On Thu, Apr 14, 2011 at 15:28, Jakub Jelinek ja...@redhat.com wrote:
 
  If bitpack_word_t has BITS_PER_BITPACK_WORD bits, then for
  nbits = BITS_PER_BITPACK_WORD this will be undefined.
  Use say
  mask = ((bitpack_word_t) 2  (nbits - 1)) - 1;
  or something similar (assertion ensures that nbits isn't 0).
 
 Quite right, thanks.  In the meantime, I've changed my mind with this.
  I think it's safer if we just assert that the value we are about to
 pack fit in the number of bits the caller specified.
 
 The only problematic user is pack_ts_type_value_fields when it tries
 to pack a -1 for the type's alias set.  I think we should just stream
 that as an integer and not go through the bitpacking overhead.
 
 For now, I'm applying this to the pph branch.  Tested on x86_64.  No
 LTO failures.

See below
 
 
 Diego.
 
 * lto-streamer-out.c (pack_ts_type_value_fields): Pack all bits
 of -1 value.
 * lto-streamer.h (bitpack_create): Assert that the value to
 pack does not overflow NBITS.
 * lto-streamer-in.c (unpack_ts_type_value_fields): Unpack
 BITS_PER_BITPACK_WORD bits for TYPE_ALIAS_SET.
 
 diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
 index 97b86ce..f04e031 100644
 --- a/gcc/lto-streamer-in.c
 +++ b/gcc/lto-streamer-in.c
 @@ -1734,7 +1734,7 @@ unpack_ts_type_value_fields (struct bitpack_d
 *bp, tree expr)
TYPE_USER_ALIGN (expr) = (unsigned) bp_unpack_value (bp, 1);
TYPE_READONLY (expr) = (unsigned) bp_unpack_value (bp, 1);
TYPE_ALIGN (expr) = (unsigned) bp_unpack_value (bp, HOST_BITS_PER_INT);
 -  TYPE_ALIAS_SET (expr) = bp_unpack_value (bp, HOST_BITS_PER_INT);
 +  TYPE_ALIAS_SET (expr) = bp_unpack_value (bp, BITS_PER_BITPACK_WORD);
  }
 
 
 diff --git a/gcc/lto-streamer-out.c b/gcc/lto-streamer-out.c
 index 3ccad8b..89ad9c5 100644
 --- a/gcc/lto-streamer-out.c
 +++ b/gcc/lto-streamer-out.c
 @@ -518,7 +518,8 @@ pack_ts_type_value_fields (struct bitpack_d *bp, tree 
 expr)
bp_pack_value (bp, TYPE_USER_ALIGN (expr), 1);
bp_pack_value (bp, TYPE_READONLY (expr), 1);
bp_pack_value (bp, TYPE_ALIGN (expr), HOST_BITS_PER_INT);
 -  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, HOST_BITS_PER_INT);
 +  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1,
 +BITS_PER_BITPACK_WORD);

As we only want to stream alias-set zeros just change it to a single bit,
like

 bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0, 1);

and on the reader side restore either a zero or -1.

  }
 
 
 diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h
 index 0d49430..73afd46 100644
 --- a/gcc/lto-streamer.h
 +++ b/gcc/lto-streamer.h
 @@ -1190,18 +1190,15 @@ bitpack_create (struct lto_output_stream *s)
  static inline void
  bp_pack_value (struct bitpack_d *bp, bitpack_word_t val, unsigned nbits)
  {
 -  bitpack_word_t mask, word;
 +  bitpack_word_t word = bp-word;
int pos = bp-pos;
 
 -  word = bp-word;
 -
 +  /* We shouldn't try to pack more bits than can fit in a bitpack word.  */
gcc_assert (nbits  0  nbits = BITS_PER_BITPACK_WORD);

Asserts will break merging the operations and make them slow again.
Please no asserts in these core routines.  Look at the .optimized
dump of a series of bp_pack_value, they should be basically optimized to
a series of ORs.

As for the -1 case, it's simply broken use of the interface.

Richard.

 -  /* Make sure that VAL only has the lower NBITS set.  Generate a
 - mask with the lower NBITS set and use it to filter the upper
 - bits from VAL.  */
 -  mask = ((bitpack_word_t) 1  nbits) - 1;
 -  val = val  mask;
 +  /* The value to pack should not overflow NBITS.  */
 +  gcc_assert (nbits == BITS_PER_BITPACK_WORD
 + || val = ((bitpack_word_t) 1  nbits));
 
/* If val does not fit into the current bitpack word switch to the
   next one.  */


Re: [PATCH] Improve combining of conditionals

2011-04-15 Thread Eric Botcazou
 The patch was successfully tested on {i686, arm, mips}-linux, both GCC
 testsuites and SPEC2000 runs.  For all targets there was no observable code
 difference in SPEC2000 benchmarks, so the example does not trigger very
 often.  Still, it speeds up CoreMark by about 1%.

 OK for trunk?

Yes, modulo the following nits:

@@ -4938,11 +4938,13 @@ find_split_point (rtx *loc, rtx insn, bool set_src)
 
IN_DEST is nonzero if we are processing the SET_DEST of a SET.
 
+   IN_COND is nonzero if we are on top level of the condition.

...we are at the top level of a condition.


@@ -5221,10 +5225,12 @@ subst (rtx x, rtx from, rtx to, int in_dest, int 
unique_copy)
expression.
 
OP0_MODE is the original mode of XEXP (x, 0).  IN_DEST is nonzero
-   if we are inside a SET_DEST.  */
+   if we are inside a SET_DEST.  IN_COND is nonzero if we are on the top level
+   of a condition.  */

Likewise.


@@ -5717,7 +5723,16 @@ combine_simplify_rtx (rtx x, enum machine_mode op0_mode, 
int in_dest)
 ZERO_EXTRACT is indeed appropriate, it will be placed back by
 the call to make_compound_operation in the SET case.  */
 
- if (STORE_FLAG_VALUE == 1
+ if (in_cond)
+   /* Don't apply below optimizations if the caller would
+  prefer a comparison rather than a value.
+  E.g., for the condition in an IF_THEN_ELSE most targets need
+  an explicit comparison.  */
+   {
+ ;
+   }

Remove the superfluous parentheses and move the comment to a new paragraph of 
the big comment just above.

No need to retest, just make sure this still compiles, thanks in advance.

-- 
Eric Botcazou


Re: [PATCH][RFC] Make called function type explicit, make function pointer type conversions useless

2011-04-15 Thread Richard Guenther
On Thu, 14 Apr 2011, Eric Botcazou wrote:

  I have the following, the gimple_call_chain check is to prevent
  Ada bootstrap from failing.
 
 On x86?

Yes, though maybe because I had get_callee_fndecl patched to not
strip conversions.

  I suppose your patch is ok, maybe with simply not wrapping the NOP_EXPR
  around builtins.
 
 I can try that indeed...

Thanks for fixing this!

Richard.


Re: [Patch,testsuite,avr]: add -finline-limit=0 to pr41885.c options

2011-04-15 Thread Richard Guenther
On Thu, Apr 14, 2011 at 8:14 PM, Denis Chertykov cherty...@gmail.com wrote:
 2011/4/14 Georg-Johann Lay a...@gjlay.de:
 This patchlet adds -finline-limit=0 to dg-options in

 testsuite/gcc.target/avr/torture/pr41885.c

 because otherwise optimizers will fold all tests and actually no test
 function is called when optimization is on. The test case still passes
 all tests.


 testsuite/
 2011-04-14  Georg-Johann Lay  a...@gjlay.de

        * gcc.target/avr/torture/pr41885.c (dg-options): Add
        -finline-limit=0

Please use -fno-inline instead.

Richard.

 Index: testsuite/gcc.target/avr/torture/pr41885.c
 ===
 --- testsuite/gcc.target/avr/torture/pr41885.c  (Revision 172431)
 +++ testsuite/gcc.target/avr/torture/pr41885.c  (Arbeitskopie)
 @@ -1,4 +1,4 @@
 -/* { dg-options -w -std=c99 } */
 +/* { dg-options -w -std=c99 -finline-limit=0 } */
  /* { dg-do run } */

  #include limits.h


 Approved.

 Denis.



Re: More of ipa-inline housekeeping

2011-04-15 Thread Jan Hubicka
 
  I fixed this on the and added sanity check that the fields are initialized.
  This has shown problem with early inliner iteration fixed thusly and fact 
  that
  early inliner is attempting to compute overall growth at a time the inline
  parameters are not computed for functions not visited by early optimizations
  yet. We previously agreed that early inliner should not try to do that (as 
  this
  leads to early inliner inlining functions called once that should be 
  deferred
  for later consieration).  I just hope it won't cause benchmarks to
  regress too much ;)
 
 Yeah, we agreed to that.  And I forgot about it as it wasn't part of the
 early inliner reorg (which was supposed to be a 1:1 transform).

Today C++ results shows some regressions, but nothing earthshaking.  So I think 
it is good
idea to drop this feature of early inliner since it is not really systematic.
There is also great improvement on LTO SPEC2000, but I tend to hope it is 
unrelated change.
Perhaps your aliasing?

Honza


Re: More of ipa-inline housekeeping

2011-04-15 Thread Richard Guenther
2011/4/15 Jan Hubicka hubi...@ucw.cz:
 
  I fixed this on the and added sanity check that the fields are initialized.
  This has shown problem with early inliner iteration fixed thusly and fact 
  that
  early inliner is attempting to compute overall growth at a time the inline
  parameters are not computed for functions not visited by early 
  optimizations
  yet. We previously agreed that early inliner should not try to do that (as 
  this
  leads to early inliner inlining functions called once that should be 
  deferred
  for later consieration).  I just hope it won't cause benchmarks to
  regress too much ;)

 Yeah, we agreed to that.  And I forgot about it as it wasn't part of the
 early inliner reorg (which was supposed to be a 1:1 transform).

 Today C++ results shows some regressions, but nothing earthshaking.  So I 
 think it is good
 idea to drop this feature of early inliner since it is not really systematic.
 There is also great improvement on LTO SPEC2000, but I tend to hope it is 
 unrelated change.
 Perhaps your aliasing?

I doubt SPEC2k uses VLAs or alloca, does it?  Might be the DSE
improvements, but I'm not sure.

Richard.

 Honza



[committed] Fix pr46084.c testcase (PR target/48614)

2011-04-15 Thread Jakub Jelinek
Hi!

This testcase doesn't use avx-check.h and is dg-do run testcase,
as it is compiled with -mavx whenever it emits any AVX instructions,
it won't be able to run on any older CPUs.

Fixed thusly, committed to trunk/4.6.

2011-04-15  Jakub Jelinek  ja...@redhat.com

PR target/48614
* gcc.target/i386/pr46084.c: Require avx_runtime instead of
just avx.

--- gcc/testsuite/gcc.target/i386/pr46084.c.jj  2011-04-15 08:11:00.493464611 
+0200
+++ gcc/testsuite/gcc.target/i386/pr46084.c 2011-04-15 08:08:50.282146043 
+0200
@@ -1,7 +1,7 @@
 /* This test needs to use setrlimit to set the stack size, so it can
only run on Unix.  */
 /* { dg-do run { target *-*-linux* *-*-solaris* *-*-darwin* } } */
-/* { dg-require-effective-target avx } */
+/* { dg-require-effective-target avx_runtime } */
 /* { dg-require-effective-target split_stack } */
 /* { dg-options -fsplit-stack -O2 -mavx } */
 

Jakub


Re: [PATCH] Improve combining of conditionals

2011-04-15 Thread Eric Botcazou
 And make IN_COND a bool instead of an int?

I had the same initial reaction but then came to the same conclusion as Maxim.

-- 
Eric Botcazou


Re: [patch][ARM] Fix 16-bit - 64-bit multiply and accumulate

2011-04-15 Thread Andrew Stubbs

Ping.

On 25/03/11 16:19, Andrew Stubbs wrote:

On 28/01/11 15:20, Richard Earnshaw wrote:


On Fri, 2011-01-28 at 15:13 +, Andrew Stubbs wrote:

On 28/01/11 14:12, Richard Earnshaw wrote:

So what happens to a variation of your testcase:

long long foolong (long long x, short *a, short *b)
{
return x + (long long)*a * (long long)*b;
}

With your patch? This should generate identical code to your original
test-case.



The patch has no effect on that testcase - it's broken in some other
way, I think, and the same with and without my patch:

ldrsh r3, [r3, #0]
ldrsh r2, [r2, #0]
push {r4, r5}
asrs r4, r3, #31
asrs r5, r2, #31
mul r4, r2, r4
mla r4, r3, r5, r4
umull r2, r3, r2, r3
adds r3, r4, r3
adds r0, r0, r2
adc r1, r1, r3
pop {r4, r5}
bx lr

Hmmm, that probably doesn't add anything useful to the discussion. :(

I'll add that one to the todo list ...

Andrew



Ouch! I though that used to work :-(



I looked at this one again, but on a second inspection I'm not sure
there's much wrong with it?

When I wrote the above I thought that there was a 64-bit multiply
instruction, but now I look more closely I see there isn't, hence the
above. It does two 16-bit loads, sign-extends the inputs to 64-bit, does
a 64-bit - 64-bit multiply, and then adds 'x'.

Can the umull/add/add/adc be optimized using umlal? It's too late on a
Friday to workout what's going on with the carries 

Also, I don't fully understand why the compiler can't just disregard the
casts and use maddhidi4? Isn't that mathematically equivalent in this case?

When you say it used to work, what did it use to look like?

Thanks

Andrew





[ping] 3 unreviewed patches

2011-04-15 Thread Eric Botcazou
Fix annoying gcov filename handling:
  http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01380.html

(rs6000) Fix thinko in output_profile_hook:
  http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01624.html

Introduce -Wstack-usage:
  http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01992.html

Thanks in advance.

-- 
Eric Botcazou


Re: [Patch,testsuite,avr]: add -finline-limit=0 to pr41885.c options

2011-04-15 Thread Georg-Johann Lay
Richard Guenther schrieb:
 On Thu, Apr 14, 2011 at 8:14 PM, Denis Chertykov cherty...@gmail.com wrote:
 2011/4/14 Georg-Johann Lay a...@gjlay.de:
 This patchlet adds -finline-limit=0 to dg-options in

 testsuite/gcc.target/avr/torture/pr41885.c

 because otherwise optimizers will fold all tests and actually no test
 function is called when optimization is on. The test case still passes
 all tests.


 testsuite/
 2011-04-14  Georg-Johann Lay  a...@gjlay.de

* gcc.target/avr/torture/pr41885.c (dg-options): Add
-finline-limit=0
 
 Please use -fno-inline instead.
 
 Richard.

Ok, changed it:

http://gcc.gnu.org/viewcvs?view=revisionrevision=172487

Johann


Re: Implement stack arrays even for unknown sizes

2011-04-15 Thread Michael Matz
Hi,

On Thu, 14 Apr 2011, Dominique Dhumieres wrote:

 I have forgotten to mentionned that I have a variant of fatigue
 in which I have done the inlining manually along with few other
 optimizations and the timing for it is
 
 [macbook] lin/test% gfc -Ofast fatigue_v8.f90
 [macbook] lin/test% time a.out  /dev/null
 2.793u 0.002s 0:02.79 100.0%  0+0k 0+1io 0pf+0w
 [macbook] lin/test% gfc -Ofast -fwhole-program fatigue_v8.f90
 [macbook] lin/test% time a.out  /dev/null
 2.680u 0.003s 0:02.68 100.0%  0+0k 0+2io 0pf+0w
 [macbook] lin/test% gfc -Ofast -fwhole-program -flto fatigue_v8.f90
 [macbook] lin/test% time a.out  /dev/null
 2.671u 0.002s 0:02.67 100.0%  0+0k 0+2io 0pf+0w
 [macbook] lin/test% gfc -Ofast -fwhole-program -fstack-arrays fatigue_v8.f90
 [macbook] lin/test% time a.out  /dev/null
 2.680u 0.003s 0:02.68 100.0%  0+0k 0+2io 0pf+0w
 [macbook] lin/test% gfc -Ofast -fwhole-program -flto -fstack-arrays 
 fatigue_v8.f90
 [macbook] lin/test% time a.out  /dev/null
 2.677u 0.003s 0:02.68 99.6%   0+0k 0+0io 0pf+0w
 
 So the timing of the original code with 
 -Ofast -finline-limit=600 -fwhole-program -flto -fstack-arrays
 is quite close to this lower bound.
 
 I have also looked at the failure for gfortran.dg/elemental_dependency_1.f90
 and it seems due to a spurious integer(kind=4) A.37[4]; (and friends) in

Yes, this is due to the DECL_EXPR statement which is rendered by the 
dumper just the same as a normal decl.  The testcase looks for exactly one 
such decl, but with -fstack-arrays there are exactly two for each such 
array.

 integer(kind=4) A.37[4];

So, this is the normal decl.

 atmp.36.dim[0].ubound = 3;
 integer(kind=4) A.37[4];

And this is the DECL_EXPR statement, which then actually is transformed 
into the stack_save/alloca/stack_restore sequence.


Ciao,
Michael.


Re: [Patch, libfortran] Replace sprintf() with snprintf()

2011-04-15 Thread Jerry DeLisle

On 04/14/2011 11:53 PM, Janne Blomqvist wrote:

Hi,

as is well known, sprintf() is prone to buffer overflow, hence
snprintf(). libgfortran uses snprintf() in some places, but not
everywhere. Rather than analyzing every sprintf() call for a potential
overflow, the attached patch takes the dogmatic but simple approach of
replacing all the remaining sprintf() usage with snprintf().

For targets without snprintf(), io/list_read.c contained a fallback
macro that uses sprintf(); this is moved to libgfortran.h so that it's
available everywhere.

readelf -s libgfortran.so|grep sprintf

confirms that there is no remaining usage of sprintf().

Regtested on x86_64-unknown-linux-gnu, Ok for trunk?



OK, thanks.

Jerry


Re: [PATCH] Improve combining of conditionals

2011-04-15 Thread Maxim Kuvyrkov
On Apr 15, 2011, at 3:34 PM, Eric Botcazou wrote:

 The problem this patch fixes is that combine_simplify_rtx() prefers to
 return an expression (say, xor (a) (b)) even when a comparison is
 prefered (say, neq (xor (a) (b)) 0).  Expressions are not recognized as
 valid conditions of if_then_else for most targets, so combiner misses a
 potential optimization.  This patch makes combine_simplify_rtx() aware of
 the context it was invoked in, and, when appropriate, does not discourage
 it from returning a conditional.
 
 Btw, this is very likely also valid for targets with STORE_FLAG_VALUE == -1 
 so 
 the same IN_COND short-circuit would need to be added a few lines below in 
 combine_simplify_rtx.  But this would need to be tested.  Do you happen to 
 have access to such a target, e.g. m68k?

Hm, I didn't notice that one, thanks!  I have access to m68k (ColdFire, tbp) 
and will test this change there before committing.

--
Maxim Kuvyrkov
Mentor Graphics / CodeSourcery



Re: [7/9] Testsuite: remove vect_{extract_even_odd,strided}_wide

2011-04-15 Thread Richard Guenther
On Tue, Apr 12, 2011 at 4:14 PM, Richard Sandiford
richard.sandif...@linaro.org wrote:
 We have separate vect_extract_even_odd and vect_extract_even_odd_wide
 target selectors, and separate vect_strided and vect_strided_wide
 selectors.  The comment suggests that wide is for 32+ bits,
 but we often use the non-wide forms for 32-bit tests.  We also have
 tests that combine 16-bit and 32-bit strided accesses without checking
 for both widths.

 I'm about to split vect_strided into vect_stridedN (for each stride
 factor N).  One option was to preserve the wide distinction and have
 vect_stridedN_wide as well.  However, given the current usage,
 and given that the two selectors are the same, I think it makes sense
 to combine them until we know what distinction we need to make.

 Tested on x86_64-linux-gnu and arm-linux-gnueabi.  OK to install?

Ok.

Thanks,
Richard.

 Richard


 gcc/testsuite/
        * lib/target-supports.exp
        (check_effective_target_vect_extract_even_odd_wide): Delete.
        (check_effective_target_vect_strided_wide): Likewise.
        * gcc.dg/vect/O3-pr39675-2.c: Use the non-wide versions instead.
        * gcc.dg/vect/fast-math-pr35982.c: Likewise.
        * gcc.dg/vect/fast-math-vect-complex-3.c: Likewise.
        * gcc.dg/vect/pr37539.c: Likewise.
        * gcc.dg/vect/slp-11.c: Likewise.
        * gcc.dg/vect/slp-12a.c: Likewise.
        * gcc.dg/vect/slp-12b.c: Likewise.
        * gcc.dg/vect/slp-19.c: Likewise.
        * gcc.dg/vect/slp-23.c: Likewise.
        * gcc.dg/vect/vect-1.c: Likewise.
        * gcc.dg/vect/vect-98.c: Likewise.
        * gcc.dg/vect/vect-107.c: Likewise.
        * gcc.dg/vect/vect-strided-float.c: Likewise.

 Index: gcc/testsuite/lib/target-supports.exp
 ===
 --- gcc/testsuite/lib/target-supports.exp       2011-04-12 11:53:54.0 
 +0100
 +++ gcc/testsuite/lib/target-supports.exp       2011-04-12 11:55:11.0 
 +0100
 @@ -3121,29 +3121,6 @@ proc check_effective_target_vect_extract
     return $et_vect_extract_even_odd_saved
  }

 -# Return 1 if the target supports vector even/odd elements extraction of
 -# vectors with SImode elements or larger, 0 otherwise.
 -
 -proc check_effective_target_vect_extract_even_odd_wide { } {
 -    global et_vect_extract_even_odd_wide_saved
 -
 -    if [info exists et_vect_extract_even_odd_wide_saved] {
 -        verbose check_effective_target_vect_extract_even_odd_wide: using 
 cached result 2
 -    } else {
 -        set et_vect_extract_even_odd_wide_saved 0
 -        if { [istarget powerpc*-*-*]
 -             || [istarget i?86-*-*]
 -             || [istarget x86_64-*-*]
 -             || [istarget ia64-*-*]
 -             || [istarget spu-*-*] } {
 -           set et_vect_extract_even_odd_wide_saved 1
 -        }
 -    }
 -
 -    verbose check_effective_target_vect_extract_even_wide_odd: returning 
 $et_vect_extract_even_odd_wide_saved 2
 -    return $et_vect_extract_even_odd_wide_saved
 -}
 -
  # Return 1 if the target supports vector interleaving, 0 otherwise.

  proc check_effective_target_vect_interleave { } {
 @@ -3184,25 +3161,6 @@ proc check_effective_target_vect_strided
     return $et_vect_strided_saved
  }

 -# Return 1 if the target supports vector interleaving and extract even/odd
 -# for wide element types, 0 otherwise.
 -proc check_effective_target_vect_strided_wide { } {
 -    global et_vect_strided_wide_saved
 -
 -    if [info exists et_vect_strided_wide_saved] {
 -        verbose check_effective_target_vect_strided_wide: using cached 
 result 2
 -    } else {
 -        set et_vect_strided_wide_saved 0
 -        if { [check_effective_target_vect_interleave]
 -              [check_effective_target_vect_extract_even_odd_wide] } {
 -           set et_vect_strided_wide_saved 1
 -        }
 -    }
 -
 -    verbose check_effective_target_vect_strided_wide: returning 
 $et_vect_strided_wide_saved 2
 -    return $et_vect_strided_wide_saved
 -}
 -
  # Return 1 if the target supports section-anchors

  proc check_effective_target_section_anchors { } {
 Index: gcc/testsuite/gcc.dg/vect/O3-pr39675-2.c
 ===
 --- gcc/testsuite/gcc.dg/vect/O3-pr39675-2.c    2011-04-12 11:53:54.0 
 +0100
 +++ gcc/testsuite/gcc.dg/vect/O3-pr39675-2.c    2011-04-12 11:55:11.0 
 +0100
 @@ -26,7 +26,7 @@ foo ()
     }
  }

 -/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect  { target 
 vect_strided_wide } } } */
 -/* { dg-final { scan-tree-dump-times vectorizing stmts using SLP 1 vect 
 { target vect_strided_wide } } } */
 +/* { dg-final { scan-tree-dump-times vectorized 1 loops 1 vect  { target 
 vect_strided } } } */
 +/* { dg-final { scan-tree-dump-times vectorizing stmts using SLP 1 vect 
 { target vect_strided } } } */
  /* { dg-final { cleanup-tree-dump vect } } */

 Index: gcc/testsuite/gcc.dg/vect/fast-math-pr35982.c
 

Re: [8/9] Testsuite: split tests for strided accesses

2011-04-15 Thread Richard Guenther
On Tue, Apr 12, 2011 at 4:19 PM, Richard Sandiford
richard.sandif...@linaro.org wrote:
 The next patch introduces separate vect_stridedN target selectors
 for each tested stride factor N.  At the moment, some tests contain
 several independent loops that have different stride factors.
 It's easier to make the next change if we put these loops into
 separate tests.

 Tested on x86_64-linux-gnu and arm-linux-gnueabi.  OK to install?

Ok.

Thanks,
Richard.

 Richard


 gcc/testsuite/
        * gcc.dg/vect/slp-11.c: Split into...
        * gcc.dg/vect/slp-11a.c, gcc.dg/vect/slp-11b.c,
        gcc.dg/vect/slp-11c.c: ...these tests.
        * gcc.dg/vect/slp-12a.c: Split 4-stride loop into...
        * gcc.dg/vect/slp-12c.c: ...this new test.
        * gcc.dg/vect/slp-19.c: Split into...
        * gcc.dg/vect/slp-19a.c, gcc.dg/vect/slp-19b.c,
        gcc.dg/vect/slp-19c.c: ...these new tests.

 Index: gcc/testsuite/gcc.dg/vect/slp-11.c
 ===
 --- gcc/testsuite/gcc.dg/vect/slp-11.c  2011-04-12 15:18:24.0 +0100
 +++ /dev/null   2011-03-23 08:42:11.268792848 +
 @@ -1,113 +0,0 @@
 -/* { dg-require-effective-target vect_int } */
 -
 -#include stdarg.h
 -#include tree-vect.h
 -
 -#define N 8
 -
 -int
 -main1 ()
 -{
 -  int i;
 -  unsigned int out[N*8], a0, a1, a2, a3, a4, a5, a6, a7, b1, b0, b2, b3, b4, 
 b5, b6, b7;
 -  unsigned int in[N*8] = 
 {0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63};
 -  float out2[N*8];
 -
 -  /* Different operations - not SLPable.  */
 -  for (i = 0; i  N; i++)
 -    {
 -      a0 = in[i*8] + 5;
 -      a1 = in[i*8 + 1] * 6;
 -      a2 = in[i*8 + 2] + 7;
 -      a3 = in[i*8 + 3] + 8;
 -      a4 = in[i*8 + 4] + 9;
 -      a5 = in[i*8 + 5] + 10;
 -      a6 = in[i*8 + 6] + 11;
 -      a7 = in[i*8 + 7] + 12;
 -
 -      b0 = a0 * 3;
 -      b1 = a1 * 2;
 -      b2 = a2 * 12;
 -      b3 = a3 * 5;
 -      b4 = a4 * 8;
 -      b5 = a5 * 4;
 -      b6 = a6 * 3;
 -      b7 = a7 * 2;
 -
 -      out[i*8] = b0 - 2;
 -      out[i*8 + 1] = b1 - 3;
 -      out[i*8 + 2] = b2 - 2;
 -      out[i*8 + 3] = b3 - 1;
 -      out[i*8 + 4] = b4 - 8;
 -      out[i*8 + 5] = b5 - 7;
 -      out[i*8 + 6] = b6 - 3;
 -      out[i*8 + 7] = b7 - 7;
 -    }
 -
 -  /* check results:  */
 -  for (i = 0; i  N; i++)
 -    {
 -      if (out[i*8] !=  (in[i*8] + 5) * 3 - 2
 -         || out[i*8 + 1] != (in[i*8 + 1] * 6) * 2 - 3
 -         || out[i*8 + 2] != (in[i*8 + 2] + 7) * 12 - 2
 -         || out[i*8 + 3] != (in[i*8 + 3] + 8) * 5 - 1
 -         || out[i*8 + 4] != (in[i*8 + 4] + 9) * 8 - 8
 -         || out[i*8 + 5] != (in[i*8 + 5] + 10) * 4 - 7
 -         || out[i*8 + 6] != (in[i*8 + 6] + 11) * 3 - 3
 -         || out[i*8 + 7] != (in[i*8 + 7] + 12) * 2 - 7)
 -       abort ();
 -    }
 -
 -  /* Requires permutation - not SLPable.  */
 -  for (i = 0; i  N*2; i++)
 -    {
 -      out[i*4] = (in[i*4] + 2) * 3;
 -      out[i*4 + 1] = (in[i*4 + 2] + 2) * 7;
 -      out[i*4 + 2] = (in[i*4 + 1] + 7) * 3;
 -      out[i*4 + 3] = (in[i*4 + 3] + 3) * 4;
 -    }
 -
 -  /* check results:  */
 -  for (i = 0; i  N*2; i++)
 -    {
 -      if (out[i*4] !=  (in[i*4] + 2) * 3
 -         || out[i*4 + 1] != (in[i*4 + 2] + 2) * 7
 -         || out[i*4 + 2] != (in[i*4 + 1] + 7) * 3
 -         || out[i*4 + 3] != (in[i*4 + 3] + 3) * 4)
 -        abort ();
 -    }
 -
 -  /* Different operations - not SLPable.  */
 -  for (i = 0; i  N*4; i++)
 -    {
 -      out2[i*2] = ((float) in[i*2] * 2 + 6) ;
 -      out2[i*2 + 1] = (float) (in[i*2 + 1] * 3 + 7);
 -    }
 -
 -  /* check results:  */
 -  for (i = 0; i  N*4; i++)
 -    {
 -      if (out2[i*2] !=  ((float) in[i*2] * 2 + 6)
 -         || out2[i*2 + 1] != (float) (in[i*2 + 1] * 3 + 7))
 -        abort ();
 -    }
 -
 -
 -  return 0;
 -}
 -
 -int main (void)
 -{
 -  check_vect ();
 -
 -  main1 ();
 -
 -  return 0;
 -}
 -
 -/* { dg-final { scan-tree-dump-times vectorized 3 loops 1 vect  { target 
 { { vect_uintfloat_cvt  vect_strided }   vect_int_mult } } } } */
 -/* { dg-final { scan-tree-dump-times vectorized 2 loops 1 vect  { target 
 { { { ! vect_uintfloat_cvt }  vect_strided }   vect_int_mult } } } } */
 -/* { dg-final { scan-tree-dump-times vectorized 0 loops 1 vect  {target  
 { ! { vect_int_mult  vect_strided } } } } }  */
 -/* { dg-final { scan-tree-dump-times vectorizing stmts using SLP 0  vect 
  } } */
 -/* { dg-final { cleanup-tree-dump vect } } */
 -
 Index: gcc/testsuite/gcc.dg/vect/slp-11a.c
 ===
 --- /dev/null   2011-03-23 08:42:11.268792848 +
 +++ gcc/testsuite/gcc.dg/vect/slp-11a.c 2011-04-12 15:18:25.0 +0100
 @@ -0,0 +1,75 @@
 +/* { dg-require-effective-target vect_int } */
 +
 +#include stdarg.h
 +#include tree-vect.h
 +
 +#define N 8
 +
 +int
 +main1 ()
 +{
 +  int i;
 +  unsigned int out[N*8], a0, a1, 

Re: [10/9] Add tests for stride-3 accesses

2011-04-15 Thread Richard Guenther
On Tue, Apr 12, 2011 at 4:34 PM, Richard Sandiford
richard.sandif...@linaro.org wrote:
 This patch adds a test for stride-3 accesses.  I didn't add any
 particularly complicated cases because I think the testsuite already
 covers the interaction between the strided loads  stores and other
 operations pretty well.  Let me know if there's something I should
 add though.

 Tested on x86_64-linux-gnu and arm-linux-gnueabi.  OK to install?

Ok.

Thanks,
Richard.

 Richard


 gcc/testsuite/
        * gcc.dg/vect/vect-strided-u16-i3.c: New test.

 Index: gcc/testsuite/gcc.dg/vect/vect-strided-u16-i3.c
 ===
 --- /dev/null   2011-03-23 08:42:11.268792848 +
 +++ gcc/testsuite/gcc.dg/vect/vect-strided-u16-i3.c     2011-04-12 
 11:55:17.0 +0100
 @@ -0,0 +1,112 @@
 +#include stdarg.h
 +#include tree-vect.h
 +
 +#define N 128
 +
 +typedef struct {
 +   unsigned short a;
 +   unsigned short b;
 +   unsigned short c;
 +} s;
 +
 +#define A(I) (I)
 +#define B(I) ((I) * 2)
 +#define C(I) ((unsigned short) ~((I) ^ 0x18))
 +
 +void __attribute__ ((noinline))
 +check1 (s *res)
 +{
 +  int i;
 +
 +  for (i = 0; i  N; i++)
 +    if (res[i].a != C (i)
 +       || res[i].b != A (i)
 +       || res[i].c != B (i))
 +      abort ();
 +}
 +
 +void __attribute__ ((noinline))
 +check2 (unsigned short *res)
 +{
 +  int i;
 +
 +  for (i = 0; i  N; i++)
 +    if (res[i] != (unsigned short) (A (i) + B (i) + C (i)))
 +      abort ();
 +}
 +
 +void __attribute__ ((noinline))
 +check3 (s *res)
 +{
 +  int i;
 +
 +  for (i = 0; i  N; i++)
 +    if (res[i].a != i
 +       || res[i].b != i
 +       || res[i].c != i)
 +      abort ();
 +}
 +
 +void __attribute__ ((noinline))
 +check4 (unsigned short *res)
 +{
 +  int i;
 +
 +  for (i = 0; i  N; i++)
 +    if (res[i] != (unsigned short) (A (i) + B (i)))
 +      abort ();
 +}
 +
 +void __attribute__ ((noinline))
 +main1 (s *arr)
 +{
 +  int i;
 +  s *ptr = arr;
 +  s res1[N];
 +  unsigned short res2[N];
 +
 +  for (i = 0; i  N; i++)
 +    {
 +      res1[i].a = arr[i].c;
 +      res1[i].b = arr[i].a;
 +      res1[i].c = arr[i].b;
 +    }
 +  check1 (res1);
 +
 +  for (i = 0; i  N; i++)
 +    res2[i] = arr[i].a + arr[i].b + arr[i].c;
 +  check2 (res2);
 +
 +  for (i = 0; i  N; i++)
 +    {
 +      res1[i].a = i;
 +      res1[i].b = i;
 +      res1[i].c = i;
 +    }
 +  check3 (res1);
 +
 +  for (i = 0; i  N; i++)
 +    res2[i] = arr[i].a + arr[i].b;
 +  check4 (res2);
 +}
 +
 +int main (void)
 +{
 +  int i;
 +  s arr[N];
 +
 +  check_vect ();
 +
 +  for (i = 0; i  N; i++)
 +    {
 +      arr[i].a = A (i);
 +      arr[i].b = B (i);
 +      arr[i].c = C (i);
 +    }
 +  main1 (arr);
 +
 +  return 0;
 +}
 +
 +/* { dg-final { scan-tree-dump-times vectorized 4 loops 1 vect  { target 
 vect_strided3 } } } */
 +/* { dg-final { cleanup-tree-dump vect } } */



Re: [lto/pph] Do not pack more bits than requested (issue4415044)

2011-04-15 Thread Diego Novillo
On Fri, Apr 15, 2011 at 04:56, Richard Guenther rguent...@suse.de wrote:

 @@ -518,7 +518,8 @@ pack_ts_type_value_fields (struct bitpack_d *bp, tree 
 expr)
    bp_pack_value (bp, TYPE_USER_ALIGN (expr), 1);
    bp_pack_value (bp, TYPE_READONLY (expr), 1);
    bp_pack_value (bp, TYPE_ALIGN (expr), HOST_BITS_PER_INT);
 -  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, 
 HOST_BITS_PER_INT);
 +  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1,
 +                BITS_PER_BITPACK_WORD);

 As we only want to stream alias-set zeros just change it to a single bit,
 like

     bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0, 1);

 and on the reader side restore either a zero or -1.

Ah, yes.  Much better.

 As for the -1 case, it's simply broken use of the interface.

Which would've been caught by the assertion.  How about this, we keep
the asserts with #ifdef ENABLE_CHECKING. This would've saved me some
ugly debugging.


Diego.


Re: [lto/pph] Do not pack more bits than requested (issue4415044)

2011-04-15 Thread Richard Guenther
On Fri, 15 Apr 2011, Diego Novillo wrote:

 On Fri, Apr 15, 2011 at 04:56, Richard Guenther rguent...@suse.de wrote:
 
  @@ -518,7 +518,8 @@ pack_ts_type_value_fields (struct bitpack_d *bp, tree 
  expr)
     bp_pack_value (bp, TYPE_USER_ALIGN (expr), 1);
     bp_pack_value (bp, TYPE_READONLY (expr), 1);
     bp_pack_value (bp, TYPE_ALIGN (expr), HOST_BITS_PER_INT);
  -  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, 
  HOST_BITS_PER_INT);
  +  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1,
  +                BITS_PER_BITPACK_WORD);
 
  As we only want to stream alias-set zeros just change it to a single bit,
  like
 
      bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0, 1);
 
  and on the reader side restore either a zero or -1.
 
 Ah, yes.  Much better.
 
  As for the -1 case, it's simply broken use of the interface.
 
 Which would've been caught by the assertion.  How about this, we keep
 the asserts with #ifdef ENABLE_CHECKING. This would've saved me some
 ugly debugging.

I think we should rather add the masking.  The assert would only
trigger for particular values, not for bogus use of the interface
(you get sign-extension for signed arguments).

Richard.

[PATCH 1/4] Remove usesess and wrong code from ipa_analyze_virtual_call_uses

2011-04-15 Thread Martin Jambor
Hi,

ipa_analyze_virtual_call_uses contains code that was meant to deal
with situation where OBJ_TYPE_REF_OBJECT is a (number of)
COMPONENT_REFs on top of a dereferenced default definition SSA_NAME of
a parameter.

The code is useless because that never happens in the IL, if an
ancestor object of a parameter is being used for a virtual call, the
object in the expression is always an SSA_NAME which is assigned the
proper value in a previous statement.

Moreover, if it ever triggered, it might lead to wrong code because in
cases like this it is necessary also to store the offset within the
parameter into the indirect call graph edge information (like we do in
indirect inlining).

The above is done in the next patch in the series.  I've split this
part from it because I would like to commit it also to the 4.6 branch.
I have bootstrapped and tested this on x86-64-linux without any
problems.  OK for trunk and the 4.6 branch?

Thanks,

Martin


2011-04-08  Martin Jambor  mjam...@suse.cz

* ipa-prop.c (ipa_analyze_virtual_call_uses): Remove handling
of ADR_EXPRs.


Index: src/gcc/ipa-prop.c
===
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -1383,18 +1383,6 @@ ipa_analyze_virtual_call_uses (struct cg
   if (!flag_devirtualize)
 return;
 
-  if (TREE_CODE (obj) == ADDR_EXPR)
-{
-  do
-   {
- obj = TREE_OPERAND (obj, 0);
-   }
-  while (TREE_CODE (obj) == COMPONENT_REF);
-  if (TREE_CODE (obj) != MEM_REF)
-   return;
-  obj = TREE_OPERAND (obj, 0);
-}
-
   if (TREE_CODE (obj) != SSA_NAME
   || !SSA_NAME_IS_DEFAULT_DEF (obj))
 return;



[PATCH 3/4] Simple relaxation of dynamic type change detection routine

2011-04-15 Thread Martin Jambor
Hi,

in order to speed up astar, I had to persuade the function that
decides whether a statement potentially modifies the dynamic type of
an object by storing a new value to the VMT pointer to consider the
following statement harmless (all types are integers of some sort):

MEM[(i32 *)b2arp_3(D) + 8B] = 0;

I'd like to experiment with this routine a bit more once I have some
other IPA-CP infrastructure in place but at the moment I opted for a
simple solution:  All scalar non-pointer stores are deemed safe.

VMT pointer is a compiler generated field which is a pointer so legal
user code is not able to store stuff there through some fancy type
casts and compiler generated code should have no reason whatsoever to
that either.  Therefore I believe this change is safe and useful.

I have bootstrapped and tested the patch on x886_64-linux.  OK for
trunk?

Thanks,

Martin



2011-04-14  Martin Jambor  mjam...@suse.cz

* ipa-prop.c (stmt_may_be_vtbl_ptr_store): Return false for scalar
non-pointer assignments.

Index: src/gcc/ipa-prop.c
===
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -405,13 +405,18 @@ stmt_may_be_vtbl_ptr_store (gimple stmt)
 {
   tree lhs = gimple_assign_lhs (stmt);
 
-  if (TREE_CODE (lhs) == COMPONENT_REF
-  !DECL_VIRTUAL_P (TREE_OPERAND (lhs, 1))
-  !AGGREGATE_TYPE_P (TREE_TYPE (lhs)))
+  if (!AGGREGATE_TYPE_P (TREE_TYPE (lhs)))
+   {
+ if (!POINTER_TYPE_P (TREE_TYPE (lhs)))
return false;
-  /* In the future we might want to use get_base_ref_and_offset to find
-if there is a field corresponding to the offset and if so, proceed
-almost like if it was a component ref.  */
+
+ if (TREE_CODE (lhs) == COMPONENT_REF
+  !DECL_VIRTUAL_P (TREE_OPERAND (lhs, 1)))
+   return false;
+ /* In the future we might want to use get_base_ref_and_offset to find
+if there is a field corresponding to the offset and if so, proceed
+almost like if it was a component ref.  */
+   }
 }
   return true;
 }



[PATCH 4/4] Devirtualization based on global objects

2011-04-15 Thread Martin Jambor
Hi,

this is the patch that actually speeds up astar (together with the
previous one).  It implements devirtualization based on global
objects.  In the past we have refrained from doing this because in
general it is difficult to say whether the object is currently being
constructed and so it might have a dynamic type of one of its
ancestors.  However, if the object's class does not have any ancestors
that obviously cannot happen.

Devirtualizing in such conditions is enough to change a virtual call
to regwayobj::isaddtobound in 473.astar to a direct one which can and
should be inlined.  That seemed a good justification to implement this
and so the patch below does so and brings about 3.1% speedup for that
benchmark with LTO.

I acknowledge that instead of discarding all classes with ancestors it
would be better to check that the called virtual method has the same
implementation in all ancestors instead.  That is perhaps something
for later.

It took me surprisingly long to realize that this technique can be
used for folding virtual calls based on local automatically allocated
objexts too and so can be used to un-XFAIL g++.dg/opt/devirt1.c that
regressed in 4.6.

Bootstrapped and tested on x86_64-linux.  OK for trunk?

Thanks,

Martin


2011-04-15  Martin Jambor  mjam...@suse.cz

* ipa-cp.c (ipcp_process_devirtualization_opportunities): Devirtualize
also according to actual contants.
* gimple-fold.c (gimple_extract_devirt_binfo_from_cst): New function.
(gimple_fold_obj_type_ref_call): New function.
(gimple_fold_call): Call gimple_fold_obj_type_ref_call on
OBJ_TYPE_REFs.
* gimple.h (gimple_extract_devirt_binfo_from_cst): Declare.

* testsuite/g++.dg/opt/devirt1.C: Bump to -O2, remove XFAIL.
* testsuite/g++.dg/opt/devirt2.C: New test.
* testsuite/g++.dg/ipa/devirt-g-1.C: Likewise.


Index: src/gcc/ipa-cp.c
===
--- src.orig/gcc/ipa-cp.c
+++ src/gcc/ipa-cp.c
@@ -1245,51 +1245,71 @@ ipcp_process_devirtualization_opportunit
 
   for (ie = node-indirect_calls; ie; ie = next_ie)
 {
-  int param_index, types_count, j;
+  int param_index;
   HOST_WIDE_INT token, anc_offset;
   tree target, delta, otr_type;
+  struct ipcp_lattice *lat;
 
   next_ie = ie-next_callee;
   if (!ie-indirect_info-polymorphic)
continue;
   param_index = ie-indirect_info-param_index;
-  if (param_index == -1
- || ipa_param_cannot_devirtualize_p (info, param_index)
- || ipa_param_types_vec_empty (info, param_index))
+  if (param_index == -1)
continue;
 
+  lat = ipcp_get_lattice (info, param_index);
   token = ie-indirect_info-otr_token;
   anc_offset = ie-indirect_info-anc_offset;
   otr_type = ie-indirect_info-otr_type;
   target = NULL_TREE;
-  types_count = VEC_length (tree, info-params[param_index].types);
-  for (j = 0; j  types_count; j++)
+  if (lat-type == IPA_CONST_VALUE)
{
- tree binfo = VEC_index (tree, info-params[param_index].types, j);
- tree d, t;
-
+ tree binfo = gimple_extract_devirt_binfo_from_cst (lat-constant);
+ if (!binfo)
+   continue;
  binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
  if (!binfo)
-   {
- target = NULL_TREE;
- break;
-   }
+   continue;
+ target = gimple_get_virt_mehtod_for_binfo (token, binfo, delta,
+false);
+   }
+  else
+   {
+ int  types_count, j;
 
- t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true);
- if (!t)
-   {
- target = NULL_TREE;
- break;
-   }
- else if (!target)
-   {
- target = t;
- delta = d;
-   }
- else if (target != t || !tree_int_cst_equal (delta, d))
+ if (ipa_param_cannot_devirtualize_p (info, param_index)
+ || ipa_param_types_vec_empty (info, param_index))
+   continue;
+
+ types_count = VEC_length (tree, info-params[param_index].types);
+ for (j = 0; j  types_count; j++)
{
- target = NULL_TREE;
- break;
+ tree binfo = VEC_index (tree, info-params[param_index].types, j);
+ tree d, t;
+
+ binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
+ if (!binfo)
+   {
+ target = NULL_TREE;
+ break;
+   }
+
+ t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true);
+ if (!t)
+   {
+ target = NULL_TREE;
+ break;
+   }
+ else if (!target)
+   {
+ target = t;
+ delta = d;
+   }
+

[PATCH 2/4] Handle calls to ancestor objects in IPA-CP devirtualization

2011-04-15 Thread Martin Jambor
Hi,

early inlining can create virtual calls based on the part of an object
that represents an ancestor.  This patch makes ipa-prop analysis able
to recognize such calls and store the required offset along with such
calls (the field is already there for similar purposes of indirect
inlining).  The constant propagation is then made aware of the offset
field and takes it into account when looking up the proper BINFO.

Bootstrapped and tested on x86_64-linux. OK for trunk?

Thanks,

Martin



2011-04-13  Martin Jambor  mjam...@suse.cz

* ipa-cp.c (ipcp_process_devirtualization_opportunities): Take into
account anc_offset and otr_type from the indirect edge info.
* ipa-prop.c (get_ancestor_addr_info): New function.
(compute_complex_ancestor_jump_func): Assignment analysis moved to
get_ancestor_addr_info, call it.
(ipa_note_param_call): Do not initialize information about polymorphic
calls, return the indirect call graph edge.  Remove the last
parameter, adjust all callers.
(ipa_analyze_virtual_call_uses): Process also calls to ancestors of
parameters.  Initialize polymorphic information in the indirect edge.

* testsuite/g++.dg/ipa/devirt-7.C: New test.


Index: src/gcc/ipa-cp.c
===
--- src.orig/gcc/ipa-cp.c
+++ src/gcc/ipa-cp.c
@@ -1246,8 +1246,8 @@ ipcp_process_devirtualization_opportunit
   for (ie = node-indirect_calls; ie; ie = next_ie)
 {
   int param_index, types_count, j;
-  HOST_WIDE_INT token;
-  tree target, delta;
+  HOST_WIDE_INT token, anc_offset;
+  tree target, delta, otr_type;
 
   next_ie = ie-next_callee;
   if (!ie-indirect_info-polymorphic)
@@ -1259,14 +1259,23 @@ ipcp_process_devirtualization_opportunit
continue;
 
   token = ie-indirect_info-otr_token;
+  anc_offset = ie-indirect_info-anc_offset;
+  otr_type = ie-indirect_info-otr_type;
   target = NULL_TREE;
   types_count = VEC_length (tree, info-params[param_index].types);
   for (j = 0; j  types_count; j++)
{
  tree binfo = VEC_index (tree, info-params[param_index].types, j);
- tree d;
- tree t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true);
+ tree d, t;
 
+ binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
+ if (!binfo)
+   {
+ target = NULL_TREE;
+ break;
+   }
+
+ t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true);
  if (!t)
{
  target = NULL_TREE;
Index: src/gcc/ipa-prop.c
===
--- src.orig/gcc/ipa-prop.c
+++ src/gcc/ipa-prop.c
@@ -576,6 +576,49 @@ compute_complex_assign_jump_func (struct
 }
 }
 
+/* Extract the base, offset and MEM_REF expression from a statement ASSIGN if
+   it looks like:
+
+   iftmp.1_3 = obj_2(D)-D.1762;
+
+   The base of the MEM_REF must be a default definition SSA NAME of a
+   parameter.  Return NULL_TREE if it looks otherwise.  If case of success, the
+   whole MEM_REF expression is returned and the offset calculated from any
+   handled components and the MEM_REF itself is stored into *OFFSET.  The whole
+   RHS stripped off the ADDR_EXPR is stored into *OBJ_P.  */
+
+static tree
+get_ancestor_addr_info (gimple assign, tree *obj_p, HOST_WIDE_INT *offset)
+{
+  HOST_WIDE_INT size, max_size;
+  tree expr, parm, obj;
+
+  if (!gimple_assign_single_p (assign))
+return NULL_TREE;
+  expr = gimple_assign_rhs1 (assign);
+
+  if (TREE_CODE (expr) != ADDR_EXPR)
+return NULL_TREE;
+  expr = TREE_OPERAND (expr, 0);
+  obj = expr;
+  expr = get_ref_base_and_extent (expr, offset, size, max_size);
+
+  if (TREE_CODE (expr) != MEM_REF
+  /* If this is a varying address, punt.  */
+  || max_size == -1
+  || max_size != size)
+return NULL_TREE;
+  parm = TREE_OPERAND (expr, 0);
+  if (TREE_CODE (parm) != SSA_NAME
+  || !SSA_NAME_IS_DEFAULT_DEF (parm)
+  || *offset  0)
+return NULL_TREE;
+
+  *offset += mem_ref_offset (expr).low * BITS_PER_UNIT;
+  *obj_p = obj;
+  return expr;
+}
+
 
 /* Given that an actual argument is an SSA_NAME that is a result of a phi
statement PHI, try to find out whether NAME is in fact a
@@ -603,7 +646,7 @@ compute_complex_ancestor_jump_func (stru
struct ipa_jump_func *jfunc,
gimple call, gimple phi)
 {
-  HOST_WIDE_INT offset, size, max_size;
+  HOST_WIDE_INT offset;
   gimple assign, cond;
   basic_block phi_bb, assign_bb, cond_bb;
   tree tmp, parm, expr, obj;
@@ -626,29 +669,12 @@ compute_complex_ancestor_jump_func (stru
 
   assign = SSA_NAME_DEF_STMT (tmp);
   assign_bb = gimple_bb (assign);
-  if (!single_pred_p (assign_bb)
-  || !gimple_assign_single_p (assign))
+  if (!single_pred_p (assign_bb))
 return;
-  expr = 

Re: [lto/pph] Do not pack more bits than requested (issue4415044)

2011-04-15 Thread Diego Novillo
On Fri, Apr 15, 2011 at 08:49, Richard Guenther rguent...@suse.de wrote:
 On Fri, 15 Apr 2011, Diego Novillo wrote:

 On Fri, Apr 15, 2011 at 04:56, Richard Guenther rguent...@suse.de wrote:

  @@ -518,7 +518,8 @@ pack_ts_type_value_fields (struct bitpack_d *bp, tree 
  expr)
     bp_pack_value (bp, TYPE_USER_ALIGN (expr), 1);
     bp_pack_value (bp, TYPE_READONLY (expr), 1);
     bp_pack_value (bp, TYPE_ALIGN (expr), HOST_BITS_PER_INT);
  -  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1, 
  HOST_BITS_PER_INT);
  +  bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0 ? 0 : -1,
  +                BITS_PER_BITPACK_WORD);
 
  As we only want to stream alias-set zeros just change it to a single bit,
  like
 
      bp_pack_value (bp, TYPE_ALIAS_SET (expr) == 0, 1);
 
  and on the reader side restore either a zero or -1.

 Ah, yes.  Much better.

  As for the -1 case, it's simply broken use of the interface.

 Which would've been caught by the assertion.  How about this, we keep
 the asserts with #ifdef ENABLE_CHECKING. This would've saved me some
 ugly debugging.

 I think we should rather add the masking.  The assert would only
 trigger for particular values, not for bogus use of the interface
 (you get sign-extension for signed arguments).

OK, that works too.  I'll prepare a patch.


Diego.


Re: [6/9] NEON vec_load_lanes and vec_store_lanes patterns

2011-04-15 Thread Richard Earnshaw

On Tue, 2011-04-12 at 15:01 +0100, Richard Sandiford wrote:
 This patch adds vec_load_lanes and vec_store_lanes patterns for NEON.
 They feed directly into the corresponding intrinsic patterns.
 
 Tested on x86_64-linux-gnu and arm-linux-gnueabi.  OK to install?
 
 Richard
 
 
 gcc/
   * config/arm/neon.md (vec_load_lanesmodemode): New expanders,
   (vec_store_lanesmodemode): Likewise.
 

OK.

R.





Re: [PATCH v3] Re: avoid useless if-before-free tests

2011-04-15 Thread Tom Tromey
 Janne == Janne Blomqvist blomqvist.ja...@gmail.com writes:

Jim Can someone add me to the gcc group?  That would help.
Jim I already have ssh access to sourceware.org.

Janne I'm not sure if I'm considered to be well-established
Janne enough, so could someone help Jim out here, please?

I added Jim to the gcc group.

Tom


[PATCH] refactor gimple asm memory clobber checking

2011-04-15 Thread Nathan Froyd
There are a couple places that check GIMPLE_ASMs for clobbering memory;
this patch centralizes the logic in gimple.c.

Tested on x86_64-unknown-linux-gnu.  OK to commit?

-Nathan

* gimple.h (gimple_asm_clobbers_memory_p): Declare.
* gimple.c (gimple_asm_clobbers_memory_p): Define.
* ipa-pure-const.c (check_stmt): Call it.
* tree-ssa-operands.c (get_asm_expr_operands): Likewise.

diff --git a/gcc/gimple.c b/gcc/gimple.c
index 090fc94..5dc62ea 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -5142,4 +5142,21 @@ gimple_call_builtin_p (gimple stmt, enum 
built_in_function code)
   DECL_FUNCTION_CODE (fndecl) == code);
 }
 
+/* Return true if STMT clobbers memory.  STMT is required to be a
+   GIMPLE_ASM.  */
+
+bool
+gimple_asm_clobbers_memory_p (const_gimple stmt)
+{
+  unsigned i;
+
+  for (i = 0; i  gimple_asm_nclobbers (stmt); i++)
+{
+  tree op = gimple_asm_clobber_op (stmt, i);
+  if (strcmp (TREE_STRING_POINTER (TREE_VALUE (op)), memory) == 0)
+   return true;
+}
+
+  return false;
+}
 #include gt-gimple.h
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 572cabc..840e149 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -973,6 +973,7 @@ extern bool walk_stmt_load_store_ops (gimple, void *,
  bool (*)(gimple, tree, void *));
 extern bool gimple_ior_addresses_taken (bitmap, gimple);
 extern bool gimple_call_builtin_p (gimple, enum built_in_function);
+extern bool gimple_asm_clobbers_memory_p (const_gimple);
 
 /* In gimplify.c  */
 extern tree create_tmp_var_raw (tree, const char *);
diff --git a/gcc/ipa-pure-const.c b/gcc/ipa-pure-const.c
index b7deb57..eb5b0f6 100644
--- a/gcc/ipa-pure-const.c
+++ b/gcc/ipa-pure-const.c
@@ -639,7 +639,6 @@ static void
 check_stmt (gimple_stmt_iterator *gsip, funct_state local, bool ipa)
 {
   gimple stmt = gsi_stmt (*gsip);
-  unsigned int i = 0;
 
   if (is_gimple_debug (stmt))
 return;
@@ -693,16 +692,12 @@ check_stmt (gimple_stmt_iterator *gsip, funct_state 
local, bool ipa)
}
   break;
 case GIMPLE_ASM:
-  for (i = 0; i  gimple_asm_nclobbers (stmt); i++)
+  if (gimple_asm_clobbers_memory_p (stmt))
{
- tree op = gimple_asm_clobber_op (stmt, i);
- if (strcmp (TREE_STRING_POINTER (TREE_VALUE (op)), memory) == 0)
-   {
-  if (dump_file)
-fprintf (dump_file, memory asm clobber is not 
const/pure);
- /* Abandon all hope, ye who enter here. */
- local-pure_const_state = IPA_NEITHER;
-   }
+ if (dump_file)
+   fprintf (dump_file, memory asm clobber is not const/pure);
+ /* Abandon all hope, ye who enter here. */
+ local-pure_const_state = IPA_NEITHER;
}
   if (gimple_asm_volatile_p (stmt))
{
diff --git a/gcc/tree-ssa-operands.c b/gcc/tree-ssa-operands.c
index 57f443f..7f76cbf 100644
--- a/gcc/tree-ssa-operands.c
+++ b/gcc/tree-ssa-operands.c
@@ -832,15 +832,8 @@ get_asm_expr_operands (gimple stmt)
 }
 
   /* Clobber all memory and addressable symbols for asm ( : : : memory);  
*/
-  for (i = 0; i  gimple_asm_nclobbers (stmt); i++)
-{
-  tree link = gimple_asm_clobber_op (stmt, i);
-  if (strcmp (TREE_STRING_POINTER (TREE_VALUE (link)), memory) == 0)
-   {
- add_virtual_operand (stmt, opf_def);
- break;
-   }
-}
+  if (gimple_asm_clobbers_memory_p (stmt))
+add_virtual_operand (stmt, opf_def);
 }
 
 


Re: [PATCH, ARM] PR47855 Compute attr length for some thumb2 insns, 2/3

2011-04-15 Thread Richard Earnshaw

On Thu, 2011-04-14 at 21:19 +0800, Carrot Wei wrote:
 On Fri, Apr 8, 2011 at 6:51 PM, Ramana Radhakrishnan
 ramana.radhakrish...@linaro.org wrote:
  On 08/04/11 10:57, Carrot Wei wrote:
 
  Hi
 
  This is the second part of the fixing for
 
  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855
 
  This patch contains the length computation for insn patterns
  *arm_movqi_insn
  and *arm_addsi3. Since the alternatives and encodings are much more
  complex,
  the attribute length is computed in separate C functions.
 
  Sorry, no. This is potentially a maintenance pain. It hardcodes alternatives
  from a pattern elsewhere in the C file. I don't like doing this unless we
  have to with the sync primitives or with push_multi. In this case I'm not
  convinced we need such functions in the .c file.
 
  Why can't we use the enabled attribute here with appropriate constraints
  for everything other than the memory cases (even there we might be able to
  invent some new constraints) ?
 
  Also a note about programming style. There are the helper macros like REG_P,
  CONST_INT_P and MEM_P which remove the necessity for checks like
 
  GET_CODE (x) == y where y E { REG, CONST_INT, MEM}
 
 Hi Ramana
 
 As you suggested I created several new constraints, and use the
 enabled attribute to split the current alternatives in this new
 patch. It has been tested on arm qemu without regression.
 
 thanks
 Carrot


Sorry, I don't think this approach can work.  Certainly not with the way
the compiler currently works, and especially for mov and add insns. 

These instructions are only 2 bytes long if either:
1) They clobber the condition code register or
2) They occur inside an IT block.

We can't tell either of these from the pattern, so you're
underestimating the length of the instruction in some circumstances by
claiming that they are only 2 bytes long.  That /will/ lead to broken
code someday.

We can't add potential clobbers to mov and add patterns because that
will break reload which relies on these patterns being simple-set insns
with no added baggage.  It *might* be possible to add clobbers to other
operations, but that will then most-likely upset instruction scheduling
(I think the scheduler treats two insns that clobber the same hard reg
as being strongly ordered).  Putting in the clobber too early will
certainly affect cond-exec generation.

In short, I'm not aware of a simple way to address this problem so that
we get accurate length information, but minimal impact on other passes
in the compiler.

R.




Re: Implement stack arrays even for unknown sizes

2011-04-15 Thread Dominique Dhumieres
Michael,

 Yes, this is due to the DECL_EXPR statement which is rendered by the 
 dumper just the same as a normal decl.  The testcase looks for exactly one 
 such decl, but with -fstack-arrays there are exactly two for each such 
 array.

The testsuite is run without -fstack-arrays, so I dont' understand why
the DECL_EXPR statement appears.

Dominique


[PATCH] Fix PR48286

2011-04-15 Thread Richard Guenther

This makes us avoid 323.

Tested on x86_64-unknown-linux-gnu/-m32, committed.

Richard.

2011-04-15  Richard Guenther  rguent...@suse.de

PR testsuite/48286
* gfortran.dg/cray_pointers_8.f90: Use -ffloat-store.

gcc/testsuite/gfortran.dg/cray_pointers_8.f90
Index: gcc/testsuite/gfortran.dg/cray_pointers_8.f90
===
--- gcc/testsuite/gfortran.dg/cray_pointers_8.f90   (revision 172493)
+++ gcc/testsuite/gfortran.dg/cray_pointers_8.f90   (working copy)
@@ -1,5 +1,5 @@
 ! { dg-do run }
-! { dg-options -fcray-pointer }
+! { dg-options -fcray-pointer -ffloat-store }
 !
 ! Test the fix for PR36528 in which the Cray pointer was not passed
 ! correctly to 'euler' so that an undefined reference to fcn was


Re: [PATCH] factor asm op chaining out from stmt.c:expand_asm_stmt

2011-04-15 Thread Richard Guenther
On Fri, Apr 15, 2011 at 3:19 PM, Nathan Froyd froy...@codesourcery.com wrote:
 There are several cut-and-pasted loops in expand_asm_stmt that could be
 parameterized by the functions used to access the particular operands.
 The patch below does that.

 Tested on x86_64-unknown-linux-gnu.  OK to commit?

Hmm, it doesn't make it easier to follow though.

Just my 2cents,

Richard.

 -Nathan

         * stmt.c (chain_asm_ops): New function.
        (expand_asm_stmt): Call it.

 diff --git a/gcc/stmt.c b/gcc/stmt.c
 index 1a9f9e5..1fc09e9 100644
 --- a/gcc/stmt.c
 +++ b/gcc/stmt.c
 @@ -1114,13 +1114,34 @@ expand_asm_operands (tree string, tree outputs, tree 
 inputs,
   free_temp_slots ();
  }

 +/* Return the operands of STMT, a GIMPLE_ASM, as described by OP_FN and
 +   N_OPS connected via TREE_CHAIN.  */
 +
 +static tree
 +chain_asm_ops (const_gimple stmt, unsigned (*n_ops) (const_gimple),
 +              tree (*op_fn) (const_gimple, unsigned))
 +{
 +  unsigned i, n;
 +  tree ret = NULL_TREE, t;
 +
 +  n = (*n_ops) (stmt);
 +  if (n  0)
 +    {
 +      t = ret = (*op_fn) (stmt, 0);
 +      for (i = 1; i  n; i++)
 +       t = TREE_CHAIN (t) = (*op_fn) (stmt, i);
 +    }
 +
 +  return ret;
 +}
 +
  void
  expand_asm_stmt (gimple stmt)
  {
   int noutputs;
 -  tree outputs, tail, t;
 +  tree outputs, tail;
   tree *o;
 -  size_t i, n;
 +  size_t i;
   const char *s;
   tree str, out, in, cl, labels;
   location_t locus = gimple_location (stmt);
 @@ -1128,41 +1149,10 @@ expand_asm_stmt (gimple stmt)
   /* Meh... convert the gimple asm operands into real tree lists.
      Eventually we should make all routines work on the vectors instead
      of relying on TREE_CHAIN.  */
 -  out = NULL_TREE;
 -  n = gimple_asm_noutputs (stmt);
 -  if (n  0)
 -    {
 -      t = out = gimple_asm_output_op (stmt, 0);
 -      for (i = 1; i  n; i++)
 -       t = TREE_CHAIN (t) = gimple_asm_output_op (stmt, i);
 -    }
 -
 -  in = NULL_TREE;
 -  n = gimple_asm_ninputs (stmt);
 -  if (n  0)
 -    {
 -      t = in = gimple_asm_input_op (stmt, 0);
 -      for (i = 1; i  n; i++)
 -       t = TREE_CHAIN (t) = gimple_asm_input_op (stmt, i);
 -    }
 -
 -  cl = NULL_TREE;
 -  n = gimple_asm_nclobbers (stmt);
 -  if (n  0)
 -    {
 -      t = cl = gimple_asm_clobber_op (stmt, 0);
 -      for (i = 1; i  n; i++)
 -       t = TREE_CHAIN (t) = gimple_asm_clobber_op (stmt, i);
 -    }
 -
 -  labels = NULL_TREE;
 -  n = gimple_asm_nlabels (stmt);
 -  if (n  0)
 -    {
 -      t = labels = gimple_asm_label_op (stmt, 0);
 -      for (i = 1; i  n; i++)
 -       t = TREE_CHAIN (t) = gimple_asm_label_op (stmt, i);
 -    }
 +  out = chain_asm_ops (stmt, gimple_asm_noutputs, gimple_asm_output_op);
 +  in = chain_asm_ops (stmt, gimple_asm_ninputs, gimple_asm_input_op);
 +  cl = chain_asm_ops (stmt, gimple_asm_nclobbers, gimple_asm_clobber_op);
 +  labels = chain_asm_ops (stmt, gimple_asm_nlabels, gimple_asm_label_op);

   s = gimple_asm_string (stmt);
   str = build_string (strlen (s), s);



patch pings

2011-04-15 Thread Jeff Law
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01060.html
http://gcc.gnu.org/ml/gcc-patches/2011-03/msg02247.html
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNqFOfAAoJEBRtltQi2kC7ijYH/ibLT2HNaFXjdE9SKT5Ih1JV
dTEWPnY7QBP5Xe6FVwZ09ibPCOxJsK7yGdQBYqy5gRQor8ebifI4kenwcBcdET/m
NogdM8DPWbuhgGda7cETNkru7FifSe9mRsKQGhNzVQl48oEKWmcGkm/l3a7gndfD
cX08lFzqIH1wcPWUqQPf6gcUMRE4W/0j92E4PEoIbigMoSIRFcduVouBlld8NLBV
aicTihAC+MFMVKSkpXGjMLCbn/HkNOyV9s9T+Or1/XuCIZy9zlB1JSfR/EJqVmPm
mNmg0bUrtzXN2uImL6ohIV32i732KKcNhNm/FupZTHfbI50nDc1h8SnCgEHc2t0=
=i7zA
-END PGP SIGNATURE-


Re: Implement stack arrays even for unknown sizes

2011-04-15 Thread Michael Matz
Hi,

On Fri, 15 Apr 2011, Dominique Dhumieres wrote:

 Michael,
 
  Yes, this is due to the DECL_EXPR statement which is rendered by the 
  dumper just the same as a normal decl.  The testcase looks for exactly one 
  such decl, but with -fstack-arrays there are exactly two for each such 
  array.
 
 The testsuite is run without -fstack-arrays, so I dont' understand why
 the DECL_EXPR statement appears.

Bummer, you're right.  I unconditionally emit a DECL_EXPR for arrays even 
when they don't have a variable length.  It's harmless, but makes the 
testcase fail (I wasn't seeing the fail because I've changed the testcase 
already to make it not fail with -fstack-arrays).

I'll make the DECL_EXPR conditional on the size being variable.  As Tobias 
already okayed the patch I'm planning to check in the slightly modified 
variant as below, after a new round of testing.


Ciao,
Michael.

* trans-array.c (toplevel): Include gimple.h.
(gfc_trans_allocate_array_storage): Check flag_stack_arrays,
properly expand variable length arrays.
(gfc_trans_auto_array_allocation): If flag_stack_arrays create
variable length decls and associate them with their scope.
* gfortran.h (gfc_option_t): Add flag_stack_arrays member.
* options.c (gfc_init_options): Handle -fstack_arrays option.
* lang.opt (fstack-arrays): Add option.
* invoke.texi (Code Gen Options): Document it.
* Make-lang.in (trans-array.o): Depend on GIMPLE_H.

Index: fortran/trans-array.c
===
*** fortran/trans-array.c   (revision 172431)
--- fortran/trans-array.c   (working copy)
*** along with GCC; see the file COPYING3.
*** 81,86 
--- 81,87 
  #include system.h
  #include coretypes.h
  #include tree.h
+ #include gimple.h
  #include diagnostic-core.h  /* For internal_error/fatal_error.  */
  #include flags.h
  #include gfortran.h
*** gfc_trans_allocate_array_storage (stmtbl
*** 630,647 
  {
/* Allocate the temporary.  */
onstack = !dynamic  initial == NULL_TREE
! gfc_can_put_var_on_stack (size);
  
if (onstack)
{
  /* Make a temporary variable to hold the data.  */
  tmp = fold_build2_loc (input_location, MINUS_EXPR, TREE_TYPE (nelem),
 nelem, gfc_index_one_node);
  tmp = build_range_type (gfc_array_index_type, gfc_index_zero_node,
  tmp);
  tmp = build_array_type (gfc_get_element_type (TREE_TYPE (desc)),
  tmp);
  tmp = gfc_create_var (tmp, A);
  tmp = gfc_build_addr_expr (NULL_TREE, tmp);
  gfc_conv_descriptor_data_set (pre, desc, tmp);
}
--- 631,657 
  {
/* Allocate the temporary.  */
onstack = !dynamic  initial == NULL_TREE
! (gfc_option.flag_stack_arrays
!|| gfc_can_put_var_on_stack (size));
  
if (onstack)
{
  /* Make a temporary variable to hold the data.  */
  tmp = fold_build2_loc (input_location, MINUS_EXPR, TREE_TYPE (nelem),
 nelem, gfc_index_one_node);
+ tmp = gfc_evaluate_now (tmp, pre);
  tmp = build_range_type (gfc_array_index_type, gfc_index_zero_node,
  tmp);
  tmp = build_array_type (gfc_get_element_type (TREE_TYPE (desc)),
  tmp);
  tmp = gfc_create_var (tmp, A);
+ /* If we're here only because of -fstack-arrays we have to
+emit a DECL_EXPR to make the gimplifier emit alloca calls.  */
+ if (!gfc_can_put_var_on_stack (size))
+   gfc_add_expr_to_block (pre,
+  fold_build1_loc (input_location,
+   DECL_EXPR, TREE_TYPE (tmp),
+   tmp));
  tmp = gfc_build_addr_expr (NULL_TREE, tmp);
  gfc_conv_descriptor_data_set (pre, desc, tmp);
}
*** gfc_trans_auto_array_allocation (tree de
*** 4759,4767 
  {
stmtblock_t init;
tree type;
!   tree tmp;
tree size;
tree offset;
bool onstack;
  
gcc_assert (!(sym-attr.pointer || sym-attr.allocatable));
--- 4769,4779 
  {
stmtblock_t init;
tree type;
!   tree tmp = NULL_TREE;
tree size;
tree offset;
+   tree space;
+   tree inittree;
bool onstack;
  
gcc_assert (!(sym-attr.pointer || sym-attr.allocatable));
*** gfc_trans_auto_array_allocation (tree de
*** 4818,4832 
return;
  }
  
!   /* The size is the number of elements in the array, so multiply by the
!  size of an element to get the total size.  */
!   tmp = TYPE_SIZE_UNIT (gfc_get_element_type (type));
!   size = fold_build2_loc (input_location, 

Re: [PATCH 1/4] Remove usesess and wrong code from ipa_analyze_virtual_call_uses

2011-04-15 Thread Richard Guenther
On Fri, 15 Apr 2011, Martin Jambor wrote:

 Hi,
 
 ipa_analyze_virtual_call_uses contains code that was meant to deal
 with situation where OBJ_TYPE_REF_OBJECT is a (number of)
 COMPONENT_REFs on top of a dereferenced default definition SSA_NAME of
 a parameter.
 
 The code is useless because that never happens in the IL, if an
 ancestor object of a parameter is being used for a virtual call, the
 object in the expression is always an SSA_NAME which is assigned the
 proper value in a previous statement.
 
 Moreover, if it ever triggered, it might lead to wrong code because in
 cases like this it is necessary also to store the offset within the
 parameter into the indirect call graph edge information (like we do in
 indirect inlining).
 
 The above is done in the next patch in the series.  I've split this
 part from it because I would like to commit it also to the 4.6 branch.
 I have bootstrapped and tested this on x86-64-linux without any
 problems.  OK for trunk and the 4.6 branch?

Ok for both.

Thanks,
Richard.

 Thanks,
 
 Martin
 
 
 2011-04-08  Martin Jambor  mjam...@suse.cz
 
   * ipa-prop.c (ipa_analyze_virtual_call_uses): Remove handling
   of ADR_EXPRs.
 
 
 Index: src/gcc/ipa-prop.c
 ===
 --- src.orig/gcc/ipa-prop.c
 +++ src/gcc/ipa-prop.c
 @@ -1383,18 +1383,6 @@ ipa_analyze_virtual_call_uses (struct cg
if (!flag_devirtualize)
  return;
  
 -  if (TREE_CODE (obj) == ADDR_EXPR)
 -{
 -  do
 - {
 -   obj = TREE_OPERAND (obj, 0);
 - }
 -  while (TREE_CODE (obj) == COMPONENT_REF);
 -  if (TREE_CODE (obj) != MEM_REF)
 - return;
 -  obj = TREE_OPERAND (obj, 0);
 -}
 -
if (TREE_CODE (obj) != SSA_NAME
|| !SSA_NAME_IS_DEFAULT_DEF (obj))
  return;
 
 

-- 
Richard Guenther rguent...@suse.de
Novell / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex


Re: Implement stack arrays even for unknown sizes

2011-04-15 Thread Jerry DeLisle

On 04/15/2011 07:28 AM, Michael Matz wrote:
--- snip ---


I'll make the DECL_EXPR conditional on the size being variable.  As Tobias
already okayed the patch I'm planning to check in the slightly modified
variant as below, after a new round of testing.



Thats A-OK

Thanks,

Jerry


Re: More of ipa-inline housekeeping

2011-04-15 Thread Dominique Dhumieres
I have forgotten to say that the 125 and 516 thresholds are for
x86_64-apple-darwin10. On powerpc-apple-darwin9 they are
repsectively 172 and 638.

Dominique


[PATCH, SMS] Avoid unfreed memory when SMS fails

2011-04-15 Thread Revital Eres
Hello,

This patch fixes the scenario where SMS fails to
schedule a loop and continue to the next one without
freeing data structures allocated while scheduling
the first loop.

Bootstrap and regtested on ppc64-redhat-linux.

OK for mainline?

Thanks,
Revital

Changelog:

* modulo-sched.c (sms_schedule): Avoid unfreed memory when SMS fails.

Index: modulo-sched.c
===
--- modulo-sched.c  (revision 170464)
+++ modulo-sched.c  (working copy)
@@ -1177,7 +1177,6 @@ sms_schedule (void)
  fprintf (dump_file, HOST_WIDEST_INT_PRINT_DEC, trip_count);
  fprintf (dump_file, )\n);
}
- continue;
}
   else
{


[PATCH, SMS] Free sccs field

2011-04-15 Thread Revital Eres
Hello,

The attached patch adds missing free operation for storage
allocated while calculating SCCs.

Bootstrap and regtested on ppc64-redhat-linux.

OK for mainline?

Thanks,
Revital

Changelog:

* ddg.c (free_ddg_all_sccs): Free sccs field in struct ddg_all_sccs.

Index: ddg.c
===
--- ddg.c   (revision 171573)
+++ ddg.c   (working copy)
@@ -1011,6 +1082,8 @@ free_ddg_all_sccs (ddg_all_sccs_ptr all_
   for (i = 0; i  all_sccs-num_sccs; i++)
 free_scc (all_sccs-sccs[i]);

+  if (all_sccs-sccs)
+free (all_sccs-sccs);
   free (all_sccs);
 }


Re: [PATCH 2/4] Handle calls to ancestor objects in IPA-CP devirtualization

2011-04-15 Thread Richard Guenther
On Fri, 15 Apr 2011, Martin Jambor wrote:

 Hi,
 
 early inlining can create virtual calls based on the part of an object
 that represents an ancestor.  This patch makes ipa-prop analysis able
 to recognize such calls and store the required offset along with such
 calls (the field is already there for similar purposes of indirect
 inlining).  The constant propagation is then made aware of the offset
 field and takes it into account when looking up the proper BINFO.
 
 Bootstrapped and tested on x86_64-linux. OK for trunk?
 
 Thanks,
 
 Martin
 
 
 
 2011-04-13  Martin Jambor  mjam...@suse.cz
 
   * ipa-cp.c (ipcp_process_devirtualization_opportunities): Take into
   account anc_offset and otr_type from the indirect edge info.
   * ipa-prop.c (get_ancestor_addr_info): New function.
   (compute_complex_ancestor_jump_func): Assignment analysis moved to
   get_ancestor_addr_info, call it.
   (ipa_note_param_call): Do not initialize information about polymorphic
   calls, return the indirect call graph edge.  Remove the last
   parameter, adjust all callers.
   (ipa_analyze_virtual_call_uses): Process also calls to ancestors of
   parameters.  Initialize polymorphic information in the indirect edge.
 
   * testsuite/g++.dg/ipa/devirt-7.C: New test.
 
 
 Index: src/gcc/ipa-cp.c
 ===
 --- src.orig/gcc/ipa-cp.c
 +++ src/gcc/ipa-cp.c
 @@ -1246,8 +1246,8 @@ ipcp_process_devirtualization_opportunit
for (ie = node-indirect_calls; ie; ie = next_ie)
  {
int param_index, types_count, j;
 -  HOST_WIDE_INT token;
 -  tree target, delta;
 +  HOST_WIDE_INT token, anc_offset;
 +  tree target, delta, otr_type;
  
next_ie = ie-next_callee;
if (!ie-indirect_info-polymorphic)
 @@ -1259,14 +1259,23 @@ ipcp_process_devirtualization_opportunit
   continue;
  
token = ie-indirect_info-otr_token;
 +  anc_offset = ie-indirect_info-anc_offset;
 +  otr_type = ie-indirect_info-otr_type;
target = NULL_TREE;
types_count = VEC_length (tree, info-params[param_index].types);
for (j = 0; j  types_count; j++)
   {
 tree binfo = VEC_index (tree, info-params[param_index].types, j);
 -   tree d;
 -   tree t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true);
 +   tree d, t;
  
 +   binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
 +   if (!binfo)
 + {
 +   target = NULL_TREE;
 +   break;
 + }
 +
 +   t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true);
 if (!t)
   {
 target = NULL_TREE;
 Index: src/gcc/ipa-prop.c
 ===
 --- src.orig/gcc/ipa-prop.c
 +++ src/gcc/ipa-prop.c
 @@ -576,6 +576,49 @@ compute_complex_assign_jump_func (struct
  }
  }
  
 +/* Extract the base, offset and MEM_REF expression from a statement ASSIGN if
 +   it looks like:
 +
 +   iftmp.1_3 = obj_2(D)-D.1762;
 +
 +   The base of the MEM_REF must be a default definition SSA NAME of a
 +   parameter.  Return NULL_TREE if it looks otherwise.  If case of success, 
 the
 +   whole MEM_REF expression is returned and the offset calculated from any
 +   handled components and the MEM_REF itself is stored into *OFFSET.  The 
 whole
 +   RHS stripped off the ADDR_EXPR is stored into *OBJ_P.  */
 +
 +static tree
 +get_ancestor_addr_info (gimple assign, tree *obj_p, HOST_WIDE_INT *offset)
 +{
 +  HOST_WIDE_INT size, max_size;
 +  tree expr, parm, obj;
 +
 +  if (!gimple_assign_single_p (assign))
 +return NULL_TREE;
 +  expr = gimple_assign_rhs1 (assign);
 +
 +  if (TREE_CODE (expr) != ADDR_EXPR)
 +return NULL_TREE;
 +  expr = TREE_OPERAND (expr, 0);
 +  obj = expr;
 +  expr = get_ref_base_and_extent (expr, offset, size, max_size);
 +
 +  if (TREE_CODE (expr) != MEM_REF
 +  /* If this is a varying address, punt.  */
 +  || max_size == -1
 +  || max_size != size)
 +return NULL_TREE;
 +  parm = TREE_OPERAND (expr, 0);
 +  if (TREE_CODE (parm) != SSA_NAME
 +  || !SSA_NAME_IS_DEFAULT_DEF (parm)

Might be an uninitialized variable, so also check
TREE_CODE (SSA_NAME_VAR (parm)) == PARM_DECL?

 +  || *offset  0)

Check this above where you check max_size/size.

 +return NULL_TREE;
 +
 +  *offset += mem_ref_offset (expr).low * BITS_PER_UNIT;

At some point it might be worth switching to
get_addr_base_and_unit_offsets and not use bit but unit offsets
throughout the code.

 +  *obj_p = obj;
 +  return expr;
 +}
 +
  
  /* Given that an actual argument is an SSA_NAME that is a result of a phi
 statement PHI, try to find out whether NAME is in fact a
 @@ -603,7 +646,7 @@ compute_complex_ancestor_jump_func (stru
   struct ipa_jump_func *jfunc,
   gimple call, gimple phi)
  {
 -  HOST_WIDE_INT offset, size, max_size;
 +  

Re: [PATCH 3/4] Simple relaxation of dynamic type change detection routine

2011-04-15 Thread Richard Guenther
On Fri, 15 Apr 2011, Martin Jambor wrote:

 Hi,
 
 in order to speed up astar, I had to persuade the function that
 decides whether a statement potentially modifies the dynamic type of
 an object by storing a new value to the VMT pointer to consider the
 following statement harmless (all types are integers of some sort):
 
 MEM[(i32 *)b2arp_3(D) + 8B] = 0;
 
 I'd like to experiment with this routine a bit more once I have some
 other IPA-CP infrastructure in place but at the moment I opted for a
 simple solution:  All scalar non-pointer stores are deemed safe.
 
 VMT pointer is a compiler generated field which is a pointer so legal
 user code is not able to store stuff there through some fancy type
 casts and compiler generated code should have no reason whatsoever to
 that either.  Therefore I believe this change is safe and useful.
 
 I have bootstrapped and tested the patch on x886_64-linux.  OK for
 trunk?

I think this should be only done for -fstrict-aliasing.

Ok with that change.
Richard.

 Thanks,
 
 Martin
 
 
 
 2011-04-14  Martin Jambor  mjam...@suse.cz
 
   * ipa-prop.c (stmt_may_be_vtbl_ptr_store): Return false for scalar
   non-pointer assignments.
 
 Index: src/gcc/ipa-prop.c
 ===
 --- src.orig/gcc/ipa-prop.c
 +++ src/gcc/ipa-prop.c
 @@ -405,13 +405,18 @@ stmt_may_be_vtbl_ptr_store (gimple stmt)
  {
tree lhs = gimple_assign_lhs (stmt);
  
 -  if (TREE_CODE (lhs) == COMPONENT_REF
 -!DECL_VIRTUAL_P (TREE_OPERAND (lhs, 1))
 -!AGGREGATE_TYPE_P (TREE_TYPE (lhs)))
 +  if (!AGGREGATE_TYPE_P (TREE_TYPE (lhs)))
 + {
 +   if (!POINTER_TYPE_P (TREE_TYPE (lhs)))
   return false;
 -  /* In the future we might want to use get_base_ref_and_offset to find
 -  if there is a field corresponding to the offset and if so, proceed
 -  almost like if it was a component ref.  */
 +
 +   if (TREE_CODE (lhs) == COMPONENT_REF
 +!DECL_VIRTUAL_P (TREE_OPERAND (lhs, 1)))
 + return false;
 +   /* In the future we might want to use get_base_ref_and_offset to find
 +  if there is a field corresponding to the offset and if so, proceed
 +  almost like if it was a component ref.  */
 + }
  }
return true;
  }
 
 

-- 
Richard Guenther rguent...@suse.de
Novell / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 - GF: Markus Rex


Re: [PATCH 4/4] Devirtualization based on global objects

2011-04-15 Thread Richard Guenther
On Fri, 15 Apr 2011, Martin Jambor wrote:

 Hi,
 
 this is the patch that actually speeds up astar (together with the
 previous one).  It implements devirtualization based on global
 objects.  In the past we have refrained from doing this because in
 general it is difficult to say whether the object is currently being
 constructed and so it might have a dynamic type of one of its
 ancestors.  However, if the object's class does not have any ancestors
 that obviously cannot happen.
 
 Devirtualizing in such conditions is enough to change a virtual call
 to regwayobj::isaddtobound in 473.astar to a direct one which can and
 should be inlined.  That seemed a good justification to implement this
 and so the patch below does so and brings about 3.1% speedup for that
 benchmark with LTO.
 
 I acknowledge that instead of discarding all classes with ancestors it
 would be better to check that the called virtual method has the same
 implementation in all ancestors instead.  That is perhaps something
 for later.
 
 It took me surprisingly long to realize that this technique can be
 used for folding virtual calls based on local automatically allocated
 objexts too and so can be used to un-XFAIL g++.dg/opt/devirt1.c that
 regressed in 4.6.
 
 Bootstrapped and tested on x86_64-linux.  OK for trunk?
 
 Thanks,
 
 Martin
 
 
 2011-04-15  Martin Jambor  mjam...@suse.cz
 
   * ipa-cp.c (ipcp_process_devirtualization_opportunities): Devirtualize
   also according to actual contants.
   * gimple-fold.c (gimple_extract_devirt_binfo_from_cst): New function.
   (gimple_fold_obj_type_ref_call): New function.
   (gimple_fold_call): Call gimple_fold_obj_type_ref_call on
   OBJ_TYPE_REFs.
   * gimple.h (gimple_extract_devirt_binfo_from_cst): Declare.
 
   * testsuite/g++.dg/opt/devirt1.C: Bump to -O2, remove XFAIL.
   * testsuite/g++.dg/opt/devirt2.C: New test.
   * testsuite/g++.dg/ipa/devirt-g-1.C: Likewise.
 
 
 Index: src/gcc/ipa-cp.c
 ===
 --- src.orig/gcc/ipa-cp.c
 +++ src/gcc/ipa-cp.c
 @@ -1245,51 +1245,71 @@ ipcp_process_devirtualization_opportunit
  
for (ie = node-indirect_calls; ie; ie = next_ie)
  {
 -  int param_index, types_count, j;
 +  int param_index;
HOST_WIDE_INT token, anc_offset;
tree target, delta, otr_type;
 +  struct ipcp_lattice *lat;
  
next_ie = ie-next_callee;
if (!ie-indirect_info-polymorphic)
   continue;
param_index = ie-indirect_info-param_index;
 -  if (param_index == -1
 -   || ipa_param_cannot_devirtualize_p (info, param_index)
 -   || ipa_param_types_vec_empty (info, param_index))
 +  if (param_index == -1)
   continue;
  
 +  lat = ipcp_get_lattice (info, param_index);
token = ie-indirect_info-otr_token;
anc_offset = ie-indirect_info-anc_offset;
otr_type = ie-indirect_info-otr_type;
target = NULL_TREE;
 -  types_count = VEC_length (tree, info-params[param_index].types);
 -  for (j = 0; j  types_count; j++)
 +  if (lat-type == IPA_CONST_VALUE)
   {
 -   tree binfo = VEC_index (tree, info-params[param_index].types, j);
 -   tree d, t;
 -
 +   tree binfo = gimple_extract_devirt_binfo_from_cst (lat-constant);
 +   if (!binfo)
 + continue;
 binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
 if (!binfo)
 - {
 -   target = NULL_TREE;
 -   break;
 - }
 + continue;
 +   target = gimple_get_virt_mehtod_for_binfo (token, binfo, delta,
 +  false);
 + }
 +  else
 + {
 +   int  types_count, j;
  
 -   t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true);
 -   if (!t)
 - {
 -   target = NULL_TREE;
 -   break;
 - }
 -   else if (!target)
 - {
 -   target = t;
 -   delta = d;
 - }
 -   else if (target != t || !tree_int_cst_equal (delta, d))
 +   if (ipa_param_cannot_devirtualize_p (info, param_index)
 +   || ipa_param_types_vec_empty (info, param_index))
 + continue;
 +
 +   types_count = VEC_length (tree, info-params[param_index].types);
 +   for (j = 0; j  types_count; j++)
   {
 -   target = NULL_TREE;
 -   break;
 +   tree binfo = VEC_index (tree, info-params[param_index].types, j);
 +   tree d, t;
 +
 +   binfo = get_binfo_at_offset (binfo, anc_offset, otr_type);
 +   if (!binfo)
 + {
 +   target = NULL_TREE;
 +   break;
 + }
 +
 +   t = gimple_get_virt_mehtod_for_binfo (token, binfo, d, true);
 +   if (!t)
 + {
 +   target = NULL_TREE;
 +   break;
 + }
 +   else if (!target)
 + {
 +   target = t;
 +

Re: [PATCH, SMS] Avoid unfreed memory when SMS fails

2011-04-15 Thread Richard Guenther
On Fri, Apr 15, 2011 at 5:26 PM, Revital Eres revital.e...@linaro.org wrote:
 Hello,

 This patch fixes the scenario where SMS fails to
 schedule a loop and continue to the next one without
 freeing data structures allocated while scheduling
 the first loop.

 Bootstrap and regtested on ppc64-redhat-linux.

 OK for mainline?

Ok.

Thanks,
Richard.

 Thanks,
 Revital

 Changelog:

 * modulo-sched.c (sms_schedule): Avoid unfreed memory when SMS fails.

 Index: modulo-sched.c
 ===
 --- modulo-sched.c      (revision 170464)
 +++ modulo-sched.c      (working copy)
 @@ -1177,7 +1177,6 @@ sms_schedule (void)
              fprintf (dump_file, HOST_WIDEST_INT_PRINT_DEC, trip_count);
              fprintf (dump_file, )\n);
            }
 -         continue;
        }
       else
        {



Re: [PATCH, SMS] Free sccs field

2011-04-15 Thread Richard Guenther
On Fri, Apr 15, 2011 at 5:27 PM, Revital Eres revital.e...@linaro.org wrote:
 Hello,

 The attached patch adds missing free operation for storage
 allocated while calculating SCCs.

 Bootstrap and regtested on ppc64-redhat-linux.

 OK for mainline?

Ok.

Thanks,
Richard.

 Thanks,
 Revital

 Changelog:

        * ddg.c (free_ddg_all_sccs): Free sccs field in struct ddg_all_sccs.

 Index: ddg.c
 ===
 --- ddg.c       (revision 171573)
 +++ ddg.c       (working copy)
 @@ -1011,6 +1082,8 @@ free_ddg_all_sccs (ddg_all_sccs_ptr all_
   for (i = 0; i  all_sccs-num_sccs; i++)
     free_scc (all_sccs-sccs[i]);

 +  if (all_sccs-sccs)
 +    free (all_sccs-sccs);
   free (all_sccs);
  }



Re: [PATCH 3/6] Allow jumps in epilogues

2011-04-15 Thread Bernd Schmidt
On 04/13/2011 02:38 PM, Bernd Schmidt wrote:
 This still requires the i386 output_set_got which I think I can cope
 with [...]

Patch below, to be applied on top of all the others. Only lightly tested
so far beyond standard (fairly useless) regression tests, by comparing
generated assembly before/after, for -fpic -march=pentium and core2, for
i686-linux and i686-apple-darwin10 (for TARGET_MACHO).

I've not found or managed to create a testcase for making MI thunks
generate a call to output_set_got. I think it should work, but it's not
tested.


Bernd
* config/i386/i386.c (output_set_got): Don't call
dwarf2out_flush_queued_reg_saves.
(ix86_reorg): Split set_got patterns.
(i386_dwarf_handle_frame_unspec,
i386_dwarf_flush_queued_register_saves): New static functions.
(TARGET_DWARF_HANDLE_FRAME_UNSPEC,
TARGET_DWARF_FLUSH_QUEUED_REGISTER_SAVES): New macros.
* config/i386/i386.md (set_got_call, set_got_pop, set_got_add):
New patterns.
* dwarf2out.c (scan_until_barrier): Use the target's
dwarf_flush_queued_register_saves hook.
* target.def (dwarf_flush_queued_register_saves): New hook.
* doc/tm.texi: Regenerate.
Index: gcc/config/i386/i386.c
===
--- gcc.orig/config/i386/i386.c
+++ gcc/config/i386/i386.c
@@ -8953,6 +8953,8 @@ output_set_got (rtx dest, rtx label ATTR
output_asm_insn (mov%z0\t{%2, %0|%0, %2}, xops);
   else
{
+ /* For normal functions, this pattern is split, but we can still
+get here for thunks.  */
  output_asm_insn (call\t%a2, xops);
 #ifdef DWARF2_UNWIND_INFO
  /* The call to next label acts as a push.  */
@@ -9010,12 +9012,6 @@ output_set_got (rtx dest, rtx label ATTR
   get_pc_thunk_name (name, REGNO (dest));
   pic_labels_used |= 1  REGNO (dest);
 
-#ifdef DWARF2_UNWIND_INFO
-  /* Ensure all queued register saves are flushed before the
-call.  */
-  if (dwarf2out_do_frame ())
-   dwarf2out_flush_queued_reg_saves ();
-#endif
   xops[2] = gen_rtx_SYMBOL_REF (Pmode, ggc_strdup (name));
   xops[2] = gen_rtx_MEM (QImode, xops[2]);
   output_asm_insn (call\t%X2, xops);
@@ -30458,6 +30454,56 @@ ix86_reorg (void)
  with old MDEP_REORGS that are not CFG based.  Recompute it now.  */
   compute_bb_for_insn ();
 
+  /* Split any set_got patterns so that we interact correctly with
+ dwarf2out.  */
+  if (!TARGET_64BIT  !TARGET_VXWORKS_RTP  !TARGET_DEEP_BRANCH_PREDICTION
+   flag_pic)
+{
+  rtx insn, next;
+  for (insn = get_insns (); insn; insn = next)
+   {
+ rtx pat, label, dest, cst, gotsym, new_insn;
+ int icode;
+
+ next = NEXT_INSN (insn);
+ if (!NONDEBUG_INSN_P (insn))
+   continue;
+
+ icode = recog_memoized (insn);
+ if (icode != CODE_FOR_set_got  icode != CODE_FOR_set_got_labelled)
+   continue;
+
+ extract_insn (insn);
+ if (icode == CODE_FOR_set_got)
+   {
+ label = gen_label_rtx ();
+ cst = const0_rtx;
+   }
+ else
+   {
+ label = recog_data.operand[1];
+ cst = const1_rtx;
+   }
+
+ dest = recog_data.operand[0];
+ pat = gen_set_got_call (label, cst);
+ new_insn = emit_insn_before (pat, insn);
+ RTX_FRAME_RELATED_P (new_insn) = 1;
+ RTX_FRAME_RELATED_P (XVECEXP (PATTERN (new_insn), 0, 1)) = 1;
+ gotsym = gen_rtx_SYMBOL_REF (Pmode, GOT_SYMBOL_NAME);
+ pat = gen_set_got_pop (dest, label);
+ new_insn = emit_insn_before (pat, insn);
+ RTX_FRAME_RELATED_P (new_insn) = 1;
+ RTX_FRAME_RELATED_P (XVECEXP (PATTERN (new_insn), 0, 1)) = 1;
+ if (!TARGET_MACHO)
+   {
+ pat = gen_set_got_add (dest, gotsym, label);
+ new_insn = emit_insn_before (pat, insn);
+   }
+ delete_insn (insn);
+   }
+}
+
   if (optimize  optimize_function_for_speed_p (cfun))
 {
   if (TARGET_PAD_SHORT_FUNCTION)
@@ -30475,6 +30521,30 @@ ix86_reorg (void)
 move_or_delete_vzeroupper ();
 }
 
+/* Handle the TARGET_DWARF_HANDLE_FRAME_UNSPEC hook.
+   This is called from dwarf2out.c to emit call frame instructions
+   for frame-related insns containing UNSPECs and UNSPEC_VOLATILEs.  */
+static void
+i386_dwarf_handle_frame_unspec (rtx pattern ATTRIBUTE_UNUSED,
+   int index ATTRIBUTE_UNUSED)
+{
+  gcc_assert (index == UNSPEC_SET_GOT);
+}
+
+/* Handle the TARGET_DWARF_FLUSH_QUEUED_REGISTER_SAVES hook.
+   This is called from dwarf2out.c to decide whether all queued
+   register saves should be emitted before INSN.  */
+static bool
+i386_dwarf_flush_queued_register_saves (rtx insn)
+{
+  if (!TARGET_VXWORKS_RTP || !flag_pic)
+{
+  int icode = recog_memoized (insn);
+  return (icode == 

[patch, libgfortran] PR48589 Invalid G0/G0.d editing for NaN/infinity

2011-04-15 Thread Jerry DeLisle

Hi,

I plan to commit the following simple and obvious patch.

Regression tested on x86-64-linux-gnu.

I will apply the test case in the PR to the testsuite.

Regards,

Jerry

2011-04-15  Jerry DeLisle  jvdeli...@gcc.gnu.org

PR libgfortran/48589
* io/write_float.def (write_infnan): Set width properly for G0.

Index: write_float.def
===
--- write_float.def (revision 172467)
+++ write_float.def (working copy)
@@ -654,11 +654,11 @@ write_infnan (st_parameter_dt *dtp, const fnode *f
   mark = (sign == S_PLUS || sign == S_MINUS) ? 8 : 7;

   nb =  f-u.real.w;
-
+
   /* If the field width is zero, the processor must select a width
 not zero.  4 is chosen to allow output of '-Inf' or '+Inf' */

-  if (nb == 0)
+  if ((nb == 0) || dtp-u.p.g0_no_blanks)
{
  if (isnan_flag)
nb = 3;



Re: patch pings

2011-04-15 Thread Bernd Schmidt
On 04/15/2011 04:18 PM, Jeff Law wrote:

 http://gcc.gnu.org/ml/gcc-patches/2011-03/msg02247.html

I don't know. I sympathize with the goal, but I'm not too happy about
the structure of this patch. Doesn't this do the scan once for every
reload in an insn? It seems to me like the loop (or rather, a function
call to it) should at least be placed at the top of choose_reload_regs
rather than in allocate_reload_reg.

In my tests (i686-linux, compiling my standard set of testcases) the
patch appears to have, in general, very little effect on code quality.
It might on average be slightly better, but I've also seen several cases
where we do worse. If we go to so much effort to do scanning of
subsequent insns it ought to be possible to do better.

AFAICT the patch ignores whether the pseudo that's being reloaded will
be reloaded again in the current ebb - if not, it should get a bad
spill register, where bad in this case only includes hard regs that
don't currently hold a spilled pseudo. Likewise, once we've reloaded the
last occurrence of a pseudo in an ebb, we should forget the
reg_reloaded_contents to make sure we consider the spill reg good again.
However, I'm not sure whether changes such as these alone will reduce
the number of cases where the patch regresses.


Bernd


Re: FDO usability patch -- cfg and lineno checksum

2011-04-15 Thread Xinliang David Li
Resent in plain text mode ..

David

On Fri, Apr 15, 2011 at 9:28 AM, Xinliang David Li davi...@google.com wrote:

 Honza, do you have a chance to take a look at it?
 Thanks,

 David

 On Wed, Apr 13, 2011 at 3:25 PM, Xinliang David Li davi...@google.com wrote:

 Hi,  in current FDO implementation, the source file version used in
 profile-generate needs to strictly match the version used in
 profile-use -- simple formating changes will invalidate the profile
 data (use of it will lead to compiler error or ICE). There are two
 main problems that lead to the weakness:

 1) the function checksum is computed based on line number and number
 of basic blocks -- this means any line number change will invalidate
 the profile data. In fact, line number should play very minimal role
 in profile matching decision. Another problem is that number of basic
 blocks is also not a good indicator whether CFG matches or not.
 2) cgraph's pid is used as the key to find the matching profile data
 for a function. If there is any new function added, or if the order of
 the functions changes, the profile data is invalidated.

 The attached is a patch from google that improves this.
 1) function checksum is split into two parts: lineno checksum and cfg 
 checksum
 2) function assembler name is used in profile hashing.

 Bootstrapped and regression tested on x86_64/linux.

 Ok for trunk?

 Thanks,

 David

  2011-04-13  Xinliang David Li  davi...@google.com

        * tree.c (crc32_byte): New function.
        * tree.h (crc32_byte): New function.
        * gcov.c (function_info): Add new fields.
        (read_graph_file): Handle new fields.
        (read_count_file): Handle name.
        * gcov-io.c (gcov_string_length): New function.
        (gcov_write_words): Bug fix.
        (gcov_read_words): Bug fix.
        * gcov-io.h: New interfaces.
        * profile.c (get_exec_counts): New parameter.
        (compute_branch_probablilities): New parameter.
        (compute_value_histogram): New parameter.
        (branch_prob): Pass cfg_checksum.
        * coverage.c (get_const_string_type): New function.
        (hash_counts_entry_hash): Use string hash.
        (hash_counts_entry_eq): Use string compare.
        (htab_counts_entry_del): Delete name.
        (read_count_file): Add handling of cfg checksum.
        (get_coverage_counts): New parameter.
        (xstrdup_mask_random): New function.
        (coverage_compute_lineno_checksum): New function.
        (coverage_compute_cfg_checksum): New function.
        (coverage_begin_output): New parameter.
        (coverage_end_function): New parameter.
        (build_fn_info_type): Build new fields.
        (build_fn_info_value): Build new field values.
        * coverage.h: Interface changes.
        * libgcov.c (gcov_exit): Dump new fields.
        * gcov_dump.c (tag_function): Print new fields.



Re: [PATCH] Improve combining of conditionals

2011-04-15 Thread Mike Stump
On Apr 15, 2011, at 3:44 AM, Eric Botcazou ebotca...@adacore.com wrote:
 And make IN_COND a bool instead of an int?
 
 I had the same initial reaction but then came to the same conclusion as Maxim.

If it can be bool, it should be bool.


Re: [PATCH, SMS] Free sccs field

2011-04-15 Thread Revital Eres
Hello,

On 15 April 2011 18:53, Nathan Froyd froy...@codesourcery.com wrote:
 On Fri, Apr 15, 2011 at 06:27:05PM +0300, Revital Eres wrote:
 +  if (all_sccs-sccs)
 +    free (all_sccs-sccs);

 No need to check for non-NULL prior to free'ing.

OK, I'll commit the patch without the check then.
(after re-testing)

Thanks,
Revital



 -Nathan




Re: [build] Allow building libobjc_gc on Tru64 UNIX, Darwin

2011-04-15 Thread Rainer Orth
Nicola,

 Ok for mainline if both pass?

 Yes.

 [and by the way, I think you're fixing PR libobjc/32037 in the process. :-)]

indeed, thanks for checking.  Testing revealed that I'd been lazy with
quoting.  I'm including the patch I've installed, which encloses
OBJC_BOEHM_GC in configure.ac in single quotes.

 Btw., it would be considerably easier if --enable-libobjc-gc could be
 enabled automatically if boehm-gc is configured.

 Yes.

I've filed

libobjc/48626   --enable-objc-gc should be automatic

 Besides, it seems that libobjc_gc isn't tested anywhere.

 Yes.

and

libobjc/48627   libobjc_gc should be tested

just in case.

Rainer


2011-04-13  Rainer Orth  r...@cebitec.uni-bielefeld.de

PR libobjc/32037
* Makefile.in (OBJC_GCFLAGS): Move ...
* configure.ac (enable_objc_gc): ... here.
Add $(libsuffix) to OBJC_BOEHM_GC.
* configure: Regenerate.

diff --git a/libobjc/Makefile.in b/libobjc/Makefile.in
--- a/libobjc/Makefile.in
+++ b/libobjc/Makefile.in
@@ -93,7 +93,7 @@ LIBTOOL_INSTALL = $(LIBTOOL) --mode=inst
 LIBTOOL_CLEAN   = $(LIBTOOL) --mode=clean
 #LIBTOOL_UNINSTALL = $(LIBTOOL) --mode=uninstall
 
-OBJC_GCFLAGS=-DOBJC_WITH_GC=1
+OBJC_GCFLAGS=@OBJC_GCFLAGS@
 OBJC_BOEHM_GC=@OBJC_BOEHM_GC@
 OBJC_BOEHM_GC_INCLUDES=@OBJC_BOEHM_GC_INCLUDES@
 OBJC_BOEHM_GC_LIBS=../boehm-gc/libgcjgc_convenience.la $(thread_libs_and_flags)
diff --git a/libobjc/configure.ac b/libobjc/configure.ac
--- a/libobjc/configure.ac
+++ b/libobjc/configure.ac
@@ -1,6 +1,6 @@
 # Process this file with autoconf to produce a configure script.
 #   Copyright 1994, 1995, 1996, 1997, 1998, 1999, 2001, 2002, 2003, 2004
-#   2005, 2006, 2009 Free Software Foundation, Inc.
+#   2005, 2006, 2009, 2011 Free Software Foundation, Inc.
 #   Originally contributed by Dave Love (d.l...@dl.ac.uk).
 #
 #This file is part of GCC.
@@ -63,15 +63,25 @@ AC_ARG_ENABLE(objc-gc,
   the GNU Objective-C runtime.],
 [case $enable_objc_gc in
   no)
+OBJC_GCFLAGS=''
 OBJC_BOEHM_GC=''
 OBJC_BOEHM_GC_INCLUDES=''
 ;;
   *)
-OBJC_BOEHM_GC=libobjc_gc.la
+OBJC_GCFLAGS='-DOBJC_WITH_GC=1'
+OBJC_BOEHM_GC='libobjc_gc$(libsuffix).la'
 OBJC_BOEHM_GC_INCLUDES='-I$(top_srcdir)/../boehm-gc/include 
-I../boehm-gc/include'
+case ${host} in
+  alpha*-dec-osf*)
+# boehm-gc headers include pthread.h, which needs to be compiled
+   # with -pthread on Tru64 UNIX.
+OBJC_GCFLAGS=${OBJC_GCFLAGS} -pthread
+   ;;
+esac
 ;;
 esac],
-[OBJC_BOEHM_GC=''; OBJC_BOEHM_GC_INCLUDES=''])
+[OBJC_GCFLAGS=''; OBJC_BOEHM_GC=''; OBJC_BOEHM_GC_INCLUDES=''])
+AC_SUBST(OBJC_GCFLAGS)
 AC_SUBST(OBJC_BOEHM_GC)
 AC_SUBST(OBJC_BOEHM_GC_INCLUDES)
 

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[v3] Handle NOTY in extract_symvers.pl

2011-04-15 Thread Rainer Orth
While testing the close-to-final version of my COMDAT-group-with-Sun as
patch

[build, c++, lto] Support COMDAT group with Sun as (PR target/40483)
http://gcc.gnu.org/ml/gcc-patches/2010-05/msg01365.html
http://gcc.gnu.org/ml/gcc-patches/2010-06/msg00600.html

which only awaits the release of a linker with the required fixes, I
noticed that extract_symvers.pl doesn't handly NOTY/NOTYPE entries, but
chokes instead, whereas the readelf based version handles them just
fine.  Although the final patch won't create them anymore, I'm
installing the following patch that makes the script more robust.  It
was tested while putting the finishing touches on the as-comdat patch.

Installed on mainline.

Rainer


2011-04-04  Rainer Orth  r...@cebitec.uni-bielefeld.de

* scripts/extract_symvers.pl: Handle NOTY.

diff --git a/libstdc++-v3/scripts/extract_symvers.pl 
b/libstdc++-v3/scripts/extract_symvers.pl
--- a/libstdc++-v3/scripts/extract_symvers.pl
+++ b/libstdc++-v3/scripts/extract_symvers.pl
@@ -1,6 +1,6 @@
 #!/usr/bin/perl -w
 
-# Copyright (C) 2010 Free Software Foundation, Inc.
+# Copyright (C) 2010, 2011 Free Software Foundation, Inc.
 #
 # This file is part of the GNU ISO C++ Library.  This library is free
 # software; you can redistribute it and/or modify it under the
@@ -108,6 +108,7 @@ while (ELFDUMP) {
 die unhandled symbol:\n$_ if ($bind !~ /^(GLOB|WEAK)/ or $oth !~ /[DP]/);
 
 # Adapt to readelf type naming convention.
+$type = NOTYPE if ($type eq NOTY);
 $type = OBJECT if ($type eq OBJT);
 
 # Use correct symbol type.
@@ -116,7 +117,7 @@ while (ELFDUMP) {
 close ELFDUMP or die elfdump error;
 
 foreach $symbol (keys %type) {
-if ($type{$symbol} eq FUNC) {
+if ($type{$symbol} eq FUNC || $type{$symbol} eq NOTYPE) {
push @lines, $type{$symbol}:$symbol\@\@$version{$symbol}\n;
 } elsif ($type{$symbol} eq OBJECT and $size{$symbol} == 0) {
push @lines, $type{$symbol}:$size{$symbol}:$version{$symbol}\n;


-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


ObjC: rewritten checks for duplicate instance variables (for improved speed and improved error messages)

2011-04-15 Thread Nicola Pero
This patch rewrites the check for duplicate instance variables
done by the Objective-C (and Objective-C++) compiler.

The new code has many advantages:

 * it avoids copying all the instance variables into a temporary
   tree chain in the way that the all code was doing (according
   to some notes from benchmarks and profiling I did a few months
   ago, this should save about 5k allocations per compilation
   when compiling a typical ObjC GNUstep file, giving an approx 0.25%
   order of magnitude speedup with -fsyntax-only);

 * it provides better error messages by providing the informative
   message previous declaration of ... when a duplicate declaration
   is found;

 * it avoids producing duplicate errors if a class with duplicate
   instance variables is subclassed.  The existing code always built
   a copy of the full list of instance variables for the class (including
   superclass ones) and then would look for duplicates.  This means if
   you have a class with a duplicate, and subclass it 10 times, you'll get
   the same error message 11 times!  The new code avoids this mistake,
   which also reduces the number of checks required (for the standard case
   without a hashtable, I'll explain in a minute).

 * it removes the special, ad-hoc field duplicate check implementation for
   ObjC++ and has it use the same code as the ObjC codebase.

The new code does the checks for duplicates directly, by just iterating
and comparing, unless the number of checks to do is very large, in which
case it falls back to a hashtable.  This is similar to what the C code
does for structs, but here the check is more specific as we are only
checking the instance variables in the current class against the other
instance variables in the class and the superclass, so it's all more
specific and optimized for the situation.

Updated existing testcases (to check for the new previous declaration of ...
messages), and added more testcases.

Ok to commit ?

Thanks

Index: c-family/c-objc.h
===
--- c-family/c-objc.h   (revision 172444)
+++ c-family/c-objc.h   (working copy)
@@ -62,7 +62,7 @@ extern tree objc_build_string_object (tree);
 extern tree objc_get_protocol_qualified_type (tree, tree);
 extern tree objc_get_class_reference (tree);
 extern tree objc_get_class_ivars (tree);
-extern tree objc_get_interface_ivars (tree);
+extern bool objc_detect_field_duplicates (bool);
 extern void objc_start_class_interface (tree, tree, tree, tree);
 extern void objc_start_category_interface (tree, tree, tree, tree);
 extern void objc_start_protocol (tree, tree, tree);
Index: c-family/ChangeLog
===
--- c-family/ChangeLog  (revision 172444)
+++ c-family/ChangeLog  (working copy)
@@ -1,3 +1,9 @@
+2011-04-15  Nicola Pero  nicola.p...@meta-innovation.com
+
+   * c-objc.h (objc_get_interface_ivars): Removed.
+   (objc_detect_field_duplicates): New.
+   * stub-objc.c: Likewise.
+   
 2011-04-14  Nicola Pero  nicola.p...@meta-innovation.com
 
* stub-objc.c (objc_declare_protocols): Renamed to
Index: c-family/stub-objc.c
===
--- c-family/stub-objc.c(revision 172444)
+++ c-family/stub-objc.c(working copy)
@@ -275,10 +275,10 @@ objc_get_class_reference (tree ARG_UNUSED (name))
   return 0;
 }
 
-tree
-objc_get_interface_ivars (tree ARG_UNUSED (fieldlist))
+bool
+objc_detect_field_duplicates (bool ARG_UNUSED (check_superclasses_only))
 {
-  return 0;
+  return false;
 }
 
 tree
Index: objc/ChangeLog
===
--- objc/ChangeLog  (revision 172444)
+++ objc/ChangeLog  (working copy)
@@ -1,3 +1,10 @@
+2011-04-15  Nicola Pero  nicola.p...@meta-innovation.com
+
+   * objc-act.c (objc_get_interface_ivars): Removed.
+   (objc_detect_field_duplicates): New.
+   (hash_instance_variable): New.
+   (eq_instance_variable): New.
+   
 2011-04-14  Nicola Pero  nicola.p...@meta-innovation.com
 
* objc-act.c (objc_declare_protocols): Renamed to
Index: objc/objc-act.c
===
--- objc/objc-act.c (revision 172444)
+++ objc/objc-act.c (working copy)
@@ -3813,6 +3813,8 @@ lookup_interface (tree ident)
   }
 }
 
+
+
 /* Implement @defs (classname) within struct bodies.  */
 
 tree
@@ -3829,19 +3831,242 @@ objc_get_class_ivars (tree class_name)
   return error_mark_node;
 }
 
+
+/* Functions used by the hashtable for field duplicates in
+   objc_detect_field_duplicates().  Ideally, we'd use a standard
+   key-value dictionary hashtable , and store as keys the field names,
+   and as values the actual declarations (used to print nice error
+   messages with the locations).  But, the hashtable we are using only
+   allows us to store keys in the hashtable, without values (it looks
+ 

[PATCH, PR 48601] cgraph_get_create_node in emultls.c

2011-04-15 Thread Martin Jambor
Hi,

lower_emutls_function_body in emultls.c should use
cgraph_get_create_node to create call graph nodes if need be because
it is introducing TLS builtin decls into IL.

I have bootstrapped and tested this on x86_64-linux (where the bug
does not happen) and in bugzilla it has been confirmed it fixes the
issue.  It is rather straight-forward and was pre-approved on IRC by
Honza so I am about to commit it now.  Testcases are already in the
testsuite.

Thanks,

Martin



2011-04-15  Martin Jambor  mjam...@suse.cz

PR middle-end/48601
* tree-emutls.c (lower_emutls_function_body): Call
cgraph_get_create_node instead of cgraph_get_node.  Do not assert the
result is non-NULL.

Index: src/gcc/tree-emutls.c
===
--- src.orig/gcc/tree-emutls.c
+++ src/gcc/tree-emutls.c
@@ -619,8 +619,9 @@ lower_emutls_function_body (struct cgrap
 
   d.cfun_node = node;
   d.builtin_decl = built_in_decls[BUILT_IN_EMUTLS_GET_ADDRESS];
-  d.builtin_node = cgraph_get_node (d.builtin_decl);
-  gcc_checking_assert (d.builtin_node);
+  /* This is where we introduce the declaration to the IL and so we have to
+ create a node for it.  */
+  d.builtin_node = cgraph_get_create_node (d.builtin_decl);
 
   FOR_EACH_BB (d.bb)
 {


Re: [PATCH] Improve combining of conditionals

2011-04-15 Thread Eric Botcazou
 If it can be bool, it should be bool.

Rather basic principle IMO.  Consistency is of much greater value.

-- 
Eric Botcazou


Re: [Patch,AVR]: FIX ICE in optabs due to bad rotate expander.

2011-04-15 Thread Georg-Johann Lay
Denis Chertykov schrieb:
 2011/4/14 Georg-Johann Lay a...@gjlay.de:
 Denis Chertykov schrieb:
 2011/4/14 Georg-Johann Lay a...@gjlay.de:
 The rotlmode3 expanders (mode \in {HI,SI,DI}) violates synopsis by
 using 4 operands instead of 3. This runs in ICE in top of
 optabs.c:maybe_gen_insn.

 The right way to do this is to use match_dup, not match_operand. So
 the fix is obvious.

 Regenerated avr-gcc and run against avr testsuite:

 gcc.target/avr/torture/pr41885.c generates these patterns

 Johann

 2011-04-14  Georg-Johann Lay  a...@gjlay.de

* config/avr/avr.md (rotlhi3, rotlsi3, rotldi3): Fix
expanders operand 3 from match_operand to match_dup.
 May be better to use (clobber (match_scratch ...)) here and in
 `*rotwmode' and `*rotbmode'.
 These are splitters that might split before reload, producing strange,
 non-matching insns like
  (set (scratch:QI) ...
 or crashing already in avr_rotate_bytes becase the split condition reads
  (reload_completed || MODEmode == DImode)
 
 Generally I'm agree with you, change match_operand to match_dup is easy.

I already submitted the easy patch to avoid the ICE.

 But IMHO the right way is:
  - the splitters and avr_rotate_bytes are wrong and must be fixed too.
  - operands[3] is a scratch register and right way is to use match_scratch.
 
 I can approve the patch.
 
 Denis.

Ok, here is the right-way patch.

The expander generates a SCRATCH instead of a pseudo, and in case of a
pre-reload split the SCRATCH is replaced with a pseudo because it is
not known if or if not the scratch will actually be needed.

avr_rotate_bytes now skips operations on non-REG 'scratch'.
Furthermore, I added an assertion that ensures that 'scratch' is
actually a REG when it is needed.

Besides that, I fixed indentation. I know it is not optimal to fix
indentation alongside with functionality... did it anyway :-)

Finally, I exposed alternative #3 of the insns to the register
allocator, because it is not possible to distinguish between
overlapping or non-overlapping regs, and #3 does not need a scratch.

Ran C-testsuite with no regressions.

Johann

2011-04-15  Georg-Johann Lay  a...@gjlay.de

* config/avr/avr.md (rotlmode3): Use SCRATCH instead
of REG in operand 3. Fix indentation. Unquote C snippet.
(*rotwmode,*rotbwmode): Ditto. Replace scratch
with pseudo for pre-reload splits. Use const_int_operand for
operand 2. Expose alternative 3 to register allocator.
* config/avr/avr.c (avr_rotate_bytes): Handle SCRATCH in
operands[3]. Use GNU indentation style.

Index: config/avr/avr.md
===
--- config/avr/avr.md	(Revision 172493)
+++ config/avr/avr.md	(Arbeitskopie)
@@ -1519,22 +1519,22 @@ (define_mode_attr rotsmode [(DI QI) (S
 
 (define_expand rotlmode3
   [(parallel [(set (match_operand:HIDI 0 register_operand )
-		   (rotate:HIDI (match_operand:HIDI 1 register_operand )
-(match_operand:VOID 2 const_int_operand )))
-		(clobber (match_dup 3))])]
+   (rotate:HIDI (match_operand:HIDI 1 register_operand )
+(match_operand:VOID 2 const_int_operand )))
+  (clobber (match_dup 3))])]
   
-  
-{
-  if (CONST_INT_P (operands[2])  0 == (INTVAL (operands[2]) % 8))
   {
-  if (AVR_HAVE_MOVW  0 == INTVAL (operands[2]) % 16)
-operands[3] = gen_reg_rtx (rotsmodemode);
-  else
-operands[3] = gen_reg_rtx (QImode);
-  }
-  else
-FAIL;
-})
+if (CONST_INT_P (operands[2])
+ 0 == INTVAL (operands[2]) % 8)
+  {
+if (AVR_HAVE_MOVW  0 == INTVAL (operands[2]) % 16)
+  operands[3] = gen_rtx_SCRATCH (rotsmodemode);
+else
+  operands[3] = gen_rtx_SCRATCH (QImode);
+  }
+else
+  FAIL;
+  })
 
 
 ;; Overlapping non-HImode registers often (but not always) need a scratch.
@@ -1545,35 +1545,49 @@ (define_expand rotlmode3
 
 ; Split word aligned rotates using scratch that is mode dependent.
 (define_insn_and_split *rotwmode
-  [(set (match_operand:HIDI 0 register_operand =r,r,#r)
-	(rotate:HIDI (match_operand:HIDI 1 register_operand 0,r,r)
-		 (match_operand 2 immediate_operand n,n,n)))
-   (clobber (match_operand:rotsmode 3 register_operand  =rotx ))]
-  (CONST_INT_P (operands[2]) 
- (0 == (INTVAL (operands[2]) % 16)  AVR_HAVE_MOVW))
+  [(set (match_operand:HIDI 0 register_operand  =r,r,r)
+(rotate:HIDI (match_operand:HIDI 1 register_operand  0,r,r)
+ (match_operand 2 const_int_operand  n,n,n)))
+   (clobber (match_scratch:rotsmode 3   =rotx))]
+  AVR_HAVE_MOVW
+0 == INTVAL (operands[2]) % 16
   #
(reload_completed || MODEmode == DImode)
   [(const_int 0)]
-  avr_rotate_bytes (operands);
-  DONE;
-)
+  {
+/* Split happens in .split1: Cook up a pseudo */
+if (SCRATCH == GET_CODE (operands[3])
+ !reload_completed)
+  {
+operands[3] = gen_reg_rtx (GET_MODE 

[wwwdocs] Shuffle entries in the footer and remove margin at the bottom

2011-04-15 Thread Gerald Pfeifer
This concludes this mini project. :-)

Committed and the pages on gcc.gnu.org regenerated; www.gnu.org will
follow over night.

Gerald


Index: style.mhtml
===
RCS file: /cvs/gcc/wwwdocs/htdocs/style.mhtml,v
retrieving revision 1.115
diff -u -r1.115 style.mhtml
--- style.mhtml 12 Apr 2011 21:11:23 -  1.115
+++ style.mhtml 15 Apr 2011 17:22:30 -
@@ -232,14 +232,10 @@
 
 div class=copyright
 
-p style=margin-top:0;These pages are
-a href=http://gcc.gnu.org/about.html;maintained by the GCC team/a.
-Last modified date::format-time -MM-DD!-- IGNORE DIFF
---a href=http://validator.w3.org/check/referer;./a/p
-
-addressFor questions related to the use of GCC, please consult these web
-pages and the a href=http://gcc.gnu.org/onlinedocs/;GCC manuals/a. If
-that fails, the a href=mailto:gcc-h...@gcc.gnu.org;gcc-h...@gcc.gnu.org/a
+address style=margin-top:0;For questions related to the use of GCC,
+please consult these web pages and the
+a href=http://gcc.gnu.org/onlinedocs/;GCC manuals/a. If that fails,
+the a href=mailto:gcc-h...@gcc.gnu.org;gcc-h...@gcc.gnu.org/a
 mailing list might help.
 Comments on these web pages and the development of GCC are welcome on our
 developer list at a href=mailto:g...@gcc.gnu.org;g...@gcc.gnu.org/a.
@@ -252,6 +248,11 @@
 Verbatim copying and distribution of this entire article is
 permitted in any medium, provided this notice is preserved./p
 
+p style=margin-bottom:0;These pages are
+a href=http://gcc.gnu.org/about.html;maintained by the GCC team/a.
+Last modified date::format-time -MM-DD!-- IGNORE DIFF
+--a href=http://validator.w3.org/check/referer;./a/p
+
 /div
 
 !--  --



[pph] Clean positive tests. (issue4423044)

2011-04-15 Thread Lawrence Crowl

This patch cleans up positive tests.  First, stop on first failure so
as to avoid littering the logfile.  Second, change the extensions of
the generated assembly files to .s-pph and .s+pph to be clear on the
provenence of each file.


Index: gcc/testsuite/ChangeLog.pph

2011-04-15  Lawrence Crowl  cr...@google.com

* lib/dg-pph.exp (dg-pph-pos): Stop on first failure.  Change names of
assembly files to be clear on which is with and without PPH.


Index: gcc/testsuite/lib/dg-pph.exp
===
--- gcc/testsuite/lib/dg-pph.exp(revision 172457)
+++ gcc/testsuite/lib/dg-pph.exp(working copy)
@@ -74,13 +74,13 @@ proc dg-pph-pos { subdir test options ma
 if { $have_errs } {
verbose -log regular compilation failed
fail $nshort $options, regular compilation failed
-return
+   return
 }
 
 if { ! [ file_on_host exists $bname.s ] } {
-   verbose -log assembly file '$bname.s' missing
-   fail $nshort $options, assembly comparison
-return
+   verbose -log regular assembly file '$bname.s' missing
+   fail $nshort $options, regular assembly missing
+   return
 }
 
 # Rename the .s file into .s-pph to compare it after the second build.
@@ -89,19 +89,33 @@ proc dg-pph-pos { subdir test options ma
 
 # Compile a second time using the pph files.
 dg-test -keep-output $test $options $mapflag -I. 
-remote_upload host $bname.s
+
+if { $have_errs } {
+   verbose -log PPH compilation failed
+   fail $nshort $options, PPH compilation failed
+   return
+}
+
+if { ! [ file_on_host exists $bname.s ] } {
+   verbose -log PPH assembly file '$bname.s' missing
+   fail $nshort $options, PPH assembly missing
+   return
+}
+
+# Rename the .s file into .s+pph to compare it.
+remote_upload host $bname.s $bname.s+pph
+remote_download host $bname.s+pph
 
 # Compare the two assembly files.  They should be identical.
-set tmp [ diff $bname.s $bname.s-pph ]
+set tmp [ diff $bname.s-pph $bname.s+pph ]
 if { $tmp == 0 } {
-verbose -log assembly file '$bname.s', '$bname.s-pph' comparison 
error
-fail $nshort $options assembly comparison
+   verbose -log assembly file '$bname.s-pph', '$bname.s+pph' comparison 
error
+   fail $nshort $options assembly comparison
 } elseif { $tmp == 1 } {
-pass $nshort $options assembly comparison
+   pass $nshort $options assembly comparison
+   file_on_host delete $bname.s-pph
+   file_on_host delete $bname.s+pph
 } else {
-fail $nshort $options assembly comparison
+   fail $nshort $options assembly comparison
 }
-file_on_host delete $bname.s
-file_on_host delete $bname.s-pph
-
 }

--
This patch is available for review at http://codereview.appspot.com/4423044


Re: [PATCH] sel-sched: Fix erroneous re-use of pointer to last insn in a BB (PR 48235)

2011-04-15 Thread Steve Ellcey
Vladimir and Dmitry,

The new test you added, gcc.dg/pr48235.c, is failing on IA64 because
it gets this excess message:

cc1: note: -freorder-blocks-and-partition does not work on this architecture

Could you modify the test to either filter out this message or not 
use the -freorder-blocks-and-partition option in IA64 or not run the
test at all on IA64.  I notice that gcc.dg/tree-prof/pr45354.c looks
like this test but also has:

/* { dg-require-effective-target freorder } */

That test does not fail, or run, on IA64 due to this dg option.

Steve Ellcey
s...@cup.hp.com


ObjC: get rid of another unnecessary, temporary copy of instance variables

2011-04-15 Thread Nicola Pero
This patch for the Objective-C compiler gets rid of another case where
we'd create a temporary tree chain with a copy of all the instance variables
of a class just to do a simple check on them.

The check is the one to check the scope of the instance variable that is
being accessed; without this patch, the check requires copying all the
instance variables for the class on a temporary tree chain.  With this
patch, the check is done (as you'd expect) on the instance variables
themselves, without copying them.

This is obviously faster; on the other hand, the typical GNUstep-based ObjC file
has classes with few instance variables (maybe 10 or 20) and only accesses 
instance
variables up to a few hundred times per file, so the speedup is only of the 
order
of 0.1% to 0.2% even with -fsyntax-only.  Of course, in unusual cases (ie, 
classes
with many instance variables, and a big file accessing instance variables lots 
of
times) the speedup may be bigger.

Ok to commit ?

Thanks

Index: ChangeLog
===
--- ChangeLog   (revision 172511)
+++ ChangeLog   (working copy)
@@ -1,5 +1,10 @@
 2011-04-15  Nicola Pero  nicola.p...@meta-innovation.com
 
+   * objc-act.c (ivar_of_class): New.
+   (objc_is_public): Use ivar_of_class.
+
+2011-04-15  Nicola Pero  nicola.p...@meta-innovation.com
+
* objc-act.c (objc_get_interface_ivars): Removed.
(objc_detect_field_duplicates): New.
(hash_instance_variable): New.
Index: objc-act.c
===
--- objc-act.c  (revision 172511)
+++ objc-act.c  (working copy)
@@ -6367,6 +6367,35 @@ is_private (tree decl)
DECL_NAME (decl)));
 }
 
+/* Searches all the instance variables of 'klass' and of its
+   superclasses for an instance variable whose name (identifier) is
+   'ivar_name_ident'.  Return the declaration (DECL) of the instance
+   variable, if found, or NULL_TREE, if not found.  */
+static inline tree
+ivar_of_class (tree klass, tree ivar_name_ident)
+{
+  /* First, look up the ivar in CLASS_RAW_IVARS.  */
+  tree decl_chain = CLASS_RAW_IVARS (klass);
+
+  for ( ; decl_chain; decl_chain = DECL_CHAIN (decl_chain))
+if (DECL_NAME (decl_chain) == ivar_name_ident)
+  return decl_chain;
+
+  /* If not found, search up the class hierarchy.  */
+  while (CLASS_SUPER_NAME (klass))
+{
+  klass = lookup_interface (CLASS_SUPER_NAME (klass));
+  
+  decl_chain = CLASS_RAW_IVARS (klass);
+  
+  for ( ; decl_chain; decl_chain = DECL_CHAIN (decl_chain))
+   if (DECL_NAME (decl_chain) == ivar_name_ident)
+ return decl_chain;
+}
+
+  return NULL_TREE;
+}
+
 /* We have an instance variable reference;, check to see if it is public.  */
 
 int
@@ -6397,7 +6426,7 @@ objc_is_public (tree expr, tree identifier)
  return 0;
}
 
- if ((decl = is_ivar (get_class_ivars (klass, true), identifier)))
+ if ((decl = ivar_of_class (klass, identifier)))
{
  if (TREE_PUBLIC (decl))
return 1;




FDO usability trivial patch to fix div by zero error with insane profile

2011-04-15 Thread Xinliang David Li
This is a trivial patch to deal with bad profile data (from
multi-threaded programs) that leads to divide by zero error.

Ok for trunk?

Thanks,

David



2011-04-15  Xinliang David Li  davi...@google.com

* ipa-inline.c (cgraph_decide_recursive_inlining): Fix
div by zero error with insane profile.
Index: ipa-inline.c
===
--- ipa-inline.c	(revision 172510)
+++ ipa-inline.c	(working copy)
@@ -779,7 +779,7 @@ cgraph_decide_recursive_inlining (struct
 		fprintf (dump_file,Not inlining cold call\n);
 	  continue;
 	}
-  if (curr-count * 100 / node-count  probability)
+  if (node-count == 0 || curr-count * 100 / node-count  probability)
 	{
 	  if (dump_file)
 		fprintf (dump_file,


Re: FDO usability trivial patch to fix div by zero error with insane profile

2011-04-15 Thread Diego Novillo
On Fri, Apr 15, 2011 at 15:20, Xinliang David Li davi...@google.com wrote:
 This is a trivial patch to deal with bad profile data (from
 multi-threaded programs) that leads to divide by zero error.

 Ok for trunk?

 Thanks,

 David



 2011-04-15  Xinliang David Li  davi...@google.com

        * ipa-inline.c (cgraph_decide_recursive_inlining): Fix
        div by zero error with insane profile.

Looks fine to me, but you really want Honza to review this.  Any
chance we can get a test case for this?


Diego.


[PATCH] Prefer simplify_gen_* over gen_* in expand_debug_expr

2011-04-15 Thread Jakub Jelinek
Hi!

On IRC richi mentioned yesterday that when gcc.dg/guality/asm-1.c
testcase is compiled with g++ instead of gcc, it fails.
Apparently that is because of slightly different ARRAY_REF
in the code, which for g++ leads into
(sign_extend:DI (zero_extend:SI (something:HI)))
in DEBUG_INSN (as opposed to
(zero_extend:DI (something:HI))
with C FE) and var-tracking doesn't happen to match it in some
places.  The above is something simplify-rtx.c can fix up though.

This patch fixes it by using simplify_gen_* instead of gen_* in many
more places than it did until now in expand_debug_expr.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2011-04-15  Jakub Jelinek  ja...@redhat.com

* cfgexpand.c (expand_debug_expr): Use
simplify_gen_{unary,binary,ternary} instead of gen_rtx_*.

--- gcc/cfgexpand.c.jj  2011-04-09 19:29:19.081421040 +0200
+++ gcc/cfgexpand.c 2011-04-14 18:52:51.354402038 +0200
@@ -2364,6 +2364,7 @@ expand_debug_expr (tree exp)
 {
   rtx op0 = NULL_RTX, op1 = NULL_RTX, op2 = NULL_RTX;
   enum machine_mode mode = TYPE_MODE (TREE_TYPE (exp));
+  enum machine_mode inner_mode = VOIDmode;
   int unsignedp = TYPE_UNSIGNED (TREE_TYPE (exp));
   addr_space_t as;
 
@@ -2410,6 +2411,7 @@ expand_debug_expr (tree exp)
 
 unary:
 case tcc_unary:
+  inner_mode = TYPE_MODE (TREE_TYPE (TREE_OPERAND (exp, 0)));
   op0 = expand_debug_expr (TREE_OPERAND (exp, 0));
   if (!op0)
return NULL_RTX;
@@ -2513,7 +2515,7 @@ expand_debug_expr (tree exp)
 case NOP_EXPR:
 case CONVERT_EXPR:
   {
-   enum machine_mode inner_mode = GET_MODE (op0);
+   inner_mode = GET_MODE (op0);
 
if (mode == inner_mode)
  return op0;
@@ -2560,9 +2562,9 @@ expand_debug_expr (tree exp)
else if (TREE_CODE_CLASS (TREE_CODE (exp)) == tcc_unary
 ? TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (exp, 0)))
 : unsignedp)
- op0 = gen_rtx_ZERO_EXTEND (mode, op0);
+ op0 = simplify_gen_unary (ZERO_EXTEND, mode, op0, inner_mode);
else
- op0 = gen_rtx_SIGN_EXTEND (mode, op0);
+ op0 = simplify_gen_unary (SIGN_EXTEND, mode, op0, inner_mode);
 
return op0;
   }
@@ -2697,7 +2699,8 @@ expand_debug_expr (tree exp)
/* Don't use offset_address here, we don't need a
   recognizable address, and we don't want to generate
   code.  */
-   op0 = gen_rtx_MEM (mode, gen_rtx_PLUS (addrmode, op0, op1));
+   op0 = gen_rtx_MEM (mode, simplify_gen_binary (PLUS, addrmode,
+ op0, op1));
  }
 
if (MEM_P (op0))
@@ -2770,25 +2773,23 @@ expand_debug_expr (tree exp)
   }
 
 case ABS_EXPR:
-  return gen_rtx_ABS (mode, op0);
+  return simplify_gen_unary (ABS, mode, op0, mode);
 
 case NEGATE_EXPR:
-  return gen_rtx_NEG (mode, op0);
+  return simplify_gen_unary (NEG, mode, op0, mode);
 
 case BIT_NOT_EXPR:
-  return gen_rtx_NOT (mode, op0);
+  return simplify_gen_unary (NOT, mode, op0, mode);
 
 case FLOAT_EXPR:
-  if (unsignedp)
-   return gen_rtx_UNSIGNED_FLOAT (mode, op0);
-  else
-   return gen_rtx_FLOAT (mode, op0);
+  return simplify_gen_unary (TYPE_UNSIGNED (TREE_TYPE (TREE_OPERAND (exp,
+0)))
+? UNSIGNED_FLOAT : FLOAT, mode, op0,
+inner_mode);
 
 case FIX_TRUNC_EXPR:
-  if (unsignedp)
-   return gen_rtx_UNSIGNED_FIX (mode, op0);
-  else
-   return gen_rtx_FIX (mode, op0);
+  return simplify_gen_unary (unsignedp ? UNSIGNED_FIX : FIX, mode, op0,
+inner_mode);
 
 case POINTER_PLUS_EXPR:
   /* For the rare target where pointers are not the same size as
@@ -2799,161 +2800,164 @@ expand_debug_expr (tree exp)
   GET_MODE (op0) != GET_MODE (op1))
{
  if (GET_MODE_BITSIZE (GET_MODE (op0))  GET_MODE_BITSIZE (GET_MODE 
(op1)))
-   op1 = gen_rtx_TRUNCATE (GET_MODE (op0), op1);
+   op1 = simplify_gen_unary (TRUNCATE, GET_MODE (op0), op1,
+ GET_MODE (op1));
  else
/* We always sign-extend, regardless of the signedness of
   the operand, because the operand is always unsigned
   here even if the original C expression is signed.  */
-   op1 = gen_rtx_SIGN_EXTEND (GET_MODE (op0), op1);
+   op1 = simplify_gen_unary (SIGN_EXTEND, GET_MODE (op0), op1,
+ GET_MODE (op1));
}
   /* Fall through.  */
 case PLUS_EXPR:
-  return gen_rtx_PLUS (mode, op0, op1);
+  return simplify_gen_binary (PLUS, mode, op0, op1);
 
 case MINUS_EXPR:
-  return gen_rtx_MINUS (mode, op0, op1);
+  return simplify_gen_binary (MINUS, mode, op0, op1);
 
 

Re: FDO usability trivial patch to fix div by zero error with insane profile

2011-04-15 Thread Xinliang David Li
There is no way to get a test case that can produce the problem in a
deterministic way.  There is really not much for Honza to review
though :)

David

On Fri, Apr 15, 2011 at 12:52 PM, Diego Novillo dnovi...@google.com wrote:
 On Fri, Apr 15, 2011 at 15:20, Xinliang David Li davi...@google.com wrote:
 This is a trivial patch to deal with bad profile data (from
 multi-threaded programs) that leads to divide by zero error.

 Ok for trunk?

 Thanks,

 David



 2011-04-15  Xinliang David Li  davi...@google.com

        * ipa-inline.c (cgraph_decide_recursive_inlining): Fix
        div by zero error with insane profile.

 Looks fine to me, but you really want Honza to review this.  Any
 chance we can get a test case for this?


 Diego.



Re: ObjC: get rid of another unnecessary, temporary copy of instance variables

2011-04-15 Thread Mike Stump
On Apr 15, 2011, at 11:52 AM, Nicola Pero wrote:
 This patch for the Objective-C compiler gets rid of another case where
 we'd create a temporary tree chain with a copy of all the instance variables
 of a class just to do a simple check on them.

 Ok to commit ?

Ok.


Re: FDO usability trivial patch to fix div by zero error with insane profile

2011-04-15 Thread Diego Novillo
On Fri, Apr 15, 2011 at 16:21, Xinliang David Li davi...@google.com wrote:
 There is no way to get a test case that can produce the problem in a
 deterministic way.  There is really not much for Honza to review
 though :)

The check may be needed somewhere else, you may be papering over the
real issue.  I don't think you are but I don't know the code well
enough to say for sure.


Diego.


Re: [PATCH, rs6000 committed] Fix PowerPC bootstrap

2011-04-15 Thread Pat Haugen

On 04/12/2011 08:22 PM, Alan Modra wrote:

On Tue, Apr 12, 2011 at 04:00:45PM -0500, Pat Haugen wrote:

  --- gcc/config/rs6000/rs6000.c  (revision 172327)
  +++ gcc/config/rs6000/rs6000.c  (working copy)
  @@ -7976,7 +7976,7 @@ call_ABI_of_interest (tree fndecl)
   return true;

  /* Interesting functions that we are emitting in this object file.  
*/
  -  c_node = cgraph_node (fndecl);
  +  c_node = cgraph_get_create_node (fndecl);
  return !cgraph_only_called_directly_p (c_node);
}
  return false;

I think we should use cgraph_get_node here.


OK, fixed with following.


2011-04-15  Pat Haugen pthau...@us.ibm.com

* config/rs6000/rs6000.c (call_ABI_of_interest): Call
cgraph_get_node instead of cgraph_get_create_node.


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 172498)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -7976,7 +7976,7 @@ call_ABI_of_interest (tree fndecl)
return true;

   /* Interesting functions that we are emitting in this object file.  */
-  c_node = cgraph_get_create_node (fndecl);
+  c_node = cgraph_get_node (fndecl);
   return !cgraph_only_called_directly_p (c_node);
 }
   return false;


go patch committed: Add compound_statement to backend interface

2011-04-15 Thread Ian Lance Taylor
This patch to the Go frontends adds a convenience function
compound_statement to build a statement list from two statements.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian


2011-04-15  Ian Lance Taylor  i...@google.com

* go-gcc.cc (Gcc_backend::compound_statement): New function.
(Gcc_backend::assignment_statement): Use error_statement.
(Gcc_backend::return_statement): Likewise.
(Gcc_backend::if_statement): Likewise.
(Gcc_backend::switch_statement): Likewise.
(Gcc_backend::statement_list): Likewise.


Index: gcc/go/go-gcc.cc
===
--- gcc/go/go-gcc.cc	(revision 172468)
+++ gcc/go/go-gcc.cc	(working copy)
@@ -191,6 +191,9 @@ class Gcc_backend : public Backend
 		   source_location);
 
   Bstatement*
+  compound_statement(Bstatement*, Bstatement*);
+
+  Bstatement*
   statement_list(const std::vectorBstatement*);
 
   // Labels.
@@ -244,7 +247,7 @@ Gcc_backend::assignment_statement(Bexpre
   tree lhs_tree = lhs-get_tree();
   tree rhs_tree = rhs-get_tree();
   if (lhs_tree == error_mark_node || rhs_tree == error_mark_node)
-return this-make_statement(error_mark_node);
+return this-error_statement();
   return this-make_statement(fold_build2_loc(location, MODIFY_EXPR,
 	  void_type_node,
 	  lhs_tree, rhs_tree));
@@ -259,10 +262,10 @@ Gcc_backend::return_statement(Bfunction*
 {
   tree fntree = bfunction-get_tree();
   if (fntree == error_mark_node)
-return this-make_statement(error_mark_node);
+return this-error_statement();
   tree result = DECL_RESULT(fntree);
   if (result == error_mark_node)
-return this-make_statement(error_mark_node);
+return this-error_statement();
   tree ret;
   if (vals.empty())
 ret = fold_build1_loc(location, RETURN_EXPR, void_type_node, NULL_TREE);
@@ -270,7 +273,7 @@ Gcc_backend::return_statement(Bfunction*
 {
   tree val = vals.front()-get_tree();
   if (val == error_mark_node)
-	return this-make_statement(error_mark_node);
+	return this-error_statement();
   tree set = fold_build2_loc(location, MODIFY_EXPR, void_type_node,
  result, vals.front()-get_tree());
   ret = fold_build1_loc(location, RETURN_EXPR, void_type_node, set);
@@ -294,7 +297,7 @@ Gcc_backend::return_statement(Bfunction*
  rettmp, field, NULL_TREE);
 	  tree val = (*p)-get_tree();
 	  if (val == error_mark_node)
-	return this-make_statement(error_mark_node);
+	return this-error_statement();
 	  tree set = fold_build2_loc(location, MODIFY_EXPR, void_type_node,
  ref, (*p)-get_tree());
 	  append_to_statement_list(set, stmt_list);
@@ -322,7 +325,7 @@ Gcc_backend::if_statement(Bexpression* c
   if (cond_tree == error_mark_node
   || then_tree == error_mark_node
   || else_tree == error_mark_node)
-return this-make_statement(error_mark_node);
+return this-error_statement();
   tree ret = build3_loc(location, COND_EXPR, void_type_node, cond_tree,
 			then_tree, else_tree);
   return this-make_statement(ret);
@@ -363,7 +366,7 @@ Gcc_backend::switch_statement(
 	{
 	  tree t = (*pcv)-get_tree();
 	  if (t == error_mark_node)
-		return this-make_statement(error_mark_node);
+		return this-error_statement();
 	  source_location loc = EXPR_LOCATION(t);
 	  tree label = create_artificial_label(loc);
 	  tree c = build3_loc(loc, CASE_LABEL_EXPR, void_type_node,
@@ -376,19 +379,36 @@ Gcc_backend::switch_statement(
 	{
 	  tree t = (*ps)-get_tree();
 	  if (t == error_mark_node)
-	return this-make_statement(error_mark_node);
+	return this-error_statement();
 	  append_to_statement_list(t, stmt_list);
 	}
 }
 
   tree tv = value-get_tree();
   if (tv == error_mark_node)
-return this-make_statement(error_mark_node);
+return this-error_statement();
   tree t = build3_loc(switch_location, SWITCH_EXPR, void_type_node,
 		  tv, stmt_list, NULL_TREE);
   return this-make_statement(t);
 }
 
+// Pair of statements.
+
+Bstatement*
+Gcc_backend::compound_statement(Bstatement* s1, Bstatement* s2)
+{
+  tree stmt_list = NULL_TREE;
+  tree t = s1-get_tree();
+  if (t == error_mark_node)
+return this-error_statement();
+  append_to_statement_list(t, stmt_list);
+  t = s2-get_tree();
+  if (t == error_mark_node)
+return this-error_statement();
+  append_to_statement_list(t, stmt_list);
+  return this-make_statement(stmt_list);
+}
+
 // List of statements.
 
 Bstatement*
@@ -401,7 +421,7 @@ Gcc_backend::statement_list(const std::v
 {
   tree t = (*p)-get_tree();
   if (t == error_mark_node)
-	return this-make_statement(error_mark_node);
+	return this-error_statement();
   append_to_statement_list(t, stmt_list);
 }
   return this-make_statement(stmt_list);
Index: gcc/go/gofrontend/statements.cc
===
--- gcc/go/gofrontend/statements.cc	(revision 

[Patch, Fortran] PR 48624 - fix DECL for external procedures with proc arguments

2011-04-15 Thread Tobias Burnus

Hi all,

the following patch fixes an issue with a multiple decl for procedures 
with procedure dummy arguments. Without the patch, multiple declarations 
are generated, which causes link failures (euler is optimized away) 
with -fwhole-program.  (Thanks goes to Richard for spotting the problem 
with gfortran.dg/cray_pointers_8.f90.)


I have no idea why there should be a problem with inlining (cf. deleted 
comment), but removing that part did not give any test suite failure 
(check-gfortran + libgomp's check). Additionally, I have build and run 
the Polyhedron testsuite with -march=native -ffast-math -funroll-loops 
-O3 -fno-protect-parens -finline-limit=400 -fwhole-program.


Build and tested on x86-64-linux.
OK for the trunk?

Tobias

PS: I have not created a test case; you think it should be added, one 
can copy gfortran.dg/cray_pointers_8.f90 and add a version with 
dg-options -fwhole-file.
2011-05-15  Tobias Burnus  bur...@net-b.de

	PR fortran/48624
	* trans-decl.c (gfc_get_extern_function_decl): Fix decl
	for external procedures with proc arguments.

diff --git a/gcc/fortran/trans-decl.c b/gcc/fortran/trans-decl.c
index 784dfc8..866720f 100644
--- a/gcc/fortran/trans-decl.c
+++ b/gcc/fortran/trans-decl.c
@@ -1527,7 +1527,6 @@ gfc_get_extern_function_decl (gfc_symbol * sym)
   tree name;
   tree mangled_name;
   gfc_gsymbol *gsym;
-  bool proc_formal_arg;
 
   if (sym-backend_decl)
 return sym-backend_decl;
@@ -1544,27 +1543,10 @@ gfc_get_extern_function_decl (gfc_symbol * sym)
  return the backend_decl.  */
   gsym =  gfc_find_gsymbol (gfc_gsym_root, sym-name);
 
-  /* Do not use procedures that have a procedure argument because this
- can result in problems of multiple decls during inlining.  */
-  proc_formal_arg = false;
-  if (gsym  gsym-ns  gsym-ns-proc_name)
-{
-  gfc_formal_arglist *formal = gsym-ns-proc_name-formal;
-  for (; formal; formal = formal-next)
-	{
-	  if (formal-sym  formal-sym-attr.flavor == FL_PROCEDURE)
-	{
-	  proc_formal_arg = true;
-	  break;
-	}
-	}
-}
-
   if (gfc_option.flag_whole_file
 	 (!sym-attr.use_assoc || sym-attr.if_source != IFSRC_DECL)
 	 !sym-backend_decl
 	 gsym  gsym-ns
-	 !proc_formal_arg
 	 ((gsym-type == GSYM_SUBROUTINE) || (gsym-type == GSYM_FUNCTION))
 	 (gsym-ns-proc_name-backend_decl || !sym-attr.intrinsic))
 {


Re: PATCH: PR middle-end/48440: [4.7 Regression] FAIL: gcc.c-torture/compile/labels-3.c

2011-04-15 Thread Steve Ellcey
I was curious if anyone was still looking at this problem?  I see this
on IA64 HP-UX in 32 bit mode where ptr_mode(SImode) != Pmode(DImode).
As H.J. points out, expand_expr_real_2 is calling simplify_gen_subreg
(expr.c, line 7366) and at that point op0 is (label_ref/v:DI 32) and
innermode is SImode.  Because innermode doesn't match the mode of op0,
we abort in simplify_subreg.  So it seems we either need to change how
we calculate innermode (so that it is DImode) or change expand_expr so
that it returns a SImode RTX expression for the tree treeop0.  I am not
sure which of these is the right thing to do.  The tree (treeop0) that
op0 is generated from is below and when we call expr_expr the modifier
is EXPAND_INITIALIZER.  Should expand_expr return an SImode expression
for this tree (ptr_mode) or a DImode expression (Pmode) in this case?

 addr_expr 65866560
type pointer_type 657d8960
type void_type 657d8900 void VOID
align 8 symtab 0 alias set -1 canonical type 657d8900
pointer_to_this pointer_type 657d8960
sizes-gimplified public unsigned SI
size integer_cst 657d1920 constant 32
unit size integer_cst 657d16c0 constant 4
align 32 symtab 0 alias set -1 canonical type 657d8960
pointer_to_this pointer_type 657f1720 reference_to_this 
reference_type 657f14e0
constant
arg 0 label_decl 657e21b8 l2 type void_type 657d8900 void
side-effects VOID file labels-3.c line 16 col 2
align 1 context function_decl 65858180 foo initial error_mark 
657e6040
(code_label/s 32 0 0 4 4 (l2) [2 uses])

chain var_decl 6585e3c0 p type pointer_type 657d8960
used unsigned SI file labels-3.c line 12 col 9 size integer_cst 
657d1920 32 unit size integer_cst 657d16c0 4
align 32 context function_decl 65858180 foo
(mem/f/c/i:SI (reg/f:DI 328 sfp) [0 p+0 S4 A32])
labels-3.c:11:44



Steve Ellcey
s...@cup.hp.com


Re: [Patch, Fortran] PR 48624 - fix DECL for external procedures with proc arguments

2011-04-15 Thread Steve Kargl
On Sat, Apr 16, 2011 at 12:30:01AM +0200, Tobias Burnus wrote:
 
 I have no idea why there should be a problem with inlining (cf. deleted 
 comment), but removing that part did not give any test suite failure 
 (check-gfortran + libgomp's check). Additionally, I have build and run 
 the Polyhedron testsuite with -march=native -ffast-math -funroll-loops 
 -O3 -fno-protect-parens -finline-limit=400 -fwhole-program.

The comment comes from r170414, which suggest there is
a problem if a procedure is in an argument list, and 
that procedure gets in-lined.

laptop:kargl[205] svn log -r 170414 trans-decl.c |more

r170414 | pault | 2011-02-22 12:33:45 -0800 (Tue, 22 Feb 2011) | 12 lines

2011-02-22  Paul Thomas  pa...@gcc.gnu.org

PR fortran/45743
* trans-decl.c (gfc_get_extern_function_decl): Don't use the
gsymbol backend_decl if the procedure has a formal argument
that is a procedure.

2011-02-22  Paul Thomas  pa...@gcc.gnu.org

PR fortran/45743
* gfortran.dg/whole_file_32.f90 : New test.

Perhaps, looking at the -fdump-tree-original on whole_file_32.f90
may shed light on the situation.


-- 
Steve


Re: [C++0x] Range-based for statements and ADL

2011-04-15 Thread Jason Merrill

Applied (with a few formatting tweaks).

Thanks,
Jason



[v3] libstdc++/48631

2011-04-15 Thread Paolo Carlini

Hi,

tested x86_64-linux, committed mainline and 4_6-branch.

Paolo.


2011-04-15  Daniel Krugler  daniel.krueg...@googlemail.com
Paolo Carlini  paolo.carl...@oracle.com

PR libstdc++/48631
* include/bits/unique_ptr.h (default_delete_Tp[]): Add deleted
function call operator.
* testsuite/20_util/default_delete/48631_neg.cc: New.
* testsuite/20_util/weak_ptr/comparison/cmp_neg.cc: Adjust dg-error
line numbers.
Index: include/bits/unique_ptr.h
===
--- include/bits/unique_ptr.h   (revision 172533)
+++ include/bits/unique_ptr.h   (working copy)
@@ -79,6 +79,8 @@
  can't delete pointer to incomplete type);
delete [] __ptr;
   }
+
+  templatetypename _Up void operator()(_Up*) const = delete;
 };
 
   /// 20.7.12.2 unique_ptr for single objects.
Index: testsuite/20_util/default_delete/48631_neg.cc
===
--- testsuite/20_util/default_delete/48631_neg.cc   (revision 0)
+++ testsuite/20_util/default_delete/48631_neg.cc   (revision 0)
@@ -0,0 +1,30 @@
+// { dg-options -std=gnu++0x  }
+// { dg-do compile }
+
+// Copyright (C) 2011 Free Software Foundation
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// http://www.gnu.org/licenses/.
+
+#include memory
+
+struct B { };
+struct D : B { };
+
+// libstdc++/48631
+D d;
+std::default_deleteB[] db;
+typedef decltype(db(d)) type; // { dg-error use of deleted function }
+// { dg-error declared here  { target *-*-* } 83 }
Index: testsuite/20_util/weak_ptr/comparison/cmp_neg.cc
===
--- testsuite/20_util/weak_ptr/comparison/cmp_neg.cc(revision 172533)
+++ testsuite/20_util/weak_ptr/comparison/cmp_neg.cc(working copy)
@@ -48,9 +48,9 @@
 // { dg-warning note  { target *-*-* } 1099 }
 // { dg-warning note  { target *-*-* } 1094 }
 // { dg-warning note  { target *-*-* } 1086 }
-// { dg-warning note  { target *-*-* } 483 }
-// { dg-warning note  { target *-*-* } 477 }
-// { dg-warning note  { target *-*-* } 467 }
+// { dg-warning note  { target *-*-* } 485 }
+// { dg-warning note  { target *-*-* } 479 }
+// { dg-warning note  { target *-*-* } 469 }
 // { dg-warning note  { target *-*-* } 587 }
 // { dg-warning note  { target *-*-* } 1056 }
 // { dg-warning note  { target *-*-* } 1050 }


Re: [PATCH, SMS] New flag to apply SMS when SC equals 1

2011-04-15 Thread Revital Eres
Hello,

 If it's for debugging, can you use a --parm instead (like
 modulo-sched-min-sc or similar)?


I think I can use --param for debugging purposes in this case.
(I might add modulo-sched-max-sc as well)

Thanks,
Revital

 Thanks,
 Richard.

 Thanks,
 Revital

 Changelog:

        * common.opt (fmodulo-sched-allow-sc-one): New flag.
        * modulo-sched.c (sms_schedule): Allow SMS when stage count
        equals one and -fmodulo-sched-allow-sc-one flag is set.




Re: [PATCH, ARM] PR47855 Compute attr length for some thumb2 insns, 2/3

2011-04-15 Thread Carrot Wei
Hi Richard

Thank you for the detailed explanation. It sounds like an inherent
difficulty of rtl passes. Then the only opportunity is ldrb/strb
instructions because they never affect cc registers.

thanks
Carrot

On Fri, Apr 15, 2011 at 9:34 PM, Richard Earnshaw rearn...@arm.com wrote:

 On Thu, 2011-04-14 at 21:19 +0800, Carrot Wei wrote:
 On Fri, Apr 8, 2011 at 6:51 PM, Ramana Radhakrishnan
 ramana.radhakrish...@linaro.org wrote:
  On 08/04/11 10:57, Carrot Wei wrote:
 
  Hi
 
  This is the second part of the fixing for
 
  http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855
 
  This patch contains the length computation for insn patterns
  *arm_movqi_insn
  and *arm_addsi3. Since the alternatives and encodings are much more
  complex,
  the attribute length is computed in separate C functions.

  Sorry, no. This is potentially a maintenance pain. It hardcodes 
  alternatives
  from a pattern elsewhere in the C file. I don't like doing this unless we
  have to with the sync primitives or with push_multi. In this case I'm not
  convinced we need such functions in the .c file.
 
  Why can't we use the enabled attribute here with appropriate constraints
  for everything other than the memory cases (even there we might be able to
  invent some new constraints) ?
 
  Also a note about programming style. There are the helper macros like 
  REG_P,
  CONST_INT_P and MEM_P which remove the necessity for checks like
 
  GET_CODE (x) == y where y E { REG, CONST_INT, MEM}

 Hi Ramana

 As you suggested I created several new constraints, and use the
 enabled attribute to split the current alternatives in this new
 patch. It has been tested on arm qemu without regression.

 thanks
 Carrot


 Sorry, I don't think this approach can work.  Certainly not with the way
 the compiler currently works, and especially for mov and add insns.

 These instructions are only 2 bytes long if either:
 1) They clobber the condition code register or
 2) They occur inside an IT block.

 We can't tell either of these from the pattern, so you're
 underestimating the length of the instruction in some circumstances by
 claiming that they are only 2 bytes long.  That /will/ lead to broken
 code someday.

 We can't add potential clobbers to mov and add patterns because that
 will break reload which relies on these patterns being simple-set insns
 with no added baggage.  It *might* be possible to add clobbers to other
 operations, but that will then most-likely upset instruction scheduling
 (I think the scheduler treats two insns that clobber the same hard reg
 as being strongly ordered).  Putting in the clobber too early will
 certainly affect cond-exec generation.

 In short, I'm not aware of a simple way to address this problem so that
 we get accurate length information, but minimal impact on other passes
 in the compiler.

 R.