date:20130529

Re: [patch, mips] Update multilibs for mips-mti-* targets

2013-05-29 Thread Richard Sandiford

Steve Ellcey  sell...@imgtec.com writes:
 2013-05-28  Steve Ellcey  sell...@imgtec.com

   * config/mips/mti-linux.h (SYSROOT_SUFFIX_SPEC): Add micromips
   and mips16 directories.
   * config/mips/t-mti-linux (MULTILIB_OPTIONS): Add micromips and
   mips16.
   (MULTILIB_DIRNAMES): Ditto.
   (MULTILIB_EXCEPTIONS): Add new exceptions.
   * config/mips/t-mti-elf (MULTILIB_OPTIONS): Add micromips.
   (MULTILIB_DIRNAMES): Ditto.
   (MULTILIB_EXCEPTIONS): Add new exceptions.

OK, thanks.  Just one minor suggestion:

 @@ -32,4 +32,11 @@ MULTILIB_EXCEPTIONS += *mips32*/*mabi=64*
  # or mips64r2 but does specify mabi=64 is not allowed because that
  # would be defaulting to the mips32r2 architecture.
  MULTILIB_EXCEPTIONS += mabi=64*
 -MULTILIB_EXCEPTIONS += mips16/mabi=64*
 +MULTILIB_EXCEPTIONS += *mips16/mabi=64*
 +MULTILIB_EXCEPTIONS += *mmicromips/mabi=64*

The new lines here and:

  # The 64 bit ABI is not supported on the mips32r2 architecture.
  # Because mips32r2 is the default we can't use that flag to trigger
 -# the exception so we check for mabi=64 with no specific mips flag
 -# instead.
 +# the exception so we check for mabi=64 with no specific mips
 +# architecture flag instead.
  MULTILIB_EXCEPTIONS += mabi=64*
 +MULTILIB_EXCEPTIONS += *mips16/mabi=64*
 +MULTILIB_EXCEPTIONS += *mmicromips/mabi=64*

here aren't really doing what the comment says.  The * at the beginning
allows any intervening flags, including -mips64 and -mips64r2.  The lines
are correct for a different reason: we're excluding all 64-bit MIPS16
and microMIPS multilibs, even when a correct -mips option is present.
I think it would be clearer if you put these lines in the restrict
MIPS16 to 32-bit targets and restrict microMIPS to mips32r2 blocks
instead.  Changing the order like that shouldn't need a full retest.

Richard

Re: [PATCH][2/3] final try: Re-organize -fvect-cost-model, enable vectorization at -O2

2013-05-29 Thread Richard Biener

On Tue, 28 May 2013, David Edelsohn wrote:

 Richi,
 
 I would greatly appreciate if you would bootstrap and test this on
 ppc64-linux as well.

It's a no-op patch unless you specify -fvect-cost-model=cheap.
I'll do a bootstrap and regtest on ppc64-linux for your convenience
anyway.

I'll record an otherwise positive review for the change from you.

Thanks,
Richard.

Re: [PATCH 1/2] handwritten part of patch

2013-05-29 Thread Richard Biener

On Tue, May 28, 2013 at 8:04 PM, David Malcolm dmalc...@redhat.com wrote:
 On Mon, 2013-05-27 at 15:38 +0200, Richard Biener wrote:
 On Sat, May 25, 2013 at 3:02 PM, David Malcolm dmalc...@redhat.com wrote:
  Eliminate all direct references to cfun from macros in
  basic-block.h and introduce access methods to control_flow_graph
 
  * basic-block.h (control_flow_graph::get_basic_block_by_idx): New
  accessor methods.
  (control_flow_graph::get_entry_block): New method.
  (control_flow_graph::get_exit_block): New method.
  (control_flow_graph::get_basic_block_info): New methods.
  (control_flow_graph::get_n_basic_blocks): New methods.
  (control_flow_graph::get_n_edges): New methods.
  (control_flow_graph::get_last_basic_block): New methods.
  (control_flow_graph::get_label_to_block_map): New methods.
  (control_flow_graph::get_profile_status): New method.
  (control_flow_graph::set_profile_status): New method.
  (ENTRY_BLOCK_PTR): Eliminate this macro.
  (EXIT_BLOCK_PTR): Likewise.
  (basic_block_info): Likewise.
  (n_basic_blocks): Likewise.
  (n_edges): Likewise.
  (last_basic_block): Likewise.
  (label_to_block_map): Likewise.
  (profile_status): Likewise.
  (BASIC_BLOCK): Likewise.
  (SET_BASIC_BLOCK): Likewise.
  (FOR_EACH_BB_FN): Rewrite in terms of...
  (FOR_EACH_BB_CFG): New macro
  (FOR_EACH_BB): Eliminate this macro
  (FOR_EACH_BB_REVERSE_FN): Rewrite in terms of...
  (FOR_EACH_BB_REVERSE_FN_CFG): New macro
  (FOR_EACH_BB_REVERSE): Eliminate this macro
  (FOR_ALL_BB): Likewise.
  (FOR_ALL_BB_CFG): New macro

 I don't like the mix of _CFG / _FN.  It's obvious we are talking about
 the CFG of a function in 'FOR_EACH_BB' and friends.

 The current status quo is a pair of macros:

   #define FOO_FN(FN) ...do something on (FN)-cfg
   #define FOO() FOO_FN(cfun)

 My patch changed these to be:

   #define FOO_CFG(CFG) ...do something on CFG
   #define FOO_FN(FN) FOO_CFG((FN)-cfg)

 and to get rid of the FOO() with its cfun usage.

 So would your preference be for the FOO() to mean the cfg:

   #define FOO(CFG) ...do something on cfg
   #define FOO_FN(FN) FOO((FN)-cfg)

 ?

 e.g.

 #define FOR_EACH_BB(BB, CFG) \
   FOR_BB_BETWEEN (BB, (CFG)-x_entry_block_ptr-next_bb, 
 (CFG)-x_exit_block_ptr, next_bb)

 #define FOR_EACH_BB_FN(BB, FN) FOR_EACH_BB_CFG (BB, (FN)-cfg)

I suppose the CFG vs. FN distinction is because there is a
control-flow-graph sub-structure in a function (OTOH there is at most
a single CFG in a FN, so specifying FN is unambiguous).

Yes, the un-suffixed variant (FOO) should be the one with the natural
interface (taking a CFG).  People already know about the _FN variant
and if you change all existing uses of FOO (without cfg/fn argument)
then that would be the best change.

Ideally FN or CFG would be handled transparently via overloading
for example (or giving both struct function and struct cfg the same
set of member functions), so

inline basic_block entry_block_ptr (struct function *fn)
{
  return fn-cfg-x_entry_block_ptr;
}
inline basic_block entry_block_ptr (struct cfg *)
{
  return cfg-x_entry_block_ptr;
}

#define FOR_EACH_BB (BB, CONTEXT) FOR_BB_BETWEEN (BB, entry_block_ptr
(CONTEXT)-next_bb, exit_block_ptr (CONTEXT), next_bb)

and only retain a single macro/function form for context kinds that can
be unambiguously interchanged.

Or do people think that would be too confusing?  (I suppose defining an
automatic conversion from struct function * to struct cfg * is also techincally
possible, but that only works with the non-member-function way and would
avoid the overloading).

That said, I have no strong preference to remove the FOO_FN variants together
with this patch.  Using FOO with cfg arguments and FOO_FN with function
arguments is fine for me (eventually as intermediate step).

  ---
  diff --git a/gcc/basic-block.h b/gcc/basic-block.h
  index eed320c..3949417 100644
  --- a/gcc/basic-block.h
  +++ b/gcc/basic-block.h
  @@ -276,6 +276,57 @@ enum profile_status_d
  fields of this struct are interpreted as the defines for backward
  source compatibility following the definition of this struct.  */
   struct GTY(()) control_flow_graph {
  +public:
  +  basic_block get_basic_block_by_idx (int idx) const
  +  {
  +return (*x_basic_block_info)[idx];
  +  }

 get_basic_block_by_idx is rather long, any reason to not shorten
 it to get_block () or get_bb?

 Will do.   get_bb () then

  +  void set_basic_block_by_idx (int idx, basic_block bb)

 set_block () or set_bb()

 Will do.

  +  {
  +(*x_basic_block_info)[idx] = bb;
  +  }
  +
  +  basic_block get_entry_block () const { return x_entry_block_ptr; }
  +
  +  basic_block get_exit_block () const { return x_exit_block_ptr; }
  +
  +  vecbasic_block, va_gc *get_basic_block_info

Re: Fix PR 57442

2013-05-29 Thread Richard Biener

On Wed, May 29, 2013 at 3:35 AM, Easwaran Raman era...@google.com wrote:
 I made a thinko when I asserted gcc_unreachable outside the main loop
 of appears_later_in_bb in my previous fix to PR 57337. It is quite
 possible for STMT1 to be followed by a one or more statements with the
 same UID till the end of the BB without encountering STMT2. Right fix
 is to return STMT1 outside the loop. Ok for trunk?

Ok.

Thanks,
Richard.

 Thanks,
 Easwaran

 2013-05-28  Easwaran Raman  era...@google.com

 PR tree-optimization/57442
 * tree-ssa-reassoc.c (appears_later_in_bb): Return correct value
 when control exits the main loop.

 2013-05-28  Easwaran Raman  era...@google.com

 PR tree-optimization/57442
 * gcc.dg/tree-ssa/reassoc-30.c: New testcase.

 Index: gcc/testsuite/gcc.dg/tree-ssa/reassoc-30.c
 ===
 --- gcc/testsuite/gcc.dg/tree-ssa/reassoc-30.c (revision 0)
 +++ gcc/testsuite/gcc.dg/tree-ssa/reassoc-30.c (revision 0)
 @@ -0,0 +1,13 @@
 +/* PR tree-optimization/57442 */
 +/* { dg-do compile } */
 +/* { dg-options -O1 } */
 +short a;
 +unsigned b;
 +long c;
 +int d;
 +
 +void f(void)
 +{
 +b = a ? : (a = b) - c + (d - (b + b));
 +}
 +
 Index: gcc/tree-ssa-reassoc.c
 ===
 --- gcc/tree-ssa-reassoc.c (revision 199385)
 +++ gcc/tree-ssa-reassoc.c (working copy)
 @@ -2888,7 +2888,7 @@ appears_later_in_bb (gimple stmt1, gimple stmt2)
else if (stmt == stmt2)
  return stmt2;
  }
 -  gcc_unreachable ();
 +  return stmt1;
  }

  /* Find the statement after which STMT must be moved so that the

[Patch, Fortran] Enable FINALization/poly dealloc for allocatables

2013-05-29 Thread Tobias Burnus


Dear all,

this patch enables finalization (and polymorphic deallocation) for 
allocatables for: end of scope, DEALLOCATE and intent(out).


As a side effect, an allocatable is no longer deallocated at the end of 
the main program. (Variables declared in the main program have 
automatically SAVE attribute; before finalization, it made no difference 
but with finalization it is detectable. And only finalizing 
nonfinalizable allocatables seems to be too much effort for too little 
gain.)


Note: This patch requires the following patch, which is still pending 
review:


* Enable wrapper generation, 
http://gcc.gnu.org/ml/fortran/2013-05/msg00073.html 
http://gcc.gnu.org/ml/fortran/2013-05/msg00093.html


Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias

PS: Fortran requires additional cases where finalization has to happen; 
those will be added in follow-up patches.
2013-05-29  Tobias Burnus  bur...@net-b.de

	PR fortran/37336
	* trans-array.c (gfc_trans_dealloc_allocated): Support finalization.
	(structure_alloc_comps): Update caller.
	(gfc_trans_deferred_array): Call finalizer.
	* trans-array.h (gfc_trans_dealloc_allocated): Update prototype.
	* trans-decl.c (gfc_trans_deferred_vars): Don't deallocate/finalize
	variables of the main program.
	* trans-expr.c (gfc_conv_procedure_call): Support finalization.
	* trans-openmp.c (gfc_omp_clause_dtor,
	gfc_trans_omp_array_reduction): Update calls.
	* trans-stmt.c (gfc_trans_deallocate): Avoid double deallocation
	of alloc components.
	* trans.c (gfc_add_finalizer_call): New function.
	(gfc_deallocate_with_status,
	gfc_deallocate_scalar_with_status): Call it
	(gfc_build_final_call): Fix handling of scalar coarrays.

2013-05-29  Tobias Burnus  bur...@net-b.de

	PR fortran/37336
	* gfortran.dg/finalize_12.f90: New.
	* gfortran.dg/alloc_comp_basics_1.f90: Add BLOCK for
	end of scope finalization.
	* gfortran.dg/alloc_comp_constructor_1.f90: Ditto.
	* gfortran.dg/allocatable_scalar_9.f90: Ditto.
	* gfortran.dg/auto_dealloc_2.f90: Ditto.
	* gfortran.dg/class_19.f03: Ditto.
	* gfortran.dg/coarray_lib_alloc_1.f90: Ditto.
	* gfortran.dg/coarray_lib_alloc_2.f90: Ditto.
	* gfortran.dg/extends_14.f03: Ditto.
	* gfortran.dg/move_alloc_4.f90: Ditto.
	* gfortran.dg/typebound_proc_27.f03: Ditto.

diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index be3a5a0..8160fcd 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -7243,7 +7243,7 @@ gfc_conv_array_parameter (gfc_se * se, gfc_expr * expr, bool g77,
 /* Generate code to deallocate an array, if it is allocated.  */
 
 tree
-gfc_trans_dealloc_allocated (tree descriptor, bool coarray)
+gfc_trans_dealloc_allocated (tree descriptor, bool coarray, gfc_expr *expr)
 {
   tree tmp;
   tree var;
@@ -7259,7 +7259,7 @@ gfc_trans_dealloc_allocated (tree descriptor, bool coarray)
  are already deallocated are ignored.  */
   tmp = gfc_deallocate_with_status (coarray ? descriptor : var, NULL_TREE,
 NULL_TREE, NULL_TREE, NULL_TREE, true,
-NULL, coarray);
+expr, coarray);
   gfc_add_expr_to_block (block, tmp);
 
   /* Zero the data pointer.  */
@@ -7548,7 +7548,7 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl,
 	{
 	  comp = fold_build3_loc (input_location, COMPONENT_REF, ctype,
   decl, cdecl, NULL_TREE);
-	  tmp = gfc_trans_dealloc_allocated (comp, c-attr.codimension);
+	  tmp = gfc_trans_dealloc_allocated (comp, c-attr.codimension, NULL);
 	  gfc_add_expr_to_block (tmpblock, tmp);
 	}
 	  else if (c-attr.allocatable)
@@ -7580,7 +7580,7 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl,
 
 	  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (comp)))
 	tmp = gfc_trans_dealloc_allocated (comp,
-	CLASS_DATA (c)-attr.codimension);
+	CLASS_DATA (c)-attr.codimension, NULL);
 	  else
 		{
 		  tmp = gfc_deallocate_scalar_with_status (comp, NULL_TREE, true, NULL,
@@ -8292,7 +8292,7 @@ gfc_trans_deferred_array (gfc_symbol * sym, gfc_wrapped_block * block)
   stmtblock_t cleanup;
   locus loc;
   int rank;
-  bool sym_has_alloc_comp;
+  bool sym_has_alloc_comp, has_finalizer;
 
   sym_has_alloc_comp = (sym-ts.type == BT_DERIVED
 			|| sym-ts.type == BT_CLASS)
@@ -8379,8 +8379,12 @@ gfc_trans_deferred_array (gfc_symbol * sym, gfc_wrapped_block * block)
 
   /* Allocatable arrays need to be freed when they go out of scope.
  The allocatable components of pointers must not be touched.  */
-  if (sym_has_alloc_comp  !(sym-attr.function || sym-attr.result)
-   !sym-attr.pointer  !sym-attr.save)
+  has_finalizer = sym-ts.type == BT_CLASS || sym-ts.type == BT_DERIVED
+		   ? gfc_is_finalizable (sym-ts.u.derived, NULL) : false;
+  if ((!sym-attr.allocatable || !has_finalizer)
+   sym_has_alloc_comp  !(sym-attr.function || sym-attr.result)
+   !sym-attr.pointer  !sym-attr.save
+   !sym-ns-proc_name-attr.is_main_program)
 {
   int rank;
   rank = sym-as ? sym-as-rank : 0;
@@

[IA-64] Improve FP comparisons with -fno-trapping-math

2013-05-29 Thread Eric Botcazou

Hi,

on most platforms the Ada compiler doesn't enable trap-on-FP-exceptions, so 
it's appropriate to set -fno-trapping-math there, which has been done in
  http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01461.html

However doing so can have an adverse effect if the architecture doesn't have a 
sufficiently rich set of FP comparison instructions because -fno-trapping-math 
allows the middle-end and the optimizers to turn the negation of signaling FP 
comparison operators into quiet FP unordered operators, e.g. not() into != 
and the quiet unordered operators aren't supported universally.  The effect is 
that you need 2 comparisons instead of just 1 to achieve the desired result.

The first (surprising) example is the IA-64, which has ORDERED and UNORDERED 
but not UNLT/UNLE/UNGT/UNGE because the unordered comparison instructions are 
signaling (since they are the negation of the signaling ordered comparisons).
Therefore the fix is to make it possible to use these unordered comparison 
instructions, which are signaling, for the quiet unordered operators when the 
flag -fno-trapping-math is in effect.  As a matter of fact, the support was 
already there (e.g. in ia64_print_operand) but it was rightfully disabled.

Tested on IA-64/Linux and IA-64/HP-UX, OK for the mainline?


2013-05-29  Eric Botcazou  ebotca...@adacore.com

* config/ia64/predicates.md (ia64_cbranch_operator): Accept unordered
comparison operators when -fno-trapping-math is in effect.
* config/ia64/ia64.c (ia64_expand_compare): Add support for unordered
comparison operators in TFmode and assert that unsupported operators
cannot reach here.
(ia64_print_operand): Likewise.


-- 
Eric BotcazouIndex: config/ia64/predicates.md
===
--- config/ia64/predicates.md	(revision 199343)
+++ config/ia64/predicates.md	(working copy)
@@ -568,9 +568,15 @@ (define_predicate fr_reg_or_0_operand
 	(match_test op == CONST0_RTX (GET_MODE (op))
 
 ;; Return 1 if OP is a valid comparison operator for cbranch instructions.
+;; If we're assuming that FP operations cannot generate user-visible traps,
+;; then we can use the FP unordered-signaling instructions to implement the
+;; FP unordered-quiet comparison predicates.
 (define_predicate ia64_cbranch_operator
-  (ior (match_operand 0 ordered_comparison_operator)
-   (match_code ordered,unordered)))
+  (if_then_else (match_test flag_trapping_math)
+		(ior (match_operand 0 ordered_comparison_operator)
+		  (match_code ordered,unordered))
+		(and (match_operand 0 comparison_operator)
+		  (not (match_code uneq,ltgt)
 
 ;; True if this is a comparison operator, which accepts a normal 8-bit
 ;; signed immediate operand.
Index: config/ia64/ia64.c
===
--- config/ia64/ia64.c	(revision 199343)
+++ config/ia64/ia64.c	(working copy)
@@ -1756,7 +1756,7 @@ ia64_expand_compare (rtx *expr, rtx *op0
   else if (TARGET_HPUX  GET_MODE (*op0) == TFmode)
 {
   enum qfcmp_magic {
-	QCMP_INV = 1,	/* Raise FP_INVALID on SNaN as a side effect.  */
+	QCMP_INV = 1,	/* Raise FP_INVALID on NaNs as a side effect.  */
 	QCMP_UNORD = 2,
 	QCMP_EQ = 4,
 	QCMP_LT = 8,
@@ -1770,21 +1770,27 @@ ia64_expand_compare (rtx *expr, rtx *op0
   switch (code)
 	{
 	  /* 1 = equal, 0 = not equal.  Equality operators do
-	 not raise FP_INVALID when given an SNaN operand.  */
+	 not raise FP_INVALID when given a NaN operand.  */
 	case EQ:magic = QCMP_EQ;  ncode = NE; break;
 	case NE:magic = QCMP_EQ;  ncode = EQ; break;
 	  /* isunordered() from C99.  */
 	case UNORDERED: magic = QCMP_UNORD;   ncode = NE; break;
 	case ORDERED:   magic = QCMP_UNORD;   ncode = EQ; break;
 	  /* Relational operators raise FP_INVALID when given
-	 an SNaN operand.  */
+	 a NaN operand.  */
 	case LT:magic = QCMP_LT|QCMP_INV; ncode = NE; break;
 	case LE:magic = QCMP_LT|QCMP_EQ|QCMP_INV; ncode = NE; break;
 	case GT:magic = QCMP_GT|QCMP_INV; ncode = NE; break;
 	case GE:magic = QCMP_GT|QCMP_EQ|QCMP_INV; ncode = NE; break;
-	  /* FUTURE: Implement UNEQ, UNLT, UNLE, UNGT, UNGE, LTGT.
-	 Expanders for buneq etc. weuld have to be added to ia64.md
-	 for this to be useful.  */
+  /* Unordered relational operators do not raise FP_INVALID
+	 when given a NaN operand.  */
+	case UNLT:magic = QCMP_LT|QCMP_UNORD; ncode = NE; break;
+	case UNLE:magic = QCMP_LT|QCMP_EQ|QCMP_UNORD; ncode = NE; break;
+	case UNGT:magic = QCMP_GT|QCMP_UNORD; ncode = NE; break;
+	case UNGE:magic = QCMP_GT|QCMP_EQ|QCMP_UNORD; ncode = NE; break;
+	  /* Not supported.  */
+	case UNEQ:
+	case LTGT:
 	default: gcc_unreachable ();
 	}
 
@@ -5279,6 +5285,9 @@ ia64_print_operand (FILE * file, rtx x,
 	case UNGE:
 	  str = nlt;

[rs6000] Improve FP comparisons with -fno-trapping-math

2013-05-29 Thread Eric Botcazou

Hi,

on most platforms the Ada compiler doesn't enable trap-on-FP-exceptions, so 
it's appropriate to set -fno-trapping-math there, which has been done in
  http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01461.html

However doing so can have an adverse effect if the architecture doesn't have a 
sufficiently rich set of FP comparison instructions because -fno-trapping-math 
allows the middle-end and the optimizers to turn the negation of signaling FP 
comparison operators into quiet FP unordered operators, e.g. not() into != 
and the quiet unordered operators aren't supported universally.  The effect is 
that you need 2 comparisons instead of just 1 to achieve the desired result.

The second (less surprising) example is the e500 architecture which features 
FP instructions operating on GP registers.  It has neither ORDERED/UNORDERED 
nor the UNLT/UNLE/UNGT/UNGE operators so, in this case, you get a new call to 
the libgcc routine implementing UNORDERED in addition to the comparison.  The 
fix is to make it possible to use the (negated form of the) ordered comparison 
instructions, which are signaling, for the quiet unordered operators when the 
flag -fno-trapping-math is in effect.  As a matter of fact, a partial support 
was already there (e.g. in rs6000_generate_compare) but it was disabled.

Tested on e500v2/Linux and PowerPC/Linux, OK for the mainline?


2013-05-29  Eric Botcazou  ebotca...@adacore.com

* config/rs6000/predicates.md (rs6000_cbranch_operator): Accept some
unordered comparison operators when -fno-trapping-math is in effect
on the e500.
* config/rs6000/rs6000.c (rs6000_generate_compare): Remove dead code
and implement unordered comparison operators properly on the e500.


-- 
Eric BotcazouIndex: config/rs6000/predicates.md
===
--- config/rs6000/predicates.md	(revision 199343)
+++ config/rs6000/predicates.md	(working copy)
@@ -1121,9 +1121,16 @@ (define_predicate branch_comparison_ope
 		   GET_MODE (XEXP (op, 0))),
 			  1
 
+;; Return 1 if OP is a valid comparison operator for cbranch instructions.
+;; If we're assuming that FP operations cannot generate user-visible traps,
+;; then on e500 we can use the ordered-signaling instructions to implement
+;; the unordered-quiet FP comparison predicates modulo a reversal.
 (define_predicate rs6000_cbranch_operator
   (if_then_else (match_test TARGET_HARD_FLOAT  !TARGET_FPRS)
-		(match_operand 0 ordered_comparison_operator)
+		(if_then_else (match_test flag_trapping_math)
+			  (match_operand 0 ordered_comparison_operator)
+			  (ior (match_operand 0 ordered_comparison_operator)
+   (match_code (unlt,unle,ungt,unge
 		(match_operand 0 comparison_operator)))
 
 ;; Return 1 if OP is a comparison operation that is valid for an SCC insn --
Index: config/rs6000/rs6000.c
===
--- config/rs6000/rs6000.c	(revision 199343)
+++ config/rs6000/rs6000.c	(working copy)
@@ -16086,16 +16086,41 @@ rs6000_generate_compare (rtx cmp, enum m
 {
   rtx cmp, or_result, compare_result2;
   enum machine_mode op_mode = GET_MODE (op0);
+  bool reverse_p;
 
   if (op_mode == VOIDmode)
 	op_mode = GET_MODE (op1);
 
+  /* First reverse the condition codes that aren't directly supported.  */
+  switch (code)
+	{
+	  case NE:
+	  case UNLT:
+	  case UNLE:
+	  case UNGT:
+	  case UNGE:
+	code = reverse_condition_maybe_unordered (code);
+	reverse_p = true;
+	break;
+
+	  case EQ:
+	  case LT:
+	  case LE:
+	  case GT:
+	  case GE:
+	reverse_p = false;
+	break;
+
+	  default:
+	gcc_unreachable ();
+	}
+
   /* The E500 FP compare instructions toggle the GT bit (CR bit 1) only.
 	 This explains the following mess.  */
 
   switch (code)
 	{
-	case EQ: case UNEQ: case NE: case LTGT:
+	case EQ:
 	  switch (op_mode)
 	{
 	case SFmode:
@@ -16121,7 +16146,8 @@ rs6000_generate_compare (rtx cmp, enum m
 	}
 	  break;
 
-	case GT: case GTU: case UNGT: case UNGE: case GE: case GEU:
+	case GT:
+	case GE:
 	  switch (op_mode)
 	{
 	case SFmode:
@@ -16147,7 +16173,8 @@ rs6000_generate_compare (rtx cmp, enum m
 	}
 	  break;
 
-	case LT: case LTU: case UNLT: case UNLE: case LE: case LEU:
+	case LT: 
+	case LE:
 	  switch (op_mode)
 	{
 	case SFmode:
@@ -16172,24 +16199,16 @@ rs6000_generate_compare (rtx cmp, enum m
 	  gcc_unreachable ();
 	}
 	  break;
+
 default:
   gcc_unreachable ();
 	}
 
   /* Synthesize LE and GE from LT/GT || EQ.  */
-  if (code == LE || code == GE || code == LEU || code == GEU)
+  if (code == LE || code == GE)
 	{
 	  emit_insn (cmp);
 
-	  switch (code)
-	{
-	case LE: code = LT; break;
-	case GE: code = GT; break;
-	case LEU: code = LT; break;
-	case GEU: code = GT; break;
-	default: gcc_unreachable ();
-	}
-
 	  compare_result2

Re: [IA-64] Improve FP comparisons with -fno-trapping-math

2013-05-29 Thread Richard Biener

On Wed, May 29, 2013 at 10:33 AM, Eric Botcazou ebotca...@adacore.com wrote:
 Hi,

 on most platforms the Ada compiler doesn't enable trap-on-FP-exceptions, so
 it's appropriate to set -fno-trapping-math there, which has been done in
   http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01461.html

 However doing so can have an adverse effect if the architecture doesn't have a
 sufficiently rich set of FP comparison instructions because -fno-trapping-math
 allows the middle-end and the optimizers to turn the negation of signaling FP
 comparison operators into quiet FP unordered operators, e.g. not() into !=
 and the quiet unordered operators aren't supported universally.  The effect is
 that you need 2 comparisons instead of just 1 to achieve the desired result.

 The first (surprising) example is the IA-64, which has ORDERED and UNORDERED
 but not UNLT/UNLE/UNGT/UNGE because the unordered comparison instructions are
 signaling (since they are the negation of the signaling ordered comparisons).
 Therefore the fix is to make it possible to use these unordered comparison
 instructions, which are signaling, for the quiet unordered operators when the
 flag -fno-trapping-math is in effect.  As a matter of fact, the support was
 already there (e.g. in ia64_print_operand) but it was rightfully disabled.

 Tested on IA-64/Linux and IA-64/HP-UX, OK for the mainline?

Did you check that this doesn't cause traps on SPEC CPU 2000 / 2006
when compiled with -ffast-math (something we generally want to support,
even if it is on the border of validity)?

Thanks,
Richard.

 2013-05-29  Eric Botcazou  ebotca...@adacore.com

 * config/ia64/predicates.md (ia64_cbranch_operator): Accept unordered
 comparison operators when -fno-trapping-math is in effect.
 * config/ia64/ia64.c (ia64_expand_compare): Add support for unordered
 comparison operators in TFmode and assert that unsupported operators
 cannot reach here.
 (ia64_print_operand): Likewise.


 --
 Eric Botcazou

[PATCH] Fix vect_bb_slp_scalar_cost change

2013-05-29 Thread Richard Biener


My commit crossed the Cilk+ changes and only its testcases expose
that we do not set stmt UIDs on blocks we currently vectorize and
thus other vinfo_for_stmt calls have to be properly guarded.

Tested on x86_64-unknown-linux-gnu, applied as obvious.

Richard.

2013-05-29  Richard Biener  rguent...@suse.de

* tree-vect-slp.c (vect_bb_slp_scalar_cost): Guard vinfo
access on whether the use is in the BB we currently try to
vectorize.
(vect_bb_vectorization_profitable_p): Pass the BB we currently
vectorize to vect_bb_slp_scalar_cost.

Index: gcc/tree-vect-slp.c
===
*** gcc/tree-vect-slp.c (revision 199402)
--- gcc/tree-vect-slp.c (working copy)
*** vect_slp_analyze_operations (bb_vec_info
*** 1904,1910 
 update LIFE according to uses of NODE.  */
  
  static unsigned
! vect_bb_slp_scalar_cost (slp_tree node, vecbool, va_stack life)
  {
unsigned scalar_cost = 0;
unsigned i;
--- 1904,1911 
 update LIFE according to uses of NODE.  */
  
  static unsigned
! vect_bb_slp_scalar_cost (basic_block bb,
!slp_tree node, vecbool, va_stack life)
  {
unsigned scalar_cost = 0;
unsigned i;
*** vect_bb_slp_scalar_cost (slp_tree node,
*** 1931,1937 
  imm_use_iterator use_iter;
  gimple use_stmt;
  FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, DEF_FROM_PTR (def_p))
!   if (!vinfo_for_stmt (use_stmt)
|| !STMT_VINFO_VECTORIZABLE (vinfo_for_stmt (use_stmt)))
  {
life[i] = true;
--- 1932,1938 
  imm_use_iterator use_iter;
  gimple use_stmt;
  FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, DEF_FROM_PTR (def_p))
!   if (gimple_bb (use_stmt) != bb
|| !STMT_VINFO_VECTORIZABLE (vinfo_for_stmt (use_stmt)))
  {
life[i] = true;
*** vect_bb_slp_scalar_cost (slp_tree node,
*** 1956,1962 
  }
  
FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
! scalar_cost += vect_bb_slp_scalar_cost (child, life);
  
return scalar_cost;
  }
--- 1957,1963 
  }
  
FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
! scalar_cost += vect_bb_slp_scalar_cost (bb, child, life);
  
return scalar_cost;
  }
*** vect_bb_vectorization_profitable_p (bb_v
*** 1995,2001 
vecbool, va_stack life;
vec_stack_alloc (bool, life, SLP_INSTANCE_GROUP_SIZE (instance));
life.quick_grow_cleared (SLP_INSTANCE_GROUP_SIZE (instance));
!   scalar_cost += vect_bb_slp_scalar_cost (SLP_INSTANCE_TREE (instance),
  life);
life.release ();
  }
--- 1996,2003 
vecbool, va_stack life;
vec_stack_alloc (bool, life, SLP_INSTANCE_GROUP_SIZE (instance));
life.quick_grow_cleared (SLP_INSTANCE_GROUP_SIZE (instance));
!   scalar_cost += vect_bb_slp_scalar_cost (BB_VINFO_BB (bb_vinfo),
! SLP_INSTANCE_TREE (instance),
  life);
life.release ();
  }

Re: [IA-64] Improve FP comparisons with -fno-trapping-math

2013-05-29 Thread Eric Botcazou

 Did you check that this doesn't cause traps on SPEC CPU 2000 / 2006
 when compiled with -ffast-math (something we generally want to support,
 even if it is on the border of validity)?

Do you mean SPECfp because some benchmarks might enable trap-on-FP-exceptions 
and you nevertheless compile them with -ffast-math?

-- 
Eric Botcazou

Re: [PATCH] Fix incorrect discriminator assignment.

2013-05-29 Thread Rainer Orth

Mike Stump m...@mrs.kithrup.com writes:

 On May 28, 2013, at 3:56 PM, Dehao Chen de...@google.com wrote:
 The attached patch restricts the test only run on linux-gnu targets. Is it 
 ok?

 Ok.  You can put HAVE_AS_DWARF2_DEBUG_LINE in a comment above it to help
 explain why the target restriction applies.  The idea is, if enough people
 care, or some x86 target (say, darwin), want to reliably find them all, one
 can then search for that and fix them all in one go.

Right, this does work e.g. on Solaris/x86 with gas, but fails with Sun
as or Darwin as.

Btw., why is this even x86 specific?  At least on Solaris/SPARC with
gas, HAVE_AS_DWARF2_DEBUG_LINE *is* defined.

And why do you add -gdwarf-2 in dg-options?  AFAICS all tests in
gcc.dg/debug/dwarf2 are built with -gdwarf-2 anyway.  Perhaps you need
just { dg-additional-options -O0 } instead?

Also, please mention PR testsuite/57413 in the ChangeLog and move the
addition of the testcase from gcc/ChangeLog to gcc/testsuite/ChangeLog,
removing the testsuite/ prefix in the pathname.

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [IA-64] Improve FP comparisons with -fno-trapping-math

2013-05-29 Thread Richard Biener

On Wed, May 29, 2013 at 11:19 AM, Eric Botcazou ebotca...@adacore.com wrote:
 Did you check that this doesn't cause traps on SPEC CPU 2000 / 2006
 when compiled with -ffast-math (something we generally want to support,
 even if it is on the border of validity)?

 Do you mean SPECfp because some benchmarks might enable trap-on-FP-exceptions
 and you nevertheless compile them with -ffast-math?

Yes, SPECfp.  And no, I don't think they enable trap-on-FP-exceptions but they
may cause exceptional behavior even if -fno-trapping-math is specified (not sure
if IA64 masks exceptions on the comparisons?)

Richard.

 --
 Eric Botcazou

Re: [x86, PATCH 1/2] Enabling of the new Intel microarchitecture Silvermont

2013-05-29 Thread Uros Bizjak

On Tue, May 28, 2013 at 11:12 AM, Zamyatin, Igor
igor.zamya...@intel.com wrote:

 Several weeks ago Intel announced new microarchitecture named Silvermont (see 
 for example 
 http://newsroom.intel.com/community/intel_newsroom/blog/2013/05/06/intel-launches-low-power-high-performance-silvermont-microarchitecture
  ).

 This patch is the first and it enables Silvermont architecture in the GCC 
 compiler (second patch will add some performance related features).

 Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk?

 Thanks,
 Igor

 2013-05-27  Yuri Rumyantsev  yuri.s.rumyant...@intel.com
 Igor Zamyatin  igor.zamya...@intel.com

 Silvermont (SLM) architecture pipeline model, tuning and
 insn selection.
 * config.gcc: Add slm config options and target.

 * config/i386/slm.md: New.

 * config/i386/driver-i386.c (host_detect_local_cpu): Check movbe.

 * gcc/config/i386/i386-c.c (ix86_target_macros_internal):  New case
 PROCESSOR_SLM.
 (ix86_target_macros_internal): Likewise.

 * gcc/config/i386/i386.c (slm_cost): New cost.
 (m_SLM): New macro flag.
 (initial_ix86_tune_features): Set m_SLM.
 (x86_accumulate_outgoing_args): Likewise.
 (x86_arch_always_fancy_math_387): Likewise.
 (processor_target_table): Add slm cost.
 (cpu_names): Add slm cpu name.
 (x86_option_override_internal): Set SLM ISA.
 (ix86_issue_rate): New case PROCESSOR_SLM.
 (ia32_multipass_dfa_lookahead): Likewise.
 (fold_builtin_cpu): Add slm.

 * config/i386/i386.h (TARGET_SLM): New target macro.
 (target_cpu_default): Add TARGET_CPU_DEFAULT_slm.
 (processor_type): Add PROCESSOR_SLM.

 * config/i386/i386.md (cpu): Add new value slm.
 (slm.md): Include slm.md.

 * libgcc/config/i386/cpuinfo.c (INTEL_SLM): New enum value.


This is OK for mainline.

Thanks,
Uros.

Re: [IA-64] Improve FP comparisons with -fno-trapping-math

2013-05-29 Thread Eric Botcazou

 Yes, SPECfp.  And no, I don't think they enable trap-on-FP-exceptions but
 they may cause exceptional behavior even if -fno-trapping-math is specified
 (not sure if IA64 masks exceptions on the comparisons?)

The manual says that QNaNs signal Invalid for the signaling FP comparison 
operators and that you need to enable the Invalid Operation trap to get a 
fault from Invalid.  The IEEE standard also says that comparisons by way of 
unordered signaling predicates raise the invalid operation exception, just 
like the regular operations.  That being said, I can double-check.

-- 
Eric Botcazou

Re: [PATCH] Do not allow non-top-level BIT_FIELD_REFs, IMAGPART_EXPRs or REALPART_EXPRs

2013-05-29 Thread Martin Jambor

Hi,

On Tue, May 28, 2013 at 03:40:25PM +0200, Richard Biener wrote:
 On Tue, 28 May 2013, Martin Jambor wrote:
  I've committed it s revision 199379, thanks.  As far as the
  non-top-levelness is concerned, the following (on top of the previous
  patch) also survives bootstrap and testsuite on x86_64 (all languages
  including Ada and Obj-C++).  Do you think it would be acceptable as
  well?
 
 With the following minor adjustment:
 
  Thanks,
  
  Martin
  
   
  2013-05-27  Martin Jambor  mjam...@suse.cz
  
  * tree-cfg.c (verify_expr): Verify that BIT_FIELD_REF, REALPART_EXPR
  and IMAGPART_EXPR do not occur within other handled_components.
  
  Index: src/gcc/tree-cfg.c
  ===
  --- src.orig/gcc/tree-cfg.c
  +++ src/gcc/tree-cfg.c
  @@ -2675,6 +2675,33 @@ verify_expr (tree *tp, int *walk_subtree
return t;
  }
   
  +  if (TREE_CODE (t) == BIT_FIELD_REF)
  +   {
  + if (!host_integerp (TREE_OPERAND (t, 1), 1)
  + || !host_integerp (TREE_OPERAND (t, 2), 1))
  +   {
  + error (invalid position or size operand to BIT_FIELD_REF);
  + return t;
  +   }
  + if (INTEGRAL_TYPE_P (TREE_TYPE (t))
  +  (TYPE_PRECISION (TREE_TYPE (t))
  + != TREE_INT_CST_LOW (TREE_OPERAND (t, 1
  +   {
  + error (integral result type precision does not match 
  +field size of BIT_FIELD_REF);
  + return t;
  +   }
  + else if (!INTEGRAL_TYPE_P (TREE_TYPE (t))
  +   TYPE_MODE (TREE_TYPE (t)) != BLKmode
  +   (GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (t)))
  +  != TREE_INT_CST_LOW (TREE_OPERAND (t, 1
  +   {
  + error (mode precision of non-integral result does not 
  +match field size of BIT_FIELD_REF);
  + return t;
  +   }
  +   }
  +
 
   t = TREE_OPERAND (t, 0);
 
 here instead of ...
 
 /* Fall-through.  */
   case COMPONENT_REF:
   case ARRAY_REF:
  @@ -2697,35 +2724,16 @@ verify_expr (tree *tp, int *walk_subtree
if (TREE_OPERAND (t, 3))
  CHECK_OP (3, invalid array stride);
  }
  - else if (TREE_CODE (t) == BIT_FIELD_REF)
  -   {
  - if (!host_integerp (TREE_OPERAND (t, 1), 1)
  - || !host_integerp (TREE_OPERAND (t, 2), 1))
  -   {
  - error (invalid position or size operand to BIT_FIELD_REF);
  - return t;
  -   }
  - if (INTEGRAL_TYPE_P (TREE_TYPE (t))
  -  (TYPE_PRECISION (TREE_TYPE (t))
  - != TREE_INT_CST_LOW (TREE_OPERAND (t, 1
  -   {
  - error (integral result type precision does not match 
  -field size of BIT_FIELD_REF);
  - return t;
  -   }
  - else if (!INTEGRAL_TYPE_P (TREE_TYPE (t))
  -   !AGGREGATE_TYPE_P (TREE_TYPE (t))
  -   TYPE_MODE (TREE_TYPE (t)) != BLKmode
  -   (GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (t)))
  -  != TREE_INT_CST_LOW (TREE_OPERAND (t, 1
  -   {
  - error (mode precision of non-integral result does not 
  -match field size of BIT_FIELD_REF);
  - return t;
  -   }
  -   }
   
t = TREE_OPERAND (t, 0);
  + if (TREE_CODE (t) == BIT_FIELD_REF
  + || TREE_CODE (t) == REALPART_EXPR
  + || TREE_CODE (t) == IMAGPART_EXPR)
  +   {
  + error (non-top-level BIT_FIELD_REF, IMAGPART_EXPR or 
  +REALPART_EXPR);
  + return t;
  +   }
 
 ... doing this after t = TREE_OPERAND (t, 0) (so, do it before).
 

All right, I have committed the following as revision 199405.

Thanks,

Martin


2013-05-29  Martin Jambor  mjam...@suse.cz

* tree-cfg.c (verify_expr): Verify that BIT_FIELD_REF, REALPART_EXPR
and IMAGPART_EXPR do not occur within other handled_components.

Index: src/gcc/tree-cfg.c
===
--- src.orig/gcc/tree-cfg.c
+++ src/gcc/tree-cfg.c
@@ -2675,6 +2675,34 @@ verify_expr (tree *tp, int *walk_subtree
  return t;
}
 
+  if (TREE_CODE (t) == BIT_FIELD_REF)
+   {
+ if (!host_integerp (TREE_OPERAND (t, 1), 1)
+ || !host_integerp (TREE_OPERAND (t, 2), 1))
+   {
+ error (invalid position or size operand to BIT_FIELD_REF);
+ return t;
+   }
+ if (INTEGRAL_TYPE_P (TREE_TYPE (t))
+  (TYPE_PRECISION (TREE_TYPE (t))
+ != TREE_INT_CST_LOW (TREE_OPERAND (t, 1
+   {
+ error (integral result type precision does not match 
+field size of BIT_FIELD_REF);
+ return t;
+   }
+ else if (!INTEGRAL_TYPE_P (TREE_TYPE (t))
+   TYPE_MODE (TREE_TYPE

Re: [ada, build] host/target configuration

2013-05-29 Thread Thomas Schwinge

Hi!

On Tue, 28 May 2013 09:28:28 +0200, I wrote:
 On Wed, 08 May 2013 11:27:09 +0200, Rainer Orth 
 r...@cebitec.uni-bielefeld.de wrote:
  As described in [PR ada/57188], amd64-pc-solaris2.1[01] Ada bootstrap was 
  failing
  for some time.  It has turned out that this patch is the culprit:
  
  2013-04-23  Eric Botcazou  ebotca...@adacore.com
  Pascal Obryo...@adacore.com
  
  * gcc-interface/Makefile.in (targ): Fix target name check.
  
  diff --git a/gcc/ada/gcc-interface/Makefile.in 
  b/gcc/ada/gcc-interface/Makefile.in
  --- a/gcc/ada/gcc-interface/Makefile.in
  +++ b/gcc/ada/gcc-interface/Makefile.in
  @@ -259,7 +259,7 @@ TOOLS_LIBS = targext.o link.o ../../ggc-
   # manufacturer, and operating system and assign each of those to its own
   # variable.
   host:=$(subst -, ,$(host_canonical))
  -targ:=$(subst -, ,$(target))
  +targ:=$(subst -, ,$(subst -gnu, ,$(target_alias)))
   arch:=$(word 1,$(targ))
   ifeq ($(words $(targ)),2)
 manu:=
 |osys:=$(word 2,$(targ))
 |  else
 |manu:=$(word 2,$(targ))
 |osys:=$(word 3,$(targ))
 |  endif
 
  I couldn't find the gcc-patches posting for this patch, thus I'm missing
  the rationale for it.  It seems rather counterintuitive and fragile to
  me, replacing the canonical $target by the far more varied $target_alias.
 
 I concur, and this has now caused confusion for the (pending upstream
 re-submission) x86 GNU/Hurd port, too, for which, upon removing -gnu from
 the target of i686-pc-gnu0.3, now a mere 0.3 remains for osys...
 
  If there's really a good reason to keep that patch nonetheless, [...]
 
 How about we use something like the following [...] patch?  In
 essence, replace the manual parsing in gcc/ada/gcc-interface/Makefile.in
 by using the values the configure script already computed for us.  This
 is largely straightforward (I would hope); please especially review the
 more involved cases such as the one where I'm now using linux%eabi -- for
 example, does that do the right thing or interfere with the Android
 configurations?
 
 The target_cpu_canonical substitution has been added in commit
 369e542b3ad1c0acfa9bfaeb72b338d8db5ba2ef (2009-02-27, r144463, schwab)
 but unused ever since, thus removed.
 
 I'll now be testing for x86 GNU/Linux and GNU/Hurd; further testing
 appreciated.

For these two configurations, I have now successfully tested the patch I
posted.  Further review/testing appreciated.  (Adding »build machinery
(*.in)« maintainers.)


 Other than rebuilding from scratch, how do I rebuild only the affected
 Ada/GNAT bits after regenerating gcc/ada/gcc-interface/Makefile?  It
 doesn't just work, and »make clean-target-libada clean-gnattools« doesn't
 help either?


gcc/ada/
* gcc-interface/Makefile.in (target_alias, host_canonical)
(target_cpu_default): Don't substitute.
(target_cpu, target_vendor, target_os, host_cpu, host_vendor)
(host_os): Substitute.
(host, targ, arch, manu, osys): Don't set, and replace their
usage with the newly substituted target_* and host_* variables.

 diff --git gcc/ada/gcc-interface/Makefile.in gcc/ada/gcc-interface/Makefile.in
 index eeb8c7f..c07722b 100644
 --- gcc/ada/gcc-interface/Makefile.in
 +++ gcc/ada/gcc-interface/Makefile.in
 @@ -151,12 +151,15 @@ GCC_CFLAGS = $(INTERNAL_CFLAGS) $(T_CFLAGS) $(CFLAGS)
  program_transform_name =
  objdir = .
  
 -target_alias=@target_alias@
  target=@target@
 +target_cpu=@target_cpu@
 +target_vendor=@target_vendor@
 +target_os=@target_os@
 +host_cpu=@host_cpu@
 +host_vendor=@host_vendor@
 +host_os=@host_os@
  xmake_file = @xmake_file@
  tmake_file = @tmake_file@
 -host_canonical=@host@
 -target_cpu_default=@target_cpu_default@
  #version=`sed -e 's/.*\\([^ \]*\)[ \].*/\1/'  $(srcdir)/version.c`
  #mainversion=`sed -e 's/.*\\([0-9]*\.[0-9]*\).*/\1/'  $(srcdir)/version.c`
  
 @@ -255,20 +258,6 @@ TOOLS_LIBS = targext.o link.o ../../ggc-none.o 
 ../../libcommon-target.a \
../../../libbacktrace/.libs/libbacktrace.a ../../../libiberty/libiberty.a \
$(SYSLIBS) $(TGT_LIB)
  
 -# Convert the target variable into a space separated list of architecture,
 -# manufacturer, and operating system and assign each of those to its own
 -# variable.
 -host:=$(subst -, ,$(host_canonical))
 -targ:=$(subst -, ,$(subst -gnu, ,$(target_alias)))
 -arch:=$(word 1,$(targ))
 -ifeq ($(words $(targ)),2)
 -  manu:=
 -  osys:=$(word 2,$(targ))
 -else
 -  manu:=$(word 2,$(targ))
 -  osys:=$(word 3,$(targ))
 -endif
 -
  # Specify the directories to be searched for header files.
  # Both . and srcdir are used, in that order,
  # so that tm.h and config.h will be found in the compilation
 @@ -280,7 +269,7 @@ ADA_INCLUDES = -I- -I. -I$(srcdir)/ada
  # Likewise, but valid for subdirectories of the current dir.
  # FIXME: for VxWorks, we cannot add $(fsrcdir) because the regs.h file in
  # that directory conflicts with a system header file.
 -ifneq ($(findstring vxworks,$(osys)),)
 +ifneq ($(findstring

RE: [PATCH,i386] FP Reassociation for AMD bdver1 and bdver2

2013-05-29 Thread Gopalasubramanian, Ganesh

Thanks Uros! Committed at r199405.

-Ganesh

-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com] 
Sent: Thursday, May 23, 2013 4:47 PM
To: Gopalasubramanian, Ganesh
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH,i386] FP Reassociation for AMD bdver1 and bdver2

On Thu, May 23, 2013 at 1:11 PM, Gopalasubramanian, Ganesh 
ganesh.gopalasubraman...@amd.com wrote:

 The patch enables FP Reassociation pass AMD bdver1 and bdver2 architectures.
 We note a performance uplift of around ~8% on calculix.

 make -k check passes.

 Is it OK for upstream?

OK.

Thanks,
Uros.

RE: [PATCH,i386] Default alignment for AMD BD and BT

2013-05-29 Thread Gopalasubramanian, Ganesh

Hi

We want this to be backported to GCC48 branch.
Please approve.

Regards
Ganesh

-Original Message-
From: Uros Bizjak [mailto:ubiz...@gmail.com] 
Sent: Tuesday, May 07, 2013 6:22 PM
To: Gopalasubramanian, Ganesh
Cc: gcc-patches@gcc.gnu.org
Subject: Re: [PATCH,i386] Default alignment for AMD BD and BT

On Tue, May 7, 2013 at 9:16 AM, Gopalasubramanian, Ganesh 
ganesh.gopalasubraman...@amd.com wrote:

 The patch updates the alignment values for AMD BD and BT architectures.

 make -k check passes.

 Is it OK for upstream?

 2013-05-07  Ganesh Gopalasubramanian  
 ganesh.gopalasubraman...@amd.com

 * config/i386/i386.c (processor_target_table): Modified default
 alignment values for AMD BD and BT architectures.

The value 11 indeed looks a bit weird, but it means: align to 16 byte boundary 
only if this can be done by skipping 10 bytes or less.

Tha patch is OK for mainline.

Thanks,
Uros.

[PING 2] [PATCH RX] Added target specific macros for RX100, RX200, and RX600

2013-05-29 Thread Sandeep Kumar Singh

Hi,

Below patch added target specific macros for RX100, RX200, and RX600

http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00050.html

Please review the patch and let me know if there should be any
modifications in it?


Regards,
Sandeep Kumar Singh,
KPIT Cummins InfoSystems Ltd.
Pune, India

Re: [PING]RE: [patch] cilkplus: Array notation for C patch

2013-05-29 Thread Rainer Orth

Iyer, Balaji V balaji.v.i...@intel.com writes:

 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Richard Henderson
 Sent: Tuesday, May 28, 2013 2:52 PM
 To: Iyer, Balaji V
 Cc: Jakub Jelinek; Aldy Hernandez; Jeff Law; 'Joseph S. Myers'; 'gcc-patches'
 Subject: Re: [PING]RE: [patch] cilkplus: Array notation for C patch

 On 05/28/2013 11:44 AM, Iyer, Balaji V wrote:
  i Richard, Jakub et al..
 I think I have fixed everything requested by RTH
 (http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01400.html).  I think I have
 also moved the tests in the correct place Jakub requested. It is passing
 all the
 correct regression tests and not affecting others.

  Is this patch OK for trunk?

 Yes, it's ok.

 This patch is committed to trunk at revision 199389.

... and immediately broke Solaris bootstrap, cf. PR bootstrap/57450.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [Patch, Fortran] Enable the generation of the FINALization wrapper function

2013-05-29 Thread Janus Weil

Hi Tobias,

 Small re-diff - but essentially unchanged.

 (I made a thinko when adding a _final call to
 gfc_trans_class_array_init_assign: Not in all contexts the _final should be
 called, only for INTENT(OUT). Thus, I remove the _final call and deferred it
 to the actual finalization call. [That also matches the scalar handling,
 which only does a memcpy and no dealloc.])


 Build and regtested on x86-64-gnu-linux.
 OK for the trunk?

I think this patch is ok. Just one nit:

@@ -5571,7 +5569,7 @@ gfc_dump_module (const char *name, int dump_flag)
  FIXME: For backwards compatibility with the old uncompressed
  module format, write an extra empty line. When the module version
  is bumped, this can be removed.  */
-  gzprintf (module_fp, GFORTRAN module version '%s' created from %s\n\n,
+  gzprintf (module_fp, GFORTRAN module version '%s' created from %s\n,
 MOD_VERSION, gfc_source_file);


Here you should remove the FIXME.

Thanks for the patch,
Janus



 Tobias Burnus wrote:

 Pre-remark: This patch does *not* enable finalization or polymorphic
 deallocation.

 * * *

 Dear all,

 The attached patch is a bit boring and invasive, but it paves the way to
 FINAL support.

 Changes of technical kind:

 * Changed ABI for CLASS's virtual table (due to _final) - and, hence, it
 bumps the .mod version
 * The finalization wrapper is now generated (this should not but might
 lead to ICEs)
 * It also causes that the virtual table is now more often generated

 New feature:

 _copy no longer deallocates the dst argument. Doing so lead to bogus
 finalization with ALLOCATE (exposed with the pending FINAL patch). As a
 sideeffect, memset could be removed and CALLOC could be replased by MALLOC
 (minute performance advantage). In order to keep the deallocation in
 gfc_trans_class_array_init_assign, there is now a call to the finalization
 wrapper.

 Next steps:
 * Add end-of-scope/intent(out) deallocation for polymorphic arrays
 * Enable FINAL parsing
 * Stepwise enabling for polymorphic deallocation/finalization
 * Fix issues with ELEMENTAL(+optional) with intent(out)
 * Fix some issues related to intrinsic assignment
 * Fix fallout of any of those items

 Build and regtested on x86-64-gnu-linux.
 OK for the trunk?

 Tobias

Fix PR57268

2013-05-29 Thread Dinar Temirbulatov

Hi,
I noticed that the scheduler created long dependence list about ~9000
elements long and slowed compilation time for about an hour. Attached
patch flushes the dependence list is case of it is longer than
MAX_PENDING_LIST_LENGTH. Tested with gcc testsite on x86_64-linux-gnu
with c and c++ enabled. Ok for trunk?
   thanks, Dinar.

2013-05-28 Dinar Temirbulatov di...@kugelworks.com

PR rtl-optimization/57268
* sched-deps.c (sched_analyze_2): Flush dependence list
then it is longer than MAX_PENDING_LIST_LENGTH.


fix.patch
Description: Binary data

Re: Fix PR57268

2013-05-29 Thread Steven Bosscher

Hello,

On Wed, May 29, 2013 at 2:01 PM, Dinar Temirbulatov wrote:
 Hi,
 I noticed that the scheduler created long dependence list about ~9000
 elements long and slowed compilation time for about an hour. Attached
 patch flushes the dependence list is case of it is longer than
 MAX_PENDING_LIST_LENGTH. Tested with gcc testsite on x86_64-linux-gnu
 with c and c++ enabled. Ok for trunk?
thanks, Dinar.

 2013-05-28 Dinar Temirbulatov dinar at kugelworks dot com

 PR rtl-optimization/57268
 * sched-deps.c (sched_analyze_2): Flush dependence list
 then it is longer than MAX_PENDING_LIST_LENGTH.


* sched-deps.c (sched_analyze_2): Flush dependence lists if
the sum of the read and write lists exceeds MAX_PENDING_LIST_LENGTH.



  if (!deps-readonly)
 -  add_insn_mem_dependence (deps, true, insn, x);
 +  {
 + if ((deps-pending_read_list_length + 
 deps-pending_write_list_length)
 +MAX_PENDING_LIST_LENGTH)
 + flush_pending_lists (deps, insn, true, true);
 +add_insn_mem_dependence (deps, true, insn, x);
 +}

The flush_pending_lists, add_insn_mem_dependence and } lines are
not indented correctly. The if (...+...) line is too long (max. 80
characters per line). The GCC style would be

if (!deps-readonly)
  {
if ((deps-pending_read_list_length
 + deps-pending_write_list_length)
 MAX_PENDING_LIST_LENGTH)
  flush_pending_lists (deps, insn, true, true);
add_insn_mem_dependence (deps, true, insn, x);
  }

(The aesthetics of GCC code style is a matter for debate, but not here
and now ;-)

Ciao!
Steven

[PATCH, AArch64] Re-organize aarch64_classify_symbol

2013-05-29 Thread Marcus Shawcroft

This patch re-organizes the implementation of aarch64_classify_symbol in 
preparation for add tiny absolute memory model support.


Regressed aarch64-none-elf, applied.

/Marcus

2013-05-29  Chris Schlumberger-Socha chris.schlumberger-so...@arm.com
Marcus Shawcroft  marcus.shawcr...@arm.com

* config/aarch64/aarch64.c (aarch64_classify_symbol): Remove comment.
Refactor if/switch.  Replace gcc_assert with if.

Re: [PING 2] [PATCH RX] Added target specific macros for RX100, RX200, and RX600

2013-05-29 Thread nick clifton


Hi Sandeep,


Below patch added target specific macros for RX100, RX200, and RX600

http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00050.html

Please review the patch and let me know if there should be any
modifications in it?


Sorry - I missed your original posting.

The patch is fine, please apply.

Cheers
  Nick

[PATCH, AArch64] Support --mcmodel=tiny.

2013-05-29 Thread Marcus Shawcroft


Hi, This patch adds support for the tiny absolute memory model.

Regressed for aarch64-none-elf with each of
  -mcmodel=tiny
  -mcmodel=small
  -mcmodel=small -fPIC

Applied.

/Marcus

2012-05-29  Chris Schlumberger-Socha chris.schlumberger-so...@arm.com
Marcus Shawcroft  marcus.shawcr...@arm.com

* config/aarch64/aarch64-protos.h (aarch64_symbol_type): Define
SYMBOL_TINY_ABSOLUTE.
* config/aarch64/aarch64.c (aarch64_load_symref_appropriately): 
Handle

SYMBOL_TINY_ABSOLUTE.
(aarch64_expand_mov_immediate): Likewise.
(aarch64_classify_symbol): Likewise.
(aarch64_mov_operand_p): Remove ATTRIBUTE_UNUSED.
Permit SYMBOL_TINY_ABSOLUTE.
* config/aarch64/predicates.md (aarch64_mov_operand): Permit CONST.

Re: [PATCH, AArch64] Support --mcmodel=tiny.

2013-05-29 Thread Marcus Shawcroft


On 29/05/13 14:08, Marcus Shawcroft wrote:

Hi, This patch adds support for the tiny absolute memory model.

Regressed for aarch64-none-elf with each of
   -mcmodel=tiny
   -mcmodel=small
   -mcmodel=small -fPIC

Applied.


This time with patch attached, oops, sorry.

/Marcus



/Marcus

2012-05-29  Chris Schlumberger-Socha chris.schlumberger-so...@arm.com
 Marcus Shawcroft  marcus.shawcr...@arm.com

 * config/aarch64/aarch64-protos.h (aarch64_symbol_type): Define
 SYMBOL_TINY_ABSOLUTE.
 * config/aarch64/aarch64.c (aarch64_load_symref_appropriately):
Handle
 SYMBOL_TINY_ABSOLUTE.
 (aarch64_expand_mov_immediate): Likewise.
 (aarch64_classify_symbol): Likewise.
 (aarch64_mov_operand_p): Remove ATTRIBUTE_UNUSED.
 Permit SYMBOL_TINY_ABSOLUTE.
 * config/aarch64/predicates.md (aarch64_mov_operand): Permit
CONST.
diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index 91fcde8..bdb6b04 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -81,6 +81,7 @@ enum aarch64_symbol_type
   SYMBOL_SMALL_TLSDESC,
   SYMBOL_SMALL_GOTTPREL,
   SYMBOL_SMALL_TPREL,
+  SYMBOL_TINY_ABSOLUTE,
   SYMBOL_FORCE_TO_MEM
 };
 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index cbe7847..074eb1c 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -524,6 +524,10 @@ aarch64_load_symref_appropriately (rtx dest, rtx imm,
 	return;
   }
 
+case SYMBOL_TINY_ABSOLUTE:
+  emit_insn (gen_rtx_SET (Pmode, dest, imm));
+  return;
+
 case SYMBOL_SMALL_GOT:
   {
 	rtx tmp_reg = dest;
@@ -826,6 +830,7 @@ aarch64_expand_mov_immediate (rtx dest, rtx imm)
 
 case SYMBOL_SMALL_TPREL:
 	case SYMBOL_SMALL_ABSOLUTE:
+	case SYMBOL_TINY_ABSOLUTE:
 	  aarch64_load_symref_appropriately (dest, imm, sty);
 	  return;
 
@@ -5030,6 +5035,8 @@ aarch64_classify_symbol (rtx x,
 
 	case AARCH64_CMODEL_TINY_PIC:
 	case AARCH64_CMODEL_TINY:
+	  return SYMBOL_TINY_ABSOLUTE;
+
 	case AARCH64_CMODEL_SMALL_PIC:
 	case AARCH64_CMODEL_SMALL:
 	  return SYMBOL_SMALL_ABSOLUTE;
@@ -5051,6 +5058,10 @@ aarch64_classify_symbol (rtx x,
   switch (aarch64_cmodel)
 	{
 	case AARCH64_CMODEL_TINY:
+	  if (SYMBOL_REF_WEAK (x))
+	return SYMBOL_FORCE_TO_MEM;
+	  return SYMBOL_TINY_ABSOLUTE;
+
 	case AARCH64_CMODEL_SMALL:
 	  if (SYMBOL_REF_WEAK (x))
 	return SYMBOL_FORCE_TO_MEM;
@@ -6444,10 +6455,9 @@ aarch64_simd_imm_scalar_p (rtx x, enum machine_mode mode ATTRIBUTE_UNUSED)
 
 bool
 aarch64_mov_operand_p (rtx x,
-		   enum aarch64_symbol_context context ATTRIBUTE_UNUSED,
+		   enum aarch64_symbol_context context,
 		   enum machine_mode mode)
 {
-
   if (GET_CODE (x) == HIGH
aarch64_valid_symref (XEXP (x, 0), GET_MODE (XEXP (x, 0
 return true;
@@ -6458,7 +6468,8 @@ aarch64_mov_operand_p (rtx x,
   if (GET_CODE (x) == SYMBOL_REF  mode == DImode  CONSTANT_ADDRESS_P (x))
 return true;
 
-  return false;
+  return aarch64_classify_symbolic_expression (x, context)
+== SYMBOL_TINY_ABSOLUTE;
 }
 
 /* Return a const_int vector of VAL.  */
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index 16c4385..3248f61 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -166,7 +166,7 @@
 })
 
 (define_predicate aarch64_mov_operand
-  (and (match_code reg,subreg,mem,const_int,symbol_ref,high)
+  (and (match_code reg,subreg,mem,const,const_int,symbol_ref,label_ref,high)
(ior (match_operand 0 register_operand)
 	(ior (match_operand 0 memory_operand)
 		 (match_test aarch64_mov_operand_p (op, SYMBOL_CONTEXT_ADR, mode))

Re: [Patch, Fortran] Enable the generation of the FINALization wrapper function

2013-05-29 Thread Tobias Burnus


Janus Weil wrote:

Build and regtested on x86-64-gnu-linux.
OK for the trunk?

I think this patch is ok. Just one nit:

@@ -5571,7 +5569,7 @@ gfc_dump_module (const char *name, int dump_flag)
   FIXME: For backwards compatibility with the old uncompressed
   module format, write an extra empty line. When the module version
   is bumped, this can be removed.  */
-  gzprintf (module_fp, GFORTRAN module version '%s' created from %s\n\n,
+  gzprintf (module_fp, GFORTRAN module version '%s' created from %s\n,
  MOD_VERSION, gfc_source_file);

Here you should remove the FIXME.


I thought I had done this - but seemingly I missed that one. DONE.


Dominique was that brave and tested the posted patches. Doing so, he 
found that the finalization wrapper causes an ICE with -m32. The reason 
is that I use gfc_convert_type to convert the kind value of result 
variable of the RANK intrinsic.


It turned out that calling that for a no op conversion leads to an ICE 
(internal error). (The conversion is supposed to convert the 
default-kind result value to the gfc_array_index_kind, otherwise one 
runs into an ICE at tree level.) Well, the solution is simple: Only 
convert it when needed. Hence, I included the following patch in the 
committal (Rev. 199409).


--- a/gcc/fortran/class.c
+++ b/gcc/fortran/class.c
@@ -1644 +1644,2 @@ generate_finalization_wrapper (gfc_symbol *derived, 
gfc_namespace *ns,

-  gfc_convert_type (rank, idx-ts, 2);
+  if (rank-ts.kind != idx-ts.kind)
+gfc_convert_type_warn (rank, idx-ts, 2, 0);


Thanks to both of you for the review and the testing.

 * * *

Now the important ingredients for finalization are all in, the next task 
is to actually do finalization. The first patch has been posted, but it 
will take a while until all finalization calls will be available.


Tobias

RE: [PING 2] [PATCH RX] Added target specific macros for RX100, RX200, and RX600

2013-05-29 Thread Sandeep Kumar Singh

Hi Nick,

 The patch is fine, please apply.
Thanks for the review.
I do not have write approvals to gcc-svn.
Could you please commit this for me?

Regards,
Sandeep Kumar Singh,
KPIT Cummins InfoSystems Ltd.
Pune, India


 -Original Message-
 From: nick clifton [mailto:ni...@redhat.com]
 Sent: Wednesday, May 29, 2013 6:38 PM
 To: Sandeep Kumar Singh; gcc-patches@gcc.gnu.org
 Cc: Kaushik Phatak; Cecilia Rodrigues
 Subject: Re: [PING 2] [PATCH RX] Added target specific macros for RX100,
 RX200, and RX600
 
 Hi Sandeep,
 
  Below patch added target specific macros for RX100, RX200, and RX600
 
  http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00050.html
 
  Please review the patch and let me know if there should be any
  modifications in it?
 
 Sorry - I missed your original posting.
 
 The patch is fine, please apply.
 
 Cheers
Nick

Re: [ada, build] host/target configuration

2013-05-29 Thread Paolo Bonzini

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Il 29/05/2013 12:50, Thomas Schwinge ha scritto:
 How about we use something like the following [...] patch?  In 
 essence, replace the manual parsing in
 gcc/ada/gcc-interface/Makefile.in by using the values the
 configure script already computed for us.  This is largely
 straightforward (I would hope); please especially review the 
 more involved cases such as the one where I'm now using
 linux%eabi -- for example, does that do the right thing or
 interfere with the Android configurations?
 
 The target_cpu_canonical substitution has been added in commit 
 369e542b3ad1c0acfa9bfaeb72b338d8db5ba2ef (2009-02-27, r144463,
 schwab) but unused ever since, thus removed.
 
 I'll now be testing for x86 GNU/Linux and GNU/Hurd; further
 testing appreciated.
 For these two configurations, I have now successfully tested the
 patch I posted.  Further review/testing appreciated.  (Adding
 »build machinery (*.in)« maintainers.)
 
 

I agree that using $(target_os) and linux% matches is a better way to
strip out the -gnus.

The patch is long, but I think I reviewed it carefully enough.  It's
okay for mainline.

Regarding the android change:

 -ifeq ($(strip $(filter-out arm%-linux,$(arch)-$(osys)) $(if
 $(findstring eabi,$(word 4,$(targ))),,$(word 4,$(targ,) +ifeq
 ($(strip $(filter-out arm% linux%eabi,$(target_cpu)
 $(target_os))),)

This is okay, it will match arm-none-linux-androideabi both before and
after (perhaps linux-%eabi would be more readable).

However, it seems that the first androideabi snippet was dead code.
Can you delete it in a follow-up?

Paolo
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.19 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJRpg7xAAoJEBvWZb6bTYbyEKgP/RI1fU8fEdzH1UK+3BG8sqgF
DSoK0mo25PBIGHYT8p4Qe2a9tE2fZys93NHufPmQnVQ1M/YqmB5JyspCoe0mqke0
fQUMI8DpV0f+g9MquIZFDYrceR4qGEaFZ9KlnHGcpErA5zs1Wb7Shj6SoWt4I3b3
2dQy61hD7DcVe1xctG349xU6Hc2j34VCk8axUcpFCm9iD6UBB8GAUmZCk6egkIJr
BWs9anasAhmh1E4aEOjGFvhWQcB23pmxSkIIk5wSkKBHZpHURIwXqOUmK1I0ebme
j7ee2+9SzSwk0emY4lBdDg14byVSvRMKnMDpFdIfHTJvYEqm6ha8jeNxBKqM/xvk
pz7AX1jwfsX22QSWiZuqcqqYBR0y7qVA0/5rjyFitLB85PY+yIfe5BNH4d5KdeuF
yDuss+n6vZrYCnyRpk0P4VbLQ5gnrtqDhxTKPABf3eMNFI6iyyadRh0qLQVrnyX5
qmUciuZTMouC4rAx1H/oAAHZbPmeHWS6cH7ggO0WlS5Us/Ws/s3OdFUePygXhQpL
xpOgqjSaDhE6s4PQAZkX79EOqvtEJynX/gGi3NTyZey/Fsp9n5JFperjAUfPG/Bt
2Cu7tDmcXLKlNKeaL7IuLppVPmVicfXTUAnYemWja78ijbbKLeD2W5OGzsfxTMuV
QoH7oqV2NqSIayTAmYg6
=Sh4D
-END PGP SIGNATURE-

Re: Fix PR57268

2013-05-29 Thread Jeff Law


On 05/29/2013 06:52 AM, Steven Bosscher wrote:

Hello,

On Wed, May 29, 2013 at 2:01 PM, Dinar Temirbulatov wrote:

Hi,
I noticed that the scheduler created long dependence list about ~9000
elements long and slowed compilation time for about an hour. Attached
patch flushes the dependence list is case of it is longer than
MAX_PENDING_LIST_LENGTH. Tested with gcc testsite on x86_64-linux-gnu
with c and c++ enabled. Ok for trunk?
thanks, Dinar.

2013-05-28 Dinar Temirbulatov dinar at kugelworks dot com

 PR rtl-optimization/57268
 * sched-deps.c (sched_analyze_2): Flush dependence list
then it is longer than MAX_PENDING_LIST_LENGTH.



 * sched-deps.c (sched_analyze_2): Flush dependence lists if
 the sum of the read and write lists exceeds MAX_PENDING_LIST_LENGTH.




  if (!deps-readonly)
-  add_insn_mem_dependence (deps, true, insn, x);
+  {
+   if ((deps-pending_read_list_length + 
deps-pending_write_list_length)
+MAX_PENDING_LIST_LENGTH)
+ flush_pending_lists (deps, insn, true, true);
+add_insn_mem_dependence (deps, true, insn, x);
+}


The flush_pending_lists, add_insn_mem_dependence and } lines are
not indented correctly. The if (...+...) line is too long (max. 80
characters per line). The GCC style would be

 if (!deps-readonly)
   {
 if ((deps-pending_read_list_length
  + deps-pending_write_list_length)
  MAX_PENDING_LIST_LENGTH)
   flush_pending_lists (deps, insn, true, true);
 add_insn_mem_dependence (deps, true, insn, x);
   }

(The aesthetics of GCC code style is a matter for debate, but not here
and now ;-)

And just to be clear, with Steven's suggested changes, this patch is OK.

jeff

[PATCH] Fix PR57441 (memory error in SLSR)

2013-05-29 Thread Bill Schmidt

Hi,

Prior to adding conditional candidates to SLSR, the size of the
increment vector was bounded by the number of related candidates, and
the vector was malloc'd to be that size.  With conditional candidates,
there can be an additional increment for each related PHI operand,
causing potential overflow and heap corruption.  This patch removes the
no longer valid assumption and allocates a fixed-size vector.

Bootstrap and regtest in progress on powerpc64-linux-gnu.  Ok for trunk
if everything checks out?

Thanks,
Bill


2013-05-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

* gimple-ssa-strength-reduction.c (analyze_candidates_and_replace):
Don't limit size of incr_vec to number of candidates.


Index: gcc/gimple-ssa-strength-reduction.c
===
--- gcc/gimple-ssa-strength-reduction.c (revision 199406)
+++ gcc/gimple-ssa-strength-reduction.c (working copy)
@@ -3361,7 +3361,6 @@ analyze_candidates_and_replace (void)
 less expensive to calculate than the replaced statements.  */
   else
{
- int length;
  enum machine_mode mode;
  bool speed;
 
@@ -3372,14 +3371,11 @@ analyze_candidates_and_replace (void)
 
  /* If all candidates have already been replaced under other
 interpretations, nothing remains to be done.  */
- length = count_candidates (c);
- if (!length)
+ if (!count_candidates (c))
continue;
- if (length  MAX_INCR_VEC_LEN)
-   length = MAX_INCR_VEC_LEN;
 
  /* Construct an array of increments for this candidate chain.  */
- incr_vec = XNEWVEC (incr_info, length);
+ incr_vec = XNEWVEC (incr_info, MAX_INCR_VEC_LEN);
  incr_vec_len = 0;
  record_increments (c);

Re: Fix PR 53743 and other -freorder-blocks-and-partition failures

2013-05-29 Thread Teresa Johnson

On Thu, May 23, 2013 at 6:18 AM, Teresa Johnson tejohn...@google.com wrote:
 On Wed, May 22, 2013 at 2:05 PM, Teresa Johnson tejohn...@google.com wrote:
 Revised patch included below. The spacing of my pasted in patch text
 looks funky again, let me know if you want the patch as an attachment
 instead.

 I addressed all of Steven's comments, except for the suggestion to use
 gcc_assert
 instead of error() in verify_hot_cold_block_grouping() to keep this 
 consistent
 with the rest of the verify_flow_info subroutines (let me know if this is 
 ok).

 I fixed this issue too, which was actually in
 insert_section_boundary_note(), so that it gcc_asserts more
 efficiently as suggested. Retested, latest patch below.

 Honza, would you be able to review the patch?

Ping. Still needs a global maintainer to review and approve.

Also, I submitted a PR for the debug range issue:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57451

Thanks!
Teresa


 Thanks!
 Teresa


 The other main changes:
 (1) Added several test cases (cloned from the torture subdirectories,
 where I manually
 built/ran with FDO and -freorder-blocks-and-partition with both the
 current trunk and
 my fixed trunk compiler, and was able to expose some failures I fixed.
 (2) Changed existing tree-prof tests that used
 -freorder-blocks-and-partition to be
 built with -O2 instead of -O, so that partitioning actually kicks in.
 (3) Fixed a couple of failures in the new
 verify_hot_cold_block_grouping() checks
 exposed by the torture tests I ran manually with splitting (2 of the
 tests cloned
 to tree-prof in this patch). One was in computed goto where we were
 too aggressive
 about cloning crossing edges, and the other was in rtl_split_edge
 called from the stack
 pass which was not correctly inserting the new bb in the correct partition 
 since
 bb layout is complete at that point.

 Re-tested on x86_64-unknown-linux-gnu with bootstrap and profiledbootstrap
 builds and regression testing. Re-built/ran cpu2006int with profile
 feedback and -freorder-blocks-and-partition enabled.

 Ok for trunk?

 Thanks!
 Teresa

 2013-05-23  Teresa Johnson  tejohn...@google.com

 * ifcvt.c (find_if_case_1): Replace BB_COPY_PARTITION with assert
 as this is now done by redirect_edge_and_branch_force.
 * function.c (thread_prologue_and_epilogue_insns): Insert new bb after
 barriers, and fix interaction with splitting.
 * emit-rtl.c (try_split): Copy REG_CROSSING_JUMP notes.
 * cfgcleanup.c (try_forward_edges): Fix early return value to properly
 reflect changes made in the routine.
 * bb-reorder.c (emit_barrier_after_bb): Move to cfgrtl.c.
 (fix_up_fall_thru_edges): Remove incorrect check for bb layout order
 since this is called in cfglayout mode, and replace partition fixup
 with assert as that is now done by force_nonfallthru_and_redirect.
 (add_reg_crossing_jump_notes): Handle the fact that some jumps may
 already be marked with region crossing note.
 (insert_section_boundary_note): Make non-static, gate on flag
 has_bb_partition, rewrite to also check for multiple partitions.
 (rest_of_handle_reorder_blocks): Remove call to
 insert_section_boundary_note, now done later during free_cfg.
 (duplicate_computed_gotos): Don't duplicate partition crossing edge.
 * bb-reorder.h (insert_section_boundary_note): Declare.
 * Makefile.in (cfgrtl.o): Depend on bb-reorder.h
 * cfgrtl.c (rest_of_pass_free_cfg): If partitions exist
 invoke insert_section_boundary_note.
 (try_redirect_by_replacing_jump): Remove unnecessary
 check for region crossing note.
 (fixup_partition_crossing): New function.
 (rtl_redirect_edge_and_branch): Fixup partition boundaries.
 (emit_barrier_after_bb): Move here from bb-reorder.c, handle insertion
 in non-cfglayout mode.
 (force_nonfallthru_and_redirect): Fixup partition boundaries,
 remove old code that tried to do this. Emit barrier correctly
 when we are in cfglayout mode.
 (last_bb_in_partition): New function.
 (rtl_split_edge): Correctly fixup partition boundaries.
 (commit_one_edge_insertion): Remove old code that tried to
 fixup region crossing edge since this is now handled in
 split_block, and set up insertion point correctly since
 block may now end in a jump.
 (verify_hot_cold_block_grouping): Guard against checking when not in
 linearized RTL mode.
 (rtl_verify_edges): Add checks for incorrect/missing REG_CROSSING_JUMP
 notes.
 (rtl_verify_flow_info_1): Move verify_hot_cold_block_grouping to
 rtl_verify_flow_info, so not called in cfglayout mode.
 (rtl_verify_flow_info): Move verify_hot_cold_block_grouping here.
 (fixup_reorder_chain): Remove old code that attempted to fixup region
 crossing note as this is now handled in 
 force_nonfallthru_and_redirect.
 (duplicate_insn_chain): Don't duplicate switch section notes.
 (rtl_can_remove_branch_p):

Re: [PING]RE: [patch] cilkplus: Array notation for C patch

2013-05-29 Thread Rainer Orth

Rainer Orth r...@cebitec.uni-bielefeld.de writes:

 Iyer, Balaji V balaji.v.i...@intel.com writes:
[...]
 This patch is committed to trunk at revision 199389.

 ... and immediately broke Solaris bootstrap, cf. PR bootstrap/57450.

Fixed implementing Richard's suggestion from the PR.  Bootstrapped
without regressions on i386-pc-solaris2.10 and x86_64-unknown-linux-gnu,
installed as obvious.

Rainer


2013-05-29  Rainer Orth  r...@cebitec.uni-bielefeld.de

PR bootstrap/57450
* c-array-notation.c (length_mismatch_in_expr_p): Use absu_hwi.
(build_array_notation_expr): Likewise.

# HG changeset patch
# Parent 22d5dfeeb2006caf8be6803106b454a19cfe6513
Fix c/c-array-notation.c compilation failure (PR bootstrap/57450)

diff --git a/gcc/c/c-array-notation.c b/gcc/c/c-array-notation.c
--- a/gcc/c/c-array-notation.c
+++ b/gcc/c/c-array-notation.c
@@ -116,7 +116,7 @@ length_mismatch_in_expr_p (location_t lo
 		{
 		  l_node = int_cst_value (list[ii][jj]);
 		  l_start = int_cst_value (start);
-		  if (abs (l_start) != abs (l_node))
+		  if (absu_hwi (l_start) != absu_hwi (l_node))
 		{
 		  error_at (loc, length mismatch in expression);
 		  return true;
@@ -1561,7 +1561,7 @@ build_array_notation_expr (location_t lo
   HOST_WIDE_INT r_length = int_cst_value (rhs_length[0][0]);
   /* Length can be negative or positive.  As long as the magnitude is OK,
 	 then the array notation is valid.  */
-  if (abs (l_length) != abs (r_length))
+  if (absu_hwi (l_length) != absu_hwi (r_length))
 	{
 	  error_at (location, length mismatch between LHS and RHS);
 	  pop_stmt_list (an_init);

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH] Fix PR57441 (memory error in SLSR)

2013-05-29 Thread Richard Biener

On Wed, May 29, 2013 at 4:50 PM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 Prior to adding conditional candidates to SLSR, the size of the
 increment vector was bounded by the number of related candidates, and
 the vector was malloc'd to be that size.  With conditional candidates,
 there can be an additional increment for each related PHI operand,
 causing potential overflow and heap corruption.  This patch removes the
 no longer valid assumption and allocates a fixed-size vector.

 Bootstrap and regtest in progress on powerpc64-linux-gnu.  Ok for trunk
 if everything checks out?

Ok with referencing the PR in the changelog and adding the testcase.

Thanks,
Richard.

 Thanks,
 Bill


 2013-05-29  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * gimple-ssa-strength-reduction.c (analyze_candidates_and_replace):
 Don't limit size of incr_vec to number of candidates.


 Index: gcc/gimple-ssa-strength-reduction.c
 ===
 --- gcc/gimple-ssa-strength-reduction.c (revision 199406)
 +++ gcc/gimple-ssa-strength-reduction.c (working copy)
 @@ -3361,7 +3361,6 @@ analyze_candidates_and_replace (void)
  less expensive to calculate than the replaced statements.  */
else
 {
 - int length;
   enum machine_mode mode;
   bool speed;

 @@ -3372,14 +3371,11 @@ analyze_candidates_and_replace (void)

   /* If all candidates have already been replaced under other
  interpretations, nothing remains to be done.  */
 - length = count_candidates (c);
 - if (!length)
 + if (!count_candidates (c))
 continue;
 - if (length  MAX_INCR_VEC_LEN)
 -   length = MAX_INCR_VEC_LEN;

   /* Construct an array of increments for this candidate chain.  */
 - incr_vec = XNEWVEC (incr_info, length);
 + incr_vec = XNEWVEC (incr_info, MAX_INCR_VEC_LEN);
   incr_vec_len = 0;
   record_increments (c);

RE: [PING]RE: [patch] cilkplus: Array notation for C patch

2013-05-29 Thread David Edelsohn

Balaji,

Thanks for this new feature and I am relieved that so much of it works
successfully on PowerLinux and AIX.

I know that you have received a deluge of reports of issues with the
cilkplus support that you slowly are working through.  I am seeing the
following new testsuite failures on AIX:

/nasfarm/dje/src/src/gcc/testsuite/c-c++-common/cilk-plus/AN/sec_implicit2.c:
In function 'main2':
/nasfarm/dje/src/src/gcc/testsuite/c-c++-common/cilk-plus/AN/sec_implicit2.c:23:15:
error: __sec_implicit_index parameter must be an integer constant
expression

/nasfarm/dje/src/src/gcc/testsuite/c-c++-common/cilk-plus/AN/sec_reduce_max_min_ind.c:
In function 'main2':
/nasfarm/dje/src/src/gcc/testsuite/c-c++-common/cilk-plus/AN/sec_reduce_max_min_ind.c:23:28:
error: __sec_reduce_min_ind or __sec_reduce_max_ind cannot have arrays
with dimension greater than 1
/nasfarm/dje/src/src/gcc/testsuite/c-c++-common/cilk-plus/AN/sec_reduce_max_min_ind.c:27:28:
error: __sec_reduce_min_ind or __sec_reduce_max_ind cannot have arrays
with dimension greater than 1

I hope that you can help resolve these errors.

Thanks, David

Re: [PATCH] Fix incorrect discriminator assignment.

2013-05-29 Thread Dehao Chen

Patch updated and committed as r199412.

Thanks,
Dehao

Re: [rs6000] Improve FP comparisons with -fno-trapping-math

2013-05-29 Thread David Edelsohn

* config/rs6000/predicates.md (rs6000_cbranch_operator): Accept some
unordered comparison operators when -fno-trapping-math is in effect
on the e500.
* config/rs6000/rs6000.c (rs6000_generate_compare): Remove dead code
and implement unordered comparison operators properly on the e500.

I'm okay with most of the change, but I have a question: What happened
to the unsigned comparisons and LTGT?

I'm not going to insist, but this probably deserves an e500-specific
testcase that it's generating the correct results and not calling
libgcc for unordered comparisons.

Thanks, David

Re: [PING]RE: [patch] cilkplus: Array notation for C patch

2013-05-29 Thread H.J. Lu

On Tue, May 28, 2013 at 5:35 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, May 28, 2013 at 1:02 PM, Iyer, Balaji V balaji.v.i...@intel.com 
 wrote:


 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of Richard Henderson
 Sent: Tuesday, May 28, 2013 2:52 PM
 To: Iyer, Balaji V
 Cc: Jakub Jelinek; Aldy Hernandez; Jeff Law; 'Joseph S. Myers'; 
 'gcc-patches'
 Subject: Re: [PING]RE: [patch] cilkplus: Array notation for C patch

 On 05/28/2013 11:44 AM, Iyer, Balaji V wrote:
  i Richard, Jakub et al..
  I think I have fixed everything requested by RTH
 (http://gcc.gnu.org/ml/gcc-patches/2013-05/msg01400.html).  I think I have
 also moved the tests in the correct place Jakub requested. It is passing 
 all the
 correct regression tests and not affecting others.
 
  Is this patch OK for trunk?

 Yes, it's ok.

 This patch is committed to trunk at revision 199389.

 Thanks,


 On Linux/x32, I got

 FAIL: c-c++-common/cilk-plus/AN/if_test.c  -O0 -fcilkplus execution test
 FAIL: c-c++-common/cilk-plus/AN/if_test.c  -fcilkplus -O0 -std=c99
 execution test
 FAIL: c-c++-common/cilk-plus/AN/if_test.c  -fcilkplus -g -O0 -std=c99
 execution test
 FAIL: c-c++-common/cilk-plus/AN/if_test.c  -fcilkplus -g -std=c99 execution 
 test
 FAIL: c-c++-common/cilk-plus/AN/if_test.c  -fcilkplus -std=c99 execution test
 FAIL: c-c++-common/cilk-plus/AN/if_test.c  -fcilkplus execution test
 FAIL: c-c++-common/cilk-plus/AN/if_test.c  -g -O0 -fcilkplus execution test
 FAIL: c-c++-common/cilk-plus/AN/if_test.c  -g -fcilkplus execution test


 [x32@gnu-35 gcc]$ /export/gnu/import/git/gcc-test-x32/bld/gcc/xgcc
 -B/export/gnu/import/git/gcc-test-x32/bld/gcc/
 /export/gnu/import/git/gcc-test-x32/src-trunk/gcc/testsuite/c-c++-common/cilk-plus/AN/if_test.c
  -fno-diagnostics-show-caret -fdiagnostics-color=never   -fcilkplus -g
 -O0 -std=c99 -fcilkplus  -lm   -m32 -o ./if_test.exe
 [x32@gnu-35 gcc]$ /export/gnu/import/git/gcc-test-x32/bld/gcc/xgcc
 -B/export/gnu/import/git/gcc-test-x32/bld/gcc/
 /export/gnu/import/git/gcc-test-x32/src-trunk/gcc/testsuite/c-c++-common/cilk-plus/AN/if_test.c
  -fno-diagnostics-show-caret -fdiagnostics-color=never   -fcilkplus -g
 -O0 -std=c99 -fcilkplus  -lm   -mx32 -o ./if_test.exe
 [x32@gnu-35 gcc]$ ./if_test.exe
 [x32@gnu-35 gcc]$ echo $?
 5
 [x32@gnu-35 gcc]$

 (gdb) r
 Starting program:
 /export/gnu/import/git/gcc-test-x32/bld/gcc/testsuite/gcc/if_test.exe

 Breakpoint 1, main2 (argc=3, argv=0xd240)
 at 
 /export/gnu/import/git/gcc-test-x32/src-trunk/gcc/testsuite/c-c++-common/cilk-plus/AN/if_test.c:131
 131  return 5;
 Missing separate debuginfos, use: debuginfo-install glibc-2.16-30.1.fc18.x32
 (gdb) p ii
 $1 = 0
 (gdb) p array2_check
 $2 = {5, 5, 5, 5, 5, 5, 5, 5, 5, 5}
 (gdb) p array2
 $3 = {10, 10, 10, 10, 10, 10, 10, 10, 10, 10}
 (gdb)

 Does cilkplus assume ptr_mode == word_mode?  On x32, ptr_mode == SImode
 and word_mode == DImode.


I opened:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57452

There is a bug in the testcase.  The problem is the second
element in array notation is not the size of array and GCC
doesn't detect the problem.

--
H.J.

Re: Fix PR57268

2013-05-29 Thread Dinar Temirbulatov

* sched-deps.c (sched_analyze_2): Flush dependence lists if
the sum of the read and write lists exceeds MAX_PENDING_LIST_LENGTH.

On Wed, May 29, 2013 at 6:49 PM, Jeff Law l...@redhat.com wrote:
 On 05/29/2013 06:52 AM, Steven Bosscher wrote:

 Hello,

 On Wed, May 29, 2013 at 2:01 PM, Dinar Temirbulatov wrote:

 Hi,
 I noticed that the scheduler created long dependence list about ~9000
 elements long and slowed compilation time for about an hour. Attached
 patch flushes the dependence list is case of it is longer than
 MAX_PENDING_LIST_LENGTH. Tested with gcc testsite on x86_64-linux-gnu
 with c and c++ enabled. Ok for trunk?
 thanks, Dinar.

 2013-05-28 Dinar Temirbulatov dinar at kugelworks dot com

  PR rtl-optimization/57268
  * sched-deps.c (sched_analyze_2): Flush dependence list
 then it is longer than MAX_PENDING_LIST_LENGTH.



  * sched-deps.c (sched_analyze_2): Flush dependence lists if
  the sum of the read and write lists exceeds
 MAX_PENDING_LIST_LENGTH.



   if (!deps-readonly)
 -  add_insn_mem_dependence (deps, true, insn, x);
 +  {
 +   if ((deps-pending_read_list_length +
 deps-pending_write_list_length)
 +MAX_PENDING_LIST_LENGTH)
 + flush_pending_lists (deps, insn, true, true);
 +add_insn_mem_dependence (deps, true, insn, x);
 +}


 The flush_pending_lists, add_insn_mem_dependence and } lines are
 not indented correctly. The if (...+...) line is too long (max. 80
 characters per line). The GCC style would be

  if (!deps-readonly)
{
  if ((deps-pending_read_list_length
   + deps-pending_write_list_length)
   MAX_PENDING_LIST_LENGTH)
flush_pending_lists (deps, insn, true, true);
  add_insn_mem_dependence (deps, true, insn, x);
}

 (The aesthetics of GCC code style is a matter for debate, but not here
 and now ;-)

 And just to be clear, with Steven's suggested changes, this patch is OK.

 jeff


fix1.patch
Description: Binary data

Re: [PATCH] Fix BB vectorization cost model wrt scalar uses

2013-05-29 Thread Tobias Burnus

The patch breaks compiling Open MPI with default compilation flags (for 
that file: -O3 -fno-strict-aliasing):


../../../../../ompi/mca/btl/tcp/btl_tcp_frag.c:100:6: internal compiler 
error: in operator[], at vec.h:815

 bool mca_btl_tcp_frag_send(mca_btl_tcp_frag_t* frag, int sd)


See PR 57453 for a reduced test case.

Tobias


Richard Biener wrote:

This properly accounts for remaining unvectorized scalar uses of
stmts we vectorize during BB vectorization.  Previously we
happily retained all of the scalar loads and computations, just
optimizing the final stores, alongside the vectorized code.
To account for scalar uses properly the following patch simply
makes the scalar cost cheaper when stmts have to be kept live.

Support for vectorizing those by using vector extracts is
currently missing.  It happens quite frequently that CSE
looks through the final scalar global store, re-using the stored
value and thus keeping the whole computation sequence live.

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

Richard.

2013-05-28  Richard Biener  rguent...@suse.de

* tree-vect-slp.c (vect_bb_slp_scalar_cost): New function
computing scalar cost offsetted by stmts that are kept live
by scalar uses.
(vect_bb_vectorization_profitable_p): Use vect_bb_slp_scalar_cost
for computation of scalar cost.

* gcc.dg/vect/bb-slp-32.c: New testcase.

Index: gcc/tree-vect-slp.c
===
*** gcc/tree-vect-slp.c (revision 199375)
--- gcc/tree-vect-slp.c (working copy)
*** vect_slp_analyze_operations (bb_vec_info
*** 1898,1903 
--- 1898,1966 
 return true;
   }

+
+ /* Compute the scalar cost of the SLP node NODE and its children
+and return it.  Do not account defs that are marked in LIFE and
+update LIFE according to uses of NODE.  */
+
+ static unsigned
+ vect_bb_slp_scalar_cost (slp_tree node, vecbool, va_stack life)
+ {
+   unsigned scalar_cost = 0;
+   unsigned i;
+   gimple stmt;
+   slp_tree child;
+
+   FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, stmt)
+ {
+   unsigned stmt_cost;
+   ssa_op_iter op_iter;
+   def_operand_p def_p;
+   stmt_vec_info stmt_info;
+
+   if (life[i])
+   continue;
+
+   /* If there is a non-vectorized use of the defs then the scalar
+  stmt is kept live in which case we do not account it or any
+required defs in the SLP children in the scalar cost.  This
+way we make the vectorization more costly when compared to
+the scalar cost.  */
+   FOR_EACH_SSA_DEF_OPERAND (def_p, stmt, op_iter, SSA_OP_DEF)
+   {
+ imm_use_iterator use_iter;
+ gimple use_stmt;
+ FOR_EACH_IMM_USE_STMT (use_stmt, use_iter, DEF_FROM_PTR (def_p))
+   if (!vinfo_for_stmt (use_stmt)
+   || !STMT_VINFO_VECTORIZABLE (vinfo_for_stmt (use_stmt)))
+ {
+   life[i] = true;
+   BREAK_FROM_IMM_USE_STMT (use_iter);
+ }
+   }
+   if (life[i])
+   continue;
+
+   stmt_info = vinfo_for_stmt (stmt);
+   if (STMT_VINFO_DATA_REF (stmt_info))
+ {
+   if (DR_IS_READ (STMT_VINFO_DATA_REF (stmt_info)))
+ stmt_cost = vect_get_stmt_cost (scalar_load);
+   else
+ stmt_cost = vect_get_stmt_cost (scalar_store);
+ }
+   else
+ stmt_cost = vect_get_stmt_cost (scalar_stmt);
+
+   scalar_cost += stmt_cost;
+ }
+
+   FOR_EACH_VEC_ELT (SLP_TREE_CHILDREN (node), i, child)
+ scalar_cost += vect_bb_slp_scalar_cost (child, life);
+
+   return scalar_cost;
+ }
+
   /* Check if vectorization of the basic block is profitable.  */

   static bool
*** vect_bb_vectorization_profitable_p (bb_v
*** 1931,1956 
   }

 /* Calculate scalar cost.  */
!   for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (si))
   {
!   stmt = gsi_stmt (si);
!   stmt_info = vinfo_for_stmt (stmt);
!
!   if (!stmt_info || !STMT_VINFO_VECTORIZABLE (stmt_info)
!   || !PURE_SLP_STMT (stmt_info))
! continue;
!
!   if (STMT_VINFO_DATA_REF (stmt_info))
! {
!   if (DR_IS_READ (STMT_VINFO_DATA_REF (stmt_info)))
! stmt_cost = vect_get_stmt_cost (scalar_load);
!   else
! stmt_cost = vect_get_stmt_cost (scalar_store);
! }
!   else
! stmt_cost = vect_get_stmt_cost (scalar_stmt);
!
!   scalar_cost += stmt_cost;
   }

 /* Complete the target-specific cost calculation.  */
--- 1994,2007 
   }

 /* Calculate scalar cost.  */
!   FOR_EACH_VEC_ELT (slp_instances, i, instance)
   {
!   vecbool, va_stack life;
!   vec_stack_alloc (bool, life, SLP_INSTANCE_GROUP_SIZE (instance));
!   life.quick_grow_cleared (SLP_INSTANCE_GROUP_SIZE (instance));
!   scalar_cost += vect_bb_slp_scalar_cost (SLP_INSTANCE_TREE (instance),
!

Re: Fix PR57268

2013-05-29 Thread Dinar Temirbulatov

Here is the corrected version of change. Also, I think, I need
write-after-approval access to commit the change.
thanks, Dinar,

PR rtl-optimization/57268
* sched-deps.c (sched_analyze_2): Flush dependence lists if
the sum of the read and write lists exceeds MAX_PENDING_LIST_LENGTH.

On Wed, May 29, 2013 at 6:49 PM, Jeff Law l...@redhat.com wrote:
 On 05/29/2013 06:52 AM, Steven Bosscher wrote:

 Hello,

 On Wed, May 29, 2013 at 2:01 PM, Dinar Temirbulatov wrote:

 Hi,
 I noticed that the scheduler created long dependence list about ~9000
 elements long and slowed compilation time for about an hour. Attached
 patch flushes the dependence list is case of it is longer than
 MAX_PENDING_LIST_LENGTH. Tested with gcc testsite on x86_64-linux-gnu
 with c and c++ enabled. Ok for trunk?
 thanks, Dinar.

 2013-05-28 Dinar Temirbulatov dinar at kugelworks dot com

  PR rtl-optimization/57268
  * sched-deps.c (sched_analyze_2): Flush dependence list
 then it is longer than MAX_PENDING_LIST_LENGTH.



  * sched-deps.c (sched_analyze_2): Flush dependence lists if
  the sum of the read and write lists exceeds
 MAX_PENDING_LIST_LENGTH.



   if (!deps-readonly)
 -  add_insn_mem_dependence (deps, true, insn, x);
 +  {
 +   if ((deps-pending_read_list_length +
 deps-pending_write_list_length)
 +MAX_PENDING_LIST_LENGTH)
 + flush_pending_lists (deps, insn, true, true);
 +add_insn_mem_dependence (deps, true, insn, x);
 +}


 The flush_pending_lists, add_insn_mem_dependence and } lines are
 not indented correctly. The if (...+...) line is too long (max. 80
 characters per line). The GCC style would be

  if (!deps-readonly)
{
  if ((deps-pending_read_list_length
   + deps-pending_write_list_length)
   MAX_PENDING_LIST_LENGTH)
flush_pending_lists (deps, insn, true, true);
  add_insn_mem_dependence (deps, true, insn, x);
}

 (The aesthetics of GCC code style is a matter for debate, but not here
 and now ;-)

 And just to be clear, with Steven's suggested changes, this patch is OK.

 jeff

[GOOGLE] Unrestrict early inline restrictions for AutoFDO

2013-05-29 Thread Dehao Chen

In gcc4-8, the max einline iterations are restricted to 1. For
AutoFDO, this is bad because early inline is not size restricted. This
patch allows einline to do multiple iterations in AutoFDO. It also
enables tracer optimization in AutoFDO.

Bootstrapped and passed regression test.

OK for googel-4_8?

Thanks,
Dehao

Index: gcc/ipa-inline.c
===
--- gcc/ipa-inline.c (revision 199416)
+++ gcc/ipa-inline.c (working copy)
@@ -2161,7 +2161,8 @@ early_inliner (void)
 {
   /* We iterate incremental inlining to get trivial cases of indirect
  inlining.  */
-  while (iterations  PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS)
+  while ((flag_auto_profile
+  || iterations  PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS))
   early_inline_small_functions (node))
  {
   timevar_push (TV_INTEGRATION);
Index: gcc/opts.c
===
--- gcc/opts.c (revision 199416)
+++ gcc/opts.c (working copy)
@@ -1644,6 +1644,8 @@ common_handle_option (struct gcc_options *opts,
  opts-x_flag_peel_loops = value;
   if (!opts_set-x_flag_value_profile_transformations)
  opts-x_flag_value_profile_transformations = value;
+  if (!opts_set-x_flag_tracer)
+ opts-x_flag_tracer = value;
   if (!opts_set-x_flag_inline_functions)
  opts-x_flag_inline_functions = value;
   if (!opts_set-x_flag_ipa_cp)

Re: [PING 2] [PATCH RX] Added target specific macros for RX100, RX200, and RX600

2013-05-29 Thread nick clifton


Hi Sandeep,

 The patch is fine, please apply.

Thanks for the review.
I do not have write approvals to gcc-svn.
Could you please commit this for me?


Done.

Cheers
  Nick

Re: [GOOGLE] Unrestrict early inline restrictions for AutoFDO

2013-05-29 Thread Xinliang David Li

The early inlining part is ok. The tracer optimization should be
revisited -- we should have more fine grain control on it (for
instance, based on FDO summary -- but that should be common to
FDO/LIPO).

David

On Wed, May 29, 2013 at 9:39 AM, Dehao Chen de...@google.com wrote:
 In gcc4-8, the max einline iterations are restricted to 1. For
 AutoFDO, this is bad because early inline is not size restricted. This
 patch allows einline to do multiple iterations in AutoFDO. It also
 enables tracer optimization in AutoFDO.

 Bootstrapped and passed regression test.

 OK for googel-4_8?

 Thanks,
 Dehao

 Index: gcc/ipa-inline.c
 ===
 --- gcc/ipa-inline.c (revision 199416)
 +++ gcc/ipa-inline.c (working copy)
 @@ -2161,7 +2161,8 @@ early_inliner (void)
  {
/* We iterate incremental inlining to get trivial cases of indirect
   inlining.  */
 -  while (iterations  PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS)
 +  while ((flag_auto_profile
 +  || iterations  PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS))
early_inline_small_functions (node))
   {
timevar_push (TV_INTEGRATION);
 Index: gcc/opts.c
 ===
 --- gcc/opts.c (revision 199416)
 +++ gcc/opts.c (working copy)
 @@ -1644,6 +1644,8 @@ common_handle_option (struct gcc_options *opts,
   opts-x_flag_peel_loops = value;
if (!opts_set-x_flag_value_profile_transformations)
   opts-x_flag_value_profile_transformations = value;
 +  if (!opts_set-x_flag_tracer)
 + opts-x_flag_tracer = value;
if (!opts_set-x_flag_inline_functions)
   opts-x_flag_inline_functions = value;
if (!opts_set-x_flag_ipa_cp)

[PATCH] ARM: Don't clobber CC reg when it is live after the peephole window

2013-05-29 Thread Meador Inge

Hi All,

This patch fixes a bug in one of the ARM peephole2 optimizations.  The
peephole2 optimization in question was changed to use the CC-updating
form for all of the instructions produced by the peephole so that the
encoding will be smaller when compiling for thumb [1].  However, I don't
think that is always safe.

For example, the CC register might be used by something *after* the
peephole window.  The current peephole will transform:


(insn:TI 7 49 18 2 (set (reg:CC 24 cc)
(compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
(const_int 0 [0]))) repro.c:5 212 {*arm_cmpsi_insn}
 (nil))

(insn:TI 18 7 11 2 (cond_exec (ne (reg:CC 24 cc)
(const_int 0 [0]))
(set (reg:SI 3 r3 [140])
(const_int 0 [0]))) repro.c:8 3366 {*p *arm_movsi_vfp}
 (expr_list:REG_EQUIV (const_int 0 [0])
(nil)))

(insn 11 18 19 2 (cond_exec (eq (reg:CC 24 cc)
(const_int 0 [0]))
(set (reg:SI 3 r3 [138])
(const_int 1 [0x1]))) repro.c:6 3366 {*p *arm_movsi_vfp}
 (expr_list:REG_EQUIV (const_int 1 [0x1])
(nil)))

(insn:TI 19 11 12 2 (cond_exec (ne (reg:CC 24 cc)
(const_int 0 [0]))
(set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
(reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
 (expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
(nil)))

(insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
(const_int 0 [0]))
(set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
(reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
 (nil))

(insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
(const_int 0 [0]))
(set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 
*endname_1(D)+0 S1 A8])
(reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
 (expr_list:REG_DEAD (reg:CC 24 cc)
(expr_list:REG_DEAD (reg:QI 3 r3 [140])
(expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
(nil)

into the following:


(insn 59 49 60 2 (parallel [
(set (reg:CC 24 cc)
(compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
(const_int 0 [0])))
(set (reg:SI 1 r1)
(minus:SI (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
(const_int 0 [0])))
]) repro.c:6 -1
 (nil))

(insn 60 59 61 2 (parallel [
(set (reg:CC 24 cc)
(compare:CC (const_int 0 [0])
(reg:SI 1 r1)))
(set (reg:SI 3 r3 [140])
(minus:SI (const_int 0 [0])
(reg:SI 1 r1)))
]) repro.c:6 -1
 (nil))

(insn 61 60 19 2 (parallel [
(set (reg:SI 3 r3 [140])
(plus:SI (plus:SI (reg:SI 3 r3 [140])
(reg:SI 1 r1))
(geu:SI (reg:CC 24 cc)
(const_int 0 [0]
(clobber (reg:CC 24 cc))
]) repro.c:6 -1
 (nil))

(insn:TI 19 61 12 2 (cond_exec (ne (reg:CC 24 cc)
(const_int 0 [0]))
(set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
(reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
 (nil))

(insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
(const_int 0 [0]))
(set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
(reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
 (expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
(nil)))

(insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
(const_int 0 [0]))
(set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 
*endname_1(D)+0 S1 A8])
(reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
 (expr_list:REG_DEAD (reg:CC 24 cc)
(expr_list:REG_DEAD (reg:QI 3 r3 [140])
(expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
(nil)


This gets compiled into the incorrect sequence:

   
ldrbr3, [r0, #0]
ldr r2, .L4
subsr1, r3, #0
rsbsr3, r1, #0
adcsr3, r3, r1
strne   r3, [r2, #0]
streq   r3, [r2, #0]
strneb r3, [r0, #0]


The conditional stores are now dealing with an incorrect condition state.

This patch fixes the problem by ensuring that the CC reg is dead after the
peephole window for the current peephole definition and falls back on the
original pre-PR46975 peephole when it is live.  Unfortunately I had trouble
coming up with a reproduction case against mainline.  I only noticed the bug
while working with some local changes that

ping: [patch] Support .eh_frame in crt1 x86_64 glibc (PR libgcc/57280, libc/15407)

2013-05-29 Thread Jan Kratochvil

Hi,

[patch update] Support .eh_frame in crt1 x86_64 glibc (PR libgcc/57280, 
libc/15407)
http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00775.html
Message-ID: 20130514191244.ga12...@host2.jankratochvil.net


Thanks,
Jan

Re: Fix PR57268

2013-05-29 Thread Maxim Kuvyrkov

On 30/05/2013, at 4:36 AM, Dinar Temirbulatov wrote:

 Here is the corrected version of change. Also, I think, I need
 write-after-approval access to commit the change.
thanks, Dinar,
 
PR rtl-optimization/57268
* sched-deps.c (sched_analyze_2): Flush dependence lists if
the sum of the read and write lists exceeds MAX_PENDING_LIST_LENGTH.


Jeff,

Could you approve write-after-approval privileges for Dinar and start the 
procedure to create a gcc/sourceware account for him?

Dinar has been working on the GNU toolchain since 2004 and now he will be more 
active in upstream development.

Thank you,

--
Maxim Kuvyrkov
KugelWorks

[Patch, Fortran] PR57456 - Handle ALLOCATE with typespec for CLASS

2013-05-29 Thread Tobias Burnus


Currently, ALLOCATE ignores the typespec for arrays. Such that:
   ALLOCATE (t2 :: var(5))
will allocate as much memory as the base type requires instead of using 
as much as t2 does.



I explicitly exclude characters as it otherwise will fail for 
allocate_with_typespec_1.f90, which uses:

 allocate(character :: c1(1))
The problem is that gfc_typenode_for_spec will return an array type and 
not an element type, hence TYPE_SIZE_UNIT won't work. The current 
version is fine, except for deferred-length strings. To properly handle 
it, one has to do it as gfortran currently does for scalars. (Best by 
consolidating the support. See PR.)


As I want to work on other things first, I would like to get this in as 
band aid - until someone has the time to do it properly. (I found it 
when trying to write a test case for the already submitted final patch.)


Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias
2013-05-29  Tobias Burnus  bur...@net-b.de

	PR fortran/37336
	* trans-array.c (gfc_trans_dealloc_allocated): Support finalization.
	(structure_alloc_comps): Update caller.
	(gfc_trans_deferred_array): Call finalizer.
	* trans-array.h (gfc_trans_dealloc_allocated): Update prototype.
	* trans-decl.c (gfc_trans_deferred_vars): Don't deallocate/finalize
	variables of the main program.
	* trans-expr.c (gfc_conv_procedure_call): Support finalization.
	* trans-openmp.c (gfc_omp_clause_dtor,
	gfc_trans_omp_array_reduction): Update calls.
	* trans-stmt.c (gfc_trans_deallocate): Avoid double deallocation
	of alloc components.
	* trans.c (gfc_add_finalizer_call): New function.
	(gfc_deallocate_with_status,
	gfc_deallocate_scalar_with_status): Call it
	(gfc_build_final_call): Fix handling of scalar coarrays.

2013-05-29  Tobias Burnus  bur...@net-b.de

	PR fortran/37336
	* gfortran.dg/finalize_12.f90: New.
	* gfortran.dg/alloc_comp_basics_1.f90: Add BLOCK for
	end of scope finalization.
	* gfortran.dg/alloc_comp_constructor_1.f90: Ditto.
	* gfortran.dg/allocatable_scalar_9.f90: Ditto.
	* gfortran.dg/auto_dealloc_2.f90: Ditto.
	* gfortran.dg/class_19.f03: Ditto.
	* gfortran.dg/coarray_lib_alloc_1.f90: Ditto.
	* gfortran.dg/coarray_lib_alloc_2.f90: Ditto.
	* gfortran.dg/extends_14.f03: Ditto.
	* gfortran.dg/move_alloc_4.f90: Ditto.
	* gfortran.dg/typebound_proc_27.f03: Ditto.

diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index be3a5a0..8160fcd 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -7243,7 +7243,7 @@ gfc_conv_array_parameter (gfc_se * se, gfc_expr * expr, bool g77,
 /* Generate code to deallocate an array, if it is allocated.  */
 
 tree
-gfc_trans_dealloc_allocated (tree descriptor, bool coarray)
+gfc_trans_dealloc_allocated (tree descriptor, bool coarray, gfc_expr *expr)
 {
   tree tmp;
   tree var;
@@ -7259,7 +7259,7 @@ gfc_trans_dealloc_allocated (tree descriptor, bool coarray)
  are already deallocated are ignored.  */
   tmp = gfc_deallocate_with_status (coarray ? descriptor : var, NULL_TREE,
 NULL_TREE, NULL_TREE, NULL_TREE, true,
-NULL, coarray);
+expr, coarray);
   gfc_add_expr_to_block (block, tmp);
 
   /* Zero the data pointer.  */
@@ -7548,7 +7548,7 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl,
 	{
 	  comp = fold_build3_loc (input_location, COMPONENT_REF, ctype,
   decl, cdecl, NULL_TREE);
-	  tmp = gfc_trans_dealloc_allocated (comp, c-attr.codimension);
+	  tmp = gfc_trans_dealloc_allocated (comp, c-attr.codimension, NULL);
 	  gfc_add_expr_to_block (tmpblock, tmp);
 	}
 	  else if (c-attr.allocatable)
@@ -7580,7 +7580,7 @@ structure_alloc_comps (gfc_symbol * der_type, tree decl,
 
 	  if (GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (comp)))
 	tmp = gfc_trans_dealloc_allocated (comp,
-	CLASS_DATA (c)-attr.codimension);
+	CLASS_DATA (c)-attr.codimension, NULL);
 	  else
 		{
 		  tmp = gfc_deallocate_scalar_with_status (comp, NULL_TREE, true, NULL,
@@ -8292,7 +8292,7 @@ gfc_trans_deferred_array (gfc_symbol * sym, gfc_wrapped_block * block)
   stmtblock_t cleanup;
   locus loc;
   int rank;
-  bool sym_has_alloc_comp;
+  bool sym_has_alloc_comp, has_finalizer;
 
   sym_has_alloc_comp = (sym-ts.type == BT_DERIVED
 			|| sym-ts.type == BT_CLASS)
@@ -8379,8 +8379,12 @@ gfc_trans_deferred_array (gfc_symbol * sym, gfc_wrapped_block * block)
 
   /* Allocatable arrays need to be freed when they go out of scope.
  The allocatable components of pointers must not be touched.  */
-  if (sym_has_alloc_comp  !(sym-attr.function || sym-attr.result)
-   !sym-attr.pointer  !sym-attr.save)
+  has_finalizer = sym-ts.type == BT_CLASS || sym-ts.type == BT_DERIVED
+		   ? gfc_is_finalizable (sym-ts.u.derived, NULL) : false;
+  if ((!sym-attr.allocatable || !has_finalizer)
+   sym_has_alloc_comp  !(sym-attr.function || sym-attr.result)
+   !sym-attr.pointer  !sym-attr.save
+   !sym-ns-proc_name-attr.is_main_program)
 {
   int rank;

Re: [Patch, Fortran] PR57456 - Handle ALLOCATE with typespec for CLASS

2013-05-29 Thread Tobias Burnus


Now as bonus with the proper patch.

Tobias

PS: I really wonder why Thunderbird's attach file dialog shows an 
outdated directory content, unless one hits F5, if one opens the dialog 
again :-(


Tobias Burnus wrote:

Currently, ALLOCATE ignores the typespec for arrays. Such that:
   ALLOCATE (t2 :: var(5))
will allocate as much memory as the base type requires instead of 
using as much as t2 does.



I explicitly exclude characters as it otherwise will fail for 
allocate_with_typespec_1.f90, which uses:

 allocate(character :: c1(1))
The problem is that gfc_typenode_for_spec will return an array type 
and not an element type, hence TYPE_SIZE_UNIT won't work. The current 
version is fine, except for deferred-length strings. To properly 
handle it, one has to do it as gfortran currently does for scalars. 
(Best by consolidating the support. See PR.)


As I want to work on other things first, I would like to get this in 
as band aid - until someone has the time to do it properly. (I found 
it when trying to write a test case for the already submitted final 
patch.)


Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias


2013-05-29  Tobias Burnus  bur...@net-b.de

	PR fortran/57456
	* trans-array.c (gfc_array_init_size): Use passed type spec,
	when available.
	(gfc_array_allocate): Pass typespec on.
	* trans-array.h (gfc_array_allocate): Update prototype.
	* trans-stmt.c (gfc_trans_allocate): Pass typespec on.

2013-05-29  Tobias Burnus  bur...@net-b.de

	PR fortran/57456
	* gfortran.dg/class_array_17.f90: New.

diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index be3a5a0..b0748b7 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -4834,7 +4834,8 @@ static tree
 gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset,
 		 gfc_expr ** lower, gfc_expr ** upper, stmtblock_t * pblock,
 		 stmtblock_t * descriptor_block, tree * overflow,
-		 tree expr3_elem_size, tree *nelems, gfc_expr *expr3)
+		 tree expr3_elem_size, tree *nelems, gfc_expr *expr3,
+		 gfc_typespec *ts)
 {
   tree type;
   tree tmp;
@@ -4834,7 +4834,8 @@ gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset,
 	  tmp = TYPE_SIZE_UNIT (tmp);
 	}
 }
+  else if (ts-type != BT_UNKNOWN  ts-type != BT_CHARACTER)
+/* FIXME: Properly handle characters.  See PR 57456.  */
+tmp = TYPE_SIZE_UNIT (gfc_typenode_for_spec (ts));
   else
 tmp = TYPE_SIZE_UNIT (gfc_get_element_type (type));
 
@@ -5081,7 +5084,7 @@ gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset,
 bool
 gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
 		tree errlen, tree label_finish, tree expr3_elem_size,
-		tree *nelems, gfc_expr *expr3)
+		tree *nelems, gfc_expr *expr3, gfc_typespec *ts)
 {
   tree tmp;
   tree pointer;
@@ -5166,7 +5169,7 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
   size = gfc_array_init_size (se-expr, ref-u.ar.as-rank,
 			  ref-u.ar.as-corank, offset, lower, upper,
 			  se-pre, set_descriptor_block, overflow,
-			  expr3_elem_size, nelems, expr3);
+			  expr3_elem_size, nelems, expr3, ts);
 
   if (dimension)
 {
diff --git a/gcc/fortran/trans-array.h b/gcc/fortran/trans-array.h
index 6f44d79..d00e156 100644
--- a/gcc/fortran/trans-array.h
+++ b/gcc/fortran/trans-array.h
@@ -24,7 +24,7 @@ tree gfc_array_deallocate (tree, tree, tree, tree, tree, gfc_expr*);
 /* Generate code to initialize and allocate an array.  Statements are added to
se, which should contain an expression for the array descriptor.  */
 bool gfc_array_allocate (gfc_se *, gfc_expr *, tree, tree, tree, tree,
-			 tree, tree *, gfc_expr *);
+			 tree, tree *, gfc_expr *, gfc_typespec *);
 
 /* Allow the bounds of a loop to be set from a callee's array spec.  */
 void gfc_set_loop_bounds_from_array_spec (gfc_interface_mapping *,
diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index 7812934..7759b86 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -4925,7 +4925,7 @@ gfc_trans_allocate (gfc_code * code)
 
   nelems = NULL_TREE;
   if (!gfc_array_allocate (se, expr, stat, errmsg, errlen, label_finish,
-			   memsz, nelems, code-expr3))
+			   memsz, nelems, code-expr3, code-ext.alloc.ts))
 	{
 	  bool unlimited_char;
 
--- /dev/null	2013-05-29 07:55:34.977108520 +0200
+++ gcc/gcc/testsuite/gfortran.dg/class_array_17.f90	2013-05-29 19:36:00.239941803 +0200
@@ -0,0 +1,34 @@
+! { dg-do compile }
+! { dg-options -fdump-tree-original }
+!
+! PR fortran/57456
+!
+module m
+  implicit none
+  type t
+integer :: i
+   end type t
+  type, extends(t) :: t2
+integer :: j
+   end type t2
+end module m
+
+program test
+  use m
+  implicit none
+  integer :: i
+  class(t), save, allocatable :: y(:)
+
+  allocate (t :: y(5))
+  select type(y)
+  type is (t2)
+do i = 1, 5
+  y(i)%i = i
+

Re: Symtab cleanups 1/17

2013-05-29 Thread Jan Hubicka

 Honza,
 
 I would greatly appreciate if you would bootstrap GCC and run the
 testsuite with your symtab cleanup patches on ppc64-linux.

Testing on gcc110.fsffrance.org passed fine.  I will re-test x86_64-linux
and go ahead with the patch and will mind to continue testing the patches
on both platforms.

Honza

 
 Thanks, David

Re: Fix PR57268

2013-05-29 Thread Jeff Law


On 05/29/2013 10:36 AM, Dinar Temirbulatov wrote:

Here is the corrected version of change. Also, I think, I need
write-after-approval access to commit the change.
 thanks, Dinar,

 PR rtl-optimization/57268
 * sched-deps.c (sched_analyze_2): Flush dependence lists if
 the sum of the read and write lists exceeds MAX_PENDING_LIST_LENGTH.
Approved for the trunk.  I've separately sent instructions to get write 
access set up.


jeff

Re: [PATCH, rs6000] power8 patches, patch #6, direct move basic quad load/store

2013-05-29 Thread David Edelsohn

+  if (TARGET_DIRECT_MOVE)
+{
+  if (TARGET_POWERPC64)
+{
+  reload_gpr_vsx[TImode]= CODE_FOR_reload_gpr_from_vsxti;
+  reload_gpr_vsx[V2DFmode]  = CODE_FOR_reload_gpr_from_vsxv2df;
+  reload_gpr_vsx[V2DImode]  = CODE_FOR_reload_gpr_from_vsxv2di;
+  reload_gpr_vsx[V4SFmode]  = CODE_FOR_reload_gpr_from_vsxv4sf;
+  reload_gpr_vsx[V4SImode]  = CODE_FOR_reload_gpr_from_vsxv4si;
+  reload_gpr_vsx[V8HImode]  = CODE_FOR_reload_gpr_from_vsxv8hi;
+  reload_gpr_vsx[V16QImode] = CODE_FOR_reload_gpr_from_vsxv16qi;
+  reload_gpr_vsx[SFmode]= CODE_FOR_reload_gpr_from_vsxsf;
+
+  reload_vsx_gpr[TImode]= CODE_FOR_reload_vsx_from_gprti;
+  reload_vsx_gpr[V2DFmode]  = CODE_FOR_reload_vsx_from_gprv2df;
+  reload_vsx_gpr[V2DImode]  = CODE_FOR_reload_vsx_from_gprv2di;
+  reload_vsx_gpr[V4SFmode]  = CODE_FOR_reload_vsx_from_gprv4sf;
+  reload_vsx_gpr[V4SImode]  = CODE_FOR_reload_vsx_from_gprv4si;
+  reload_vsx_gpr[V8HImode]  = CODE_FOR_reload_vsx_from_gprv8hi;
+  reload_vsx_gpr[V16QImode] = CODE_FOR_reload_vsx_from_gprv16qi;
+  reload_vsx_gpr[SFmode]= CODE_FOR_reload_vsx_from_gprsf;
+}
+  else
+{
+  reload_fpr_gpr[DImode] = CODE_FOR_reload_fpr_from_gprdi;
+  reload_fpr_gpr[DDmode] = CODE_FOR_reload_fpr_from_gprdd;
+  reload_fpr_gpr[DFmode] = CODE_FOR_reload_fpr_from_gprdf;
+}
+}

Why do the VSX reload functions depend on TARGET_POWERPC64? That seems
like the wrong test.

+/* Return true if this is a move direct operation between GPR registers and
+   floating point/VSX registers.  */
+
+bool
+direct_move_p (rtx op0, rtx op1)

Why isn't this function symmetric?  It at least needs an explanation
in the comment about assumptions for the operands.

+/* Return true if this is a load or store quad operation.  */
+
+bool
+quad_load_store_p (rtx op0, rtx op1)

Same for this function.

+/* Helper function for rs6000_secondary_reload to return true if a move to a
+   different register classe is really a simple move.  */
+
+static bool
+rs6000_secondary_reload_simple_move (enum rs6000_reg_type to_type,
+ enum rs6000_reg_type from_type,
+ enum machine_mode mode)
+{
+  int size;
+
+  /* Add support for various direct moves available.  In this function, we only
+ look at cases where we don't need any extra registers, and one or more
+ simple move insns are issued.  At present, 32-bit integers are not allowed
+ in FPR/VSX registers.  Single precision binary floating is not a simple
+ move because we need to convert to the single precision memory layout.
+ The 4-byte SDmode can be moved.  */

The second comment should be merged into the first -- it explains the
purpose and implementation of the function.

+/* Return whether a move between two register classes can be done either
+   directly (simple move) or via a pattern that uses a single extra temporary
+   (using power8's direct move in this case.  */
+
+static bool
+rs6000_secondary_reload_move (enum rs6000_reg_type to_type,
+  enum rs6000_reg_type from_type,
+  enum machine_mode mode,
+  secondary_reload_info *sri,
+  bool altivec_p)

Missing a close parenthesis in the comment.

(define_insn *vsx_movmode
-  [(set (match_operand:VSX_M 0 nonimmediate_operand
=Z,VSr,VSr,?Z,?wa,?wa,*Y,*r,*r,VSr,?wa,*r,v,wZ,v)
-(match_operand:VSX_M 1 input_operand
VSr,Z,VSr,wa,Z,wa,r,Y,r,j,j,j,W,v,wZ))]
+  [(set (match_operand:VSX_M 0 nonimmediate_operand
=Z,VSr,VSr,?Z,?wa,?wa,wQ,?r,??Y,??r,??r,VSr,?wa,*r,v,wZ, v)
+(match_operand:VSX_M 1 input_operand
VSr,Z,VSr,wa,Z,wa,r,wQ,r,Y,r,j,j,j,W,v,wZ))]

Why do you need to change the modifiers? Why should vector operands in
GPRs matter for register preferences (removing `*' from r
constraints)?

Thanks, David

[PATCH] Fix -fdump-passes

2013-05-29 Thread Teresa Johnson

This patch re-enables -fdump-passes. It had stopped working because
dump_passes was changed to use the FOR_EACH_DEFINED_FUNCTION iterator,
however, functions are not marked as defined until after dump_passes
is called, in cgraph_analyze_functions. Fixed by iterating over all
functions.

Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?

Thanks,
Teresa

2013-05-29  Teresa Johnson  tejohn...@google.com

* passes.c (dump_passes): Use FOR_EACH_FUNCTION since
functions are not yet marked as defined.

Index: passes.c
===
--- passes.c (revision 199199)
+++ passes.c (working copy)
@@ -718,7 +718,7 @@ dump_passes (void)

   create_pass_tab();

-  FOR_EACH_DEFINED_FUNCTION (n)
+  FOR_EACH_FUNCTION (n)
 if (DECL_STRUCT_FUNCTION (n-symbol.decl))
   {
  node = n;


--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [gomp4/cilk-in-gomp/cilkplus] C++ parsing for Cilk plus #pragma simd

2013-05-29 Thread Aldy Hernandez


On 05/28/13 16:20, Iyer, Balaji V wrote:


Regarding the tests, I will merge trunk-cilk-in-gomp so we can reuse the c-c++-
common infrastructure you wrote.  There is no sense having multiple tests for
vectorlength, linear, etc.


Can you please add them under c-c++-common/cilk-plus/PS directory?

AN = Array Notation
PS = Pragma SIMD
CK = Cilk Keywords
EF = Elemental function.

This way, we don't have to monkey with the existing 
gcc.dg/cilk-plus/cilk-plus.exp script.

Thanks,
Balaji V. Iyer.



Aldy


On second thought, it looks like you'll be busy working on the fallout 
from the array notation work.  I can take over this entire patch, fix 
things, and re-post the final version before committing to the branch.


Thanks for contributing this!

Aldy

Re: [PATCH, rs6000] power8 patches, patch #7, quad/byte/half-word atomic instructions

2013-05-29 Thread David Edelsohn

-  if (mode == QImode || mode == HImode)
+  /* On power8, we want to use SImode for the operation.  On previoius systems,
+ use the operation in a subword and shift/mask to get the proper byte or
+ halfword.  */
+  if (TARGET_SYNC_HI_QI  (mode == QImode || mode == HImode))
+{
+  val = convert_modes (SImode, mode, val, 1);
+
+  /* Prepare to adjust the return value.  */
+  before = gen_reg_rtx (SImode);
+  if (after)
+after = gen_reg_rtx (SImode);
+  mode = SImode;
+}
+  else if (mode == QImode || mode == HImode)

Spelling: previoius.

This logic is redundant. Why not

if (mode == QImode || mode == HImode)
  {
if (TARGET_SYNC_HI_QI)
  {
 new code
  }
else
  {
 original code
  }

The rest of this patch is okay.

Thanks, David

Re: [PATCH] Fix -fdump-passes

2013-05-29 Thread Jeff Law


On 05/29/2013 02:11 PM, Teresa Johnson wrote:

This patch re-enables -fdump-passes. It had stopped working because
dump_passes was changed to use the FOR_EACH_DEFINED_FUNCTION iterator,
however, functions are not marked as defined until after dump_passes
is called, in cgraph_analyze_functions. Fixed by iterating over all
functions.

Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?

Thanks,
Teresa

2013-05-29  Teresa Johnson  tejohn...@google.com

* passes.c (dump_passes): Use FOR_EACH_FUNCTION since
 functions are not yet marked as defined.

OK, Assuming you fix the whitespace in the ChangeLog entry.

jeff

Re: [PATCH, rs6000] power8 patches, patch #6, direct move basic quad load/store

2013-05-29 Thread Michael Meissner

On Wed, May 29, 2013 at 03:53:43PM -0400, David Edelsohn wrote:
 +  if (TARGET_DIRECT_MOVE)
 +{
 +  if (TARGET_POWERPC64)
 +{
 +  reload_gpr_vsx[TImode]= CODE_FOR_reload_gpr_from_vsxti;
 +  reload_gpr_vsx[V2DFmode]  = CODE_FOR_reload_gpr_from_vsxv2df;
 +  reload_gpr_vsx[V2DImode]  = CODE_FOR_reload_gpr_from_vsxv2di;
 +  reload_gpr_vsx[V4SFmode]  = CODE_FOR_reload_gpr_from_vsxv4sf;
 +  reload_gpr_vsx[V4SImode]  = CODE_FOR_reload_gpr_from_vsxv4si;
 +  reload_gpr_vsx[V8HImode]  = CODE_FOR_reload_gpr_from_vsxv8hi;
 +  reload_gpr_vsx[V16QImode] = CODE_FOR_reload_gpr_from_vsxv16qi;
 +  reload_gpr_vsx[SFmode]= CODE_FOR_reload_gpr_from_vsxsf;
 +
 +  reload_vsx_gpr[TImode]= CODE_FOR_reload_vsx_from_gprti;
 +  reload_vsx_gpr[V2DFmode]  = CODE_FOR_reload_vsx_from_gprv2df;
 +  reload_vsx_gpr[V2DImode]  = CODE_FOR_reload_vsx_from_gprv2di;
 +  reload_vsx_gpr[V4SFmode]  = CODE_FOR_reload_vsx_from_gprv4sf;
 +  reload_vsx_gpr[V4SImode]  = CODE_FOR_reload_vsx_from_gprv4si;
 +  reload_vsx_gpr[V8HImode]  = CODE_FOR_reload_vsx_from_gprv8hi;
 +  reload_vsx_gpr[V16QImode] = CODE_FOR_reload_vsx_from_gprv16qi;
 +  reload_vsx_gpr[SFmode]= CODE_FOR_reload_vsx_from_gprsf;
 +}
 +  else
 +{
 +  reload_fpr_gpr[DImode] = CODE_FOR_reload_fpr_from_gprdi;
 +  reload_fpr_gpr[DDmode] = CODE_FOR_reload_fpr_from_gprdd;
 +  reload_fpr_gpr[DFmode] = CODE_FOR_reload_fpr_from_gprdf;
 +}
 +}
 
 Why do the VSX reload functions depend on TARGET_POWERPC64? That seems
 like the wrong test.

Because at present we do not do direct move between VSX and GPR registers for
128-bit in 32-bit mode.  For 32-bit mode, we only transfer 64-bit types.

Due to issues with secondary reload where you only get one temporary register,
and that temporary register might/might not overlap with the output register,
we might need a type that takes 3 traditional FPR registers, and we don't have
one at present (to move from GPR to VSX we would need to do 2 direct moves, a
FMRGOW for the first 64-bits, and 2 direct moves, and another FMRGOW for the
second 64-bits, and then an XXPERMDI to glue the two 64-bits together).  I've
seen places where reload does not honor the  constraint for these secondary
reload cases, so the output can't be one of the temporary registers, hence
needing a type that spans 3 registers.

In theory it can be done, it just hasn't been done.

 +/* Return true if this is a move direct operation between GPR registers and
 +   floating point/VSX registers.  */
 +
 +bool
 +direct_move_p (rtx op0, rtx op1)
 
 Why isn't this function symmetric?  It at least needs an explanation
 in the comment about assumptions for the operands.

I probably should have named the operands to and from, since that is how the
callers call it.  I'm not sure I understand the comment about it being
symetric, since you need different strategies to move in different directions,
and you might or might not have implemented all of the cases.

 +/* Return true if this is a load or store quad operation.  */
 +
 +bool
 +quad_load_store_p (rtx op0, rtx op1)
 
 Same for this function.

Again it should have been to/from.

 +/* Helper function for rs6000_secondary_reload to return true if a move to a
 +   different register classe is really a simple move.  */
 +
 +static bool
 +rs6000_secondary_reload_simple_move (enum rs6000_reg_type to_type,
 + enum rs6000_reg_type from_type,
 + enum machine_mode mode)
 +{
 +  int size;
 +
 +  /* Add support for various direct moves available.  In this function, we 
 only
 + look at cases where we don't need any extra registers, and one or more
 + simple move insns are issued.  At present, 32-bit integers are not 
 allowed
 + in FPR/VSX registers.  Single precision binary floating is not a simple
 + move because we need to convert to the single precision memory layout.
 + The 4-byte SDmode can be moved.  */
 
 The second comment should be merged into the first -- it explains the
 purpose and implementation of the function.

Ok.

 +/* Return whether a move between two register classes can be done either
 +   directly (simple move) or via a pattern that uses a single extra temporary
 +   (using power8's direct move in this case.  */
 +
 +static bool
 +rs6000_secondary_reload_move (enum rs6000_reg_type to_type,
 +  enum rs6000_reg_type from_type,
 +  enum machine_mode mode,
 +  secondary_reload_info *sri,
 +  bool altivec_p)
 
 Missing a close parenthesis in the comment.

Yep, thanks.

 (define_insn *vsx_movmode
 -  [(set (match_operand:VSX_M 0 nonimmediate_operand
 =Z,VSr,VSr,?Z,?wa,?wa,*Y,*r,*r,VSr,?wa,*r,v,wZ,v)
 -(match_operand:VSX_M 1 input_operand
 VSr,Z,VSr,wa,Z,wa,r,Y,r,j,j,j,W,v,wZ))]
 +  [(set

Re: [PATCH, rs6000] power8 patches, patch #7, quad/byte/half-word atomic instructions

2013-05-29 Thread Michael Meissner

On Wed, May 29, 2013 at 04:29:07PM -0400, David Edelsohn wrote:
 -  if (mode == QImode || mode == HImode)
 +  /* On power8, we want to use SImode for the operation.  On previoius 
 systems,
 + use the operation in a subword and shift/mask to get the proper byte or
 + halfword.  */
 +  if (TARGET_SYNC_HI_QI  (mode == QImode || mode == HImode))
 +{
 +  val = convert_modes (SImode, mode, val, 1);
 +
 +  /* Prepare to adjust the return value.  */
 +  before = gen_reg_rtx (SImode);
 +  if (after)
 +after = gen_reg_rtx (SImode);
 +  mode = SImode;
 +}
 +  else if (mode == QImode || mode == HImode)
 
 Spelling: previoius.

Thanks.

 This logic is redundant. Why not
 
 if (mode == QImode || mode == HImode)
   {
 if (TARGET_SYNC_HI_QI)
   {
  new code
   }
 else
   {
  original code
   }
 
 The rest of this patch is okay.

I can do that.  I think earlier versions did it that way.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH] Fix -fdump-passes

2013-05-29 Thread Teresa Johnson

On Wed, May 29, 2013 at 1:32 PM, Jeff Law l...@redhat.com wrote:
 On 05/29/2013 02:11 PM, Teresa Johnson wrote:

 This patch re-enables -fdump-passes. It had stopped working because
 dump_passes was changed to use the FOR_EACH_DEFINED_FUNCTION iterator,
 however, functions are not marked as defined until after dump_passes
 is called, in cgraph_analyze_functions. Fixed by iterating over all
 functions.

 Bootstrapped and tested on x86-64-unknown-linux-gnu. Ok for trunk?

 Thanks,
 Teresa

 2013-05-29  Teresa Johnson  tejohn...@google.com

 * passes.c (dump_passes): Use FOR_EACH_FUNCTION since
  functions are not yet marked as defined.

 OK, Assuming you fix the whitespace in the ChangeLog entry.

Fixed and committed at r199424. Thanks, Teresa


 jeff




--
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413

Re: [ada, build] host/target configuration

2013-05-29 Thread Thomas Schwinge

Hi!

On Wed, 29 May 2013 16:21:38 +0200, Paolo Bonzini bonz...@gnu.org wrote:
 Il 29/05/2013 12:50, Thomas Schwinge ha scritto:
  How about we use something like the following [...] patch?  In 
  essence, replace the manual parsing in
  gcc/ada/gcc-interface/Makefile.in by using the values the
  configure script already computed for us.  This is largely
  straightforward (I would hope); please especially review the 
  more involved cases such as the one where I'm now using
  linux%eabi -- for example, does that do the right thing or
  interfere with the Android configurations?

 I agree that using $(target_os) and linux% matches is a better way to
 strip out the -gnus.
 
 The patch is long, but I think I reviewed it carefully enough.  It's
 okay for mainline.

Thanks for the review.  Before I commit however:

 Regarding the android change:
 
  -ifeq ($(strip $(filter-out arm%-linux,$(arch)-$(osys)) $(if $(findstring 
  eabi,$(word 4,$(targ))),,$(word 4,$(targ,)
  +ifeq ($(strip $(filter-out arm% linux%eabi,$(target_cpu) $(target_os))),)
 
 This is okay, it will match arm-none-linux-androideabi both before and
 after

Well, upon more carful inspection I came to realize that the old/still
current code accepts arm-*-linux-*eabi but also plain arm-*-linux, which
my new code no longer accepts.  Is the old/still current behavior
intentional here, or a bug?  It has been introduced in commit
f49eb158fa5618dbd1130e47281e0bae13798ec6 (r192475):

gcc/ada/
2012-10-15  Matthias Klose  d...@ubuntu.com

* gcc-interface/Makefile.in: Match arm*-*-linux-*eabi* for
ARM Linux/GNU.

-ifeq ($(strip $(filter-out arm% linux-gnueabi,$(arch) $(osys)-$(word 
4,$(targ,)
+ifeq ($(strip $(filter-out arm%-linux,$(arch)-$(osys)) $(if $(findstring 
eabi,$(word 4,$(targ))),,$(word 4,$(targ,)

From the description, I understand that change to mean any ARM,
Linux-kernel, EABI configuration with any kind of user-land, so my new
interpretation seems correct, and the current code wrong to also accept
plain arm-*-linux.  That change just cite should just have replaced
linux-gnueabi with linux-%eabi?

 (perhaps linux-%eabi would be more readable).

As soon as we've clarified the previous question, I'll do that change.


 However, it seems that the first androideabi snippet was dead code.
 Can you delete it in a follow-up?

That has been added in commit 7a5ee35f18bc945ec8abbcd1377c446352504e6e
(r193238):

2012-11-06  Arnaud Charlet  char...@adacore.com

[...]
* gcc-interface/Makefile.in: Add runtime pairs for Android.
Rework handling of s-oscons.ads.
* s-osinte-android.ads, s-osinte-android.adb: New files.

+ifeq ($(strip $(filter-out arm% linux-androideabi,$(arch) $(osys)-$(word 
4,$(targ,)
+  LIBGNAT_TARGET_PAIRS = \
+  [...]
+  s-osinte.adbs-osinte-android.adb \
+  s-osinte.adss-osinte-android.ads \
+  [...]

Indeed I agree that the second snippet in the Makefile.in (as changed by
Matthias on 2012-10-15) triggers anytime the first one (as added by
Arnaud on 2012-11-06) triggered, so Arnaud's LIBGNAT_TARGET_PAIRS setting
will always be overwritten by Matthias', so to speak, and thus the new
s-osinte-android.* files never be used in the current code.  Arnaud,
Matthias, please clarify the intended behavior?  Of course I would assume
that for *-android* configurations the new s-osinte-android.* files be
used, but as Arnaud's change has gone in after Matthias', so this can
never have worked (unless I'm confused)?


Grüße,
 Thomas


pgpMQriqgNpfH.pgp
Description: PGP signature

Re: [patch, mips] Fix for PR target/56942

2013-05-29 Thread Steven Bosscher

On Tue, May 28, 2013 at 10:39 PM, Steven Bosscher stevenb@gmail.com wrote:
 On Tue, May 28, 2013 at 9:36 PM, Richard Sandiford
 rdsandif...@googlemail.com wrote:
 Hi Steven,

 Steven Bosscher stevenb@gmail.com writes:
 Imho the active-insn idiom is the best solution for the moment. I
 will fix this mess properly asap, probably next week.

 Just wondering, how are things going with this?  (I assume fixing it
 properly means getting rid of the FIXME in next_active_insn?)

 Yes, it involves getting this to work:


And I can't get it to fail. I also can't find any place where the
label and jump table might get separated. I was expecting some trouble
with cross-jumping but it seems that it takes care of updating the
label reference, and skip_insns_after_block already expects the label
and table to be adjacent.

So let's just see if/what breaks...

Bootstrappedtested on powerpc64-unknown-linux-gnu (unix{,-m32}).
Bootstrappedtested on x86_64-unknown-linux-gnu (unix{,-m32}).
Buildtested powerpc64 X mips-elf.

OK for trunk?

If it causes any trouble, please file a PR and assign it to me.

And when the dust has settled, I can clean up the FIXME for
JUMP_TABLE_DATA in next_active_insn, and fix up the back ends.

Ciao!
Steven


  * rtlanal.c (tablejump_p): Expect table and label to be adjacent.

Index: rtlanal.c
===
--- rtlanal.c   (revision 199324)
+++ rtlanal.c   (working copy)
@@ -2711,6 +2711,7 @@ tablejump_p (const_rtx insn, rtx *labelp, rtx *tab
(table = next_active_insn (label)) != NULL_RTX
JUMP_TABLE_DATA_P (table))
 {
+  gcc_assert (table == NEXT_INSN (label));
   if (labelp)
*labelp = label;
   if (tablep)

Re: [GOOGLE] Unrestrict early inline restrictions for AutoFDO

2013-05-29 Thread Dehao Chen

OK, I'll commit the early inline part.

Dehao

On Wed, May 29, 2013 at 10:00 AM, Xinliang David Li davi...@google.com wrote:
 The early inlining part is ok. The tracer optimization should be
 revisited -- we should have more fine grain control on it (for
 instance, based on FDO summary -- but that should be common to
 FDO/LIPO).

 David

 On Wed, May 29, 2013 at 9:39 AM, Dehao Chen de...@google.com wrote:
 In gcc4-8, the max einline iterations are restricted to 1. For
 AutoFDO, this is bad because early inline is not size restricted. This
 patch allows einline to do multiple iterations in AutoFDO. It also
 enables tracer optimization in AutoFDO.

 Bootstrapped and passed regression test.

 OK for googel-4_8?

 Thanks,
 Dehao

 Index: gcc/ipa-inline.c
 ===
 --- gcc/ipa-inline.c (revision 199416)
 +++ gcc/ipa-inline.c (working copy)
 @@ -2161,7 +2161,8 @@ early_inliner (void)
  {
/* We iterate incremental inlining to get trivial cases of indirect
   inlining.  */
 -  while (iterations  PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS)
 +  while ((flag_auto_profile
 +  || iterations  PARAM_VALUE (PARAM_EARLY_INLINER_MAX_ITERATIONS))
early_inline_small_functions (node))
   {
timevar_push (TV_INTEGRATION);
 Index: gcc/opts.c
 ===
 --- gcc/opts.c (revision 199416)
 +++ gcc/opts.c (working copy)
 @@ -1644,6 +1644,8 @@ common_handle_option (struct gcc_options *opts,
   opts-x_flag_peel_loops = value;
if (!opts_set-x_flag_value_profile_transformations)
   opts-x_flag_value_profile_transformations = value;
 +  if (!opts_set-x_flag_tracer)
 + opts-x_flag_tracer = value;
if (!opts_set-x_flag_inline_functions)
   opts-x_flag_inline_functions = value;
if (!opts_set-x_flag_ipa_cp)

[GOOGLE] Remove records in powerpc64-grtev3-linux-gnu.xfail

2013-05-29 Thread Carrot Wei

Hi

Since b/8397853 has been fixed, the related tests now passed, so we can remove
them from powerpc64-grtev3-linux-gnu.xfail now.

Tested with ./buildit --run_tests.

OK for google 4.7 branch?

thanks
Carrot


2013-05-29  Guozhi Wei  car...@google.com

* powerpc64-grtev3-linux-gnu.xfail (*** g++): Remove passed records.



Index: powerpc64-grtev3-linux-gnu.xfail
===
--- powerpc64-grtev3-linux-gnu.xfail (revision 199411)
+++ powerpc64-grtev3-linux-gnu.xfail (working copy)
@@ -131,19 +131,8 @@
 FAIL: g++.dg/ext/cleanup-9.C -std=gnu++11 execution test
 FAIL: g++.dg/warn/Wself-assign-2.C -std=gnu++11  (test for warnings, line 12)
 FAIL: g++.dg/tree-prof/lipo/vcall1_0.C scan-ipa-dump-times profile
Indirect call - direct call 2
-# b/8397853, a LIPO bug, causing compilation to fail.
-FAIL: g++.dg/tree-prof/mversn15.C compilation,  -fprofile-generate
-UNRESOLVED: g++.dg/tree-prof/mversn15.C execution,-fprofile-generate
-#FAIL: g++.dg/tree-prof/mversn15.C execution,-fprofile-generate
-UNRESOLVED: g++.dg/tree-prof/mversn15.C compilation,  -fprofile-use
-UNRESOLVED: g++.dg/tree-prof/mversn15.C execution,-fprofile-use
 FAIL: g++.dg/tree-prof/mversn15.C scan-tree-dump optimized return 0
 FAIL: g++.dg/tree-prof/mversn15.C scan-tree-dump optimized main_clone
-FAIL: g++.dg/tree-prof/mversn15a.C compilation,  -fprofile-generate
-UNRESOLVED: g++.dg/tree-prof/mversn15a.C execution,-fprofile-generate
-#FAIL: g++.dg/tree-prof/mversn15a.C execution,-fprofile-generate
-UNRESOLVED: g++.dg/tree-prof/mversn15a.C compilation,  -fprofile-use
-UNRESOLVED: g++.dg/tree-prof/mversn15a.C execution,-fprofile-use

 # Fortran failures are not important to us so far.
 *** gfortran:

Re: new port: msp430-elf, revision 2

2013-05-29 Thread DJ Delorie


Are there any further comments on this?  Do I need to post a final
patch?  Can it be committed yet?

Re: [GOOGLE] Remove records in powerpc64-grtev3-linux-gnu.xfail

2013-05-29 Thread Xinliang David Li

ok.

David

On Wed, May 29, 2013 at 5:18 PM, Carrot Wei car...@google.com wrote:
 Hi

 Since b/8397853 has been fixed, the related tests now passed, so we can remove
 them from powerpc64-grtev3-linux-gnu.xfail now.

 Tested with ./buildit --run_tests.

 OK for google 4.7 branch?

 thanks
 Carrot


 2013-05-29  Guozhi Wei  car...@google.com

 * powerpc64-grtev3-linux-gnu.xfail (*** g++): Remove passed records.



 Index: powerpc64-grtev3-linux-gnu.xfail
 ===
 --- powerpc64-grtev3-linux-gnu.xfail (revision 199411)
 +++ powerpc64-grtev3-linux-gnu.xfail (working copy)
 @@ -131,19 +131,8 @@
  FAIL: g++.dg/ext/cleanup-9.C -std=gnu++11 execution test
  FAIL: g++.dg/warn/Wself-assign-2.C -std=gnu++11  (test for warnings, line 12)
  FAIL: g++.dg/tree-prof/lipo/vcall1_0.C scan-ipa-dump-times profile
 Indirect call - direct call 2
 -# b/8397853, a LIPO bug, causing compilation to fail.
 -FAIL: g++.dg/tree-prof/mversn15.C compilation,  -fprofile-generate
 -UNRESOLVED: g++.dg/tree-prof/mversn15.C execution,-fprofile-generate
 -#FAIL: g++.dg/tree-prof/mversn15.C execution,-fprofile-generate
 -UNRESOLVED: g++.dg/tree-prof/mversn15.C compilation,  -fprofile-use
 -UNRESOLVED: g++.dg/tree-prof/mversn15.C execution,-fprofile-use
  FAIL: g++.dg/tree-prof/mversn15.C scan-tree-dump optimized return 0
  FAIL: g++.dg/tree-prof/mversn15.C scan-tree-dump optimized main_clone
 -FAIL: g++.dg/tree-prof/mversn15a.C compilation,  -fprofile-generate
 -UNRESOLVED: g++.dg/tree-prof/mversn15a.C execution,-fprofile-generate
 -#FAIL: g++.dg/tree-prof/mversn15a.C execution,-fprofile-generate
 -UNRESOLVED: g++.dg/tree-prof/mversn15a.C compilation,  -fprofile-use
 -UNRESOLVED: g++.dg/tree-prof/mversn15a.C execution,-fprofile-use

  # Fortran failures are not important to us so far.
  *** gfortran:

Re: [patch] Hash table changes from cxx-conversion branch - config part

2013-05-29 Thread Lawrence Crowl

On 5/21/13, Diego Novillo dnovi...@google.com wrote:
 On May 13, 2013 Lawrence Crowl cr...@googlers.com wrote:
  I still have not heard from i386 or ia64 folks.  Anyone?

 The i386 bits look fine to me as well.  Please wait 48 hours to
 give the i386 maintainers a chance to object.

Committed.

-- 
Lawrence Crowl

67 matches

Mail list logo