Re: [RFA][rtl-optimization/52714] Do not allow combine to create invalid RTL

2014-02-11 Thread Eric Botcazou
 I pondered changing the condition for swapping the insn order, but it
 didn't seem worth it.  I doubt we see many 3-2 combinations where I3 is
 a JUMP_INSN, the result turns into two simple sets and the insn swapping
 code you wrote decides it needs to swap the insns.

I didn't actually write it, just enhanced it recently, that's why I suggested 
to do the same here.  It's one line of code and we have an example of valid 
simplification at hand so I think we ought to do it.

 It seems to me that as long as we're re-using the existing insns to
 contain the simple sets that we have to ensure that they're INSNs, not
 CALL_INSNs or JUMP_INSNs.

I disagree, nullifying JUMP_INSNs by changing them to (set (pc) (pc)) is a 
standard method in the combiner.

-- 
Eric Botcazou


[committed] Add pr59927 testcase

2014-02-11 Thread Jakub Jelinek
Hi!

On Tue, Feb 11, 2014 at 01:13:09AM +, rth at gcc dot gnu.org wrote:
 --- Comment #9 from Richard Henderson rth at gcc dot gnu.org ---
 Author: rth
 Date: Tue Feb 11 01:12:38 2014
 New Revision: 207677
 
 URL: http://gcc.gnu.org/viewcvs?rev=207677root=gccview=rev
 Log:
 PR target/59927
 
 * calls.c (expand_call): Don't double-push for reg_parm_stack_space.
 * config/i386/i386.c (init_cumulative_args): Remove sorry for 64-bit
 ms-abi vs -mno-accumulate-outgoing-args.
 (ix86_expand_prologue): Unconditionally call ix86_eax_live_at_start_p.
 * config/i386/i386.h (ACCUMULATE_OUTGOING_ARGS): Fix comment with
 respect to ms-abi.
 
 Modified:
 trunk/gcc/ChangeLog
 trunk/gcc/calls.c
 trunk/gcc/config/i386/i386.c
 trunk/gcc/config/i386/i386.h

I've committed following testcase for a PR Richard has fixed:

2014-02-11  Jakub Jelinek  ja...@redhat.com

PR target/59927
* gcc.target/i386/pr59927.c: New test.

--- gcc/testsuite/gcc.target/i386/pr59927.c.jj  2014-02-11 10:04:10.430559305 
+0100
+++ gcc/testsuite/gcc.target/i386/pr59927.c 2014-02-11 10:03:54.0 
+0100
@@ -0,0 +1,17 @@
+/* PR target/59927 */
+/* { dg-do compile } */
+/* { dg-options -O2 -g } */
+
+extern void baz (int) __attribute__ ((__ms_abi__));
+
+void
+foo (void (__attribute__ ((ms_abi)) *fn) (int))
+{
+  fn (0);
+}
+
+void
+bar (void)
+{
+  baz (0);
+}

Jakub


[PATCH][ARM] Fix PR 55426

2014-02-11 Thread Kyrill Tkachov

Hi all,

In this PR the 128-bit load-duplicate intrinsics in neon.exp ICE on big-endian 
with an unrecognisable insn error:


neon-vld1_dupQ.c:24:1: error: unrecognizable insn:
(insn 94 93 31 (set (subreg:DI (reg:V2DI 95 d16 [orig:137 D.14400 ] [137]) 0)
(subreg:DI (reg:V2DI 95 d16 [orig:137 D.14400 ] [137]) 8))

The problem seems to be that the neon_vld1_dupv2di splitter generates subregs 
after reload with gen_lowpart and gen_highpart. Since that splitter always 
matches after reload, we already know the hard register numbers, so we can just 
manipulate those directly to extract the two doubleword parts of a quadword reg.


While we're at it, we might as well use a more general move instruction when the 
alignment is natural to potentially take advantage of more complex addressing 
modes. We're allowed to do that because the vld1Q_dup*64 intrinsics describe a 
behaviour and do not guarantee that a particular instruction will be used.


Therefore the vld1Q_dup*64 tests are updated to be run-time tests instead to 
test the functionality. New *_misaligned tests are added, however, to make sure 
that we still generate vld1.64 when the address is explicitly unaligned, since 
vld1.64 is the only instruction that can handle that.


Did an armeb-none-linux-gnueabihf build.
The vld1Q_dup*64* tests now pass on big and little endian.
arm-none-linux-gnueabihf bootstrap on Chromebook successful.


This is a regression since 4.7. I've tested this on trunk. Will test this on the 
4.8 and 4.7 branches.


Ok for those branches if no regressions?

Thanks,
Kyrill


2014-02-11  Kyrylo Tkachov  kyrylo.tkac...@arm.com

PR target/55426
* config/arm/neon.md (neon_vld1_dupv2di): Do not generate
low and high part subregs, use hard reg numbers.
* config/arm/arm.c (arm_mem_aligned_p): New function.
(arm_init_neon_builtins): Allow for memory operands
in load operations.
* config/arm/arm-protos.h (arm_mem_aligned_p): Declare
extern.
* config/arm/constraints.md (Uo): New constraint.

2014-02-11  Kyrylo Tkachov  kyrylo.tkac...@arm.com

PR target/55426
* gcc.target/arm/neon/vld1Q_dupp64.c: Change to run-time test.
* gcc.target/arm/neon/vld1Q_dups64.c: Likewise.
* gcc.target/arm/neon/vld1Q_dupu64.c: Likewise.
* gcc.target/arm/neon/vld1Q_dupp64_misaligned.c: New test.
* gcc.target/arm/neon/vld1Q_dups64_misaligned.c: Likewise.
* gcc.target/arm/neon/vld1Q_dupu64_misaligned.c: Likewise.diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 13874ee..56f46e3 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -95,6 +95,7 @@ extern enum reg_class coproc_secondary_reload_class (enum machine_mode, rtx,
 extern bool arm_tls_referenced_p (rtx);
 
 extern int arm_coproc_mem_operand (rtx, bool);
+extern bool arm_mem_aligned_p (rtx, unsigned int);
 extern int neon_vector_mem_operand (rtx, int, bool);
 extern int neon_struct_mem_operand (rtx);
 
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index fc81bf6..33c829d 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -12499,6 +12499,14 @@ arm_coproc_mem_operand (rtx op, bool wb)
   return FALSE;
 }
 
+/* Return true if the MEM RTX x has the given alignment.  */
+bool
+arm_mem_aligned_p (rtx x, unsigned int alignment)
+{
+  gcc_assert (MEM_P (x));
+  return MEM_ALIGN (x) == alignment;
+}
+
 /* Return TRUE if OP is a memory operand which we can load or store a vector
to/from. TYPE is one of the following values:
 0 - Vector load/stor (vldr)
@@ -23644,7 +23652,9 @@ arm_init_neon_builtins (void)
 		/* Neon load patterns always have the memory
 		   operand in the operand 1 position.  */
 		gcc_assert (insn_data[d-code].operand[k].predicate
-== neon_struct_operand);
+  == neon_struct_operand
+			|| insn_data[d-code].operand[k].predicate
+			 == memory_operand);
 
 		switch (d-mode)
 		  {
diff --git a/gcc/config/arm/constraints.md b/gcc/config/arm/constraints.md
index 85dd116..86947dd 100644
--- a/gcc/config/arm/constraints.md
+++ b/gcc/config/arm/constraints.md
@@ -381,6 +381,14 @@
  (and (match_code mem)
   (match_test TARGET_32BIT  neon_vector_mem_operand (op, 2, true
 
+(define_memory_constraint Uo
+ @internal
+  In ARM/Thumb-2 state a valid address for Neon element and structure
+  load/store instructions or normal load on doubleword alignment.
+ (and (match_code mem)
+  (match_test TARGET_32BIT  (arm_mem_aligned_p (op, DOUBLEWORD_ALIGNMENT)
+   || neon_vector_mem_operand (op, 2, true)
+
 (define_memory_constraint Us
  @internal
   In ARM/Thumb-2 state a valid address for non-offset loads/stores of
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 2f06e42..e4490ba 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -4406,19 +4406,28 @@
 
 (define_insn_and_split neon_vld1_dupv2di
[(set (match_operand:V2DI 0 s_register_operand 

Re: Use [warning enabled by default] for default warnings

2014-02-11 Thread Richard Sandiford
Richard Biener richard.guent...@gmail.com writes:
 On Sun, Feb 9, 2014 at 9:30 PM, Robert Dewar de...@adacore.com wrote:
 On 2/9/2014 3:23 PM, Richard Sandiford wrote:
 can't we just reword the one warning where there is an ambiguity to
 avoid the confusion, rather than creating such an earthquake, which
 as Arno says, really has zero advantages to Ada programmers, and clear
 disadvantages .. to me [enabled by default] is already awfully long!


 Well, since the Ada part has been rejected I think we just need to
 consider this from the non-Ada perspective.  And IMO there's zero
 chance that each new warning will be audited for whether the
 [enabled by default] will be unambiguous.  The fact that this
 particular warning caused confusion and someone actually reported
 it doesn't mean that there are no other warnings like that.  E.g.:

-fprefetch-loop-arrays is not supported with -Os [enabled by default]

 could also be misunderstood, especially if working on an existing codebase
 with an existing makefile.  And the effect for:

pragma simd ignored because -fcilkplus is not enabled [enabled by
 default]

 is a bit unfortunate.  Those were just two examples -- I'm sure I could
 pick more.


 Indeed, worrisome examples,

 a shorter substitute would be [default warning]

 ???

 Or print nothing at all?  After all [...] was supposed to tell people how
 to disable the warning!  If there isn't a way to do that ... maybe instead
 print [-w]?  hmm, all existing [...] are positive so we'd have to print
 -no-w which doesn't exist.  Bah.  So there isn't a way to negate -w
 on the commandline to only get default warnings enabled again.

OK, this version drops the [enabled by default] altogether.
Tested as before.  OK to install?

Thanks,
Richard


gcc/
* opts.c (option_name): Remove enabled by default rider.

gcc/testsuite/
* gcc.dg/gomp/simd-clones-5.c: Update comment for new warning message.

Index: gcc/opts.c
===
--- gcc/opts.c  2014-02-10 20:36:32.380197329 +
+++ gcc/opts.c  2014-02-10 20:58:45.894502379 +
@@ -2216,14 +2216,10 @@ option_name (diagnostic_context *context
return xstrdup (cl_options[option_index].opt_text);
 }
   /* A warning without option classified as an error.  */
-  else if (orig_diag_kind == DK_WARNING || orig_diag_kind == DK_PEDWARN
-  || diag_kind == DK_WARNING)
-{
-  if (context-warning_as_error_requested)
-   return xstrdup (cl_options[OPT_Werror].opt_text);
-  else
-   return xstrdup (_(enabled by default));
-}
+  else if ((orig_diag_kind == DK_WARNING || orig_diag_kind == DK_PEDWARN
+   || diag_kind == DK_WARNING)
+   context-warning_as_error_requested)
+return xstrdup (cl_options[OPT_Werror].opt_text);
   else
 return NULL;
 }
Index: gcc/testsuite/gcc.dg/gomp/simd-clones-5.c
===
--- gcc/testsuite/gcc.dg/gomp/simd-clones-5.c   2014-02-10 20:36:32.380197329 
+
+++ gcc/testsuite/gcc.dg/gomp/simd-clones-5.c   2014-02-10 21:00:32.549412313 
+
@@ -3,7 +3,7 @@
 
 /* ?? The -w above is to inhibit the following warning for now:
a.c:2:6: warning: AVX vector argument without AVX enabled changes
-   the ABI [enabled by default].  */
+   the ABI.  */
 
 #pragma omp declare simd notinbranch simdlen(4)
 void foo (int *a)



[patch] Fix wrong code with VCE to bit-field type at -O

2014-02-11 Thread Eric Botcazou
Hi,

this is an interesting regression from GCC 4.5 present on all active branches 
and triggered by recent SRA improvements.  For the attached testcase, we have 
an unchecked conversion of a 3-byte slice of an array of 4 bytes to a record 
type containing a 3-byte bit-field; an unchecked conversion in this case 
directly translates into a VIEW_CONVERT_EXPR.  Then SRA scalarizes the bit-
field and turns the original VCE into a VCE of the 3-byte slice to the bit-
field type, which is a 32-bit integer with precision 24.

But the expansion of VCE isn't prepared for this and generates a full 32-bit 
read, which thus reads 1 additional byte and doesn't mask it afterwards, thus 
resulting in a wrong value for the scalarized bit-field.

Proposed fix attached, tested on x86-64/Linux, OK for the mainline?


2014-02-11  Eric Botcazou  ebotca...@adacore.com

* expr.c (REDUCE_BIT_FIELD): Extend the scope of the macro.
(expand_expr_real_1) case VIEW_CONVERT_EXPR: Deal with bit-field
destination types.


2014-02-11  Eric Botcazou  ebotca...@adacore.com

* gnat.dg/opt31.adb: New test.


-- 
Eric BotcazouIndex: expr.c
===
--- expr.c	(revision 207685)
+++ expr.c	(working copy)
@@ -9221,7 +9221,6 @@ expand_expr_real_2 (sepops ops, rtx targ
 return temp;
   return REDUCE_BIT_FIELD (temp);
 }
-#undef REDUCE_BIT_FIELD
 
 
 /* Return TRUE if expression STMT is suitable for replacement.  
@@ -10444,15 +10443,21 @@ expand_expr_real_1 (tree exp, rtx target
 	  op0 = target;
 	}
 
-  /* At this point, OP0 is in the correct mode.  If the output type is
+  /* If it's a MEM with a different mode and we need to reduce it to a
+	 bit-field type, do an extraction.  Otherwise, we can read all bits
+	 of MODE but need to deal with the alignment.  If the output type is
 	 such that the operand is known to be aligned, indicate that it is.
-	 Otherwise, we need only be concerned about alignment for non-BLKmode
-	 results.  */
+	 Otherwise, we actually need only be concerned about alignment for
+	 non-BLKmode results.  */
   if (MEM_P (op0))
 	{
 	  enum insn_code icode;
 
-	  if (TYPE_ALIGN_OK (type))
+	  if (reduce_bit_field  GET_MODE (op0) != mode)
+	return extract_bit_field (op0, TYPE_PRECISION (type), 0,
+  TYPE_UNSIGNED (type), NULL_RTX,
+  mode, mode);
+	  else if (TYPE_ALIGN_OK (type))
 	{
 	  /* ??? Copying the MEM without substantially changing it might
 		 run afoul of the code handling volatile memory references in
@@ -10483,7 +10488,7 @@ expand_expr_real_1 (tree exp, rtx target
 		  /* Nor can the insn generator.  */
 		  insn = GEN_FCN (icode) (reg, op0);
 		  emit_insn (insn);
-		  return reg;
+		  return REDUCE_BIT_FIELD (reg);
 		}
 	  else if (STRICT_ALIGNMENT)
 		{
@@ -10513,7 +10518,7 @@ expand_expr_real_1 (tree exp, rtx target
 	  op0 = adjust_address (op0, mode, 0);
 	}
 
-  return op0;
+  return REDUCE_BIT_FIELD (op0);
 
 case MODIFY_EXPR:
   {
@@ -10613,6 +10618,7 @@ expand_expr_real_1 (tree exp, rtx target
   return expand_expr_real_2 (ops, target, tmode, modifier);
 }
 }
+#undef REDUCE_BIT_FIELD
 
 /* Subroutine of above: reduce EXP to the precision of TYPE (in the
signedness of TYPE), possibly returning the result in TARGET.  */-- { dg-do run }
-- { dg-options -O }

with Interfaces; use Interfaces;
with Unchecked_Conversion;

procedure Opt31 is

  type Unsigned_24 is new Unsigned_32 range 0 .. 2**24 - 1;
  subtype Time_T is Unsigned_24 range 0 .. 24 * 60 * 60 * 128 - 1;

  type Messages_T is array (Positive range ) of Unsigned_8;
  subtype T_3Bytes is Messages_T (1 .. 3);

  type Rec1 is record
F : Time_T;
  end record;
  for Rec1 use record
F at 0 range 0 .. 23;
  end record;
  for Rec1'Size use 24;

  type Rec2 is record
I1,I2,I3,I4 : Integer;
R1 : Rec1;
  end record;

  function Conv is new Unchecked_Conversion (T_3Bytes, Rec1);

  procedure Decode (M : Messages_T) is
My_Rec2 : Rec2;
  begin
My_Rec2.R1 := Conv (M (1 .. 3));
if not My_Rec2.R1.F'Valid then
  raise Program_Error;
end if;
  end;

  Message : Messages_T (1 .. 4) := (16#18#, 16#0C#, 16#0C#, 16#18#);

begin
  Decode (Message);
end;

Re: Use [warning enabled by default] for default warnings

2014-02-11 Thread Florian Weimer

On 02/09/2014 09:00 PM, Richard Sandiford wrote:


+   return xstrdup (_(warning enabled by default));


I think this is still wrong because this message really means, this 
warning cannot be controlled with a warning flag, but it can likely be 
switched off by other means.  I don't think it's helpful.


In my opinion, it is better to make this message obsolete by introducing 
the missing warning flags.


--
Florian Weimer / Red Hat Product Security Team


Re: Use [warning enabled by default] for default warnings

2014-02-11 Thread Robert Dewar

On 2/11/2014 4:45 AM, Richard Sandiford wrote:


OK, this version drops the [enabled by default] altogether.
Tested as before.  OK to install?


Still a huge earthquake in terms of affecting test suites and
baselines of many users. is it really worth it? In the case of
GNAT we have only recently started tagging messages in this
way, so changes would not be so disruptive, and we can debate
following whatever gcc does, but I think it is important to
understand that any change in this area is a big one in terms
of impact on users.


Thanks,
Richard


gcc/
* opts.c (option_name): Remove enabled by default rider.

gcc/testsuite/
* gcc.dg/gomp/simd-clones-5.c: Update comment for new warning message.

Index: gcc/opts.c
===
--- gcc/opts.c  2014-02-10 20:36:32.380197329 +
+++ gcc/opts.c  2014-02-10 20:58:45.894502379 +
@@ -2216,14 +2216,10 @@ option_name (diagnostic_context *context
return xstrdup (cl_options[option_index].opt_text);
  }
/* A warning without option classified as an error.  */
-  else if (orig_diag_kind == DK_WARNING || orig_diag_kind == DK_PEDWARN
-  || diag_kind == DK_WARNING)
-{
-  if (context-warning_as_error_requested)
-   return xstrdup (cl_options[OPT_Werror].opt_text);
-  else
-   return xstrdup (_(enabled by default));
-}
+  else if ((orig_diag_kind == DK_WARNING || orig_diag_kind == DK_PEDWARN
+   || diag_kind == DK_WARNING)
+   context-warning_as_error_requested)
+return xstrdup (cl_options[OPT_Werror].opt_text);
else
  return NULL;
  }
Index: gcc/testsuite/gcc.dg/gomp/simd-clones-5.c
===
--- gcc/testsuite/gcc.dg/gomp/simd-clones-5.c   2014-02-10 20:36:32.380197329 
+
+++ gcc/testsuite/gcc.dg/gomp/simd-clones-5.c   2014-02-10 21:00:32.549412313 
+
@@ -3,7 +3,7 @@

  /* ?? The -w above is to inhibit the following warning for now:
 a.c:2:6: warning: AVX vector argument without AVX enabled changes
-   the ABI [enabled by default].  */
+   the ABI.  */

  #pragma omp declare simd notinbranch simdlen(4)
  void foo (int *a)





[PATCH][LTO] Fix PR60060

2014-02-11 Thread Richard Biener

This fixes the ICE seen in PR60060 which stems from us asking to
output debug-info for a function-scope global twice - once via
emitting the function and its BLOCK-associated vars and once
via the debug-hook through lto_write_globals - 
emit_debug_global_declarations.

The list of global variables does not agree with those from
the frontends (that function-scope global is not in GFortrans list).

Thus the following simply avoids the mess by only ever emitting
debug-info for non-function scope globals here.  As
wrapup_global_declarations looks like a complete no-op for LTO
(DECL_DEFER_OUTPUT is never 1, it's not even streamed ...) I
chose to inline emit_debug_global_declarations.  That whole area
asks for a FE-middle-end interface sanitization ...

LTO bootstrapped / tested on x86_64-unknown-linux-gnu.

Does that look ok?

Thanks,
Richard.

2014-02-11  Richard Biener  rguent...@suse.de

PR lto/60060
* lto-lang.c (lto_write_globals): Do not call
wrapup_global_declarations or emit_debug_global_declarations
but emit debug info for non-function scope variables
directly.

Index: gcc/lto/lto-lang.c
===
*** gcc/lto/lto-lang.c  (revision 207655)
--- gcc/lto/lto-lang.c  (working copy)
*** lto_write_globals (void)
*** 1082,1098 
if (flag_wpa)
  return;
  
!   /* Record the global variables.  */
!   vectree lto_global_var_decls = vNULL;
varpool_node *vnode;
FOR_EACH_DEFINED_VARIABLE (vnode)
! lto_global_var_decls.safe_push (vnode-decl);
! 
!   tree *vec = lto_global_var_decls.address ();
!   int len = lto_global_var_decls.length ();
!   wrapup_global_declarations (vec, len);
!   emit_debug_global_declarations (vec, len);
!   lto_global_var_decls.release ();
  }
  
  static tree
--- 1082,1092 
if (flag_wpa)
  return;
  
!   /* Output debug info for global variables.  */  
varpool_node *vnode;
FOR_EACH_DEFINED_VARIABLE (vnode)
! if (!decl_function_context (vnode-decl))
!   debug_hooks-global_decl (vnode-decl);
  }
  
  static tree


Re: [RFA] [middle-end/54041] Convert modes as needed from expand_expr

2014-02-11 Thread Richard Biener
On Mon, Feb 10, 2014 at 5:35 PM, Jeff Law l...@redhat.com wrote:
 On 02/07/14 02:17, Richard Biener wrote:

 +2014-02-05  Jeff Law  l...@redhat.com
 +
 +   PR middle-end/54041
 +   * expr.c (expand_expr_addr_1): Handle expand_expr returning an
 +   object with an undesirable mode.
 +
   2014-02-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

  * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Change
 diff --git a/gcc/expr.c b/gcc/expr.c
 index 878a51b..9609c45 100644
 --- a/gcc/expr.c
 +++ b/gcc/expr.c
 @@ -7708,6 +7708,11 @@ expand_expr_addr_expr_1 (tree exp, rtx target,
 enum
 machine_mode tmode,
   modifier == EXPAND_INITIALIZER
? EXPAND_INITIALIZER : EXPAND_NORMAL);

 +  /* expand_expr is allowed to return an object in a mode other
 +than TMODE.  If it did, we need to convert.  */
 +  if (tmode != GET_MODE (tmp))
 +   tmp = convert_modes (tmode, GET_MODE (tmp),
 +tmp, TYPE_UNSIGNED (TREE_TYPE (offset)));


 What about CONSTANT_P tmp?  Don't you need to use
 TYPE_MODE (TREE_TYPE (offset)) in that case?


 As I mentioned last week, we want to pass VOIDmode objects (constants) down
 to convert_memory_address_addr_space unchange.  c_m_a_a_s will handle those
 correctly.

 This patch fixes that oversight and the function name in the ChangeLog
 entry.

 I've verified this version still fixes the original bug report and included
 it in an x86_64-unknown-linux-gnu bootstrap  test for sanity's sake.


 OK for the trunk?

Ok.

Thanks,
Richard.

 Thanks,
 Jeff



 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index 2dbab72..eca3e2f 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,9 @@
 +2014-02-05  Jeff Law  l...@redhat.com
 +
 +   PR middle-end/54041
 +   * expr.c (expand_expr_addr_expr_1): Handle expand_expr returning an
 +   object with an undesirable mode.
 +
  2014-02-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Change
 diff --git a/gcc/expr.c b/gcc/expr.c
 index 878a51b..42a451d 100644
 --- a/gcc/expr.c
 +++ b/gcc/expr.c
 @@ -7708,6 +7708,11 @@ expand_expr_addr_expr_1 (tree exp, rtx target, enum
 machine_mode tmode,
  modifier == EXPAND_INITIALIZER
   ? EXPAND_INITIALIZER : EXPAND_NORMAL);

 +  /* expand_expr is allowed to return an object in a mode other
 +than TMODE.  If it did, we need to convert.  */
 +  if (GET_MODE (tmp) != VOIDmode  tmode != GET_MODE (tmp))
 +   tmp = convert_modes (tmode, GET_MODE (tmp),
 +tmp, TYPE_UNSIGNED (TREE_TYPE (offset)));
result = convert_memory_address_addr_space (tmode, result, as);
tmp = convert_memory_address_addr_space (tmode, tmp, as);

 diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
 index c81a00d..283912d 100644
 --- a/gcc/testsuite/ChangeLog
 +++ b/gcc/testsuite/ChangeLog
 @@ -1,3 +1,8 @@
 +2014-02-05  Jeff Law  l...@redhat.com
 +
 +   PR middle-end/54041
 +   * gcc.target/m68k/pr54041.c: New test.
 +
  2014-02-05  Bill Schmidt  wschm...@linux.vnet.ibm.com

 * gcc.dg/vmx/sum2s.c: New.
 diff --git a/gcc/testsuite/gcc.target/m68k/pr54041.c
 b/gcc/testsuite/gcc.target/m68k/pr54041.c
 new file mode 100644
 index 000..645cb6d
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/m68k/pr54041.c
 @@ -0,0 +1,10 @@
 +/* { dg-do compile } */
 +/* { dg-options -O -mshort } */
 +
 +extern int r[];
 +
 +int *fn(int i)
 +{
 +   return r[i];
 +}
 +



Re: RFA: XFAIL gcc.dg/vect/pr56787.c if vect_no_align

2014-02-11 Thread Richard Biener
On Mon, Feb 10, 2014 at 9:38 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Not 100% sure whether this is the preferred fix, but gcc.dg/vect/pr56787.c
 has lots of float * parameters that point who-knows-where and so is only
 vectorised if the target supports misaligned vector accesses:

 void
 foo (unsigned long n, const float *__restrict u0,
  const float *__restrict u1, const float *__restrict u2,
  const float *__restrict u3, const float *__restrict u4,
  const float *__restrict s0, const float *__restrict s1,
  const float *__restrict s2, float *__restrict t3,
  float *__restrict t4)
 {
   unsigned long i;
   for (i = 0; i  n; i++)
 {
   float u[5], f[3][5];
   u[0] = u0[i]; u[1] = u1[i]; u[2] = u2[i]; u[3] = u3[i]; u[4] = u4[i];
   bar (u, f);
   t3[i] = s0[i] * f[0][3] + s1[i] * f[1][3] + s2[i] * f[2][3];
 }
 }

 MIPS paired-single doesn't have any form of vector misalignment.
 I suppose it would be technically possible to use misaligned
 integer accesses and an FPR-GPR move, but that can be expensive.
 I also don't have any hardware to benchmark it on.

 So this patch adds an xfail for vect_no_align.  Tested on mipsisa64-sde-elf.
 OK to install?

Ok.

Thanks,
Richard.

 Thanks,
 Richard


 gcc/testsuite/
 * gcc.dg/vect/pr56787.c: Mark as xfail for vect_no_align.

 Index: gcc/testsuite/gcc.dg/vect/pr56787.c
 ===
 --- gcc/testsuite/gcc.dg/vect/pr56787.c 2014-02-10 20:26:03.870867802 +
 +++ gcc/testsuite/gcc.dg/vect/pr56787.c 2014-02-10 20:36:42.072279177 +
 @@ -31,5 +31,5 @@ foo (unsigned long n, const float *__res
  }
  }

 -/* { dg-final { scan-tree-dump vectorized 1 loops vect } } */
 +/* { dg-final { scan-tree-dump vectorized 1 loops vect { xfail 
 vect_no_align } } } */
  /* { dg-final { cleanup-tree-dump vect } } */


[PING] [test case, patch] ICE in Cilk Plus structured block checker

2014-02-11 Thread Thomas Schwinge
Hi!

Ping again for test case and rather trivial patch to cure a GCC trunk ICE
in the Cilk Plus structured block checker if both -fcilkplus and -fopenmp
are specified.

Also, I'd still like to know whether the »Generalize diagnose_omp_blocks'
structured block logic« patch is fine, too?  (If that one is
acknowledged, I'll then commit the »OpenACC parallel structured blocks«
patch to gomp-4_0-branch.)

On Fri, 24 Jan 2014 16:57:28 +0100, I wrote:
 Ping.  With a different subject line, this time.  Let's first concentrate
 on the ICE in the Cilk Plus structured block checker, then generalize
 diagnose_omp_blocks' structured block logic, before finally getting to
 the OpenACC extension.
 
 
 Here starts the ICE patch:
 
 On Fri, 10 Jan 2014 12:48:21 +0100, I wrote:
  Ping.
  
  On Thu, 19 Dec 2013 17:49:09 +0100, I wrote:
   Ping.
   
   On Tue, 10 Dec 2013 13:53:39 +0100, I wrote:
At the end of this email you can find the patch that I'd like to apply 
to
gomp-4_0-branch for OpenACC structured blocks, after the following two
have been approved:
   
On Fri, 06 Dec 2013 19:33:35 +0100, I wrote:
 On Fri, 15 Nov 2013 14:44:45 -0700, Aldy Hernandez al...@redhat.com 
 wrote:
  --- a/gcc/omp-low.c
  +++ b/gcc/omp-low.c
  @@ -10177,12 +10210,33 @@ diagnose_sb_0 (gimple_stmt_iterator 
  *gsi_p,
   error (invalid entry to OpenMP structured block);
   #endif
   
  +  bool cilkplus_block = false;
  +  if (flag_enable_cilkplus)
  +{
  +  if ((branch_ctx
  +   gimple_code (branch_ctx) == GIMPLE_OMP_FOR
  +   gimple_omp_for_kind (branch_ctx) == 
  GF_OMP_FOR_KIND_CILKSIMD)
  + || (gimple_code (label_ctx) == GIMPLE_OMP_FOR
  +  gimple_omp_for_kind (label_ctx) == 
  GF_OMP_FOR_KIND_CILKSIMD))
  +   cilkplus_block = true;
  +}
 
 There is one issue here: consider the following code:
 
 void baz()
 {
   bad1:
   #pragma omp parallel
 goto bad1;
 }
 
 Then, if both -fcilkplus and -fopenmp are specified, that will run 
 into a
 SIGSEGV/ICE because of label_ctx == NULL.  The fix is simple enough; 
 OK
 for trunk and gomp-4_0-branch (after full testing)?

Testing looks good.

 The testcase is
 basically a concatenation of gcc.dg/cilk-plus/jump.c and
 gcc.dg/gomp/block-1.c -- should this be done differently/better?
 
 commit eee16f8aad4527b705d327476b00bf9f5ba6dcce
 Author: Thomas Schwinge tho...@codesourcery.com
 Date:   Fri Dec 6 18:55:41 2013 +0100
 
 Fix possible ICE (null pointer dereference) introduced in r204863.
 
   gcc/
   * omp-low.c (diagnose_sb_0): Make sure label_ctx is valid to
   dereference.
   gcc/testsuite/
   * gcc.dg/cilk-plus/jump-openmp.c: New file.
 
 diff --git gcc/omp-low.c gcc/omp-low.c
 index e0f7d1d..91221c0 100644
 --- gcc/omp-low.c
 +++ gcc/omp-low.c
 @@ -10865,7 +10865,8 @@ diagnose_sb_0 (gimple_stmt_iterator *gsi_p,
if ((branch_ctx
   gimple_code (branch_ctx) == GIMPLE_OMP_FOR
   gimple_omp_for_kind (branch_ctx) == 
 GF_OMP_FOR_KIND_CILKSIMD)
 -   || (gimple_code (label_ctx) == GIMPLE_OMP_FOR
 +   || (label_ctx
 +gimple_code (label_ctx) == GIMPLE_OMP_FOR
  gimple_omp_for_kind (label_ctx) == 
 GF_OMP_FOR_KIND_CILKSIMD))
   cilkplus_block = true;
  }
 diff --git gcc/testsuite/gcc.dg/cilk-plus/jump-openmp.c 
 gcc/testsuite/gcc.dg/cilk-plus/jump-openmp.c
 new file mode 100644
 index 000..95e6b2d
 --- /dev/null
 +++ gcc/testsuite/gcc.dg/cilk-plus/jump-openmp.c
 @@ -0,0 +1,49 @@
 +/* { dg-do compile } */
 +/* { dg-options -fcilkplus -fopenmp } */
 +/* { dg-require-effective-target fopenmp } */
 +
 +int *a, *b, c;
 +
 +void foo()
 +{
 +#pragma simd
 +  for (int i=0; i  1000; ++i)
 +{
 +  a[i] = b[i];
 +  if (c == 5)
 + return; /* { dg-error invalid branch to/from a Cilk Plus 
 structured block } */
 +}
 +}
 +
 +void bar()
 +{
 +#pragma simd
 +  for (int i=0; i  1000; ++i)
 +{
 +lab:
 +  a[i] = b[i];
 +}
 +  if (c == 6)
 +goto lab; /* { dg-error invalid entry to Cilk Plus structured 
 block } */
 +}
 +
 +void baz()
 +{
 +  bad1:
 +  #pragma omp parallel
 +goto bad1; /* { dg-error invalid branch to/from an OpenMP 
 structured block } */
 +
 +  goto bad2; /* { dg-error invalid entry to OpenMP structured 
 block } */
 +  #pragma omp parallel
 +{
 +  bad2: ;
 +}
 +
 +  #pragma omp parallel
 +{
 +  int i;
 +  goto ok1;
 +  for (i = 0; i  10; ++i)
 + { ok1: 

Re: Use [warning enabled by default] for default warnings

2014-02-11 Thread Richard Sandiford
Robert Dewar de...@adacore.com writes:
 On 2/11/2014 4:45 AM, Richard Sandiford wrote:
 OK, this version drops the [enabled by default] altogether.
 Tested as before.  OK to install?

 Still a huge earthquake in terms of affecting test suites and
 baselines of many users. is it really worth it? In the case of
 GNAT we have only recently started tagging messages in this
 way, so changes would not be so disruptive, and we can debate
 following whatever gcc does, but I think it is important to
 understand that any change in this area is a big one in terms
 of impact on users.

The patch deliberately didn't affect Ada's diagnostic routines given
your comments from the first round.  Calling this a huge earthquake
for other languages seems like a gross overstatement.

I don't think gcc, g++, gfortran, etc, have ever made a commitment
to producing textually identical warnings and errors for given inputs
across different releases.  It seems ridiculous to require that,
especially if it stands in the way of improving the diagnostics
or introducing finer-grained -W control.

E.g. Florian's complaint was that we shouldn't have warnings that
are not under the control of any -W options.  But by your logic
we couldn't change that either, because all those [enabled by default]s
would become [-Wnew-option]s.

Thanks,
Richard


Only assume 4-byte stack alignment on Solaris 9/x86 (PR libgomp/60107)

2014-02-11 Thread Rainer Orth
As described in the PR, a few libgomp testcases FAIL on Solaris 9/x86
because the SSE insns assume 16-byte stack alignment, while Solaris 9,
unlike Solaris 10, only guarantees the 4-byte alignment prescribed by
the i386 psABI.  This can be fixed, at a slight performance penalty and
code size increase, by defaulting Solaris 9/x86 to
STACK_REALIGN_DEFAULT=1.

This patch does just that.  Bootstrapped without regressions on
i386-pc-solaris2.9, ok'ed for 4.9.0 by Jakub in the PR, installed on
mainline.

Rainer


2014-02-10  Rainer Orth  r...@cebitec.uni-bielefeld.de

PR libgomp/60107
* config/i386/sol2-9.h: New file.
* config.gcc (i[34567]86-*-solaris2* | x86_64-*-solaris2.1[0-9]*,
*-*-solaris2.9*): Use it.

# HG changeset patch
# Parent 3ca092d5b3d8fd8554991136a1a48c4eaabeb941
Only assume 4-byte stack alignment on Solaris 9/x86 (PR libgomp/60107)

diff --git a/gcc/config.gcc b/gcc/config.gcc
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1537,6 +1537,9 @@ i[34567]86-*-solaris2* | x86_64-*-solari
 	esac
 	with_tune_32=${with_tune_32:-generic}
 	case ${target} in
+	*-*-solaris2.9*)
+		tm_file=${tm_file} i386/sol2-9.h
+		;;
 	*-*-solaris2.1[0-9]*)
 		tm_file=${tm_file} i386/x86-64.h i386/sol2-bi.h sol2-bi.h
 		tm_defines=${tm_defines} TARGET_BI_ARCH=1
diff --git a/gcc/config/i386/sol2-9.h b/gcc/config/i386/sol2-9.h
new file mode 100644
--- /dev/null
+++ b/gcc/config/i386/sol2-9.h
@@ -0,0 +1,23 @@
+/* Target definitions for GCC for Intel 80386 running Solaris 9
+   Copyright (C) 2014 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+http://www.gnu.org/licenses/.  */
+
+/* Solaris 9 only guarantees 4-byte stack alignment as required by the i386
+   psABI, so realign it as necessary for SSE instructions.  */
+#undef STACK_REALIGN_DEFAULT
+#define STACK_REALIGN_DEFAULT 1

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH, PR52252] Vectorization for load/store groups of size 3.

2014-02-11 Thread Richard Biener
On Tue, 11 Feb 2014, Evgeny Stupachenko wrote:

 Hi,
 
 The patch gives an expected 3 times gain for the test case in the PR52252
 (and even 6 times for AVX2).
 It passes make check and bootstrap on x86.
 spec2000/spec2006 got no regressions/gains on x86.
 
 Is this patch ok?

I've worked on generalizing the permutation support in the light
of the availability of the generic shuffle support in the IL
but hit some road-blocks in the way code-generation works for
group loads with permutations (I don't remember if I posted all patches).

This patch seems to be to a slightly different place but it again
special-cases a specific permutation.  Why's that?  Why can't we
support groups of size 7 for example?  So - can this be generalized
to support arbitrary non-power-of-two load/store groups?

Other than that the patch has to wait for stage1 to open again,
of course.  And it misses a testcase.

Btw, do you have a copyright assignment on file with the FSF covering
work on GCC?

Thanks,
Richard.

 ChangeLog:
 
 2014-02-11  Evgeny Stupachenko  evstu...@gmail.com
 
 * target.h (vect_cost_for_stmt): Defining new cost vec_perm_shuffle.
 * tree-vect-data-refs.c (vect_grouped_store_supported): New
 check for stores group of length 3.
 (vect_permute_store_chain): New permutations for stores group of
 length 3.
 (vect_grouped_load_supported): New check for loads group of length
 3.
 (vect_permute_load_chain): New permutations for loads group of
 length 3.
 * tree-vect-stmts.c (vect_model_store_cost): New cost
 vec_perm_shuffle
 for the new permutations.
 (vect_model_load_cost): Ditto.
 * config/aarch64/aarch64.c (builtin_vectorization_cost): Adding
 vec_perm_shuffle cost as equvivalent of vec_perm cost.
 * config/arm/arm.c: Ditto.
 * config/rs6000/rs6000.c: Ditto.
 * config/spu/spu.c: Ditto.
 * config/i386/x86-tune.def (TARGET_SLOW_PHUFFB): Target for slow
 byte
 shuffle on some x86 architectures.
 * config/i386/i386.h (processor_costs): Defining pshuffb cost.
 * config/i386/i386.c (processor_costs): Adding pshuffb cost.
 (ix86_builtin_vectorization_cost): Adding cost for the new
 permutations.
 Fixing cost for other permutations.
 (expand_vec_perm_even_odd_1): Avoid byte shuffles when they are
 slow (TARGET_SLOW_PHUFFB).
 (ix86_add_stmt_cost): Adding cost when STMT is WIDEN_MULTIPLY.
 Adding new shuffle cost only when byte shuffle is expected.
 Fixing cost model for Silvermont.
 
 Thanks,
 Evgeny
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer


Re: [PING] [test case, patch] ICE in Cilk Plus structured block checker

2014-02-11 Thread Thomas Schwinge
Hi!

On Tue, 11 Feb 2014 13:43:46 +0100, I wrote:
 Ping again for test case and rather trivial patch to cure a GCC trunk ICE
 in the Cilk Plus structured block checker if both -fcilkplus and -fopenmp
 are specified.

Jakub asked me to »please repost just the (hopefully small) trunk patch
alone«, so here we go:

Consider the following code:

void baz()
{
  bad1:
  #pragma omp parallel
goto bad1;
}

Then, if both -fcilkplus and -fopenmp are specified, that will run into a
SIGSEGV/ICE because of label_ctx == NULL in omp-low.c:diagnose_sb_0.

The testcase is basically a concatenation of gcc.dg/cilk-plus/jump.c and
gcc.dg/gomp/block-1.c -- should this be done differently/better?

Fix potential ICE (null pointer dereference) in omp-low.c:diagnose_sb_0.

gcc/
* omp-low.c (diagnose_sb_0): Make sure label_ctx is valid to
dereference.
gcc/testsuite/
* gcc.dg/cilk-plus/jump-openmp.c: New file.

diff --git gcc/omp-low.c gcc/omp-low.c
index e0f7d1d..91221c0 100644
--- gcc/omp-low.c
+++ gcc/omp-low.c
@@ -10865,7 +10865,8 @@ diagnose_sb_0 (gimple_stmt_iterator *gsi_p,
   if ((branch_ctx
gimple_code (branch_ctx) == GIMPLE_OMP_FOR
gimple_omp_for_kind (branch_ctx) == GF_OMP_FOR_KIND_CILKSIMD)
- || (gimple_code (label_ctx) == GIMPLE_OMP_FOR
+ || (label_ctx
+  gimple_code (label_ctx) == GIMPLE_OMP_FOR
   gimple_omp_for_kind (label_ctx) == GF_OMP_FOR_KIND_CILKSIMD))
cilkplus_block = true;
 }
diff --git gcc/testsuite/gcc.dg/cilk-plus/jump-openmp.c 
gcc/testsuite/gcc.dg/cilk-plus/jump-openmp.c
new file mode 100644
index 000..95e6b2d
--- /dev/null
+++ gcc/testsuite/gcc.dg/cilk-plus/jump-openmp.c
@@ -0,0 +1,49 @@
+/* { dg-do compile } */
+/* { dg-options -fcilkplus -fopenmp } */
+/* { dg-require-effective-target fopenmp } */
+
+int *a, *b, c;
+
+void foo()
+{
+#pragma simd
+  for (int i=0; i  1000; ++i)
+{
+  a[i] = b[i];
+  if (c == 5)
+   return; /* { dg-error invalid branch to/from a Cilk Plus structured 
block } */
+}
+}
+
+void bar()
+{
+#pragma simd
+  for (int i=0; i  1000; ++i)
+{
+lab:
+  a[i] = b[i];
+}
+  if (c == 6)
+goto lab; /* { dg-error invalid entry to Cilk Plus structured block } */
+}
+
+void baz()
+{
+  bad1:
+  #pragma omp parallel
+goto bad1; /* { dg-error invalid branch to/from an OpenMP structured 
block } */
+
+  goto bad2; /* { dg-error invalid entry to OpenMP structured block } */
+  #pragma omp parallel
+{
+  bad2: ;
+}
+
+  #pragma omp parallel
+{
+  int i;
+  goto ok1;
+  for (i = 0; i  10; ++i)
+   { ok1: break; }
+}
+}


Grüße,
 Thomas


pgpmmBQ4hAZPQ.pgp
Description: PGP signature


Re: [patch] Fix wrong code with VCE to bit-field type at -O

2014-02-11 Thread Richard Biener
On Tue, Feb 11, 2014 at 11:40 AM, Eric Botcazou ebotca...@adacore.com wrote:
 Hi,

 this is an interesting regression from GCC 4.5 present on all active branches
 and triggered by recent SRA improvements.  For the attached testcase, we have
 an unchecked conversion of a 3-byte slice of an array of 4 bytes to a record
 type containing a 3-byte bit-field; an unchecked conversion in this case
 directly translates into a VIEW_CONVERT_EXPR.  Then SRA scalarizes the bit-
 field and turns the original VCE into a VCE of the 3-byte slice to the bit-
 field type, which is a 32-bit integer with precision 24.

 But the expansion of VCE isn't prepared for this and generates a full 32-bit
 read, which thus reads 1 additional byte and doesn't mask it afterwards, thus
 resulting in a wrong value for the scalarized bit-field.

 Proposed fix attached, tested on x86-64/Linux, OK for the mainline?

Hmm.  The intent was of course to only allow truly no-op converts via
VIEW_CONVERT_EXPR - that is, the size of the operand type and the
result type should be the same.  So, isn't SRA doing it wrong when
creating the VIEW_CONVERT_EXPR of a 3-byte type to a 4-byte type?

The verification we do in tree-cfg.c:verify_types_in_gimple_reference
hints at that we _do_ have even grosser mismatches - so reality may
trump desired design here.

Richard.

 2014-02-11  Eric Botcazou  ebotca...@adacore.com

 * expr.c (REDUCE_BIT_FIELD): Extend the scope of the macro.
 (expand_expr_real_1) case VIEW_CONVERT_EXPR: Deal with bit-field
 destination types.


 2014-02-11  Eric Botcazou  ebotca...@adacore.com

 * gnat.dg/opt31.adb: New test.


 --
 Eric Botcazou


Re: Use [warning enabled by default] for default warnings

2014-02-11 Thread Richard Biener
On Tue, Feb 11, 2014 at 1:48 PM, Richard Sandiford
rdsandif...@googlemail.com wrote:
 Robert Dewar de...@adacore.com writes:
 On 2/11/2014 4:45 AM, Richard Sandiford wrote:
 OK, this version drops the [enabled by default] altogether.
 Tested as before.  OK to install?

 Still a huge earthquake in terms of affecting test suites and
 baselines of many users. is it really worth it? In the case of
 GNAT we have only recently started tagging messages in this
 way, so changes would not be so disruptive, and we can debate
 following whatever gcc does, but I think it is important to
 understand that any change in this area is a big one in terms
 of impact on users.

 The patch deliberately didn't affect Ada's diagnostic routines given
 your comments from the first round.  Calling this a huge earthquake
 for other languages seems like a gross overstatement.

 I don't think gcc, g++, gfortran, etc, have ever made a commitment
 to producing textually identical warnings and errors for given inputs
 across different releases.  It seems ridiculous to require that,
 especially if it stands in the way of improving the diagnostics
 or introducing finer-grained -W control.

 E.g. Florian's complaint was that we shouldn't have warnings that
 are not under the control of any -W options.  But by your logic
 we couldn't change that either, because all those [enabled by default]s
 would become [-Wnew-option]s.

Yeah, I think Roberts argument is a red herring - there are loads of
diagnostic changes every release so you cannot expect those to
be stable.

I'm fine with dropping the [enabled by default] as in the patch, but lets
hear another ok before making the change.

Thanks,
Richard.

 Thanks,
 Richard


Re: [patch] Fix wrong code with VCE to bit-field type at -O

2014-02-11 Thread Jakub Jelinek
On Tue, Feb 11, 2014 at 02:17:04PM +0100, Richard Biener wrote:
  this is an interesting regression from GCC 4.5 present on all active 
  branches
  and triggered by recent SRA improvements.  For the attached testcase, we 
  have
  an unchecked conversion of a 3-byte slice of an array of 4 bytes to a record
  type containing a 3-byte bit-field; an unchecked conversion in this case
  directly translates into a VIEW_CONVERT_EXPR.  Then SRA scalarizes the bit-
  field and turns the original VCE into a VCE of the 3-byte slice to the bit-
  field type, which is a 32-bit integer with precision 24.
 
  But the expansion of VCE isn't prepared for this and generates a full 32-bit
  read, which thus reads 1 additional byte and doesn't mask it afterwards, 
  thus
  resulting in a wrong value for the scalarized bit-field.
 
  Proposed fix attached, tested on x86-64/Linux, OK for the mainline?
 
 Hmm.  The intent was of course to only allow truly no-op converts via
 VIEW_CONVERT_EXPR - that is, the size of the operand type and the
 result type should be the same.  So, isn't SRA doing it wrong when
 creating the VIEW_CONVERT_EXPR of a 3-byte type to a 4-byte type?
 
 The verification we do in tree-cfg.c:verify_types_in_gimple_reference
 hints at that we _do_ have even grosser mismatches - so reality may
 trump desired design here.

I thought we only allow VCE if the bitsize of both types is the same.
If you have different bitsizes, then supposedly VCE to same bitsize integer
followed by zero/sign extension or truncation followed by another VCE should
be used.

Jakub


Re: Use [warning enabled by default] for default warnings

2014-02-11 Thread Robert Dewar

On 2/11/2014 7:48 AM, Richard Sandiford wrote:


The patch deliberately didn't affect Ada's diagnostic routines given
your comments from the first round.  Calling this a huge earthquake
for other languages seems like a gross overstatement.


Actually it's much less of an impact for Ada for two reasons. First we
only just started tagging warnings. In fact we have only just released
an official version with the facility for tagging warnings.

Second, this tagging of warnings is not the default (that would have
been a big earthquake) but you have to turn it on explicitly.

But I do indeed think it will have a significant impact for users
of other languages, where this has been done for a while, and if
I am not mistaken, done by default?


I don't think gcc, g++, gfortran, etc, have ever made a commitment
to producing textually identical warnings and errors for given inputs
across different releases.  It seems ridiculous to require that,
especially if it stands in the way of improving the diagnostics
or introducing finer-grained -W control.

E.g. Florian's complaint was that we shouldn't have warnings that
are not under the control of any -W options.  But by your logic
we couldn't change that either, because all those [enabled by default]s
would become [-Wnew-option]s.


I am not saying you can't change it, just that it is indeed a big
earthquake. No of course there is no commitment not to make changes.
But you have to be aware that when you make changes like this, the
impact is very significant in real production environments, and
gcc is as you know extensively used in such environments.

What I am saying here is that this is worth some discussion on what
the best approach is.

Ideally indeed it would be better if all warnings were controlled by
some specific warning category. I am not sure a warning switch that
default-covered all otherwise uncovered cases (as suggested by one
person at least) would be a worthwhile approach.



Re: [patch] Fix wrong code with VCE to bit-field type at -O

2014-02-11 Thread Richard Biener
On Tue, Feb 11, 2014 at 2:22 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Tue, Feb 11, 2014 at 02:17:04PM +0100, Richard Biener wrote:
  this is an interesting regression from GCC 4.5 present on all active 
  branches
  and triggered by recent SRA improvements.  For the attached testcase, we 
  have
  an unchecked conversion of a 3-byte slice of an array of 4 bytes to a 
  record
  type containing a 3-byte bit-field; an unchecked conversion in this case
  directly translates into a VIEW_CONVERT_EXPR.  Then SRA scalarizes the bit-
  field and turns the original VCE into a VCE of the 3-byte slice to the bit-
  field type, which is a 32-bit integer with precision 24.
 
  But the expansion of VCE isn't prepared for this and generates a full 
  32-bit
  read, which thus reads 1 additional byte and doesn't mask it afterwards, 
  thus
  resulting in a wrong value for the scalarized bit-field.
 
  Proposed fix attached, tested on x86-64/Linux, OK for the mainline?

 Hmm.  The intent was of course to only allow truly no-op converts via
 VIEW_CONVERT_EXPR - that is, the size of the operand type and the
 result type should be the same.  So, isn't SRA doing it wrong when
 creating the VIEW_CONVERT_EXPR of a 3-byte type to a 4-byte type?

 The verification we do in tree-cfg.c:verify_types_in_gimple_reference
 hints at that we _do_ have even grosser mismatches - so reality may
 trump desired design here.

 I thought we only allow VCE if the bitsize of both types is the same.
 If you have different bitsizes, then supposedly VCE to same bitsize integer
 followed by zero/sign extension or truncation followed by another VCE should
 be used.

Yeah, but the verification code is conveniently crippled:

  if (TREE_CODE (expr) == VIEW_CONVERT_EXPR)
{
  /* For VIEW_CONVERT_EXPRs which are allowed here too, we only check
 that their operand is not an SSA name or an invariant when
 requiring an lvalue (this usually means there is a SRA or IPA-SRA
 bug).  Otherwise there is nothing to verify, gross mismatches at
 most invoke undefined behavior.  */
  if (require_lvalue
   (TREE_CODE (op) == SSA_NAME
  || is_gimple_min_invariant (op)))
{
  error (conversion of an SSA_NAME on the left hand side);
  debug_generic_stmt (expr);
  return true;
}
  else if (TREE_CODE (op) == SSA_NAME
TYPE_SIZE (TREE_TYPE (expr)) != TYPE_SIZE
(TREE_TYPE (op)))
{
  error (conversion of register to a different size);
  debug_generic_stmt (expr);
  return true;
}

thus that only applies to register VIEW_CONVERT_EXPRs.  But maybe
we two are missing sth here - it's an Ada testcase after all ;)

Richard.

 Jakub


Re: [PATCH, PR52252] Vectorization for load/store groups of size 3.

2014-02-11 Thread Evgeny Stupachenko
Missed patch attached in plain-text.

I have copyright assignment on file with the FSF covering work on GCC.

Load/stores groups of length 3 is the most frequent non-power-of-2
case. It is used in RGB image processing (like test case in PR52252).
For sure we can extend the patch to length 5 and more. However, this
potentially affect performance on some other architectures and
requires larger testing. So length 3 it is just first step.The
algorithm in the patch could be modified for a general case in several
steps.

I understand that the patch should wait for the stage 1, however since
its ready we can discuss it right now and make some changes (like
general size of group).

Thanks,
Evgeny

On Tue, Feb 11, 2014 at 5:00 PM, Richard Biener rguent...@suse.de wrote:

 On Tue, 11 Feb 2014, Evgeny Stupachenko wrote:

  Hi,
 
  The patch gives an expected 3 times gain for the test case in the PR52252
  (and even 6 times for AVX2).
  It passes make check and bootstrap on x86.
  spec2000/spec2006 got no regressions/gains on x86.
 
  Is this patch ok?

 I've worked on generalizing the permutation support in the light
 of the availability of the generic shuffle support in the IL
 but hit some road-blocks in the way code-generation works for
 group loads with permutations (I don't remember if I posted all patches).

 This patch seems to be to a slightly different place but it again
 special-cases a specific permutation.  Why's that?  Why can't we
 support groups of size 7 for example?  So - can this be generalized
 to support arbitrary non-power-of-two load/store groups?

 Other than that the patch has to wait for stage1 to open again,
 of course.  And it misses a testcase.

 Btw, do you have a copyright assignment on file with the FSF covering
 work on GCC?

 Thanks,
 Richard.

  ChangeLog:
 
  2014-02-11  Evgeny Stupachenko  evstu...@gmail.com
 
  * target.h (vect_cost_for_stmt): Defining new cost vec_perm_shuffle.
  * tree-vect-data-refs.c (vect_grouped_store_supported): New
  check for stores group of length 3.
  (vect_permute_store_chain): New permutations for stores group of
  length 3.
  (vect_grouped_load_supported): New check for loads group of length
  3.
  (vect_permute_load_chain): New permutations for loads group of
  length 3.
  * tree-vect-stmts.c (vect_model_store_cost): New cost
  vec_perm_shuffle
  for the new permutations.
  (vect_model_load_cost): Ditto.
  * config/aarch64/aarch64.c (builtin_vectorization_cost): Adding
  vec_perm_shuffle cost as equvivalent of vec_perm cost.
  * config/arm/arm.c: Ditto.
  * config/rs6000/rs6000.c: Ditto.
  * config/spu/spu.c: Ditto.
  * config/i386/x86-tune.def (TARGET_SLOW_PHUFFB): Target for slow
  byte
  shuffle on some x86 architectures.
  * config/i386/i386.h (processor_costs): Defining pshuffb cost.
  * config/i386/i386.c (processor_costs): Adding pshuffb cost.
  (ix86_builtin_vectorization_cost): Adding cost for the new
  permutations.
  Fixing cost for other permutations.
  (expand_vec_perm_even_odd_1): Avoid byte shuffles when they are
  slow (TARGET_SLOW_PHUFFB).
  (ix86_add_stmt_cost): Adding cost when STMT is WIDEN_MULTIPLY.
  Adding new shuffle cost only when byte shuffle is expected.
  Fixing cost model for Silvermont.
 
  Thanks,
  Evgeny
 

 --
 Richard Biener rguent...@suse.de
 SUSE / SUSE Labs
 SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
 GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer


vect.patch
Description: Binary data


[patch c++]: Fix pr/58835 [4.7/4.8/4.9 Regression] ICE with __PRETTY_FUNCTION__ in broken function

2014-02-11 Thread Kai Tietz
Hi,

the following patch adds missing handling of error_mark_node result of
fname_decl within finish_fname.

ChangeLog

2014-02-11  Kai Tietz  kti...@redhat.com

PR c++/58835
* semantics.c (finish_fname): Handle error_mark_node.

Regression tested for x86_64-unknown-linux-gnu, i686-w64-mingw32.  Ok for apply?

Regards,
Kai

Index: semantics.c
===
--- semantics.c (Revision 207686)
+++ semantics.c (Arbeitskopie)
@@ -2630,7 +2630,8 @@ finish_fname (tree id)
   tree decl;

   decl = fname_decl (input_location, C_RID_CODE (id), id);
-  if (processing_template_decl  current_function_decl)
+  if (processing_template_decl  current_function_decl
+   decl != error_mark_node)
 decl = DECL_NAME (decl);
   return decl;
 }


Re: [PING] [test case, patch] ICE in Cilk Plus structured block checker

2014-02-11 Thread Jakub Jelinek
On Tue, Feb 11, 2014 at 02:15:00PM +0100, Thomas Schwinge wrote:
 Jakub asked me to »please repost just the (hopefully small) trunk patch
 alone«, so here we go:
 
 Consider the following code:
 
 void baz()
 {
   bad1:
   #pragma omp parallel
 goto bad1;
 }
 
 Then, if both -fcilkplus and -fopenmp are specified, that will run into a
 SIGSEGV/ICE because of label_ctx == NULL in omp-low.c:diagnose_sb_0.
 
 The testcase is basically a concatenation of gcc.dg/cilk-plus/jump.c and
 gcc.dg/gomp/block-1.c -- should this be done differently/better?
 
 Fix potential ICE (null pointer dereference) in omp-low.c:diagnose_sb_0.
 
   gcc/
   * omp-low.c (diagnose_sb_0): Make sure label_ctx is valid to
   dereference.
   gcc/testsuite/
   * gcc.dg/cilk-plus/jump-openmp.c: New file.

Ok, thanks.

Jakub


Re: Use [warning enabled by default] for default warnings

2014-02-11 Thread Richard Sandiford
Robert Dewar de...@adacore.com writes:
 I don't think gcc, g++, gfortran, etc, have ever made a commitment
 to producing textually identical warnings and errors for given inputs
 across different releases.  It seems ridiculous to require that,
 especially if it stands in the way of improving the diagnostics
 or introducing finer-grained -W control.

 E.g. Florian's complaint was that we shouldn't have warnings that
 are not under the control of any -W options.  But by your logic
 we couldn't change that either, because all those [enabled by default]s
 would become [-Wnew-option]s.

 I am not saying you can't change it, just that it is indeed a big
 earthquake. No of course there is no commitment not to make changes.
 But you have to be aware that when you make changes like this, the
 impact is very significant in real production environments, and
 gcc is as you know extensively used in such environments.

 What I am saying here is that this is worth some discussion on what
 the best approach is.

But what's the basis for that discussion going to be?  I first made this
suggestion on gcc@, which is the best list we have for getting user feedback,
and no user made this objection.  And when I worked in an environment
where I had direct contact with GCC-using customers, none of them gave
any indication that they were expecting textual stability between releases.
If you know of people who are using non-Ada languages this way then
please describe their set-up.  If you don't, how are we going to know
how such hypothetical users are going to react?  E.g. how many of those
users will have heard of sed?

I thought the trend these days was to move towards -Werror, so that for
many people the expected output is to get no warnings at all.  And bear
in mind that the kind of warnings that are not under -W control tend to
be those that are so likely to be a mistake that no-one has ever had an
incentive to make them optional.  I find it hard to believe that
significant numbers of users are not fixing the sources of those
warnings and are instead requiring every release of GCC to produce
warnings with a particular wording.

Thanks,
Richard


Re: Allow passing arrays in registers on AArch64

2014-02-11 Thread Marcus Shawcroft
On 6 February 2014 22:51, Michael Hudson-Doyle
michael.hud...@canonical.com wrote:

 diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
 index 16c51a8..958c667 100644
 --- a/gcc/config/aarch64/aarch64.c
 +++ b/gcc/config/aarch64/aarch64.c
 @@ -1187,14 +1187,10 @@ aarch64_pass_by_reference (cumulative_args_t pcum 
 ATTRIBUTE_UNUSED,
size = (mode == BLKmode  type)
  ? int_size_in_bytes (type) : (int) GET_MODE_SIZE (mode);

 -  if (type)
 +  /* Aggregates are passed by reference based on their size.  */
 +  if (type  AGGREGATE_TYPE_P (type))
  {
 -  /* Arrays always passed by reference.  */
 -  if (TREE_CODE (type) == ARRAY_TYPE)
 -   return true;
 -  /* Other aggregates based on their size.  */
 -  if (AGGREGATE_TYPE_P (type))
 -   size = int_size_in_bytes (type);
 +  size = int_size_in_bytes (type);
  }

/* Variable sized arguments are always returned by reference.  */

This version of the patch looks fine.  Since this is a bug I think it
should be committed now in stage 4.This is OK if release manager
agrees.

Cheers
/Marcus


Re: Allow passing arrays in registers on AArch64

2014-02-11 Thread Jakub Jelinek
On Tue, Feb 11, 2014 at 02:51:08PM +, Marcus Shawcroft wrote:
 On 6 February 2014 22:51, Michael Hudson-Doyle
 michael.hud...@canonical.com wrote:
 
  diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
  index 16c51a8..958c667 100644
  --- a/gcc/config/aarch64/aarch64.c
  +++ b/gcc/config/aarch64/aarch64.c
  @@ -1187,14 +1187,10 @@ aarch64_pass_by_reference (cumulative_args_t pcum 
  ATTRIBUTE_UNUSED,
 size = (mode == BLKmode  type)
   ? int_size_in_bytes (type) : (int) GET_MODE_SIZE (mode);
 
  -  if (type)
  +  /* Aggregates are passed by reference based on their size.  */
  +  if (type  AGGREGATE_TYPE_P (type))
   {
  -  /* Arrays always passed by reference.  */
  -  if (TREE_CODE (type) == ARRAY_TYPE)
  -   return true;
  -  /* Other aggregates based on their size.  */
  -  if (AGGREGATE_TYPE_P (type))
  -   size = int_size_in_bytes (type);
  +  size = int_size_in_bytes (type);
   }
 
 /* Variable sized arguments are always returned by reference.  */
 
 This version of the patch looks fine.  Since this is a bug I think it
 should be committed now in stage 4.This is OK if release manager
 agrees.

Ok.

Jakub


Re: [PATCH][ARM] Fix PR 55426

2014-02-11 Thread Kyrill Tkachov

On 11/02/14 09:38, Kyrill Tkachov wrote:

Hi all,

In this PR the 128-bit load-duplicate intrinsics in neon.exp ICE on big-endian
with an unrecognisable insn error:

neon-vld1_dupQ.c:24:1: error: unrecognizable insn:
(insn 94 93 31 (set (subreg:DI (reg:V2DI 95 d16 [orig:137 D.14400 ] [137]) 0)
  (subreg:DI (reg:V2DI 95 d16 [orig:137 D.14400 ] [137]) 8))

The problem seems to be that the neon_vld1_dupv2di splitter generates subregs
after reload with gen_lowpart and gen_highpart. Since that splitter always
matches after reload, we already know the hard register numbers, so we can just
manipulate those directly to extract the two doubleword parts of a quadword reg.

While we're at it, we might as well use a more general move instruction when the
alignment is natural to potentially take advantage of more complex addressing
modes. We're allowed to do that because the vld1Q_dup*64 intrinsics describe a
behaviour and do not guarantee that a particular instruction will be used.

Therefore the vld1Q_dup*64 tests are updated to be run-time tests instead to
test the functionality. New *_misaligned tests are added, however, to make sure
that we still generate vld1.64 when the address is explicitly unaligned, since
vld1.64 is the only instruction that can handle that.

Did an armeb-none-linux-gnueabihf build.
The vld1Q_dup*64* tests now pass on big and little endian.
arm-none-linux-gnueabihf bootstrap on Chromebook successful.


This is a regression since 4.7. I've tested this on trunk. Will test this on the
4.8 and 4.7 branches.


My apologies, I misread the bug report as if this appears on 4.7. I couldn't 
reproduce it on 4.7 since the offending pattern didn't exist then.

I'm testing a 4.8 variant of this patch.

Kyrill




Ok for those branches if no regressions?

Thanks,
Kyrill


2014-02-11  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  PR target/55426
  * config/arm/neon.md (neon_vld1_dupv2di): Do not generate
  low and high part subregs, use hard reg numbers.
  * config/arm/arm.c (arm_mem_aligned_p): New function.
  (arm_init_neon_builtins): Allow for memory operands
  in load operations.
  * config/arm/arm-protos.h (arm_mem_aligned_p): Declare
  extern.
  * config/arm/constraints.md (Uo): New constraint.

2014-02-11  Kyrylo Tkachov  kyrylo.tkac...@arm.com

  PR target/55426
  * gcc.target/arm/neon/vld1Q_dupp64.c: Change to run-time test.
  * gcc.target/arm/neon/vld1Q_dups64.c: Likewise.
  * gcc.target/arm/neon/vld1Q_dupu64.c: Likewise.
  * gcc.target/arm/neon/vld1Q_dupp64_misaligned.c: New test.
  * gcc.target/arm/neon/vld1Q_dups64_misaligned.c: Likewise.
  * gcc.target/arm/neon/vld1Q_dupu64_misaligned.c: Likewise.





Re: [PATCH][AArch64] Wire up Cortex-A57 rtx costs

2014-02-11 Thread Marcus Shawcroft
On 10 February 2014 09:55, Marcus Shawcroft marcus.shawcr...@gmail.com wrote:
 On 30 January 2014 13:48, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:
 Hi all,

 This patch wires up the aarch64 backend to use the Cortex-A57 rtx costs
 table that is proposed at
 http://gcc.gnu.org/ml/gcc-patches/2014-01/msg01954.html

 OK if release manager agrees.

 /Marcus

Jakub, are you OK with us taking this patch in stage4 ?

This patch builds on the ARM patch
http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00403.html and like the
ARM patch it is none default tuning.

/Marcus


Re: [PATCH][ARM] Add -mcpu=native detection for Cortex-A53, A57

2014-02-11 Thread Ramana Radhakrishnan
On Mon, Feb 10, 2014 at 11:01 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:
 Hi all,

 This patchlet adds the part numbers for the Cortex-A53 and A57 cores so that
 they can be detected when parsing /proc/cpuinfo on AArch32 Linux systems.
 This will allow the -mcpu=native machinery to detect those cores.

 Tested arm-none-eabi on a model.

 This is a fairly innocuous change, is it ok at this stage or for next stage
 1?

I'm happy for this to go in provided an RM doesn't object in 24 hours.

regards
Ramana



 Thanks,
 Kyrill

 2014-02-10  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/arm/driver-arm.c (arm_cpu_table): Add entries for Cortex-A53,
 Cortex-A57.


Re: [PATCH][ARM]fix potential testsuite/gcc.target/arm/fixed_float_conversion.c regression

2014-02-11 Thread Ramana Radhakrishnan
On Mon, Feb 3, 2014 at 3:56 PM, Renlin Li renlin...@arm.com wrote:
 Hi all,

 This patch will ensure testsuite/gcc.target/arm/fixed_float_conversion.c is
 checked only when -mfpu=vfp3 -mfloat-abi=softfp is applicable for the
 target.

 Accordingly, two procs (check_effective_target_arm_vfp3_ok and
 add_options_for_arm_vfp3) are added into
 gcc/testsuite/lib/target-supports.exp.

 I have also update related documentation.

 Okay for trunk?


This is OK.

Ramana



 Kind regards,
 Renlin Li


 gcc/testsuite/ChangeLog:

 2014-02-03  Renlin Li  renlin...@arm.com

 * gcc.target/arm/fixed_float_conversion.c: Add arm_vfp3 option to the
 test case.
 * lib/target-supports.exp: check_effective_target_arm_vfp3_ok: New.
 add_options_for_arm_vfp3: New.


 gcc/ChangeLog:

 2014-02-03  Renlin Li  renlin...@arm.com

 * doc/sourcebuild.texi: Document check_effective_target_arm_vfp3_ok and
 add_options_for_arm_vfp3


Re: [PATCH][AArch64] Wire up Cortex-A57 rtx costs

2014-02-11 Thread Jakub Jelinek
On Tue, Feb 11, 2014 at 03:03:32PM +, Marcus Shawcroft wrote:
 On 10 February 2014 09:55, Marcus Shawcroft marcus.shawcr...@gmail.com 
 wrote:
  On 30 January 2014 13:48, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:
  Hi all,
 
  This patch wires up the aarch64 backend to use the Cortex-A57 rtx costs
  table that is proposed at
  http://gcc.gnu.org/ml/gcc-patches/2014-01/msg01954.html
 
  OK if release manager agrees.
 
  /Marcus
 
 Jakub, are you OK with us taking this patch in stage4 ?
 
 This patch builds on the ARM patch
 http://gcc.gnu.org/ml/gcc-patches/2014-02/msg00403.html and like the
 ARM patch it is none default tuning.

Ok.

Jakub


Re: [PATCH, PR52252] Vectorization for load/store groups of size 3.

2014-02-11 Thread Richard Biener
On Tue, 11 Feb 2014, Evgeny Stupachenko wrote:

 Missed patch attached in plain-text.
 
 I have copyright assignment on file with the FSF covering work on GCC.
 
 Load/stores groups of length 3 is the most frequent non-power-of-2
 case. It is used in RGB image processing (like test case in PR52252).
 For sure we can extend the patch to length 5 and more. However, this
 potentially affect performance on some other architectures and
 requires larger testing. So length 3 it is just first step.The
 algorithm in the patch could be modified for a general case in several
 steps.
 
 I understand that the patch should wait for the stage 1, however since
 its ready we can discuss it right now and make some changes (like
 general size of group).

Other than that I'd like to see a vectorizer hook querying the cost of a
vec_perm_const expansion instead of adding vec_perm_shuffle
(thus requires the constant shuffle mask to be passed as well
as the vector type).  That's more useful for other uses that
would require (arbitrary) shuffles.

Didn't look at the rest of the patch yet - queued in my review
pipeline.

Thanks,
Richard.

 Thanks,
 Evgeny
 
 On Tue, Feb 11, 2014 at 5:00 PM, Richard Biener rguent...@suse.de wrote:
 
  On Tue, 11 Feb 2014, Evgeny Stupachenko wrote:
 
   Hi,
  
   The patch gives an expected 3 times gain for the test case in the PR52252
   (and even 6 times for AVX2).
   It passes make check and bootstrap on x86.
   spec2000/spec2006 got no regressions/gains on x86.
  
   Is this patch ok?
 
  I've worked on generalizing the permutation support in the light
  of the availability of the generic shuffle support in the IL
  but hit some road-blocks in the way code-generation works for
  group loads with permutations (I don't remember if I posted all patches).
 
  This patch seems to be to a slightly different place but it again
  special-cases a specific permutation.  Why's that?  Why can't we
  support groups of size 7 for example?  So - can this be generalized
  to support arbitrary non-power-of-two load/store groups?
 
  Other than that the patch has to wait for stage1 to open again,
  of course.  And it misses a testcase.
 
  Btw, do you have a copyright assignment on file with the FSF covering
  work on GCC?
 
  Thanks,
  Richard.
 
   ChangeLog:
  
   2014-02-11  Evgeny Stupachenko  evstu...@gmail.com
  
   * target.h (vect_cost_for_stmt): Defining new cost 
   vec_perm_shuffle.
   * tree-vect-data-refs.c (vect_grouped_store_supported): New
   check for stores group of length 3.
   (vect_permute_store_chain): New permutations for stores group of
   length 3.
   (vect_grouped_load_supported): New check for loads group of length
   3.
   (vect_permute_load_chain): New permutations for loads group of
   length 3.
   * tree-vect-stmts.c (vect_model_store_cost): New cost
   vec_perm_shuffle
   for the new permutations.
   (vect_model_load_cost): Ditto.
   * config/aarch64/aarch64.c (builtin_vectorization_cost): Adding
   vec_perm_shuffle cost as equvivalent of vec_perm cost.
   * config/arm/arm.c: Ditto.
   * config/rs6000/rs6000.c: Ditto.
   * config/spu/spu.c: Ditto.
   * config/i386/x86-tune.def (TARGET_SLOW_PHUFFB): Target for slow
   byte
   shuffle on some x86 architectures.
   * config/i386/i386.h (processor_costs): Defining pshuffb cost.
   * config/i386/i386.c (processor_costs): Adding pshuffb cost.
   (ix86_builtin_vectorization_cost): Adding cost for the new
   permutations.
   Fixing cost for other permutations.
   (expand_vec_perm_even_odd_1): Avoid byte shuffles when they are
   slow (TARGET_SLOW_PHUFFB).
   (ix86_add_stmt_cost): Adding cost when STMT is WIDEN_MULTIPLY.
   Adding new shuffle cost only when byte shuffle is expected.
   Fixing cost model for Silvermont.
  
   Thanks,
   Evgeny
  
 
  --
  Richard Biener rguent...@suse.de
  SUSE / SUSE Labs
  SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
  GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer
 

-- 
Richard Biener rguent...@suse.de
SUSE / SUSE Labs
SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746
GF: Jeff Hawn, Jennifer Guild, Felix Imendorffer


Re: Use [warning enabled by default] for default warnings

2014-02-11 Thread Robert Dewar

On 2/11/2014 9:36 AM, Richard Sandiford wrote:

  I find it hard to believe that
significant numbers of users are not fixing the sources of those
warnings and are instead requiring every release of GCC to produce
warnings with a particular wording.


Good enough for me, I think it is OK to make the change.


Re: [PATCH][ARM]fix potential testsuite/gcc.target/arm/fixed_float_conversion.c regression

2014-02-11 Thread Kyrill Tkachov

On 11/02/14 15:11, Ramana Radhakrishnan wrote:

On Mon, Feb 3, 2014 at 3:56 PM, Renlin Li renlin...@arm.com wrote:

Hi all,

This patch will ensure testsuite/gcc.target/arm/fixed_float_conversion.c is
checked only when -mfpu=vfp3 -mfloat-abi=softfp is applicable for the
target.

Accordingly, two procs (check_effective_target_arm_vfp3_ok and
add_options_for_arm_vfp3) are added into
gcc/testsuite/lib/target-supports.exp.

I have also update related documentation.

Okay for trunk?


This is OK.


Hi all,

I've committed this on behalf of Renlin as r207691.

Kyrill



Ramana



Kind regards,
Renlin Li


gcc/testsuite/ChangeLog:

2014-02-03  Renlin Li  renlin...@arm.com

 * gcc.target/arm/fixed_float_conversion.c: Add arm_vfp3 option to the
test case.
 * lib/target-supports.exp: check_effective_target_arm_vfp3_ok: New.
 add_options_for_arm_vfp3: New.


gcc/ChangeLog:

2014-02-03  Renlin Li  renlin...@arm.com

 * doc/sourcebuild.texi: Document check_effective_target_arm_vfp3_ok and
 add_options_for_arm_vfp3





Re: [RFC, patch] Detect lack of 32-bit devel environment on x86_64-linux targets

2014-02-11 Thread Jakub Jelinek
On Fri, Jan 31, 2014 at 03:49:40PM +0100, Richard Biener wrote:
 On Fri, Jan 31, 2014 at 3:45 PM, FX fxcoud...@gmail.com wrote:
  I've just seen that an explicit --enable-multilib is a way to do that.
 
  Yes, I was writing that as a reply when I received your email. (Also, it's 
  written in the configure error message.)
 
 Yeah - you know, that message is quite long and somehow I didn't read it
 carefully.  I suspect that will happen to others, too, so we'll get some
 extra complaints from that ;)
 
  Btw, doing the configure check exactly after all-stage1-gcc should be
  an early enough and a serialization point, no?  There you can do the
  check even for when cross-compiling.
 
  Well, you've already built a whole stage, so it's not so early, is it?
 
 Well, building the stage1 compiler is probably the fastest thing nowadays
 (it didn't yet build the target libraries for stage1 with the stage1,
 unoptimized
 and checking-enabled compiler - which is where it would fail in the odd
 way which is what you want to improve).
 
 As I said, you can't properly check it at the point you are checking.
 Which is why I complain - you're not checking this properly!
 
 Anyway, I've fixed the bug on our side with --enable-multilib.

Just hit the same thing, while I have (in mock) 32-bit devel libc installed,
I don't have 32-bit libgcc_s installed (what for, it will be built by gcc).

Please revert it, or at least improve it (e.g. by trying to build
with -static-libgcc at least).

Jakub


Re: [PATCH 4/6] [GOMP4] OpenACC 1.0+ support in fortran front-end

2014-02-11 Thread Thomas Schwinge
Hi!

On Fri, 31 Jan 2014 15:16:07 +0400, Ilmir Usmanov i.usma...@samsung.com wrote:
  OpenACC 1.0 support -- GENERIC nodes and gimplify stubs.

Thanks!  This one is nearly ready for commit, and can then go in already,
while the Fortran front end patches are still being dicussed.  :-)

Please merge »OpenACC 1.0 support -- documentation« into this patch.  In
gcc/doc/generic.texi, please also add an entry for OACC_DECLARE, which is
currently missing.  Ideally, gcc/doc/generic.texi OMP_CLAUSE should
also be updated for the OpenACC clauses (as well as recently added OpenMP
ones), but as that one's outdated already, this is not a prerequisite.

For ChangeLog files updates (on gomp-4_0-branch, use the respective
ChangeLog.gomp files, by the way), should just you be listed as the
author, or also your colleagues?

 --- a/gcc/gimplify.c
 +++ b/gcc/gimplify.c

 @@ -6157,6 +6166,20 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
 *pre_p,
   remove = true;
 break;
  
 +  case OMP_CLAUSE_HOST:
 +  case [...]
 +  case OMP_CLAUSE_VECTOR_LENGTH:
   case OMP_CLAUSE_NOWAIT:
   case OMP_CLAUSE_ORDERED:
   case OMP_CLAUSE_UNTIED:

Indentation (one tab instead of two spaces).

 @@ -6476,6 +6499,23 @@ gimplify_adjust_omp_clauses (tree *list_p)
   }
 break;
  
 +  case OMP_CLAUSE_HOST:
 +  case [...]
 +  case OMP_CLAUSE_VECTOR_LENGTH:
 +sorry (Clause not supported yet);
 +break;
 +
   case OMP_CLAUSE_REDUCTION:
   case OMP_CLAUSE_COPYIN:
   case OMP_CLAUSE_COPYPRIVATE:

Indentation.

Also, shouldn't the sorry be moved to gimplify_scan_omp_clauses, and a
»remove = true« be added in there, and gimplify_adjust_omp_clauses then
just contain a gcc_unreachable (or, move the new »case OMP_CLAUSE_*« next
to the existing default: that already contains gcc_unreachable)?

 @@ -7988,6 +8028,19 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, 
 gimple_seq *post_p,
 ret = GS_ALL_DONE;
 break;
  
 +  case OACC_KERNELS:
 +  case OACC_DATA:
 +  case OACC_CACHE:
 +  case OACC_WAIT:
 +  case OACC_HOST_DATA:
 +  case OACC_DECLARE:
 +  case OACC_UPDATE:
 +  case OACC_ENTER_DATA:
 +  case OACC_EXIT_DATA:
 +sorry (directive not yet implemented);
 +ret = GS_ALL_DONE;
 +break;
 +
   case OMP_PARALLEL:
 gimplify_omp_parallel (expr_p, pre_p);
 ret = GS_ALL_DONE;

Indentation.

Further down in gimplify_expr, shouldn't these new OACC_* codes also be
added to the »These expressions should already be in gimple IR form«
assert?

 --- a/gcc/omp-low.c
 +++ b/gcc/omp-low.c
 @@ -1491,6 +1491,18 @@ fixup_child_record_type (omp_context *ctx)
TREE_TYPE (ctx-receiver_decl) = build_pointer_type (type);
  }
  
 +static bool
 +gimple_code_is_oacc (const_gimple g)
 +{
 +  switch (gimple_code (g))
 +{
 +case GIMPLE_OACC_PARALLEL:
 +  return true;
 +default:
 +  return false;
 +}
 +}
 +

Eventually, this will probably end up next to CASE_GIMPLE_OP/is_gimple_op
in gimple.h (or the latter be reworked to be able to ask for is_omp vs.
is_oacc vs. is_omp_or_oacc), but it's fine to do that once we actually
need it in files other than just omp-low.c, and once we support more
GIMPLE_OACC_* codes.

 @@ -1710,17 +1732,34 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)

 +  case OMP_CLAUSE_HOST:
 +  case OMP_CLAUSE_OACC_DEVICE:
 +  case OMP_CLAUSE_DEVICE_RESIDENT:
 +  case OMP_CLAUSE_USE_DEVICE:
 +  case OMP_CLAUSE_ASYNC:
 +  case OMP_CLAUSE_GANG:
 +  case OMP_CLAUSE_WAIT:
 +  case OMP_NO_CLAUSE_CACHE:
 +  case OMP_CLAUSE_INDEPENDENT:
 +  case OMP_CLAUSE_WORKER:
 +  case OMP_CLAUSE_VECTOR:
 +  case OMP_CLAUSE_NUM_GANGS:
 +  case OMP_CLAUSE_NUM_WORKERS:
 +  case OMP_CLAUSE_VECTOR_LENGTH:
 +sorry (Clause not supported yet);
 +break;
 +
   default:
 gcc_unreachable ();
   }

Indentation.

 @@ -1827,9 +1876,26 @@ scan_sharing_clauses (tree clauses, omp_context *ctx)
   case OMP_CLAUSE__LOOPTEMP_:
   case OMP_CLAUSE_TO:
   case OMP_CLAUSE_FROM:
 -   gcc_assert (gimple_code (ctx-stmt) != GIMPLE_OACC_PARALLEL);
 +   gcc_assert (!gimple_code_is_oacc (ctx-stmt));
 break;
  
 +  case OMP_CLAUSE_HOST:
 +  case OMP_CLAUSE_OACC_DEVICE:
 +  case OMP_CLAUSE_DEVICE_RESIDENT:
 +  case OMP_CLAUSE_USE_DEVICE:
 +  case OMP_CLAUSE_ASYNC:
 +  case OMP_CLAUSE_GANG:
 +  case OMP_CLAUSE_WAIT:
 +  case OMP_NO_CLAUSE_CACHE:
 +  case OMP_CLAUSE_INDEPENDENT:
 +  case OMP_CLAUSE_WORKER:
 +  case OMP_CLAUSE_VECTOR:
 +  case OMP_CLAUSE_NUM_GANGS:
 +  case OMP_CLAUSE_NUM_WORKERS:
 +  case OMP_CLAUSE_VECTOR_LENGTH:
 +sorry (Clause not supported yet);
 +break;
 +
   default:
 gcc_unreachable ();
   }

Indentation.

 --- a/gcc/tree-core.h
 +++ b/gcc/tree-core.h
 @@ -213,19 +213,19 @@ enum omp_clause_code {
   (c_parser_omp_variable_list).  */
OMP_CLAUSE_ERROR = 0,
  
 -  /* OpenMP clause: private (variable_list).  */
 +  /* OpenMP/OpenACC clause: private (variable_list).  */
   

Simple install hook

2014-02-11 Thread Philip Herron
Added install hook:

/opt/gjit/bin/gcc -g -O2 -Wall t.c -o test -I/opt/gjit/include
-lgccjit -L/opt/gjit/lib

Compiles the helloworld examples correctly and it runs instead of
pointing to my gcc build dir. Am working on getting more involved with
this and started:

https://github.com/redbrain/spy

Its only just starting to parse stuff but a kind of C/Python kind of
language using gcc as a JIT might be interesting kind of dynamic
language for C that can call C libs safely and easily is the idea.
Mostly just so i can see where to help out in the jit front-end.

Was also considering some kind of libopcodes work to assemble the code
in memory instead of creating a .so in /tmp. Not sure if i know what i
am doing enough there but it might be more elegant for stuff using
this front-end.

Am so impressed how well this works.

Thanks David.
From c0c1c3d55ad707ef5c62199791acabd1ceea6858 Mon Sep 17 00:00:00 2001
From: Philip redbr...@gcc.gnu.org
Date: Tue, 11 Feb 2014 16:46:15 +
Subject: [PATCH] Add install hook

---
 gcc/jit/Make-lang.in | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/gcc/jit/Make-lang.in b/gcc/jit/Make-lang.in
index 17d27ca..0a468da 100644
--- a/gcc/jit/Make-lang.in
+++ b/gcc/jit/Make-lang.in
@@ -85,7 +85,9 @@ jit.srcman:
 
 #
 # Install hooks:
-jit.install-common:
+jit.install-common: installdirs
+	$(INSTALL_PROGRAM) libgccjit.so $(DESTDIR)/$(libdir)/libgccjit.so
+	$(INSTALL_PROGRAM) $(srcdir)/jit/libgccjit.h $(DESTDIR)/$(includedir)/libgccjit.h
 
 jit.install-man:
 
-- 
1.8.3.2



Re: Use [warning enabled by default] for default warnings

2014-02-11 Thread Franz Sirl

Am 2014-02-11 15:36, schrieb Richard Sandiford:

I thought the trend these days was to move towards -Werror, so that for
many people the expected output is to get no warnings at all.  And bear
in mind that the kind of warnings that are not under -W control tend to
be those that are so likely to be a mistake that no-one has ever had an
incentive to make them optional.  I find it hard to believe that
significant numbers of users are not fixing the sources of those
warnings and are instead requiring every release of GCC to produce
warnings with a particular wording.


Hi,

actually at my site we turn on more and more warnings into errors, but 
we do it warning by warning with more -Werror=..., so the fine-grained 
warning changes are really nice for us. The problem we face with 
[enabled by default] warnings is not that there are no options to turn 
these warnings off (we _want_ these warnings), but this also means there 
are no corresponding -Werror= options (and also no 
-Werror=enabled-by-default or -Werror=default-warnings). And pure 
-Werror turns all other warnings we want to see into errors too :(.


regards,
Franz



Re: [patch] Fix wrong code with VCE to bit-field type at -O

2014-02-11 Thread Eric Botcazou
 Hmm.  The intent was of course to only allow truly no-op converts via
 VIEW_CONVERT_EXPR - that is, the size of the operand type and the
 result type should be the same.  So, isn't SRA doing it wrong when
 creating the VIEW_CONVERT_EXPR of a 3-byte type to a 4-byte type?

That's debatable IMO if the 4-byte type has 3-byte precision, but I don't have 
a strong opinion so I can try to fix it in SRA, although it will be weird to 
do low-level fiddling because of precision and size at the Tree level.

-- 
Eric Botcazou


Re: [google gcc-4_8] Don't use gcov counter related ssa name as induction variables

2014-02-11 Thread Xinliang David Li
On Mon, Feb 10, 2014 at 11:48 PM, Wei Mi w...@google.com wrote:
 Here is the updated patch, which follow UD chain to determine whether
 iv.base is defined by __gcovx.xxx[] var. It is a lot simpler than
 adding a tree bit.

 regression test and previously failed benchmark in piii mode is ok.
 Other test is going on.

 2014-02-10  Wei Mi  w...@google.com

 * tree-ssa-loop-ivopts.c (defined_by_gcov_counter): New.
 (contains_abnormal_ssa_name_p): Add defined_by_gcov_counter
 check for ssa name.

 * testsuite/gcc.dg/profile-generate-4.c: New.

 Index: tree-ssa-loop-ivopts.c
 ===
 --- tree-ssa-loop-ivopts.c  (revision 207019)
 +++ tree-ssa-loop-ivopts.c  (working copy)
 @@ -705,6 +705,68 @@ idx_contains_abnormal_ssa_name_p (tree b
return !abnormal_ssa_name_p (*index);
  }

 +/* Return true if the use is defined by a gcov counter var.
 +   It is used to check if an iv candidate is generated for
 +   profiling. For profile generated ssa name, we should not
 +   use it as IV because gcov counter may have data-race for
 +   multithread program, it could involve tricky bug to use
 +   such ssa var in IVOPT.
 +

Add the snippets describing how ralloc introduces a second gcov
counter load and asynchronous update of it from another thread leading
to bogus trip count.

Also mention that setting volatile flag on gcov counter accesses may
greatly degrade profile-gen performance.

 +   To limit patterns to be checked, we list the possible cases
 +   here:
 +   Before PRE, the ssa name used to set __gcov counter is as
 +   follows:
 +   for () {
 + PROF_edge_counter_1 = __gcov.foo[i];
 + PROF_edge_counter_2 = PROF_edge_counter_1 + 1;
 + __gcov.foo[i] = PROF_edge_counter_2;
 +   }
 +   If PRE works, the loop may be transformed to:
 +   pretmp_1 = __gcov.foo[i];
 +   for () {
 + prephitmp_1 = PHI (PROF_edge_counter_2, pretmp_1);
 + PROF_edge_counter_1 = prephitmp_1;
 + PROF_edge_counter_2 = PROF_edge_counter_1 + 1;
 + __gcov.foo[i] = PROF_edge_counter_2;
 +   }
 +   So there are two cases:
 +   case1: If PRE doesn't work, PROF_edge_counter_1 and PROF_edge_counter_2
 +   are neither induction variables candidates. We don't have to worry
 +   about this case.
 +   case2: If PRE works, the iv candidate base of PROF_edge_counter_1 and
 +   PROF_edge_counter_2 are pretmp_1 or pretmp_1 + 1. pretmp_1 is defined
 +   by __gcov var.
 +
 +   So this func only has to check case2. For a ssa name which is an iv
 +   candidate, check its base USE and see if it is defined by __gcov var.
 +   Returning true means the ssa name is generated for profiling.  */
 +
 +bool
 +defined_by_gcov_counter (tree use)
 +{
 +  gimple stmt;
 +  tree rhs, decl;
 +  const char *name;
 +
 +  stmt = SSA_NAME_DEF_STMT (use);
 +  if (!is_gimple_assign (stmt))
 +return false;
 +
 +  rhs = gimple_assign_rhs1 (stmt);
 +  if (TREE_CODE (rhs) != ARRAY_REF)
 +return false;
 +
 +  decl = TREE_OPERAND (rhs, 0);
 +  if (TREE_CODE (decl) != VAR_DECL)
 +return false;



Also check TREE_STATIC and DECL_ARTIFICIAL flag.


David

 +
 +  name = IDENTIFIER_POINTER (DECL_NAME (decl));
 +  if (strncmp (name, __gcov, 6))
 +return false;
 +
 +  return true;
 +}
 +
  /* Returns true if EXPR contains a ssa name that occurs in an
 abnormal phi node.  */

 @@ -721,7 +783,8 @@ contains_abnormal_ssa_name_p (tree expr)
codeclass = TREE_CODE_CLASS (code);

if (code == SSA_NAME)
 -return SSA_NAME_OCCURS_IN_ABNORMAL_PHI (expr) != 0;
 +return SSA_NAME_OCCURS_IN_ABNORMAL_PHI (expr) != 0
 +  || defined_by_gcov_counter (expr);

if (code == INTEGER_CST
|| is_gimple_min_invariant (expr))
 Index: testsuite/gcc.dg/profile-generate-4.c
 ===
 --- testsuite/gcc.dg/profile-generate-4.c   (revision 0)
 +++ testsuite/gcc.dg/profile-generate-4.c   (revision 0)
 @@ -0,0 +1,21 @@
 +/* { dg-do compile { target { i?86-*-* x86_64-*-* } } } */
 +/* { dg-options -O2 -fprofile-generate -fno-tree-loop-im
 -fdump-tree-ivopts-details-blocks } */
 +
 +/* Because gcov counter related var has data race for multithread program,
 +   compiler should prevent them from affecting program correctness. So
 +   PROF_edge_counter variable should not be used as induction variable, or
 +   else IVOPT may use such variable to compute loop boundary.  */
 +
 +void *ptr;
 +int N;
 +
 +void foo(void *t) {
 +  int i;
 +  for (i = 0; i  N; i++) {
 +t = *(void **)t;
 +  }
 +  ptr = t;
 +}
 +
 +/* { dg-final { scan-tree-dump-times ssa name PROF_edge_counter 0
 ivopts} } */
 +/* { dg-final { cleanup-tree-dump ivopts } } */


[PATCH] Use VIEW_CONVERT_EXPR in SRA created debug stmts if needed (PR debug/59776)

2014-02-11 Thread Jakub Jelinek
Hi!

As discussed in the PR, this patch adds VCE if types aren't compatible,
in order not to create invalid debug stmts.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-02-11  Richard Henderson  r...@redhat.com
Jakub Jelinek  ja...@redhat.com

PR debug/59776
* tree-sra.c (load_assign_lhs_subreplacements): Add VIEW_CONVERT_EXPR
around drhs if type conversion to lacc-type is not useless.

* gcc.dg/guality/pr59776.c: New test.

--- gcc/tree-sra.c.jj   2014-02-08 00:53:46.0 +0100
+++ gcc/tree-sra.c  2014-02-11 14:31:51.469937602 +0100
@@ -2950,6 +2950,10 @@ load_assign_lhs_subreplacements (struct
  lacc);
  else
drhs = NULL_TREE;
+ if (drhs
+  !useless_type_conversion_p (lacc-type, TREE_TYPE (drhs)))
+   drhs = fold_build1_loc (loc, VIEW_CONVERT_EXPR,
+   lacc-type, drhs);
  ds = gimple_build_debug_bind (get_access_replacement (lacc),
drhs, gsi_stmt (*old_gsi));
  gsi_insert_after (new_gsi, ds, GSI_NEW_STMT);
--- gcc/testsuite/gcc.dg/guality/pr59776.c.jj   2014-02-11 14:44:04.604957250 
+0100
+++ gcc/testsuite/gcc.dg/guality/pr59776.c  2014-02-11 14:52:57.0 
+0100
@@ -0,0 +1,29 @@
+/* PR debug/59776 */
+/* { dg-do run } */
+/* { dg-options -g } */
+
+#include ../nop.h
+
+struct S { float f, g; };
+
+__attribute__((noinline, noclone)) void
+foo (struct S *p)
+{
+  struct S s1, s2; /* { dg-final { gdb-test pr59776.c:17 
s1.f 5.0 } } */
+  s1 = *p; /* { dg-final { gdb-test pr59776.c:17 
s1.g 6.0 } } */
+  s2 = s1; /* { dg-final { gdb-test pr59776.c:17 
s2.f 0.0 } } */
+  *(int *) s2.f = 0;  /* { dg-final { gdb-test pr59776.c:17 
s2.g 6.0 } } */
+  asm volatile (NOP : : : memory);   /* { dg-final { gdb-test pr59776.c:20 
s1.f 5.0 } } */
+  asm volatile (NOP : : : memory);   /* { dg-final { gdb-test pr59776.c:20 
s1.g 6.0 } } */
+  s2 = s1; /* { dg-final { gdb-test pr59776.c:20 
s2.f 5.0 } } */
+  asm volatile (NOP : : : memory);   /* { dg-final { gdb-test pr59776.c:20 
s2.g 6.0 } } */
+  asm volatile (NOP : : : memory);
+}
+
+int
+main ()
+{
+  struct S x = { 5.0f, 6.0f };
+  foo (x);
+  return 0;
+}

Jakub


Re: [PATCH] Use VIEW_CONVERT_EXPR in SRA created debug stmts if needed (PR debug/59776)

2014-02-11 Thread Richard Henderson
On 02/11/2014 09:36 AM, Jakub Jelinek wrote:
 Hi!
 
 As discussed in the PR, this patch adds VCE if types aren't compatible,
 in order not to create invalid debug stmts.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 
 2014-02-11  Richard Henderson  r...@redhat.com
   Jakub Jelinek  ja...@redhat.com
 
   PR debug/59776
   * tree-sra.c (load_assign_lhs_subreplacements): Add VIEW_CONVERT_EXPR
   around drhs if type conversion to lacc-type is not useless.
 
   * gcc.dg/guality/pr59776.c: New test.

Ok.

Nice test case.


r~


[PATCH] Fix up -Wsequence-point handling (PR c/60101)

2014-02-11 Thread Jakub Jelinek
Hi!

Right now we spent several minutes in verify_tree.  The problems I see
is that merge_tlist does the exact opposite of what should it be doing with
copy flag (most merge_tlist calls are with copy=0, thus that means mostly
unnecessary copying, but for the single one for SAVE_EXPR it actually
probably breaks things or can break), the middle-hunk is about one spot
which IMHO uses copy=1 unnecessarily (nothing uses tmp_list2 afterwards).

The most important is the last hunk, the SAVE_EXPR handling was doing
merge_tlist first on the whole tmp_nosp chain (with copy=0 which mistakenly
copied everything, see above), and then doing that again with the whole
tmp_nosp list except the first entry, then except the first two entries etc.
O(n^2) complexity, but IMHO none of the calls but the first one actually
would do anything, because after the first merge_tlist call all entries
should be actually in tmp_list3 (except for duplicates), and so for all
further entries it would be finding a matching entry already (found=1).

With this patch pr60101.c compiles within less than a second.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-02-11  Jakub Jelinek  ja...@redhat.com

PR c/60101
* c-common.c (merge_tlist): If copy is true, call new_tlist,
if false, add ADD itself, rather than vice versa.
(verify_tree): For COND_EXPR, don't call merge_tlist with non-zero
copy.  For SAVE_EXPR, only call merge_tlist once.

* c-c++-common/pr60101.c: New test.

--- gcc/c-family/c-common.c.jj  2014-02-08 10:07:12.0 +0100
+++ gcc/c-family/c-common.c 2014-02-11 13:18:53.270521683 +0100
@@ -2976,7 +2976,7 @@ merge_tlist (struct tlist **to, struct t
  }
   if (!found)
{
- *end = copy ? add : new_tlist (NULL, add-expr, add-writer);
+ *end = copy ? new_tlist (NULL, add-expr, add-writer) : add;
  end = (*end)-next;
  *end = 0;
}
@@ -3134,7 +3134,7 @@ verify_tree (tree x, struct tlist **pbef
   verify_tree (TREE_OPERAND (x, 0), tmp_before, tmp_list2, NULL_TREE);
   warn_for_collisions (tmp_list2);
   merge_tlist (pbefore_sp, tmp_before, 0);
-  merge_tlist (pbefore_sp, tmp_list2, 1);
+  merge_tlist (pbefore_sp, tmp_list2, 0);
 
   tmp_list3 = tmp_nosp = 0;
   verify_tree (TREE_OPERAND (x, 1), tmp_list3, tmp_nosp, NULL_TREE);
@@ -3238,12 +3238,7 @@ verify_tree (tree x, struct tlist **pbef
warn_for_collisions (tmp_nosp);
 
tmp_list3 = 0;
-   while (tmp_nosp)
- {
-   struct tlist *t = tmp_nosp;
-   tmp_nosp = t-next;
-   merge_tlist (tmp_list3, t, 0);
- }
+   merge_tlist (tmp_list3, tmp_nosp, 0);
t-cache_before_sp = tmp_before;
t-cache_after_sp = tmp_list3;
  }
--- gcc/testsuite/c-c++-common/pr60101.c.jj 2014-02-11 14:55:20.728288546 
+0100
+++ gcc/testsuite/c-c++-common/pr60101.c2014-02-11 13:37:28.0 
+0100
@@ -0,0 +1,112 @@
+/* PR c/60101 */
+/* { dg-do compile } */
+/* { dg-options -O2 -Wall } */
+
+extern int *a, b, *c, *d;
+
+void
+foo (double _Complex *x, double _Complex *y, double _Complex *z, unsigned int 
l, int w)
+{
+  unsigned int e = (unsigned int) a[3];
+  double _Complex (*v)[l][4][e][l][4] = (double _Complex (*)[l][4][e][l][4]) z;
+  double _Complex (*f)[l][b][l] = (double _Complex (*)[l][b][l]) y;
+  unsigned int g = c[0] * c[1] * c[2];
+  unsigned int h = d[0] + c[0] * (d[1] + c[1] * d[2]);
+  unsigned int i;
+
+  for (i = 0; i  e; i++)
+{
+  int j = e * d[3] + i;
+
+  unsigned int n0, n1, n2, n3, n4, n5, n6, n7, n8, n9, n10, n11;
+  float _Complex s = 0.;
+  unsigned int t = 0;
+
+  for (n0 = 0; n0  l; n0++)
+   for (n1 = 0; n1  l; n1++)
+ for (n2 = 0; n2  l; n2++)
+   for (n3 = 0; n3  l; n3++)
+ for (n4 = 0; n4  l; n4++)
+   for (n5 = 0; n5  l; n5++)
+ for (n6 = 0; n6  l; n6++)
+   for (n7 = 0; n7  l; n7++)
+ for (n8 = 0; n8  l; n8++)
+   for (n9 = 0; n9  l; n9++)
+ for (n10 = 0; n10  l; n10++)
+   for (n11 = 0; n11  l; n11++)
+ {
+   if (t % g == h)
+ s
+   += f[n0][n4][j][n8] * f[n1][n5][j][n9] * 
~(f[n2][n6][w][n10]) * ~(f[n3][n7][w][n11])
+  * (+0.25 * v[0][n2][0][i][n9][1] * 
v[0][n3][0][i][n5][1] * v[0][n10][0][i][n4][1]
+ * v[0][n7][1][i][n8][0] * 
v[0][n11][1][i][n1][0] * v[0][n6][1][i][n0][0]
+ + 0.25 * v[0][n2][0][i][n9][1] * 
v[0][n3][0][i][n5][1] * v[0][n10][0][i][n4][1]
+ * v[0][n11][1][i][n8][0] * 
v[0][n6][1][i][n1][0] * 

Re: [PING]Re: [PATCH][ARM]fix potential testsuite/gcc.target/arm/fixed_float_conversion.c regression

2014-02-11 Thread H.J. Lu
On Mon, Feb 10, 2014 at 6:02 AM, Renlin Li renlin...@arm.com wrote:
 On 03/02/14 15:56, Renlin Li wrote:

 Hi all,

 This patch will ensure testsuite/gcc.target/arm/fixed_float_conversion.c
 is checked only when -mfpu=vfp3 -mfloat-abi=softfp is applicable for the
 target.

 Accordingly, two procs (check_effective_target_arm_vfp3_ok and
 add_options_for_arm_vfp3) are added into
 gcc/testsuite/lib/target-supports.exp.

 I have also update related documentation.

 Okay for trunk?

 Kind regards,
 Renlin Li


 gcc/testsuite/ChangeLog:

 2014-02-03  Renlin Li  renlin...@arm.com

 * gcc.target/arm/fixed_float_conversion.c: Add arm_vfp3 option to the
 test case.
 * lib/target-supports.exp: check_effective_target_arm_vfp3_ok: New.
 add_options_for_arm_vfp3: New.


 gcc/ChangeLog:

 2014-02-03  Renlin Li  renlin...@arm.com

 * doc/sourcebuild.texi: Document check_effective_target_arm_vfp3_ok
 and
 add_options_for_arm_vfp3


 Hi,

 Anybody help me to review this patch?


This breaks bootstrap:

../../src-trunk/gcc/doc/sourcebuild.texi:1963: @ref reference to
nonexistent node `arm_vfp3_ok'

I am checking in this as an obvious fix.


-- 
H.J.
--
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 1ea5753..85ef819 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1552,6 +1552,7 @@ ARM target supports @code{-mfpu=vfp -mfloat-abi=softfp}.
 Some multilibs may be incompatible with these options.

 @item arm_vfp3_ok
+@anchor{arm_vfp3_ok}
 ARM target supports @code{-mfpu=vfp3 -mfloat-abi=softfp}.
 Some multilibs may be incompatible with these options.


Re: [PING]Re: [PATCH][ARM]fix potential testsuite/gcc.target/arm/fixed_float_conversion.c regression

2014-02-11 Thread H.J. Lu
On Tue, Feb 11, 2014 at 9:46 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Mon, Feb 10, 2014 at 6:02 AM, Renlin Li renlin...@arm.com wrote:
 On 03/02/14 15:56, Renlin Li wrote:

 Hi all,

 This patch will ensure testsuite/gcc.target/arm/fixed_float_conversion.c
 is checked only when -mfpu=vfp3 -mfloat-abi=softfp is applicable for the
 target.

 Accordingly, two procs (check_effective_target_arm_vfp3_ok and
 add_options_for_arm_vfp3) are added into
 gcc/testsuite/lib/target-supports.exp.

 I have also update related documentation.

 Okay for trunk?

 Kind regards,
 Renlin Li


 gcc/testsuite/ChangeLog:

 2014-02-03  Renlin Li  renlin...@arm.com

 * gcc.target/arm/fixed_float_conversion.c: Add arm_vfp3 option to the
 test case.
 * lib/target-supports.exp: check_effective_target_arm_vfp3_ok: New.
 add_options_for_arm_vfp3: New.


 gcc/ChangeLog:

 2014-02-03  Renlin Li  renlin...@arm.com

 * doc/sourcebuild.texi: Document check_effective_target_arm_vfp3_ok
 and
 add_options_for_arm_vfp3


 Hi,

 Anybody help me to review this patch?


 This breaks bootstrap:

 ../../src-trunk/gcc/doc/sourcebuild.texi:1963: @ref reference to
 nonexistent node `arm_vfp3_ok'

 I am checking in this as an obvious fix.


 --
 H.J.
 --
 diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
 index 1ea5753..85ef819 100644
 --- a/gcc/doc/sourcebuild.texi
 +++ b/gcc/doc/sourcebuild.texi
 @@ -1552,6 +1552,7 @@ ARM target supports @code{-mfpu=vfp -mfloat-abi=softfp}.
  Some multilibs may be incompatible with these options.

  @item arm_vfp3_ok
 +@anchor{arm_vfp3_ok}
  ARM target supports @code{-mfpu=vfp3 -mfloat-abi=softfp}.
  Some multilibs may be incompatible with these options.

It has been fixed by Uros.


-- 
H.J.


Re: [PING]Re: [PATCH][ARM]fix potential testsuite/gcc.target/arm/fixed_float_conversion.c regression

2014-02-11 Thread Renlin Li

On 11/02/14 17:47, H.J. Lu wrote:

On Tue, Feb 11, 2014 at 9:46 AM, H.J. Lu hjl.to...@gmail.com wrote:

On Mon, Feb 10, 2014 at 6:02 AM, Renlin Li renlin...@arm.com wrote:

On 03/02/14 15:56, Renlin Li wrote:

Hi all,

This patch will ensure testsuite/gcc.target/arm/fixed_float_conversion.c
is checked only when -mfpu=vfp3 -mfloat-abi=softfp is applicable for the
target.

Accordingly, two procs (check_effective_target_arm_vfp3_ok and
add_options_for_arm_vfp3) are added into
gcc/testsuite/lib/target-supports.exp.

I have also update related documentation.

Okay for trunk?

Kind regards,
Renlin Li


gcc/testsuite/ChangeLog:

2014-02-03  Renlin Li  renlin...@arm.com

 * gcc.target/arm/fixed_float_conversion.c: Add arm_vfp3 option to the
test case.
 * lib/target-supports.exp: check_effective_target_arm_vfp3_ok: New.
 add_options_for_arm_vfp3: New.


gcc/ChangeLog:

2014-02-03  Renlin Li  renlin...@arm.com

 * doc/sourcebuild.texi: Document check_effective_target_arm_vfp3_ok
and
 add_options_for_arm_vfp3


Hi,

Anybody help me to review this patch?


This breaks bootstrap:

../../src-trunk/gcc/doc/sourcebuild.texi:1963: @ref reference to
nonexistent node `arm_vfp3_ok'

I am checking in this as an obvious fix.


--
H.J.
--
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index 1ea5753..85ef819 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1552,6 +1552,7 @@ ARM target supports @code{-mfpu=vfp -mfloat-abi=softfp}.
  Some multilibs may be incompatible with these options.

  @item arm_vfp3_ok
+@anchor{arm_vfp3_ok}
  ARM target supports @code{-mfpu=vfp3 -mfloat-abi=softfp}.
  Some multilibs may be incompatible with these options.

It has been fixed by Uros.



Hi Uros  H.J. ,

Thank you for the fix!
It's definitely my fault without double checking the patch.

Kind regards,
Renlin



Re: [PATCH][LTO] Fix PR60060

2014-02-11 Thread Jan Hubicka
 
 This fixes the ICE seen in PR60060 which stems from us asking to
 output debug-info for a function-scope global twice - once via
 emitting the function and its BLOCK-associated vars and once
 via the debug-hook through lto_write_globals - 
 emit_debug_global_declarations.
 
 The list of global variables does not agree with those from
 the frontends (that function-scope global is not in GFortrans list).

Yes, function scope statics are not supposed to be in the global list,
only in the symbol table.
 if (flag_wpa)
   return;
   
 !   /* Output debug info for global variables.  */  
 varpool_node *vnode;
 FOR_EACH_DEFINED_VARIABLE (vnode)
 ! if (!decl_function_context (vnode-decl))
 !   debug_hooks-global_decl (vnode-decl);

If it works, it looks OK to me :)  Wrapup_global_decls does more than
just debug output, but it is already done at compilation time,
so we probably do not need to re-do it.

Honza
   }
   
   static tree


[jit] Improvements to error-handling

2014-02-11 Thread David Malcolm
Committed to branch dmalcolm/jit:

gcc/jit/
* internal-api.c (gcc::jit::recording::context::add_error_va): If
GCC_JIT_STR_OPTION_PROGNAME is NULL, use libgccjit.so, as per
the comment in libgccjit.h
(gcc::jit::recording::label::replay_into): When reporting on an
unplaced label, include the name of the containing function in
the error message.
* libgccjit.c: Remove comment about Functions for use within the
code factory, as this distinction went away in commit
96b218c9a1d5f39fb649e02c0e77586b180e8516.
(RETURN_VAL_IF_FAIL_PRINTF4): New.
(RETURN_NULL_IF_FAIL_PRINTF4): New.
(jit_error): Invoke vfprintf with the correct format string in
the NULL context case.
(gcc_jit_context_new_call): Check for NULL entries within the
array of arguments.
---
 gcc/jit/ChangeLog.jit  | 18 ++
 gcc/jit/internal-api.c | 10 --
 gcc/jit/libgccjit.c| 29 +
 3 files changed, 51 insertions(+), 6 deletions(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index 68c38db..d331082 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,21 @@
+2014-02-11  David Malcolm  dmalc...@redhat.com
+
+   * internal-api.c (gcc::jit::recording::context::add_error_va): If
+   GCC_JIT_STR_OPTION_PROGNAME is NULL, use libgccjit.so, as per
+   the comment in libgccjit.h
+   (gcc::jit::recording::label::replay_into): When reporting on an
+   unplaced label, include the name of the containing function in
+   the error message.
+   * libgccjit.c: Remove comment about Functions for use within the
+   code factory, as this distinction went away in commit
+   96b218c9a1d5f39fb649e02c0e77586b180e8516.
+   (RETURN_VAL_IF_FAIL_PRINTF4): New.
+   (RETURN_NULL_IF_FAIL_PRINTF4): New.
+   (jit_error): Invoke vfprintf with the correct format string in
+   the NULL context case.
+   (gcc_jit_context_new_call): Check for NULL entries within the
+   array of arguments.
+
 2014-02-10  David Malcolm  dmalc...@redhat.com
 
* libgccjit.h (gcc_jit_context_get_int_type): New.
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index 9ff0eb8..6c66d3d 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -500,8 +500,12 @@ recording::context::add_error_va (const char *fmt, va_list 
ap)
   char buf[1024];
   vsnprintf (buf, sizeof (buf) - 1, fmt, ap);
 
+  const char *progname = get_str_option (GCC_JIT_STR_OPTION_PROGNAME);
+  if (!progname)
+progname = libgccjit.so;
+
   fprintf (stderr, %s: %s\n,
-  get_str_option (GCC_JIT_STR_OPTION_PROGNAME),
+  progname,
   buf);
 
   if (!m_error_count)
@@ -1075,7 +1079,9 @@ recording::label::replay_into (replayer *r)
 {
   if (!m_has_been_placed)
 {
-  r-add_error (unplaced label: %s, get_debug_string ());
+  r-add_error (unplaced label within %s: %s,
+   m_func-get_debug_string (),
+   get_debug_string ());
   return;
 }
   set_playback_obj (m_func-playback_function ()
diff --git a/gcc/jit/libgccjit.c b/gcc/jit/libgccjit.c
index 94b3271..3c2d962 100644
--- a/gcc/jit/libgccjit.c
+++ b/gcc/jit/libgccjit.c
@@ -100,6 +100,16 @@ struct gcc_jit_loop : public gcc::jit::recording::loop
   }\
   JIT_END_STMT
 
+#define RETURN_VAL_IF_FAIL_PRINTF4(TEST_EXPR, RETURN_EXPR, CTXT, ERR_FMT, A0, 
A1, A2, A3) \
+  JIT_BEGIN_STMT   \
+if (!(TEST_EXPR))  \
+  {\
+   jit_error ((CTXT), %s:  ERR_FMT,  \
+  __func__, (A0), (A1), (A2), (A3));   \
+   return (RETURN_EXPR);   \
+  }\
+  JIT_END_STMT
+
 #define RETURN_VAL_IF_FAIL_PRINTF6(TEST_EXPR, RETURN_EXPR, CTXT, ERR_FMT, A0, 
A1, A2, A3, A4, A5) \
   JIT_BEGIN_STMT   \
 if (!(TEST_EXPR))  \
@@ -119,6 +129,9 @@ struct gcc_jit_loop : public gcc::jit::recording::loop
 #define RETURN_NULL_IF_FAIL_PRINTF3(TEST_EXPR, CTXT, ERR_FMT, A0, A1, A2) \
   RETURN_VAL_IF_FAIL_PRINTF3 (TEST_EXPR, NULL, CTXT, ERR_FMT, A0, A1, A2)
 
+#define RETURN_NULL_IF_FAIL_PRINTF4(TEST_EXPR, CTXT, ERR_FMT, A0, A1, A2, A3) \
+  RETURN_VAL_IF_FAIL_PRINTF4 (TEST_EXPR, NULL, CTXT, ERR_FMT, A0, A1, A2, A3)
+
 #define RETURN_NULL_IF_FAIL_PRINTF6(TEST_EXPR, CTXT, ERR_FMT, A0, A1, A2, A3, 
A4, A5) \
   RETURN_VAL_IF_FAIL_PRINTF6 (TEST_EXPR, NULL, CTXT, ERR_FMT, A0, A1, A2, A3, 
A4, A5)
 
@@ -174,7 +187,8 @@ jit_error (gcc::jit::recording::context *ctxt, const char 

Re: [google gcc-4_8] Don't use gcov counter related ssa name as induction variables

2014-02-11 Thread Wei Mi
 +/* Return true if the use is defined by a gcov counter var.
 +   It is used to check if an iv candidate is generated for
 +   profiling. For profile generated ssa name, we should not
 +   use it as IV because gcov counter may have data-race for
 +   multithread program, it could involve tricky bug to use
 +   such ssa var in IVOPT.
 +

 Add the snippets describing how ralloc introduces a second gcov
 counter load and asynchronous update of it from another thread leading
 to bogus trip count.

 Also mention that setting volatile flag on gcov counter accesses may
 greatly degrade profile-gen performance.


Comments added.

 +  if (!is_gimple_assign (stmt))
 +return false;
 +
 +  rhs = gimple_assign_rhs1 (stmt);
 +  if (TREE_CODE (rhs) != ARRAY_REF)
 +return false;
 +
 +  decl = TREE_OPERAND (rhs, 0);
 +  if (TREE_CODE (decl) != VAR_DECL)
 +return false;



 Also check TREE_STATIC and DECL_ARTIFICIAL flag.


 David


Check added. Add DECL_ARTIFICIAL setting in build_var() in coverage.c.

The updated patch is attached.

Thanks,
Wei.
2014-02-11  Wei Mi  w...@google.com

* tree-ssa-loop-ivopts.c (defined_by_gcov_counter): New.
(contains_abnormal_ssa_name_p): Add defined_by_gcov_counter
check for ssa name.
* coverage.c (build_var): Set DECL_ARTIFICIAL(gcov var decl)
to be 1.

* testsuite/gcc.dg/profile-generate-4.c: New.

Index: testsuite/gcc.dg/profile-generate-4.c
===
--- testsuite/gcc.dg/profile-generate-4.c   (revision 0)
+++ testsuite/gcc.dg/profile-generate-4.c   (revision 0)
@@ -0,0 +1,21 @@
+/* { dg-do compile { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-options -O2 -fprofile-generate -fno-tree-loop-im 
-fdump-tree-ivopts-details-blocks } */
+
+/* Because gcov counter related var has data race for multithread program,
+   compiler should prevent them from affecting program correctness. So
+   PROF_edge_counter variable should not be used as induction variable, or
+   else IVOPT may use such variable to compute loop boundary.  */
+
+void *ptr;
+int N;
+
+void foo(void *t) {
+  int i;
+  for (i = 0; i  N; i++) {
+t = *(void **)t;
+  }
+  ptr = t;
+}
+
+/* { dg-final { scan-tree-dump-times ssa name PROF_edge_counter 0 ivopts} 
} */
+/* { dg-final { cleanup-tree-dump ivopts } } */
Index: coverage.c
===
--- coverage.c  (revision 207019)
+++ coverage.c  (working copy)
@@ -1485,6 +1485,7 @@ build_var (tree fn_decl, tree type, int
   TREE_STATIC (var) = 1;
   TREE_ADDRESSABLE (var) = 1;
   DECL_ALIGN (var) = TYPE_ALIGN (type);
+  DECL_ARTIFICIAL (var) = 1;
 
   return var;
 }
Index: tree-ssa-loop-ivopts.c
===
--- tree-ssa-loop-ivopts.c  (revision 207019)
+++ tree-ssa-loop-ivopts.c  (working copy)
@@ -705,6 +705,115 @@ idx_contains_abnormal_ssa_name_p (tree b
   return !abnormal_ssa_name_p (*index);
 }
 
+/* Return true if the use is defined by a gcov counter var.
+   It is used to check if an iv candidate is generated for
+   profiling. For profile generated ssa name, we should not
+   use it as IV because gcov counter may have data-race for
+   multithread program, it is compiler's responsibility to
+   avoid connecting profile counter related vars with program
+   correctness.
+
+   Without the check, the following bug could happen in
+   following case:
+   * original loop
+
+ int i;
+ for (i = 0; i  N; i++) {
+   t = *(void **)t;
+ }
+
+   * after profile-gen and IVOPT, loop condition is replaced and
+ pretmp_1 is involved in loop boundary computation.
+
+ pretmp_1 = __gcov0.foo[0];
+ _22 = pretmp_1 + 1;
+ ...
+ _31 = (unsigned long) pretmp_1;
+ _32 = _30 + _31;
+ _33 = _32 + 2;
+  label:
+ ivtmp.8_9 = PHI ivtmp.8_5(5), ivtmp.8_2(3)
+ PROF_edge_counter_10 = (long int) ivtmp.8_9;
+ __gcov0.foo[0] = PROF_edge_counter_10;
+   ...
+ ivtmp.8_5 = ivtmp.8_9 + 1;
+ if (ivtmp.8_5 != _33)
+   goto label
+
+   * after register allocation, pretmp_1 may be marked as REG_EQUIV in IRA
+ with __gcov0.foo[0] and some references are replaced by __gcov0.foo in 
LRA.
+
+ _22 = __gcov0.foo[0] + 1;
+ ...
+ _31 = (unsigned long) __gcov0.foo[0];
+ _32 = _30 + _31;
+ _33 = _32 + 2;
+  label:
+ 
+
+   * Bug happens when __gcov0.foo[0] is updated asynchronously by other thread
+ between the above __gcov0.foo[0] references statements.
+
+   We don't choose to mark gcov counter as volatile because it may greatly
+   degrade profile-gen performance.
+
+   To limit patterns to be checked, we list the possible cases
+   here:
+   Before PRE, the ssa name used to set __gcov counter is as
+   follows:
+   for () {
+ PROF_edge_counter_1 = __gcov.foo[i];
+ PROF_edge_counter_2 = PROF_edge_counter_1 + 1;
+ __gcov.foo[i] = PROF_edge_counter_2;
+   }
+   If 

fix typo in doc/generic.texi

2014-02-11 Thread Prathamesh Kulkarni
* (generic.texi): Fix typo

Index: generic.texi
===
--- generic.texi(revision 207627)
+++ generic.texi(working copy)
@@ -53,7 +53,7 @@ seems inelegant.
 @node Deficiencies
 @section Deficiencies

-There are many places in which this document is incomplet and incorrekt.
+There are many places in which this document is incomplete and incorrect.
 It is, as of yet, only @emph{preliminary} documentation.

 @c -


Re: [google gcc-4_8] Don't use gcov counter related ssa name as induction variables

2014-02-11 Thread Xinliang David Li
ok for google branch for now.  Please resubmit to trunk and backport
the trunk version if needed.

thanks,

David

On Tue, Feb 11, 2014 at 10:29 AM, Wei Mi w...@google.com wrote:
 +/* Return true if the use is defined by a gcov counter var.
 +   It is used to check if an iv candidate is generated for
 +   profiling. For profile generated ssa name, we should not
 +   use it as IV because gcov counter may have data-race for
 +   multithread program, it could involve tricky bug to use
 +   such ssa var in IVOPT.
 +

 Add the snippets describing how ralloc introduces a second gcov
 counter load and asynchronous update of it from another thread leading
 to bogus trip count.

 Also mention that setting volatile flag on gcov counter accesses may
 greatly degrade profile-gen performance.


 Comments added.

 +  if (!is_gimple_assign (stmt))
 +return false;
 +
 +  rhs = gimple_assign_rhs1 (stmt);
 +  if (TREE_CODE (rhs) != ARRAY_REF)
 +return false;
 +
 +  decl = TREE_OPERAND (rhs, 0);
 +  if (TREE_CODE (decl) != VAR_DECL)
 +return false;



 Also check TREE_STATIC and DECL_ARTIFICIAL flag.


 David


 Check added. Add DECL_ARTIFICIAL setting in build_var() in coverage.c.

 The updated patch is attached.

 Thanks,
 Wei.


Re: fix typo in doc/generic.texi

2014-02-11 Thread Marek Polacek
On Wed, Feb 12, 2014 at 12:00:30AM +0530, Prathamesh Kulkarni wrote:
 * (generic.texi): Fix typo
 
 Index: generic.texi
 ===
 --- generic.texi(revision 207627)
 +++ generic.texi(working copy)
 @@ -53,7 +53,7 @@ seems inelegant.
  @node Deficiencies
  @section Deficiencies
 
 -There are many places in which this document is incomplet and incorrekt.
 +There are many places in which this document is incomplete and incorrect.

I believe this typo in intentional.

Marek


[PATCH, testsuite]: Revert PR testsuite/58630 on mainline

2014-02-11 Thread Uros Bizjak
Hello!

Now that Richard fixed ms_abi limitation w.r.t.
accumulate-outgoing-args, we can revert PR testsuite/58630 fix on
mainline. On mainline, ms_abi works without problems with and without
this option.

2014-02-11  Uros Bizjak  ubiz...@gmail.com

PR target/59927
Revert
2013-12-15  Uros Bizjak  ubiz...@gmail.com

PR testsuite/58630
* gcc.target/i386/pr43662.c (dg-options):
Add -maccumulate-outgoing-args.
* gcc.target/i386/pr43869.c (dg-options): Ditto.
* gcc.target/i386/pr57003.c (dg-options): Ditto.
* gcc.target/i386/avx-vzeroupper-16.c (dg-options):
Remove -mtune=generic and add -maccumulate-outgoing-args instead.
* gcc.target/i386/avx-vzeroupper-17.c (dg-options): Ditto.
* gcc.target/i386/avx-vzeroupper-18.c (dg-options): Ditto.
* gcc.target/x86_64/abi/callabi/func-1.c (dg-options):
Add -maccumulate-outgoing-args.
* gcc.target/x86_64/abi/callabi/func-2a.c (dg-options): Ditto.
* gcc.target/x86_64/abi/callabi/func-2b.c (dg-options): Ditto.
* gcc.target/x86_64/abi/callabi/func-indirect.c (dg-options): Ditto.
* gcc.target/x86_64/abi/callabi/func-indirect-2a.c (dg-options): Ditto.
* gcc.target/x86_64/abi/callabi/func-indirect-2b.c (dg-options): Ditto.
* gcc.target/x86_64/abi/callabi/leaf-1.c (dg-options): Ditto.
* gcc.target/x86_64/abi/callabi/leaf-2.c (dg-options): Ditto.
* gcc.target/x86_64/abi/callabi/pr38891.c (dg-options): Ditto.
* gcc.target/x86_64/abi/callabi/vaarg-1.c (dg-options): Ditto.
* gcc.target/x86_64/abi/callabi/vaarg-2.c (dg-options): Ditto.
* gcc.target/x86_64/abi/callabi/vaarg-3.c (dg-options): Ditto.
* gcc.target/x86_64/abi/callabi/vaarg-4a.c (dg-options): Ditto.
* gcc.target/x86_64/abi/callabi/vaarg-4b.c (dg-options): Ditto.
* gcc.target/x86_64/abi/callabi/vaarg-5a.c (dg-options): Ditto.
* gcc.target/x86_64/abi/callabi/vaarg-5b.c (dg-options): Ditto.

Tested on x86_64-pc-linux-gnu, with and without --with-arch=core2
--with-cpu=core2 configured compiler.

Committed to mainline SVN.

Uros.
Index: ChangeLog
===
--- ChangeLog   (revision 207693)
+++ ChangeLog   (working copy)
@@ -1,3 +1,36 @@
+2014-02-11  Uros Bizjak  ubiz...@gmail.com
+
+   PR target/59927
+   Revert
+   2013-12-15  Uros Bizjak  ubiz...@gmail.com
+
+   PR testsuite/58630
+   * gcc.target/i386/pr43662.c (dg-options):
+   Add -maccumulate-outgoing-args.
+   * gcc.target/i386/pr43869.c (dg-options): Ditto.
+   * gcc.target/i386/pr57003.c (dg-options): Ditto.
+   * gcc.target/i386/avx-vzeroupper-16.c (dg-options):
+   Remove -mtune=generic and add -maccumulate-outgoing-args instead.
+   * gcc.target/i386/avx-vzeroupper-17.c (dg-options): Ditto.
+   * gcc.target/i386/avx-vzeroupper-18.c (dg-options): Ditto.
+   * gcc.target/x86_64/abi/callabi/func-1.c (dg-options):
+   Add -maccumulate-outgoing-args.
+   * gcc.target/x86_64/abi/callabi/func-2a.c (dg-options): Ditto.
+   * gcc.target/x86_64/abi/callabi/func-2b.c (dg-options): Ditto.
+   * gcc.target/x86_64/abi/callabi/func-indirect.c (dg-options): Ditto.
+   * gcc.target/x86_64/abi/callabi/func-indirect-2a.c (dg-options): Ditto.
+   * gcc.target/x86_64/abi/callabi/func-indirect-2b.c (dg-options): Ditto.
+   * gcc.target/x86_64/abi/callabi/leaf-1.c (dg-options): Ditto.
+   * gcc.target/x86_64/abi/callabi/leaf-2.c (dg-options): Ditto.
+   * gcc.target/x86_64/abi/callabi/pr38891.c (dg-options): Ditto.
+   * gcc.target/x86_64/abi/callabi/vaarg-1.c (dg-options): Ditto.
+   * gcc.target/x86_64/abi/callabi/vaarg-2.c (dg-options): Ditto.
+   * gcc.target/x86_64/abi/callabi/vaarg-3.c (dg-options): Ditto.
+   * gcc.target/x86_64/abi/callabi/vaarg-4a.c (dg-options): Ditto.
+   * gcc.target/x86_64/abi/callabi/vaarg-4b.c (dg-options): Ditto.
+   * gcc.target/x86_64/abi/callabi/vaarg-5a.c (dg-options): Ditto.
+   * gcc.target/x86_64/abi/callabi/vaarg-5b.c (dg-options): Ditto.
+
 2014-02-11  Renlin Li  renlin...@arm.com
 
* gcc.target/arm/fixed_float_conversion.c: Add arm_vfp3 option.
Index: gcc.target/i386/avx-vzeroupper-16.c
===
--- gcc.target/i386/avx-vzeroupper-16.c (revision 207693)
+++ gcc.target/i386/avx-vzeroupper-16.c (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile { target lp64 } } */
-/* { dg-options -O2 -mavx -mabi=ms -maccumulate-outgoing-args -dp } */
+/* { dg-options -O2 -mavx -mabi=ms -dp } */
 
 typedef float __m256 __attribute__ ((__vector_size__ (32), __may_alias__));
 
Index: gcc.target/i386/avx-vzeroupper-17.c
===
--- gcc.target/i386/avx-vzeroupper-17.c (revision 207693)
+++ gcc.target/i386/avx-vzeroupper-17.c (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile { target lp64 } } */
-/* { dg-options -O2 -mavx -mabi=ms -maccumulate-outgoing-args 

Re: Use [warning enabled by default] for default warnings

2014-02-11 Thread Ian Lance Taylor
On Tue, Feb 11, 2014 at 5:20 AM, Richard Biener
richard.guent...@gmail.com wrote:
 On Tue, Feb 11, 2014 at 1:48 PM, Richard Sandiford
 rdsandif...@googlemail.com wrote:
 Robert Dewar de...@adacore.com writes:
 On 2/11/2014 4:45 AM, Richard Sandiford wrote:
 OK, this version drops the [enabled by default] altogether.
 Tested as before.  OK to install?

 Still a huge earthquake in terms of affecting test suites and
 baselines of many users. is it really worth it? In the case of
 GNAT we have only recently started tagging messages in this
 way, so changes would not be so disruptive, and we can debate
 following whatever gcc does, but I think it is important to
 understand that any change in this area is a big one in terms
 of impact on users.

 The patch deliberately didn't affect Ada's diagnostic routines given
 your comments from the first round.  Calling this a huge earthquake
 for other languages seems like a gross overstatement.

 I don't think gcc, g++, gfortran, etc, have ever made a commitment
 to producing textually identical warnings and errors for given inputs
 across different releases.  It seems ridiculous to require that,
 especially if it stands in the way of improving the diagnostics
 or introducing finer-grained -W control.

 E.g. Florian's complaint was that we shouldn't have warnings that
 are not under the control of any -W options.  But by your logic
 we couldn't change that either, because all those [enabled by default]s
 would become [-Wnew-option]s.

 Yeah, I think Roberts argument is a red herring - there are loads of
 diagnostic changes every release so you cannot expect those to
 be stable.

 I'm fine with dropping the [enabled by default] as in the patch, but lets
 hear another ok before making the change.

I think this change is OK.

It's obviously not a great situation, but enabled by default is
fairly useless information, so this seems like a marginal improvement.

Ian


PATCH: PR target/60151: HAVE_AS_GOTOFF_IN_DATA is mis-detected on x86-64

2014-02-11 Thread H.J. Lu
Hi,

HAVE_AS_GOTOFF_IN_DATA defines a 32-bit assembler feature, we need to
pass --32 to assembler. Otherwise, we get the wrong result on x86-64.
We already pass --32 to assembler on x86.  It should be OK to do it
in configure.  OK for trunk?

Thanks.

H.J.
---
2014-02-11  H.J. Lu  hongjiu...@intel.com

PR target/60151
* configure.ac (HAVE_AS_GOTOFF_IN_DATA): Pass --32 to assembler.
* configure: Regenerated.

diff --git a/gcc/configure.ac b/gcc/configure.ac
index ac3d842..0aafbc9 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -3868,7 +3868,7 @@ foo:  nop
 # These two are used unconditionally by i386.[ch]; it is to be defined
 # to 1 if the feature is present, 0 otherwise.
 gcc_GAS_CHECK_FEATURE([GOTOFF in data],
-gcc_cv_as_ix86_gotoff_in_data, [2,11,0],,
+gcc_cv_as_ix86_gotoff_in_data, [2,11,0], --32,
 [  .text
 .L0:
nop


Re: PATCH: PR target/60151: HAVE_AS_GOTOFF_IN_DATA is mis-detected on x86-64

2014-02-11 Thread Rainer Orth
Hi H.J.,

 HAVE_AS_GOTOFF_IN_DATA defines a 32-bit assembler feature, we need to
 pass --32 to assembler. Otherwise, we get the wrong result on x86-64.
 We already pass --32 to assembler on x86.  It should be OK to do it
 in configure.  OK for trunk?

This would break Solaris/x86 with as configurations, where this test
currently passes, but would fail since as doesn't understand --32.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: PATCH: PR target/60151: HAVE_AS_GOTOFF_IN_DATA is mis-detected on x86-64

2014-02-11 Thread Uros Bizjak
On Tue, Feb 11, 2014 at 8:28 PM, H.J. Lu hongjiu...@intel.com wrote:

 HAVE_AS_GOTOFF_IN_DATA defines a 32-bit assembler feature, we need to
 pass --32 to assembler. Otherwise, we get the wrong result on x86-64.
 We already pass --32 to assembler on x86.  It should be OK to do it
 in configure.  OK for trunk?

Unfortunately, .code32 didn't work as expected...

So, OK.

Thanks,
Uros.


RFA: one more version of patch for PR59535

2014-02-11 Thread Vladimir Makarov
  This is one more version of the patch to fix the PR59535

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59535

  Here are the results of applying the patch:

ThumbThumb2

reload 2626334  2400154
lra (before the patch) 2665749  2414926
lra (after the patch)  2626334  2397132


I already wrote that the change in arm.h is to prevent reloading sp as
an address by LRA. Reload has no such problem as it uses legitimate
address hook and LRA mostly relies on base_reg_class.

Richard, I need an approval for this change.

2014-02-11  Vladimir Makarov  vmaka...@redhat.com

PR rtl-optimization/59535
* lra-constraints.c (process_alt_operands): Encourage alternative
when unassigned pseudo class is superset of the alternative class.
(inherit_reload_reg): Don't inherit when optimizing for code size.
* config/arm/arm.h (MODE_BASE_REG_CLASS): Return CORE_REGS for
Thumb2 and BASE_REGS for modes not less than 4 for LRA.

Index: lra-constraints.c
===
--- lra-constraints.c	(revision 207562)
+++ lra-constraints.c	(working copy)
@@ -2112,6 +2112,21 @@ process_alt_operands (int only_alternati
 		  goto fail;
 		}
 
+	  /* If not assigned pseudo has a class which a subset of
+		 required reg class, it is a less costly alternative
+		 as the pseudo still can get a hard reg of necessary
+		 class.  */
+	  if (! no_regs_p  REG_P (op)  hard_regno[nop]  0
+		   (cl = get_reg_class (REGNO (op))) != NO_REGS
+		   ira_class_subset_p[this_alternative][cl])
+		{
+		  if (lra_dump_file != NULL)
+		fprintf
+		  (lra_dump_file,
+		   %d Super set class reg: reject-=3\n, nop);
+		  reject -= 3;
+		}
+
 	  this_alternative_offmemok = offmemok;
 	  if (this_costly_alternative != NO_REGS)
 		{
@@ -4391,6 +4406,9 @@ static bool
 inherit_reload_reg (bool def_p, int original_regno,
 		enum reg_class cl, rtx insn, rtx next_usage_insns)
 {
+  if (optimize_function_for_size_p (cfun))
+return false;
+
   enum reg_class rclass = lra_get_allocno_class (original_regno);
   rtx original_reg = regno_reg_rtx[original_regno];
   rtx new_reg, new_insns, usage_insn;
Index: config/arm/arm.h
===
--- config/arm/arm.h	(revision 207562)
+++ config/arm/arm.h	(working copy)
@@ -1272,8 +1272,10 @@ enum reg_class
when addressing quantities in QI or HI mode; if we don't know the
mode, then we must be conservative.  */
 #define MODE_BASE_REG_CLASS(MODE)	\
-(TARGET_ARM || (TARGET_THUMB2  !optimize_size) ? CORE_REGS :  \
- (((MODE) == SImode) ? BASE_REGS : LO_REGS))
+(TARGET_ARM || (TARGET_THUMB2  (!optimize_size || arm_lra_flag))	\
+ ? CORE_REGS : ((MODE) == SImode	\
+|| (arm_lra_flag  GET_MODE_SIZE (MODE) = 4)	\
+? BASE_REGS : LO_REGS))
 
 /* For Thumb we can not support SP+reg addressing, so we return LO_REGS
instead of BASE_REGS.  */


Re: PATCH: PR target/60151: HAVE_AS_GOTOFF_IN_DATA is mis-detected on x86-64

2014-02-11 Thread H.J. Lu
On Tue, Feb 11, 2014 at 11:40 AM, Rainer Orth
r...@cebitec.uni-bielefeld.de wrote:
 Hi H.J.,

 HAVE_AS_GOTOFF_IN_DATA defines a 32-bit assembler feature, we need to
 pass --32 to assembler. Otherwise, we get the wrong result on x86-64.
 We already pass --32 to assembler on x86.  It should be OK to do it
 in configure.  OK for trunk?

 This would break Solaris/x86 with as configurations, where this test
 currently passes, but would fail since as doesn't understand --32.


How about passing --32 to as only for Linux?  OK to install?

Thanks.


-- 
H.J.
---
2014-02-11  H.J. Lu  hongjiu...@intel.com

PR target/60151
* configure.ac (HAVE_AS_GOTOFF_IN_DATA): Pass --32 to assembler
for Linux target.
* configure: Regenerated.

diff --git a/gcc/configure.ac b/gcc/configure.ac
index ac3d842..1b5dca2 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -3867,8 +3867,17 @@ foo: nop

 # These two are used unconditionally by i386.[ch]; it is to be defined
 # to 1 if the feature is present, 0 otherwise.
+case $target_os in
+linux*)
+  as_ix86_gotoff_in_data_opt=--32
+  ;;
+*)
+  as_ix86_gotoff_in_data_opt=
+  ;;
+esac
 gcc_GAS_CHECK_FEATURE([GOTOFF in data],
-gcc_cv_as_ix86_gotoff_in_data, [2,11,0],,
+gcc_cv_as_ix86_gotoff_in_data, [2,11,0],
+ $as_ix86_gotoff_in_data_opt,
 [ .text
 .L0:
  nop


Re: [RFA][rtl-optimization/52714] Do not allow combine to create invalid RTL

2014-02-11 Thread Jeff Law

On 02/11/14 02:06, Eric Botcazou wrote:

I pondered changing the condition for swapping the insn order, but it
didn't seem worth it.  I doubt we see many 3-2 combinations where I3 is
a JUMP_INSN, the result turns into two simple sets and the insn swapping
code you wrote decides it needs to swap the insns.


I didn't actually write it, just enhanced it recently, that's why I suggested
to do the same here.  It's one line of code and we have an example of valid
simplification at hand so I think we ought to do it.


It seems to me that as long as we're re-using the existing insns to
contain the simple sets that we have to ensure that they're INSNs, not
CALL_INSNs or JUMP_INSNs.


I disagree, nullifying JUMP_INSNs by changing them to (set (pc) (pc)) is a
standard method in the combiner.
Right, but most of the time we're doing a 2-1 or 3-1 that turns the 
conditional jump into a nop -- at which point all we do is modify I3 
(which is the conditional jump) into a nop-jump.  That common case 
shouldn't be affected by my change.


It's only when we have a 3-2 or 4-2 combination where one of the 
earlier insns has a side effect we need to keep and we're able to 
simplify the conditional into a NOP that would be affected by my change. 
 And I suspect that is much more rare.


Regardless, I'll see what I can do.

THanks,
Jeff


[COMMITTED] Fix target/59927

2014-02-11 Thread Richard Henderson
[Re-send, since I never saw the original arrive on the list.]

After the first version passed Kai's testing, we chatted on IRC about
various other bits that might be relevant to the winx64 abi issues.

The conditional call to ix86_eax_live_at_start_p seemed to be harmless
either way, but confusing as a conditional.  It seems least confusing
to make the call unconditional.

Finally, adjust the comment for ACCUMULATE_OUTGOING_ARGS to match reality.


r~


PR target/59927
* calls.c (expand_call): Don't double-push for reg_parm_stack_space.
* config/i386/i386.c (init_cumulative_args): Remove sorry for 64-bit
ms-abi vs -mno-accumulate-outgoing-args.
(ix86_expand_prologue): Unconditionally call ix86_eax_live_at_start_p.
* config/i386/i386.h (ACCUMULATE_OUTGOING_ARGS): Fix comment with
respect to ms-abi.
diff --git a/gcc/calls.c b/gcc/calls.c
index d574a95..f392319 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -2955,6 +2955,7 @@ expand_call (tree exp, rtx target, int ignore)
   /* If we push args individually in reverse order, perform stack alignment
 before the first push (the last arg).  */
   if (PUSH_ARGS_REVERSED  argblock == 0
+   adjusted_args_size.constant  reg_parm_stack_space
   adjusted_args_size.constant != unadjusted_args_size)
{
  /* When the stack adjustment is pending, we get better code
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 1b7bb3e..0a15e44 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -6110,10 +6110,6 @@ init_cumulative_args (CUMULATIVE_ARGS *cum,  /* Argument 
info to initialize */
   cum-caller = caller;
 
   /* Set up the number of registers to use for passing arguments.  */
-
-  if (TARGET_64BIT  cum-call_abi == MS_ABI  !ACCUMULATE_OUTGOING_ARGS)
-sorry (ms_abi attribute requires -maccumulate-outgoing-args 
-  or subtarget optimization implying it);
   cum-nregs = ix86_regparm;
   if (TARGET_64BIT)
 {
@@ -11032,15 +11028,14 @@ ix86_expand_prologue (void)
 
   if (TARGET_64BIT)
 r10_live = (DECL_STATIC_CHAIN (current_function_decl) != 0);
-  if (!TARGET_64BIT_MS_ABI)
-eax_live = ix86_eax_live_at_start_p ();
 
-  /* Note that SEH directives need to continue tracking the stack
-pointer even after the frame pointer has been set up.  */
+  eax_live = ix86_eax_live_at_start_p ();
   if (eax_live)
{
  insn = emit_insn (gen_push (eax));
  allocate -= UNITS_PER_WORD;
+ /* Note that SEH directives need to continue tracking the stack
+pointer even after the frame pointer has been set up.  */
  if (sp_is_cfa_reg || TARGET_SEH)
{
  if (sp_is_cfa_reg)
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 0e757c9c..b605ae2 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -1537,12 +1537,12 @@ enum reg_class
mode the difference is less drastic but visible.  
 
FIXME: Unlike earlier implementations, the size of unwind info seems to
-   actually grouw with accumulation.  Is that because accumulated args
+   actually grow with accumulation.  Is that because accumulated args
unwind info became unnecesarily bloated?
-   
-   64-bit MS ABI seem to require 16 byte alignment everywhere except for
-   function prologue and epilogue.  This is not possible without
-   ACCUMULATE_OUTGOING_ARGS.  
+
+   With the 64-bit MS ABI, we can generate correct code with or without
+   accumulated args, but because of OUTGOING_REG_PARM_STACK_SPACE the code
+   generated without accumulated args is terrible.
 
If stack probes are required, the space used for large function
arguments on the stack must also be probed, so enable


Re: PATCH: PR target/60151: HAVE_AS_GOTOFF_IN_DATA is mis-detected on x86-64

2014-02-11 Thread Rainer Orth
H.J. Lu hjl.to...@gmail.com writes:

 On Tue, Feb 11, 2014 at 11:40 AM, Rainer Orth
 r...@cebitec.uni-bielefeld.de wrote:
 Hi H.J.,

 HAVE_AS_GOTOFF_IN_DATA defines a 32-bit assembler feature, we need to
 pass --32 to assembler. Otherwise, we get the wrong result on x86-64.
 We already pass --32 to assembler on x86.  It should be OK to do it
 in configure.  OK for trunk?

 This would break Solaris/x86 with as configurations, where this test
 currently passes, but would fail since as doesn't understand --32.


 How about passing --32 to as only for Linux?  OK to install?

I'd rather do it for gas instead, which can be used on non-Linux
systems, too.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH] Avoid uninit warnings with Fortran optional args (PR fortran/52370)

2014-02-11 Thread Jakub Jelinek
Hi!

Optional dummy arg support vars are often undefined in various paths,
while the predicated uninit analysis can sometimes avoid false positives,
sometimes it can't.  So IMHO we should set TREE_NO_WARNING here, it is
already set on the other support vars like size.N, ubound.N, lbound.N, etc.
but not on the actual data pointer.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2014-02-11  Jakub Jelinek  ja...@redhat.com

PR fortran/52370
* trans-decl.c (gfc_build_dummy_array_decl): Set TREE_NO_WARNING
on decl if sym-attr.optional.

* gfortran.dg/pr52370.f90: New test.

--- gcc/fortran/trans-decl.c.jj 2014-01-03 11:40:53.0 +0100
+++ gcc/fortran/trans-decl.c2014-02-11 18:17:47.315598566 +0100
@@ -1014,6 +1014,10 @@ gfc_build_dummy_array_decl (gfc_symbol *
   TREE_STATIC (decl) = 0;
   DECL_EXTERNAL (decl) = 0;
 
+  /* Avoid uninitialized warnings for optional dummy arguments.  */
+  if (sym-attr.optional)
+TREE_NO_WARNING (decl) = 1;
+
   /* We should never get deferred shape arrays here.  We used to because of
  frontend bugs.  */
   gcc_assert (sym-as-type != AS_DEFERRED);
--- gcc/testsuite/gfortran.dg/pr52370.f90.jj2014-02-11 18:20:45.704628460 
+0100
+++ gcc/testsuite/gfortran.dg/pr52370.f90   2014-02-11 18:20:11.0 
+0100
@@ -0,0 +1,21 @@
+! PR fortran/52370
+! { dg-do compile }
+! { dg-options -O1 -Wall }
+
+module pr52370
+contains
+  subroutine foo(a,b)
+real, intent(out) :: a
+real, dimension(:), optional, intent(out) :: b
+a=0.5
+if (present(b)) then
+  b=1.0
+end if
+  end subroutine foo
+end module pr52370
+
+program prg52370
+  use pr52370
+  real :: a
+  call foo(a)
+end program prg52370

Jakub


[jit] Fix error message in test

2014-02-11 Thread David Malcolm
Committed to branch dmalcolm/jit:

gcc/testsuite/
* jit.dg/test-error-unplaced-label.c (verify_code): Update
expected error message to reflect commit
6cd4f82c5237cc328aea229cdaaa428ff09d6e98.
---
 gcc/testsuite/ChangeLog.jit  | 6 ++
 gcc/testsuite/jit.dg/test-error-unplaced-label.c | 2 +-
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/ChangeLog.jit b/gcc/testsuite/ChangeLog.jit
index e108fb7e..25aff70 100644
--- a/gcc/testsuite/ChangeLog.jit
+++ b/gcc/testsuite/ChangeLog.jit
@@ -1,3 +1,9 @@
+2014-02-11  David Malcolm  dmalc...@redhat.com
+
+   * jit.dg/test-error-unplaced-label.c (verify_code): Update
+   expected error message to reflect commit
+   6cd4f82c5237cc328aea229cdaaa428ff09d6e98.
+
 2014-02-10  David Malcolm  dmalc...@redhat.com
 
* jit.dg/test-types.c (struct zoo): Add field m_sized_int_type,
diff --git a/gcc/testsuite/jit.dg/test-error-unplaced-label.c 
b/gcc/testsuite/jit.dg/test-error-unplaced-label.c
index 64073f47..464fdfc 100644
--- a/gcc/testsuite/jit.dg/test-error-unplaced-label.c
+++ b/gcc/testsuite/jit.dg/test-error-unplaced-label.c
@@ -45,6 +45,6 @@ verify_code (gcc_jit_context *ctxt, gcc_jit_result *result)
 
   /* Verify that the correct error message was emitted.  */
   CHECK_STRING_VALUE (gcc_jit_context_get_first_error (ctxt),
- unplaced label: foo);
+ unplaced label within test_fn: foo);
 }
 
-- 
1.7.11.7



Re: PATCH: PR target/60151: HAVE_AS_GOTOFF_IN_DATA is mis-detected on x86-64

2014-02-11 Thread H.J. Lu
On Tue, Feb 11, 2014 at 12:29 PM, Rainer Orth
r...@cebitec.uni-bielefeld.de wrote:
 H.J. Lu hjl.to...@gmail.com writes:

 On Tue, Feb 11, 2014 at 11:40 AM, Rainer Orth
 r...@cebitec.uni-bielefeld.de wrote:
 Hi H.J.,

 HAVE_AS_GOTOFF_IN_DATA defines a 32-bit assembler feature, we need to
 pass --32 to assembler. Otherwise, we get the wrong result on x86-64.
 We already pass --32 to assembler on x86.  It should be OK to do it
 in configure.  OK for trunk?

 This would break Solaris/x86 with as configurations, where this test
 currently passes, but would fail since as doesn't understand --32.


 How about passing --32 to as only for Linux?  OK to install?

 I'd rather do it for gas instead, which can be used on non-Linux
 systems, too.


Sure.  Here is the new patch.  OK to install?

-- 
H.J.
---
2014-02-11  H.J. Lu  hongjiu...@intel.com

PR target/60151
* configure.ac (HAVE_AS_GOTOFF_IN_DATA): Pass --32 to GNU assembler.
* configure: Regenerated.

diff --git a/gcc/configure.ac b/gcc/configure.ac
index ac3d842..333bca6 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -3867,8 +3867,14 @@ foo: nop

 # These two are used unconditionally by i386.[ch]; it is to be defined
 # to 1 if the feature is present, 0 otherwise.
+if test x$gas = xyes; then
+  as_ix86_gotoff_in_data_opt=--32
+else
+  as_ix86_gotoff_in_data_opt=
+fi
 gcc_GAS_CHECK_FEATURE([GOTOFF in data],
-gcc_cv_as_ix86_gotoff_in_data, [2,11,0],,
+gcc_cv_as_ix86_gotoff_in_data, [2,11,0],
+ $as_ix86_gotoff_in_data_opt,
 [ .text
 .L0:
  nop


Harden ipa-devirt for invalid programs, part 1 (and also fix a wrong code issue)

2014-02-11 Thread Jan Hubicka
Hi,
this is first part of fix for lto/59468 that is about ipa-devirt ICEs on
invalid C++ programs (violating ODR rule).

This patch hardens it wrt various low-level VTABLE corruptions - the case
virtual tables gets too small, their initalizer is missing or user decided to
rewrite the symbol by a different kind of variable so we no longer understand
its constructor.

While working on it I did some testing and noticed that there is important
bug WRT coplete list.  We may end up not listing a node because we can not
refer it, but we forget to clear the list.  This is sported by the testcases
added. 

To avoid bugs like this, I decided to make can_refer explicit 
when calling gimple_get_virt_method_for_vtable (that function may return NULL
also in other cases when caller is confused - this will not happen for
ipa-devirt but may happen for old devirt code in ipa-prop that I plan to remove
next stage1).  This allows bit more sanity checks about updating complete
flag.

Bootstrapped/regtested x86_64-linux, will commit it after bit of firefox
testing.

As a followup I will add ODR violation warning when virtual tables do not match.

I also noticed that we may end up not shipping virtual tables that matters to
LTO because they are not referenced directly by any of the units. This seems a
bit more important than anticipated.  Have old patch to make this happen but it
seems bit expensive without measurable benefits (i.e. improvements in
devirtualization counts) and it triggers interesting issues with free_lang_data
that I would like to discuss separately.

Will evaulate it bit further.

Honza

PR lto/59468
* ipa-utils.h (possible_polymorphic_call_targets): Update prototype
and wrapper.
* ipa-devirt.c: Include demangle.h
(odr_violation_reported): New static variable.
(add_type_duplicate): Update odr_violations.
(maybe_record_node): Add completep parameter; update it.
(record_target_from_binfo): Add COMPLETEP parameter;
update it as needed.
(possible_polymorphic_call_targets_1): Likewise.
(struct polymorphic_call_target_d): Add nonconstruction_targets;
rename FINAL to COMPLETE.
(record_targets_from_bases): Sanity check we found the binfo;
fix COMPLETEP updating.
(possible_polymorphic_call_targets): Add NONCONSTRUTION_TARGETSP
parameter, fix computing of COMPLETEP.
(dump_possible_polymorphic_call_targets): Imrove readability of dump; at
LTO time do demangling.
(ipa_devirt): Use nonconstruction_targets; Improve dumps.
* gimple-fold.c (gimple_get_virt_method_for_vtable): Add can_refer
parameter.
(gimple_get_virt_method_for_binfo): Likewise.
* gimple-fold.h (gimple_get_virt_method_for_binfo,
gimple_get_virt_method_for_vtable): Update prototypes.

* g++.dg/ipa/devirt-27.C: New testcase.
* g++.dg/ipa/devirt-26.C: New testcase.
Index: gimple-fold.c
===
*** gimple-fold.c   (revision 207649)
--- gimple-fold.c   (working copy)
*** fold_const_aggregate_ref (tree t)
*** 3170,3191 
  }
  
  /* Lookup virtual method with index TOKEN in a virtual table V
!at OFFSET.  */
  
  tree
  gimple_get_virt_method_for_vtable (HOST_WIDE_INT token,
   tree v,
!  unsigned HOST_WIDE_INT offset)
  {
tree vtable = v, init, fn;
unsigned HOST_WIDE_INT size;
unsigned HOST_WIDE_INT elt_size, access_index;
tree domain_type;
  
/* First of all double check we have virtual table.  */
if (TREE_CODE (v) != VAR_DECL
|| !DECL_VIRTUAL_P (v))
! return NULL_TREE;
  
init = ctor_for_folding (v);
  
--- 3170,3204 
  }
  
  /* Lookup virtual method with index TOKEN in a virtual table V
!at OFFSET.  
!Set CAN_REFER if non-NULL to false if method
!is not referable or if the virtual table is ill-formed (such as rewriten
!by non-C++ produced symbol). Otherwise just return NULL in that calse.  */
  
  tree
  gimple_get_virt_method_for_vtable (HOST_WIDE_INT token,
   tree v,
!  unsigned HOST_WIDE_INT offset,
!  bool *can_refer)
  {
tree vtable = v, init, fn;
unsigned HOST_WIDE_INT size;
unsigned HOST_WIDE_INT elt_size, access_index;
tree domain_type;
  
+   if (can_refer)
+ *can_refer = true;
+ 
/* First of all double check we have virtual table.  */
if (TREE_CODE (v) != VAR_DECL
|| !DECL_VIRTUAL_P (v))
! {
!   gcc_assert (in_lto_p);
!   /* Pass down that we lost track of the target.  */
!   if (can_refer)
!   *can_refer = false;
!   return NULL_TREE;
! }
  
init = ctor_for_folding (v);
  
*** gimple_get_virt_method_for_vtable (HOST_
*** 3197,3202 
--- 3210,3218 
if 

Re: [PATCH] Avoid uninit warnings with Fortran optional args (PR fortran/52370)

2014-02-11 Thread Steve Kargl
On Tue, Feb 11, 2014 at 09:38:20PM +0100, Jakub Jelinek wrote:
 
 Optional dummy arg support vars are often undefined in various paths,
 while the predicated uninit analysis can sometimes avoid false positives,
 sometimes it can't.  So IMHO we should set TREE_NO_WARNING here, it is
 already set on the other support vars like size.N, ubound.N, lbound.N, etc.
 but not on the actual data pointer.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
 

Yes.  Although trunk is in stage 4, your patch seems to be relatively
benign.

-- 
Steve


Re: [PATCH] Fix PR target/60137, no splitters for vectors that get GPR registers

2014-02-11 Thread David Edelsohn
On Mon, Feb 10, 2014 at 8:14 PM, Michael Meissner
meiss...@linux.vnet.ibm.com wrote:
 This patch fixes PR target/60137 that shows up when some vector types get
 allocated to GPR registers, and there wasn't a splitter for the type.  It 
 shows
 up when targetting ISA 2.07 (power8) when VSX is disabled but Altivec (VMX) is
 enabled, though I suspect there are other failure cases.

 I bootstrapped and checked the patch, and it caused no regressions in the test
 suite.  Is it ok to apply?

 [gcc]
 2014-02-10  Michael Meissner  meiss...@linux.vnet.ibm.com

 PR target/60137
 * config/rs6000/rs6000.md (128-bit GPR splitter): Add a splitter
 for VSX/Altivec vectors that land in GPR registers.

 [gcc/testsuite]
 2014-02-10  Michael Meissner  meiss...@linux.vnet.ibm.com

 PR target/60137
 * gcc.target/powerpc/pr60137.c: New file.

Okay.

Thanks, David


Re: [google][gcc-4_8][patch]Handle Split functions in the Function Reordering Plugin

2014-02-11 Thread Teresa Johnson
On Mon, Feb 10, 2014 at 7:15 PM, Sriraman Tallam tmsri...@google.com wrote:
 Hi Teresa,

I have attached a patch to recognize and reorder split functions in
 the function reordering plugin. Please review.

 Thanks
 Sri

 Index: function_reordering_plugin/callgraph.c
 ===
 --- function_reordering_plugin/callgraph.c(revision 207671)
 +++ function_reordering_plugin/callgraph.c(working copy)
 @@ -550,6 +550,25 @@ static void set_node_type (Node *n)
s-computed_weight = n-computed_weight;
s-max_count = n-max_count;

 +  /* If s is split into a cold section, assign the split weight to the
 + max count of the split section.   Use this also for the weight of 
 the
 + split section.  */
 +  if (s-split_section)
 +{
 +  s-split_section-max_count = s-split_section-weight = 
 n-split_weight;
 +  /* If split_section is comdat, update all the comdat
 +  candidates for weight.  */
 +  s_comdat = s-split_section-comdat_group;
 +  while (s_comdat != NULL)
 +{
 +  s_comdat-weight = s-split_section-weight;
 +  s_comdat-computed_weight
 + = s-split_section-computed_weight;

Where is s-split_section-computed_weight set?

 +  s_comdat-max_count = s-split_section-max_count;
 +  s_comdat = s_comdat-comdat_group;
 +}
 + }
 +

...


 +  /* It is split and it is not comdat.  */
 +  else if (is_split_function)
 + {
 +   Section_id *cold_section = NULL;
 +   /* Function splitting means that the hot part is really the
 +  relevant section and the other section is unlikely executed and
 +  should not be part of the callgraph.  */

 -  section-comdat_group = kept-comdat_group;
 -  kept-comdat_group = section;
 +   /* Store the new section in the section list.  */
 +   section-next = first_section;
 +   first_section = section;
 +   /* The right section is not in the section_map if .text.unlikely is
 +  not the new section.  */
 +  if (!is_prefix_of (.text.unlikely, section_name))

The double-negative in the above comment is a bit confusing. Can we
make this similar to the check in the earlier split comdat case. I.e.
something like (essentially swapping the if condition and assert):

  /* If the existing section is cold, the newly detected split must
 be hot.  */
  if (is_prefix_of (.text.unlikely, kept-full_name))
{
  assert (!is_prefix_of (.text.unlikely, section_name));
  ...
}
  else
{
  assert (is_prefix_of (.text.unlikely, section_name));
  ...
}

 + {
 +   assert (is_prefix_of (.text.unlikely, kept-full_name));
 +   /* The kept section was the unlikely section.  Change the section
 +  in section_map to be the new section which is the hot one.  */
 +   *slot = section;
 +   /* Record the split cold section in the hot section.  */
 +   section-split_section = kept;
 +   /* Comdats and function splitting are already handled.  */
 +   assert (kept-comdat_group == NULL);
 +   cold_section = kept;
 + }
 +   else
 + {
 +   /* Record the split cold section in the hot section.  */
 +   assert (!is_prefix_of (.text.unlikely, kept-full_name));
 +   kept-split_section = section;
 +   cold_section = section;
 + }
 +   assert (cold_section != NULL  cold_section-comdat_group == NULL);
 +   cold_section-is_split_cold_section = 1;
 + }
...

Thanks,
Teresa


-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


patch fixing PR49008

2014-02-11 Thread Vladimir Makarov
  The following patch fixes

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49008

  This typo was not important as the code worked correctly in the most
cases as units required precence of only one another unit mostly.

  Committed as rev. 207701.

2014-02-11  Vladimir Makarov  vmaka...@redhat.com

PR target/49008
* genautomata.c (add_presence_absence): Fix typo with
{final_}presence_list.

Index: genautomata.c
===
--- genautomata.c	(revision 207697)
+++ genautomata.c	(working copy)
@@ -2348,7 +2348,7 @@ add_presence_absence (unit_set_el_t dest
 		for (prev_el = (presence_p
 ? (final_p
    ? dst-unit_decl-final_presence_list
-   : dst-unit_decl-final_presence_list)
+   : dst-unit_decl-presence_list)
 : (final_p
    ? dst-unit_decl-final_absence_list
    : dst-unit_decl-absence_list));


[jit] Add an explicit boolean type

2014-02-11 Thread David Malcolm
Committed to branch dmalcolm/jit:

gcc/jit/
* libgccjit.h (enum gcc_jit_types): Add GCC_JIT_TYPE_BOOL.

* internal-api.h (gcc::jit::recording::comparison::comparison): Use
GCC_JIT_TYPE_BOOL as the types of comparisons, rather than
GCC_JIT_TYPE_INT.

* internal-api.c (gcc::jit::recording::memento_of_get_type::
dereference): Handle GCC_JIT_TYPE_BOOL (with an error).
(get_type_strings): Add GCC_JIT_TYPE_BOOL.
(get_tree_node_for_type): Handle GCC_JIT_TYPE_BOOL.

gcc/testsuite/
* jit.dg/test-types.c: Add test coverage for getting type
GCC_JIT_TYPE_BOOL.
* jit.dg/test-expressions.c (make_test_of_comparison): Convert
return type from int to bool.
(verify_comparisons): Likewise.
---
 gcc/jit/ChangeLog.jit   | 13 +
 gcc/jit/internal-api.c  |  6 ++
 gcc/jit/internal-api.h  |  2 +-
 gcc/jit/libgccjit.h |  4 
 gcc/testsuite/ChangeLog.jit |  8 
 gcc/testsuite/jit.dg/test-expressions.c |  9 ++---
 gcc/testsuite/jit.dg/test-types.c   | 15 +++
 7 files changed, 53 insertions(+), 4 deletions(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index d331082..90c2e3f 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,5 +1,18 @@
 2014-02-11  David Malcolm  dmalc...@redhat.com
 
+   * libgccjit.h (enum gcc_jit_types): Add GCC_JIT_TYPE_BOOL.
+
+   * internal-api.h (gcc::jit::recording::comparison::comparison): Use
+   GCC_JIT_TYPE_BOOL as the types of comparisons, rather than
+   GCC_JIT_TYPE_INT.
+
+   * internal-api.c (gcc::jit::recording::memento_of_get_type::
+   dereference): Handle GCC_JIT_TYPE_BOOL (with an error).
+   (get_type_strings): Add GCC_JIT_TYPE_BOOL.
+   (get_tree_node_for_type): Handle GCC_JIT_TYPE_BOOL.
+
+2014-02-11  David Malcolm  dmalc...@redhat.com
+
* internal-api.c (gcc::jit::recording::context::add_error_va): If
GCC_JIT_STR_OPTION_PROGNAME is NULL, use libgccjit.so, as per
the comment in libgccjit.h
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index 6c66d3d..0f140fe 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -657,6 +657,7 @@ recording::memento_of_get_type::dereference ()
 case GCC_JIT_TYPE_VOID_PTR:
   return m_ctxt-get_type (GCC_JIT_TYPE_VOID);
 
+case GCC_JIT_TYPE_BOOL:
 case GCC_JIT_TYPE_CHAR:
 case GCC_JIT_TYPE_SIGNED_CHAR:
 case GCC_JIT_TYPE_UNSIGNED_CHAR:
@@ -698,6 +699,8 @@ static const char * const get_type_strings[] = {
   void,/* GCC_JIT_TYPE_VOID */
   void *,  /* GCC_JIT_TYPE_VOID_PTR */
 
+  bool,  /* GCC_JIT_TYPE_BOOL */
+
   char,   /* GCC_JIT_TYPE_CHAR */
   signed char,/* GCC_JIT_TYPE_SIGNED_CHAR */
   unsigned char,  /* GCC_JIT_TYPE_UNSIGNED_CHAR */
@@ -1723,6 +1726,9 @@ get_tree_node_for_type (enum gcc_jit_types type_)
 case GCC_JIT_TYPE_VOID_PTR:
   return ptr_type_node;
 
+case GCC_JIT_TYPE_BOOL:
+  return boolean_type_node;
+
 case GCC_JIT_TYPE_CHAR:
   return char_type_node;
 case GCC_JIT_TYPE_SIGNED_CHAR:
diff --git a/gcc/jit/internal-api.h b/gcc/jit/internal-api.h
index 55772a6..f285039 100644
--- a/gcc/jit/internal-api.h
+++ b/gcc/jit/internal-api.h
@@ -961,7 +961,7 @@ public:
  location *loc,
  enum gcc_jit_comparison op,
  rvalue *a, rvalue *b)
-  : rvalue (ctxt, loc, ctxt-get_type (GCC_JIT_TYPE_INT)), /* FIXME: should be 
bool? */
+  : rvalue (ctxt, loc, ctxt-get_type (GCC_JIT_TYPE_BOOL)),
 m_op (op),
 m_a (a),
 m_b (b)
diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h
index 976bcb2..a4ef65e 100644
--- a/gcc/jit/libgccjit.h
+++ b/gcc/jit/libgccjit.h
@@ -305,6 +305,10 @@ enum gcc_jit_types
   /* void *.  */
   GCC_JIT_TYPE_VOID_PTR,
 
+  /* C++'s bool type; also C99's _Bool type, aka bool if using
+ stdbool.h.  */
+  GCC_JIT_TYPE_BOOL,
+
   /* Various integer types.  */
 
   /* C's char (of some signedness) and the variants where the
diff --git a/gcc/testsuite/ChangeLog.jit b/gcc/testsuite/ChangeLog.jit
index 25aff70..4d8f209 100644
--- a/gcc/testsuite/ChangeLog.jit
+++ b/gcc/testsuite/ChangeLog.jit
@@ -1,5 +1,13 @@
 2014-02-11  David Malcolm  dmalc...@redhat.com
 
+   * jit.dg/test-types.c: Add test coverage for getting type
+   GCC_JIT_TYPE_BOOL.
+   * jit.dg/test-expressions.c (make_test_of_comparison): Convert
+   return type from int to bool.
+   (verify_comparisons): Likewise.
+
+2014-02-11  David Malcolm  dmalc...@redhat.com
+
* jit.dg/test-error-unplaced-label.c (verify_code): Update
expected error message to reflect commit
6cd4f82c5237cc328aea229cdaaa428ff09d6e98.
diff --git a/gcc/testsuite/jit.dg/test-expressions.c 
b/gcc/testsuite/jit.dg/test-expressions.c
index 36588c6..d2fd950 100644
--- a/gcc/testsuite/jit.dg/test-expressions.c
+++ 

Re: [PATCH i386] Enable -freorder-blocks-and-partition

2014-02-11 Thread Teresa Johnson
On Thu, Dec 19, 2013 at 10:19 PM, Teresa Johnson tejohn...@google.com wrote:
 On Thu, Dec 12, 2013 at 5:13 PM, Jan Hubicka hubi...@ucw.cz wrote:
 On Wed, Dec 11, 2013 at 1:21 AM, Martin Liška marxin.li...@gmail.com 
 wrote:
  Hello,
 I prepared a collection of systemtap graphs for GIMP.
 
  1) just my profile-based function reordering: 550 pages
  2) just -freorder-blocks-and-partitions: 646 pages
  3) just -fno-reorder-blocks-and-partitions: 638 pages
 
  Please see attached data.

 Thanks for the data. A few observations/questions:

 With both 1) (your (time-based?) reordering) and 2)
 (-freorder-blocks-and-partitions) there are a fair amount of accesses
 out of the cold section. I'm not seeing so many accesses out of the
 cold section in the apps I am looking at with splitting enabled. In

 I see you already comitted the patch, so perhaps Martin's measurement assume
 the pass is off by default?

 I rebuilded GCC with profiledboostrap and with the linkerscript unmapping
 text.unlikely.  I get ICE in:
 (gdb) bt
 #0  diagnostic_set_caret_max_width(diagnostic_context*, int) () at 
 ../../gcc/diagnostic.c:108
 #1  0x00f68457 in diagnostic_initialize (context=0x18ae000 
 global_diagnostic_context, n_opts=n_opts@entry=1290) at 
 ../../gcc/diagnostic.c:135
 #2  0x0100050e in general_init (argv0=optimized out) at 
 ../../gcc/toplev.c:1110
 #3  toplev_main(int, char**) () at ../../gcc/toplev.c:1922
 #4  0x7774cbe5 in __libc_start_main () from /lib64/libc.so.6
 #5  0x00f7898d in _start () at ../sysdeps/x86_64/start.S:122

 That is relatively early in startup process. The function seems inlined and
 it fails only on second invocation, did not have time to investigate further,
 yet while without -fprofile-use it starts...

 I'll see if I can reproduce this and investigate, although at this
 point that might have to wait until after my holiday vacation.

I tried the linkerscript with cpu2006 and got quite a lot of failures
(using the ref inputs to train, so the behavior should be the same in
both profile-gen and profile-use). I investigated the one in bzip2 and
found an issue that may not be easy to fix and is perhaps something it
is not worth fixing. The case was essentially the following: Function
foo was called by callsites A and B, with call counts 148122615 and
18, respectively.

Within function foo, there was a basic block that had a very low count
(compared to the entry bb count of 148122633), and therefore a 0
frequency:

;;   basic block 6, loop depth 0, count 18, freq 0

The ipa inliner decided to inline into callsite A but not B. Because
the vast majority of the call count was from callsite A, when we
performed execute_fixup_cfg after doing the inline transformation, the
count_scale is 0 and the out-of-line copy of foo's blocks all got
counts 0. However, most of the bbs still had non-zero frequencies. But
bb 6 ended up with a count and frequency of 0, leading us to split it
out. It turns out that at least one of the 18 counts for this block
were from callsite B, and we ended up trying to execute the split bb
in the out-of-line copy from that callsite.

I can think of a couple of ways to prevent this to happen (e.g. have
execute_fixup_cfg give the block a count or frequency of 1 instead of
0, or mark the bb somehow as not eligible for splitting due to a low
confidence in the 0 count/frequency), but they seem a little hacky. I
am thinking that the splitting here is something we shouldn't worry
about - it is so close to 0 count that the occasional jump to the
split section caused by the lack of precision in the frequency is not
a big deal. Unfortunately, without fixing this I can't use the linker
script without disabling inlining to avoid this problem.

I reran cpu2006 with the linker script but with -fno-inline and got 6
more benchmarks to pass. So there are other issues out there. I will
take a look at another one and see if it is a similar
scaling/precision issue. I'm thinking that I may just use my heatmap
scripts (which leverages perf-events profiles) to see if there is any
significant execution in the split cold sections, since it seems we
can't realistically prevent any and all execution of the cold split
sections, and that is more meaningful anyway.

Teresa



 On our periodic testers I see off-noise improvement in crafty 2200-2300
 and regression on Vortex, 2900-2800, plus code size increase.

 I had only run cpu2006, but not cpu2000. I'll see if I can reproduce
 this as well.

 I have been investigating a few places where I saw accesses in the
 cold split regions in internal benchmarks. Here are a couple, and how
 I have addressed them so far:

 1) loop unswitching

 In this case, loop unswitching hoisted a branch from within the loop
 to outside the loop, and in doing so it was effectively speculated
 above several other branches. In it's original location it always went
 to only one of the successors (biased 0/100%). But when it was hoisted
 it sometimes took 

Re: [google][gcc-4_8][patch]Handle Split functions in the Function Reordering Plugin

2014-02-11 Thread Sriraman Tallam
On Tue, Feb 11, 2014 at 1:32 PM, Teresa Johnson tejohn...@google.com wrote:
 On Mon, Feb 10, 2014 at 7:15 PM, Sriraman Tallam tmsri...@google.com wrote:
 Hi Teresa,

I have attached a patch to recognize and reorder split functions in
 the function reordering plugin. Please review.

 Thanks
 Sri

 Index: function_reordering_plugin/callgraph.c
 ===
 --- function_reordering_plugin/callgraph.c(revision 207671)
 +++ function_reordering_plugin/callgraph.c(working copy)
 @@ -550,6 +550,25 @@ static void set_node_type (Node *n)
s-computed_weight = n-computed_weight;
s-max_count = n-max_count;

 +  /* If s is split into a cold section, assign the split weight to the
 + max count of the split section.   Use this also for the weight of 
 the
 + split section.  */
 +  if (s-split_section)
 +{
 +  s-split_section-max_count = s-split_section-weight = 
 n-split_weight;
 +  /* If split_section is comdat, update all the comdat
 +  candidates for weight.  */
 +  s_comdat = s-split_section-comdat_group;
 +  while (s_comdat != NULL)
 +{
 +  s_comdat-weight = s-split_section-weight;
 +  s_comdat-computed_weight
 + = s-split_section-computed_weight;

 Where is s-split_section-computed_weight set

I removed this line as it is not useful. Details:

In general, the computed_weight for sections is assigned in
set_node_type in line:
 s-computed_weight = n-computed_weight;

The computed_weight is obtained by adding the weights of all incoming
edges. However, for the cold part of split functions, this can never
be non-zero as the split cold part is never called and so this line is
not useful.



 +  s_comdat-max_count = s-split_section-max_count;
 +  s_comdat = s_comdat-comdat_group;
 +}
 + }
 +

 ...


 +  /* It is split and it is not comdat.  */
 +  else if (is_split_function)
 + {
 +   Section_id *cold_section = NULL;
 +   /* Function splitting means that the hot part is really the
 +  relevant section and the other section is unlikely executed and
 +  should not be part of the callgraph.  */

 -  section-comdat_group = kept-comdat_group;
 -  kept-comdat_group = section;
 +   /* Store the new section in the section list.  */
 +   section-next = first_section;
 +   first_section = section;
 +   /* The right section is not in the section_map if .text.unlikely is
 +  not the new section.  */
 +  if (!is_prefix_of (.text.unlikely, section_name))

 The double-negative in the above comment is a bit confusing. Can we
 make this similar to the check in the earlier split comdat case. I.e.
 something like (essentially swapping the if condition and assert):

Changed. New patch attached.

Thanks
Sri


   /* If the existing section is cold, the newly detected split must
  be hot.  */
   if (is_prefix_of (.text.unlikely, kept-full_name))
 {
   assert (!is_prefix_of (.text.unlikely, section_name));
   ...
 }
   else
 {
   assert (is_prefix_of (.text.unlikely, section_name));
   ...
 }

 + {
 +   assert (is_prefix_of (.text.unlikely, kept-full_name));
 +   /* The kept section was the unlikely section.  Change the section
 +  in section_map to be the new section which is the hot one.  */
 +   *slot = section;
 +   /* Record the split cold section in the hot section.  */
 +   section-split_section = kept;
 +   /* Comdats and function splitting are already handled.  */
 +   assert (kept-comdat_group == NULL);
 +   cold_section = kept;
 + }
 +   else
 + {
 +   /* Record the split cold section in the hot section.  */
 +   assert (!is_prefix_of (.text.unlikely, kept-full_name));
 +   kept-split_section = section;
 +   cold_section = section;
 + }
 +   assert (cold_section != NULL  cold_section-comdat_group == NULL);
 +   cold_section-is_split_cold_section = 1;
 + }
 ...

 Thanks,
 Teresa


 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Index: function_reordering_plugin/callgraph.h
===
--- function_reordering_plugin/callgraph.h  (revision 207700)
+++ function_reordering_plugin/callgraph.h  (working copy)
@@ -236,6 +236,8 @@ typedef struct section_id_
  is comdat hot and kept, pointer to the kept cold split
  section.  */
   struct section_id_ *split_section;
+  /* If this is the cold part of a split section.  */
+  char is_split_cold_section;
   /* Check if this section has been considered for output.  */
   char processed;
 } Section_id;

Re: [PATCH] Handle more COMDAT profiling issues

2014-02-11 Thread Xinliang David Li
Is it better to add some logic in counts_to_freq to determine if the
profile count needs to be dropped completely to force profile
estimation?

David

On Mon, Feb 10, 2014 at 2:12 PM, Teresa Johnson tejohn...@google.com wrote:
 This patch attempts to address the lost profile issue for COMDATs in
 more circumstances, exposed by function splitting.

 My earlier patch handled the case where the comdat had 0 counts since
 the linker kept the copy in a different module. In that case we
 prevent the guessed frequencies on 0-count functions from being
 dropped by counts_to_freq, and then later mark any reached via
 non-zero callgraph edges as guessed. Finally, when one such 0-count
 comdat is inlined the call count is propagated to the callee blocks
 using the guessed probabilities.

 However, in this case, there was a comdat that had a very small
 non-zero count, that was being inlined to a much hotter callsite. This
 could happen when there was a copy that was ipa-inlined
 in the profile gen compile, so the copy in that module gets some
 non-zero counts from the ipa inlined instance, but when the out of
 line copy was eliminated by the linker (selected from a different
 module). In this case the inliner was scaling the bb counts up quite a
 lot when inlining. The problem is that you most likely can't trust
 that the 0 count bbs in such a case are really not executed by the
 callsite it is being into, since the counts are very small and
 correspond to a different callsite. In some internal C++ applications
 I am seeing that we execute out of the split cold portion of COMDATs
 for this reason.

 This problem is more complicated to address than the 0-count instance,
 because we need the cgraph to determine which functions to drop the
 profile on, and at that point the estimated frequencies have already
 been overwritten by counts_to_freqs. To handle this broader case, I
 have changed the drop_profile routine to simply re-estimate the
 probabilities/frequencies (and translate these into counts scaled from
 the incoming call edge counts). This unfortunately necessitates
 rebuilding the cgraph, to propagate the new synthesized counts and
 avoid checking failures downstream. But it will only be rebuilt if we
 dropped any profiles. With this solution, some of the older approach
 can be removed (e.g. propagating counts using the guessed
 probabilities during inlining).

 Patch is below. Bootstrapped and tested on x86-64-unknown-linux-gnu.
 Also tested on
 a profile-use build of SPEC cpu2006. Ok for trunk when stage 1 reopens?

 Thanks,
 Teresa

 2014-02-10  Teresa Johnson  tejohn...@google.com

 * graphite.c (graphite_finalize): Pass new parameter.
 * params.def (PARAM_MIN_CALLER_REESTIMATE_RATIO): New.
 * predict.c (tree_estimate_probability): New parameter.
 (tree_estimate_probability_worker): Renamed from
 tree_estimate_probability_driver.
 (tree_reestimate_probability): New function.
 (tree_estimate_probability_driver): Invoke
 tree_estimate_probability_worker.
 (freqs_to_counts): Move from tree-inline.c.
 (drop_profile): Re-estimated profiles when dropping counts.
 (handle_missing_profiles): Drop for some non-zero functions as well.
 (counts_to_freqs): Remove code obviated by reestimation.
 * predict.h (handle_missing_profiles): Update declartion.
 (tree_estimate_probability): Ditto.
 * tree-inline.c (freqs_to_counts): Move to predict.c.
 (copy_cfg_body): Remove code obviated by reestimation.
 * tree-profile.c (gimple_gen_ior_profiler):
 (rebuild_cgraph): Code extracted from tree_profiling to rebuild 
 cgraph.
 (tree_profiling): Invoke rebuild_cgraph as needed.

 Index: graphite.c
 ===
 --- graphite.c  (revision 207436)
 +++ graphite.c  (working copy)
 @@ -247,7 +247,7 @@ graphite_finalize (bool need_cfg_cleanup_p)
cleanup_tree_cfg ();
profile_status_for_fn (cfun) = PROFILE_ABSENT;
release_recorded_exits ();
 -  tree_estimate_probability ();
 +  tree_estimate_probability (false);
  }

cloog_state_free (cloog_state);
 Index: params.def
 ===
 --- params.def  (revision 207436)
 +++ params.def  (working copy)
 @@ -44,6 +44,12 @@ DEFPARAM (PARAM_PREDICTABLE_BRANCH_OUTCOME,
   Maximal estimated outcome of branch considered predictable,
   2, 0, 50)

 +DEFPARAM (PARAM_MIN_CALLER_REESTIMATE_RATIO,
 + min-caller-reestimate-ratio,
 + Minimum caller-to-callee node count ratio to force
 reestimated branch 
 +  probabilities in callee (where 0 means only when callee
 count is 0),
 + 10, 0, 0)
 +
  DEFPARAM (PARAM_INLINE_MIN_SPEEDUP,
   inline-min-speedup,
   The minimal estimated speedup allowing inliner to ignore
 inline-insns-single and 

[google gcc-4_8] Remove DW_AT_GNU_addr_base from skeleton type unit DIEs

2014-02-11 Thread Cary Coutant
Remove DW_AT_GNU_addr_base from skeleton type unit DIEs.

GCC currently adds a DW_AT_GNU_addr_base attribute to skeleton type unit
DIEs, even though it's not needed there. By removing it, we can save
the 8 bytes that a DW_FORM_addr takes up, plus the 24 bytes that the
corresponding relocation takes.

This patch is for the google/gcc-4_8 branch. I will submit it
for trunk when stage 1 is open.

Tested with crosstool_validate.py.


2014-02-11  Cary Coutant  ccout...@google.com

* dwarf2out.c (add_top_level_skeleton_die_attrs): Don't add
DW_AT_GNU_addr_base to all skeleton DIEs.
(dwarf2out_finish): Add it only to the skeleton comp_unit DIE.


Index: dwarf2out.c
===
--- dwarf2out.c (revision 207671)
+++ dwarf2out.c (working copy)
@@ -8918,7 +8918,6 @@ add_top_level_skeleton_die_attrs (dw_die
   if (comp_dir != NULL)
 add_skeleton_AT_string (die, DW_AT_comp_dir, comp_dir);
   add_AT_pubnames (die);
-  add_AT_lineptr (die, DW_AT_GNU_addr_base, debug_addr_section_label);
 }
 
 /* Return the single type-unit die for skeleton type units.  */
@@ -23929,6 +23928,8 @@ dwarf2out_finish (const char *filename)
  skeleton die attrs are added when the skeleton type unit is
  created, so ensure it is created by this point.  */
   add_top_level_skeleton_die_attrs (main_comp_unit_die);
+  add_AT_lineptr (main_comp_unit_die, DW_AT_GNU_addr_base,
+  debug_addr_section_label);
   (void) get_skeleton_type_unit ();
   htab_traverse_noresize (debug_str_hash, index_string, index);
 }


Re: [google gcc-4_8] Remove DW_AT_GNU_addr_base from skeleton type unit DIEs

2014-02-11 Thread Doug Evans
On Tue, Feb 11, 2014 at 3:55 PM, Cary Coutant ccout...@google.com wrote:
 Remove DW_AT_GNU_addr_base from skeleton type unit DIEs.

 GCC currently adds a DW_AT_GNU_addr_base attribute to skeleton type unit
 DIEs, even though it's not needed there. By removing it, we can save
 the 8 bytes that a DW_FORM_addr takes up, plus the 24 bytes that the
 corresponding relocation takes.

 This patch is for the google/gcc-4_8 branch. I will submit it
 for trunk when stage 1 is open.

 Tested with crosstool_validate.py.


 2014-02-11  Cary Coutant  ccout...@google.com

 * dwarf2out.c (add_top_level_skeleton_die_attrs): Don't add
 DW_AT_GNU_addr_base to all skeleton DIEs.
 (dwarf2out_finish): Add it only to the skeleton comp_unit DIE.


 Index: dwarf2out.c
 ===
 --- dwarf2out.c (revision 207671)
 +++ dwarf2out.c (working copy)
 @@ -8918,7 +8918,6 @@ add_top_level_skeleton_die_attrs (dw_die
if (comp_dir != NULL)
  add_skeleton_AT_string (die, DW_AT_comp_dir, comp_dir);
add_AT_pubnames (die);
 -  add_AT_lineptr (die, DW_AT_GNU_addr_base, debug_addr_section_label);
  }

  /* Return the single type-unit die for skeleton type units.  */
 @@ -23929,6 +23928,8 @@ dwarf2out_finish (const char *filename)
   skeleton die attrs are added when the skeleton type unit is
   created, so ensure it is created by this point.  */
add_top_level_skeleton_die_attrs (main_comp_unit_die);
 +  add_AT_lineptr (main_comp_unit_die, DW_AT_GNU_addr_base,
 +  debug_addr_section_label);
(void) get_skeleton_type_unit ();
htab_traverse_noresize (debug_str_hash, index_string, index);
  }

LGTM


Re: [PATCH, WWW] [AVX-512] Add news about AVX-512.

2014-02-11 Thread Gerald Pfeifer
On Mon, 10 Feb 2014, Kirill Yukhin wrote:
 Index: htdocs/index.html
 ===
 + dtspanIntel AVX-512 support/span
 + span class=date[2014-02-07]/span/dt
 + ddIntel AVX-512 support was added to GCC.  That includes inline 
 assembly
 +   support, extended existing and new registers, intrinsics set (covered 
 by
 +   corresponding testsuite), and basic autovectorization./dd

How about

  support for inline assembly, new registers and extending existing ones,
  new intrinsics, and basic autovectorization

?  Please keep an eye on whether I might have changed the meaning; the
above is my interpretation of what I _think_ I read.

(To not make this too long I'd omit the testsuite aspect unless you feel
strong about it.  Then I'd ask the question why only that part is covered
by the testsuite, though. ;-)

No attribution to Intel or the actual developers as we have done in
other cases?

 Index: htdocs/gcc-4.9/changes.html
 ===
 + liGCC now supports Intel Advanced Vector Extensions 512 instructions
 +   (AVX-512), including inline assembly support, extended existing and
 +   new registers, intrinsics set (covered by corresponding testsuite) and
 +   basic autovectorization. 

Can you adjust this similarly (but keep/add the testsuite piece)?

 AVX-512 instructions are available via
 +   following GCC switches: AVX-512 foundamental instructions
 +   code-mavx512f/code, AVX-512 prefetch instructions 
 code-mavx512pf/code,
 +   AVX-512 exponential and reciprocal instructions 
 code-mavx512er/code,
 +   AVX-512 conflict detection instructions code-mavx512cd/code.

via the following (add the)

Here I'd add a colon after each instructions.

Is there an option to enable all of them together?


This is fine with these changes or different ones addressing the
items I pointed out.

Thanks,
Gerald

PS: This is my primary e-mail addres (just in case you have put
the other into your address book or so).


Re: [google][gcc-4_8][patch]Handle Split functions in the Function Reordering Plugin

2014-02-11 Thread Sriraman Tallam
On Tue, Feb 11, 2014 at 3:29 PM, Teresa Johnson tejohn...@google.com wrote:

 On Feb 11, 2014 2:37 PM, Sriraman Tallam tmsri...@google.com wrote:

 On Tue, Feb 11, 2014 at 1:32 PM, Teresa Johnson tejohn...@google.com
 wrote:
  On Mon, Feb 10, 2014 at 7:15 PM, Sriraman Tallam tmsri...@google.com
  wrote:
  Hi Teresa,
 
 I have attached a patch to recognize and reorder split functions in
  the function reordering plugin. Please review.
 
  Thanks
  Sri
 
  Index: function_reordering_plugin/callgraph.c
  ===
  --- function_reordering_plugin/callgraph.c(revision 207671)
  +++ function_reordering_plugin/callgraph.c(working copy)
  @@ -550,6 +550,25 @@ static void set_node_type (Node *n)
 s-computed_weight = n-computed_weight;
 s-max_count = n-max_count;
 
  +  /* If s is split into a cold section, assign the split weight to
  the
  + max count of the split section.   Use this also for the
  weight of the
  + spliandection.  */

  +  if (s-split_section)
  +{
  +  s-split_section-max_count = s-split_section-weight =
  n-split_weight;
  +  /* If split_section is comdat, update all the comdat
  +  candidates for weight.  */
  +  s_comdat = s-split_section-comdat_group;
  +  while (s_comdat != NULL)
  +{
  +  s_comdat-weight = s-split_section-weight;
  +  s_comdat-computed_weight
  + = s-split_section-computed_weight;
 
  Where is s-split_section-computed_weight set

 I removed this line as it is not useful. Details:

 In general, the computed_weight for sections is assigned in
 set_node_type in line:
  s-computed_weight = n-computed_weight;

 The computed_weight is obtained by adding the weights of all incoming
 edges. However, for the cold part of split functions, this can never
 be non-zero as the split cold part is never called and so this line is
 not useful.


 
  +  s_comdat-max_count = s-split_section-max_count;
  +  s_comdat = s_comdat-comdat_group;
  +}
  + }
  +
 
  ...
 
 
  +  /* It is split and it is not comdat.  */
  +  else if (is_split_function)
  + {
  +   Section_id *cold_section = NULL;
  +   /* Function splitting means that the hot part is really the
  +  relevant section and the other section is unlikely executed
  and
  +  should not be part of the callgraph.  */
 
  -  section-comdat_group = kept-comdat_group;
  -  kept-comdat_group = section;
  +   /* Store the new section in the section list.  */
  +   section-next = first_section;
  +   first_section = section;
  +   /* The right section is not in the section_map if
  .text.unlikely is
  +  not the new section.  */
  +  if (!is_prefix_of (.text.unlikely, section_name))
 
  The double-negative in the above comment is a bit confusing. Can we
  make this similar to the check in the earlier split comdat case. I.e.
  something like (essentially swapping the if condition and assert):

 Changed. New patch attached.

 The comment is fixed but the checks stayed the same and seem out of order
 now. Teresa

Ah!, sorry. Changed and patch attached.

Sri




 Thanks
 Sri

 
/* If the existing section is cold, the newly detected split
  must
   be hot.  */
if (is_prefix_of (.text.unlikely, kept-full_name))
  {
assert (!is_prefix_of (.text.unlikely, section_name));
...
  }
else
  {
assert (is_prefix_of (.text.unlikely, section_name));
...
  }
 
  + {
  +   assert (is_prefix_of (.text.unlikely, kept-full_name));
  +   /* The kept section was the unlikely section.  Change the
  section
  +  in section_map to be the new section which is the hot
  one.  */
  +   *slot = section;
  +   /* Record the split cold section in the hot section.  */
  +   section-split_section = kept;
  +   /* Comdats and function splitting are already handled.  */
  +   assert (kept-comdat_group == NULL);
  +   cold_section = kept;
  + }
  +   else
  + {
  +   /* Record the split cold section in the hot section.  */
  +   assert (!is_prefix_of (.text.unlikely, kept-full_name));
  +   kept-split_section = section;
  +   cold_section = section;
  + }
  +   assert (cold_section != NULL  cold_section-comdat_group ==
  NULL);
  +   cold_section-is_split_cold_section = 1;
  + }
  ...
 
  Thanks,
  Teresa
 
 
  --
  Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413
Index: 
gcc/testsuite/g++.dg/tree-prof/func_reorder_gold_plugin_split_functions_1.C
===
--- 

Re: [google][gcc-4_8][patch]Handle Split functions in the Function Reordering Plugin

2014-02-11 Thread Teresa Johnson
On Tue, Feb 11, 2014 at 4:32 PM, Sriraman Tallam tmsri...@google.com wrote:
 On Tue, Feb 11, 2014 at 3:29 PM, Teresa Johnson tejohn...@google.com wrote:

 On Feb 11, 2014 2:37 PM, Sriraman Tallam tmsri...@google.com wrote:

 On Tue, Feb 11, 2014 at 1:32 PM, Teresa Johnson tejohn...@google.com
 wrote:
  On Mon, Feb 10, 2014 at 7:15 PM, Sriraman Tallam tmsri...@google.com
  wrote:
  Hi Teresa,
 
 I have attached a patch to recognize and reorder split functions in
  the function reordering plugin. Please review.
 
  Thanks
  Sri
 
  Index: function_reordering_plugin/callgraph.c
  ===
  --- function_reordering_plugin/callgraph.c(revision 207671)
  +++ function_reordering_plugin/callgraph.c(working copy)
  @@ -550,6 +550,25 @@ static void set_node_type (Node *n)
 s-computed_weight = n-computed_weight;
 s-max_count = n-max_count;
 
  +  /* If s is split into a cold section, assign the split weight to
  the
  + max count of the split section.   Use this also for the
  weight of the
  + spliandection.  */

  +  if (s-split_section)
  +{
  +  s-split_section-max_count = s-split_section-weight =
  n-split_weight;
  +  /* If split_section is comdat, update all the comdat
  +  candidates for weight.  */
  +  s_comdat = s-split_section-comdat_group;
  +  while (s_comdat != NULL)
  +{
  +  s_comdat-weight = s-split_section-weight;
  +  s_comdat-computed_weight
  + = s-split_section-computed_weight;
 
  Where is s-split_section-computed_weight set

 I removed this line as it is not useful. Details:

 In general, the computed_weight for sections is assigned in
 set_node_type in line:
  s-computed_weight = n-computed_weight;

 The computed_weight is obtained by adding the weights of all incoming
 edges. However, for the cold part of split functions, this can never
 be non-zero as the split cold part is never called and so this line is
 not useful.


 
  +  s_comdat-max_count = s-split_section-max_count;
  +  s_comdat = s_comdat-comdat_group;
  +}
  + }
  +
 
  ...
 
 
  +  /* It is split and it is not comdat.  */
  +  else if (is_split_function)
  + {
  +   Section_id *cold_section = NULL;
  +   /* Function splitting means that the hot part is really the
  +  relevant section and the other section is unlikely executed
  and
  +  should not be part of the callgraph.  */
 
  -  section-comdat_group = kept-comdat_group;
  -  kept-comdat_group = section;
  +   /* Store the new section in the section list.  */
  +   section-next = first_section;
  +   first_section = section;
  +   /* The right section is not in the section_map if
  .text.unlikely is
  +  not the new section.  */
  +  if (!is_prefix_of (.text.unlikely, section_name))
 
  The double-negative in the above comment is a bit confusing. Can we
  make this similar to the check in the earlier split comdat case. I.e.
  something like (essentially swapping the if condition and assert):

 Changed. New patch attached.

 The comment is fixed but the checks stayed the same and seem out of order
 now. Teresa

 Ah!, sorry. Changed and patch attached.

The assert in the else clause is unnecessary (since you have landed
there after doing that same check already). It would be good to have
asserts in both the if and else clauses on the new section_name (see
my suggested code in the initial response).

Teresa


 Sri




 Thanks
 Sri

 
/* If the existing section is cold, the newly detected split
  must
   be hot.  */
if (is_prefix_of (.text.unlikely, kept-full_name))
  {
assert (!is_prefix_of (.text.unlikely, section_name));
...
  }
else
  {
assert (is_prefix_of (.text.unlikely, section_name));
...
  }
 
  + {
  +   assert (is_prefix_of (.text.unlikely, kept-full_name));
  +   /* The kept section was the unlikely section.  Change the
  section
  +  in section_map to be the new section which is the hot
  one.  */
  +   *slot = section;
  +   /* Record the split cold section in the hot section.  */
  +   section-split_section = kept;
  +   /* Comdats and function splitting are already handled.  */
  +   assert (kept-comdat_group == NULL);
  +   cold_section = kept;
  + }
  +   else
  + {
  +   /* Record the split cold section in the hot section.  */
  +   assert (!is_prefix_of (.text.unlikely, kept-full_name));
  +   kept-split_section = section;
  +   cold_section = section;
  + }
  +   assert (cold_section != NULL  cold_section-comdat_group ==
  NULL);
  +  

[GOOGLE] Emit a single unaligned load/store instruction for i386 m_GENERIC

2014-02-11 Thread Cong Hou
This small patch lets GCC emit a single unaligned load/store
instruction for m_GENERIC i386 CPUs.

Bootstrapped and passed regression test.

OK for Google branch?


thanks,
Cong


Index: gcc/config/i386/i386.c
===
--- gcc/config/i386/i386.c (revision 207701)
+++ gcc/config/i386/i386.c (working copy)
@@ -1903,10 +1903,10 @@ static unsigned int initial_ix86_tune_fe
   m_PPRO | m_P4_NOCONA | m_CORE_ALL | m_ATOM  | m_AMDFAM10 | m_BDVER
| m_GENERIC,

   /* X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL */
-  m_COREI7 | m_COREI7_AVX | m_AMDFAM10 | m_BDVER | m_BTVER,
+  m_COREI7 | m_COREI7_AVX | m_AMDFAM10 | m_BDVER | m_BTVER | m_GENERIC,

   /* X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL */
-  m_COREI7 | m_COREI7_AVX | m_BDVER,
+  m_COREI7 | m_COREI7_AVX | m_BDVER | m_GENERIC,

   /* X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL */
   m_BDVER ,


Re: [GOOGLE] Emit a single unaligned load/store instruction for i386 m_GENERIC

2014-02-11 Thread Xinliang David Li
ok.

David

On Tue, Feb 11, 2014 at 4:50 PM, Cong Hou co...@google.com wrote:
 This small patch lets GCC emit a single unaligned load/store
 instruction for m_GENERIC i386 CPUs.

 Bootstrapped and passed regression test.

 OK for Google branch?


 thanks,
 Cong


 Index: gcc/config/i386/i386.c
 ===
 --- gcc/config/i386/i386.c (revision 207701)
 +++ gcc/config/i386/i386.c (working copy)
 @@ -1903,10 +1903,10 @@ static unsigned int initial_ix86_tune_fe
m_PPRO | m_P4_NOCONA | m_CORE_ALL | m_ATOM  | m_AMDFAM10 | m_BDVER
 | m_GENERIC,

/* X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL */
 -  m_COREI7 | m_COREI7_AVX | m_AMDFAM10 | m_BDVER | m_BTVER,
 +  m_COREI7 | m_COREI7_AVX | m_AMDFAM10 | m_BDVER | m_BTVER | m_GENERIC,

/* X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL */
 -  m_COREI7 | m_COREI7_AVX | m_BDVER,
 +  m_COREI7 | m_COREI7_AVX | m_BDVER | m_GENERIC,

/* X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL */
m_BDVER ,


Re: [google][gcc-4_8][patch]Handle Split functions in the Function Reordering Plugin

2014-02-11 Thread Sriraman Tallam
On Tue, Feb 11, 2014 at 4:39 PM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Feb 11, 2014 at 4:32 PM, Sriraman Tallam tmsri...@google.com wrote:
 On Tue, Feb 11, 2014 at 3:29 PM, Teresa Johnson tejohn...@google.com wrote:

 On Feb 11, 2014 2:37 PM, Sriraman Tallam tmsri...@google.com wrote:

 On Tue, Feb 11, 2014 at 1:32 PM, Teresa Johnson tejohn...@google.com
 wrote:
  On Mon, Feb 10, 2014 at 7:15 PM, Sriraman Tallam tmsri...@google.com
  wrote:
  Hi Teresa,
 
 I have attached a patch to recognize and reorder split functions in
  the function reordering plugin. Please review.
 
  Thanks
  Sri
 
  Index: function_reordering_plugin/callgraph.c
  ===
  --- function_reordering_plugin/callgraph.c(revision 207671)
  +++ function_reordering_plugin/callgraph.c(working copy)
  @@ -550,6 +550,25 @@ static void set_node_type (Node *n)
 s-computed_weight = n-computed_weight;
 s-max_count = n-max_count;
 
  +  /* If s is split into a cold section, assign the split weight to
  the
  + max count of the split section.   Use this also for the
  weight of the
  + spliandection.  */

  +  if (s-split_section)
  +{
  +  s-split_section-max_count = s-split_section-weight =
  n-split_weight;
  +  /* If split_section is comdat, update all the comdat
  +  candidates for weight.  */
  +  s_comdat = s-split_section-comdat_group;
  +  while (s_comdat != NULL)
  +{
  +  s_comdat-weight = s-split_section-weight;
  +  s_comdat-computed_weight
  + = s-split_section-computed_weight;
 
  Where is s-split_section-computed_weight set

 I removed this line as it is not useful. Details:

 In general, the computed_weight for sections is assigned in
 set_node_type in line:
  s-computed_weight = n-computed_weight;

 The computed_weight is obtained by adding the weights of all incoming
 edges. However, for the cold part of split functions, this can never
 be non-zero as the split cold part is never called and so this line is
 not useful.


 
  +  s_comdat-max_count = s-split_section-max_count;
  +  s_comdat = s_comdat-comdat_group;
  +}
  + }
  +
 
  ...
 
 
  +  /* It is split and it is not comdat.  */
  +  else if (is_split_function)
  + {
  +   Section_id *cold_section = NULL;
  +   /* Function splitting means that the hot part is really the
  +  relevant section and the other section is unlikely executed
  and
  +  should not be part of the callgraph.  */
 
  -  section-comdat_group = kept-comdat_group;
  -  kept-comdat_group = section;
  +   /* Store the new section in the section list.  */
  +   section-next = first_section;
  +   first_section = section;
  +   /* The right section is not in the section_map if
  .text.unlikely is
  +  not the new section.  */
  +  if (!is_prefix_of (.text.unlikely, section_name))
 
  The double-negative in the above comment is a bit confusing. Can we
  make this similar to the check in the earlier split comdat case. I.e.
  something like (essentially swapping the if condition and assert):

 Changed. New patch attached.

 The comment is fixed but the checks stayed the same and seem out of order
 now. Teresa

 Ah!, sorry. Changed and patch attached.

 The assert in the else clause is unnecessary (since you have landed
 there after doing that same check already). It would be good to have
 asserts in both the if and else clauses on the new section_name (see
 my suggested code in the initial response).

Ok, I overlooked the code you suggested in the original response,
sorry about that. I have included those asserts you suggested in both
places where we swap the new section with the existing section.

Thanks
Sri


 Teresa


 Sri




 Thanks
 Sri

 
/* If the existing section is cold, the newly detected split
  must
   be hot.  */
if (is_prefix_of (.text.unlikely, kept-full_name))
  {
assert (!is_prefix_of (.text.unlikely, section_name));
...
  }
else
  {
assert (is_prefix_of (.text.unlikely, section_name));
...
  }
 
  + {
  +   assert (is_prefix_of (.text.unlikely, kept-full_name));
  +   /* The kept section was the unlikely section.  Change the
  section
  +  in section_map to be the new section which is the hot
  one.  */
  +   *slot = section;
  +   /* Record the split cold section in the hot section.  */
  +   section-split_section = kept;
  +   /* Comdats and function splitting are already handled.  */
  +   assert (kept-comdat_group == NULL);
  +   cold_section = kept;
  + }
  +   else
  + {
  +   /* Record 

Re: [PATCH] Fix up -Wsequence-point handling (PR c/60101)

2014-02-11 Thread Joseph S. Myers
On Tue, 11 Feb 2014, Jakub Jelinek wrote:

 Hi!
 
 Right now we spent several minutes in verify_tree.  The problems I see
 is that merge_tlist does the exact opposite of what should it be doing with
 copy flag (most merge_tlist calls are with copy=0, thus that means mostly
 unnecessary copying, but for the single one for SAVE_EXPR it actually
 probably breaks things or can break), the middle-hunk is about one spot
 which IMHO uses copy=1 unnecessarily (nothing uses tmp_list2 afterwards).
 
 The most important is the last hunk, the SAVE_EXPR handling was doing
 merge_tlist first on the whole tmp_nosp chain (with copy=0 which mistakenly
 copied everything, see above), and then doing that again with the whole
 tmp_nosp list except the first entry, then except the first two entries etc.
 O(n^2) complexity, but IMHO none of the calls but the first one actually
 would do anything, because after the first merge_tlist call all entries
 should be actually in tmp_list3 (except for duplicates), and so for all
 further entries it would be finding a matching entry already (found=1).
 
 With this patch pr60101.c compiles within less than a second.
 
 Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Simple install hook

2014-02-11 Thread David Malcolm
On Tue, 2014-02-11 at 16:51 +, Philip Herron wrote:
[adding the j...@gcc.gnu.org ML to the CC]

 Added install hook:

Thanks!

I don't know that this is needed for a 3-line patch, but have you done
the copyright assignment paperwork for GCC contribution?  (I hope to
merge my branch into gcc trunk at some point).  [Also, I'd love to have
more, larger, patches from you for the jit branch!]

 /opt/gjit/bin/gcc -g -O2 -Wall t.c -o test -I/opt/gjit/include
 -lgccjit -L/opt/gjit/lib
 
 Compiles the helloworld examples correctly and it runs instead of
 pointing to my gcc build dir. Am working on getting more involved with
 this and started:
 
 https://github.com/redbrain/spy
 
 Its only just starting to parse stuff but a kind of C/Python kind of
 language using gcc as a JIT might be interesting kind of dynamic
 language for C that can call C libs safely and easily is the idea.
 Mostly just so i can see where to help out in the jit front-end.

Excellent!  Looks promising - though it looks like the backend is all
stubbed out at the moment.

Note that the JIT API isn't frozen yet.  I try to remember to add API
change to the subject line when posting my commits, but I don't always
remember.

Let me know if you have any questions on how the JIT API works - or
input on how it *should* work.

FWIW I've been experimentally porting GNU Octave's LLVM-based JIT to
using libgccjit, and finding and fixing various issues in the latter on
the way - that's been driving a lot of the patches to the jit branch
lately.

 Was also considering some kind of libopcodes work to assemble the code
 in memory instead of creating a .so in /tmp. Not sure if i know what i
 am doing enough there but it might be more elegant for stuff using
 this front-end.

My thinking here was that the core code of the GNU assembler could gain
the option of being built as a shared library, and having to isolate
state in a context object, and we could try to hack the two projects
into meeting in the middle.  Large amount of work though (and a
different mailing list), hence the crude .so hack for now.

 Am so impressed how well this works.

Cheers!
Dave



Re: [google][gcc-4_8][patch]Handle Split functions in the Function Reordering Plugin

2014-02-11 Thread Teresa Johnson
On Tue, Feb 11, 2014 at 4:51 PM, Sriraman Tallam tmsri...@google.com wrote:
 On Tue, Feb 11, 2014 at 4:39 PM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Feb 11, 2014 at 4:32 PM, Sriraman Tallam tmsri...@google.com wrote:
 On Tue, Feb 11, 2014 at 3:29 PM, Teresa Johnson tejohn...@google.com 
 wrote:

 On Feb 11, 2014 2:37 PM, Sriraman Tallam tmsri...@google.com wrote:

 On Tue, Feb 11, 2014 at 1:32 PM, Teresa Johnson tejohn...@google.com
 wrote:
  On Mon, Feb 10, 2014 at 7:15 PM, Sriraman Tallam tmsri...@google.com
  wrote:
  Hi Teresa,
 
 I have attached a patch to recognize and reorder split functions in
  the function reordering plugin. Please review.
 
  Thanks
  Sri
 
  Index: function_reordering_plugin/callgraph.c
  ===
  --- function_reordering_plugin/callgraph.c(revision 207671)
  +++ function_reordering_plugin/callgraph.c(working copy)
  @@ -550,6 +550,25 @@ static void set_node_type (Node *n)
 s-computed_weight = n-computed_weight;
 s-max_count = n-max_count;
 
  +  /* If s is split into a cold section, assign the split weight to
  the
  + max count of the split section.   Use this also for the
  weight of the
  + spliandection.  */

  +  if (s-split_section)
  +{
  +  s-split_section-max_count = s-split_section-weight =
  n-split_weight;
  +  /* If split_section is comdat, update all the comdat
  +  candidates for weight.  */
  +  s_comdat = s-split_section-comdat_group;
  +  while (s_comdat != NULL)
  +{
  +  s_comdat-weight = s-split_section-weight;
  +  s_comdat-computed_weight
  + = s-split_section-computed_weight;
 
  Where is s-split_section-computed_weight set

 I removed this line as it is not useful. Details:

 In general, the computed_weight for sections is assigned in
 set_node_type in line:
  s-computed_weight = n-computed_weight;

 The computed_weight is obtained by adding the weights of all incoming
 edges. However, for the cold part of split functions, this can never
 be non-zero as the split cold part is never called and so this line is
 not useful.


 
  +  s_comdat-max_count = s-split_section-max_count;
  +  s_comdat = s_comdat-comdat_group;
  +}
  + }
  +
 
  ...
 
 
  +  /* It is split and it is not comdat.  */
  +  else if (is_split_function)
  + {
  +   Section_id *cold_section = NULL;
  +   /* Function splitting means that the hot part is really the
  +  relevant section and the other section is unlikely executed
  and
  +  should not be part of the callgraph.  */
 
  -  section-comdat_group = kept-comdat_group;
  -  kept-comdat_group = section;
  +   /* Store the new section in the section list.  */
  +   section-next = first_section;
  +   first_section = section;
  +   /* The right section is not in the section_map if
  .text.unlikely is
  +  not the new section.  */
  +  if (!is_prefix_of (.text.unlikely, section_name))
 
  The double-negative in the above comment is a bit confusing. Can we
  make this similar to the check in the earlier split comdat case. I.e.
  something like (essentially swapping the if condition and assert):

 Changed. New patch attached.

 The comment is fixed but the checks stayed the same and seem out of order
 now. Teresa

 Ah!, sorry. Changed and patch attached.

 The assert in the else clause is unnecessary (since you have landed
 there after doing that same check already). It would be good to have
 asserts in both the if and else clauses on the new section_name (see
 my suggested code in the initial response).

 Ok, I overlooked the code you suggested in the original response,
 sorry about that. I have included those asserts you suggested in both
 places where we swap the new section with the existing section.

Ok, thanks! Looks good.
Teresa


 Thanks
 Sri


 Teresa


 Sri




 Thanks
 Sri

 
/* If the existing section is cold, the newly detected split
  must
   be hot.  */
if (is_prefix_of (.text.unlikely, kept-full_name))
  {
assert (!is_prefix_of (.text.unlikely, section_name));
...
  }
else
  {
assert (is_prefix_of (.text.unlikely, section_name));
...
  }
 
  + {
  +   assert (is_prefix_of (.text.unlikely, kept-full_name));
  +   /* The kept section was the unlikely section.  Change the
  section
  +  in section_map to be the new section which is the hot
  one.  */
  +   *slot = section;
  +   /* Record the split cold section in the hot section.  */
  +   section-split_section = kept;
  +   /* Comdats and function splitting are already handled.  */
  +   assert 

Re: [PATCH] Handle more COMDAT profiling issues

2014-02-11 Thread Teresa Johnson
On Tue, Feb 11, 2014 at 2:56 PM, Xinliang David Li davi...@google.com wrote:
 Is it better to add some logic in counts_to_freq to determine if the
 profile count needs to be dropped completely to force profile
 estimation?

This is the problem I was mentioning below where we call
counts_to_freqs before we have the cgraph and can tell that we will
need to drop the profile. When we were only dropping the profile for
functions with all 0 counts, we simply avoided doing the
counts_to_freqs when the counts were all 0, which works since the 0
counts don't leave things in an inconsistent state (counts vs
estimated frequencies).

Teresa


 David

 On Mon, Feb 10, 2014 at 2:12 PM, Teresa Johnson tejohn...@google.com wrote:
 This patch attempts to address the lost profile issue for COMDATs in
 more circumstances, exposed by function splitting.

 My earlier patch handled the case where the comdat had 0 counts since
 the linker kept the copy in a different module. In that case we
 prevent the guessed frequencies on 0-count functions from being
 dropped by counts_to_freq, and then later mark any reached via
 non-zero callgraph edges as guessed. Finally, when one such 0-count
 comdat is inlined the call count is propagated to the callee blocks
 using the guessed probabilities.

 However, in this case, there was a comdat that had a very small
 non-zero count, that was being inlined to a much hotter callsite. This
 could happen when there was a copy that was ipa-inlined
 in the profile gen compile, so the copy in that module gets some
 non-zero counts from the ipa inlined instance, but when the out of
 line copy was eliminated by the linker (selected from a different
 module). In this case the inliner was scaling the bb counts up quite a
 lot when inlining. The problem is that you most likely can't trust
 that the 0 count bbs in such a case are really not executed by the
 callsite it is being into, since the counts are very small and
 correspond to a different callsite. In some internal C++ applications
 I am seeing that we execute out of the split cold portion of COMDATs
 for this reason.

 This problem is more complicated to address than the 0-count instance,
 because we need the cgraph to determine which functions to drop the
 profile on, and at that point the estimated frequencies have already
 been overwritten by counts_to_freqs. To handle this broader case, I
 have changed the drop_profile routine to simply re-estimate the
 probabilities/frequencies (and translate these into counts scaled from
 the incoming call edge counts). This unfortunately necessitates
 rebuilding the cgraph, to propagate the new synthesized counts and
 avoid checking failures downstream. But it will only be rebuilt if we
 dropped any profiles. With this solution, some of the older approach
 can be removed (e.g. propagating counts using the guessed
 probabilities during inlining).

 Patch is below. Bootstrapped and tested on x86-64-unknown-linux-gnu.
 Also tested on
 a profile-use build of SPEC cpu2006. Ok for trunk when stage 1 reopens?

 Thanks,
 Teresa

 2014-02-10  Teresa Johnson  tejohn...@google.com

 * graphite.c (graphite_finalize): Pass new parameter.
 * params.def (PARAM_MIN_CALLER_REESTIMATE_RATIO): New.
 * predict.c (tree_estimate_probability): New parameter.
 (tree_estimate_probability_worker): Renamed from
 tree_estimate_probability_driver.
 (tree_reestimate_probability): New function.
 (tree_estimate_probability_driver): Invoke
 tree_estimate_probability_worker.
 (freqs_to_counts): Move from tree-inline.c.
 (drop_profile): Re-estimated profiles when dropping counts.
 (handle_missing_profiles): Drop for some non-zero functions as well.
 (counts_to_freqs): Remove code obviated by reestimation.
 * predict.h (handle_missing_profiles): Update declartion.
 (tree_estimate_probability): Ditto.
 * tree-inline.c (freqs_to_counts): Move to predict.c.
 (copy_cfg_body): Remove code obviated by reestimation.
 * tree-profile.c (gimple_gen_ior_profiler):
 (rebuild_cgraph): Code extracted from tree_profiling to rebuild 
 cgraph.
 (tree_profiling): Invoke rebuild_cgraph as needed.

 Index: graphite.c
 ===
 --- graphite.c  (revision 207436)
 +++ graphite.c  (working copy)
 @@ -247,7 +247,7 @@ graphite_finalize (bool need_cfg_cleanup_p)
cleanup_tree_cfg ();
profile_status_for_fn (cfun) = PROFILE_ABSENT;
release_recorded_exits ();
 -  tree_estimate_probability ();
 +  tree_estimate_probability (false);
  }

cloog_state_free (cloog_state);
 Index: params.def
 ===
 --- params.def  (revision 207436)
 +++ params.def  (working copy)
 @@ -44,6 +44,12 @@ DEFPARAM (PARAM_PREDICTABLE_BRANCH_OUTCOME,
   Maximal estimated outcome of 

Re: [PATCH] Handle more COMDAT profiling issues

2014-02-11 Thread Xinliang David Li
Why is call graph needed to determine whether to drop the profile?

If that is needed, it might be possible to leverage the ipa_profile
pass as it will walk through all function nodes to do profile
annotation. With this you can make decision to drop callee profile in
caller's context.

David

On Tue, Feb 11, 2014 at 5:04 PM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Feb 11, 2014 at 2:56 PM, Xinliang David Li davi...@google.com wrote:
 Is it better to add some logic in counts_to_freq to determine if the
 profile count needs to be dropped completely to force profile
 estimation?

 This is the problem I was mentioning below where we call
 counts_to_freqs before we have the cgraph and can tell that we will
 need to drop the profile. When we were only dropping the profile for
 functions with all 0 counts, we simply avoided doing the
 counts_to_freqs when the counts were all 0, which works since the 0
 counts don't leave things in an inconsistent state (counts vs
 estimated frequencies).

 Teresa


 David

 On Mon, Feb 10, 2014 at 2:12 PM, Teresa Johnson tejohn...@google.com wrote:
 This patch attempts to address the lost profile issue for COMDATs in
 more circumstances, exposed by function splitting.

 My earlier patch handled the case where the comdat had 0 counts since
 the linker kept the copy in a different module. In that case we
 prevent the guessed frequencies on 0-count functions from being
 dropped by counts_to_freq, and then later mark any reached via
 non-zero callgraph edges as guessed. Finally, when one such 0-count
 comdat is inlined the call count is propagated to the callee blocks
 using the guessed probabilities.

 However, in this case, there was a comdat that had a very small
 non-zero count, that was being inlined to a much hotter callsite. This
 could happen when there was a copy that was ipa-inlined
 in the profile gen compile, so the copy in that module gets some
 non-zero counts from the ipa inlined instance, but when the out of
 line copy was eliminated by the linker (selected from a different
 module). In this case the inliner was scaling the bb counts up quite a
 lot when inlining. The problem is that you most likely can't trust
 that the 0 count bbs in such a case are really not executed by the
 callsite it is being into, since the counts are very small and
 correspond to a different callsite. In some internal C++ applications
 I am seeing that we execute out of the split cold portion of COMDATs
 for this reason.

 This problem is more complicated to address than the 0-count instance,
 because we need the cgraph to determine which functions to drop the
 profile on, and at that point the estimated frequencies have already
 been overwritten by counts_to_freqs. To handle this broader case, I
 have changed the drop_profile routine to simply re-estimate the
 probabilities/frequencies (and translate these into counts scaled from
 the incoming call edge counts). This unfortunately necessitates
 rebuilding the cgraph, to propagate the new synthesized counts and
 avoid checking failures downstream. But it will only be rebuilt if we
 dropped any profiles. With this solution, some of the older approach
 can be removed (e.g. propagating counts using the guessed
 probabilities during inlining).

 Patch is below. Bootstrapped and tested on x86-64-unknown-linux-gnu.
 Also tested on
 a profile-use build of SPEC cpu2006. Ok for trunk when stage 1 reopens?

 Thanks,
 Teresa

 2014-02-10  Teresa Johnson  tejohn...@google.com

 * graphite.c (graphite_finalize): Pass new parameter.
 * params.def (PARAM_MIN_CALLER_REESTIMATE_RATIO): New.
 * predict.c (tree_estimate_probability): New parameter.
 (tree_estimate_probability_worker): Renamed from
 tree_estimate_probability_driver.
 (tree_reestimate_probability): New function.
 (tree_estimate_probability_driver): Invoke
 tree_estimate_probability_worker.
 (freqs_to_counts): Move from tree-inline.c.
 (drop_profile): Re-estimated profiles when dropping counts.
 (handle_missing_profiles): Drop for some non-zero functions as well.
 (counts_to_freqs): Remove code obviated by reestimation.
 * predict.h (handle_missing_profiles): Update declartion.
 (tree_estimate_probability): Ditto.
 * tree-inline.c (freqs_to_counts): Move to predict.c.
 (copy_cfg_body): Remove code obviated by reestimation.
 * tree-profile.c (gimple_gen_ior_profiler):
 (rebuild_cgraph): Code extracted from tree_profiling to rebuild 
 cgraph.
 (tree_profiling): Invoke rebuild_cgraph as needed.

 Index: graphite.c
 ===
 --- graphite.c  (revision 207436)
 +++ graphite.c  (working copy)
 @@ -247,7 +247,7 @@ graphite_finalize (bool need_cfg_cleanup_p)
cleanup_tree_cfg ();
profile_status_for_fn (cfun) = PROFILE_ABSENT;
release_recorded_exits ();
 -  

[GOOGLE] Prevent x_flag_complex_method to be set to 2 for C++.

2014-02-11 Thread Cong Hou
With this patch x_flag_complex_method won't be set to 2 for C++ so
that multiply/divide between std::complex objects won't be replaced by
expensive builtin function calls.

Bootstrapped and passed regression test.

OK for Google branch?


thanks,
Cong



Index: gcc/c-family/c-opts.c
===
--- gcc/c-family/c-opts.c (revision 207701)
+++ gcc/c-family/c-opts.c (working copy)
@@ -204,8 +204,10 @@ c_common_init_options_struct (struct gcc
   opts-x_warn_write_strings = c_dialect_cxx ();
   opts-x_flag_warn_unused_result = true;

-  /* By default, C99-like requirements for complex multiply and divide.  */
-  opts-x_flag_complex_method = 2;
+  /* By default, C99-like requirements for complex multiply and divide.
+ But for C++ this should not be required.  */
+  if (c_language != clk_cxx)
+opts-x_flag_complex_method = 2;
 }

 /* Common initialization before calling option handlers.  */


Re: [GOOGLE] Prevent x_flag_complex_method to be set to 2 for C++.

2014-02-11 Thread Xinliang David Li
ok.

David

On Tue, Feb 11, 2014 at 5:30 PM, Cong Hou co...@google.com wrote:
 With this patch x_flag_complex_method won't be set to 2 for C++ so
 that multiply/divide between std::complex objects won't be replaced by
 expensive builtin function calls.

 Bootstrapped and passed regression test.

 OK for Google branch?


 thanks,
 Cong



 Index: gcc/c-family/c-opts.c
 ===
 --- gcc/c-family/c-opts.c (revision 207701)
 +++ gcc/c-family/c-opts.c (working copy)
 @@ -204,8 +204,10 @@ c_common_init_options_struct (struct gcc
opts-x_warn_write_strings = c_dialect_cxx ();
opts-x_flag_warn_unused_result = true;

 -  /* By default, C99-like requirements for complex multiply and divide.  */
 -  opts-x_flag_complex_method = 2;
 +  /* By default, C99-like requirements for complex multiply and divide.
 + But for C++ this should not be required.  */
 +  if (c_language != clk_cxx)
 +opts-x_flag_complex_method = 2;
  }

  /* Common initialization before calling option handlers.  */


Re: [PATCH] Handle more COMDAT profiling issues

2014-02-11 Thread Teresa Johnson
On Tue, Feb 11, 2014 at 5:16 PM, Xinliang David Li davi...@google.com wrote:
 Why is call graph needed to determine whether to drop the profile?

Because we detect this situation by looking for cases where the call
edge counts greatly exceed the callee node count.


 If that is needed, it might be possible to leverage the ipa_profile
 pass as it will walk through all function nodes to do profile
 annotation. With this you can make decision to drop callee profile in
 caller's context.

There are 2 ipa profiling passes, which are somewhat confusingly named
(to me at least. =). This is being done during the first.

The first is pass_ipa_tree_profile in tree-profile.c, but is a
SIMPLE_IPA_PASS and has the name profile in the dump. The second is
pass_ipa_profile in ipa-profile.c, which is an IPA_PASS and has the
name profile_estimate in the dump. I assume you are suggesting to
move this into the latter? But I'm not clear on what benefit that
gives - the functions are not being traversed in order, so there is
still the issue of needing to rebuild the cgraph after dropping
profiles, which might be best done earlier as I have in the patch.

Teresa


 David

 On Tue, Feb 11, 2014 at 5:04 PM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Feb 11, 2014 at 2:56 PM, Xinliang David Li davi...@google.com 
 wrote:
 Is it better to add some logic in counts_to_freq to determine if the
 profile count needs to be dropped completely to force profile
 estimation?

 This is the problem I was mentioning below where we call
 counts_to_freqs before we have the cgraph and can tell that we will
 need to drop the profile. When we were only dropping the profile for
 functions with all 0 counts, we simply avoided doing the
 counts_to_freqs when the counts were all 0, which works since the 0
 counts don't leave things in an inconsistent state (counts vs
 estimated frequencies).

 Teresa


 David

 On Mon, Feb 10, 2014 at 2:12 PM, Teresa Johnson tejohn...@google.com 
 wrote:
 This patch attempts to address the lost profile issue for COMDATs in
 more circumstances, exposed by function splitting.

 My earlier patch handled the case where the comdat had 0 counts since
 the linker kept the copy in a different module. In that case we
 prevent the guessed frequencies on 0-count functions from being
 dropped by counts_to_freq, and then later mark any reached via
 non-zero callgraph edges as guessed. Finally, when one such 0-count
 comdat is inlined the call count is propagated to the callee blocks
 using the guessed probabilities.

 However, in this case, there was a comdat that had a very small
 non-zero count, that was being inlined to a much hotter callsite. This
 could happen when there was a copy that was ipa-inlined
 in the profile gen compile, so the copy in that module gets some
 non-zero counts from the ipa inlined instance, but when the out of
 line copy was eliminated by the linker (selected from a different
 module). In this case the inliner was scaling the bb counts up quite a
 lot when inlining. The problem is that you most likely can't trust
 that the 0 count bbs in such a case are really not executed by the
 callsite it is being into, since the counts are very small and
 correspond to a different callsite. In some internal C++ applications
 I am seeing that we execute out of the split cold portion of COMDATs
 for this reason.

 This problem is more complicated to address than the 0-count instance,
 because we need the cgraph to determine which functions to drop the
 profile on, and at that point the estimated frequencies have already
 been overwritten by counts_to_freqs. To handle this broader case, I
 have changed the drop_profile routine to simply re-estimate the
 probabilities/frequencies (and translate these into counts scaled from
 the incoming call edge counts). This unfortunately necessitates
 rebuilding the cgraph, to propagate the new synthesized counts and
 avoid checking failures downstream. But it will only be rebuilt if we
 dropped any profiles. With this solution, some of the older approach
 can be removed (e.g. propagating counts using the guessed
 probabilities during inlining).

 Patch is below. Bootstrapped and tested on x86-64-unknown-linux-gnu.
 Also tested on
 a profile-use build of SPEC cpu2006. Ok for trunk when stage 1 reopens?

 Thanks,
 Teresa

 2014-02-10  Teresa Johnson  tejohn...@google.com

 * graphite.c (graphite_finalize): Pass new parameter.
 * params.def (PARAM_MIN_CALLER_REESTIMATE_RATIO): New.
 * predict.c (tree_estimate_probability): New parameter.
 (tree_estimate_probability_worker): Renamed from
 tree_estimate_probability_driver.
 (tree_reestimate_probability): New function.
 (tree_estimate_probability_driver): Invoke
 tree_estimate_probability_worker.
 (freqs_to_counts): Move from tree-inline.c.
 (drop_profile): Re-estimated profiles when dropping counts.
 (handle_missing_profiles): Drop for some 

Re: [PATCH] Handle more COMDAT profiling issues

2014-02-11 Thread Xinliang David Li
On Tue, Feb 11, 2014 at 5:36 PM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Feb 11, 2014 at 5:16 PM, Xinliang David Li davi...@google.com wrote:
 Why is call graph needed to determine whether to drop the profile?

 Because we detect this situation by looking for cases where the call
 edge counts greatly exceed the callee node count.


 If that is needed, it might be possible to leverage the ipa_profile
 pass as it will walk through all function nodes to do profile
 annotation. With this you can make decision to drop callee profile in
 caller's context.

 There are 2 ipa profiling passes, which are somewhat confusingly named
 (to me at least. =). This is being done during the first.

 The first is pass_ipa_tree_profile in tree-profile.c, but is a
 SIMPLE_IPA_PASS and has the name profile in the dump. The second is
 pass_ipa_profile in ipa-profile.c, which is an IPA_PASS and has the
 name profile_estimate in the dump. I assume you are suggesting to
 move this into the latter? But I'm not clear on what benefit that
 gives - the functions are not being traversed in order, so there is
 still the issue of needing to rebuild the cgraph after dropping
 profiles, which might be best done earlier as I have in the patch.


I meant the tree-profile one.  I think this might work: after all the
function's profile counts are annotated, add another walk of the the
call graph nodes to drop bad profiles before the the call graph is
rebuilt (Call graph does exist at that point).

David

 Teresa


 David

 On Tue, Feb 11, 2014 at 5:04 PM, Teresa Johnson tejohn...@google.com wrote:
 On Tue, Feb 11, 2014 at 2:56 PM, Xinliang David Li davi...@google.com 
 wrote:
 Is it better to add some logic in counts_to_freq to determine if the
 profile count needs to be dropped completely to force profile
 estimation?

 This is the problem I was mentioning below where we call
 counts_to_freqs before we have the cgraph and can tell that we will
 need to drop the profile. When we were only dropping the profile for
 functions with all 0 counts, we simply avoided doing the
 counts_to_freqs when the counts were all 0, which works since the 0
 counts don't leave things in an inconsistent state (counts vs
 estimated frequencies).

 Teresa


 David

 On Mon, Feb 10, 2014 at 2:12 PM, Teresa Johnson tejohn...@google.com 
 wrote:
 This patch attempts to address the lost profile issue for COMDATs in
 more circumstances, exposed by function splitting.

 My earlier patch handled the case where the comdat had 0 counts since
 the linker kept the copy in a different module. In that case we
 prevent the guessed frequencies on 0-count functions from being
 dropped by counts_to_freq, and then later mark any reached via
 non-zero callgraph edges as guessed. Finally, when one such 0-count
 comdat is inlined the call count is propagated to the callee blocks
 using the guessed probabilities.

 However, in this case, there was a comdat that had a very small
 non-zero count, that was being inlined to a much hotter callsite. This
 could happen when there was a copy that was ipa-inlined
 in the profile gen compile, so the copy in that module gets some
 non-zero counts from the ipa inlined instance, but when the out of
 line copy was eliminated by the linker (selected from a different
 module). In this case the inliner was scaling the bb counts up quite a
 lot when inlining. The problem is that you most likely can't trust
 that the 0 count bbs in such a case are really not executed by the
 callsite it is being into, since the counts are very small and
 correspond to a different callsite. In some internal C++ applications
 I am seeing that we execute out of the split cold portion of COMDATs
 for this reason.

 This problem is more complicated to address than the 0-count instance,
 because we need the cgraph to determine which functions to drop the
 profile on, and at that point the estimated frequencies have already
 been overwritten by counts_to_freqs. To handle this broader case, I
 have changed the drop_profile routine to simply re-estimate the
 probabilities/frequencies (and translate these into counts scaled from
 the incoming call edge counts). This unfortunately necessitates
 rebuilding the cgraph, to propagate the new synthesized counts and
 avoid checking failures downstream. But it will only be rebuilt if we
 dropped any profiles. With this solution, some of the older approach
 can be removed (e.g. propagating counts using the guessed
 probabilities during inlining).

 Patch is below. Bootstrapped and tested on x86-64-unknown-linux-gnu.
 Also tested on
 a profile-use build of SPEC cpu2006. Ok for trunk when stage 1 reopens?

 Thanks,
 Teresa

 2014-02-10  Teresa Johnson  tejohn...@google.com

 * graphite.c (graphite_finalize): Pass new parameter.
 * params.def (PARAM_MIN_CALLER_REESTIMATE_RATIO): New.
 * predict.c (tree_estimate_probability): New parameter.
 (tree_estimate_probability_worker): Renamed from
 

Warn about virtual table mismatches

2014-02-11 Thread Jan Hubicka
Hi,
this patch implements warning when we merge two virtual tables that do not
match.  It compares the size and also go into fields.  I had to implement my
own comparsion code, since the vtables may subtly differ - in RTTI and also I
need to do so during DECL merging (since we will forget about the decl then)

Here is an example of real ODR violation in bugzilla it finds.
../../dist/include/mozilla/a11y/DocAccessible.h:40:0: warning: type �struct 
DocAccessible� violates one definition rule [enabled by default]
 class DocAccessible : public HyperTextAccessibleWrap,
 ^
/aux/hubicka/firefox/accessible/src/generic/DocAccessible.h:40:0: note: a type 
with the same name but different virtual table is defined in another 
translation unit
 class DocAccessible : public HyperTextAccessibleWrap,
 ^
/aux/hubicka/firefox/accessible/src/generic/DocAccessible.cpp:1232:0: note: the 
first different method is �HandleAccEvent�
 DocAccessible::HandleAccEvent(AccEvent* aEvent)
 ^
/aux/hubicka/firefox/accessible/src/atk/AccessibleWrap.cpp:956:0: note: a 
conflicting method is �HandleAccEvent�
 AccessibleWrap::HandleAccEvent(AccEvent* aEvent)
 ^

The patch is not terribly pretty, but does the job.  I will wait for few
days for comments/suggestins and then plan to commit it.

PR lto/59468
* lto/lto-symtab.c (lto_symtab_prevailing_decl): Check
virtual tables for ODR violations.
* ipa-devirt.c (compare_virtual_tables): New function.
* ipa-utils.h (compare_virtual_tables): Declare.
Index: lto/lto-symtab.c
===
--- lto/lto-symtab.c(revision 207702)
+++ lto/lto-symtab.c(working copy)
@@ -679,5 +679,13 @@ lto_symtab_prevailing_decl (tree decl)
   if (!ret)
 return decl;
 
+  /* Check and report ODR violations on virtual tables.  */
+  if (decl != ret-decl  DECL_VIRTUAL_P (decl))
+{
+  compare_virtual_tables (ret-decl, decl);
+  /* We are done with checking and DECL will die after merigng.  */
+  DECL_VIRTUAL_P (decl) = 0;
+}
+
   return ret-decl;
 }
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 207702)
+++ ipa-devirt.c(working copy)
@@ -1675,6 +1673,101 @@ update_type_inheritance_graph (void)
 }
 
 
+/* Compare two virtual tables, PREVAILING and VTABLE and output ODR
+   violation warings.  */
+
+void
+compare_virtual_tables (tree prevailing, tree vtable)
+{
+  tree init1 = DECL_INITIAL (prevailing), init2 = DECL_INITIAL (vtable);
+  if (init1 == init2)
+return;
+  if (init2 == error_mark_node)
+return;
+  /* Be sure to keep virtual table contents even for external
+ vtables when they are available.  */
+  if (init1 == error_mark_node || !init1)
+{
+  DECL_INITIAL (prevailing) = DECL_INITIAL (vtable);
+  return;
+}
+  if (!init2  DECL_EXTERNAL (vtable))
+return;
+  if (DECL_VIRTUAL_P (prevailing)  init1  init2
+   CONSTRUCTOR_NELTS (init1) == CONSTRUCTOR_NELTS (init2))
+{
+  unsigned i;
+  tree field1, field2;
+  tree val1, val2;
+  bool matched = true;
+
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (init1),
+i, field1, val1)
+   {
+ gcc_assert (!field1);
+ field2 = CONSTRUCTOR_ELT (init2, i)-index;
+ val2 = CONSTRUCTOR_ELT (init2, i)-value;
+ gcc_assert (!field2);
+ if (val2 == val1)
+   continue;
+ STRIP_NOPS (val1);
+ STRIP_NOPS (val2);
+ /* Unwind
+ val addr_expr type pointer_type
+  readonly constant
+  arg 0 mem_ref type pointer_type __vtbl_ptr_type
+  readonly
+  arg 0 addr_expr type pointer_type
+  arg 0 var_decl _ZTCSd0_Si arg 1 integer_cst 24 */
+
+ while ((TREE_CODE (val1) == MEM_REF
+  TREE_CODE (val2) == MEM_REF
+  (TREE_OPERAND (val1, 1)
+ == TREE_OPERAND (val2, 1)))
+|| (TREE_CODE (val1) == ADDR_EXPR
+ TREE_CODE (val2) == ADDR_EXPR))
+   {
+ val1 = TREE_OPERAND (val1, 0);
+ val2 = TREE_OPERAND (val2, 0);
+   }
+ if (val1 == val2)
+   continue;
+ if (TREE_CODE (val2) == ADDR_EXPR)
+   {
+ tree tmp = val1;
+ val1 = val2;
+ val2 = tmp;
+}
+ /* Allow combining RTTI and non-RTTI is OK.  */
+ if (TREE_CODE (val1) == ADDR_EXPR
+  TREE_CODE (TREE_OPERAND (val1, 0)) == VAR_DECL
+  !DECL_VIRTUAL_P (TREE_OPERAND (val1, 0))
+  TREE_CODE (val2) == NOP_EXPR
+  integer_zerop (TREE_OPERAND (val2, 0)))
+   continue;
+ if (TREE_CODE (val1) == NOP_EXPR
+  TREE_CODE (val2) == NOP_EXPR
+  integer_zerop (TREE_OPERAND (val1, 0))
+  integer_zerop 

Re: Warn about virtual table mismatches

2014-02-11 Thread Jason Merrill

On 02/11/2014 07:54 PM, Jan Hubicka wrote:

+ /* Allow combining RTTI and non-RTTI is OK.  */


You mean combining -frtti and -fno-rtti compiles?  Yes, that's fine, 
though you need to prefer the -frtti version in case code from that 
translation unit uses the RTTI info.



/aux/hubicka/firefox/accessible/src/generic/DocAccessible.cpp:1232:0: note: the 
first different method is �HandleAccEvent�


I don't see where this note would come from in the patch.

Jason



Re: Warn about virtual table mismatches

2014-02-11 Thread Jan Hubicka
 On 02/11/2014 07:54 PM, Jan Hubicka wrote:
 +  /* Allow combining RTTI and non-RTTI is OK.  */
 
 You mean combining -frtti and -fno-rtti compiles?  Yes, that's fine,
 though you need to prefer the -frtti version in case code from that
 translation unit uses the RTTI info.

Is there some mechanism that linker will do so? At the moment we just chose 
variant
that would be selected by linker.  I can make the choice, but what happens with 
non-LTO?
 
 /aux/hubicka/firefox/accessible/src/generic/DocAccessible.cpp:1232:0: note: 
 the first different method is �HandleAccEvent�
 
 I don't see where this note would come from in the patch.
 
Sorry, diffed old tree

Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 207702)
+++ ipa-devirt.c(working copy)
@@ -1675,6 +1675,132 @@
 }
 
 
+/* Compare two virtual tables, PREVAILING and VTABLE and output ODR
+   violation warings.  */
+
+void
+compare_virtual_tables (tree prevailing, tree vtable)
+{
+  tree init1 = DECL_INITIAL (prevailing), init2 = DECL_INITIAL (vtable);
+  tree val1 = NULL, val2 = NULL;
+  if (!DECL_VIRTUAL_P (prevailing))
+{
+  odr_violation_reported = true;
+  if (warning_at (DECL_SOURCE_LOCATION (prevailing), 0,
+declaration %D conflict with a virtual table,
+prevailing))
+   inform (DECL_SOURCE_LOCATION (TYPE_NAME (DECL_CONTEXT (vtable))),
+   a type defining the virtual table in another translation 
unit);
+  return;
+}
+  if (init1 == init2 || init2 == error_mark_node)
+return;
+  /* Be sure to keep virtual table contents even for external
+ vtables when they are available.  */
+  gcc_assert (init1  init1 != error_mark_node);
+  if (!init2  DECL_EXTERNAL (vtable))
+return;
+  if (init1  init2
+   CONSTRUCTOR_NELTS (init1) == CONSTRUCTOR_NELTS (init2))
+{
+  unsigned i;
+  tree field1, field2;
+  bool matched = true;
+
+  FOR_EACH_CONSTRUCTOR_ELT (CONSTRUCTOR_ELTS (init1),
+i, field1, val1)
+   {
+ gcc_assert (!field1);
+ field2 = CONSTRUCTOR_ELT (init2, i)-index;
+ val2 = CONSTRUCTOR_ELT (init2, i)-value;
+ gcc_assert (!field2);
+ if (val2 == val1)
+   continue;
+ if (TREE_CODE (val1) == NOP_EXPR)
+   val1 = TREE_OPERAND (val1, 0);
+ if (TREE_CODE (val2) == NOP_EXPR)
+   val2 = TREE_OPERAND (val2, 0);
+ /* Unwind
+ val addr_expr type pointer_type
+  readonly constant
+  arg 0 mem_ref type pointer_type __vtbl_ptr_type
+  readonly
+  arg 0 addr_expr type pointer_type
+  arg 0 var_decl _ZTCSd0_Si arg 1 integer_cst 24 */
+
+ while (TREE_CODE (val1) == TREE_CODE (val2)
+ (((TREE_CODE (val1) == MEM_REF
+  || TREE_CODE (val1) == POINTER_PLUS_EXPR)
+  (TREE_OPERAND (val1, 1))
+   == TREE_OPERAND (val2, 1))
+|| TREE_CODE (val1) == ADDR_EXPR))
+   {
+ val1 = TREE_OPERAND (val1, 0);
+ val2 = TREE_OPERAND (val2, 0);
+ if (TREE_CODE (val1) == NOP_EXPR)
+   val1 = TREE_OPERAND (val1, 0);
+ if (TREE_CODE (val2) == NOP_EXPR)
+   val2 = TREE_OPERAND (val2, 0);
+   }
+ if (val1 == val2)
+   continue;
+ if (TREE_CODE (val2) == ADDR_EXPR)
+   {
+ tree tmp = val1;
+ val1 = val2;
+ val2 = tmp;
+}
+ /* Combining RTTI and non-RTTI vtables is OK.
+??? Perhaps we can alsa (optionally) warn here?  */
+ if (TREE_CODE (val1) == ADDR_EXPR
+  TREE_CODE (TREE_OPERAND (val1, 0)) == VAR_DECL
+  !DECL_VIRTUAL_P (TREE_OPERAND (val1, 0))
+  integer_zerop (val2))
+   continue;
+ /* For some reason zeros gets NOP_EXPR wrappers.  */
+ if (integer_zerop (val1)
+  integer_zerop (val2))
+   continue;
+ /* Compare assembler names; this function is run during
+declaration merging.  */
+ if (DECL_P (val1)  DECL_P (val2)
+  DECL_ASSEMBLER_NAME (val1) == DECL_ASSEMBLER_NAME (val2))
+   continue;
+ matched = false;
+ break;
+   }
+  if (!matched)
+   {
+ odr_violation_reported = true;
+ if (warning_at (DECL_SOURCE_LOCATION (TYPE_NAME (DECL_CONTEXT 
(vtable))), 0,
+type %qD violates one definition rule,
+DECL_CONTEXT (vtable)))
+   {
+ inform (DECL_SOURCE_LOCATION (TYPE_NAME (DECL_CONTEXT 
(prevailing))),
+ a type with the same name but different virtual table is 

+ defined in another translation unit);
+  

[Ping] Re: [C++ Patch/RFC] PR 60047

2014-02-11 Thread Paolo Carlini

Hi,

the last version of this work is unreviewed, should be rather 
straightforward...


Paolo.


Re: [PATCH] Documentation for dump and optinfo output

2014-02-11 Thread Thomas Schwinge
Hi!

On Wed, 5 Feb 2014 16:33:19 -0800, Sharad Singhai sing...@google.com wrote:
 I am really sorry about the delay.

No worries; not exactly a severe issue.  ;-)

 I couldn't exactly reproduce the
 warning which you described

Maybe the version of makeinfo is relevant?

$ makeinfo --version | head -n 1
makeinfo (GNU texinfo) 5.1

 but I found a place where two nodes were
 out of order in optinfo.texi. Could you please apply the following
 patch and see if the problem goes away? If it works for you, I will
 commit the doc fixes.
 
 Also I would appreciate the exact command which produces these warnings.

Your patch does fix the problem; see the following diff of the build log,
where the warnings are now gone, and which also happens to contain the
makeinfo command line.

@@ -4199,12 +4199,6 @@ if [ xinfo = xinfo ]; then \
makeinfo --split-size=500 --split-size=500 --no-split 
-I . -I ../../source/gcc/doc \
-I ../../source/gcc/doc/include -o doc/gccint.info 
../../source/gcc/doc/gccint.texi; \
fi
-../../source/gcc/doc/optinfo.texi:45: warning: node next `Optimization groups' 
in menu `Dump output verbosity' and in sectioning `Dump files and streams' 
differ
-../../source/gcc/doc/optinfo.texi:77: warning: node next `Dump files and 
streams' in menu `Dump types' and in sectioning `Dump output verbosity' differ
-../../source/gcc/doc/optinfo.texi:77: warning: node prev `Dump files and 
streams' in menu `Dump output verbosity' and in sectioning `Optimization 
groups' differ
-../../source/gcc/doc/optinfo.texi:104: warning: node next `Dump output 
verbosity' in menu `Dump files and streams' and in sectioning `Dump types' 
differ
-../../source/gcc/doc/optinfo.texi:104: warning: node prev `Dump output 
verbosity' in menu `Optimization groups' and in sectioning `Dump files and 
streams' differ
-../../source/gcc/doc/optinfo.texi:137: warning: node prev `Dump types' in menu 
`Dump files and streams' and in sectioning `Dump output verbosity' differ
 if [ xinfo = xinfo ]; then \
makeinfo --split-size=500 --split-size=500 --no-split 
-I ../../source/gcc/doc \
-I ../../source/gcc/doc/include -o doc/gccinstall.info 
../../source/gcc/doc/install.texi; \

 * doc/optinfo.texi: Fix order of nodes.

Thanks, please commit.


Grüße,
 Thomas


pgpFX9jiZv3dY.pgp
Description: PGP signature