[PING][PATCH, ARM] Fix PR63718, Thumb1 bootstrap -- disable fuse-caller-save for Thumb1

2014-12-02 Thread Tom de Vries

On 20-11-14 12:54, Tom de Vries wrote:

Richard,

This patch fixes PR63718, which currently breaks Thumb1 bootstrap.

The problem is that in Thumb1 mode, we emit the epilogue in RTL, but the last
insn - epilogue_insns - does not accurately model the corresponding insns
emitted in the asm file. F.i., the asm file may contain an insn:
...
   pop {r0}

while the corresponding RTL pattern looks like this:
...
(jump_insn (unspec_volatile [
 (return)
  ] VUNSPEC_EPILOGUE))
...

As a consequence, the epilogue may clobber registers without fuse-caller-save
being able to analyze that.

Adding the missing clobbers to epilogue_insns is not trivial, and probably not a
good idea for stage3. The patch works around the problem by disabling
fuse-caller-save in Thumb1 mode.

Build and reg-tested on arm-none-eabi.

OK for stage3?



Ping.

Thanks,
- Tom



Re: [PATCH] Fix PR64126

2014-12-02 Thread Richard Biener
On Mon, 1 Dec 2014, Marc Glisse wrote:

 On Mon, 1 Dec 2014, Richard Biener wrote:
 
  The following fixes PR64126.
  
  Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.
  
  Richard.
  
  2014-12-01  Richard Biener  rguent...@suse.de
  
  PR middle-end/64126
  * match.pd: Allow conversions in ~A + 1 - -A, add -A - 1 - ~A
  and -1 - A - ~A.
  * fold-const.c (fold_binary_loc): Remove transforms here.
  
  Index: gcc/match.pd
  ===
  --- gcc/match.pd(revision 218144)
  +++ gcc/match.pd(working copy)
  @@ -484,8 +522,22 @@ (define_operator_list inverted_tcc_compa
  
/* ~A + 1 - -A */
(simplify
  -   (plus (bit_not @0) integer_each_onep)
  -   (negate @0))
  +   (plus (convert? (bit_not @0)) integer_each_onep)
  +   (if (tree_nop_conversion_p (type, TREE_TYPE (@0)))
  +(negate (convert @0
  +
  +  /* -A - 1 - ~A */
  +  (simplify
  +   (minus (convert? (negate @0)) integer_each_onep)
  +   (if (!TYPE_OVERFLOW_TRAPS (type)
 
 I don't understand why TYPE_OVERFLOW_TRAPS is tested for this one but not the
 others.

No idea - I just copied from fold-const.c.  The whole -ftrapv and
now -fsanitize=overflow stuff is just very inconsistently checked.

  +tree_nop_conversion_p (type, TREE_TYPE (@0)))
  +(bit_not (convert @0
  +
  +  /* -1 - A - ~A */
  +  (simplify
  +   (minus integer_all_onesp @0)
  +   (if (TREE_CODE (type) != COMPLEX_TYPE)
  +(bit_not @0)))
 
 It should also be true for COMPLEX_TYPE where integer_all_onesp tests for
 -1-i.

Yeah, I have no idea why it was disabled for complex.  Maybe expansion
doesn't deal with ~complex properly (on some targets?).  Not sure.

Didn't bother to investigate as ...

 (I know you are just copying from fold-const)

... I was just copying from fold-const.c ;)

Richard.


Re: [PATCH] Fix PR64137

2014-12-02 Thread Richard Biener
On Mon, 1 Dec 2014, Richard Biener wrote:

 On Mon, 1 Dec 2014, FX wrote:
 
  Your change is OK (we don’t want to use the type of the result, but the 
  type of the argument indeed).
  
  
   Index: gcc/fortran/trans-intrinsic.c
   ===
   --- gcc/fortran/trans-intrinsic.c (revision 218211)
   +++ gcc/fortran/trans-intrinsic.c (working copy)
   @@ -3729,7 +3729,7 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * s
possible value is HUGE in both cases.  */
 if (op == GT_EXPR)
   tmp = fold_build1_loc (input_location, NEGATE_EXPR, TREE_TYPE (tmp), 
   tmp);
   -  if (op == GT_EXPR  expr-ts.type == BT_INTEGER)
   +  if (op == GT_EXPR  arrayexpr-ts.type == BT_INTEGER)
   tmp = fold_build2_loc (input_location, MINUS_EXPR, TREE_TYPE (tmp), 
   tmp,
build_int_cst (type, 1));
  
  
  Logic would dictate that it is build_int_cst (TREE_TYPE (tmp), 1)” 
  instead of build_int_cst (type, 1)” in that last line. Probably doesn’t 
  matter much, as it will be all folded into the same value anyway, but 
  could you test and commit that change together, while you’re at it?
 
 Sure, will re-test and commit tomorrow.

The following is what I have applied after bootstrap  regtest on
x86_64-unknown-linux-gnu.

Richard.

2014-12-02  Richard Biener  rguent...@suse.de

PR fortran/64137
* trans-intrinsic.c (gfc_conv_intrinsic_minmaxloc): Check
proper expressions type, use proper type for computing
-Huge - 1.

Index: gcc/fortran/trans-intrinsic.c
===
--- gcc/fortran/trans-intrinsic.c   (revision 218225)
+++ gcc/fortran/trans-intrinsic.c   (working copy)
@@ -3729,9 +3729,9 @@ gfc_conv_intrinsic_minmaxloc (gfc_se * s
  possible value is HUGE in both cases.  */
   if (op == GT_EXPR)
 tmp = fold_build1_loc (input_location, NEGATE_EXPR, TREE_TYPE (tmp), tmp);
-  if (op == GT_EXPR  expr-ts.type == BT_INTEGER)
+  if (op == GT_EXPR  arrayexpr-ts.type == BT_INTEGER)
 tmp = fold_build2_loc (input_location, MINUS_EXPR, TREE_TYPE (tmp), tmp,
-  build_int_cst (type, 1));
+  build_int_cst (TREE_TYPE (tmp), 1));
 
   gfc_add_modify (se-pre, limit, tmp);
 

Re: [PATCH] Fix PR15346

2014-12-02 Thread Richard Biener
On Mon, 1 Dec 2014, Joseph Myers wrote:

 On Mon, 1 Dec 2014, Richard Biener wrote:
 
  +/* Combine two successive divisions.  */
  +(for div (trunc_div ceil_div floor_div round_div exact_div)
 
 This doesn't seem correct for all kinds of division and signedness of 
 arguments.
 
 TRUNC_DIV_EXPR (C division) and EXACT_DIV_EXPR should be OK regardless.  
 But for CEIL_DIV_EXPR and FLOOR_DIV_EXPR, in ((a / b) / c) you need c to 
 be positive so that both roundings are in the same direction (consider 
 e.g. 3 FLOOR_DIV -2 FLOOR_DIV -2 == 1, but 3 FLOOR_DIV 4 == 0).  And for 
 ROUND_DIV_EXPR, it can be incorrect whatever the signs (e.g. 2 ROUND_DIV 3 
 ROUND_DIV 2 == 1, but 2 ROUND_DIV 6 == 0).
 
 I'm not sure if these forms of division actually occur in places where 
 this could cause a problem, but it does look like Ada may enable you to 
 generate ROUND_DIV_EXPR.

Hmm.  I thought I was following what extract_muldiv_1 does (but of course
that function is quite hard to follow).

case TRUNC_DIV_EXPR:  case CEIL_DIV_EXPR:  case FLOOR_DIV_EXPR:
case ROUND_DIV_EXPR:  case EXACT_DIV_EXPR:
...
  /* If these are the same operation types, we can associate them
 assuming no overflow.  */
  if (tcode == code)
{
  bool overflow_p = false;
  bool overflow_mul_p;
  signop sign = TYPE_SIGN (ctype);
  wide_int mul = wi::mul (op1, c, sign, overflow_mul_p);
  overflow_p = TREE_OVERFLOW (c) | TREE_OVERFLOW (op1);
  if (overflow_mul_p
   ((sign == UNSIGNED  tcode != MULT_EXPR) || sign == 
SIGNED))
overflow_p = true;
  if (!overflow_p)
return fold_build2 (tcode, ctype, fold_convert (ctype, op0),
wide_int_to_tree (ctype, mul));

but yeah, I see you are correct.  I'll restrict the pattern to
TRUNC_DIV_EXPR and EXACT_DIV_EXPR which are the only ones I've
ever seen generated from C.

I don't know how to generate testcases for the other division
opcodes - eventually we should have builtins just for the sake
of being able to generate testcases...

Committed as obvoious.

Thanks,
Richard.

2014-12-02  Richard Biener  rguent...@suse.de

* match.pd: Restrict division combining to trunc_div and
exact_div.

Index: gcc/match.pd
===
--- gcc/match.pd(revision 218258)
+++ gcc/match.pd(working copy)
@@ -129,8 +129,9 @@ (define_operator_list inverted_tcc_compa
TYPE_UNSIGNED (type))
   (trunc_div @0 @1)))
 
-/* Combine two successive divisions.  */
-(for div (trunc_div ceil_div floor_div round_div exact_div)
+/* Combine two successive divisions.  Note that combining ceil_div
+   and floor_div is trickier and combining round_div even more so.  */
+(for div (trunc_div exact_div)
  (simplify
   (div (div @0 INTEGER_CST@1) INTEGER_CST@2)
   (with {


Re: [PATCH] Fix PR15346

2014-12-02 Thread Richard Biener
On Mon, 1 Dec 2014, Marc Glisse wrote:

 On Mon, 1 Dec 2014, Richard Biener wrote:
 
  The following fixes the oldest bug in my list of assigned PRs, namely
  combining A / CST / CST' to A / (CST * CST').  extract_muldiv does
  this in fold-const.c (but it fails to optimize to zero when CST * CST'
  overflows).
  
  Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
  
  Richard.
  
  2014-12-01  Richard Biener  rguent...@suse.de
  
  PR tree-optimization/15346
  * Makefile.in (gimple-match.o-warn): Remove -Wno-unused-parameter,
  add -Wno-unused-but-set-variable.
  * match.pd: Combine two successive divisions.
  
  * gcc.dg/tree-ssa/forwprop-32.c: New testcase.
  
  Index: gcc/match.pd
  ===
  --- gcc/match.pd(revision 218140)
  +++ gcc/match.pd(working copy)
  @@ -129,6 +129,19 @@ (define_operator_list inverted_tcc_compa
 TYPE_UNSIGNED (type))
(trunc_div @0 @1)))
  
  +/* Combine two successive divisions.  */
  +(for div (trunc_div ceil_div floor_div round_div exact_div)
  + (simplify
  +  (div (div @0 INTEGER_CST@1) INTEGER_CST@2)
  +  (with {
  +bool overflow_p;
  +wide_int mul = wi::mul (@1, @2, TYPE_SIGN (type), overflow_p);
  +   }
  +   (if (!overflow_p)
  +(div @0 { wide_int_to_tree (type, mul); }))
  +   (if (overflow_p)
  +{ build_zero_cst (type); }
  +
 
 Can't you have something like:
 INT_MIN / 2 / (INT_MIN / -2) == -1
 where
 2 * (INT_MIN / -2) overflows?

Hmm, right.  It should work if overflow_p  mul != INT_MIN as far
as I can see.

I am testing the following.

Thanks,
Richard.

2014-12-02  Richard Biener  rguent...@suse.de

* match.pd: When combining divisions exclude the degenerate
case involving INT_MIN from overflow handling.

* gcc.dg/torture/20141202-1.c: New testcase.

Index: gcc/match.pd
===
--- gcc/match.pd(revision 218260)
+++ gcc/match.pd(working copy)
@@ -140,7 +140,9 @@ (define_operator_list inverted_tcc_compa
}
(if (!overflow_p)
 (div @0 { wide_int_to_tree (type, mul); }))
-   (if (overflow_p)
+   (if (overflow_p
+ (TYPE_UNSIGNED (type)
+   || mul != wi::min_value (TYPE_PRECISION (type), SIGNED)))
 { build_zero_cst (type); }
 
 /* Optimize A / A to 1.0 if we don't care about
Index: gcc/testsuite/gcc.dg/torture/20141202-1.c
===
--- gcc/testsuite/gcc.dg/torture/20141202-1.c   (revision 0)
+++ gcc/testsuite/gcc.dg/torture/20141202-1.c   (working copy)
@@ -0,0 +1,15 @@
+/* { dg-do run } */
+
+extern void abort (void);
+
+int foo (int x)
+{
+  return (x / 2) / ((-__INT_MAX__ - 1) / -2);
+}
+
+int main()
+{
+  if (foo (- __INT_MAX__ - 1) != -1)
+abort ();
+  return 0;
+}




Re: [PATCH] Adjust gimple_build_assign* gimple.texi documentation

2014-12-02 Thread Richard Biener
On Tue, 2 Dec 2014, Jakub Jelinek wrote:

 Hi!
 
 Here is an attempt to adjust gimple_build_assign* documentation.
 Ok for trunk?

Ok.

Thanks,
Richard.

 Note, apparently the documentation has not been adjusted for
 the gimple - gassign * etc. changes, David, can you please work on
 adjusting gimple.texi to match the reality after your changes?
 
 2014-12-02  Jakub Jelinek  ja...@redhat.com
 
   * doc/gimple.texi (gimple_build_assign_with_ops): Remove.
   (gimple_build_assign): Document the new overloads.
 
 --- gcc/doc/gimple.texi.jj2014-11-20 17:06:24.0 +0100
 +++ gcc/doc/gimple.texi   2014-12-02 08:27:06.898386806 +0100
 @@ -1124,9 +1124,35 @@ already have a tree expression that you
  tuple.  However, try to avoid building expression trees for the
  sole purpose of calling this function.  If you already have the
  operands in separate trees, it is better to use
 -@code{gimple_build_assign_with_ops}.
 +@code{gimple_build_assign} with @code{enum tree_code} argument and separate
 +arguments for each operand.
  @end deftypefn
  
 +@deftypefn {GIMPLE function} gimple gimple_build_assign @
 +(tree lhs, enum tree_code subcode, tree op1, tree op2, tree op3)
 +This function is similar to two operand @code{gimple_build_assign},
 +but is used to build a @code{GIMPLE_ASSIGN} statement when the operands of 
 the
 +right-hand side of the assignment are already split into
 +different operands.
 +
 +The left-hand side is an lvalue passed in lhs.  Subcode is the
 +@code{tree_code} for the right-hand side of the assignment.  Op1, op2 and op3
 +are the operands.
 +@end deftypefn
 +
 +@deftypefn {GIMPLE function} gimple gimple_build_assign @
 +(tree lhs, enum tree_code subcode, tree op1, tree op2)
 +Like the above 5 operand @code{gimple_build_assign}, but with the last
 +argument @code{NULL} - this overload should not be used for
 +@code{GIMPLE_TERNARY_RHS} assignments.
 +@end deftypefn
 +
 +@deftypefn {GIMPLE function} gimple gimple_build_assign @
 +(tree lhs, enum tree_code subcode, tree op1)
 +Like the above 4 operand @code{gimple_build_assign}, but with the last
 +argument @code{NULL} - this overload should be used only for
 +@code{GIMPLE_UNARY_RHS} and @code{GIMPLE_SINGLE_RHS} assignments.
 +@end deftypefn
  
  @deftypefn {GIMPLE function} gimple gimplify_assign (tree dst, tree src, 
 gimple_seq *seq_p)
  Build a new @code{GIMPLE_ASSIGN} tuple and append it to the end of
 @@ -1139,19 +1165,6 @@ case they will be converted to a gimple
  
  This function returns the newly created @code{GIMPLE_ASSIGN} tuple.
  
 -@deftypefn {GIMPLE function} gimple gimple_build_assign_with_ops @
 -(enum tree_code subcode, tree lhs, tree op1, tree op2)
 -This function is similar to @code{gimple_build_assign}, but is used to
 -build a @code{GIMPLE_ASSIGN} statement when the operands of the
 -right-hand side of the assignment are already split into
 -different operands.
 -
 -The left-hand side is an lvalue passed in lhs.  Subcode is the
 -@code{tree_code} for the right-hand side of the assignment.  Op1 and op2
 -are the operands.  If op2 is null, subcode must be a @code{tree_code}
 -for a unary expression.
 -@end deftypefn
 -
  @deftypefn {GIMPLE function} {enum tree_code} gimple_assign_rhs_code (gimple 
 g)
  Return the code of the expression computed on the @code{RHS} of
  assignment statement @code{G}.
 
   Jakub
 
 

-- 
Richard Biener rguent...@suse.de
SUSE LINUX GmbH, GF: Jeff Hawn, Jennifer Guild, Felix Imendoerffer, HRB 21284
(AG Nuernberg)
Maxfeldstrasse 5, 90409 Nuernberg, Germany


[match-and-simplify] Merge from trunk

2014-12-02 Thread Richard Biener

2014-12-02  Richard Biener  rguent...@suse.de

Merge from trunk r217561 through r218261.



[PATCH, alpha]: Fix PR 64113, Error: No lda !gpdisp!282 was found

2014-12-02 Thread Uros Bizjak
Hello!

As mentioned in the PR, we need accurate gp live analysis to split
__tls_get_addr call. Move split from post-reload splitter to peephole2
pass, as is the case with other call insns.

2014-12-02  Uros Bizjak  ubiz...@gmail.com

PR target/64113
* config/alpha/alpha.md (call_value_osf_tlsgd): Do not split insn
using post-reload splitter.  Use peephole2 pass instead.
(call_value_osf_tlsldm): Ditto.
(TLS_CALL): New int iterator.
(tls): New int attribute.
(call_value_osf_tls): Merge insn pattern from call_value_osf_tlsgd
and call_value_tlsldm using TLS_CALL int iterator.

Bootstrapped and regression tested on alphaev68-linux-gnu, also tested
by building systemd-217 that failed to compile without the patch.

Committed to mainline SVN, will be committed to 4.9 branch.

Uros.
Index: config/alpha/alpha.md
===
--- config/alpha/alpha.md   (revision 218160)
+++ config/alpha/alpha.md   (working copy)
@@ -5984,16 +5984,38 @@
   [(set_attr type jsr)
(set_attr length *,*,8)])
 
-(define_insn_and_split call_value_osf_tlsgd
+(define_int_iterator TLS_CALL
+   [UNSPEC_TLSGD_CALL
+UNSPEC_TLSLDM_CALL])
+
+(define_int_attr tls
+   [(UNSPEC_TLSGD_CALL tlsgd)
+(UNSPEC_TLSLDM_CALL tlsldm)])
+
+(define_insn call_value_osf_tls
   [(set (match_operand 0)
(call (mem:DI (match_operand:DI 1 symbolic_operand))
  (const_int 0)))
-   (unspec [(match_operand:DI 2 const_int_operand)] UNSPEC_TLSGD_CALL)
+   (unspec [(match_operand:DI 2 const_int_operand)] TLS_CALL)
(use (reg:DI 29))
(clobber (reg:DI 26))]
   HAVE_AS_TLS
-  #
-   reload_completed
+  ldq $27,%1($29)\t\t!literal!%2\;jsr $26,($27),%1\t\t!lituse_tls!%2\;ldah 
$29,0($26)\t\t!gpdisp!%*\;lda $29,0($29)\t\t!gpdisp!%*
+  [(set_attr type jsr)
+   (set_attr length 16)])
+
+;; We must use peep2 instead of a split because we need accurate life
+;; information for $gp.
+(define_peephole2
+  [(parallel
+[(set (match_operand 0)
+ (call (mem:DI (match_operand:DI 1 symbolic_operand))
+   (const_int 0)))
+ (unspec [(match_operand:DI 2 const_int_operand)] TLS_CALL)
+ (use (reg:DI 29))
+ (clobber (reg:DI 26))])]
+  HAVE_AS_TLS  reload_completed
+peep2_regno_dead_p (1, 29)
   [(set (match_dup 3)
(unspec:DI [(match_dup 5)
(match_dup 1)
@@ -6001,10 +6023,9 @@
(parallel [(set (match_dup 0)
   (call (mem:DI (match_dup 3))
 (const_int 0)))
- (set (match_dup 5)
-  (unspec:DI [(match_dup 5) (match_dup 4)] UNSPEC_LDGP1))
+ (use (match_dup 5))
  (use (match_dup 1))
- (use (unspec [(match_dup 2)] UNSPEC_TLSGD_CALL))
+ (use (unspec [(match_dup 2)] TLS_CALL))
  (clobber (reg:DI 26))])
(set (match_dup 5)
(unspec:DI [(match_dup 5) (match_dup 4)] UNSPEC_LDGP2))]
@@ -6012,19 +6033,18 @@
   operands[3] = gen_rtx_REG (Pmode, 27);
   operands[4] = GEN_INT (alpha_next_sequence_number++);
   operands[5] = pic_offset_table_rtx;
-}
-  [(set_attr type multi)])
+})
 
-(define_insn_and_split call_value_osf_tlsldm
-  [(set (match_operand 0)
-   (call (mem:DI (match_operand:DI 1 symbolic_operand))
- (const_int 0)))
-   (unspec [(match_operand:DI 2 const_int_operand)] UNSPEC_TLSLDM_CALL)
-   (use (reg:DI 29))
-   (clobber (reg:DI 26))]
-  HAVE_AS_TLS
-  #
-   reload_completed
+(define_peephole2
+  [(parallel
+[(set (match_operand 0)
+ (call (mem:DI (match_operand:DI 1 symbolic_operand))
+   (const_int 0)))
+ (unspec [(match_operand:DI 2 const_int_operand)] TLS_CALL)
+ (use (reg:DI 29))
+ (clobber (reg:DI 26))])]
+  HAVE_AS_TLS  reload_completed
+!peep2_regno_dead_p (1, 29)
   [(set (match_dup 3)
(unspec:DI [(match_dup 5)
(match_dup 1)
@@ -6035,7 +6055,7 @@
  (set (match_dup 5)
   (unspec:DI [(match_dup 5) (match_dup 4)] UNSPEC_LDGP1))
  (use (match_dup 1))
- (use (unspec [(match_dup 2)] UNSPEC_TLSLDM_CALL))
+ (use (unspec [(match_dup 2)] TLS_CALL))
  (clobber (reg:DI 26))])
(set (match_dup 5)
(unspec:DI [(match_dup 5) (match_dup 4)] UNSPEC_LDGP2))]
@@ -6043,8 +6063,7 @@
   operands[3] = gen_rtx_REG (Pmode, 27);
   operands[4] = GEN_INT (alpha_next_sequence_number++);
   operands[5] = pic_offset_table_rtx;
-}
-  [(set_attr type multi)])
+})
 
 (define_insn *call_value_osf_1
   [(set (match_operand 0)


Re: [Patch] Improving jump-thread pass for PR 54742

2014-12-02 Thread Richard Biener
On Mon, Dec 1, 2014 at 10:06 PM, Jeff Law l...@redhat.com wrote:
 On 11/25/14 14:16, Sebastian Pop wrote:

 Sebastian Pop wrote:

 I will bootstrap and regression test this patch on x86_64-linux and
 powerpc64-linux.  I will also run it on our internal benchmarks,
  coremark, and
 the llvm test-suite.
 
 I will also include a longer testcase that makes sure we do not regress
  on
 coremark.

 Done all the above.  Attached is the new patch with a new testcase.  I
 have also
 added verify_seme inspired by the recent patch adding verify_sese.

 Sebastian


 0001-extend-jump-thread-for-finite-state-automata-PR-5474.patch


  From ca222d5222fb976c7aa258d3e3c04e593f42f7a2 Mon Sep 17 00:00:00 2001
 From: Sebastian Pops@samsung.com
 Date: Fri, 26 Sep 2014 14:54:20 -0500
 Subject: [PATCH] extend jump thread for finite state automata PR 54742

 Adapted from a patch from James Greenhalgh.

 * params.def (max-fsm-thread-path-insns, max-fsm-thread-length,
 max-fsm-thread-paths): New.

 * doc/invoke.texi (max-fsm-thread-path-insns,
 max-fsm-thread-length,
 max-fsm-thread-paths): Documented.

 * tree-cfg.c (split_edge_bb_loc): Export.
 * tree-cfg.h (split_edge_bb_loc): Declared extern.

 * tree-ssa-threadedge.c (simplify_control_stmt_condition): Restore
 the
 original value of cond when simplification fails.
 (fsm_find_thread_path): New.
 (fsm_find_control_statement_thread_paths): New.
 (fsm_thread_through_normal_block): Call
 find_control_statement_thread_paths.

 * tree-ssa-threadupdate.c (dump_jump_thread_path): Pretty print
 EDGE_START_FSM_THREAD.
 (verify_seme): New.
 (duplicate_seme_region): New.
 (thread_through_all_blocks): Generate code for
 EDGE_START_FSM_THREAD edges
 calling gimple_duplicate_sese_region.

 * tree-ssa-threadupdate.h (jump_thread_edge_type): Add
 EDGE_START_FSM_THREAD.

 * testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c: New.
 * testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c: New.
 ---
   gcc/doc/invoke.texi  |   12 ++
   gcc/params.def   |   15 ++
   gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c |   43 +
   gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c |  127 +
   gcc/tree-cfg.c   |2 +-
   gcc/tree-cfg.h   |1 +
   gcc/tree-ssa-threadedge.c|  215
 +-
   gcc/tree-ssa-threadupdate.c  |  201
 +++-
   gcc/tree-ssa-threadupdate.h  |1 +
   9 files changed, 614 insertions(+), 3 deletions(-)
   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c
   create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-7.c

 diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
 index 89edddb..074183f 100644
 --- a/gcc/doc/invoke.texi
 +++ b/gcc/doc/invoke.texi
 @@ -10624,6 +10624,18 @@ large and significantly increase compile time at
 optimization level
   @option{-O1} and higher.  This parameter is a maximum nubmer of
 statements
   in a single generated constructor.  Default value is 5000.

 +@item max-fsm-thread-path-insns
 +Maximum number of instructions to copy when duplicating blocks on a
 +finite state automaton jump thread path.  The default is 100.
 +
 +@item max-fsm-thread-length
 +Maximum number of basic blocks on a finite state automaton jump thread
 +path.  The default is 10.
 +
 +@item max-fsm-thread-paths
 +Maximum number of new jump thread paths to create for a finite state
 +automaton.  The default is 50.

 Has there been any tuning on these defaults.  I don't have any strong
 opinions about what they ought to be, this is more to get any such
 information recorded on the lists for historical purposes.

 I think it's worth a note in the debug dump anytime you abort threading when
 you hit a limit.

 I'm a bit worried about compile-time impacts of the all the recursion, but
 I'm willing to wait and see if it turns out to be a problem in practice.

Please consider restricting it to -fexpensive-optimizations (-O2+).

Richard.


 diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
 index 8b0b7b8..c9fe212 100644
 --- a/gcc/tree-ssa-threadedge.c
 +++ b/gcc/tree-ssa-threadedge.c
 @@ -56,6 +56,8 @@ along with GCC; see the file COPYING3.  If not see
   #include params.h
   #include tree-ssa-threadedge.h
   #include builtins.h
 +#include cfg.h
 +#include cfganal.h

   /* To avoid code explosion due to jump threading, we limit the
  number of statements we are going to copy.  This variable
 @@ -661,6 +663,7 @@ simplify_control_stmt_condition (edge e,
rather than use a relational operator.  These are simpler to
 handle.  */
 if (TREE_CODE (cond) == SSA_NAME)
   {
 +  tree original_lhs = cond;
 cached_lhs = cond;

 /* Get the 

Re: [testsuite] Fix multiple definitions of _init

2014-12-02 Thread Richard Biener
On Mon, Dec 1, 2014 at 11:48 PM, Oleg Endo oleg.e...@t-online.de wrote:
 On Mon, 2014-12-01 at 12:09 +0100, Richard Biener wrote:
 On Mon, Dec 1, 2014 at 8:21 AM, Oleg Endo oleg.e...@t-online.de wrote:
  Hi,
 
  When running the testsuite on a sh-elf configuration, some test cases
  fail due to multiple definitions of the function '_init'.  This is
  because on sh-elf every function is automatically prefixed with a '_'
  char.  When defining a C function 'init' in some code, the actual name
  becomes '_init' and this conflicts with the _init function in libgcc
  (crti.S).  While the behavior of adding the '_' prefix is debatable, the
  testsuite failures can be easily fixed by the attached patch.
  Is this OK for trunk?

 The testcases are certainly valid C testcases and thus sh-elf is buggy
 here.  As '_init' is a reserved identifier it shouldn't mangle 'init' as
 '_init'.

 Do you never run into real applications naming a function 'init'?

 Looking for USER_LABEL_PREFIX shows that the '_' prefix mangling scheme
 is being used by quite some targets/configs.  So it's not just sh-elf,
 but other bare metal configs and their ABIs, where 'init' gets mangled
 to '_init' and there's a '.global _init' in libgcc.  Yes, it does break
 valid standard C code 'void init (void)', but it's been doing that for
 quite a while.

Oh, and I wonder why the global init isn't then __init (two underscores),
as obviously on those targets symbols with _ prefix are not in the
implementation namespace.

 I didn't mean to open a can of worms because of 4 test cases showing up
 in sh-elf sim tests.  From what I can see, those 4 cases would also pop
 up for other configs, not only sh-elf.  Anyway, the function names in
 those test cases don't matter at all, do they?

No, but it looks like papering over a problem.  After all we could have
a valid testcase checking for exactly this situation - that a C function
named 'init' works fine.

Joseph may have more experience with how targets should setup
USER_LABEL_PREFIX to avoid this situation.

Richard.

 Cheers,
 Oleg



[PATCH, REPOST] Fix PR fortran/60718

2014-12-02 Thread Bernd Edlinger
Hello Tobias,

a long time ago, I posted this patch, but it got forgotten.
However the described problem is still unsolved,
so I thought my patch should be re-posted now.


Boot-strapped and regression-tested on arm-linux-gnueabihf.
OK for trunk?


Thanks
Bernd.



On Tue, 15 Apr 2014 13:49:37, Bernd Edlinger wrote:

 Hi Tobias,

 On Fri, 11 Apr 2014 16:04:51, Tobias Burnus wrote:

 Hi Tobias,

 On Fri, Apr 11, 2014 at 02:39:57PM +0200, Bernd Edlinger wrote:
 On Fri, 11 Apr 2014 13:37:46, Tobias Burnus wrote:
 Hmm,

 I was hoping somehow that only that test case is broken,
 and needs to be fixed. The target attribute is somehow simple,
 it implies intent(in) and the actual value will in most cases
 be a pointer, as in the example.

 I think that passing another nonpointer TARGET to a dummy argument
 which has a TARGET attribute is at least as common as passing a
 POINTER to a TARGET.

 TARGET is roughtly the opposite to the restrict qualifier. By default
 any nonpointer variable does not alias with something else, unless
 it has the TARGET attribute; if it has, it (its address) can then
 be assigned to a pointer. POINTER intrinsically alias and cannot
 have the TARGET attribute.

 Pointer - Nonalloc
 Allocatable - Noalloc
 Nonallocatable*/Allocatable* - Pointer with intent(in)

 Well, this approach does not handle intent(inout) at all.



 Now I have created a test case for the different aliasing issues
 with may arise with scalar objects.

 As you pointed out, also conversions of allocatable - nonalloc,
 allocatable - pointer and nonalloc - pointer  turn out to
 violate the strict aliasing rules. However, conversions of
 arrays of objects with different attributes seem to be safe.

 I have not been able to find an example where it would be
 necessary to write the modified class object back to the original
 location. But I am not really a Fortran expert.

 Unfortunately there are also conversions of optional allocatable -
 optional pointer, which complicate the whole thing quite a lot.
 I have found these in class_optional_2.f90.

 Boot-strapped and regression-tested on x86_64-linux-gnu.
 OK for trunk?


 Thanks
 Bernd.

  2014-04-15  Bernd Edlinger  bernd.edlin...@hotmail.de

PR fortran/60718
* trans-expr.c (gfc_conv_procedure_call): Fix a strict aliasing
violation when passing a class object to a formal parameter which has
different pointer or allocatable attributes.

testsuite:
2014-04-14  Bernd Edlinger  bernd.edlin...@hotmail.de

PR fortran/60718
* gfortran.dg/class_alias.f90: New.



patch-pr60718.diff
Description: Binary data


Re: PR 13631 Problems in messages

2014-12-02 Thread Jonathan Wakely

On 01/12/14 22:42 +0100, François Dumont wrote:

Hi

   Here is another proposal that consider all your remarks except 
one. I finally prefer to go with std::vector of pointers. Dynamically 
allocating Catalog_info allow to avoid numerous copies of locale when 
we find this pointer from the catalog info.




+ result_type

+_M_get(catalog_id __c) const
+{
+  __gnu_cxx::__scoped_lock lock(_M_mutex);
+
+  const catalog_info* __entry =
+lower_bound(_M_map, _M_map + _M_nb_entry, __c, _Comp());
+  if (__entry != _M_map + _M_nb_entry  __entry-_M_id == __c)
+return result_type(__entry-_M_domain, __entry-_M_locale);
+  return result_type(0, locale());


I assume the second return was just for testing, it should be removed.


I don't get this one. I still need to return something when I can't 
find the catalog so the 2 return statements.


I thought I remembered seeing two returns at the same level of
indentation, but obviously they are not! Maybe I was looking it it in
gmail's attachment viewer which eats leading whitespace. Ignore that
comment, sorry.


Index: config/locale/gnu/messages_members.h

===
--- config/locale/gnu/messages_members.h(revision 218027)
+++ config/locale/gnu/messages_members.h(working copy)
@@ -83,22 +83,6 @@
 _S_destroy_c_locale(_M_c_locale_messages); }

-  templatetypename _CharT
-typename messages_CharT::catalog - 
messages_CharT::do_open(const basic_stringchar __s, -  
const locale) const
-{ -  // No error checking is done, assume the catalog 
exists and can

-  // be used.
-  textdomain(__s.c_str());
-  return 0;
-}
-
-  templatetypename _CharT
-void-messages_CharT::do_close(catalog) const -{ }
-
  // messages_byname
  templatetypename _CharT
messages_byname_CharT::messages_byname(const char* __s, 
size_t __refs)


Unless I'm misreading this patch, you've removed the definitions of
messages_CharT::do_open() and messages_CharT::do_close() for the
primary template. They would stil be needed if users instantiate e.g.
messageschar16_t or messagessigned char.


Yes but do you confirm that it is already the same problem with do_get ?


Yes, I'm just wondering why make the situation worse than it already
is. Does leaving the default definitions cause a problem?

If some user has got code that uses messagessigned char and they
provide a definition for messagessigned char::do_get() their code
will break if do_open and do_close disappear. (Realistically I doubt
anyone is doing that though, it may not even work.)

In my opinion we could provide template implementations of all those 
methods relying on codecvt_CharT, char, mbstate_t, even for do_get. 
But in this case some implementation details will be exposed in the 
header files and additional symbols will have to be exported I think.


In fact I have already started doing something like that but then 
start facing issue accessing nl_langinfo_l. Shall I go further and 
provide a patch doing this ?


IMHO no.


+catalog
+_M_add(const string __domain, locale __l)
+{
+  __gnu_cxx::__scoped_lock lock(_M_mutex);
+
+  // The counter is not likely to roll unless catalogs keep on being
+  // opened/closed which is consider as an application mistake for the
+  // moment.
+  if (_M_catalog_counter == numeric_limitscatalog::max())
+   return -1;
+
+  _M_infos.push_back(new Catalog_info(_M_catalog_counter++, __domain, 
__l));
+  return _M_infos.back()-_M_id;


This is not exception safe. You can use auto_ptr to fix it:

 std::auto_ptrCatalog_info info(new Catalog_info(_M_catalog_counter++,
   __domain, __l));
 _M_infos.push_back(info.get());
 return info.release()-_M_id;



Re: [testsuite] Fix multiple definitions of _init

2014-12-02 Thread Oleg Endo
On Tue, 2014-12-02 at 11:22 +0100, Richard Biener wrote:
 On Mon, Dec 1, 2014 at 11:48 PM, Oleg Endo oleg.e...@t-online.de wrote:
  On Mon, 2014-12-01 at 12:09 +0100, Richard Biener wrote:
  On Mon, Dec 1, 2014 at 8:21 AM, Oleg Endo oleg.e...@t-online.de wrote:
   Hi,
  
   When running the testsuite on a sh-elf configuration, some test cases
   fail due to multiple definitions of the function '_init'.  This is
   because on sh-elf every function is automatically prefixed with a '_'
   char.  When defining a C function 'init' in some code, the actual name
   becomes '_init' and this conflicts with the _init function in libgcc
   (crti.S).  While the behavior of adding the '_' prefix is debatable, the
   testsuite failures can be easily fixed by the attached patch.
   Is this OK for trunk?
 
  The testcases are certainly valid C testcases and thus sh-elf is buggy
  here.  As '_init' is a reserved identifier it shouldn't mangle 'init' as
  '_init'.
 
  Do you never run into real applications naming a function 'init'?
 
  Looking for USER_LABEL_PREFIX shows that the '_' prefix mangling scheme
  is being used by quite some targets/configs.  So it's not just sh-elf,
  but other bare metal configs and their ABIs, where 'init' gets mangled
  to '_init' and there's a '.global _init' in libgcc.  Yes, it does break
  valid standard C code 'void init (void)', but it's been doing that for
  quite a while.
 
 Oh, and I wonder why the global init isn't then __init (two underscores),
 as obviously on those targets symbols with _ prefix are not in the
 implementation namespace.

My guess:  Because it's written in asm, and there it's not known whether
a _ prefix should be used or not.  It's an ABI/convention.  BTW it's not
only the init function, but potentially several others.  E.g.
newlib/libc/sys/sh/crt0.S refers to other symbols such as _stack,
_edata, _end, _main, _exit.  Thus, having something global named stack
in a C program will potentially clash.  So while we maybe could use
USER_LABEL_PREFIX in libgcc to add the prefix, there would still be
issues with things defined in e.g. newlib or the linker script.

For bare metal configs / programs having some additional
reserved/prohibited names is not a big deal, as there are often other
restrictions which are even worse.  I guess the issue isn't that bad.
Out of all the gcc test cases, I hit only 4.  In the context of gcc sim
testing, one option could be to use a special toolchain config, which
does not do the _ prefixing.  However, there's no such configure option
and the only way to change it is to change the USER_LABEL_PREFIX
definition, or add a subtarget which doesn't prefix _.  Another option
could be adding something to the global configure stuff so that the
option -fno-leading-underscore can be turned on by default for the
resulting compiler.

Cheers,
Oleg




Re: [PATCH, i386] Add prefixes avoidance tuning for silvermont target

2014-12-02 Thread Ilya Enkovich
On 05 Nov 11:00, Uros Bizjak wrote:
 On Wed, Nov 5, 2014 at 10:35 AM, Ilya Enkovich enkovich@gmail.com wrote:
  Hi,
 
  Having stage1 close to end, may we make some decision regarding this
  patch? Having a couple of working variants, may we choose and use one
  of them?
 
 I propose to wait for Vlad for an update about his plans on register
 preference algorythm that would fix this (and other Ya*r-type
 issues).
 
 In the absence of the fix, we'll go with Yr,*x.
 
 Uros.
 

Hi,

Here is an updated patch which uses Yr,*x option to avoid long prefixes for 
Silvermont.  Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2014-12-02  Ilya Enkovich  ilya.enkov...@intel.com

* config/i386/constraints.md (Yr): New.
* config/i386/i386.h (reg_class): Add NO_REX_SSE_REGS.
(REG_CLASS_NAMES): Likewise.
(REG_CLASS_CONTENTS): Likewise.
* config/i386/sse.md (*vec_concatv2sf_sse4_1): Add alternatives
which use only NO_REX_SSE_REGS.
(vec_setmode_0): Likewise.
(*vec_setv4sf_sse4_1): Likewise.
(sse4_1_insertps): Likewise.
(*sse4_1_extractps): Likewise.
(*sse4_1_mulv2siv2di3mask_name): Likewise.
(*sse4_1_avx2_mulmode3mask_name): Likewise.
(*sse4_1_codemode3mask_name): Likewise.
(*sse4_1_codemode3): Likewise.
(*sse4_1_eqv2di3): Likewise.
(sse4_2_gtv2di3): Likewise.
(*vec_extractv4si): Likewise.
(*vec_concatv2si_sse4_1): Likewise.
(vec_concatv2di): Likewise.
(sse4_1_blendssemodesuffixavxsizesuffix): Likewise.
(sse4_1_blendvssemodesuffixavxsizesuffix): Likewise.
(sse4_1_dpssemodesuffixavxsizesuffix): Likewise.
(vi8_sse4_1_avx2_avx512_movntdqa): Likewise.
(sse4_1_avx2_mpsadbw): Likewise.
(sse4_1_avx2packusdwmask_name): Likewise.
(sse4_1_avx2_pblendvb): Likewise.
(sse4_1_pblendw): Likewise.
(sse4_1_phminposuw): Likewise.
(sse4_1_codev8qiv8hi2mask_name): Likewise.
(sse4_1_codev4qiv4si2mask_name): Likewise.
(sse4_1_codev4hiv4si2mask_name): Likewise.
(sse4_1_codev2qiv2di2mask_name): Likewise.
(sse4_1_codev2hiv2di2mask_name): Likewise.
(sse4_1_codev2siv2di2mask_name): Likewise.
(sse4_1_ptest): Likewise.
(sse4_1_roundssemodesuffixavxsizesuffix): Likewise.
(sse4_1_roundssescalarmodesuffix): Likewise.
* config/i386/subst.md (mask_prefix4): New.
* config/i386/x86-tune.def (X86_TUNE_AVOID_4BYTE_PREFIXES): New.

gcc/testsuites/

2014-12-02  Ilya Enkovich  ilya.enkov...@intel.com

* gcc.target/i386/sse2-init-v2di-2.c: Adjust to changed
vec_concatv2di template.
diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md
index 4e07d70..72925cd 100644
--- a/gcc/config/i386/constraints.md
+++ b/gcc/config/i386/constraints.md
@@ -108,6 +108,8 @@
 ;;  d  Integer register when integer DFmode moves are enabled
 ;;  x  Integer register when integer XFmode moves are enabled
 ;;  f  x87 register when 80387 floating point arithmetic is enabled
+;;  r  SSE regs not requiring REX prefix when prefixes avoidance is enabled
+;; and all SSE regs otherwise
 
 (define_register_constraint Yz TARGET_SSE ? SSE_FIRST_REG : NO_REGS
  First SSE register (@code{%xmm0}).)
@@ -150,6 +152,10 @@
  (ix86_fpmath  FPMATH_387) ? FLOAT_REGS : NO_REGS
  @internal Any x87 register when 80387 FP arithmetic is enabled.)
 
+(define_register_constraint Yr
+ TARGET_SSE ? (X86_TUNE_AVOID_4BYTE_PREFIXES ? NO_REX_SSE_REGS : 
ALL_SSE_REGS) : NO_REGS
+ @internal Lower SSE register when avoiding REX prefix and all SSE registers 
otherwise.)
+
 ;; We use the B prefix to denote any number of internal operands:
 ;;  s  Sibcall memory operand, not valid for TARGET_X32
 ;;  w  Call memory operand, not valid for TARGET_X32
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 3f5f979..6d9df65 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -1311,6 +1311,7 @@ enum reg_class
   FP_TOP_REG, FP_SECOND_REG,   /* %st(0) %st(1) */
   FLOAT_REGS,
   SSE_FIRST_REG,
+  NO_REX_SSE_REGS,
   SSE_REGS,
   EVEX_SSE_REGS,
   BND_REGS,
@@ -1369,6 +1370,7 @@ enum reg_class
FP_TOP_REG, FP_SECOND_REG,  \
FLOAT_REGS,   \
SSE_FIRST_REG,\
+   NO_REX_SSE_REGS,  \
SSE_REGS, \
EVEX_SSE_REGS,\
BND_REGS, \
@@ -1409,6 +1411,7 @@ enum reg_class
 { 0x0200,   0x0,0x0 },   /* FP_SECOND_REG */ \
 { 0xff00,   0x0,0x0 },   /* FLOAT_REGS */\
   { 0x20,   0x0,0x0 },   /* SSE_FIRST_REG */ \
+{ 0x1fe0,  0x00,0x0 },   /* NO_REX_SSE_REGS */   \
 { 0x1fe0,  0x1fe000,0x0 },   /* SSE_REGS */  \
{ 0x0,0xffe0,   0x1f },  

Re: [PATCH, i386] Add prefixes avoidance tuning for silvermont target

2014-12-02 Thread Uros Bizjak
On Tue, Dec 2, 2014 at 12:08 PM, Ilya Enkovich enkovich@gmail.com wrote:

  Having stage1 close to end, may we make some decision regarding this
  patch? Having a couple of working variants, may we choose and use one
  of them?

 I propose to wait for Vlad for an update about his plans on register
 preference algorythm that would fix this (and other Ya*r-type
 issues).

 In the absence of the fix, we'll go with Yr,*x.

 Uros.


 Hi,

 Here is an updated patch which uses Yr,*x option to avoid long prefixes for 
 Silvermont.  Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for 
 trunk?

 Thanks,
 Ilya
 --
 gcc/

 2014-12-02  Ilya Enkovich  ilya.enkov...@intel.com

 * config/i386/constraints.md (Yr): New.
 * config/i386/i386.h (reg_class): Add NO_REX_SSE_REGS.
 (REG_CLASS_NAMES): Likewise.
 (REG_CLASS_CONTENTS): Likewise.
 * config/i386/sse.md (*vec_concatv2sf_sse4_1): Add alternatives
 which use only NO_REX_SSE_REGS.
 (vec_setmode_0): Likewise.
 (*vec_setv4sf_sse4_1): Likewise.
 (sse4_1_insertps): Likewise.
 (*sse4_1_extractps): Likewise.
 (*sse4_1_mulv2siv2di3mask_name): Likewise.
 (*sse4_1_avx2_mulmode3mask_name): Likewise.
 (*sse4_1_codemode3mask_name): Likewise.
 (*sse4_1_codemode3): Likewise.
 (*sse4_1_eqv2di3): Likewise.
 (sse4_2_gtv2di3): Likewise.
 (*vec_extractv4si): Likewise.
 (*vec_concatv2si_sse4_1): Likewise.
 (vec_concatv2di): Likewise.
 (sse4_1_blendssemodesuffixavxsizesuffix): Likewise.
 (sse4_1_blendvssemodesuffixavxsizesuffix): Likewise.
 (sse4_1_dpssemodesuffixavxsizesuffix): Likewise.
 (vi8_sse4_1_avx2_avx512_movntdqa): Likewise.
 (sse4_1_avx2_mpsadbw): Likewise.
 (sse4_1_avx2packusdwmask_name): Likewise.
 (sse4_1_avx2_pblendvb): Likewise.
 (sse4_1_pblendw): Likewise.
 (sse4_1_phminposuw): Likewise.
 (sse4_1_codev8qiv8hi2mask_name): Likewise.
 (sse4_1_codev4qiv4si2mask_name): Likewise.
 (sse4_1_codev4hiv4si2mask_name): Likewise.
 (sse4_1_codev2qiv2di2mask_name): Likewise.
 (sse4_1_codev2hiv2di2mask_name): Likewise.
 (sse4_1_codev2siv2di2mask_name): Likewise.
 (sse4_1_ptest): Likewise.
 (sse4_1_roundssemodesuffixavxsizesuffix): Likewise.
 (sse4_1_roundssescalarmodesuffix): Likewise.
 * config/i386/subst.md (mask_prefix4): New.
 * config/i386/x86-tune.def (X86_TUNE_AVOID_4BYTE_PREFIXES): New.

 gcc/testsuites/

 2014-12-02  Ilya Enkovich  ilya.enkov...@intel.com

 * gcc.target/i386/sse2-init-v2di-2.c: Adjust to changed
 vec_concatv2di template.

OK for mainline with a change below.

@@ -6544,13 +6550,14 @@
 })

 (define_insn_and_split *sse4_1_extractps
-  [(set (match_operand:SF 0 nonimmediate_operand =rm,x,x)
+  [(set (match_operand:SF 0 nonimmediate_operand =*rm,rm,x,x)
  (vec_select:SF

Please do not change preferences of non-SSE registers. Please check
the patch that similar changes didn't creep in.

Thanks,
Uros.


RE: [PATCH] Improve spillcost of literal pool loads

2014-12-02 Thread Wilco Dijkstra
 Jeff Law wrote:
 OK with the appropropriate ChangeLog entires.  THe original for
 ira-costs.c was fine, so you just need the trivial one for the testcase.

ChangeLog below - Jiong, could you commit for me please?

2014-12-02  Wilco Dijkstra  wdijk...@arm.com

* gcc/ira-costs.c (scan_one_insn): Improve spill cost adjustment.
* gcc/testsuite/gcc.target/aarch64/remat1.c: New testcase.
---
 gcc/ira-costs.c   |  5 ++---
 gcc/testsuite/gcc.target/aarch64/remat1.c | 20 
 2 files changed, 22 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/remat1.c

diff --git a/gcc/ira-costs.c b/gcc/ira-costs.c
index 122815b..95d266e 100644
--- a/gcc/ira-costs.c
+++ b/gcc/ira-costs.c
@@ -1462,12 +1462,11 @@ scan_one_insn (rtx_insn *insn)
((MEM_P (XEXP (note, 0))
!side_effects_p (SET_SRC (set)))
  || (CONSTANT_P (XEXP (note, 0))
-  targetm.legitimate_constant_p (GET_MODE (SET_DEST (set)),
-   XEXP (note, 0))
+  (! flag_pic || LEGITIMATE_PIC_OPERAND_P (XEXP (note, 0)))
   REG_N_SETS (REGNO (SET_DEST (set))) == 1))
general_operand (SET_SRC (set), GET_MODE (SET_SRC (set
 {
-  enum reg_class cl = GENERAL_REGS;
+  enum reg_class cl = ALL_REGS;
   rtx reg = SET_DEST (set);
   int num = COST_INDEX (REGNO (reg));
 
diff --git a/gcc/testsuite/gcc.target/aarch64/remat1.c 
b/gcc/testsuite/gcc.target/aarch64/remat1.c
new file mode 100644
index 000..999577e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/remat1.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options -O2 -fomit-frame-pointer -fcaller-saves -ffixed-d8 -ffixed-d9 
-ffixed-d10 -ffixed-d11 -ffixed-d12 -ffixed-d13 -ffixed-d14 -ffixed-d15 } */
+
+/* Under high register pressure FP immediates should be rematerialized
+   as literal loads rather than being caller-saved to the stack.  */
+
+void
+g (void);
+
+float
+f (float x)
+{
+  x += 3.1f;
+  g ();
+  x *= 3.1f;
+  return x;
+}
+
+/* { dg-final { scan-assembler-times ldr\ts\[0-9]+, .LC0 2 } } */
+
-- 
1.9.1



Re: [PATCH, ARM] Fix PR63718, Thumb1 bootstrap -- disable fuse-caller-save for Thumb1

2014-12-02 Thread Ramana Radhakrishnan



On 20/11/14 11:54, Tom de Vries wrote:

Richard,

This patch fixes PR63718, which currently breaks Thumb1 bootstrap.

The problem is that in Thumb1 mode, we emit the epilogue in RTL, but the last
insn - epilogue_insns - does not accurately model the corresponding insns
emitted in the asm file. F.i., the asm file may contain an insn:
...
pop {r0}

while the corresponding RTL pattern looks like this:
...
(jump_insn (unspec_volatile [
  (return)
   ] VUNSPEC_EPILOGUE))
...

As a consequence, the epilogue may clobber registers without fuse-caller-save
being able to analyze that.

Adding the missing clobbers to epilogue_insns is not trivial, and probably not a
good idea for stage3. The patch works around the problem by disabling
fuse-caller-save in Thumb1 mode.

Build and reg-tested on arm-none-eabi.

OK for stage3?


This is OK, can you make sure that a bug is created to reflect the work 
required to make this work in the backend.


Sorry about the delay in review.

regards
Ramana



Thanks,
- Tom



Re: [PATCH PR59593] [arm] Backport r217772 r217826 to 4.8 4.9

2014-12-02 Thread Ramana Radhakrishnan



On 29/11/14 06:50, Chen Shanyao wrote:

I've backported this fix to 4.8  4.9 branch.
These patches have been tested for armeb-none-eabi-gcc/g++ with qemu,
and both the test results were ok.



The Changelog should mention all authors of the original patches i.e. 
include my name.


Otherwise this is OK unless an RM objects in the next 24 hours.



regards
Ramana


Re: [rtlanal.c][BE][1/2] Fix vector load/stores to not use ld1/st1

2014-12-02 Thread Alan Hayward

On 21/11/2014 14:08, Alan Hayward alan.hayw...@arm.com wrote:


On 14/11/2014 16:48, Alan Hayward alan.hayw...@arm.com wrote:

This is a new version of my BE patch from a few weeks ago.
This is part 1 and covers rtlanal.c. The second part will be aarch64
specific.

When combined with the second patch, It fixes up movoi/ci/xi for Big
Endian, so that we end up with the lab of a big-endian integer to be in
the low byte of the highest-numbered register.

This will apply cleanly by itself and no regressions were seen when
testing aarch64 and x86_64 on make check.


Changelog:

2014-11-14  Alan Hayward  alan.hayw...@arm.com

* rtlanal.c
(subreg_get_info): Exit early for simple and common cases


Alan.

Hi,

The second part to this patch (aarch64 specific) has been approved.


Could someone review this one please.


Thanks,
Alan.


Ping.


Thanks,
Alan.


0001-BE-fix-load-stores.-Common-code.patch
Description: Binary data


Re: [PATCH 1/2, PR 63814] Strengthen cgraph_edge_brings_value_p

2014-12-02 Thread H.J. Lu
On Fri, Nov 21, 2014 at 10:59 AM, Martin Jambor mjam...@suse.cz wrote:
 Hi,

 PR 63814 is caused by cgraph_edge_brings_value_p misidentifying an
 edge to an expanded artificial thunk as an edge to the original node,
 which then leads to crazy double-cloning and doubling the thunks along
 the call.

 This patch fixes the bug by strengthening the predicate so that it
 knows where the value is supposed to go and can check that it goes
 there and not anywhere else.  It also adds an extra availability check
 that was probably missing in it.

 Bootstrapped and tested on x86_64-linux, and i686-linux.  OK for
 trunk?

 Thanks,

 Martin


 2014-11-20  Martin Jambor  mjam...@suse.cz

 PR ipa/63814
 * ipa-cp.c (same_node_or_its_all_contexts_clone_p): New function.
 (cgraph_edge_brings_value_p): New parameter dest, use
 same_node_or_its_all_contexts_clone_p and check availability.
 (cgraph_edge_brings_value_p): Likewise.
 (get_info_about_necessary_edges): New parameter dest, pass it to
 cgraph_edge_brings_value_p.  Update caller.
 (gather_edges_for_value): Likewise.
 (perhaps_add_new_callers): Use cgraph_edge_brings_value_p to check
 both the destination and availability.



I checked in this testcase:

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index a17cc6c..1410f10 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2014-12-02  H.J. Lu  hongjiu...@intel.com
+
+ PR ipa/63814
+ * g++.dg/ipa/pr63814.C: New test.
+
 2014-12-02  Wilco Dijkstra  wilco.dijks...@arm.com

  * gcc.target/aarch64/remat1.c: New testcase.
diff --git a/gcc/testsuite/g++.dg/ipa/pr63814.C
b/gcc/testsuite/g++.dg/ipa/pr63814.C
new file mode 100644
index 000..15a7dda
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr63814.C
@@ -0,0 +1,29 @@
+// { dg-do run { target fpic } }
+// { dg-options -O3 -fpic }
+
+struct CBase {
+  virtual void BaseFunc () {}
+};
+
+struct MMixin {
+  virtual void * MixinFunc (int, int) = 0;
+};
+
+struct CExample: CBase, public MMixin
+{
+  void *MixinFunc (int arg, int arg2)
+  {
+return this;
+  }
+};
+
+void *test (MMixin  anExample)
+{
+  return anExample.MixinFunc (0, 0);
+}
+
+int main ()
+{
+  CExample c;
+  return (test (c) != c);
+}

H.J.


[PATCH] rs6000: Fix PR target/64115

2014-12-02 Thread Ulrich Weigand
Hello,

as discussed in the PR, this patch removes an invalid ENABLE_CHECKING
sanity check in rs6000_delegitimize_address, fixing the ICE.

Tested on powerpc64-linux.

OK for mainline / 4.9 / 4.8?

Bye,
Ulrich


ChangeLog:

PR target/64115
* config/rs6000/rs6000.c (rs6000_delegitimize_address): Remove
invalid UNSPEC_TOCREL sanity check under ENABLE_CHECKING.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 218210)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -7033,24 +7033,6 @@
   if (GET_CODE (y) == UNSPEC
XINT (y, 1) == UNSPEC_TOCREL)
 {
-#ifdef ENABLE_CHECKING
-  if (REG_P (XVECEXP (y, 0, 1))
-  REGNO (XVECEXP (y, 0, 1)) == TOC_REGISTER)
-   {
- /* All good.  */
-   }
-  else if (GET_CODE (XVECEXP (y, 0, 1)) == DEBUG_EXPR)
-   {
- /* Weirdness alert.  df_note_compute can replace r2 with a
-debug_expr when this unspec is in a debug_insn.
-Seen in gcc.dg/pr51957-1.c  */
-   }
-  else
-   {
- debug_rtx (orig_x);
- abort ();
-   }
-#endif
   y = XVECEXP (y, 0, 0);
 
 #ifdef HAVE_AS_TLS

-- 
  Dr. Ulrich Weigand
  GNU/Linux compilers and toolchain
  ulrich.weig...@de.ibm.com



Re: [PATCH fortran/linemap] Add enough column hint to fit any possible offset

2014-12-02 Thread Manuel López-Ibáñez
On 2 December 2014 at 07:50, Tobias Burnus bur...@net-b.de wrote:
 Hi Manuel,

 Manuel López-Ibáñez wrote:

 This patch actually does not touch linemap but I will appreciate
 Dodji's comments about the approach. The problem is that in case of
 long lines, the column hint of 120 might be too small, thus we do not
 have enough locations within one line to point to a higher column
 (like 132 in the testcase). Giving as column hint the length of the
 line seems the right fix.
 [...]
 My fix here is to create a dummy location for line_len -1 (the last column
 in the line). This forces the next map to start after this last column's
 location.


 My feeling is that that doesn't work for
 -ffree-line-length-none/-ffixed-line-length-none. In that case,
 gfc_option.free_line_length == 0 and there is no limit to the number of
 characters per line. Thus, I fear that

b-location
 -   = linemap_line_start (line_table, current_file-line++, 120);
 +   = linemap_line_start (line_table, current_file-line++, line_len);


 might even set the line length to 0 - but I haven't checked. For all other
 values, it should work.

My understanding of this piece of code is that Fortran reads the whole
line into a buffer, and that line_len is the length of the line just
read. Am I wrong? If so, is there any way to compute the length of the
current line?

A small optimization would be to use all that information about
truncation limits to give a smaller column hint when possible, but
someone more knowledgeable in Fortran can do that as a follow-up. I
think setting it to the actual length line will fix the ICEs and only
waste a bit of memory if the lines are quite long.

Cheers,

Manuel.


Re: [PATCH] DWARF add DW_AT_noreturn on noreturn function subprogram.

2014-12-02 Thread Mark Wielaard
On Mon, 2014-12-01 at 14:31 -0700, Jeff Law wrote:
 On 11/25/14 15:57, Mark Wielaard wrote:
  This implements the DWARFv5 noreturn proposal:
  http://dwarfstd.org/ShowIssue.php?issue=140331.1
 
  TREE_THIS_VOLATILE on a FUNCTION_DECL node means the function does not
  return normally. This catches the traditional noreturn GNU attribute,
  the C11 _Noreturn keyword and the C++11 [[noreturn]] attribute.
 
  This relies on the DW_AT_noreturn constant defined in the DWARFv5 DRAFT:
  http://www.dwarfstd.org/doc/dwarf5.20141029.pdf
  We could also use a new GNU extension if we don't trust these constants
  to be stable.
 
 Presumably we don't have any sense when the values will be assigned, 
 right?  Any chance we could speed that along a bit?

Sadly I have no idea. Officially the DWARF committee hasn't accepted new
proposals since March 2014 for DWARFv5. But some of my proposals have
been accepted anyway as late as last month. There is not really a way to
know when which proposal is submitted or considered for incorporation by
people outside the committee.

There is the above draft that contains the currently assigned constants.
But I don't know when the committee will ask for comments on the draft.
Or how long the comment period will be. I have just asked on the DWARF
mailinglist.

I don't mind using GNU vendor constants instead. But people on the
committee have convinced me it should be fine to use the constants from
the current draft. Given that Jakub, Jason and Cary are all on the DWARF
committee I assume they would yell and scream at me if they believed
using these constants were not stable. Most usage is gated behind the
experimental GCC5 -gdwarf-5 switch anyway (although not this attribute,
which is also emitted in non-strict mode).

BTW. I do make sure to update bintuils, gdb, elfutils and valgrind to
match the GCC5 output, so the constants should at least be consistent
across the GNU toolchain and friends. All GNU and DWARFv5 extensions are
documented: https://fedorahosted.org/elfutils/wiki/DwarfExtensions

Cheers,

Mark


Re: PATCH: ICE: SIGSEGV in decide_alg() with -mmemset-strategy=libcall:-1:align -minline-all-stringops

2014-12-02 Thread Uros Bizjak
On Sat, Nov 29, 2014 at 3:00 PM, H.J. Lu hjl.to...@gmail.com wrote:
 When searching for an usable algorithm with -minline-all-stringops,
 decide_alg stops when it sees libcall even if there is a usable
 algorithm.  It goes into an infinite loop.  This patch changes
 decide_alg to stop searching only if there aren't any usable algorithms.
 Testd on Linux/x86-64.  OK for trunk.


 H.J.
 
 gcc/

 2014-11-29  H.J. Lu  hongjiu...@intel.com

 PR target/64108
 * config/i386/i386.c (decide_alg): Stop only if there aren't
 any usable algorithms.

 gcc/testsuite/

 2014-11-29  H.J. Lu  hongjiu...@intel.com

 PR target/64108
 * gcc.target/i386/memset-strategy-2.c: New test.

OK with a change below. Please wait a couple of days before
backporting the fix - the fix looks kind of obvious, but this part of
the code is somehow complex and has biten us in the past.

Thanks,
Uros.

 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 2493130..d789635 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -24464,7 +24464,9 @@ decide_alg (HOST_WIDE_INT count, HOST_WIDE_INT 
 expected_size,
   *noalign = alg_noalign;
   return alg;
 }
 - break;
 + /* Stop only if there aren't any usable algorithms.  */
 + if (!any_alg_usable_p)
 +   break;

Please use else if ... and remove unneded argument. IMO, the code
speaks for itself.

 }
   else if (alg_usable_p (candidate, memset))
 {
 diff --git a/gcc/testsuite/gcc.target/i386/memset-strategy-2.c 
 b/gcc/testsuite/gcc.target/i386/memset-strategy-2.c
 new file mode 100644
 index 000..aafa54d
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/i386/memset-strategy-2.c
 @@ -0,0 +1,10 @@
 +/* PR target/64108 */
 +/* { dg-do compile } */
 +/* { dg-options -O2 -march=atom -mmemset-strategy=libcall:-1:align 
 -minline-all-stringops } */
 +
 +char a[2048];
 +void t (void)
 +{
 +  __builtin_memset (a, 1, 2048);
 +}
 +


[patch] Update libstdc++ status docs

2014-12-02 Thread Jonathan Wakely

Update the docs. Committed to trunk.
commit 388d7943d85f505381f6e818d2de07978d022498
Author: Jonathan Wakely jwak...@redhat.com
Date:   Tue Dec 2 13:13:33 2014 +

	* doc/xml/manual/status_cxx2011.xml: Update.
	* doc/xml/manual/status_cxx2014.xml: Update.
	* doc/html/manual/status.html: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
index 455c704..c4eec2c 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2011.xml
@@ -2434,11 +2434,10 @@ particular release.
   entry/
 /row
 row
-  ?dbhtml bgcolor=#B0B0B0 ?
   entry30.5/entry
   entryCondition variables/entry
-  entryPartial/entry
-  entryMissing notify_all_at_thread_exit/entry
+  entryY/entry
+  entry/
 /row
 row
   entry30.5.1/entry
@@ -2483,11 +2482,10 @@ particular release.
   entry/
 /row
 row
-  ?dbhtml bgcolor=#B0B0B0 ?
   entry30.6.5/entry
   entryClass template codepromise/code/entry
-  entryPartial/entry
-  entryMissing set_*_at_thread_exit/entry
+  entryY/entry
+  entry/
 /row
 row
   entry30.6.6/entry
@@ -2508,11 +2506,10 @@ particular release.
   entry/
 /row
 row
-  ?dbhtml bgcolor=#B0B0B0 ?
   entry30.6.9/entry
   entryClass template codepackaged_task/code/entry
-  entryPartial/entry
-  entryMissing make_ready_at_thread_exit/entry
+  entryY/entry
+  entry/
 /row
 row
   entry
diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2014.xml b/libstdc++-v3/doc/xml/manual/status_cxx2014.xml
index f7d18a2..f716739 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2014.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2014.xml
@@ -283,18 +283,6 @@ not in any particular release.
 /row
 
 row
-  ?dbhtml bgcolor=#C8B0B0 ?
-  entry
-	link xmlns:xlink=http://www.w3.org/1999/xlink; xlink:href=http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2013/n3783.pdf;
-	  N3783
-	/link
-  /entry
-  entryNetwork byte order conversion/entry
-  entryN/entry
-  entryLibrary Fundamentals TS/entry
-/row
-
-row
   entry
 	link xmlns:xlink=http://www.w3.org/1999/xlink; xlink:href=http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2013/n3793.html;
 	  N3793
@@ -397,14 +385,13 @@ not in any particular release.
 /row
 
 row
-  ?dbhtml bgcolor=#C8C8B0 ?
   entry
 	link xmlns:xlink=http://www.w3.org/1999/xlink; xlink:href=http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n3932.htm;
 	  N3932
 	/link
   /entry
   entryVariable Templates For Type Traits/entry
-  entryPartial/entry
+  entryY/entry
   entryLibrary Fundamentals TS/entry
 /row
 


Re: [PATCH 2/3] Extended if-conversion

2014-12-02 Thread Richard Biener
On Mon, Dec 1, 2014 at 4:53 PM, Yuri Rumyantsev ysrum...@gmail.com wrote:
 Hi Richard,

 I resend you patch1 and patch2 with minor changes:
 1. I renamed flag_force_vectorize to aggressive_if_conv.
 2. Use static cast for the first argument of gimple_phi_arg_edge.
 I also very sorry that I sent you bad patch.

 Now let me answer on your questions related to second patch.
 1. Why we need both predicate_extended_scalar_phi and
 predicate_arbitrary_scalar_phi?

 Let's consider the following simple test-case:

   #pragma omp simd safelen(8)
   for (i=0; i512; i++)
   {
 float t = a[i];
 if (t  0.0f  t  1.0e+17f)
   if (c[i] != 0)  /* c is integer array. */
 res += 1;
   }

 we can see the following phi node correspondent to res:

 # res_1 = PHI res_15(3), res_15(4), res_10(5)

 It is clear that we can optimize it to phi node with 2 arguments only
 and only one check can be used for phi predication (for reduction in
 our case), namely predicate of bb_5. In general case we can't do it
 even if we sort all phi argument values since we still have to produce
 a chain of cond expressions to perform phi predication (see comments
 for predicate_arbitrary_scalar_phi).

How so?  We can always use !(condition) for the last value, thus
treat it as an 'else' case.  That even works for

# res_1 = PHI res_15(3), res_15(4), res_10(5), res_10(7)

where the condition for edges 5 and 7 can be computed as
! (condition for 3 || condition for 4).

Of course it is worthwhile to also sort single-occurances first
so your case gets just the condiiton for edge 5 and its inversion
used for edges 3 and 4 combined.

 2. Why we need to introduce find_insertion_point?
  Let's consider another test-case extracted from 175.vpr ( t5.c is
 attached) and we can see that bb_7 and bb_9 containig phi nodes has
 only critical incoming edges and both contain code computing edge
 predicates, e.g.

 bb 7:
 # xmax_edge_30 = PHI xmax_edge_36(6), xmax_edge_18(5)
 _46 = xmax_17 == xmax_37;
 _47 = xmax_17 == xmax_27;
 _48 = _46  _47;
 _53 = xmax_17 == xmax_37;
 _54 = ~_53;
 _55 = xmax_17 == xmax_27;
 _56 = _54  _55;
 _57 = _48 | _56;
 xmax_edge_19 = xmax_edge_39 + 1;
 goto bb 11;

 It is evident that we can not put phi predication at the block
 beginning but need to put it after predicate computations.
 Note also that if there are no critical edges for phi arguments
 insertion point will be after labels Note also that phi result can
 have use in this block too, so we can't put predication code to the
 block end.

So the issue is that predicate insertion for edge predicates does
not happen on the edge but somewhere else (generally impossible
for critical edges unless you split them).

I think I've told you before that I prefer simple solutions to such issues,
like splitting the edge!  Certainly not involving a function walking
GENERIC expressions.

Thanks,
Richard.

 Let me know if you still have any questions.

 Best regards.
 Yuri.




 2014-11-28 15:43 GMT+03:00 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 12, 2014 at 2:35 PM, Yuri Rumyantsev ysrum...@gmail.com wrote:
 Hi All,

 Here is the second patch related to extended predication.
 Few comments which explain a main goal of design.

 1. I don't want to insert any critical edge splitting since it may
 lead to less efficient binaries.
 2. One special case of extended PHI node predication was introduced
 when #arguments is more than 2 but only two arguments are different
 and one argument has the only occurrence. For such PHI conditional
 scalar reduction is applied.
 This is correspondent to the following statement:
 if (q1  q2  q3) var++
  New function phi_has_two_different_args was introduced to detect such phi.
 3. Original algorithm for PHI predication used assumption that at
 least one incoming edge for blocks containing PHI is not critical - it
 guarantees that all computations related to predicate of normal edge
 are already inserted above this block and
 code related to PHI predication can be inserted at the beginning of
 block. But this is not true for critical edges for which predicate
 computations are  in the block where code for phi predication must be
 inserted. So new function find_insertion_point is introduced which is
 simply found out the last statement in block defining predicates
 correspondent to all incoming edges and insert phi predication code
 after it (with some minor exceptions).

 Unfortunately the patch doesn't apply for me - I get

 patch:  malformed patch at line 505: @@ -1720,6 +2075,8 @@
 predicate_all_scalar_phis (struct loop *loop)

 a few remarks nevertheless.  I don't see how we need both
 predicate_extended_scalar_phi and predicate_arbitrary_scalar_phi.
 Couldn't we simply sort an array of (edge, value) pairs after value
 and handle equal values specially in predicate_extended_scalar_phi?
 That would even make PHI a, a, b, c, c more optimal.

 I don't understand the need for find_insertion_point.  All SSA names
 required for the 

Re: [PATCH linemap] Make some asserts fail gracefully

2014-12-02 Thread Dodji Seketeli
Manuel López-Ibáñez lopeziba...@gmail.com writes:


 +/* Assert that becomes a conditional expression when not checking.

For the sake of clarity towards newcomers, I'd say:

Assert that becomes a conditional expression when checking is
disabled at compilation time.

 +   Use this for conditions that should not happen but if they happen, it
 +   is better to handle them gracefully than ICE. Usage:

Similarly, I'd say:

Use this for conditions that should not happen but if they happen, it is
better to handle them gracefully rather than crash randomly later.
Usage:

OK with those changes and thank for your time and dedication.

Cheers,

-- 
Dodji


[PATCH, CHKP] Don't generate bndret for not instrumented calls

2014-12-02 Thread Ilya Enkovich
Hi,

Currently bndret is generated each time we need to get bounds for returned 
pointer.  It causes bndret generated for not instrumented calls incuding 
builtin function calls.  Troubles appear when such builtin call is optimized 
out - bndret needs to be handled appropriately.  Since we don't want not 
instrumented builtin calls optimizers to be affected by instrumentation, we 
better avoid bndret for not instrumented calls.  This patch uses zero bounds 
when we don't expect call to return bounds.

Bootstrapped and tested on  x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2014-12-02  Ilya Enkovich  ilya.enkov...@intel.com

* tree-chkp.c (chkp_call_returns_bounds_p): New.
(chkp_build_returned_bound): Use zero bounds as
returned by calls not returning bounds.

gcc/testsuite/

2014-12-02  Ilya Enkovich  ilya.enkov...@intel.com

* gcc.target/i386/chkp-bndret.c: New.
* gcc.target/i386/chkp-strchr.c: New.


diff --git a/gcc/testsuite/gcc.target/i386/chkp-bndret.c 
b/gcc/testsuite/gcc.target/i386/chkp-bndret.c
new file mode 100644
index 000..3498058
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/chkp-bndret.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target mpx } */
+/* { dg-options -fcheck-pointer-bounds -mmpx -O2 -fdump-tree-chkp } */
+/* { dg-final { scan-tree-dump-not bndret chkp } } */
+/* { dg-final { cleanup-tree-dump chkp } } */
+
+#include string.h
+
+extern int *test1 (int *p) __attribute__((bnd_legacy));
+
+int *
+test2 (int *p)
+{
+  return test1 (p);
+}
diff --git a/gcc/testsuite/gcc.target/i386/chkp-strchr.c 
b/gcc/testsuite/gcc.target/i386/chkp-strchr.c
new file mode 100644
index 000..94a5eaa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/chkp-strchr.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target mpx } */
+/* { dg-options -fcheck-pointer-bounds -mmpx -O2 } */
+
+#include string.h
+
+static char *
+test1 (char *str)
+{
+  return strrchr (str, '_');
+}
+
+char *
+test2 ()
+{
+  return test1 (test_string);
+}
diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
index 3e38691..f7def51 100644
--- a/gcc/tree-chkp.c
+++ b/gcc/tree-chkp.c
@@ -2080,6 +2080,38 @@ chkp_get_nonpointer_load_bounds (void)
   return chkp_get_zero_bounds ();
 }
 
+/* Return 1 if may use bndret call to get bounds for pointer
+   returned by CALL.  */
+static bool
+chkp_call_returns_bounds_p (gcall *call)
+{
+  if (gimple_call_internal_p (call))
+return false;
+
+  tree fndecl = gimple_call_fndecl (call);
+
+  if (fndecl  DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD)
+return false;
+
+  if (fndecl
+   lookup_attribute (bnd_legacy, DECL_ATTRIBUTES (fndecl)))
+return false;
+
+  if (fndecl  DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL)
+{
+  if (chkp_instrument_normal_builtin (fndecl))
+   return true;
+
+  if (!lookup_attribute (always_inline, DECL_ATTRIBUTES (fndecl)))
+   return false;
+
+  struct cgraph_node *clone = chkp_maybe_create_clone (fndecl);
+  return (clone  gimple_has_body_p (clone-decl));
+}
+
+  return true;
+}
+
 /* Build bounds returned by CALL.  */
 static tree
 chkp_build_returned_bound (gcall *call)
@@ -2154,7 +2186,7 @@ chkp_build_returned_bound (gcall *call)
 
   bounds = chkp_find_bounds (gimple_call_arg (call, argno), iter);
 }
-  else
+  else if (chkp_call_returns_bounds_p (call))
 {
   gcc_assert (TREE_CODE (gimple_call_lhs (call)) == SSA_NAME);
 
@@ -2172,6 +2204,8 @@ chkp_build_returned_bound (gcall *call)
 
   update_stmt (stmt);
 }
+  else
+bounds = chkp_get_zero_bounds ();
 
   if (dump_file  (dump_flags  TDF_DETAILS))
 {


[PATCH, CHKP] Don't try to optimize bounds returned by strchr

2014-12-02 Thread Ilya Enkovich
Hi,

For strchr calls bounds of the first argument are considered as returned which 
is wrong because NULL may be returned.  This patch fixes that.  Bootstrapped 
and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
2014-12-02  Ilya Enkovich  ilya.enkov...@intel.com

* tree-chkp.c (chkp_build_returned_bound): Don't predict
return bounds for strchr calls.


diff --git a/gcc/tree-chkp.c b/gcc/tree-chkp.c
index f7def51..1d91aa6 100644
--- a/gcc/tree-chkp.c
+++ b/gcc/tree-chkp.c
@@ -2163,8 +2163,7 @@ chkp_build_returned_bound (gcall *call)
 }
   /* Do not use retbnd when returned bounds are equal to some
  of passed bounds.  */
-  else if ((gimple_call_return_flags (call)  ERF_RETURNS_ARG)
-  || gimple_call_builtin_p (call, BUILT_IN_STRCHR))
+  else if (gimple_call_return_flags (call)  ERF_RETURNS_ARG)
 {
   gimple_stmt_iterator iter = gsi_for_stmt (call);
   unsigned int retarg = 0, argno;


Re: [PATCH] rs6000: Fix PR target/64115

2014-12-02 Thread David Edelsohn
On Tue, Dec 2, 2014 at 7:56 AM, Ulrich Weigand uweig...@de.ibm.com wrote:
 Hello,

 as discussed in the PR, this patch removes an invalid ENABLE_CHECKING
 sanity check in rs6000_delegitimize_address, fixing the ICE.

 Tested on powerpc64-linux.

 OK for mainline / 4.9 / 4.8?

 Bye,
 Ulrich


 ChangeLog:

 PR target/64115
 * config/rs6000/rs6000.c (rs6000_delegitimize_address): Remove
 invalid UNSPEC_TOCREL sanity check under ENABLE_CHECKING.

Okay.

Thanks, David


Re: [PATCH] Improve spillcost of literal pool loads

2014-12-02 Thread H.J. Lu
On Tue, Dec 2, 2014 at 3:34 AM, Wilco Dijkstra wdijk...@arm.com wrote:
 Jeff Law wrote:
 OK with the appropropriate ChangeLog entires.  THe original for
 ira-costs.c was fine, so you just need the trivial one for the testcase.

 ChangeLog below - Jiong, could you commit for me please?

 2014-12-02  Wilco Dijkstra  wdijk...@arm.com

 * gcc/ira-costs.c (scan_one_insn): Improve spill cost adjustment.
 * gcc/testsuite/gcc.target/aarch64/remat1.c: New testcase.

How did you test on Linux/ia32?  It introduced many regressions:

FAIL: gcc.dg/hoist-register-pressure-3.c scan-rtl-dump hoist
PRE/HOIST: end of bb .* copying expression
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O1 -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O1 -mforce-drap -fpic
execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none -mforce-drap -fpic
execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O2 -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O2 -mforce-drap -fpic
execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -fomit-frame-pointer
-fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions -mforce-drap -fpic execution
test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -fomit-frame-pointer
-funroll-loops -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -fomit-frame-pointer
-funroll-loops -mforce-drap -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -fomit-frame-pointer
-mforce-drap -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -g -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -g -mforce-drap -fpic
execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -Os -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -Os -mforce-drap -fpic
execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1 -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1 -mforce-drap -fpic
execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none -mforce-drap -fpic
execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O2 -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O2 -mforce-drap -fpic
execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -fomit-frame-pointer
-fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions -mforce-drap -fpic execution
test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -fomit-frame-pointer
-funroll-loops -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -fomit-frame-pointer
-funroll-loops -mforce-drap -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -fomit-frame-pointer
-mforce-drap -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -g -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -g -mforce-drap -fpic
execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -Os -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -Os -mforce-drap -fpic
execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O1 -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O1 -mforce-drap -fpic
execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none -mforce-drap -fpic
execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O2 -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O2 -mforce-drap -fpic
execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O3 -fomit-frame-pointer
-fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O3 -fomit-frame-pointer
-funroll-all-loops -finline-functions -mforce-drap -fpic execution
test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O3 -fomit-frame-pointer
-funroll-loops -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O3 -fomit-frame-pointer
-funroll-loops -mforce-drap -fpic execution test
FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O3 -fomit-frame-pointer

[PATCH, CHKP] Keep instrumentation references consistency when merging cgraph nodes

2014-12-02 Thread Ilya Enkovich
Hi,

Currently cgraph nodes merge may break instrumented_version references.  It 
depends on order nodes are read and merged and therefore problem is not nicely 
reproducible.  I couldn't make a small reproducer.  This patch fixes problem 
and 253.perlbmk benchmark build with '-O3 -flto -fcheck-pointer-bounds -mmpx'.  
Bootstrapped and tested on x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
2014-12-02  Ilya Enkovich  ilya.enkov...@intel.com

* lto-cgraph.c (input_cgraph_1): Don't break existing
instrumentation clone references.
* lto/lto-symtab.c (lto_cgraph_replace_node): Redirect
instrumented_version references appropriately.


diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c
index d424e14..b310b44 100644
--- a/gcc/lto-cgraph.c
+++ b/gcc/lto-cgraph.c
@@ -1562,7 +1562,18 @@ input_cgraph_1 (struct lto_file_decl_data *file_data,
 
  cnode-instrumented_version = cgraph_node::get (cnode-orig_decl);
  if (cnode-instrumented_version)
-   cnode-instrumented_version-instrumented_version = cnode;
+   {
+ /* We may have multiple nodes for a single function which
+will be merged later.  To have a proper merge we need
+to keep instrumentation_version reference between nodes
+consistent: each instrumented_version reference should
+have proper reverse reference.  Thus don't break existing
+instrumented_version reference if it already exists.  */
+ if (cnode-instrumented_version-instrumented_version)
+   cnode-instrumented_version = NULL;
+ else
+   cnode-instrumented_version-instrumented_version = cnode;
+   }
 
  /* Restore decl names reference.  */
  if (IDENTIFIER_TRANSPARENT_ALIAS (DECL_ASSEMBLER_NAME 
(cnode-decl))
diff --git a/gcc/lto/lto-symtab.c b/gcc/lto/lto-symtab.c
index 4c4e48a..f5d82a7 100644
--- a/gcc/lto/lto-symtab.c
+++ b/gcc/lto/lto-symtab.c
@@ -99,6 +99,20 @@ lto_cgraph_replace_node (struct cgraph_node *node,
   /* Redirect incomming references.  */
   prevailing_node-clone_referring (node);
 
+  /* Fix instrumentation references.  */
+  if (node-instrumented_version)
+{
+  gcc_assert (node-instrumentation_clone
+ == prevailing_node-instrumentation_clone);
+  node-instrumented_version-instrumented_version = prevailing_node;
+  if (!prevailing_node-instrumented_version)
+   prevailing_node-instrumented_version = node-instrumented_version;
+  /* Need to reset node-instrumented_version to NULL,
+otherwise node removal code would reset
+node-instrumented_version-instrumented_version.  */
+  node-instrumented_version = NULL;
+}
+
   ipa_merge_profiles (prevailing_node, node);
   lto_free_function_in_decl_state_for_node (node);
 


Re: [PATCH] Improve spillcost of literal pool loads

2014-12-02 Thread H.J. Lu
On Tue, Dec 2, 2014 at 5:51 AM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Dec 2, 2014 at 3:34 AM, Wilco Dijkstra wdijk...@arm.com wrote:
 Jeff Law wrote:
 OK with the appropropriate ChangeLog entires.  THe original for
 ira-costs.c was fine, so you just need the trivial one for the testcase.

 ChangeLog below - Jiong, could you commit for me please?

 2014-12-02  Wilco Dijkstra  wdijk...@arm.com

 * gcc/ira-costs.c (scan_one_insn): Improve spill cost adjustment.
 * gcc/testsuite/gcc.target/aarch64/remat1.c: New testcase.

 How did you test on Linux/ia32?  It introduced many regressions:

 FAIL: gcc.dg/hoist-register-pressure-3.c scan-rtl-dump hoist
 PRE/HOIST: end of bb .* copying expression
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O1 -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O1 -mforce-drap -fpic
 execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O2 -flto
 -fno-use-linker-plugin -flto-partition=none -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O2 -flto
 -fno-use-linker-plugin -flto-partition=none -mforce-drap -fpic
 execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O2 -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O2 -mforce-drap -fpic
 execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -fomit-frame-pointer
 -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -fomit-frame-pointer
 -funroll-all-loops -finline-functions -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -fomit-frame-pointer
 -funroll-all-loops -finline-functions -mforce-drap -fpic execution
 test
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -fomit-frame-pointer
 -funroll-loops -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -fomit-frame-pointer
 -funroll-loops -mforce-drap -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -fomit-frame-pointer
 -mforce-drap -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -g -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -O3 -g -mforce-drap -fpic
 execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -Os -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-1.c   -Os -mforce-drap -fpic
 execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1 -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1 -mforce-drap -fpic
 execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O2 -flto
 -fno-use-linker-plugin -flto-partition=none -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O2 -flto
 -fno-use-linker-plugin -flto-partition=none -mforce-drap -fpic
 execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O2 -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O2 -mforce-drap -fpic
 execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -fomit-frame-pointer
 -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -fomit-frame-pointer
 -funroll-all-loops -finline-functions -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -fomit-frame-pointer
 -funroll-all-loops -finline-functions -mforce-drap -fpic execution
 test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -fomit-frame-pointer
 -funroll-loops -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -fomit-frame-pointer
 -funroll-loops -mforce-drap -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -fomit-frame-pointer
 -mforce-drap -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -g -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O3 -g -mforce-drap -fpic
 execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -Os -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -Os -mforce-drap -fpic
 execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O1 -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O1 -mforce-drap -fpic
 execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O2 -flto
 -fno-use-linker-plugin -flto-partition=none -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O2 -flto
 -fno-use-linker-plugin -flto-partition=none -mforce-drap -fpic
 execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O2 -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O2 -mforce-drap -fpic
 execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O3 -fomit-frame-pointer
 -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O3 -fomit-frame-pointer
 -funroll-all-loops -finline-functions -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O3 -fomit-frame-pointer
 -funroll-all-loops -finline-functions -mforce-drap -fpic execution
 test
 FAIL: gcc.dg/torture/stackalign/setjmp-4.c   -O3 -fomit-frame-pointer
 -funroll-loops -fpic execution test
 FAIL: gcc.dg/torture/stackalign/setjmp-4.c   

Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2014-12-02 Thread Kyrill Tkachov


On 23/09/14 09:27, James Greenhalgh wrote:

On Mon, Sep 15, 2014 at 11:56:03AM +0100, Andrew Stubbs wrote:

On 15/09/14 10:46, Richard Earnshaw wrote:

Hmm, I wonder if arm_override_options should reject neon + (arch  7).

Is this more to your taste?

Is this really such a good idea? It causes carnage throughout the
testsuite if you have configured with support for Neon and the testcase
is written with dg-options for a pre-armv7-a -march value.

For example in:
   testsuite/gcc.target/arm/di-longlong64-sync-withhelpers.c

Which forces -march=armv5.

Perhaps you just have to fix the effective-target-ok tests - but then
we lose swathes of test coverage.


This also causes subtle Linux kernel compile failures.
Over there they use make rules where they check if the compiler supports 
-march=armv5te and if not use -march=armv4t.
With this patch if the compiler is configured with something like 
--with-fpu=neon the test will fail with your error message,

even though the compiler supports -march=armv5te.

Kyrill




Thanks,
James


Andrew

P.S. arm_override_options was renamed in 2010.
2014-09-15  Andrew Stubbs  a...@codesourcery.com

* gcc/config/arm/arm.c (arm_option_override): Reject -mfpu=neon
when architecture is older than ARMv7.

Index: gcc/config/arm/arm.c
===
--- gcc/config/arm/arm.c(revision 215228)
+++ gcc/config/arm/arm.c(working copy)
@@ -2845,6 +2845,9 @@
  
arm_fpu_desc = all_fpus[arm_fpu_index];
  
+  if (TARGET_NEON  !arm_arch7)

+error (target CPU does not support NEON);
+
switch (arm_fpu_desc-model)
  {
  case ARM_FP_MODEL_VFP:





Re: PATCH: ICE: SIGSEGV in decide_alg() with -mmemset-strategy=libcall:-1:align -minline-all-stringops

2014-12-02 Thread H.J. Lu
On Tue, Dec 2, 2014 at 5:25 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Sat, Nov 29, 2014 at 3:00 PM, H.J. Lu hjl.to...@gmail.com wrote:
 When searching for an usable algorithm with -minline-all-stringops,
 decide_alg stops when it sees libcall even if there is a usable
 algorithm.  It goes into an infinite loop.  This patch changes
 decide_alg to stop searching only if there aren't any usable algorithms.
 Testd on Linux/x86-64.  OK for trunk.


 H.J.
 
 gcc/

 2014-11-29  H.J. Lu  hongjiu...@intel.com

 PR target/64108
 * config/i386/i386.c (decide_alg): Stop only if there aren't
 any usable algorithms.

 gcc/testsuite/

 2014-11-29  H.J. Lu  hongjiu...@intel.com

 PR target/64108
 * gcc.target/i386/memset-strategy-2.c: New test.

 OK with a change below. Please wait a couple of days before
 backporting the fix - the fix looks kind of obvious, but this part of
 the code is somehow complex and has biten us in the past.

Will do.

 Thanks,
 Uros.

 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 2493130..d789635 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -24464,7 +24464,9 @@ decide_alg (HOST_WIDE_INT count, HOST_WIDE_INT 
 expected_size,
   *noalign = alg_noalign;
   return alg;
 }
 - break;
 + /* Stop only if there aren't any usable algorithms.  */
 + if (!any_alg_usable_p)
 +   break;

 Please use else if ... and remove unneded argument. IMO, the code
 speaks for itself.

Which unneeded argument to remove? All arguments of decide_alg are
used.

 }
   else if (alg_usable_p (candidate, memset))
 {
 diff --git a/gcc/testsuite/gcc.target/i386/memset-strategy-2.c 
 b/gcc/testsuite/gcc.target/i386/memset-strategy-2.c
 new file mode 100644
 index 000..aafa54d
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/i386/memset-strategy-2.c
 @@ -0,0 +1,10 @@
 +/* PR target/64108 */
 +/* { dg-do compile } */
 +/* { dg-options -O2 -march=atom -mmemset-strategy=libcall:-1:align 
 -minline-all-stringops } */
 +
 +char a[2048];
 +void t (void)
 +{
 +  __builtin_memset (a, 1, 2048);
 +}
 +



-- 
H.J.


Re: PATCH: ICE: SIGSEGV in decide_alg() with -mmemset-strategy=libcall:-1:align -minline-all-stringops

2014-12-02 Thread Uros Bizjak
On Tue, Dec 2, 2014 at 3:03 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Dec 2, 2014 at 5:25 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Sat, Nov 29, 2014 at 3:00 PM, H.J. Lu hjl.to...@gmail.com wrote:
 When searching for an usable algorithm with -minline-all-stringops,
 decide_alg stops when it sees libcall even if there is a usable
 algorithm.  It goes into an infinite loop.  This patch changes
 decide_alg to stop searching only if there aren't any usable algorithms.
 Testd on Linux/x86-64.  OK for trunk.


 H.J.
 
 gcc/

 2014-11-29  H.J. Lu  hongjiu...@intel.com

 PR target/64108
 * config/i386/i386.c (decide_alg): Stop only if there aren't
 any usable algorithms.

 gcc/testsuite/

 2014-11-29  H.J. Lu  hongjiu...@intel.com

 PR target/64108
 * gcc.target/i386/memset-strategy-2.c: New test.

 OK with a change below. Please wait a couple of days before
 backporting the fix - the fix looks kind of obvious, but this part of
 the code is somehow complex and has biten us in the past.

 Will do.

 Thanks,
 Uros.

 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 2493130..d789635 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -24464,7 +24464,9 @@ decide_alg (HOST_WIDE_INT count, HOST_WIDE_INT 
 expected_size,
   *noalign = alg_noalign;
   return alg;
 }
 - break;
 + /* Stop only if there aren't any usable algorithms.  */
 + if (!any_alg_usable_p)
 +   break;

 Please use else if ... and remove unneded argument. IMO, the code
 speaks for itself.

 Which unneeded argument to remove? All arguments of decide_alg are
 used.

Eh, slip of the tongue. Unnedded *comment*, the one your patch is adding.

Uros.


Re: PATCH: ICE: SIGSEGV in decide_alg() with -mmemset-strategy=libcall:-1:align -minline-all-stringops

2014-12-02 Thread H.J. Lu
On Tue, Dec 2, 2014 at 6:05 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Tue, Dec 2, 2014 at 3:03 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Dec 2, 2014 at 5:25 AM, Uros Bizjak ubiz...@gmail.com wrote:
 On Sat, Nov 29, 2014 at 3:00 PM, H.J. Lu hjl.to...@gmail.com wrote:
 When searching for an usable algorithm with -minline-all-stringops,
 decide_alg stops when it sees libcall even if there is a usable
 algorithm.  It goes into an infinite loop.  This patch changes
 decide_alg to stop searching only if there aren't any usable algorithms.
 Testd on Linux/x86-64.  OK for trunk.


 H.J.
 
 gcc/

 2014-11-29  H.J. Lu  hongjiu...@intel.com

 PR target/64108
 * config/i386/i386.c (decide_alg): Stop only if there aren't
 any usable algorithms.

 gcc/testsuite/

 2014-11-29  H.J. Lu  hongjiu...@intel.com

 PR target/64108
 * gcc.target/i386/memset-strategy-2.c: New test.

 OK with a change below. Please wait a couple of days before
 backporting the fix - the fix looks kind of obvious, but this part of
 the code is somehow complex and has biten us in the past.

 Will do.

 Thanks,
 Uros.

 diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
 index 2493130..d789635 100644
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -24464,7 +24464,9 @@ decide_alg (HOST_WIDE_INT count, HOST_WIDE_INT 
 expected_size,
   *noalign = alg_noalign;
   return alg;
 }
 - break;
 + /* Stop only if there aren't any usable algorithms.  */
 + if (!any_alg_usable_p)
 +   break;

 Please use else if ... and remove unneded argument. IMO, the code
 speaks for itself.

 Which unneeded argument to remove? All arguments of decide_alg are
 used.

 Eh, slip of the tongue. Unnedded *comment*, the one your patch is adding.

 Uros.

This is what I checked in.  I will wait for a few days before backporting
it to 4.9.

Thanks.

-- 
H.J.
---
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3eacc5a..c746d0f 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2014-12-02  H.J. Lu  hongjiu...@intel.com
+
+ PR target/64108
+ * config/i386/i386.c (decide_alg): Stop only if there aren't
+ any usable algorithms.
+
 2014-12-02  Tom de Vries  t...@codesourcery.com

  PR rtl-optimization/63718
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 3397167..211c9e6 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -24464,7 +24464,8 @@ decide_alg (HOST_WIDE_INT count, HOST_WIDE_INT
expected_size,
   *noalign = alg_noalign;
   return alg;
 }
-  break;
+  else if (!any_alg_usable_p)
+break;
  }
   else if (alg_usable_p (candidate, memset))
  {
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 00da0bd..5d66849 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2014-12-02  H.J. Lu  hongjiu...@intel.com
+
+ PR target/64108
+ * gcc.target/i386/memset-strategy-2.c: New test.
+
 2014-12-02  Richard Biener  rguent...@suse.de

  * gcc.dg/torture/20141202-1.c: New testcase.
diff --git a/gcc/testsuite/gcc.target/i386/memset-strategy-2.c
b/gcc/testsuite/gcc.target/i386/memset-strategy-2.c
new file mode 100644
index 000..aafa54d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/memset-strategy-2.c
@@ -0,0 +1,10 @@
+/* PR target/64108 */
+/* { dg-do compile } */
+/* { dg-options -O2 -march=atom -mmemset-strategy=libcall:-1:align
-minline-all-stringops } */
+
+char a[2048];
+void t (void)
+{
+  __builtin_memset (a, 1, 2048);
+}
+


Re: [C++ PATCH] Detect UB in shifts in constexpr functions

2014-12-02 Thread Marek Polacek
On Fri, Nov 28, 2014 at 09:28:03AM -0500, Jason Merrill wrote:
 I was thinking even more detailed: one diagnostic for negative count, one
 diagnostic for count larger than the precision of the lhs, and then a third
 for overflow.
 
Alright, done.

 +  /* For signed x  y the following:
 + (unsigned) x  ((prec (lhs) - 1) - y)
 + if  1, is undefined.  */
 
 I meant briefly explaining where that formula comes from.

This was a little bit problematic, because I don't really know where
that formula comes from.  We use it in c-ubsan.c, so I've just taken
it from there.  It was Jakub who came with that formula, so maybe he
remembers its origin.  I wrote something more to the comment describing
it, even though that might not be enough.

Another change I made is that I've added missing cxx11 check for the
signed x  case.

Bootstrapped/regtested on ppc64-linux, ok for trunk?

2014-12-02  Marek Polacek  pola...@redhat.com

* constexpr.c (cxx_eval_check_shift_p): New function.
(cxx_eval_binary_expression): Call it.

* g++.dg/cpp0x/constexpr-shift1.C: New test.
* g++.dg/cpp1y/constexpr-shift1.C: New test.
* g++.dg/ubsan/pr63956.C: Add dg-errors.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 2184a2f..556c422 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -1470,6 +1470,73 @@ verify_constant (tree t, bool allow_non_constant, bool 
*non_constant_p,
   return *non_constant_p;
 }
 
+/* Check whether the shift operation with code CODE and type TYPE on LHS
+   and RHS is undefined.  If it is, explain why and give an error.  */
+
+static void
+cxx_eval_check_shift_p (location_t loc, enum tree_code code, tree type,
+   tree lhs, tree rhs)
+{
+  if ((code != LSHIFT_EXPR  code != RSHIFT_EXPR)
+  || TREE_CODE (lhs) != INTEGER_CST
+  || TREE_CODE (rhs) != INTEGER_CST)
+return;
+
+  tree lhstype = TREE_TYPE (lhs);
+  unsigned HOST_WIDE_INT uprec = TYPE_PRECISION (TREE_TYPE (lhs));
+
+  /* [expr.shift] The behavior is undefined if the right operand
+ is negative, or greater than or equal to the length in bits
+ of the promoted left operand.  */
+  if (tree_int_cst_sgn (rhs) == -1)
+{
+  error_at (loc, right operand of shift expression %q+E is negative,
+   build2_loc (loc, code, type, lhs, rhs));
+  return;
+}
+  if (compare_tree_int (rhs, uprec) = 0)
+{
+  error_at (loc, right operand of shift expression %q+E is = than 
+   the precision of the left operand,
+   build2_loc (loc, code, type, lhs, rhs));
+  return;
+}
+
+  /* The value of E1  E2 is E1 left-shifted E2 bit positions; [...]
+ if E1 has a signed type and non-negative value, and E1x2^E2 is
+ representable in the corresponding unsigned type of the result type,
+ then that value, converted to the result type, is the resulting value;
+ otherwise, the behavior is undefined.  */
+  if (code == LSHIFT_EXPR  !TYPE_UNSIGNED (lhstype)
+   (cxx_dialect = cxx11))
+{
+  if (tree_int_cst_sgn (lhs) == -1)
+   {
+ error_at (loc, left operand of shift expression %q+E is negative,
+   build2_loc (loc, code, type, lhs, rhs));
+ return;
+   }
+  /* For signed x  y the following:
+(unsigned) x  ((prec (lhs) - 1) - y)
+if  1, is undefined.  The right-hand side of this formula
+is the highest bit of the LHS that can be set (starting from 0),
+so that the shift doesn't overflow.  We then right-shift the LHS
+to see whether any other bit is set making the original shift
+undefined -- the result is not representable in the corresponding
+unsigned type.  */
+  tree t = build_int_cst (unsigned_type_node, uprec - 1);
+  t = fold_build2 (MINUS_EXPR, unsigned_type_node, t, rhs);
+  tree ulhs = fold_convert (unsigned_type_for (lhstype), lhs);
+  t = fold_build2 (RSHIFT_EXPR, TREE_TYPE (ulhs), ulhs, t);
+  if (tree_int_cst_lt (integer_one_node, t))
+   {
+ error_at (loc, shift expression %q+E overflows,
+   build2_loc (loc, code, type, lhs, rhs));
+ return;
+   }
+}
+}
+
 /* Subroutine of cxx_eval_constant_expression.
Attempt to reduce the unary expression tree T to a compile time value.
If successful, return the value.  Otherwise issue a diagnostic
@@ -1532,6 +1599,8 @@ cxx_eval_binary_expression (const constexpr_ctx *ctx, 
tree t,
   else
r = build2_loc (loc, code, type, lhs, rhs);
 }
+  else
+cxx_eval_check_shift_p (loc, code, type, lhs, rhs);
   VERIFY_CONSTANT (r);
   return r;
 }
diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-shift1.C 
gcc/testsuite/g++.dg/cpp0x/constexpr-shift1.C
index e69de29..1f4ee73 100644
--- gcc/testsuite/g++.dg/cpp0x/constexpr-shift1.C
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-shift1.C
@@ -0,0 +1,73 @@
+// { dg-do compile { target c++11 } }
+
+constexpr int
+fn1 (int i, int j)
+{
+  

Re: [PATCH fortran/linemap] Add enough column hint to fit any possible offset

2014-12-02 Thread Dodji Seketeli
Hello Manuel, Tobias,

Manuel López-Ibáñez lopeziba...@gmail.com writes:

 This patch actually does not touch linemap but I will appreciate
 Dodji's comments about the approach.

Thanks :-)

 The problem is that in case of long lines, the column hint of 120
 might be too small, thus we do not have enough locations within one
 line to point to a higher column (like 132 in the testcase). Giving as
 column hint the length of the line seems the right fix.

I see.

 The other issue that triggers with this testcase is that, even though
 the column hint allows having line 6 + column 132, since we never
 generate a location except for 6:0, if a new map is created for the
 new line, then it will consider that the last location was 6:0 and the
 new map for line 7 will start at location+1, hence, no offset within
 line 6 can be represented.

Right.  This is done by design to allow for a kind of 'compression' of
the theoretical source location space.  That is, the number of source
locations known to the line map sub-system roughly becomes (roughly) the
number of tokens issued by the tokenizer, rather than the total number
of columns multiplied by the number of lines of the input program.

And of course, we are we today willing something in-between, I guess.
:-)

 My fix here is to create a dummy location for line_len -1 (the last
 column in the line). This forces the next map to start after this last
 column's location.

I think this makes sense.  I'll get to the theoretically unlimited line
length case that Tobias alludes to later below.

 I'm not sure if this latter fix is the approach we want to take.

I think this approach is fine.

 If so, then we may want to change linemap.c itself to force new maps
 to reserve enough locations for any offset in the line instead of
 doing last_location+1.

Hmmh, we must keep in mind that this whole line map thing should keep
working for cases where the primary way to interact with the source
location management sub-system is basically cpp_get_token().  So if that
condition is satisfied, then, why not.  I guess I'd need to look at
specifics of what you are proposing here before talking :-)

[...]

Tobias Burnus bur...@net-b.de writes:

[...]

 My fix here is to create a dummy location for line_len -1 (the last
 column in the line). This forces the next map to start after this
 last column's location.

 My feeling is that that doesn't work for
 -ffree-line-length-none/-ffixed-line-length-none. In that case,
 gfc_option.free_line_length == 0 and there is no limit to the number
 of characters per line. Thus, I fear that

b-location
 - = linemap_line_start (line_table, current_file-line++, 120);
 + = linemap_line_start (line_table, current_file-line++, line_len);


 might even set the line length to 0 – but I haven't checked. For all
 other values, it should work.

I am not sure what the value of line_len is when
gfc_option.free_line_length == 0.  If it's not zero, then we are good.
If it's zero (with the semantics of that zero meaning that the line
might be very big), then we might just get the actual length of the line
in terms of column count and pass that to linemap_line_start.

We must just keep in mind that the line subsystem has a hard limit on
column numbers (I think it's 100 000 columns at the moment).  Passed
that limit, we give up tracking column numbers in source locations; that
means that all columns numbers are set to zero.


I hope this is useful.

Cheers,

-- 
Dodji


[PATCH, CHKP] Fix instrumentation clones privatization

2014-12-02 Thread Ilya Enkovich
Hi,

Currently symbol names privatization doesn't work for instrumentation clones 
because clones assembler name is transparent alias and therefore alias target 
should be privatized instead.  This patch does it.  Bootstrapped and tested on 
x86_64-unknown-linux-gnu.  OK for trunk?

Thanks,
Ilya
--
gcc/

2014-12-02  Ilya Enkovich  ilya.enkov...@intel.com

* lto/lto-partition.c (privatize_symbol_name): Correctly
privatize instrumentation clones.

gcc/testsuite/

2014-12-02  Ilya Enkovich  ilya.enkov...@intel.com

* gcc.dg/lto/lto.exp: Load mpx-dg.exp.
* gcc.dg/lto/chkp-privatize_0.c: New.
* gcc.dg/lto/chkp-privatize_1.c: New.


diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 65f0582..809a493 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -787,8 +787,16 @@ static bool
 privatize_symbol_name (symtab_node *node)
 {
   tree decl = node-decl;
-  const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
-  cgraph_node *cnode;
+  cgraph_node *cnode = dyn_cast cgraph_node * (node);
+  const char *name;
+
+  /* If we want to privatize instrumentation clone
+ then we need to change original function name
+ which is used via transparent alias chain.  */
+  if (cnode  cnode-instrumentation_clone)
+decl = cnode-orig_decl;
+
+  name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
 
   /* Our renaming machinery do not handle more than one change of assembler 
name.
  We should not need more than one anyway.  */
@@ -821,15 +829,20 @@ privatize_symbol_name (symtab_node *node)
 (DECL_ASSEMBLER_NAME (decl)));
   /* We could change name which is a target of transparent alias
  chain of instrumented function name.  Fix alias chain if so  .*/
-  if ((cnode = dyn_cast cgraph_node * (node))
-   !cnode-instrumentation_clone
-   cnode-instrumented_version
-   cnode-instrumented_version-orig_decl == decl)
+  if (cnode)
 {
-  tree iname = DECL_ASSEMBLER_NAME (cnode-instrumented_version-decl);
-
-  gcc_assert (IDENTIFIER_TRANSPARENT_ALIAS (iname));
-  TREE_CHAIN (iname) = DECL_ASSEMBLER_NAME (decl);
+  tree iname = NULL_TREE;
+  if (cnode-instrumentation_clone)
+   iname = DECL_ASSEMBLER_NAME (cnode-decl);
+  else if (cnode-instrumented_version
+   cnode-instrumented_version-orig_decl == decl)
+   iname = DECL_ASSEMBLER_NAME (cnode-instrumented_version-decl);
+
+  if (iname)
+   {
+ gcc_assert (IDENTIFIER_TRANSPARENT_ALIAS (iname));
+ TREE_CHAIN (iname) = DECL_ASSEMBLER_NAME (decl);
+   }
 }
   if (symtab-dump_file)
 fprintf (symtab-dump_file,
diff --git a/gcc/testsuite/gcc.dg/lto/chkp-privatize_0.c 
b/gcc/testsuite/gcc.dg/lto/chkp-privatize_0.c
new file mode 100644
index 000..4c899e8
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/chkp-privatize_0.c
@@ -0,0 +1,18 @@
+/* { dg-lto-do link } */
+/* { dg-require-effective-target mpx { target { i?86-*-* x86_64-*-* } } } */
+/* { dg-lto-options { { -fPIC -flto -flto-partition=max -fcheck-pointer-bounds 
-mmpx } } } */
+
+static int
+__attribute__ ((noinline))
+func1 (int i)
+{
+  return i + 2;
+}
+
+extern int func2 (int i);
+
+int
+main (int argc, char **argv)
+{
+  return func1 (argc) + func2 (argc) + 1;
+}
diff --git a/gcc/testsuite/gcc.dg/lto/chkp-privatize_1.c 
b/gcc/testsuite/gcc.dg/lto/chkp-privatize_1.c
new file mode 100644
index 000..db39e7f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/chkp-privatize_1.c
@@ -0,0 +1,8 @@
+int func1 = 10;
+
+int
+func2 (int i)
+{
+  func1++;
+  return i + func1;
+}
diff --git a/gcc/testsuite/gcc.dg/lto/lto.exp b/gcc/testsuite/gcc.dg/lto/lto.exp
index 2586297..797c02f 100644
--- a/gcc/testsuite/gcc.dg/lto/lto.exp
+++ b/gcc/testsuite/gcc.dg/lto/lto.exp
@@ -30,6 +30,7 @@ if $tracelevel then {
 # Load procedures from common libraries. 
 load_lib standard.exp
 load_lib gcc.exp
+load_lib mpx-dg.exp
 
 # Load the language-independent compabibility support procedures.
 load_lib lto.exp


[PATCH][1/n] Fix PR14541

2014-12-02 Thread Richard Biener

This is a (tiny) piece of fixing PR14541 which basically complains
that most math function simplification doesn't happen on GIMPLE.
This is of course because nobody re-builds the nested GENERIC
call expressions that would be necessary to trigger the various
simplifications implemneted in builtins.c.

The following patch thus moves the simplifications from
fold_builtin_logarithm to match.pd patterns, removing
the corresponding code from builtins.c.

Note that this results in those patterns no longer applying
at -O0 (or in C/C++ constant expression simplification) as
currently there is no builtin folding implemented in the
GENERIC code-path of match-and-simplify.  I am considering
that premature optimization (even more so at -O0...).
Note that the machinery only omits support for outermost
calls, thus we cannot simplify log (exp (x)) but we can
simplify x * pow (x, CST) to pow (x, CST+1) for example.
The simplifications also won't apply during gimplification
but only after SSA is built.

On the branch there are also complete implementations of
the fold_builtin_pow and fold_builtin_sqrt simplifications.

I'd like to get a little bit coverage of builtin simplification
in the wild and wonder if the following is a piece that would
be acceptable for trunk at this point (well, it fixes part of
a bug which is fine in stage3).  Eventually also merging
the builtin bits from the branch.

Are there otherwise any opinions on not handling these for GENERIC?
Of course in the end I was just lazy and support shouldn't be too
difficult.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Thanks,
Richard.

2014-12-02  Richard Biener  rguent...@suse.de

PR middle-end/14541
* builtins.c (fold_builtin_logarithm): Implement simplifications ...
* match.pd: ... here as patterns.

Index: gcc/match.pd
===
--- gcc/match.pd(revision 218265)
+++ gcc/match.pd(working copy)
@@ -916,3 +916,90 @@ (define_operator_list inverted_tcc_compa
 (icmp @0 @1))
(if (ic == ncmp)
 (ncmp @0 @1)
+
+
+/* Simplification of math builtins.  */
+
+(define_operator_list LOG BUILT_IN_LOGF BUILT_IN_LOG BUILT_IN_LOGL)
+(define_operator_list EXP BUILT_IN_EXPF BUILT_IN_EXP BUILT_IN_EXPL)
+(define_operator_list LOG2 BUILT_IN_LOG2F BUILT_IN_LOG2 BUILT_IN_LOG2L)
+(define_operator_list EXP2 BUILT_IN_EXP2F BUILT_IN_EXP2 BUILT_IN_EXP2L)
+(define_operator_list LOG10 BUILT_IN_LOG10F BUILT_IN_LOG10 BUILT_IN_LOG10L)
+(define_operator_list EXP10 BUILT_IN_EXP10F BUILT_IN_EXP10 BUILT_IN_EXP10L)
+(define_operator_list POW BUILT_IN_POWF BUILT_IN_POW BUILT_IN_POWL)
+(define_operator_list POW10 BUILT_IN_POW10F BUILT_IN_POW10 BUILT_IN_POW10L)
+(define_operator_list SQRT BUILT_IN_SQRTF BUILT_IN_SQRT BUILT_IN_SQRTL)
+(define_operator_list CBRT BUILT_IN_CBRTF BUILT_IN_CBRT BUILT_IN_CBRTL)
+
+
+/* fold_builtin_logarithm */
+(if (flag_unsafe_math_optimizations)
+ /* Special case, optimize logN(expN(x)) = x.  */
+ (for logs (LOG LOG2 LOG10)
+  exps (EXP EXP2 EXP10)
+  (simplify
+   (logs (exps @0))
+@0))
+ /* Optimize logN(func()) for various exponential functions.  We
+want to determine the value x and the power exponent in
+order to transform logN(x**exponent) into exponent*logN(x).  */
+ (for logs (LOG LOG LOG LOG
+LOG2 LOG2 LOG2 LOG2
+   LOG10 LOG10 LOG10 LOG10)
+  exps (EXP EXP2 EXP10 POW10)
+  (simplify
+   (logs (exps @0))
+   (with {
+ tree x;
+ switch (exps)
+   {
+   CASE_FLT_FN (BUILT_IN_EXP):
+ /* Prepare to do logN(exp(exponent) - exponent*logN(e).  */
+x = build_real (type, real_value_truncate (TYPE_MODE (type),
+   dconst_e ()));
+ break;
+   CASE_FLT_FN (BUILT_IN_EXP2):
+ /* Prepare to do logN(exp2(exponent) - exponent*logN(2).  */
+ x = build_real (type, dconst2);
+ break;
+   CASE_FLT_FN (BUILT_IN_EXP10):
+   CASE_FLT_FN (BUILT_IN_POW10):
+/* Prepare to do logN(exp10(exponent) - exponent*logN(10).  */
+{
+  REAL_VALUE_TYPE dconst10;
+  real_from_integer (dconst10, VOIDmode, 10, SIGNED);
+  x = build_real (type, dconst10);
+}
+ break;
+   }
+ }
+(mult (logs { x; }) @0
+ (for logs (LOG LOG
+LOG2 LOG2
+   LOG10 LOG10)
+  exps (SQRT CBRT)
+  (simplify
+   (logs (exps @0))
+   (with {
+ tree x;
+ switch (exps)
+   {
+   CASE_FLT_FN (BUILT_IN_SQRT):
+/* Prepare to do logN(sqrt(x) - 0.5*logN(x).  */
+x = build_real (type, dconsthalf);
+ break;
+   CASE_FLT_FN (BUILT_IN_CBRT):
+/* Prepare to do logN(cbrt(x) - (1/3)*logN(x).  */
+ x = build_real (type, real_value_truncate (TYPE_MODE (type),
+   dconst_third ()));
+ break;
+   }
+ }
+(mult { x; } (logs @0)
+ /* 

Re: [C++ PATCH] Detect UB in shifts in constexpr functions

2014-12-02 Thread Jason Merrill

On 12/02/2014 09:14 AM, Marek Polacek wrote:

On Fri, Nov 28, 2014 at 09:28:03AM -0500, Jason Merrill wrote:

I was thinking even more detailed: one diagnostic for negative count, one
diagnostic for count larger than the precision of the lhs, and then a third
for overflow.


Alright, done.


Thanks.  These errors also need to be conditional on (!ctx-quiet), and 
if we see one of these conditions we need to set *non_constant_p.



This was a little bit problematic, because I don't really know where
that formula comes from.  We use it in c-ubsan.c, so I've just taken
it from there.  It was Jakub who came with that formula, so maybe he
remembers its origin.  I wrote something more to the comment describing
it, even though that might not be enough.


Looks good.

Jason



Re: [PATCH 2/3] Extended if-conversion

2014-12-02 Thread Yuri Rumyantsev
Thanks Richard for your quick reply!

1. I agree that we can combine predicate_extended_ and
predicate_arbitrary_ to one function as you proposed.
2. What is your opinion about using more simple decision about
insertion point - if bb has use of phi result insert phi predication
before it and at the bb end otherwise. I assume that critical edge
splitting is not a good decision.

Best regards.
Yuri.

2014-12-02 16:28 GMT+03:00 Richard Biener richard.guent...@gmail.com:
 On Mon, Dec 1, 2014 at 4:53 PM, Yuri Rumyantsev ysrum...@gmail.com wrote:
 Hi Richard,

 I resend you patch1 and patch2 with minor changes:
 1. I renamed flag_force_vectorize to aggressive_if_conv.
 2. Use static cast for the first argument of gimple_phi_arg_edge.
 I also very sorry that I sent you bad patch.

 Now let me answer on your questions related to second patch.
 1. Why we need both predicate_extended_scalar_phi and
 predicate_arbitrary_scalar_phi?

 Let's consider the following simple test-case:

   #pragma omp simd safelen(8)
   for (i=0; i512; i++)
   {
 float t = a[i];
 if (t  0.0f  t  1.0e+17f)
   if (c[i] != 0)  /* c is integer array. */
 res += 1;
   }

 we can see the following phi node correspondent to res:

 # res_1 = PHI res_15(3), res_15(4), res_10(5)

 It is clear that we can optimize it to phi node with 2 arguments only
 and only one check can be used for phi predication (for reduction in
 our case), namely predicate of bb_5. In general case we can't do it
 even if we sort all phi argument values since we still have to produce
 a chain of cond expressions to perform phi predication (see comments
 for predicate_arbitrary_scalar_phi).

 How so?  We can always use !(condition) for the last value, thus
 treat it as an 'else' case.  That even works for

 # res_1 = PHI res_15(3), res_15(4), res_10(5), res_10(7)

 where the condition for edges 5 and 7 can be computed as
 ! (condition for 3 || condition for 4).

 Of course it is worthwhile to also sort single-occurances first
 so your case gets just the condiiton for edge 5 and its inversion
 used for edges 3 and 4 combined.

 2. Why we need to introduce find_insertion_point?
  Let's consider another test-case extracted from 175.vpr ( t5.c is
 attached) and we can see that bb_7 and bb_9 containig phi nodes has
 only critical incoming edges and both contain code computing edge
 predicates, e.g.

 bb 7:
 # xmax_edge_30 = PHI xmax_edge_36(6), xmax_edge_18(5)
 _46 = xmax_17 == xmax_37;
 _47 = xmax_17 == xmax_27;
 _48 = _46  _47;
 _53 = xmax_17 == xmax_37;
 _54 = ~_53;
 _55 = xmax_17 == xmax_27;
 _56 = _54  _55;
 _57 = _48 | _56;
 xmax_edge_19 = xmax_edge_39 + 1;
 goto bb 11;

 It is evident that we can not put phi predication at the block
 beginning but need to put it after predicate computations.
 Note also that if there are no critical edges for phi arguments
 insertion point will be after labels Note also that phi result can
 have use in this block too, so we can't put predication code to the
 block end.

 So the issue is that predicate insertion for edge predicates does
 not happen on the edge but somewhere else (generally impossible
 for critical edges unless you split them).

 I think I've told you before that I prefer simple solutions to such issues,
 like splitting the edge!  Certainly not involving a function walking
 GENERIC expressions.

 Thanks,
 Richard.

 Let me know if you still have any questions.

 Best regards.
 Yuri.




 2014-11-28 15:43 GMT+03:00 Richard Biener richard.guent...@gmail.com:
 On Wed, Nov 12, 2014 at 2:35 PM, Yuri Rumyantsev ysrum...@gmail.com wrote:
 Hi All,

 Here is the second patch related to extended predication.
 Few comments which explain a main goal of design.

 1. I don't want to insert any critical edge splitting since it may
 lead to less efficient binaries.
 2. One special case of extended PHI node predication was introduced
 when #arguments is more than 2 but only two arguments are different
 and one argument has the only occurrence. For such PHI conditional
 scalar reduction is applied.
 This is correspondent to the following statement:
 if (q1  q2  q3) var++
  New function phi_has_two_different_args was introduced to detect such phi.
 3. Original algorithm for PHI predication used assumption that at
 least one incoming edge for blocks containing PHI is not critical - it
 guarantees that all computations related to predicate of normal edge
 are already inserted above this block and
 code related to PHI predication can be inserted at the beginning of
 block. But this is not true for critical edges for which predicate
 computations are  in the block where code for phi predication must be
 inserted. So new function find_insertion_point is introduced which is
 simply found out the last statement in block defining predicates
 correspondent to all incoming edges and insert phi predication code
 after it (with some minor exceptions).

 Unfortunately the patch doesn't apply for me - I get

 patch:  malformed patch at line 

Re: [C++ PATCH] Detect UB in shifts in constexpr functions

2014-12-02 Thread Marek Polacek
On Tue, Dec 02, 2014 at 09:59:26AM -0500, Jason Merrill wrote:
 Thanks.  These errors also need to be conditional on (!ctx-quiet), and if
 we see one of these conditions we need to set *non_constant_p.

Ah, sorry.  Fixed in the following.

Regtested on ppc64-linux, bootstrap in progress.

2014-12-02  Marek Polacek  pola...@redhat.com

* constexpr.c (cxx_eval_check_shift_p): New function.
(cxx_eval_binary_expression): Call it.  Set NON_CONSTANT_P if it
returns true.

* g++.dg/cpp0x/constexpr-shift1.C: New test.
* g++.dg/cpp1y/constexpr-shift1.C: New test.
* g++.dg/ubsan/pr63956.C: Add dg-errors.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 2184a2f..48bc8f1 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -1470,6 +1470,79 @@ verify_constant (tree t, bool allow_non_constant, bool 
*non_constant_p,
   return *non_constant_p;
 }
 
+/* Check whether the shift operation with code CODE and type TYPE on LHS
+   and RHS is undefined.  If it is, give an error with an explanation,
+   and return true; return false otherwise.  */
+
+static bool
+cxx_eval_check_shift_p (location_t loc, const constexpr_ctx *ctx,
+   enum tree_code code, tree type, tree lhs, tree rhs)
+{
+  if ((code != LSHIFT_EXPR  code != RSHIFT_EXPR)
+  || TREE_CODE (lhs) != INTEGER_CST
+  || TREE_CODE (rhs) != INTEGER_CST)
+return false;
+
+  tree lhstype = TREE_TYPE (lhs);
+  unsigned HOST_WIDE_INT uprec = TYPE_PRECISION (TREE_TYPE (lhs));
+
+  /* [expr.shift] The behavior is undefined if the right operand
+ is negative, or greater than or equal to the length in bits
+ of the promoted left operand.  */
+  if (tree_int_cst_sgn (rhs) == -1)
+{
+  if (!ctx-quiet)
+   error_at (loc, right operand of shift expression %q+E is negative,
+ build2_loc (loc, code, type, lhs, rhs));
+  return true;
+}
+  if (compare_tree_int (rhs, uprec) = 0)
+{
+  if (!ctx-quiet)
+   error_at (loc, right operand of shift expression %q+E is = than 
+ the precision of the left operand,
+ build2_loc (loc, code, type, lhs, rhs));
+  return true;
+}
+
+  /* The value of E1  E2 is E1 left-shifted E2 bit positions; [...]
+ if E1 has a signed type and non-negative value, and E1x2^E2 is
+ representable in the corresponding unsigned type of the result type,
+ then that value, converted to the result type, is the resulting value;
+ otherwise, the behavior is undefined.  */
+  if (code == LSHIFT_EXPR  !TYPE_UNSIGNED (lhstype)
+   (cxx_dialect = cxx11))
+{
+  if (tree_int_cst_sgn (lhs) == -1)
+   {
+ if (!ctx-quiet)
+   error_at (loc, left operand of shift expression %q+E is negative,
+ build2_loc (loc, code, type, lhs, rhs));
+ return true;
+   }
+  /* For signed x  y the following:
+(unsigned) x  ((prec (lhs) - 1) - y)
+if  1, is undefined.  The right-hand side of this formula
+is the highest bit of the LHS that can be set (starting from 0),
+so that the shift doesn't overflow.  We then right-shift the LHS
+to see whether any other bit is set making the original shift
+undefined -- the result is not representable in the corresponding
+unsigned type.  */
+  tree t = build_int_cst (unsigned_type_node, uprec - 1);
+  t = fold_build2 (MINUS_EXPR, unsigned_type_node, t, rhs);
+  tree ulhs = fold_convert (unsigned_type_for (lhstype), lhs);
+  t = fold_build2 (RSHIFT_EXPR, TREE_TYPE (ulhs), ulhs, t);
+  if (tree_int_cst_lt (integer_one_node, t))
+   {
+ if (!ctx-quiet)
+   error_at (loc, shift expression %q+E overflows,
+ build2_loc (loc, code, type, lhs, rhs));
+ return true;
+   }
+}
+  return false;
+}
+
 /* Subroutine of cxx_eval_constant_expression.
Attempt to reduce the unary expression tree T to a compile time value.
If successful, return the value.  Otherwise issue a diagnostic
@@ -1532,6 +1605,8 @@ cxx_eval_binary_expression (const constexpr_ctx *ctx, 
tree t,
   else
r = build2_loc (loc, code, type, lhs, rhs);
 }
+  else if (cxx_eval_check_shift_p (loc, ctx, code, type, lhs, rhs))
+*non_constant_p = true;
   VERIFY_CONSTANT (r);
   return r;
 }
diff --git gcc/testsuite/g++.dg/cpp0x/constexpr-shift1.C 
gcc/testsuite/g++.dg/cpp0x/constexpr-shift1.C
index e69de29..1f4ee73 100644
--- gcc/testsuite/g++.dg/cpp0x/constexpr-shift1.C
+++ gcc/testsuite/g++.dg/cpp0x/constexpr-shift1.C
@@ -0,0 +1,73 @@
+// { dg-do compile { target c++11 } }
+
+constexpr int
+fn1 (int i, int j)
+{
+  return i  j; // { dg-error is negative }
+}
+
+constexpr int i1 = fn1 (1, -1);
+
+constexpr int
+fn2 (int i, int j)
+{
+  return i  j; // { dg-error is = than the precision of the left operand }
+}
+
+constexpr int i2 = fn2 (1, 200);
+
+constexpr int
+fn3 (int i, int j)
+{
+  return i  

[patch, 4.9] backport parallel header changes to the 4.9 branch

2014-12-02 Thread Matthias Klose
Backport these header changes to the 4.9 branch. No regression, but reported to 
the Debian BTS.  Safe to backport according to Jonathan Wakely. No regressions 
in the testsuite on x86_64-linux-gnu. Ok for the branch?


libstdc++-v3/

2014-12-02  Matthias Klose  d...@ubuntu.com

PR libstdc++/64103
Backport from mainline
2014-11-03  Paolo Carlini  paolo.carl...@oracle.com

* include/parallel/algo.h: Do not use default arguments in function
template redeclarations (definitions).
* include/parallel/numeric.h: Likewise.

libstdc++-v3/

2014-12-02  Matthias Klose  d...@ubuntu.com

	PR libstdc++/64103
	Backport from mainline
	2014-11-03  Paolo Carlini  paolo.carl...@oracle.com

	* include/parallel/algo.h: Do not use default arguments in function
	template redeclarations (definitions).
	* include/parallel/numeric.h: Likewise.

 
Index: libstdc++-v3/include/parallel/algo.h
===
--- libstdc++-v3/include/parallel/algo.h	(revision 218277)
+++ libstdc++-v3/include/parallel/algo.h	(working copy)
@@ -81,9 +81,8 @@
   templatetypename _RAIter, typename _Function
 _Function
 __for_each_switch(_RAIter __begin, _RAIter __end, 
-_Function __f, random_access_iterator_tag, 
-__gnu_parallel::_Parallelism __parallelism_tag
-= __gnu_parallel::parallel_balanced)
+_Function __f, random_access_iterator_tag,
+__gnu_parallel::_Parallelism __parallelism_tag)
 {
   if (_GLIBCXX_PARALLEL_CONDITION(
 static_cast__gnu_parallel::_SequenceIndex(__end - __begin)
@@ -896,8 +895,7 @@
 typename iterator_traits_RAIter::difference_type
 __count_switch(_RAIter __begin, _RAIter __end, 
  const _Tp __value, random_access_iterator_tag, 
- __gnu_parallel::_Parallelism __parallelism_tag 
- = __gnu_parallel::parallel_unbalanced)
+ __gnu_parallel::_Parallelism __parallelism_tag)
 {
   typedef iterator_traits_RAIter _TraitsType;
   typedef typename _TraitsType::value_type _ValueType;
@@ -966,8 +964,7 @@
 typename iterator_traits_RAIter::difference_type
 __count_if_switch(_RAIter __begin, _RAIter __end, 
 _Predicate __pred, random_access_iterator_tag,
-__gnu_parallel::_Parallelism __parallelism_tag
-= __gnu_parallel::parallel_unbalanced)
+__gnu_parallel::_Parallelism __parallelism_tag)
 {
   typedef iterator_traits_RAIter _TraitsType;
   typedef typename _TraitsType::value_type _ValueType;
@@ -1225,8 +1222,7 @@
 __transform1_switch(_RAIter1 __begin, _RAIter1 __end,
   _RAIter2 __result, _UnaryOperation __unary_op,
   random_access_iterator_tag, random_access_iterator_tag,
-  __gnu_parallel::_Parallelism __parallelism_tag
-  = __gnu_parallel::parallel_balanced)
+  __gnu_parallel::_Parallelism __parallelism_tag)
 {
   if (_GLIBCXX_PARALLEL_CONDITION(
 static_cast__gnu_parallel::_SequenceIndex(__end - __begin)
@@ -1315,8 +1311,7 @@
   _RAIter3 __result, _BinaryOperation __binary_op,
   random_access_iterator_tag, random_access_iterator_tag,
   random_access_iterator_tag,
-  __gnu_parallel::_Parallelism __parallelism_tag 
-  = __gnu_parallel::parallel_balanced)
+  __gnu_parallel::_Parallelism __parallelism_tag)
 {
   if (_GLIBCXX_PARALLEL_CONDITION(
 (__end1 - __begin1) =
@@ -1422,8 +1417,7 @@
 __replace_switch(_RAIter __begin, _RAIter __end, 
const _Tp __old_value, const _Tp __new_value, 
random_access_iterator_tag, 
-   __gnu_parallel::_Parallelism __parallelism_tag
-   = __gnu_parallel::parallel_balanced)
+   __gnu_parallel::_Parallelism __parallelism_tag)
 {
   // XXX parallel version is where?
   replace(__begin, __end, __old_value, __new_value, 
@@ -1478,8 +1472,7 @@
 __replace_if_switch(_RAIter __begin, _RAIter __end,
   _Predicate __pred, const _Tp __new_value,
   random_access_iterator_tag,
-  __gnu_parallel::_Parallelism __parallelism_tag
-  = __gnu_parallel::parallel_balanced)
+  __gnu_parallel::_Parallelism __parallelism_tag)
 {
   if (_GLIBCXX_PARALLEL_CONDITION(
 static_cast__gnu_parallel::_SequenceIndex(__end - __begin)
@@ -1544,8 +1537,7 @@
 void
 __generate_switch(_RAIter __begin, _RAIter __end,
 _Generator __gen, random_access_iterator_tag, 
-__gnu_parallel::_Parallelism __parallelism_tag
- 

Re: [patch, 4.9] backport parallel header changes to the 4.9 branch

2014-12-02 Thread Jonathan Wakely

On 02/12/14 16:24 +, Jonathan Wakely wrote:

On 02/12/14 17:20 +0100, Matthias Klose wrote:
Backport these header changes to the 4.9 branch. No regression, but 
reported to the Debian BTS.  Safe to backport according to Jonathan 
Wakely. No regressions in the testsuite on x86_64-linux-gnu. Ok for 
the branch?


You might as well include the additional fixes in r217069 which fix
more instances of the same problem.


Which you already did! Sorry.

OK for the 4.9 branch, thanks for doing this.


[PATCH][wwwdocs] Add note about removal of -mwords-little-endian option to changes.html for GCC 5

2014-12-02 Thread Kyrill Tkachov

Hi all,

Back in July this deprecated option was removed.
This patch adds a note to changes.html for that.

Looked at the result in firefox, looked ok to me.

Ok to commit?

Thanks,
Kyrill
Index: htdocs/gcc-5/changes.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-5/changes.html,v
retrieving revision 1.44
diff -U 3 -r1.44 changes.html
--- htdocs/gcc-5/changes.html	28 Nov 2014 12:20:01 -	1.44
+++ htdocs/gcc-5/changes.html	2 Dec 2014 15:36:03 -
@@ -406,6 +406,9 @@
code-mcpu=cortex-a17.cortex-a7 /code and
code-mtune=cortex-a17.cortex-a7/code options.
   /li
+  li The deprecated option code-mwords-little-endian/code
+   has been removed.
+  /li
 
  /ul
 

Re: [patch, 4.9] backport parallel header changes to the 4.9 branch

2014-12-02 Thread Jonathan Wakely

On 02/12/14 17:20 +0100, Matthias Klose wrote:
Backport these header changes to the 4.9 branch. No regression, but 
reported to the Debian BTS.  Safe to backport according to Jonathan 
Wakely. No regressions in the testsuite on x86_64-linux-gnu. Ok for 
the branch?


You might as well include the additional fixes in r217069 which fix
more instances of the same problem.


Re: [PATCH 08/08] PR/64003 workaround (uninit memory in i386.md)

2014-12-02 Thread David Malcolm
On Wed, 2014-11-26 at 10:41 -0700, Jeff Law wrote:
 On 11/25/14 18:39, David Malcolm wrote:
  I suspect this is papering over a real problem, but I've been
  applying this workaround locally to keep my valgrind output clean.
 
  gcc/ChangeLog:
  PR/64003
  * final.c (shorten_branches): Allocate insn_lengths with
  XCNEWVEC rather than XNEWVEC; remove subsequent per-element
  initialization.
 I'd like more information on this one.

I've spent some time trying to track this down, and I've added detailed
notes to the bug.

In short, I believe the problem occurs with a *jcc_1 insn that jumps
forwards, but the full details are in the bug.

 My first thought is that something must be creating a new insn after 
 shorten_branches is complete or an existing insn that was not on the 
 chain when we called shorten-branches, but got threaded onto the chain 
 later.  Either would be considered bad in various ways, so we'd like to 
 fix it.

I don't think either of these are the case.  I believe it's due to the
size of the jcc_1 insn being affected by the distance to the jump
target, which for a forward jump is a bit of a chicken-and-egg issue,
since that distance is affected by the size of the jcc_1 insn itself.  

It looks like align_fuzz exists in order to cope with this kind of
circular definition, but the issue seems to occur inside align_fuzz
itself.

 Presumably you're not doing an out-of-range access into insn_lengths? 
 Right?
 
 Do you have a reasonable testcase for this?

I've added a more minimal reproducer to the bug (8 lines); see:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64003#c7

I see it with latest trunk (r218254) on x86_64.

See my notes in the bug.

Dave



[PING] [PATCH][AARCH64]Use selected cpu's tuning when no tuning parameter is specified.

2014-12-02 Thread Renlin Li

On 27/11/14 11:27, Renlin Li wrote:

Hi all,

We have the following code in aarch64_override_options() function.

  /* The selected cpu may be an architecture, so lookup tuning by core 
ID.  */

  if (!selected_tune)
selected_tune = all_cores[selected_cpu-core];

However, the logic here is not right any more according to our current 
code. selected_cpu will never be an architecture at this point.


This patch will correct the behaviour and use selected_cpu's tuning 
directly if selected_tune is still not decided at this point.



We have the following consequence after the change.
Previously we have the following tuning settings with the following 
command line options:
-march=armv8-a -- selected_cpu = generic, aarch64_tune_params = 
cortexa53_tunings
Nothing specified and selected_cpu == generic -- aarch64_tune_params 
= cortexa53_tunings


After the change, we got something different:
-march=armv8-a -- selected_cpu = generic, aarch64_tune_params = 
generic_tunings
Nothing specified and selected_cpu == generic -- aarch64_tune_params 
= generic_tunings


All other configuration(-march, -mcpu, -mtune) combinations should be 
unaffected.



aarch64-none-elf has been built and tested on the model, no issue.

Is it Okay for trunk?


gcc/ChangeLog:

2014-11-27  Renlin Li  renlin...@arm.com

* config/aarch64/aarch64.c (aarch64_parse_cpu): Don't define 
selected_tune.

(aarch64_override_options): Use selected_cpu's tuning.




PING~

Regards,
Renlin



Re: [C++ PATCH] Detect UB in shifts in constexpr functions

2014-12-02 Thread Jason Merrill

OK, thanks.

Jason


[PATCH, testsuite, i386]: Add some missing defines to AVX512 testcases

2014-12-02 Thread Uros Bizjak
Hello!

2014-12-02  Uros Bizjak  ubiz...@gmail.com

* gcc.target/i386/avx512ifma-vpmaddhuq-2.c: Define AVX512IFMA.
* gcc.target/i386/avx512ifma-vpmaddluq-2.c: Ditto.
* gcc.target/i386/avx512vbmi-vpermb-2.c: Define AVX512VBMI.
* gcc.target/i386/avx512vbmi-vpermi2b-2.c: Ditto.
* gcc.target/i386/avx512vbmi-vpermt2b-2.c: Ditto.
* gcc.target/i386/avx512vbmi-vpmultishiftqb-2.c: Ditto.

Tested on x86_64-linux-gnu and committed to mainline SVN.

Uros.
Index: gcc.target/i386/avx512ifma-vpmaddhuq-2.c
===
--- gcc.target/i386/avx512ifma-vpmaddhuq-2.c(revision 218276)
+++ gcc.target/i386/avx512ifma-vpmaddhuq-2.c(working copy)
@@ -2,6 +2,8 @@
 /* { dg-options -O2 -mavx512ifma } */
 /* { dg-require-effective-target avx512ifma } */
 
+#define AVX512IFMA
+
 #include avx512f-helper.h
 
 #define SIZE (AVX512F_LEN / 64)
Index: gcc.target/i386/avx512ifma-vpmaddluq-2.c
===
--- gcc.target/i386/avx512ifma-vpmaddluq-2.c(revision 218276)
+++ gcc.target/i386/avx512ifma-vpmaddluq-2.c(working copy)
@@ -2,6 +2,8 @@
 /* { dg-options -O2 -mavx512ifma } */
 /* { dg-require-effective-target avx512ifma } */
 
+#define AVX512IFMA
+
 #include avx512f-helper.h
 
 #define SIZE (AVX512F_LEN / 64)
Index: gcc.target/i386/avx512vbmi-vpermb-2.c
===
--- gcc.target/i386/avx512vbmi-vpermb-2.c   (revision 218276)
+++ gcc.target/i386/avx512vbmi-vpermb-2.c   (working copy)
@@ -2,6 +2,8 @@
 /* { dg-options -O2 -mavx512vbmi } */
 /* { dg-require-effective-target avx512vbmi } */
 
+#define AVX512VBMI
+
 #include avx512f-helper.h
 
 #define SIZE (AVX512F_LEN / 8)
Index: gcc.target/i386/avx512vbmi-vpermi2b-2.c
===
--- gcc.target/i386/avx512vbmi-vpermi2b-2.c (revision 218276)
+++ gcc.target/i386/avx512vbmi-vpermi2b-2.c (working copy)
@@ -2,6 +2,8 @@
 /* { dg-options -O2 -mavx512vbmi } */
 /* { dg-require-effective-target avx512vbmi } */
 
+#define AVX512VBMI
+
 #include avx512f-helper.h
 
 #define SIZE (AVX512F_LEN / 8)
Index: gcc.target/i386/avx512vbmi-vpermt2b-2.c
===
--- gcc.target/i386/avx512vbmi-vpermt2b-2.c (revision 218276)
+++ gcc.target/i386/avx512vbmi-vpermt2b-2.c (working copy)
@@ -2,6 +2,8 @@
 /* { dg-options -O2 -mavx512vbmi } */
 /* { dg-require-effective-target avx512vbmi } */
 
+#define AVX512VBMI
+
 #include avx512f-helper.h
 
 #define SIZE (AVX512F_LEN / 8)
Index: gcc.target/i386/avx512vbmi-vpmultishiftqb-2.c
===
--- gcc.target/i386/avx512vbmi-vpmultishiftqb-2.c   (revision 218276)
+++ gcc.target/i386/avx512vbmi-vpmultishiftqb-2.c   (working copy)
@@ -2,6 +2,8 @@
 /* { dg-options -O2 -mavx512vbmi } */
 /* { dg-require-effective-target avx512vbmi } */
 
+#define AVX512VBMI
+
 #include avx512f-helper.h
 
 #define SIZE (AVX512F_LEN / 8)


[Patch]: Check __gthread_setspecific return

2014-12-02 Thread Ryan Mansfield

Hi,

Underlying pthread_setspecific can return non-zero with ENOMEM or EINVAL.

2014-12-02  Ryan Mansfield  rmansfi...@qnx.com

* emutls.c (__emutls_get_address): Check __gthread_setspecific 
returns.


OK?

Regards,

Ryan Mansfield

Index: libgcc/emutls.c
===
--- libgcc/emutls.c	(revision 218278)
+++ libgcc/emutls.c	(working copy)
@@ -160,7 +160,8 @@
   if (arr == NULL)
 	abort ();
   arr-size = size;
-  __gthread_setspecific (emutls_key, (void *) arr);
+  if (__gthread_setspecific (emutls_key, (void *) arr) != 0)
+	abort ();
 }
   else if (__builtin_expect (offset  arr-size, 0))
 {
@@ -174,7 +175,8 @@
   arr-size = size;
   memset (arr-data + orig_size, 0,
 	  (size - orig_size) * sizeof (void *));
-  __gthread_setspecific (emutls_key, (void *) arr);
+  if (__gthread_setspecific (emutls_key, (void *) arr) != 0)
+	abort ();
 }
 
   void *ret = arr-data[offset - 1];


[C++ Patch] PR 60978

2014-12-02 Thread Paolo Carlini

Hi,

another simple issue: this one argues that the warning is overeager when 
anonymous enums are involved, which often are just used as named 
constants. Details: maybe write the conditional in a different way; 
maybe even use  instead of ||. Tested x86_64-linux.


Thanks,
Paolo.

//
/cp
2014-12-02  Paolo Carlini  paolo.carl...@oracle.com

PR c++/60978
* call.c (build_conditional_expr_1): Do not -Wenum-compare warn
for anonymous enums.

/testsuite
2014-12-02  Paolo Carlini  paolo.carl...@oracle.com

PR c++/60978
* g++.dg/warn/Wenum-compare-2.C: New.
Index: cp/call.c
===
--- cp/call.c   (revision 218278)
+++ cp/call.c   (working copy)
@@ -5014,6 +5014,10 @@ build_conditional_expr_1 (location_t loc, tree arg
   DECL_CONTEXT (orig_arg2) == DECL_CONTEXT (orig_arg3))
/* Two enumerators from the same enumeration can have different
   types when the enumeration is still being defined.  */;
+ else if (TYPE_ANONYMOUS_P (arg2_type)
+  || TYPE_ANONYMOUS_P (arg3_type))
+   /* Avoid warning for anonymous enums, probably just used as
+  named constants.  */;
   else if (complain  tf_warning)
 warning_at (loc, OPT_Wenum_compare, enumeral mismatch in 
conditional expression: %qT vs %qT,
Index: testsuite/g++.dg/warn/Wenum-compare-2.C
===
--- testsuite/g++.dg/warn/Wenum-compare-2.C (revision 0)
+++ testsuite/g++.dg/warn/Wenum-compare-2.C (working copy)
@@ -0,0 +1,10 @@
+// PR c++/60978
+// { dg-options -Wenum-compare }
+
+enum { A };
+enum { B };
+
+int foo(int x)
+{
+  return x ? A : B;
+}


asan: support for globals in kernel

2014-12-02 Thread Dmitry Vyukov
Hi,

The following patch adds support for instrumentation of globals for
Linux kernel (-fsanitize=kernel-address). Kernel only supports
constructors with default priority, but the rest works fine.

OK for trunk?


https://codereview.appspot.com/176570043

Index: gcc/ChangeLog
===
--- gcc/ChangeLog (revision 218280)
+++ gcc/ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2014-12-02  Dmitry Vyukov  dvyu...@google.com
+
+ * asan.c: (asan_finish_file): Use default priority for constructors
+ in kernel mode.
+
 2014-12-02  Ulrich Weigand  ulrich.weig...@de.ibm.com

  PR target/64115
Index: gcc/asan.c
===
--- gcc/asan.c (revision 218280)
+++ gcc/asan.c (working copy)
@@ -1348,7 +1348,9 @@
  the var that is selected by the linker will have
  padding or not.  */
   || DECL_ONE_ONLY (decl)
-  /* Similarly for common vars.  People can use -fno-common.  */
+  /* Similarly for common vars.  People can use -fno-common.
+ Note: Linux kernel is built with -fno-common, so we do instrument
+ globals there even if it is C.  */
   || (DECL_COMMON (decl)  TREE_PUBLIC (decl))
   /* Don't protect if using user section, often vars placed
  into user section from multiple TUs are then assumed
@@ -2440,6 +2442,7 @@
 {
   varpool_node *vnode;
   unsigned HOST_WIDE_INT gcount = 0;
+  int priority;

   if (shadow_ptr_types[0] == NULL_TREE)
 asan_init_shadow_ptr_types ();
@@ -2448,6 +2451,13 @@
  nor after .LASAN* array.  */
   flag_sanitize = ~SANITIZE_ADDRESS;

+  /* For user-space we want asan constructors to run first.
+ Linux kernel does not support priorities other than default, and the only
+ other user of constructors is coverage. So we run with the default
+ priority.  */
+  priority = flag_sanitize  SANITIZE_USER_ADDRESS ?
+  MAX_RESERVED_INIT_PRIORITY - 1 : DEFAULT_INIT_PRIORITY;
+
   if (flag_sanitize  SANITIZE_USER_ADDRESS)
 {
   tree fn = builtin_decl_implicit (BUILT_IN_ASAN_INIT);
@@ -2503,12 +2513,10 @@
  build_fold_addr_expr (var),
  gcount_tree),
  dtor_statements);
-  cgraph_build_static_cdtor ('D', dtor_statements,
- MAX_RESERVED_INIT_PRIORITY - 1);
+  cgraph_build_static_cdtor ('D', dtor_statements, priority);
 }
   if (asan_ctor_statements)
-cgraph_build_static_cdtor ('I', asan_ctor_statements,
-   MAX_RESERVED_INIT_PRIORITY - 1);
+cgraph_build_static_cdtor ('I', asan_ctor_statements, priority);
   flag_sanitize |= SANITIZE_ADDRESS;
 }


Re: [PATCH x86_64] Optimize access to globals in -fpie -pie builds with copy relocations

2014-12-02 Thread Sriraman Tallam
Ping.

On Mon, Nov 10, 2014 at 3:22 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Mon, Oct 6, 2014 at 1:43 PM, Sriraman Tallam tmsri...@google.com wrote:
 Ping.

 On Mon, Sep 29, 2014 at 10:57 AM, Sriraman Tallam tmsri...@google.com 
 wrote:
 Ping.

 On Fri, Sep 19, 2014 at 2:11 PM, Sriraman Tallam tmsri...@google.com 
 wrote:
 Hi Richard,

 I also ran the gcc testsuite with
 RUNTESTFLAGS=--tool_opts=-mcopyrelocs to check for issues.  The only
 test that failed was g++.dg/tsan/default_options.C.  It uses -fpie
 -pie and BFD ld to link. Since BFD ld does not support copy
 relocations with -pie, it does not link. I linked with gold to make
 the test pass.

 Could you please take another look at this patch?

 Thanks
 Sri

 On Mon, Sep 8, 2014 at 3:19 PM, Sriraman Tallam tmsri...@google.com 
 wrote:
 On Tue, Sep 2, 2014 at 1:40 PM, Richard Henderson r...@redhat.com wrote:
 On 06/20/2014 05:17 PM, Sriraman Tallam wrote:
 Index: config/i386/i386.c
 ===
 --- config/i386/i386.c(revision 211826)
 +++ config/i386/i386.c(working copy)
 @@ -12691,7 +12691,9 @@ legitimate_pic_address_disp_p (rtx disp)
   return true;
   }
 else if (!SYMBOL_REF_FAR_ADDR_P (op0)
 - SYMBOL_REF_LOCAL_P (op0)
 + (SYMBOL_REF_LOCAL_P (op0)
 +|| (TARGET_64BIT  ix86_copyrelocs  flag_pie
 + !SYMBOL_REF_FUNCTION_P (op0)))
   ix86_cmodel != CM_LARGE_PIC)
   return true;
 break;

 This is the wrong place to patch.

 You ought to be adjusting SYMBOL_REF_LOCAL_P, by providing a modified
 TARGET_BINDS_LOCAL_P.

 I have done this in the new attached patch, I added a new function
 i386_binds_local_p which will check for this and call
 default_binds_local_p otherwise.


 Note in particular that I believe that you are doing the wrong thing 
 with weak
 and COMMON symbols, in that you probably ought not force a copy reloc 
 there.

 I added an extra check to not do this for WEAK symbols. I also added a
 check for DECL_EXTERNAL so I believe this will also not be called for
 COMMON symbols.


 Note the complexity of default_binds_local_p_1, and the fact that all you
 really want to modify is

   /* If PIC, then assume that any global name can be overridden by
  symbols resolved from other modules.  */
   else if (shlib)
 local_p = false;

 near the bottom of that function.

 I did not understand what you mean here? Were you suggesting an
 alternative way of doing this?

 Thanks for reviewing
 Sri



 r~


Re: asan: support for globals in kernel

2014-12-02 Thread Jakub Jelinek
On Tue, Dec 02, 2014 at 09:56:36PM +0400, Dmitry Vyukov wrote:
 --- gcc/ChangeLog (revision 218280)
 +++ gcc/ChangeLog (working copy)
 @@ -1,3 +1,8 @@
 +2014-12-02  Dmitry Vyukov  dvyu...@google.com
 +
 + * asan.c: (asan_finish_file): Use default priority for constructors
 + in kernel mode.

Seems your MUA is eating tabs, I'll assume they are where I'd expect them to
be.

 @@ -2440,6 +2442,7 @@
  {
varpool_node *vnode;
unsigned HOST_WIDE_INT gcount = 0;
 +  int priority;
 
if (shadow_ptr_types[0] == NULL_TREE)
  asan_init_shadow_ptr_types ();

No need to declare this separately, 

 @@ -2448,6 +2451,13 @@
   nor after .LASAN* array.  */
flag_sanitize = ~SANITIZE_ADDRESS;
 
 +  /* For user-space we want asan constructors to run first.
 + Linux kernel does not support priorities other than default, and the 
 only
 + other user of constructors is coverage. So we run with the default
 + priority.  */
 +  priority = flag_sanitize  SANITIZE_USER_ADDRESS ?
 +  MAX_RESERVED_INIT_PRIORITY - 1 : DEFAULT_INIT_PRIORITY;

formatting plus declare it here too.  This should be
  int priority = flag_sanitize  SANITIZE_USER_ADDRESS
 ? MAX_RESERVED_INIT_PRIORITY - 1 : DEFAULT_INIT_PRIORITY;
(? should be on the second line and aligned below flag_sanitize).

Ok for trunk with that change.

Shall we backport it to 4.9 branch too?

Jakub


Re: asan: support for globals in kernel

2014-12-02 Thread Dmitry Vyukov
On Tue, Dec 2, 2014 at 9:04 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Tue, Dec 02, 2014 at 09:56:36PM +0400, Dmitry Vyukov wrote:
 --- gcc/ChangeLog (revision 218280)
 +++ gcc/ChangeLog (working copy)
 @@ -1,3 +1,8 @@
 +2014-12-02  Dmitry Vyukov  dvyu...@google.com
 +
 + * asan.c: (asan_finish_file): Use default priority for constructors
 + in kernel mode.

 Seems your MUA is eating tabs, I'll assume they are where I'd expect them to
 be.

 @@ -2440,6 +2442,7 @@
  {
varpool_node *vnode;
unsigned HOST_WIDE_INT gcount = 0;
 +  int priority;

if (shadow_ptr_types[0] == NULL_TREE)
  asan_init_shadow_ptr_types ();

 No need to declare this separately,

 @@ -2448,6 +2451,13 @@
   nor after .LASAN* array.  */
flag_sanitize = ~SANITIZE_ADDRESS;

 +  /* For user-space we want asan constructors to run first.
 + Linux kernel does not support priorities other than default, and the 
 only
 + other user of constructors is coverage. So we run with the default
 + priority.  */
 +  priority = flag_sanitize  SANITIZE_USER_ADDRESS ?
 +  MAX_RESERVED_INIT_PRIORITY - 1 : DEFAULT_INIT_PRIORITY;

 formatting plus declare it here too.  This should be
   int priority = flag_sanitize  SANITIZE_USER_ADDRESS
  ? MAX_RESERVED_INIT_PRIORITY - 1 : DEFAULT_INIT_PRIORITY;
 (? should be on the second line and aligned below flag_sanitize).

 Ok for trunk with that change.

 Shall we backport it to 4.9 branch too?

If it's doable, it would be nice. Thanks.
When do we expect next 4.9 release?

Yes, my mail client eats tabs, here is proper version:
https://codereview.appspot.com/176570043/diff/20001/gcc/ChangeLog
Sorry.


And here is updated text version with eaten tabs:


Index: gcc/ChangeLog
===
--- gcc/ChangeLog (revision 218280)
+++ gcc/ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2014-12-02  Dmitry Vyukov  dvyu...@google.com
+
+ * asan.c: (asan_finish_file): Use default priority for constructors
+ in kernel mode.
+
 2014-12-02  Ulrich Weigand  ulrich.weig...@de.ibm.com

  PR target/64115
Index: gcc/asan.c
===
--- gcc/asan.c (revision 218280)
+++ gcc/asan.c (working copy)
@@ -1348,7 +1348,9 @@
  the var that is selected by the linker will have
  padding or not.  */
   || DECL_ONE_ONLY (decl)
-  /* Similarly for common vars.  People can use -fno-common.  */
+  /* Similarly for common vars.  People can use -fno-common.
+ Note: Linux kernel is built with -fno-common, so we do instrument
+ globals there even if it is C.  */
   || (DECL_COMMON (decl)  TREE_PUBLIC (decl))
   /* Don't protect if using user section, often vars placed
  into user section from multiple TUs are then assumed
@@ -2448,6 +2450,13 @@
  nor after .LASAN* array.  */
   flag_sanitize = ~SANITIZE_ADDRESS;

+  /* For user-space we want asan constructors to run first.
+ Linux kernel does not support priorities other than default, and the only
+ other user of constructors is coverage. So we run with the default
+ priority.  */
+  int priority = flag_sanitize  SANITIZE_USER_ADDRESS
+ ? MAX_RESERVED_INIT_PRIORITY - 1 : DEFAULT_INIT_PRIORITY;
+
   if (flag_sanitize  SANITIZE_USER_ADDRESS)
 {
   tree fn = builtin_decl_implicit (BUILT_IN_ASAN_INIT);
@@ -2503,12 +2512,10 @@
  build_fold_addr_expr (var),
  gcount_tree),
  dtor_statements);
-  cgraph_build_static_cdtor ('D', dtor_statements,
- MAX_RESERVED_INIT_PRIORITY - 1);
+  cgraph_build_static_cdtor ('D', dtor_statements, priority);
 }
   if (asan_ctor_statements)
-cgraph_build_static_cdtor ('I', asan_ctor_statements,
-   MAX_RESERVED_INIT_PRIORITY - 1);
+cgraph_build_static_cdtor ('I', asan_ctor_statements, priority);
   flag_sanitize |= SANITIZE_ADDRESS;
 }


[PING][PATCH V2] plugin event for C/C++ function definitions

2014-12-02 Thread Andres Tiraboschi
Hi, this patch adds a new plugin event PLUGIN_START_PARSE_FUNCTION and 
PLUGIN_FINISH_PARSE_FUNCTION that are invoked at start_function and 
finish_function respectively in the C and C++ frontends. 
PLUGIN_START_PARSE_FUNCTION is called before parsing the function body.
PLUGIN_FINISH_PARSE_FUNCTION is called after parsing a function definition.

Since I have no write privileges please commit this for me if ok.

changelog:

 gcc/c/c-decl.c: Invoke callbacks in start_function and finish_function.

 gcc/cp/decl.c: Invoke callbacks in start_function and finish_function.

 gcc/doc/plugins.texi: Add documentation about PLUGIN_START_FUNCTION and 
PLUGIN_FINISH_FUNCTION
 gcc/plugin.def: Add events for start_function and finish_function. 
 
 gcc/plugin.c (register_callback, invoke_plugin_callbacks): Same.

 gcc/testsuite/g++.dg/plugin/def_plugin.c: New test plugin. 
 
 gcc/testsuite/g++.dg/plugin/def-plugin-test.C: Testcase for above plugin.
 gcc/testsuite/g++.dg/plugin/plugin.exp 

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 6413e6f..d9d922c 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -4449,6 +4449,7 @@ start_decl (struct c_declarator *declarator, struct 
c_declspecs *declspecs,
   decl = grokdeclarator (declarator, declspecs,
 NORMAL, initialized, NULL, attributes, expr, NULL,
 deprecated_state);
+  invoke_plugin_callbacks (PLUGIN_START_PARSE_FUNCTION, decl);
   if (!decl)
 return 0;
 
@@ -9031,6 +9032,7 @@ finish_function (void)
  It's still in DECL_STRUCT_FUNCTION, and we'll restore it in
  tree_rest_of_compilation.  */
   set_cfun (NULL);
+  invoke_plugin_callbacks (PLUGIN_FINISH_PARSE_FUNCTION, 
current_function_decl);
   current_function_decl = NULL;
 }
 
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 716ab5f..6adf2e4 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -13671,6 +13671,7 @@ start_function (cp_decl_specifier_seq *declspecs,
   tree decl1;
 
   decl1 = grokdeclarator (declarator, declspecs, FUNCDEF, 1, attrs);
+  invoke_plugin_callbacks (PLUGIN_START_PARSE_FUNCTION, decl1);
   if (decl1 == error_mark_node)
 return false;
   /* If the declarator is not suitable for a function definition,
@@ -14301,6 +14302,7 @@ finish_function (int flags)
   vec_free (deferred_mark_used_calls);
 }
 
+  invoke_plugin_callbacks (PLUGIN_FINISH_PARSE_FUNCTION, fndecl);
   return fndecl;
 }
 
diff --git a/gcc/doc/plugins.texi b/gcc/doc/plugins.texi
index 4a839b8..1c9e074 100644
--- a/gcc/doc/plugins.texi
+++ b/gcc/doc/plugins.texi
@@ -174,6 +174,8 @@ Callbacks can be invoked at the following pre-determined 
events:
 @smallexample
 enum plugin_event
 @{
+  PLUGIN_START_PARSE_FUNCTION,  /* Called before parsing the body of a 
function. */
+  PLUGIN_FINISH_PARSE_FUNCTION, /* After finishing parsing a function. */
   PLUGIN_PASS_MANAGER_SETUP,/* To hook into pass manager.  */
   PLUGIN_FINISH_TYPE,   /* After finishing parsing a type.  */
   PLUGIN_FINISH_DECL,   /* After finishing parsing a declaration. */
diff --git a/gcc/plugin.c b/gcc/plugin.c
index 8debc09..f7a8b64 100644
--- a/gcc/plugin.c
+++ b/gcc/plugin.c
@@ -433,6 +433,8 @@ register_callback (const char *plugin_name,
return;
  }
   /* Fall through.  */
+  case PLUGIN_START_PARSE_FUNCTION:
+  case PLUGIN_FINISH_PARSE_FUNCTION:
   case PLUGIN_FINISH_TYPE:
   case PLUGIN_FINISH_DECL:
   case PLUGIN_START_UNIT:
@@ -511,6 +513,8 @@ invoke_plugin_callbacks_full (int event, void *gcc_data)
gcc_assert (event = PLUGIN_EVENT_FIRST_DYNAMIC);
gcc_assert (event  event_last);
   /* Fall through.  */
+  case PLUGIN_START_PARSE_FUNCTION:
+  case PLUGIN_FINISH_PARSE_FUNCTION:
   case PLUGIN_FINISH_TYPE:
   case PLUGIN_FINISH_DECL:
   case PLUGIN_START_UNIT:
diff --git a/gcc/plugin.def b/gcc/plugin.def
index df5d383..4b7f6ef 100644
--- a/gcc/plugin.def
+++ b/gcc/plugin.def
@@ -17,6 +17,11 @@ You should have received a copy of the GNU General Public 
License
 along with GCC; see the file COPYING3.  If not see
 http://www.gnu.org/licenses/.  */
 
+/* Called before parsing the body of a function.  */
+DEFEVENT (PLUGIN_START_PARSE_FUNCTION)
+
+/* After finishing parsing a function. */
+DEFEVENT (PLUGIN_FINISH_PARSE_FUNCTION)
 
 /* To hook into pass manager.  */
 DEFEVENT (PLUGIN_PASS_MANAGER_SETUP)
diff --git a/gcc/testsuite/g++.dg/plugin/def-plugin-test.C 
b/gcc/testsuite/g++.dg/plugin/def-plugin-test.C
new file mode 100644
index 000..b7f2d3d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/plugin/def-plugin-test.C
@@ -0,0 +1,13 @@
+int global = 12;
+
+int function1(void);
+
+int function2(int a) // { dg-warning Start fndef function2 }
+{
+  return function1() + a;
+} //  { dg-warning Finish fndef function2 }
+
+int function1(void) // { dg-warning Start fndef function1 }
+{
+  

Re: asan: support for globals in kernel

2014-12-02 Thread Jakub Jelinek
On Tue, Dec 02, 2014 at 10:12:30PM +0400, Dmitry Vyukov wrote:
  Shall we backport it to 4.9 branch too?
 
 If it's doable, it would be nice. Thanks.

Bet the same patch will just apply there.

 When do we expect next 4.9 release?

Probably in March/April timeframe or so, 4.9.2 has been released
just a month ago.

 And here is updated text version with eaten tabs:

Ok.

Jakub


Re: [testsuite] Fix multiple definitions of _init

2014-12-02 Thread Mike Stump
On Nov 30, 2014, at 11:21 PM, Oleg Endo oleg.e...@t-online.de wrote:
 When running the testsuite on a sh-elf configuration, some test cases
 fail due to multiple definitions of the function '_init'.  This is
 because on sh-elf every function is automatically prefixed with a '_'
 char.  When defining a C function 'init' in some code, the actual name
 becomes '_init' and this conflicts with the _init function in libgcc
 (crti.S).  While the behavior of adding the '_' prefix is debatable, the
 testsuite failures can be easily fixed by the attached patch.
 Is this OK for trunk?

No.  It is reasonable for the test suite to fail when the implementation of gcc 
is wrong (unclean) or newlib startup code is wrong (unclean).  Since that is 
what happened, the fix is to fix the cleanliness problem.

I’ve read through the thread…  I think the sh newlib port has a trivial bug and 
should be fixed.  Either, add an underscore or remove one, just like _all_ the 
other ports.  There is no complexity here, just a typo (or someone copied the 
wrong port).

When newlib/libc/sys/sh/crt0.S is fixed, and libgcc/config/sh/crt1.S is fixed, 
then these test cases will pass.

USER_LABEL_PREFIX  is not the issue.  Let me quote from the code:

.long   _atexit
.long   _init

This is wrong.

Also, _fini needs to be fixed as well.  It is unfortunate that the fix involves 
two packages, but, such is life.

Re: asan: support for globals in kernel

2014-12-02 Thread Dmitry Vyukov
On Tue, Dec 2, 2014 at 9:17 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Tue, Dec 02, 2014 at 10:12:30PM +0400, Dmitry Vyukov wrote:
  Shall we backport it to 4.9 branch too?

 If it's doable, it would be nice. Thanks.

 Bet the same patch will just apply there.

Do I need to do anything additional for this? Or you will backport it?


Committed in revision 218281.


 When do we expect next 4.9 release?

 Probably in March/April timeframe or so, 4.9.2 has been released
 just a month ago.

 And here is updated text version with eaten tabs:

 Ok.

 Jakub


Re: [PATCH] Fix PR15346

2014-12-02 Thread Joseph Myers
On Tue, 2 Dec 2014, Richard Biener wrote:

  I'm not sure if these forms of division actually occur in places where 
  this could cause a problem, but it does look like Ada may enable you to 
  generate ROUND_DIV_EXPR.
 
 Hmm.  I thought I was following what extract_muldiv_1 does (but of course
 that function is quite hard to follow).

I don't claim existing optimizations are correct for all these forms of 
division.

In fact even wide-int doesn't handle ROUND_DIV_EXPR correctly (round to 
nearest, ties away from 0) - at a glance, without testing, ROUND_MOD_EXPR 
has the same bug (wi::ges_p (wi::abs (remainder), wi::lrshift (wi::abs 
(y), 1)) is not the right condition for rounding away).  The following Ada 
testcase test_round_div.adb will generate a ROUND_DIV_EXPR which is 
executed correctly at -O0, but fails at -O3.  (The separate division 
function seems necessary to stop the front end from doing its own constant 
folding, though there may for all I know be more idiomatic Ada ways of 
doing that.)

procedure Test_Round_Div is
   type Fixed is delta 1.0 range -2147483648.0 .. 2147483647.0;
   A : Fixed := 1.0;
   B : Fixed := 3.0;
   C : Integer;
   function Divide (X, Y : Fixed) return Integer is
   begin
  return Integer (X / Y);
   end;
begin
   C := Divide (A, B);
   if C /= 0 then
  raise Program_Error;
   end if;
end Test_Round_Div;

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH x86_64] Optimize access to globals in -fpie -pie builds with copy relocations

2014-12-02 Thread H.J. Lu
On Mon, Sep 8, 2014 at 3:19 PM, Sriraman Tallam tmsri...@google.com wrote:
 On Tue, Sep 2, 2014 at 1:40 PM, Richard Henderson r...@redhat.com wrote:
 On 06/20/2014 05:17 PM, Sriraman Tallam wrote:
 Index: config/i386/i386.c
 ===
 --- config/i386/i386.c(revision 211826)
 +++ config/i386/i386.c(working copy)
 @@ -12691,7 +12691,9 @@ legitimate_pic_address_disp_p (rtx disp)
   return true;
   }
 else if (!SYMBOL_REF_FAR_ADDR_P (op0)
 - SYMBOL_REF_LOCAL_P (op0)
 + (SYMBOL_REF_LOCAL_P (op0)
 +|| (TARGET_64BIT  ix86_copyrelocs  flag_pie
 + !SYMBOL_REF_FUNCTION_P (op0)))
   ix86_cmodel != CM_LARGE_PIC)
   return true;
 break;

 This is the wrong place to patch.

 You ought to be adjusting SYMBOL_REF_LOCAL_P, by providing a modified
 TARGET_BINDS_LOCAL_P.

 I have done this in the new attached patch, I added a new function
 i386_binds_local_p which will check for this and call
 default_binds_local_p otherwise.


 Note in particular that I believe that you are doing the wrong thing with 
 weak
 and COMMON symbols, in that you probably ought not force a copy reloc there.

 I added an extra check to not do this for WEAK symbols. I also added a
 check for DECL_EXTERNAL so I believe this will also not be called for
 COMMON symbols.


 Note the complexity of default_binds_local_p_1, and the fact that all you
 really want to modify is

   /* If PIC, then assume that any global name can be overridden by
  symbols resolved from other modules.  */
   else if (shlib)
 local_p = false;

 near the bottom of that function.

 I did not understand what you mean here? Were you suggesting an
 alternative way of doing this?

 Thanks for reviewing

I'd like to see a few testcases:

1. One test to show it does the right thing for external variable.
2. One test to show it does the right thing for common symbol.
3. One test to show it does the right thing for weak symbol.
4. One test to show it does the right thing for external function.

Thanks.

-- 
H.J.


Re: [PATCH] Fix PR15346

2014-12-02 Thread Joseph Myers
On Tue, 2 Dec 2014, Joseph Myers wrote:

 (y), 1)) is not the right condition for rounding away).  The following Ada 
 testcase test_round_div.adb will generate a ROUND_DIV_EXPR which is 

I should add that I'm not sure if Ada requires correct rounding for 
fixed-point division converted to integer like this, or, if not, whether 
GNU Ada nevertheless guarantees it; it's simply a case that does generate 
such a ROUND_DIV_EXPR and so can be used to test how ROUND_DIV_EXPR 
behaves (in the absence of built-in functions).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH x86_64] Optimize access to globals in -fpie -pie builds with copy relocations

2014-12-02 Thread Uros Bizjak
Hello!

 Ping.
 Ping.
 Ping.
 Ping.

It would probably help reviewers if you pointed to actual path
submission [1], which unfortunately contains the explanation in the
patch itself [2], which further explains that this functionality is
currently only supported with gold, patched with [3].

[1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
[2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
[3] https://sourceware.org/ml/binutils/2014-05/msg00092.html

After a bit of the above detective work, I think that new gcc option
is not necessary. The configure should detect if new functionality is
supported in the linker, and auto-configure gcc to use it when
appropriate.

I have also added a couple of linker experts in the CC.

Uros.


Re: [PATCH x86_64] Optimize access to globals in -fpie -pie builds with copy relocations

2014-12-02 Thread H.J. Lu
On Tue, Dec 2, 2014 at 11:19 AM, Uros Bizjak ubiz...@gmail.com wrote:
 Hello!

 Ping.
 Ping.
 Ping.
 Ping.

 It would probably help reviewers if you pointed to actual path
 submission [1], which unfortunately contains the explanation in the
 patch itself [2], which further explains that this functionality is
 currently only supported with gold, patched with [3].

 [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
 [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
 [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html

 After a bit of the above detective work, I think that new gcc option
 is not necessary. The configure should detect if new functionality is
 supported in the linker, and auto-configure gcc to use it when
 appropriate.

 I have also added a couple of linker experts in the CC.

I don't think i386_binds_local_p is correct.  What does it
return for hidden external variable?  I think it should be

bool local = default_binds_local_p (exp);
if (!local)
   local = ...
return local;


H.J.


Re: [PATCH x86_64] Optimize access to globals in -fpie -pie builds with copy relocations

2014-12-02 Thread H.J. Lu
On Tue, Dec 2, 2014 at 11:19 AM, Uros Bizjak ubiz...@gmail.com wrote:
 Hello!

 Ping.
 Ping.
 Ping.
 Ping.

 It would probably help reviewers if you pointed to actual path
 submission [1], which unfortunately contains the explanation in the
 patch itself [2], which further explains that this functionality is
 currently only supported with gold, patched with [3].

 [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
 [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
 [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html

 After a bit of the above detective work, I think that new gcc option
 is not necessary. The configure should detect if new functionality is
 supported in the linker, and auto-configure gcc to use it when
 appropriate.

I think GCC option is needed since one can use -fuse-ld= to
change linker.


-- 
H.J.


Re: [PATCH x86_64] Optimize access to globals in -fpie -pie builds with copy relocations

2014-12-02 Thread Uros Bizjak
On Tue, Dec 2, 2014 at 8:40 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Dec 2, 2014 at 11:19 AM, Uros Bizjak ubiz...@gmail.com wrote:
 Hello!

 Ping.
 Ping.
 Ping.
 Ping.

 It would probably help reviewers if you pointed to actual path
 submission [1], which unfortunately contains the explanation in the
 patch itself [2], which further explains that this functionality is
 currently only supported with gold, patched with [3].

 [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
 [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
 [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html

 After a bit of the above detective work, I think that new gcc option
 is not necessary. The configure should detect if new functionality is
 supported in the linker, and auto-configure gcc to use it when
 appropriate.

 I think GCC option is needed since one can use -fuse-ld= to
 change linker.

IMO, nobody will use this highly special x86_64-only option. It would
be best for gnu-ld to reach feature parity with gold as far as this
functionality is concerned. In this case, the optimization would be
auto-configured, and would fire automatically, without any user
intervention.

Uros.


Re: [PATCH] Make IPA-CP propagate alignment information of pointers

2014-12-02 Thread Martin Jambor
Hi,

thanks for the comments.

On Mon, Dec 01, 2014 at 11:40:59PM +0100, Jan Hubicka wrote:
   2014-11-19  Martin Jambor  mjam...@suse.cz
   
 * ipa-prop.h (ipa_alignment): New type.
 (ipa_jump_func): New field alignment.
 (ipcp_transformation_summary) New type.
 (ipcp_grow_transformations_if_necessary): Declare.
 (ipa_node_agg_replacements): Removed.
 (ipcp_transformations): Declare.
 (ipcp_get_transformation_summary): New function.
 (ipa_get_agg_replacements_for_node): Use it.
 * ipa-cp.c (ipcp_param_lattices): New field alignment.
 (print_all_lattices): Also print alignment.
 (alignment_bottom_p): New function.
 (set_alignment_to_bottom): Likewise.
 (set_all_contains_variable): Also set alignment to bottom.
 (initialize_node_lattices): Likewise.
 (propagate_alignment_accross_jump_function): New function.
 (propagate_constants_accross_call): Call it.
 (ipcp_store_alignment_results): New function.
 (ipcp_driver): Call it.
 * ipa-prop.c (ipa_node_agg_replacements): Removed.
 (ipcp_transformations): New.
 (ipa_print_node_jump_functions_for_edge): Also print alignment.
 (ipa_set_jf_unknown): New function.
 (detect_type_change_from_memory_writes): Use ipa_set_jf_unknown.
 (ipa_compute_jump_functions_for_edge): Also calculate alignment.
 (update_jump_functions_after_inlining): Use ipa_set_jf_unknown.
 (ipcp_grow_transformations_if_necessary): New function.
 (ipa_set_node_agg_value_chain): Use ipcp_transformations.
 (ipa_node_removal_hook): Likewise.
 (ipa_node_duplication_hook): Also duplicate alignment results.
 (ipa_write_jump_function): Also stream alignments.
 (ipa_read_jump_function): Use ipa_set_jf_unknown, also stream
 alignments.
 (write_agg_replacement_chain): Renamed to
 write_ipcp_transformation_info, also stream alignments.
 (read_agg_replacement_chain): Renamed to
 read_ipcp_transformation_info, also stream alignments.
 (ipa_prop_write_all_agg_replacement): Renamed to
 ipcp_write_transformation_summaries. Stream always.
 (ipa_prop_read_all_agg_replacement): Renamed to
 ipcp_read_transformation_summaries.
 (ipcp_update_alignments): New function.
 (ipcp_transform_function): Call it, free also alignments.
 
 In longer term I think we should just propagate value range and known to be 
 zero/nonzero
 bits.  This is stronger and more universal than inventing propagation for 
 subproblems
 (like non-NULL would be useful).
 I think it would be also good excuse to do (simplified) early VRP to populate 
 the vlaue
 ranges locally.  This seems to make quite some difference on C++.

Well, it would essentially need some sort of early VRP.  So yes,
hopefully in the next major version we can do that.

 
 But for 5.0 it seems like resonable thing to do.
 
 Since ipa-prop itself is built as propaation engine for multiple types of 
 values
 (scalars, aggregates, contexts and now value ranges).

I believe that if we start propagating value ranges, this would make
propagating scalars redundant?  Also it would need some rethinking of
the ipa-cp propagation heuristics.

 I wonder if the code can not be
 organized by type of propagated value and perhaps split into multiple file to 
 make
 it more readable.

Splitting off of some part to extra files might help but first it is
necessary to figure out how to reorganize the contents of the file in
general.  Jump functions should probably be re-thought a little bit,
especially once we add information like reference mapping from my
aggregate propagation patches.

It would also be beneficial to do some analysis together with
generating the inline summaries and tweak ipa-analysis summary
conditions so that they can tell which value ranges would lead to the
same results.  And so on and so forth.

 
 I think the way we split polymorphic-call-context lattice operations into 
 separate
 file quite works but more of code separation would be nice.
 
 Well, a bit like df.c infrastructure is done.
   +  else if (jfunc-type == IPA_JF_PASS_THROUGH
   +|| jfunc-type == IPA_JF_ANCESTOR)
   +{
   +  struct ipa_node_params *caller_info = IPA_NODE_REF (cs-caller);
   +  struct ipcp_param_lattices *src_lats;
   +  HOST_WIDE_INT offset = 0;
   +  int src_idx;
   +
 
 We probably chould avoid new places that rely on tree_fits_shwi_p and use 
 wide_int
 instead.

On IRC we agreed that this making change here alone is of very limited
benefit, unlike using wide_int more in IPA data structures overall,
something which should be done in 6.0.

   +  if (POINTER_TYPE_P (TREE_TYPE(arg)))
   + {
   +   unsigned HOST_WIDE_INT hwi_bitpos;
   +   unsigned align;
   +
   +   if (get_pointer_alignment_1 (arg, align, hwi_bitpos)
   +align  BITS_PER_UNIT)
 
 OK, so here we get the nonzero/zero bits propagation from CCP used, right?
 
   +  ipcp_transformation_summary *ts = ipcp_get_transformation_summary 
   

Re: [PATCH 5/6] combine: handle REG_UNUSED in reg_dead_at_p (PR59278)

2014-12-02 Thread Jeff Law

On 12/01/14 16:30, Oleg Endo wrote:

On Mon, 2014-12-01 at 10:38 -0700, Jeff Law wrote:

On 11/27/14 18:44, Segher Boessenkool wrote:

Currently reg_dead_at_p returns 0 for registers that are set but never
used.  This patch repairs that oversight.  This fixes PR59278.


2014-11-27  Segher Boessenkool  seg...@kernel.crashing.org

gcc/
PR rtl-optimization/59278
combine (reg_dead_at_p): Consider REG_UNUSED notes.

OK with a testcase.  ISTM you can just create an SH specific testcase
from the code in the BZ.  Pre-approved with suitable testcase.

jeff



Attached is the SH test case for the PR, which passes with the combine
patch applied for make -k check-gcc RUNTESTFLAGS=sh.exp=pr59278.c
--target_board=sh-sim\{-m4/-ml,-m4/-mb,-m2a/-mb,-m2/-mb}

Cheers,
Oleg

testsuite/ChangeLog:
* gcc.target/sh/pr59278.c: New.

OK, please commit after Segher or together as a unified patch+testcase.

jeff


Re: [Patch] Improving jump-thread pass for PR 54742

2014-12-02 Thread Jeff Law

On 12/02/14 03:15, Richard Biener wrote:


I'm a bit worried about compile-time impacts of the all the recursion, but
I'm willing to wait and see if it turns out to be a problem in practice.


Please consider restricting it to -fexpensive-optimizations (-O2+).

Yea, let's go ahead and do that.

jeff



[PATCH] Update docs to reflect use of gimple subclasses

2014-12-02 Thread David Malcolm
On Tue, 2014-12-02 at 08:49 +0100, Jakub Jelinek wrote:
Hi!
 
 Here is an attempt to adjust gimple_build_assign* documentation.
 Ok for trunk?
 
 Note, apparently the documentation has not been adjusted for
 the gimple - gassign * etc. changes, David, can you please work on
 adjusting gimple.texi to match the reality after your changes?

I went through cfg.texi and gimple.texi updating all of the
documentation to reflect the changed signatures.

gimple.texi has numerous overlong lines; I wordwrapped any that I
touched.

make html succeeds; I visually inspected and sanity-checked the
built HTML.

This will have merger conflicts with Jakub's patch, but it should be
trivial to fix them up.

OK for trunk?

gcc/ChangeLog:
* doc/cfg.texi (GIMPLE statement iterators): Add note about
gphi_iterator, and use one in the example.
* doc/gimple.texi (Tuple specific accessors): Add missing
GIMPLE_GOTO section and menu item.
(gimple_build_asm, gimple gimple_build_assign_with_ops)
gimple_call_mark_uninlinable, gimple_call_cannot_inline_p): Remove
description of removed functions.
(gimple_build_assign, gimple_build_bind, gimple_build_call,
gimple_build_call_from_tree, gimple_build_call_vec,
gimple_build_catch, gimple_build_cond,
gimple_build_cond_from_tree, gimple_build_debug_bind,
gimple_build_eh_filter, gimple_build_label, gimple_build_goto,
gimple_build_omp_atomic_load, gimple_build_omp_atomic_store,
gimple_build_omp_continue, gimple_build_omp_critical,
gimple_build_omp_for, gimple_build_omp_parallel,
gimple_build_omp_sections, gimple_build_omp_single,
gimple_build_return, gimple_build_resx, gimple_build_switch,
gimple_build_try): Update return type within description to
reflect changes in gimple.h to using gimple subclasses.
(gimple_build_asm_vec): Update return type, params and
description.
(gimple_asm_ninputs): Update param.
(gimple_asm_noutputs, gimple_asm_nclobbers, gimple_asm_input_op
gimple_asm_set_input_op, gimple_asm_output_op
gimple_asm_set_output_op, gimple_asm_clobber_op,
gimple_asm_set_clobber_op, gimple_asm_string,
gimple_asm_volatile_p, gimple_asm_set_volatile, gimple_bind_vars,
gimple_bind_set_vars, gimple_bind_append_vars, gimple_bind_body,
gimple_bind_set_body, gimple_bind_add_stmt, gimple_bind_add_seq,
gimple_bind_block, gimple_bind_set_block, gimple_call_set_fn,
gimple_call_return_type, gimple_call_set_chain,
gimple_call_set_tail, gimple_call_tail_p,
gimple_call_copy_skip_args, gimple_catch_types,
gimple_catch_types_ptr, gimple_catch_handler,
gimple_catch_set_types, gimple_catch_set_handler,
gimple_cond_set_code, gimple_cond_set_lhs, gimple_cond_set_rhs,
gimple_cond_true_label, gimple_cond_set_true_label,
gimple_cond_set_false_label, gimple_cond_false_label,
gimple_cond_make_false, gimple_cond_make_true,
gimple_eh_filter_set_types, gimple_eh_filter_set_failure,
gimple_eh_must_not_throw_fndecl,
gimple_eh_must_not_throw_set_fndecl, gimple_label_label,
gimple_label_set_label, gimple_goto_set_dest,
gimple_omp_atomic_load_set_lhs, gimple_omp_atomic_load_lhs,
gimple_omp_atomic_load_set_rhs, gimple_omp_atomic_load_rhs,
gimple_omp_atomic_store_set_val, gimple_omp_atomic_store_val,
gimple_omp_continue_control_def,
gimple_omp_continue_control_def_ptr,
gimple_omp_continue_set_control_def,
gimple_omp_continue_control_use,
gimple_omp_continue_control_use_ptr,
gimple_omp_continue_set_control_use, gimple_omp_critical_name,
gimple_omp_critical_name_ptr, gimple_omp_critical_set_name,
gimple_omp_parallel_clauses_ptr, gimple_omp_parallel_set_clauses,
gimple_omp_parallel_child_fn, gimple_omp_parallel_child_fn_ptr,
gimple_omp_parallel_set_child_fn, gimple_omp_parallel_data_arg,
gimple_omp_parallel_data_arg_ptr,
gimple_omp_parallel_set_data_arg, gimple_omp_single_set_clauses,
gimple_phi_set_result, gimple_phi_set_arg, gimple_resx_region,
gimple_resx_set_region, gimple_return_retval,
gimple_return_set_retval, gimple_switch_num_labels,
gimple_switch_set_num_labels, gimple_switch_index,
gimple_switch_set_index, gimple_switch_label,
gimple_switch_set_label, gimple_switch_default_label,
gimple_switch_set_default_label, gimple_try_set_eval,
gimple_try_set_cleanup): Update initial param within description
to reflect changes in gimple.h to using gimple subclasses.
(Adding a new GIMPLE statement code): Update to reflect gimple
statement subclassing.
---
 gcc/doc/cfg.texi|   8 +-
 gcc/doc/gimple.texi | 354 +---
 2 files changed, 204 

Re: [PATCH x86_64] Optimize access to globals in -fpie -pie builds with copy relocations

2014-12-02 Thread Jakub Jelinek
On Tue, Dec 02, 2014 at 12:16:09PM -0800, H.J. Lu wrote:
  IMO, nobody will use this highly special x86_64-only option. It would
  be best for gnu-ld to reach feature parity with gold as far as this
  functionality is concerned. In this case, the optimization would be
  auto-configured, and would fire automatically, without any user
  intervention.
 
 I will implement it in ld after its support is checked into GCC.

I think it would be better to do it the other way around, so that gcc can be
configured against the right ld from the start.

Jakub


Re: [PATCH 2/2, PR 63814] Do not re-create expanded artificial thunks

2014-12-02 Thread Martin Jambor
Hi,

On Mon, Dec 01, 2014 at 10:43:19PM +0100, Jan Hubicka wrote:
  On Fri, Nov 21, 2014 at 08:18:12PM +0100, Martin Jambor wrote:
   Hi,
   
   when debugging PR 63814 I noticed that when cgraph_node::create_clone
   was using redirect_edge_duplicating_thunks to redirect two edges to a
   thunk of a clone, two thunks were created, one for each edge.  The
   reason is that even though duplicate_thunk_for_node attempts to locate
   an already created thunk, it does so by looking for a caller with
   thunk.thunk_p set and the previously created one does not have it set
   because (on i686) expand_thunk has expanded the thunk to gimple and
   cleared the flag.
   
   This patch fixes the issue by marking such expanded thunks with yet
   another flag and then uses the flag to identify such expanded thunks.
   Bootstrapped and tested on x86_64-linux and i686-linux.  Honza, do you
   think this is a good approach?  Is the patch OK for trunk?
 
 Hmm, I was sort of hoping to drop the whole .thunk structure into summary and
 do not keep it for functions with Gimple body, so this change will imply the
 need to keep this datastructure for thunks that are already expanded.
 moreover we do not stream the flag, so not all expanded thunks are handled,
 only those that was introduced during WPA.
 
 It may be cleaner to simply donot expand thunks proactively before all
 the clonning is finished?
 

I did not really like it either (although I must say that in general I
really dislike the need to call expand_thunk but telling it to not
really expand anything if possible).  So here is another approach,
which works by delaying expansion of thunks but only after all
redirections within create_clone are done.  This has the advantage
that the new method expand_all_artificial_thunks is not exposed to
every (indirect) user of cgraph_node::create_clone (as opposed to
requiring them all to call that function at some point after making
any redirections at all).  However, it also has the disadvantage that
in the unlikeliest of circumstances (strongly connected components in
IPA-CP value dependencies connected by thunks and these thunks ceasing
to be assembly thunks in the course of cloning at the same time),
IPA-CP might still create extra gimple-expanded thunks.

The alternative not suffering from this problem would be either
exposing expand_all_artificial_thunks to all users and requiring them
to call it, with IPA-CP doing it for all created clones at the very
end of propagation, or teaching the infrastructure and pass manager
about this and then expanding them at some convenient time
(before/after inlining?).  However, the first option seems to be too
ugly for very little benefit (but I am willing to implement that) and
the second does not seem to be stage3 material.

What do you think?  Bootstrapped and tested on x86_64-linux.

Martin


2014-12-02  Martin Jambor  mjam...@suse.cz

* cgraph.h (cgraph_node): New method expand_all_artificial_thunks.
(cgraph_edge): New method redirect_callee_duplicating_thunks.
* cgraphclones.c (duplicate_thunk_for_node): Donot expand newly
created thunks.
(redirect_edge_duplicating_thunks): Turned into edge method
redirect_callee_duplicating_thunks.
(cgraph_node::expand_all_artificial_thunks): New method.
(create_clone): Call expand_all_artificial_thunks.
* ipa-cp.c (perhaps_add_new_callers): Call
redirect_callee_duplicating_thunks instead of redirect_callee.
Also call expand_all_artificial_thunks.

Index: src/gcc/cgraph.h
===
--- src.orig/gcc/cgraph.h
+++ src/gcc/cgraph.h
@@ -908,6 +908,10 @@ public:
  thunks that are not lowered.  */
   bool expand_thunk (bool output_asm_thunks, bool force_gimple_thunk);
 
+  /*  Call expand_thunk on all callers that are thunks and if analyze those
+  nodes that were expanded.  */
+  void expand_all_artificial_thunks ();
+
   /* Assemble thunks and aliases associated to node.  */
   void assemble_thunks_and_aliases (void);
 
@@ -1477,6 +1481,12 @@ struct GTY((chain_next (%h.next_caller
  call expression.  */
   void redirect_callee (cgraph_node *n);
 
+  /* If the edge does not lead to a thunk, simply redirect it to N.  Otherwise
+ create one or more equivalent thunks for N and redirect E to the first in
+ the chain.  Note that it is then necessary to call
+ n-expand_all_artificial_thunks once all callers are redirected.  */
+  void redirect_callee_duplicating_thunks (cgraph_node *n);
+
   /* Make an indirect edge with an unknown callee an ordinary edge leading to
  CALLEE.  DELTA is an integer constant that is to be added to the this
  pointer (first parameter) to compensate for skipping
Index: src/gcc/cgraphclones.c
===
--- src.orig/gcc/cgraphclones.c
+++ src/gcc/cgraphclones.c
@@ -370,28 +370,47 @@ 

[PATCH, PR 64153] Check type sizes before V_C_Eing in evaluate_conditions_for_known_args

2014-12-02 Thread Martin Jambor
Hi,

apparently it is necessary to check that type sizes match before
attempting to fold-V_C_E them in evaluate_conditions_for_known_args.
So this patch does this.

It passes bootstrap and testing on x86_64-linux and I have verified
with a cross compiler that the reported bug is fixed (the generated
assembly is the same as before the commit which introduced the
problem).

OK for trunk and (after a bootstrap and testing there) the 4.9 branch?

Thanks,

Martin


2014-12-02  Martin Jambor  mjam...@suse.cz

PR ipa/64153
* ipa-inline-analysis.c (evaluate_conditions_for_known_args): Check
type sizes before view_converting.

Index: src/gcc/ipa-inline-analysis.c
===
--- src.orig/gcc/ipa-inline-analysis.c
+++ src/gcc/ipa-inline-analysis.c
@@ -880,12 +880,19 @@ evaluate_conditions_for_known_args (stru
}
   if (c-code == IS_NOT_CONSTANT || c-code == CHANGED)
continue;
-  val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (c-val), val);
-  res = val
-   ? fold_binary_to_constant (c-code, boolean_type_node, val, c-val)
-   : NULL;
-  if (res  integer_zerop (res))
-   continue;
+
+  if (operand_equal_p (TYPE_SIZE (TREE_TYPE (c-val)),
+  TYPE_SIZE (TREE_TYPE (val)), 0))
+   {
+ val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (c-val), val);
+
+ res = val
+   ? fold_binary_to_constant (c-code, boolean_type_node, val, c-val)
+   : NULL;
+
+ if (res  integer_zerop (res))
+   continue;
+   }
   clause |= 1  (i + predicate_first_dynamic_condition);
 }
   return clause;


Re: [PATCH 08/08] PR/64003 workaround (uninit memory in i386.md)

2014-12-02 Thread Jeff Law

On 12/02/14 09:20, David Malcolm wrote:

In short, I believe the problem occurs with a *jcc_1 insn that jumps
forwards, but the full details are in the bug.


My first thought is that something must be creating a new insn after
shorten_branches is complete or an existing insn that was not on the
chain when we called shorten-branches, but got threaded onto the chain
later.  Either would be considered bad in various ways, so we'd like to
fix it.


I don't think either of these are the case.  I believe it's due to the
size of the jcc_1 insn being affected by the distance to the jump
target, which for a forward jump is a bit of a chicken-and-egg issue,
since that distance is affected by the size of the jcc_1 insn itself.

It looks like align_fuzz exists in order to cope with this kind of
circular definition, but the issue seems to occur inside align_fuzz
itself.
Sorry, I didn't look at the BZ, you had already put a fair amount of 
analysis in there.  My bad.


This feels so familiar.

Jeff


Re: attribute handler oddness in MEP and STORMY16 ports

2014-12-02 Thread DJ Delorie

My memories of why I did MeP the way I did are... vague.  I recall it
had to do with getting the attributes to apply to C++ objects
correctly, since C++ objects tend to be complicated and gcc didn't
always pass me what I expected.

 think they are suppose to.   They build, but I cant test them... can 
 some with that ability run them and see if they are correct?

MeP and stormy16 are both cgen/sid simulators.


Re: [PATCH x86_64] Optimize access to globals in -fpie -pie builds with copy relocations

2014-12-02 Thread H.J. Lu
On Tue, Dec 2, 2014 at 12:01 PM, Uros Bizjak ubiz...@gmail.com wrote:
 On Tue, Dec 2, 2014 at 8:40 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Dec 2, 2014 at 11:19 AM, Uros Bizjak ubiz...@gmail.com wrote:
 Hello!

 Ping.
 Ping.
 Ping.
 Ping.

 It would probably help reviewers if you pointed to actual path
 submission [1], which unfortunately contains the explanation in the
 patch itself [2], which further explains that this functionality is
 currently only supported with gold, patched with [3].

 [1] https://gcc.gnu.org/ml/gcc-patches/2014-09/msg00645.html
 [2] https://gcc.gnu.org/ml/gcc-patches/2014-09/txt2CHtu81P1O.txt
 [3] https://sourceware.org/ml/binutils/2014-05/msg00092.html

 After a bit of the above detective work, I think that new gcc option
 is not necessary. The configure should detect if new functionality is
 supported in the linker, and auto-configure gcc to use it when
 appropriate.

 I think GCC option is needed since one can use -fuse-ld= to
 change linker.

 IMO, nobody will use this highly special x86_64-only option. It would
 be best for gnu-ld to reach feature parity with gold as far as this
 functionality is concerned. In this case, the optimization would be
 auto-configured, and would fire automatically, without any user
 intervention.

I will implement it in ld after its support is checked into GCC.

-- 
H.J.


Re: [PATCH, PR 64153] Check type sizes before V_C_Eing in evaluate_conditions_for_known_args

2014-12-02 Thread Jan Hubicka
 Hi,
 
 apparently it is necessary to check that type sizes match before
 attempting to fold-V_C_E them in evaluate_conditions_for_known_args.
 So this patch does this.
 
 It passes bootstrap and testing on x86_64-linux and I have verified
 with a cross compiler that the reported bug is fixed (the generated
 assembly is the same as before the commit which introduced the
 problem).
 
 OK for trunk and (after a bootstrap and testing there) the 4.9 branch?
 
 Thanks,
 
 Martin
 
 
 2014-12-02  Martin Jambor  mjam...@suse.cz
 
   PR ipa/64153
   * ipa-inline-analysis.c (evaluate_conditions_for_known_args): Check
   type sizes before view_converting.

OK,
Honza
 
 Index: src/gcc/ipa-inline-analysis.c
 ===
 --- src.orig/gcc/ipa-inline-analysis.c
 +++ src/gcc/ipa-inline-analysis.c
 @@ -880,12 +880,19 @@ evaluate_conditions_for_known_args (stru
   }
if (c-code == IS_NOT_CONSTANT || c-code == CHANGED)
   continue;
 -  val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (c-val), val);
 -  res = val
 - ? fold_binary_to_constant (c-code, boolean_type_node, val, c-val)
 - : NULL;
 -  if (res  integer_zerop (res))
 - continue;
 +
 +  if (operand_equal_p (TYPE_SIZE (TREE_TYPE (c-val)),
 +TYPE_SIZE (TREE_TYPE (val)), 0))
 + {
 +   val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (c-val), val);
 +
 +   res = val
 + ? fold_binary_to_constant (c-code, boolean_type_node, val, c-val)
 + : NULL;
 +
 +   if (res  integer_zerop (res))
 + continue;
 + }
clause |= 1  (i + predicate_first_dynamic_condition);
  }
return clause;


[PATCH] fix PR testsuite/64145

2014-12-02 Thread Jack Howarth
The attached patch fixes the regression in the
gcc/testsuite/gcc.dg/graphite/isl-codegen-loop-dumping.c testcase
caused by the accidental removal of -fgraphite-identity from
dg-options at r217315. Okay for gcc trunk?
 Jack
2014-12-01  Jack Howarth  howa...@bromo.med.uc.edu

PR testsuite/64145
* gcc.dg/graphite/isl-codegen-loop-dumping.c: Restore 
-fgraphite-identity.

Index: gcc/testsuite/gcc.dg/graphite/isl-codegen-loop-dumping.c
===
--- gcc/testsuite/gcc.dg/graphite/isl-codegen-loop-dumping.c(revision 
218285)
+++ gcc/testsuite/gcc.dg/graphite/isl-codegen-loop-dumping.c(working copy)
@@ -1,4 +1,4 @@
-/* { dg-options -O2 -fdump-tree-graphite-all } */
+/* { dg-options -O2 -fgraphite-identity -fdump-tree-graphite-all } */
 
 int
 main (int n, int *a)


Fwd: [PATCH] fix PR testsuite/64145

2014-12-02 Thread Roman Gareev
Looks good to me. Thank you.


2014-12-03 2:09 GMT+05:00 Jack Howarth howarth.at@gmail.com:
 The attached patch fixes the regression in the
 gcc/testsuite/gcc.dg/graphite/isl-codegen-loop-dumping.c testcase
 caused by the accidental removal of -fgraphite-identity from
 dg-options at r217315. Okay for gcc trunk?
  Jack



--
Cheers, Roman Gareev.


Re: [PATCH 08/08] PR/64003 workaround (uninit memory in i386.md)

2014-12-02 Thread Jeff Law

On 12/02/14 09:20, David Malcolm wrote:


I've spent some time trying to track this down, and I've added detailed
notes to the bug.

In short, I believe the problem occurs with a *jcc_1 insn that jumps
forwards, but the full details are in the bug.


My first thought is that something must be creating a new insn after
shorten_branches is complete or an existing insn that was not on the
chain when we called shorten-branches, but got threaded onto the chain
later.  Either would be considered bad in various ways, so we'd like to
fix it.


I don't think either of these are the case.  I believe it's due to the
size of the jcc_1 insn being affected by the distance to the jump
target, which for a forward jump is a bit of a chicken-and-egg issue,
since that distance is affected by the size of the jcc_1 insn itself.
I thought we had code somewhere that made worst case assumptions about 
the lengths, to get those arrays initialized with reasonable value 
without looking forward.  But that may have also been in place prior to 
the addition of align_fuzz.




It looks like align_fuzz exists in order to cope with this kind of
circular definition, but the issue seems to occur inside align_fuzz
itself.
align_fuzz is meant to estimate padding due to alignments.  Joern and I 
went round and round on that code and I never managed to wrap my head 
around it or its correctness.  Jim W. might have stepped in at some 
point to arbitrate.  I may have to pull out my archives and review that 
history (IIRC is predates egcs).


Interestingly there was a great discussion around the code in 2009 which 
mirrored some of my initial concerns with this code.  Ultimately Bernd 
recommended removing it, but nobody reviewed/approved the patch.


I think you should ping Joern to chime in here about the proper course 
of action to take if we're going to keep this code in GCC.  You could 
just assign in the BZ to him.  FWIW, I can't convince myself 
initializing the value to zero is actually safe or not.



jeff




Re: [PATCH][wwwdocs] Add note about removal of -mwords-little-endian option to changes.html for GCC 5

2014-12-02 Thread Gerald Pfeifer

On Tuesday 2014-12-02 16:34, Kyrill Tkachov wrote:

Back in July this deprecated option was removed.
This patch adds a note to changes.html for that.

Looked at the result in firefox, looked ok to me.

Ok to commit?


Yep.

And I very comfortable for you to commit changes like this
without asking for approval.  Mind, I am happy to review 
(except for Dec 6h-17th where I expect I'll be off-offline),

I just don't want to enforce too much process on you.

Gerald


Re: [arm][patch] fix arm_neon_ok check on !arm_arch7

2014-12-02 Thread Ramana Radhakrishnan
On Tue, Dec 2, 2014 at 2:01 PM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:

 On 23/09/14 09:27, James Greenhalgh wrote:

 On Mon, Sep 15, 2014 at 11:56:03AM +0100, Andrew Stubbs wrote:

 On 15/09/14 10:46, Richard Earnshaw wrote:

 Hmm, I wonder if arm_override_options should reject neon + (arch  7).

 Is this more to your taste?

 Is this really such a good idea? It causes carnage throughout the
 testsuite if you have configured with support for Neon and the testcase
 is written with dg-options for a pre-armv7-a -march value.

 For example in:
testsuite/gcc.target/arm/di-longlong64-sync-withhelpers.c

 Which forces -march=armv5.

 Perhaps you just have to fix the effective-target-ok tests - but then
 we lose swathes of test coverage.


 This also causes subtle Linux kernel compile failures.
 Over there they use make rules where they check if the compiler supports
 -march=armv5te and if not use -march=armv4t.
 With this patch if the compiler is configured with something like
 --with-fpu=neon the test will fail with your error message,
 even though the compiler supports -march=armv5te.

I've spent some time this evening pondering over your patch. Firstly
it appears that the current behaviour is going to cause more breakage
than originally expected. If this is to go in we'd have a number of
users having to add -mfloat-abi=soft to the command line option to
ensure that -march=armv5te works just fine on the files where
march=armv5te in the first places.

I'm not sure that the original patch is enough.

The tools have always allowed us to drop down the arch to
march=armv5te along with using -mfpu=neon. We are now changing command
line behaviour, so an inform in terms of diagnostics to the user would
be useful as it states that we don't really have mfpu=neon generating
neon code any more because of this particular case. If we are to do
this then the original patch is probably not enough as it then doesn't
handle the case of TARGET_VFP3 / TARGET_VFP5 / TARGET_NEON_FP16 /
TARGET_FP16 / TARGET_FPU_ARMV8 etc. etc. etc.




regards
Ramana





 Kyrill




 Thanks,
 James

 Andrew

 P.S. arm_override_options was renamed in 2010.
 2014-09-15  Andrew Stubbs  a...@codesourcery.com

 * gcc/config/arm/arm.c (arm_option_override): Reject -mfpu=neon
 when architecture is older than ARMv7.

 Index: gcc/config/arm/arm.c
 ===
 --- gcc/config/arm/arm.c(revision 215228)
 +++ gcc/config/arm/arm.c(working copy)
 @@ -2845,6 +2845,9 @@
   arm_fpu_desc = all_fpus[arm_fpu_index];
   +  if (TARGET_NEON  !arm_arch7)
 +error (target CPU does not support NEON);
 +
 switch (arm_fpu_desc-model)
   {
   case ARM_FP_MODEL_VFP:





Re: [PATCH] DWARF add DW_AT_noreturn on noreturn function subprogram.

2014-12-02 Thread Michael Eager

On 12/01/14 14:02, Cary Coutant wrote:

[+cc Michael Eager]


Rather than having to lobby to keep it unchanged because we jumped the gun,
can we lobby to get the number assigned in the near future rather than in
the potentially far future?  That feels more cooperative to me :-)

Would that make Michael happier?


I'm pretty confident that Michael will say, don't rely on the value
until the final spec is published. (He's told me exactly that in the
past. I've been guilty of jumping the gun on a couple of DW_LANG codes
and a few DW_AT values.)

But we should let Michael answer for himself...


As Cary said, the values are not fixed until the spec is released.
As Jason said, the values are unlikely to change.

The standards process can be a bit messy at times, particularly
when we realize that something we added to a draft needs to be changed.
We try to avoid making unnecessary changes, naturally.  On the other
hand, we don't want something in an internal draft of the DWARF
specification to prevent us from making necessary changes.

We will avoid changing the value of the DW_AT_noreturn or any other
new attribute, if for no reason than it makes additional work for us.
We're also aware that changes may affect work on GCC, GDB, or other
tools, which we would like to avoid.

The most conservative approach is to wait for the DWARF Version
5 spec to be released.  While there is no guarantee that attribute
values will not change in the final spec, the risk if this happening
seems small.



Michael, for your reference, here's the start of this thread:

https://gcc.gnu.org/ml/gcc-patches/2014-11/msg03195.html

and its continuation into December:

https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00099.html

Michael, are you OK with sharing your target dates for publishing the spec?


We will likely have a public review draft available in April or May
with a 60 period for comments, with final release most likely in July.


--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077


Re: [testsuite] Fix multiple definitions of _init

2014-12-02 Thread Joseph Myers
On Tue, 2 Dec 2014, Richard Biener wrote:

 Joseph may have more experience with how targets should setup
 USER_LABEL_PREFIX to avoid this situation.

See e.g. config/arm/lib1funcs.S:

#define SYM(x) CONCAT1 (__USER_LABEL_PREFIX__, x)

(and the associated macro definition of CONCAT1 that uses, etc.).  If you 
have .S sources that may be used on targets both with and without a 
prefix, you should do something similar.  (The ELF gABI says External C 
symbols have the same names in C and object files' symbol tables., so ELF 
targets using leading underscores on C symbol names are going against the 
gABI.)

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [C++ Patch] PR 60978

2014-12-02 Thread Jason Merrill
See my comment on the PR; I think the testcase illustrates why we still 
want this warning for anonymous enums.


Jason


Re: [PATCH x86_64] Optimize access to globals in -fpie -pie builds with copy relocations

2014-12-02 Thread H.J. Lu
On Tue, Dec 2, 2014 at 12:19 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Tue, Dec 02, 2014 at 12:16:09PM -0800, H.J. Lu wrote:
  IMO, nobody will use this highly special x86_64-only option. It would
  be best for gnu-ld to reach feature parity with gold as far as this
  functionality is concerned. In this case, the optimization would be
  auto-configured, and would fire automatically, without any user
  intervention.

 I will implement it in ld after its support is checked into GCC.

 I think it would be better to do it the other way around, so that gcc can be
 configured against the right ld from the start.


Consider it is done.  I will check it into binutils ths week.


-- 
H.J.


Re: [C++ Patch] PR 60978

2014-12-02 Thread Paolo Carlini

Hi,

On 12/02/2014 11:01 PM, Jason Merrill wrote:
See my comment on the PR; I think the testcase illustrates why we 
still want this warning for anonymous enums.
Ok... Thus, barring further discussion, I will simply close the Bug as 
invalid.


Thanks!
Paolo.


Re: [PATCH 0/2, AArch64, v3] APM X-Gene 1 cost-table and pipeline model

2014-12-02 Thread Ramana Radhakrishnan
On Fri, Nov 21, 2014 at 6:44 PM, Philipp Tomsich
philipp.toms...@theobroma-systems.com wrote:
 The following patch-series adds optimized support for the APM X-Gene 1
 by providing a cost-model and pipeline-model. The pipeline-model has a
 few long reservation-chains, but looking at the stats for the generated
 NDA shows that it's well below other AArch64 cores (e.g. Cortex-A53) in
 overall size.

 This includes all the requested enhancements and cleans up the naming of
 the various states and reservations in 'xgene1.md'.

 Even though it isn't wired into the 32bit ARM backend yet, we've decided
 to keep the machine-description in config/arm... after all, the X-Gene
 family is backwards compatible with ARMv7 and our benchmarking has shown
 good potential for performance improvements from improving the instruction
 selection and scheduling when using ARMv7 code (after all, X-Gene 1 is a
 4-way superscalar design).

Good, this being the case I would strongly suggest that -mcpu=xgene1
is wired up properly in the ARM backend by the end of stage3. It's
better to do this sooner rather than later in order to minimize issues
late in the release cycle. Afterwards its just bug fixing / tuning
specifically for xgene. As I've said in the past I would strongly
discourage an options break between the 2 backends, it will only come
back to haunt the backends. i.e. please wire up xgene1 in
config/arm/arm-cores.def

Please also submit a patch to changes.html for AArch64 and ARM
backends once this patch is reviewed / queued.

Ramana



 After having a few further discussions with my colleagues regarding the
 latencies and modelling of divides in the pipeline, we've readjusted the
 modelling of the divides another time... even though it doesn't make a
 difference in real-world benchmarks.

 Thanks to everyone who took the time to review and comment.


 Philipp Tomsich (2):
   Core definition for APM XGene-1 and associated cost-table.
   Pipeline model for APM XGene-1.

  gcc/ChangeLog|  14 +
  gcc/config/aarch64/aarch64-cores.def |   1 +
  gcc/config/aarch64/aarch64-tune.md   |   2 +-
  gcc/config/aarch64/aarch64.c |  62 
  gcc/config/aarch64/aarch64.md|   3 +-
  gcc/config/arm/aarch-cost-tables.h   | 101 +++
  gcc/config/arm/xgene1.md | 532 
 +++
  gcc/doc/invoke.texi  |   3 +-
  8 files changed, 715 insertions(+), 3 deletions(-)
  create mode 100644 gcc/config/arm/xgene1.md

 --
 1.9.1



Re: [PATCH 1/2] Core definition for APM XGene-1 and associated cost-table.

2014-12-02 Thread Ramana Radhakrishnan
On Fri, Nov 21, 2014 at 6:44 PM, Philipp Tomsich
philipp.toms...@theobroma-systems.com wrote:
 To keep this change separately buildable from the pipeline model,
 this patch directs the APM XGene-1 to use the generic scheduling
 model.
 ---
  gcc/ChangeLog|   8 +++
  gcc/config/aarch64/aarch64-cores.def |   1 +
  gcc/config/aarch64/aarch64-tune.md   |   2 +-
  gcc/config/aarch64/aarch64.c |  62 +
  gcc/config/arm/aarch-cost-tables.h   | 101 
 +++
  gcc/doc/invoke.texi  |   3 +-
  6 files changed, 175 insertions(+), 2 deletions(-)

 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index 2fa58ca..c9ac0d9 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,3 +1,11 @@
 +2014-11-19  Philipp Tomsich  philipp.toms...@theobroma-systems.com
 +
 +   * config/aarch64/aarch64-cores.def (xgene1): Update/add the
 +   xgene1 (APM XGene-1) core definition.
 +   * gcc/config/aarch64/aarch64.c: Add cost tables for APM XGene-1
 +   * config/arm/aarch-cost-tables.h: Add cost tables for APM XGene-1
 +   * doc/invoke.texi: Document -mcpu=xgene1.

Changes to config/arm/ files are ok as long as you've built a
cross-compiler for arm-none-eabi and ensured that a build succeeds.

regards
Ramana

 +
  2014-11-18  Maciej W. Rozycki  ma...@codesourcery.com

 * config/mips/mips.md (compression): Add `micromips32' setting.
 diff --git a/gcc/config/aarch64/aarch64-cores.def 
 b/gcc/config/aarch64/aarch64-cores.def
 index 312941f..e553e50 100644
 --- a/gcc/config/aarch64/aarch64-cores.def
 +++ b/gcc/config/aarch64/aarch64-cores.def
 @@ -37,6 +37,7 @@
  AARCH64_CORE(cortex-a53,  cortexa53, cortexa53, 8,  AARCH64_FL_FPSIMD | 
 AARCH64_FL_CRC, cortexa53)
  AARCH64_CORE(cortex-a57,  cortexa15, cortexa15, 8,  AARCH64_FL_FPSIMD | 
 AARCH64_FL_CRC, cortexa57)
  AARCH64_CORE(thunderx,thunderx,  thunderx, 8,  AARCH64_FL_FPSIMD | 
 AARCH64_FL_CRC | AARCH64_FL_CRYPTO, thunderx)
 +AARCH64_CORE(xgene1,  xgene1,xgene1,8,  AARCH64_FL_FPSIMD, 
 xgene1)

  /* V8 big.LITTLE implementations.  */

 diff --git a/gcc/config/aarch64/aarch64-tune.md 
 b/gcc/config/aarch64/aarch64-tune.md
 index c717ea8..6409082 100644
 --- a/gcc/config/aarch64/aarch64-tune.md
 +++ b/gcc/config/aarch64/aarch64-tune.md
 @@ -1,5 +1,5 @@
  ;; -*- buffer-read-only: t -*-
  ;; Generated automatically by gentune.sh from aarch64-cores.def
  (define_attr tune
 -   cortexa53,cortexa15,thunderx,cortexa57cortexa53
 +   cortexa53,cortexa15,thunderx,xgene1,cortexa57cortexa53
 (const (symbol_ref ((enum attr_tune) aarch64_tune
 diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
 index 4fec21e..9b92527 100644
 --- a/gcc/config/aarch64/aarch64.c
 +++ b/gcc/config/aarch64/aarch64.c
 @@ -226,6 +226,27 @@ static const struct cpu_addrcost_table 
 cortexa57_addrcost_table =
  #if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
  __extension__
  #endif
 +static const struct cpu_addrcost_table xgene1_addrcost_table =
 +{
 +#if HAVE_DESIGNATED_INITIALIZERS
 +  .addr_scale_costs =
 +#endif
 +{
 +  NAMED_PARAM (hi, 1),
 +  NAMED_PARAM (si, 0),
 +  NAMED_PARAM (di, 0),
 +  NAMED_PARAM (ti, 1),
 +},
 +  NAMED_PARAM (pre_modify, 1),
 +  NAMED_PARAM (post_modify, 0),
 +  NAMED_PARAM (register_offset, 0),
 +  NAMED_PARAM (register_extend, 1),
 +  NAMED_PARAM (imm_offset, 0),
 +};
 +
 +#if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
 +__extension__
 +#endif
  static const struct cpu_regmove_cost generic_regmove_cost =
  {
NAMED_PARAM (GP2GP, 1),
 @@ -262,6 +283,17 @@ static const struct cpu_regmove_cost 
 thunderx_regmove_cost =
NAMED_PARAM (FP2FP, 4)
  };

 +static const struct cpu_regmove_cost xgene1_regmove_cost =
 +{
 +  NAMED_PARAM (GP2GP, 1),
 +  NAMED_PARAM (GP2FP, 8),
 +  NAMED_PARAM (FP2GP, 8),
 +  /* We currently do not provide direct support for TFmode Q-Q move.
 + Therefore we need to raise the cost above 2 in order to have
 + reload handle the situation.  */
 +  NAMED_PARAM (FP2FP, 4)
 +};
 +
  /* Generic costs for vector insn classes.  */
  #if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
  __extension__
 @@ -302,6 +334,26 @@ static const struct cpu_vector_cost 
 cortexa57_vector_cost =
NAMED_PARAM (cond_not_taken_branch_cost, 1)
  };

 +/* Generic costs for vector insn classes.  */
 +#if HAVE_DESIGNATED_INITIALIZERS  GCC_VERSION = 2007
 +__extension__
 +#endif
 +static const struct cpu_vector_cost xgene1_vector_cost =
 +{
 +  NAMED_PARAM (scalar_stmt_cost, 1),
 +  NAMED_PARAM (scalar_load_cost, 5),
 +  NAMED_PARAM (scalar_store_cost, 1),
 +  NAMED_PARAM (vec_stmt_cost, 2),
 +  NAMED_PARAM (vec_to_scalar_cost, 4),
 +  NAMED_PARAM (scalar_to_vec_cost, 4),
 +  NAMED_PARAM (vec_align_load_cost, 10),
 +  NAMED_PARAM (vec_unalign_load_cost, 10),
 +  NAMED_PARAM (vec_unalign_store_cost, 2),
 +  NAMED_PARAM (vec_store_cost, 2),
 +  NAMED_PARAM (cond_taken_branch_cost, 2),

Re: [PATCH 2/2] Pipeline model for APM XGene-1.

2014-12-02 Thread Ramana Radhakrishnan
On Fri, Nov 21, 2014 at 6:44 PM, Philipp Tomsich
philipp.toms...@theobroma-systems.com wrote:
 ---
  gcc/ChangeLog |   6 +
  gcc/config/aarch64/aarch64.md |   3 +-
  gcc/config/arm/xgene1.md  | 532 
 ++
  3 files changed, 540 insertions(+), 1 deletion(-)
  create mode 100644 gcc/config/arm/xgene1.md

 diff --git a/gcc/ChangeLog b/gcc/ChangeLog
 index c9ac0d9..dad2278 100644
 --- a/gcc/ChangeLog
 +++ b/gcc/ChangeLog
 @@ -1,5 +1,11 @@
  2014-11-19  Philipp Tomsich  philipp.toms...@theobroma-systems.com

 +   * config/aarch64/aarch64.md: Include xgene1.md.
 +   (generic_sched): Set to no for xgene1.
 +   * config/arm/xgene1.md: New file.
 +

config/arm/xgene1.md is fine as long as you've built a cross-compiler
(building cc1 / cc1plus is enough) to arm-none-eabi to ensure no build
breakage.

Please wait for Marcus or Richard to approve the AArch64 bits.

regards
Ramana

 +2014-11-19  Philipp Tomsich  philipp.toms...@theobroma-systems.com
 +
 * config/aarch64/aarch64-cores.def (xgene1): Update/add the
 xgene1 (APM XGene-1) core definition.
 * gcc/config/aarch64/aarch64.c: Add cost tables for APM XGene-1
 diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
 index 597ff8c..1b36384 100644
 --- a/gcc/config/aarch64/aarch64.md
 +++ b/gcc/config/aarch64/aarch64.md
 @@ -191,7 +191,7 @@

  (define_attr generic_sched yes,no
(const (if_then_else
 -  (eq_attr tune cortexa53,cortexa15,thunderx)
 +  (eq_attr tune cortexa53,cortexa15,thunderx,xgene1)
(const_string no)
(const_string yes

 @@ -199,6 +199,7 @@
  (include ../arm/cortex-a53.md)
  (include ../arm/cortex-a15.md)
  (include thunderx.md)
 +(include ../arm/xgene1.md)

  ;; ---
  ;; Jumps and other miscellaneous insns
 diff --git a/gcc/config/arm/xgene1.md b/gcc/config/arm/xgene1.md
 new file mode 100644
 index 000..563a959
 --- /dev/null
 +++ b/gcc/config/arm/xgene1.md
 @@ -0,0 +1,532 @@
 +;; Machine description for AppliedMicro xgene1 core.
 +;; Copyright (C) 2012-2014 Free Software Foundation, Inc.
 +;; Contributed by Theobroma Systems Design und Consulting GmbH.
 +;;See http://www.theobroma-systems.com for more info.
 +;;
 +;; This file is part of GCC.
 +;;
 +;; GCC is free software; you can redistribute it and/or modify it
 +;; under the terms of the GNU General Public License as published by
 +;; the Free Software Foundation; either version 3, or (at your option)
 +;; any later version.
 +;;
 +;; GCC is distributed in the hope that it will be useful, but
 +;; WITHOUT ANY WARRANTY; without even the implied warranty of
 +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
 +;; General Public License for more details.
 +;;
 +;; You should have received a copy of the GNU General Public License
 +;; along with GCC; see the file COPYING3.  If not see
 +;; http://www.gnu.org/licenses/.
 +
 +;; Pipeline description for the xgene1 micro-architecture
 +
 +(define_automaton xgene1)
 +
 +(define_cpu_unit xgene1_decode_out0 xgene1)
 +(define_cpu_unit xgene1_decode_out1 xgene1)
 +(define_cpu_unit xgene1_decode_out2 xgene1)
 +(define_cpu_unit xgene1_decode_out3 xgene1)
 +
 +(define_cpu_unit xgene1_divide xgene1)
 +(define_cpu_unit xgene1_fp_divide xgene1)
 +(define_cpu_unit xgene1_fsu xgene1)
 +(define_cpu_unit xgene1_fcmp xgene1)
 +
 +(define_reservation xgene1_decode1op
 +( xgene1_decode_out0 )
 +|( xgene1_decode_out1 )
 +|( xgene1_decode_out2 )
 +|( xgene1_decode_out3 )
 +)
 +(define_reservation xgene1_decode2op
 +( xgene1_decode_out0 + xgene1_decode_out1 )
 +|( xgene1_decode_out0 + xgene1_decode_out2 )
 +|( xgene1_decode_out0 + xgene1_decode_out3 )
 +|( xgene1_decode_out1 + xgene1_decode_out2 )
 +|( xgene1_decode_out1 + xgene1_decode_out3 )
 +|( xgene1_decode_out2 + xgene1_decode_out3 )
 +)
 +(define_reservation xgene1_decodeIsolated
 +( xgene1_decode_out0 + xgene1_decode_out1 + xgene1_decode_out2 + 
 xgene1_decode_out3 )
 +)
 +
 +(define_insn_reservation xgene1_branch 1
 +  (and (eq_attr tune xgene1)
 +   (eq_attr type branch))
 +  xgene1_decode1op)
 +
 +(define_insn_reservation xgene1_nop 1
 +  (and (eq_attr tune xgene1)
 +   (eq_attr type no_insn))
 +  xgene1_decode1op)
 +
 +(define_insn_reservation xgene1_call 1
 +  (and (eq_attr tune xgene1)
 +   (eq_attr type call))
 +  xgene1_decode2op)
 +
 +(define_insn_reservation xgene1_f_load 10
 +  (and (eq_attr tune xgene1)
 +   (eq_attr type f_loadd,f_loads))
 +  xgene1_decode2op)
 +
 +(define_insn_reservation xgene1_f_store 4
 +  (and (eq_attr tune xgene1)
 +   (eq_attr type f_stored,f_stores))
 +  xgene1_decode2op)
 +
 +(define_insn_reservation xgene1_fmov 2
 +  (and (eq_attr tune xgene1)
 +   (eq_attr type fmov,fconsts,fconstd))
 +  xgene1_decode1op)

Re: [Ping]Re: [PR63762][4.9] Backport the patch which fixes GCC generates UNPREDICTABLE STR with Rn = Rt for arm

2014-12-02 Thread Ramana Radhakrishnan
CCing release maintainers as well as they need to approve this
backport if Vlad is happy with it.

Vlad - is this ok to go back as it fixes a bug for ARM in the 4.9 tree
that came up in building bits of debian.

Ramana

On Mon, Dec 1, 2014 at 5:24 PM, Renlin Li renlin...@arm.com wrote:
 On 01/12/14 15:58, H.J. Lu wrote:

 On Thu, Nov 27, 2014 at 8:38 AM, Renlin Li renlin...@arm.com wrote:

 On 27/11/14 15:37, H.J. Lu wrote:

 On Thu, Nov 27, 2014 at 7:32 AM, Renlin Li renlin...@arm.com wrote:

 On 26/11/14 18:12, H.J. Lu wrote:

 On Wed, Nov 26, 2014 at 10:09 AM, Renlin Li renlin...@arm.com wrote:

 On 26/11/14 12:16, H.J. Lu wrote:

 On Wed, Nov 26, 2014 at 4:07 AM, Renlin Li renlin...@arm.com
 wrote:

 On 20/11/14 16:17, Renlin Li wrote:

 Hi all,

 This is a backport for gcc-4_9-branch of the patch [PR63762]GCC
 generates
 UNPREDICTABLE STR with Rn = Rt for arm posted in:
 https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02253.html

 arm-none-eabi has been test on the model, no new issues.
 bootstrapping
 and
 regression tested on x86, no new issues.

 Is it Okay for gcc-4_9-branch?

 gcc/ChangeLog:

 2014-11-20  Renlin Li  renlin...@arm.com

 PR middle-end/63762
 * ira.c (ira): Update preferred class.

 gcc/testsuite/ChangeLog:

 2014-11-20  Renlin Li  renlin...@arm.com

 PR middle-end/63762
 * gcc.dg/pr63762.c: New.

 Ping for it.

 Please verify if it is the real fix for

 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63661

 If yes, please add a testcase for PR 63661 and mention it in
 your ChangeLog entry.

 Thanks.


 Hi H.J.

 Yes, I have verified that, this patch additionally fixes PR 63661.

 I observed the same behaviour as I saw on arm backend. It will be
 great
 if
 you can double check they are caused by exactly the same reason.

 I will ask our people to take a look.

 A new testcase has been added, ChangeLog has been updated to reflect
 the
 change. Updated patch has bee attached.
 Okay for gcc-4_9-branch?

 Regards,
 Renlin Li


 gcc/ChangeLog:

 2014-11-26  Renlin Li  renlin...@arm.com

PR middle-end/63762
PR middle-end/63661
* ira.c (ira): Update preferred class.

 gcc/testsuite/ChangeLog:

 2014-11-26  Renlin Li  renlin...@arm.com

PR middle-end/63661
PR middle-end/63762
* testsuite/gcc.dg/pr63661.c: New.
* testsuite/gcc.dg/pr63762.c: New.


 pr63661.c should be moved to gcc.target/i386 and run it
 on PIC target.

 Thanks.

 Hi H.J.

 The patch has been adjusted according to your suggestion.


 gcc/ChangeLog:

 2014-11-27  Renlin Li  renlin...@arm.com

   PR middle-end/63762
   PR target/63661
   * ira.c (ira): Update preferred class.

 gcc/testsuite/ChangeLog:

 2014-11-27  Renlin Li  renlin...@arm.com

   PR middle-end/63762
   PR target/63661
   * testsuite/gcc.dg/pr63762.c: New.
   * testsuite/gcc.target/i386/pr63661.c: New.

   ^^^  No testsuite/


 Hi H.J.

 gcc.target/i386/pr63661.c should be checked into trunk first.

 A separate patch is sent to mailing list for this.


 ChangeLog has been corrected.

 Regards,
 Renlin Li

 gcc/ChangeLog:

 2014-11-27  Renlin Li  renlin...@arm.com

  PR middle-end/63762
  PR target/63661
  * ira.c (ira): Update preferred class.

 gcc/testsuite/ChangeLog:

 2014-11-27  Renlin Li  renlin...@arm.com

  PR middle-end/63762
  PR target/63661
  * gcc.dg/pr63762.c: New.
  * gcc.target/i386/pr63661.c: New.


 You need to update gcc.target/i386/pr63661.c from trunk.



 Update the test case for pr63661 from trunk.

 Okay for gcc-4_9-branch?

 gcc/ChangeLog:

 2014-12-01  Renlin Li  renlin...@arm.com

 PR middle-end/63762
 PR target/63661
 * ira.c (ira): Update preferred class.

 gcc/testsuite/ChangeLog:

 2014-12-01  Renlin Li  renlin...@arm.com
  H.J Lu  hongjiu...@intel.com


 PR middle-end/63762
 PR target/63661
 * gcc.dg/pr63762.c: New.
 * gcc.target/i386/pr63661.c: New.


[Patch, committed] contrib/download_prerequisites - download ISL 0.14

2014-12-02 Thread Tobias Burnus
Committed as Rev. 218294, now that ISL 0.14 is supported by GCC 5 and 
ISL 0.14 is in the infrastructure directory.


Tobias
Index: contrib/ChangeLog
===
--- contrib/ChangeLog	(Revision 218293)
+++ contrib/ChangeLog	(Arbeitskopie)
@@ -1,3 +1,7 @@
+2014-12-02  Tobias Burnus  bur...@net-b.de
+
+	* download_prerequisites: Download ISL 0.14 instead of 0.12.2.
+
 2014-11-25  Tom de Vries  t...@codesourcery.com
 	Peter Bergner  berg...@vnet.ibm.com
 
Index: contrib/download_prerequisites
===
--- contrib/download_prerequisites	(Revision 218293)
+++ contrib/download_prerequisites	(Arbeitskopie)
@@ -43,7 +43,7 @@ ln -sf $MPC mpc || exit 1
 
 # Necessary to build GCC with the Graphite loop optimizations.
 if [ $GRAPHITE_LOOP_OPT = yes ] ; then
-  ISL=isl-0.12.2
+  ISL=isl-0.14
 
   wget ftp://gcc.gnu.org/pub/gcc/infrastructure/$ISL.tar.bz2 || exit 1
   tar xjf $ISL.tar.bz2  || exit 1


Re: PR 13631 Problems in messages

2014-12-02 Thread François Dumont

So here is another proposal with all your remarks implemented.

2014-12-02  François Dumont  fdum...@gcc.gnu.org

DR libstdc++/13631
* include/bits/codecvt.h (codecvtchar, char, mbstate_t): friend class
std::messageschar.
(codecvtwchar_t, char, mbstate_t): friend class
std::messageswchar_t.
* config/locale/gnu/messages_member.h
(messageschar::do_open): Specialized.
(messageschar::do_close): Likewise.
(messageswchar_t::do_open): Likewise.
(messageswchar_t::do_close): Likewise.
* config/locale/gnu/messages_member.cc:
(messageschar::do_open): Implement. Use bind_textdomain_codeset based
on codecvtchar, char, mbstate_t._M_c_locale_codecvt code set. Use
internal cache to keep opened domain name with locale information.
(messageswchar_t::do_open): Likewise with
codecvtwchar_t, char, mbstate_t.
(messageschar::do_close): Implement. Clean cache information.
(messageswchar_t::do_close): Likewise.
(get_glibc_msg): New. Use dgettext rather than gettext using cached
domain name associated to catalog id.
(messageschar::do_get): Use latter.
(messageswchar_t::do_get): Likewise and use also cached locale
codecvtwchar_t, char, mbstate_t facet to convert wchar_t default
value to char and the result back to wchar_t.
* testsuite/22_locale/messages/13631.cc: New.
* testsuite/22_locale/messages/members/char/2.cc: Use also fr_FR locale
for charset conversion to get the expected accented character.

Tested under Linux x86_64.

Ok to commit ?

François


On 02/12/2014 11:55, Jonathan Wakely wrote:

If some user has got code that uses messagessigned char and they
provide a definition for messagessigned char::do_get() their code
will break if do_open and do_close disappear. (Realistically I doubt
anyone is doing that though, it may not even work.)

I restored default implementation, it does no harm.


This is not exception safe. You can use auto_ptr to fix it:

 std::auto_ptrCatalog_info info(new 
Catalog_info(_M_catalog_counter++,

   __domain, __l));
 _M_infos.push_back(info.get());
 return info.release()-_M_id 

Good catch.

Index: config/locale/gnu/messages_members.cc
===
--- config/locale/gnu/messages_members.cc	(revision 218285)
+++ config/locale/gnu/messages_members.cc	(working copy)
@@ -31,20 +31,123 @@
 #include locale
 #include bits/c++locale_internal.h
 
-namespace std _GLIBCXX_VISIBILITY(default)
+#include limits
+#include algorithm
+#include vector
+
+#include backward/auto_ptr.h
+#include ext/concurrence.h
+
+namespace
 {
-_GLIBCXX_BEGIN_NAMESPACE_VERSION
+  using namespace std;
 
-  // Specializations.
-  template
-string
-messageschar::do_get(catalog, int, int, const string __dfault) const
+  typedef messages_base::catalog catalog;
+
+  struct Catalog_info
 {
+Catalog_info(catalog __id, const string __domain, locale __loc)
+  : _M_id(__id), _M_domain(__domain), _M_locale(__loc)
+{ }
+
+catalog _M_id;
+string _M_domain;
+locale _M_locale;
+  };
+
+  class Catalogs
+  {
+  public:
+Catalogs() : _M_catalog_counter(0) { }
+
+~Catalogs()
+{
+  for (vectorCatalog_info*::iterator __it = _M_infos.begin();
+	   __it != _M_infos.end(); ++__it)
+	delete *__it;
+}
+
+catalog
+_M_add(const string __domain, locale __l)
+{
+  __gnu_cxx::__scoped_lock lock(_M_mutex);
+
+  // The counter is not likely to roll unless catalogs keep on being
+  // opened/closed which is consider as an application mistake for the
+  // moment.
+  if (_M_catalog_counter == numeric_limitscatalog::max())
+	return -1;
+
+  std::auto_ptrCatalog_info info(new Catalog_info(_M_catalog_counter++,
+			__domain, __l));
+  _M_infos.push_back(info.get());
+  return info.release()-_M_id;
+}
+
+void
+_M_erase(catalog __c)
+{
+  __gnu_cxx::__scoped_lock lock(_M_mutex);
+
+  vectorCatalog_info*::iterator __res =
+	lower_bound(_M_infos.begin(), _M_infos.end(), __c, _Comp());
+  if (__res == _M_infos.end() || (*__res)-_M_id != __c)
+	return;
+
+  delete *__res;
+  _M_infos.erase(__res);
+
+  // Just in case closed catalog was the last open.
+  if (__c == _M_catalog_counter - 1)
+	--_M_catalog_counter;
+}
+
+const Catalog_info*
+_M_get(catalog __c) const
+{
+  __gnu_cxx::__scoped_lock lock(_M_mutex);
+
+  vectorCatalog_info*::const_iterator __res =
+	lower_bound(_M_infos.begin(), _M_infos.end(), __c, _Comp());
+
+  if (__res != _M_infos.end()  (*__res)-_M_id == __c)
+	return *__res;
+
+  return 0;
+}
+
+  private:
+struct _Comp
+{
+  bool operator()(catalog __cat, const Catalog_info* __info) const
+  { return __cat  __info-_M_id; }
+
+  bool operator()(const Catalog_info* __info, catalog __cat) const
+  { return __info-_M_id  __cat; }
+  

Re: [PATCH][ARM] Implement TARGET_SCHED_MACRO_FUSION_PAIR_P

2014-12-02 Thread Ramana Radhakrishnan
On Tue, Nov 11, 2014 at 11:55 AM, Kyrill Tkachov kyrylo.tkac...@arm.com wrote:
 Hi all,

 This is the arm implementation of the macro fusion hook.
 It tries to fuse movw+movt operations together. It also tries to take lo_sum
 RTXs into account since those generate movt instructions as well.

 Bootstrapped and tested on arm-none-linux-gnueabihf.

 Ok for trunk?



  if (current_tune-fuseable_ops  ARM_FUSE_MOVW_MOVT)
+{
+  /* We are trying to fuse
+ movw imm / movt imm
+ instructions as a group that gets scheduled together.  */
+

A comment here about the insn structure would be useful.

+  set_dest = SET_DEST (curr_set);
+  if (GET_CODE (set_dest) == ZERO_EXTRACT)
+{
+  if (CONST_INT_P (SET_SRC (curr_set))
+   CONST_INT_P (SET_SRC (prev_set))
+   REG_P (XEXP (set_dest, 0))
+   REG_P (SET_DEST (prev_set))
+   REGNO (XEXP (set_dest, 0)) == REGNO (SET_DEST (prev_set)))
+return true;
+}
+  else if (GET_CODE (SET_SRC (curr_set)) == LO_SUM
+REG_P (SET_DEST (curr_set))
+REG_P (SET_DEST (prev_set))
+GET_CODE (SET_SRC (prev_set)) == HIGH
+REGNO (SET_DEST (curr_set)) == REGNO (SET_DEST (prev_set)))
+{
+  return true;
+}

Can we add a fast path exit to be

if (GET_MODE (set_dest) != SImode)
  return false;

I did think whether we wanted to use reg_overlap_mentioned_p as that
may simplify the logic a bit but that's  overkill here as we still
want to restrict it to the cases above.

Otherwise OK.

Ramana




+}
+  return false;


 Thanks,
 Kyrill

 2014-11-11  Kyrylo Tkachov  kyrylo.tkac...@arm.com

 * config/arm/arm-protos.h (tune_params): Add fuseable_ops field.
 * config/arm/arm.c (arm_macro_fusion_p): New function.
 (arm_macro_fusion_pair_p): Likewise.
 (TARGET_SCHED_MACRO_FUSION_P): Define.
 (TARGET_SCHED_MACRO_FUSION_PAIR_P): Likewise.
 (ARM_FUSE_NOTHING): Likewise.
 (ARM_FUSE_MOVW_MOVT): Likewise.
 (arm_slowmul_tune, arm_fastmul_tune, arm_strongarm_tune,
 arm_xscale_tune, arm_9e_tune, arm_v6t2_tune, arm_cortex_tune,
 arm_cortex_a8_tune, arm_cortex_a7_tune, arm_cortex_a15_tune,
 arm_cortex_a53_tune, arm_cortex_a57_tune, arm_cortex_a9_tune,
 arm_cortex_a12_tune, arm_v7m_tune, arm_v6m_tune, arm_fa726te_tune
 arm_cortex_a5_tune): Specify fuseable_ops value.


Re: [PATCH driver/diagnostics] init color earlier, add color to driver

2014-12-02 Thread Joseph Myers
On Tue, 2 Dec 2014, Manuel López-Ibáñez wrote:

 2014-12-02  Manuel López-Ibáñez  m...@gcc.gnu.org
 
 * diagnostic.c (diagnostic_color_init): New.
 * diagnostic.h: Declare.
 * gcc.c (driver::global_initializations): Use it.
 (driver_handle_option): Handle -fdiagnostics-color_.
 * toplev.c: Do not include diagnostic-color.h.
 (process_options): Do not initialize color diagnostics here.
 * common.opt (fdiagnostics-color=): Add Driver.
 * opts-global.c (init_options_once): Initialize color here.

The gcc.c / common.opt / opts-global.c changes are OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH obvious/diagnostics] Honor override_column when printing the caret

2014-12-02 Thread Manuel López-Ibáñez
libcpp uses diagnostic-override_column to give a custom column number
to diagnostics. This is taken into account when building the prefix,
but it was missing when placing the caret.

Before:

/home/manuel/override_column.c:1:4: warning: /* within comment [-Wcomment]
 /* /* */
 ^

After:

/home/manuel/override_column.c:1:4: warning: /* within comment [-Wcomment]
 /* /* */
^


Committed as obvious in r218295.


2014-12-03  Manuel López-Ibáñez  m...@gcc.gnu.org

* diagnostic.c (diagnostic_show_locus): Honor override_column when
placing the caret.

--- gcc/diagnostic.c(revision 218278)
+++ gcc/diagnostic.c(working copy)
@@ -308,10 +308,12 @@ diagnostic_show_locus (diagnostic_contex
   || diagnostic-location == context-last_location)
 return;

   context-last_location = diagnostic-location;
   s = expand_location_to_spelling_point (diagnostic-location);
+  if (diagnostic-override_column)
+s.column = diagnostic-override_column;
   line = location_get_source_line (s, line_width);
   if (line == NULL || s.column  line_width)
 return;


Re: [PATCH x86_64] Optimize access to globals in -fpie -pie builds with copy relocations

2014-12-02 Thread H.J. Lu
On Tue, Dec 2, 2014 at 2:14 PM, H.J. Lu hjl.to...@gmail.com wrote:
 On Tue, Dec 2, 2014 at 12:19 PM, Jakub Jelinek ja...@redhat.com wrote:
 On Tue, Dec 02, 2014 at 12:16:09PM -0800, H.J. Lu wrote:
  IMO, nobody will use this highly special x86_64-only option. It would
  be best for gnu-ld to reach feature parity with gold as far as this
  functionality is concerned. In this case, the optimization would be
  auto-configured, and would fire automatically, without any user
  intervention.

 I will implement it in ld after its support is checked into GCC.

 I think it would be better to do it the other way around, so that gcc can be
 configured against the right ld from the start.


 Consider it is done.  I will check it into binutils ths week.


It is on binutils master branch now:

https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=9a926d55ab4b6667f6c35b518d59b902fe490d9d

-- 
H.J.


  1   2   >