date:20170213

Re: [PATCH][RFC] Fix PR79432, SSA from the gimplifier and abnormal edges

2017-02-13 Thread Richard Biener

On Mon, 13 Feb 2017, Jeff Law wrote:

> On 02/10/2017 05:24 AM, Richard Biener wrote:
> > 
> > It turns out the SSA var defs/uses generated by the gimplifier can break
> > apart in a way no longer satisfying the dominance requirement of SSA
> > uses vs. their defs by means of CFG construction adding abnormal edges
> > for stuff like setjmp (but also non-local gotos I guess).
> > 
> > This would be quite costly to overcome in gimplification - one needs
> > to check whether a (part?) of an expression to be gimplified may
> > produce such edges and disable SSA name generation for them.  With
> > the recursive nature of the gimplifier plus the general complexity
> > of GENERIC I can't see how to do this.
> ISTM there is no apriori way to know if any object is live across a call which
> would create these abnormal edges.  How in the world did this work in the
> past?

It worked in the past because we didn't use SSA names for temporaries in
the gimplifiers and regular decls just got effectively uninitialized
uses across such edges (by into-SSA and PHI insertion, exactly how I
fix this "after the fact")

> Sigh.  Because the problem starts at gimplification, we can't even do
> something like pre-scan the block for problem calls -- we don't have a CFG.
> 
> Can this only happen during gimplification of an expression where a
> sub-expression has a problematic call?  How bad do you think things would blow
> up if we scanned for a call and avoided using SSA_NAMEs when there's a
> problematical call within the expression?  Or is there no way structurally to
> do that within the gimplifier?

Well, the only thing I could imagine is doing walk_tree on all GENERIC
where it matters (all?).

> Are setjmp/longjump the only problems here, or does EH tickle these issues as
> well (if so, then ISTM you'd need a different test).

The only issue is when CFG construction adds backedges which only happens
for abnormals.  Only backedges can make uses no longer dominate defs.

> > 
> > Thus the following patch "recovers" from the extra abnormal edges
> > by effectively treating SSA vars pre into-SSA as "non-SSA" and thus
> > doing PHI insertion for them when rewriting the function into SSA.
> > Implementation-wise the easiest thing was to re-write the affected
> > SSA vars out-of-SSA (replace them by a decl).
> If we can't do the right thing from the start, then "recovery" seems to be the
> only option.

I think it's a question of cost and maintainability.  It for sure must
be possible to fix this in the gimplifier but I think the costs are too 
high.

> 
> > 
> > The out-of-SSA rewriting is placed in insert_phi_nodes because thats
> > the first point in time we have immediate uses present (otherwise
> > it could be done at any point after CFG construction).
> THe earlier the better.  So as soon as we have a CFG & SSA we have to fixup.
> So location seems right.

Yeah, if we'd have immediate uses at CFG construction then I'd fix it
at the time we introduce those abnormal edges.  For GCC 8 we might want
to consider this (building SSA operands earlier).

> > 
> > I'm not 100% happy with this but after trying for a day I can't come
> > up with a better solution - it has the chance of papering over
> > bogus gimplifications but OTOH that we only do this for functions
> > calling setjmp or having non-local labels would make those trigger
> > SSA verification in the other cases.
> > 
> > Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> > 
> > Any comments / ideas?  Anyone with good arguments that this is
> > even conceptually the correct thing to do?
> > 
> > I'm bootstrapping this with the calls_setjmp/has_nonlocal_label
> > short-cutted with a 1 || as well as without (to increase testing
> > coverage).
> > 
> > Thanks,
> > Richard.
> > 
> > 2017-02-10  Richard Biener  
> > 
> > PR middle-end/79432
> > * tree-into-ssa.c (insert_phi_nodes): When the function can
> > have abnormal edges rewrite SSA names with broken use-def
> > dominance out of SSA and register them for PHI insertion.
> > 
> > * gcc.dg/torture/pr79432.c: New testcase.
> It's gross.  But unless we have a reasonable way to do this during
> gimplification, I don't see anything better.

Ok.  I've applied the patch for now.  If anybody comes up with a cheap
enough idea to cover these in the gimplifier we can reconsider.

Richard.

> jeff
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

[PATCH] [X86_64] Fix alignment for znver1 arch.

2017-02-13 Thread Pawar, Amit

Hi maintainers,

Please find the below patch which changes the code alignment values for znver1. 
Bootstrap and regression test passed on x86_64.
OK to apply?

Thanks,
Amit Pawar


diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 2561d53..a5b0159 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2017-02-13  Amit Pawar  
+
+ * config/i386/i386.c (znver1_cost): Fix the alignment for function and
+ max skip bytes for function, loop and jump.
+
 2017-02-13  Martin Sebor  

  PR middle-end/79496
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index d7dce4b..d9a4a38 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -2672,7 +2672,7 @@ static const struct ptt 
processor_target_table[PROCESSOR_max] =
   {"bdver4", _cost, 16, 10, 16, 7, 11},
   {"btver1", _cost, 16, 10, 16, 7, 11},
   {"btver2", _cost, 16, 10, 16, 7, 11},
-  {"znver1", _cost, 16, 10, 16, 7, 11}
+  {"znver1", _cost, 16, 15, 16, 15, 16}
 };
 ^L
 static unsigned int

Re: [PATCH] use zero as the lower bound for a signed-unsigned range (PR 79327)

2017-02-13 Thread Jakub Jelinek

On Mon, Feb 13, 2017 at 04:53:19PM -0700, Jeff Law wrote:
> > dirtype is one of the standard {un,}signed {char,short,int,long,long long}
> > types, all of them have 0 in their ranges.
> > For VR_RANGE we almost always set res.knownrange to true:
> >   /* Set KNOWNRANGE if the argument is in a known subrange
> >  of the directive's type (KNOWNRANGE may be reset below).  */
> >   res.knownrange
> > = (!tree_int_cst_equal (TYPE_MIN_VALUE (dirtype), argmin)
> >|| !tree_int_cst_equal (TYPE_MAX_VALUE (dirtype), argmax));
> > (the exception is in case that range clearly has to include zero),
> > and reset it only if adjust_range_for_overflow returned true, which means
> > it also set the range to TYPE_M{IN,AX}_VALUE (dirtype) and again
> > includes zero.
> > So IMNSHO likely_adjust in what you've committed is always true
> > when you use it and thus just a useless computation and something to make
> > the code harder to understand.
> If KNOWNRANGE is false, then LIKELY_ADJUST will be true.  But I don't see
> how we can determine anything for LIKELY_ADJUST if KNOWNRANGE is true.

We can't, but that doesn't matter, we only use it if KNOWNRANGE is false.
The only user of LIKELY_ADJUST is:
 
  if (res.knownrange)
res.range.likely = res.range.max;
  else
{
// -- Here we know res.knownrage is false
  res.range.likely = res.range.min;
  if (likely_adjust && maybebase && base != 10)
// -- and here is the only user of likely_adjust
{
  if (res.range.min == 1)
res.range.likely += base == 8 ? 1 : 2;
  else if (res.range.min == 2
   && base == 16
   && (dir.width[0] == 2 || dir.prec[0] == 2))
++res.range.likely;
}
}

> > Even if you don't trust this, with the ranges in argmin/argmax, it is
> > IMHO undesirable to set it differently at the different code paths,
> > if you want to check whether the final range includes zero and at least
> > one another value, just do
> > -  if (likely_adjust && maybebase && base != 10)
> > +  if ((tree_int_cst_sgn (argmin) < 0 || tree_int_cst_sgn (argmax) > 0)
> >&& maybebase && base != 10)
> > Though, it is useless both for the above reason and for the reason that you
> > actually do something only:
> I'm not convinced it's useless, but it does seem advisable to bring test
> down to where it's actually used and to bse it strictly on argmin/argmax.
> Can you test a patch which does that?

That would then be (the only difference compared to the previous patch is
the last hunk) following.  I can surely test that, I'm still convinced it
would work equally if that
(tree_int_cst_sgn (argmin) < 0 || tree_int_cst_sgn (argmax) > 0)
is just gcc_checking_assert.

2017-02-14  Jakub Jelinek  

PR tree-optimization/79327
* gimple-ssa-sprintf.c (format_integer): Remove likely_adjust
variable, its initialization and use.

--- gcc/gimple-ssa-sprintf.c.jj 2017-02-04 08:43:12.0 +0100
+++ gcc/gimple-ssa-sprintf.c2017-02-04 08:45:33.173709580 +0100
@@ -1232,10 +1232,6 @@ format_integer (const directive , tr
of the format string by returning [-1, -1].  */
 return fmtresult ();
 
-  /* True if the LIKELY counter should be adjusted upward from the MIN
- counter to account for arguments with unknown values.  */
-  bool likely_adjust = false;
-
   fmtresult res;
 
   /* Using either the range the non-constant argument is in, or its
@@ -1265,14 +1261,6 @@ format_integer (const directive , tr
 
  res.argmin = argmin;
  res.argmax = argmax;
-
- /* Set the adjustment for an argument whose range includes
-zero since that doesn't include the octal or hexadecimal
-base prefix.  */
- wide_int wzero = wi::zero (wi::get_precision (min));
- if (wi::le_p (min, wzero, SIGNED)
- && !wi::neg_p (max))
-   likely_adjust = true;
}
   else if (range_type == VR_ANTI_RANGE)
{
@@ -1307,11 +1295,6 @@ format_integer (const directive , tr
 
   if (!argmin)
 {
-  /* Set the adjustment for an argument whose range includes
-zero since that doesn't include the octal or hexadecimal
-base prefix.  */
-  likely_adjust = true;
-
   if (TREE_CODE (argtype) == POINTER_TYPE)
{
  argmin = build_int_cst (pointer_sized_int_node, 0);
@@ -1371,7 +1354,8 @@ format_integer (const directive , tr
   else
 {
   res.range.likely = res.range.min;
-  if (likely_adjust && maybebase && base != 10)
+  if (maybebase && base != 10
+ && (tree_int_cst_sgn (argmin) < 0 || tree_int_cst_sgn (argmax) > 0))
{
  if (res.range.min == 1)
res.range.likely += base == 8 ? 1 : 2;


Jakub

[RFA][PR tree-optimization/79095] [PATCH 1/4] Improve ranges for MINUS_EXPR and EXACT_DIV_EXPR V3

2017-02-13 Thread Jeff Law



This is the first patch in the series with Richi's comments from last 
week addressed.  #2, #3 and #4 were unchanged.


Richi asked for the EXACT_DIV_EXPR handling in 
extract_range_from_binary_exit_1 to move out one IF conditional nesting 
level.


Richi noted that the use of symbolic_range_based_on_p was unsafe in the 
context I used it in extract_range-from_binary_expr, and that the test 
we want to make happens to be simpler as well.


And finally we use Richi's heuristic for when to prefer ~[0,0] over a 
wide normal range.


Bootstrapped and regression tested with the other 3 patches in this kit 
on x86_64-unknown-linux-gnu.


All 4 patches are attached to this message for ease of review.


Ok for the trunk?

Thanks,
jeff
* tree-vrp.c (extract_range_from_binary_expr_1): For EXACT_DIV_EXPR,
if the numerator has the range ~[0,0] make the resultant range ~[0,0].
(extract_range_from_binary_expr): For MINUS_EXPR with no derived range,
if the operands are known to be not equal, then the resulting range
is ~[0,0].
(intersect_ranges): If the new range is ~[0,0] and the old range is
wide, then prefer ~[0,0].

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index b429217..9174948 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -2259,6 +2259,19 @@ extract_range_from_binary_expr_1 (value_range *vr,
   else if (vr1.type == VR_UNDEFINED)
 set_value_range_to_varying ();
 
+  /* We get imprecise results from ranges_from_anti_range when
+ code is EXACT_DIV_EXPR.  We could mask out bits in the resulting
+ range, but then we also need to hack up vrp_meet.  It's just
+ easier to special case when vr0 is ~[0,0] for EXACT_DIV_EXPR.  */
+  if (code == EXACT_DIV_EXPR
+  && vr0.type == VR_ANTI_RANGE
+  && vr0.min == vr0.max
+  && integer_zerop (vr0.min))
+{
+  set_value_range_to_nonnull (vr, expr_type);
+  return;
+}
+
   /* Now canonicalize anti-ranges to ranges when they are not symbolic
  and express ~[] op X as ([]' op X) U ([]'' op X).  */
   if (vr0.type == VR_ANTI_RANGE
@@ -3298,6 +3311,21 @@ extract_range_from_binary_expr (value_range *vr,
 
   extract_range_from_binary_expr_1 (vr, code, expr_type, _vr0, );
 }
+
+  /* If we didn't derive a range for MINUS_EXPR, and
+ op1's range is ~[op0,op0] or vice-versa, then we
+ can derive a non-null range.  This happens often for
+ pointer subtraction.  */
+  if (vr->type == VR_VARYING
+  && code == MINUS_EXPR
+  && TREE_CODE (op0) == SSA_NAME
+  && ((vr0.type == VR_ANTI_RANGE
+  && vr0.min == op1
+  && vr0.min == vr0.max)
+ || (vr1.type == VR_ANTI_RANGE
+ && vr1.min == op0
+ && vr1.min == vr1.max)))
+  set_value_range_to_nonnull (vr, TREE_TYPE (op0));
 }
 
 /* Extract range information from a unary operation CODE based on
@@ -8620,6 +8648,17 @@ intersect_ranges (enum value_range_type *vr0type,
  else if (vrp_val_is_min (vr1min)
   && vrp_val_is_max (vr1max))
;
+ /* Choose the anti-range if it is ~[0,0], that range is special
+enough to special case when vr1's range is relatively wide.  */
+ else if (*vr0min == *vr0max
+  && integer_zerop (*vr0min)
+  && (TYPE_PRECISION (TREE_TYPE (*vr0min))
+  == TYPE_PRECISION (ptr_type_node))
+  && TREE_CODE (vr1max) == INTEGER_CST
+  && TREE_CODE (vr1min) == INTEGER_CST
+  && (wi::clz (wi::sub (vr1max, vr1min))
+  < TYPE_PRECISION (TREE_TYPE (*vr0min)) / 2))
+   ;
  /* Else choose the range.  */
  else
{
* tree-vrp.c (overflow_comparison_p_1): New function.
(overflow_comparison_p): New function.

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index ad8173c..2c03a74 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -5186,6 +5186,118 @@ masked_increment (const wide_int _in, const 
wide_int ,
   return val ^ sgnbit;
 }
 
+/* Helper for overflow_comparison_p
+
+   OP0 CODE OP1 is a comparison.  Examine the comparison and potentially
+   OP1's defining statement to see if it ultimately has the form
+   OP0 CODE (OP0 PLUS INTEGER_CST)
+
+   If so, return TRUE indicating this is an overflow test and store into
+   *NEW_CST an updated constant that can be used in a narrowed range test.
+
+   REVERSED indicates if the comparison was originally:
+
+   OP1 CODE' OP0.
+
+   This affects how we build the updated constant.  */
+
+static bool
+overflow_comparison_p_1 (enum tree_code code, tree op0, tree op1,
+bool follow_assert_exprs, bool reversed, tree *new_cst)
+{
+  /* See if this is a relational operation between two SSA_NAMES with
+ unsigned, overflow wrapping values.  If so, check it more deeply.  */
+  if ((code == LT_EXPR || code == LE_EXPR
+   || code == GE_EXPR || code == GT_EXPR)
+  &&

[PATCH][RFA][target/79404] Fix uninitialized reference to ira_register_move_cost[mode]

2017-02-13 Thread Jeff Law



So imagine we have two allocnos related by a copy chain (two operand 
architecture).


(gdb) p *cp->first
$11 = {num = 9, regno = 33, mode = DImode, wmode = DImode, aclass = 
GENERAL_REGS, dont_reassign_p = 0,
  bad_spill_p = 0, assigned_p = 1, conflict_vec_p = 0, hard_regno = -1, 
next_regno_allocno = 0x0,
  loop_tree_node = 0x1e0b190, nrefs = 13, freq = 8069, class_cost = 
1380, updated_class_cost = 1380,
  memory_cost = 29656, updated_memory_cost = 29656, 
excess_pressure_points_num = 17, allocno_prefs = 0x0,
  allocno_copies = 0x1e4b400, cap = 0x0, cap_member = 0x0, num_objects 
= 1, objects = {0x1e8b6a0, 0x0},
  call_freq = 0, calls_crossed_num = 0, cheap_calls_crossed_num = 0, 
crossed_calls_clobbered_regs = 0,
  hard_reg_costs = 0x1da9510, updated_hard_reg_costs = 0x0, 
conflict_hard_reg_costs = 0x0,

  updated_conflict_hard_reg_costs = 0x0, add_data = 0x1e04378}

(gdb) p *cp->second
$12 = {num = 12, regno = 39, mode = SImode, wmode = SImode, aclass = 
GENERAL_REGS, dont_reassign_p = 0,
  bad_spill_p = 1, assigned_p = 1, conflict_vec_p = 0, hard_regno = 2, 
next_regno_allocno = 0x0,
  loop_tree_node = 0x1e0b190, nrefs = 2, freq = 388, class_cost = 0, 
updated_class_cost = 0, memory_cost = 1552,
  updated_memory_cost = 1552, excess_pressure_points_num = 0, 
allocno_prefs = 0x0, allocno_copies = 0x1e4b400,
  cap = 0x0, cap_member = 0x0, num_objects = 2, objects = {0x1e8b7e0, 
0x1e8b830}, call_freq = 0,
  calls_crossed_num = 0, cheap_calls_crossed_num = 0, 
crossed_calls_clobbered_regs = 0,
  hard_reg_costs = 0x1da9550, updated_hard_reg_costs = 0x0, 
conflict_hard_reg_costs = 0x0,

  updated_conflict_hard_reg_costs = 0x0, add_data = 0x1e04480}


Note how cp->first is mode DImode.

Now assume that all the real uses of cp->first occur as SUBREG 
expressions.  But there is a DImode clobber of cp->first.  Like this:



(insn 7 2 3 2 (clobber (reg/v:DI 33 [ u ])) 
"/home/gcc/GIT-2/gcc/libgcc/libgcc2.c":404 -1

 (nil))
(insn 3 7 4 2 (set (subreg:HI (reg/v:DI 33 [ u ]) 0)
(mem/c:HI (reg/f:HI 9 ap) [4 u+0 S2 A16])) 
"/home/gcc/GIT-2/gcc/libgcc/libgcc2.c":404 5 {*movhi_h8300}

 (nil))
(insn 4 3 5 2 (set (subreg:HI (reg/v:DI 33 [ u ]) 2)
(mem/c:HI (plus:HI (reg/f:HI 9 ap)
(const_int 2 [0x2])) [4 u+2 S2 A16])) 
"/home/gcc/GIT-2/gcc/libgcc/libgcc2.c":404 5 {*movhi_h8300}

 (nil))
[ ... ]
(insn 35 32 37 5 (parallel [
(set (reg:SI 39 [ _32 ])
(lshiftrt:SI (subreg:SI (reg/v:DI 33 [ u ]) 0)
(subreg:QI (reg:HI 38) 1)))
(clobber (scratch:QI))
]) "/home/gcc/GIT-2/gcc/libgcc/libgcc2.c":415 229 {*shiftsi}
 (expr_list:REG_DEAD (reg:HI 38)
(expr_list:REG_DEAD (reg/v:DI 33 [ u ])
(expr_list:REG_EQUIV (mem/j/c:SI (plus:HI (reg/f:HI 11 fp)
(const_int -4 [0xfffc])) [1 
w.s.low+0 S4 A16])

(nil)

There's other references to (reg 33), but again, they all use subregs. 
The only real DImode reference to (reg 33) is in the clobber.  And 
remember that (reg 33) is involved in a copy chain.



So we'll eventually call allocno_copy_cost_saving and try to compute a 
cost savings using:


2764  cost += cp->freq * 
ira_register_move_cost[allocno_mode][rclass][rclass];


But ira_register_move_cost[DImode] is NULL -- it's never been 
initialized, presumably because we never see a real DImode reference to 
anything except in CLOBBER statements.


We can fix this in scan_one_insn via the attached patch.  I'm not sure 
if this is the best place to catch this or not.


I haven't included a testcase as this trips just building libgcc on the 
H8 target.  I could easily reduce it if folks think its worth the trouble.


I've verified this allows libgcc to build on the H8 target and 
bootstrapped/regression tested the change on x86_64-unknown-linux-gnu as 
well.


Vlad, is this OK for the trunk, or should we be catching this elsewhere?



Jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index f352051..2170e57 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,9 @@
+2017-02-13 Jeff Law  
+
+   PR target/79404
+   * ira-costs.c (scan_one_insn): Initialize register move costs
+   for pseudos seen in USE/CLOBBER insns.
+
 2017-02-13  Aaron Sawdey  
 
PR target/79449
diff --git a/gcc/ira-costs.c b/gcc/ira-costs.c
index c561db6..1737430 100644
--- a/gcc/ira-costs.c
+++ b/gcc/ira-costs.c
@@ -1438,9 +1438,25 @@ scan_one_insn (rtx_insn *insn)
 return insn;
 
   pat_code = GET_CODE (PATTERN (insn));
-  if (pat_code == USE || pat_code == CLOBBER || pat_code == ASM_INPUT)
+  if (pat_code == ASM_INPUT)
 return insn;
 
+  /* If INSN is a USE/CLOBBER of a pseudo in a mode M then go ahead
+ and initialize the register move costs of mode M.
+
+ The pseudo may be related to another pseudo via a copy (implicit or
+ explicit) and if there are no mode M uses/sets of the original
+

[patch, doc] copy-edit ARC options documentation

2017-02-13 Thread Sandra Loosemore


On 02/11/2017 09:18 PM, Sandra Loosemore wrote:

I noticed a bunch of copy-editing issues in the "ARC Options" section of
invoke.texi.  I'm willing to take a stab at fixing them, but I need some
technical assistance since I'm not familiar with the details of this
architecture myself.


Here's the patch I've come up with so far.  Does this look OK 
technically?  I did the best I could with trying to identify instruction 
names, rewrite jargon, etc but I may have guessed wrong in some cases.


I also found a couple more issues not addressed here.

(1) It looks from the implementation in config/arc/arc.opt that 
-mmpy-option= takes string arguments as well as the numbers currently 
documented.  It also supports options numbered 7 (plus_dmpy), 8 
(plus_macd), and 9 (plus_qmacw) that are not currently documented at 
all.  Help?


(2) I have to wonder about the usefulness of some of the documented code 
generation options.  I left this stuff alone for now, but e.g. 
documentation like "Enable bbit peephole2" is not helpful to users, and 
are options that enable constraints that are *required* for code 
generation useful to anyone?  Suggesting that some of these options be 
removed is out of my realm as a documentation maintainer, but I keep 
hearing complaints that we have too many options, the manual is too 
long, it devotes too much space to obscure features users shouldn't 
really need to know about, etc.


-Sandra


2017-02-13  Sandra Loosemore  

	gcc/
	* doc/invoke.texi (ARC Options): Copy-edit to fix punctuation,
	markup, and similar issues.  Remove @opindex entries for things
	that aren't options.
Index: gcc/doc/invoke.texi
===
--- gcc/doc/invoke.texi	(revision 245370)
+++ gcc/doc/invoke.texi	(working copy)
@@ -14247,70 +14247,58 @@ Compile for ARC EM.
 Compile for ARC HS.
 
 @item em
-@opindex em
-Compile for ARC EM cpu with no hardware extension.
+Compile for ARC EM CPU with no hardware extension.
 
 @item em4
-@opindex em4
-Compile for ARC EM4 cpu.
+Compile for ARC EM4 CPU.
 
 @item em4_dmips
-@opindex em4_dmips
-Compile for ARC EM4 DMIPS cpu.
+Compile for ARC EM4 DMIPS CPU.
 
 @item em4_fpus
-@opindex em4_fpus
-Compile for ARC EM4 DMIPS cpu with single precision floating point
+Compile for ARC EM4 DMIPS CPU with single-precision floating-point
 extension.
 
 @item em4_fpuda
-@opindex em4_fpuda
-Compile for ARC EM4 DMIPS cpu with single precision floating point and
-double assists instructions.
+Compile for ARC EM4 DMIPS CPU with single-precision floating-point and
+double assist instructions.
 
 @item hs
-@opindex hs
-Compile for ARC HS cpu with no hardware extension, except the atomic
+Compile for ARC HS CPU with no hardware extensions except the atomic
 instructions.
 
 @item hs34
-@opindex hs34
-Compile for ARC HS34 cpu.
+Compile for ARC HS34 CPU.
 
 @item hs38
-@opindex hs38
-Compile for ARC HS38 cpu.
+Compile for ARC HS38 CPU.
 
 @item hs38_linux
-@opindex hs38_linux
-Compile for ARC HS38 cpu with all hardware extensions on.
+Compile for ARC HS38 CPU with all hardware extensions on.
 
 @item arc600_norm
-@opindex arc600_norm
-Compile for ARC 600 cpu with norm instruction enabled.
+Compile for ARC 600 CPU with @code{norm} instructions enabled.
 
 @item arc600_mul32x16
-@opindex arc600_mul32x16
-Compile for ARC 600 cpu with norm and mul32x16 instructions enabled.
+Compile for ARC 600 CPU with @code{norm} and 32x16-bit multiply 
+instructions enabled.
 
 @item arc600_mul64
-@opindex arc600_mul64
-Compile for ARC 600 cpu with norm and mul64 instructions enabled.
+Compile for ARC 600 CPU with @code{norm} and @code{mul64}-family 
+instructions enabled.
 
 @item arc601_norm
-@opindex arc601_norm
-Compile for ARC 601 cpu with norm instruction enabled.
+Compile for ARC 601 CPU with @code{norm} instructions enabled.
 
 @item arc601_mul32x16
-@opindex arc601_mul32x16
-Compile for ARC 601 cpu with norm and mul32x16 instructions enabled.
+Compile for ARC 601 CPU with @code{norm} and 32x16-bit multiply
+instructions enabled.
 
 @item arc601_mul64
-@opindex arc601_mul64
-Compile for ARC 601 cpu with norm and mul64 instructions enabled.
+Compile for ARC 601 CPU with @code{norm} and @code{mul64}-family
+instructions enabled.
 
 @item nps400
-@opindex nps400
 Compile for ARC 700 on NPS400 chip.
 
 @end table
@@ -14339,20 +14327,21 @@ supported.  This is always enabled for @
 
 @item -mno-mpy
 @opindex mno-mpy
-Do not generate mpy instructions for ARC700.  This instruction is
+Do not generate @code{mpy}-family instructions for ARC700.  This option is
 deprecated.
 
 @item -mmul32x16
 @opindex mmul32x16
-Generate 32x16 bit multiply and mac instructions.
+Generate 32x16-bit multiply and multiply-accumulate instructions.
 
 @item -mmul64
 @opindex mmul64
-Generate mul64 and mulu64 instructions.  Only valid for @option{-mcpu=ARC600}.
+Generate @code{mul64} and @code{mulu64} instructions.  
+Only valid for @option{-mcpu=ARC600}.

Re: PR rtl-optimization/64081: Enable RTL loop unrolling for duplicated exit blocks and back edges (with AIX fixes)

2017-02-13 Thread Jeff Law


On 02/09/2017 11:08 AM, Aldy Hernandez wrote:

For those of you not following the PR, this is a re-post of a patch that
was approved in some form a year+ ago, but was reverted because it
caused an undiagnosed bootstrap problem on AIX:

https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00421.html

Annoyingly, the AIX problem disappeared with some unrelated changes to
SCCVN by Richard in more recent GCC's.  Consequently, the patch got
blocked because we could no longer reproduce the problem, but it wasn't
entirely clear that it was gone.

As penance for sins in a previous life, I have taken it upon myself to
reproduce this problem back in time and find what caused the AIX failure
back then.  I will spare the list the boring process, but the problem
with the original patch was that it changed the semantics of
check_simple_exit, but not all of it's indirect callers.  This caused
two versions of the same loop to have different unsynchronized iteration
variables-- one in a simple register, and one in a doloop variant on ppc.

I have bootstrapped this patch around trunk@226811 on AIX (all
languages), which is the latest time AIX failed to bootstrap with the
original patch.  My testing environment included a handful of other
unrelated changes that were required to coerce AIX into building
GCC5-ish with GCC6.1.

I have also bootstrapped and tested this patch at today's trunk on
x86-64 Linux with no adverse effects.

As I've mentioned, I believe the original patch was previously approved,
so to aid in reviewing I am including the full patch ("full-patch") with
my changes on top of the original patch, as well as an incremental patch
("incremental-patch") with my recent changes to bring AIX to happiness.

I understand if changes to the RTL looping infrastructure are too late
for this release cycle, and as I am not the original author-- don't kill
the messenger!  I just want to move this forward, even if it means
getting approval for GCC8.  This will at the very least allow me to
sleep at night, knowing I won't have to touch AIX for a while (or at
least wrt to this PR) ;-).

Original author(s) CC'ed.
So it seems in your updated patch there is only one call where we ask 
for LOOP_EXIT_COMPLEX, specifically the call from get_loop_location.


But I don't see how asking for LOOP_EXIT_COMPLEX from that location 
would change whether or not we unroll any given loop (which is the core 
of bz64081).


Am I missing something?

Jeff

Re: [PATCH] use zero as the lower bound for a signed-unsigned range (PR 79327)

2017-02-13 Thread Jeff Law


On 02/04/2017 01:07 AM, Jakub Jelinek wrote:

On Fri, Feb 03, 2017 at 05:39:21PM +0100, Jakub Jelinek wrote:

Say in the http://gcc.gnu.org/ml/gcc-patches/2017-02/msg00234.html
patch, you would with my patch need just the tree_digits part,
and then the very last hunk in gimple-ssa-sprintf.c (with
likely_adjust &&
removed).  Because you do the adjustments only if !res.knownrange
and in that case you know argmin/argmax are actually dirtype's min/max,
so 0 must be in the range.


You've committed the patch unnecessarily complicated, see above.
The following gives the same testsuite result.

As you know, just getting the same testsuite result is not sufficient ;-)



dirtype is one of the standard {un,}signed {char,short,int,long,long long}
types, all of them have 0 in their ranges.
For VR_RANGE we almost always set res.knownrange to true:
  /* Set KNOWNRANGE if the argument is in a known subrange
 of the directive's type (KNOWNRANGE may be reset below).  */
  res.knownrange
= (!tree_int_cst_equal (TYPE_MIN_VALUE (dirtype), argmin)
   || !tree_int_cst_equal (TYPE_MAX_VALUE (dirtype), argmax));
(the exception is in case that range clearly has to include zero),
and reset it only if adjust_range_for_overflow returned true, which means
it also set the range to TYPE_M{IN,AX}_VALUE (dirtype) and again
includes zero.
So IMNSHO likely_adjust in what you've committed is always true
when you use it and thus just a useless computation and something to make
the code harder to understand.
If KNOWNRANGE is false, then LIKELY_ADJUST will be true.  But I don't 
see how we can determine anything for LIKELY_ADJUST if KNOWNRANGE is true.





Even if you don't trust this, with the ranges in argmin/argmax, it is
IMHO undesirable to set it differently at the different code paths,
if you want to check whether the final range includes zero and at least
one another value, just do
-  if (likely_adjust && maybebase && base != 10)
+  if ((tree_int_cst_sgn (argmin) < 0 || tree_int_cst_sgn (argmax) > 0)
   && maybebase && base != 10)
Though, it is useless both for the above reason and for the reason that you
actually do something only:
I'm not convinced it's useless, but it does seem advisable to bring test 
down to where it's actually used and to bse it strictly on 
argmin/argmax.  Can you test a patch which does that?



Jeff

Re: [PATCH] suppress unhelpful -Wformat-truncation=2 INT_MAX warning (PR 79448)

2017-02-13 Thread Jeff Law


On 02/10/2017 10:55 AM, Martin Sebor wrote:

The recent Fedora mass rebuild revealed that the Wformat-truncation=2
checker is still a bit too aggressive and complains about potentially
unbounded strings causing subsequent directives t exceed the INT_MAX
limit.  (It's unclear how the build ended up enabling level 2 of
the warning.)

This is because for the purposes of the return value optimization
the pass must assume that such strings really are potentially unbounded
and result in as many as INT_MAX bytes (or more).  That doesn't mean
that it should warn on such cases.

The attached patch relaxes the checker to avoid the warning in this
case.  Since there's no easy way for a user to suppress the warning,
is this change okay for trunk at this stage?

Martin

gcc-79448.diff


PR middle-end/79448 - unhelpful -Wformat-truncation=2 warning

gcc/testsuite/ChangeLog:

PR middle-end/79448
* gcc.dg/tree-ssa/builtin-snprintf-warn-3.c: New test.
* gcc.dg/tree-ssa/pr79448-2.c: New test.
* gcc.dg/tree-ssa/pr79448.c: New test.

gcc/ChangeLog:

PR middle-end/79448
* gimple-ssa-sprintf.c (format_directive): Avoid issuing INT_MAX
  warning for strings of unknown length.

diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index e6cc31d..bf76162 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -2561,11 +2561,15 @@ format_directive (const pass_sprintf_length::call_info 
,
   /* Raise the total unlikely maximum by the larger of the maximum
  and the unlikely maximum.  It doesn't matter if the unlikely
  maximum overflows.  */
+  unsigned HOST_WIDE_INT save = res->range.unlikely;
   if (fmtres.range.max < fmtres.range.unlikely)
 res->range.unlikely += fmtres.range.unlikely;
   else
 res->range.unlikely += fmtres.range.max;

+  if (res->range.unlikely < save)
+res->range.unlikely = HOST_WIDE_INT_M1U;
+
So this looks like you're doing an overflow check -- yet earlier your 
comment says "It doesnt' matter if the unlikely maximum overflows". 
ISTM that comment needs updating -- if it doesn't matter, then why check 
for it and clamp the value?




   res->range.min += fmtres.range.min;
   res->range.likely += fmtres.range.likely;

@@ -2616,7 +2620,12 @@ format_directive (const pass_sprintf_length::call_info 
,

   /* Has the likely and maximum directive output exceeded INT_MAX?  */
   bool likelyximax = *dir.beg && res->range.likely > target_int_max ();
-  bool maxximax = *dir.beg && res->range.max > target_int_max ();
+  /* Don't consider the maximum to be in excess when it's the result
+ of a string of unknown length (i.e., whose maximum has been set
+ to HOST_WIDE_INT_M1U.  */
+  bool maxximax = (*dir.beg
+  && res->range.max > target_int_max ()
+  && res->range.max < HOST_WIDE_INT_MAX);
So your comment mentions HOST_WIDE_INT_M1U as the key for a string of 
unknown length.  But that doesn't obviously correspond to what the code 
checks.


Can you please fix up the two comments.  With the comments fixed, this 
is OK.


jeff

Re: [PATCH] Introduce BUGURL in gcc/po/exgettext

2017-02-13 Thread Jeff Law


On 02/09/2017 07:23 AM, Gerald Pfeifer wrote:

On Thu, 9 Feb 2017, Gerald Pfeifer wrote:

As Jakub also pointed out, there are various places where our
BUGURL is defined (not just used).


So, this one I found particularly interesting.

The same URL is used four times in the same file, which begs for
a little abstraction. ;-)  I did that, and adjusted the link (once).

Okay with a proper ChangeLog entry?

Of course.
jeff

Re: [RFA][PR tree-optimization/79095] [PATCH 1/4] Improve ranges for MINUS_EXPR and EXACT_DIV_EXPR

2017-02-13 Thread Jeff Law


On 02/07/2017 01:39 AM, Richard Biener wrote:

On Mon, Feb 6, 2017 at 10:57 PM, Jeff Law  wrote:

On 02/06/2017 08:33 AM, Richard Biener wrote:


ah, indeed vr0type is VR_ANTI_RANGE and yes we have the case of a
range with an anti-range "inside".  This also covers [-1,1] v
~[0,0] where you choose the much larger anti-range now.  So at
least we want to have some idea about the sizes of the ranges
(ideally we'd choose the smaller though for most further
propagations anti-ranges often degenerate to varying...)


vr0 as an anti-singleton range like ~[0,0] is the only one likely
of any interest right now and that's always going to have a range
that is all but one value :-)

vr1 is the tricky case.  We could do v1.max - vr1.min and if that
overflows or is some "large" value (say > 65536 just to throw out a
value), then we conclude creating the singleton anti-range like
~[0,0] is more useful.


Yes, but it's hard to tell.  As said, anti-ranges quickly degrade in
further propagation and I fear that without a better range
representation it's hard to do better in all cases here.  The fact is
we can't represent the result of the intersection and thus we have to
conservatively choose an approximation.  Sometimes we have the other
range on an SSA name and thus can use equivalences (when coming from
assert processing), but most of the time not and thus we can't use
equivalences (which use SSA name versions rather than an index into
a ranges array - one possible improvement to the range
representation). Iff ~[0,0] is useful information querying sth for
non-null should also look at equivalences btw.
I spoke with Andrew a bit today, he's consistently seeing cases where 
the union of 3 ranges is necessary to resolve the kinds of queries we're 
interested in.  He's made a design decision not to use anti-ranges in 
his work, so y'all are in sync on that long term.


He and Aldy have some bits to change the underlying range representation 
that might make sense to hash through right after stage1 reopens.


Jeff

Re: [PATCH][RFC] Fix PR79432, SSA from the gimplifier and abnormal edges

2017-02-13 Thread Jeff Law


On 02/10/2017 05:24 AM, Richard Biener wrote:


It turns out the SSA var defs/uses generated by the gimplifier can break
apart in a way no longer satisfying the dominance requirement of SSA
uses vs. their defs by means of CFG construction adding abnormal edges
for stuff like setjmp (but also non-local gotos I guess).

This would be quite costly to overcome in gimplification - one needs
to check whether a (part?) of an expression to be gimplified may
produce such edges and disable SSA name generation for them.  With
the recursive nature of the gimplifier plus the general complexity
of GENERIC I can't see how to do this.
ISTM there is no apriori way to know if any object is live across a call 
which would create these abnormal edges.  How in the world did this work 
in the past?


Sigh.  Because the problem starts at gimplification, we can't even do 
something like pre-scan the block for problem calls -- we don't have a CFG.


Can this only happen during gimplification of an expression where a 
sub-expression has a problematic call?  How bad do you think things 
would blow up if we scanned for a call and avoided using SSA_NAMEs when 
there's a problematical call within the expression?  Or is there no way 
structurally to do that within the gimplifier?


Are setjmp/longjump the only problems here, or does EH tickle these 
issues as well (if so, then ISTM you'd need a different test).




Thus the following patch "recovers" from the extra abnormal edges
by effectively treating SSA vars pre into-SSA as "non-SSA" and thus
doing PHI insertion for them when rewriting the function into SSA.
Implementation-wise the easiest thing was to re-write the affected
SSA vars out-of-SSA (replace them by a decl).
If we can't do the right thing from the start, then "recovery" seems to 
be the only option.





The out-of-SSA rewriting is placed in insert_phi_nodes because thats
the first point in time we have immediate uses present (otherwise
it could be done at any point after CFG construction).
THe earlier the better.  So as soon as we have a CFG & SSA we have to 
fixup.  So location seems right.




I'm not 100% happy with this but after trying for a day I can't come
up with a better solution - it has the chance of papering over
bogus gimplifications but OTOH that we only do this for functions
calling setjmp or having non-local labels would make those trigger
SSA verification in the other cases.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

Any comments / ideas?  Anyone with good arguments that this is
even conceptually the correct thing to do?

I'm bootstrapping this with the calls_setjmp/has_nonlocal_label
short-cutted with a 1 || as well as without (to increase testing
coverage).

Thanks,
Richard.

2017-02-10  Richard Biener  

PR middle-end/79432
* tree-into-ssa.c (insert_phi_nodes): When the function can
have abnormal edges rewrite SSA names with broken use-def
dominance out of SSA and register them for PHI insertion.

* gcc.dg/torture/pr79432.c: New testcase.
It's gross.  But unless we have a reasonable way to do this during 
gimplification, I don't see anything better.


jeff

Re: [PATCH] Workaround extended precision decimal computations in float-cast-overflow-10.c (PR sanitizer/79341)

2017-02-13 Thread Jeff Law


On 02/10/2017 12:58 PM, Jakub Jelinek wrote:

Hi!

Apparently on s390x with -march=z10 and above at -O2, some of the _Decimal32
computations are performed in _Decimal64 precision.  As the test intent
is to test the diagnostics on cast from floating point types to integral
types, the values we want to cast matter most (we want values that are
smallest or largest values already not representable in the integral type),
and already use volatile for the tem variable, so I think it is fine to
extend it.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-02-10  Jakub Jelinek  

PR sanitizer/79341
* c-c++-common/ubsan/float-cast-overflow-8.c (TEST): Make min and max
variables volatile.

OK.
jeff

C++ PATCH for c++/79461 (ICE with lambda and constexpr ctor)

2017-02-13 Thread Jason Merrill

Here build_data_member_initialization got confused by the
initialization of a field of local variable f and thought it was
initializing a field of S; when we then went looking for that field in
S, we didn't find it, and crashed.  But if the target of an
initialization is a member of a VAR_DECL, it can't be part of *this,
so we should ignore it here.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 227008348567729cd124bea44227771681361c40
Author: Jason Merrill 
Date:   Mon Feb 13 15:09:37 2017 -0500

PR c++/79461 - ICE with lambda in constexpr constructor

* constexpr.c (build_data_member_initialization): Ignore
initialization of a local variable.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index bfdde9e..004bb45 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -379,6 +379,9 @@ build_data_member_initialization (tree t, 
vec **vec)
   if (TREE_CODE (member) == COMPONENT_REF)
 {
   tree aggr = TREE_OPERAND (member, 0);
+  if (TREE_CODE (aggr) == VAR_DECL)
+   /* Initializing a local variable, don't add anything.  */
+   return true;
   if (TREE_CODE (aggr) != COMPONENT_REF)
/* Normal member initialization.  */
member = TREE_OPERAND (member, 1);
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda15.C 
b/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda15.C
new file mode 100644
index 000..7e05481
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-lambda15.C
@@ -0,0 +1,10 @@
+// PR c++/79461
+// { dg-options -std=c++1z }
+
+struct S {
+  constexpr S(int i) {
+auto f = [i]{};
+  }
+};
+int main() {}
+

[PATCH] rs6000: float128 on BE and 32-bit

2017-02-13 Thread Segher Boessenkool

This fixes float128 on BE and on 32-bit.

The configure tests need to use -mabi=altivec for 32-bit, since it is
not the default there.  That also enables the "vector" keyword, used by
the tests.  To do this it temporarily adds a few flags to the CFLAGS
variable.

It also fixes a syntax error in the libgcc_cv_powerpc_float128_hw test
(the function name was missing in the function declaration).

Regenerating config.in (via autoreconf) removed the duplicate definition
of HAVE_SOLARIS_CRTS.

Finally, this adds a "-mfloat128-hardware requires -m64" test to
rs6000.c: all the current patterns need 64-bit registers.  Maybe we'll
want to add float128 hardware support to 32-bit some day, but certainly
not today.

Tested on powerpc64-linux {-m32,-m64} so far.  I see I also need to
update some code comments, ugh.


Segher


2017-02-13  Segher Boessenkool  
* config/rs6000/rs6000.c (rs6000_option_override_internal): Disallow
-mfloat128-hardware without -m64.

libgcc/
* configure.ac (test for libgcc_cv_powerpc_float128): Temporarily
modify CFLAGS.  Add -mabi=altivec.
(test for libgcc_cv_powerpc_float128_hw): Add -mpower9-vector and
-mfloat128-hardware to the CFLAGS.  Fix syntax error in the C snippet.
* configure: Regenerate.
* config.in: Regenerate.

---
 gcc/config/rs6000/rs6000.c |  8 
 libgcc/config.in   |  3 ---
 libgcc/configure   | 12 +++-
 libgcc/configure.ac| 12 +++-
 4 files changed, 22 insertions(+), 13 deletions(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 11f11f0..8ca42fd 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -4686,6 +4686,14 @@ rs6000_option_override_internal (bool global_init_p)
   rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_HW;
 }
 
+  if (TARGET_FLOAT128_HW && !TARGET_64BIT)
+{
+  if ((rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_HW) != 0)
+   error ("-mfloat128-hardware requires -m64");
+
+  rs6000_isa_flags &= ~OPTION_MASK_FLOAT128_HW;
+}
+
   if (TARGET_FLOAT128_HW && !TARGET_FLOAT128_KEYWORD
   && (rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_HW) != 0
   && (rs6000_isa_flags_explicit & OPTION_MASK_FLOAT128_KEYWORD) == 0)
diff --git a/libgcc/config.in b/libgcc/config.in
index 4d33411..25aa0d9 100644
--- a/libgcc/config.in
+++ b/libgcc/config.in
@@ -21,9 +21,6 @@
 /* Define if the system-provided CRTs are present on Solaris. */
 #undef HAVE_SOLARIS_CRTS
 
-/* Define if the system-provided CRTs are present on Solaris. */
-#undef HAVE_SOLARIS_CRTS
-
 /* Define to 1 if you have the  header file. */
 #undef HAVE_STDINT_H
 
diff --git a/libgcc/configure b/libgcc/configure
index 5c900cc..7ef84bc 100644
--- a/libgcc/configure
+++ b/libgcc/configure
@@ -4779,6 +4779,8 @@ case ${host} in
 # software libraries, and whether the assembler can handle xsaddqp
 # for hardware support.
 powerpc*-*-linux*)
+  saved_CFLAGS="$CFLAGS"
+  CFLAGS="$CFLAGS -mabi=altivec -mvsx"
   { $as_echo "$as_me:${as_lineno-$LINENO}: checking for PowerPC ISA 2.06 to 
build __float128 libraries" >&5
 $as_echo_n "checking for PowerPC ISA 2.06 to build __float128 libraries... " 
>&6; }
 if test "${libgcc_cv_powerpc_float128+set}" = set; then :
@@ -4786,8 +4788,7 @@ if test "${libgcc_cv_powerpc_float128+set}" = set; then :
 else
   cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
-#pragma GCC target ("vsx")
- vector double dadd (vector double a, vector double b) { return a + b; }
+vector double dadd (vector double a, vector double b) { return a + b; }
 _ACEOF
 if ac_fn_c_try_compile "$LINENO"; then :
   libgcc_cv_powerpc_float128=yes
@@ -4799,6 +4800,7 @@ fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: $libgcc_cv_powerpc_float128" 
>&5
 $as_echo "$libgcc_cv_powerpc_float128" >&6; }
 
+  CFLAGS="$CFLAGS -mpower9-vector -mfloat128 -mfloat128-hardware"
   { $as_echo "$as_me:${as_lineno-$LINENO}: checking for PowerPC ISA 3.0 to 
build hardware __float128 libraries" >&5
 $as_echo_n "checking for PowerPC ISA 3.0 to build hardware __float128 
libraries... " >&6; }
 if test "${libgcc_cv_powerpc_float128_hw+set}" = set; then :
@@ -4806,12 +4808,11 @@ if test "${libgcc_cv_powerpc_float128_hw+set}" = set; 
then :
 else
   cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
-#pragma GCC target ("vsx,power9-vector")
- #include 
+#include 
  #ifndef AT_PLATFORM
  #error "AT_PLATFORM is not defined"
  #endif
- vector unsigned char (vector unsigned char a, vector unsigned char b)
+ vector unsigned char add (vector unsigned char a, vector unsigned char b)
  {
vector unsigned char ret;
__asm__ ("xsaddqp %0,%1,%2" : "=v" (ret) : "v" (a), "v" (b));
@@ -4830,6 +4831,7 @@ rm -f core conftest.err conftest.$ac_objext 
conftest.$ac_ext
 fi
 { $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$libgcc_cv_powerpc_float128_hw" >&5

Re: [PATCH] Fix various cases of missing whitespace in string literals split across lines

2017-02-13 Thread Jeff Law


On 02/13/2017 12:42 PM, Jakub Jelinek wrote:

Hi!

While looking at PR79475 (already fixed), I wrote a small scriptlet
#!/bin/awk -f
/^[[:blank:]]*"[^[:blank:]]/ {
  if (last)
{
  print last
  print
}
}
{
  last = ""
}
/[^[:blank:]]"[[:blank:]]*\\?$/ {
  if ($0 ~ /\\[tnv]"[[:blank:]]*\\?$/)
last = ""
  else
last = $0
}
to look for possible issues like
"something"
"and something"
where there probably is supposed to be a space in between.  It shows
various false positives (especially in the spec handling stuff) and while
it handles
"something\n"
"something else"
it doesn't handle
"something"
"\nsomething else"
but in any case the false positive ratio was small enough that I could
easily look for actual bugs.  This patch fixes various of these
in the middle end as well as C and fortran FEs.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-02-13  Jakub Jelinek  

* cprop.c (cprop_jump): Add missing space in string literal.
* tree-ssa-structalias.c (rewrite_constraints): Likewise.
(get_constraint_for_component_ref): Likewise.
* df-core.c (df_worklist_dataflow_doublequeue): Likewise.
* tree-outof-ssa.c (insert_partition_copy_on_edge): Likewise.
* lra-constraints.c (process_alt_operands): Likewise.
* ipa-inline.c (inline_small_functions): Likewise.
* tree-ssa-sccvn.c (visit_reference_op_store): Likewise.
* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Likewise.
* trans-mem.c (diagnose_tm_1_op): Likewise.
* omp-grid.c (grid_find_single_omp_among_assignments): Likewise.
(grid_parallel_clauses_gridifiable): Likewise.
c/
* c-parser.c (c_parser_oacc_declare): Add missing space in
diagnostics.
fortran/
* trans-expr.c (gfc_conv_substring): Add missing space in diagnostics.

OK.
jeff

Re: [PATCH] Minor formatting improvement in nvptx mkoffload

2017-02-13 Thread Jeff Law


On 02/13/2017 12:49 PM, Jakub Jelinek wrote:

Hi!

Another case of missing whitespace when string literals are split
across multiple lines, I believe we should emit a space here
for nicer formatting.

Ok for trunk?

2017-02-13  Jakub Jelinek  

* config/nvptx/mkoffload.c (process): Add space in between
, and %d.

OK
jeff

Re: [C++ PATCH] Fix 3 diagnostic whitespace issues

2017-02-13 Thread Jeff Law


On 02/13/2017 12:50 PM, Jakub Jelinek wrote:

Hi!

And lastly this patch fixes 3 diagnostics issues found by that script
in the C++ FE.  Bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2017-02-13  Jakub Jelinek  

* init.c (warn_placement_new_too_small): Add missing space in
diagnostics.
* parser.c (cp_parser_oacc_declare): Likewise.
* mangle.c (maybe_check_abi_tags): Likewise.

OK.
jeff

Re: [PATCH] avoid eliminating snprintf(d, n, ...) whose zero size comes from a range (PR 79496)

2017-02-13 Thread Jeff Law


On 02/13/2017 01:11 PM, Martin Sebor wrote:

When the size of the destination in a call to snprintf is in
a range, at level 1 -Wformat-truncation uses the upper bound
as the size while the stricter level 2 uses the lower bound.
However, when the lower bound is zero treating it the same as
a constant zero and optimizing the call into a constant isn't
correct because the actual argument need not be zero and the
output of the function is important.  The attached patch
avoids this unsafe transformation.

Is this okay for trunk?

Martin

gcc-79496.diff


PR middle-end/79496 - call to snprintf with zero size eliminated with 
-Wformat-truncation=2

gcc/ChangeLog:

PR middle-end/79496
* gimple-ssa-sprintf.c (pass_sprintf_length::handle_gimple_call): Avoid
clearing info.nowrite flag when snprintf size argument is a range.

gcc/testsuite/ChangeLog:

PR middle-end/79496
* gcc.dg/tree-ssa/builtin-snprintf-2.c: New test.

OK.
jeff

Re: [PATCH doc] update -dM to mention excluded macros (PR 41540)

2017-02-13 Thread Martin Sebor


On 02/12/2017 01:27 AM, Gerald Pfeifer wrote:

On Tue, 7 Feb 2017, Martin Sebor wrote:

The attached documentation-only patch clarifies the description
of the -dM option to mention that  __FILE__ (and other predefined
macros) do no appear on the list generated by the option.


+The predefined macros @code{__FILE__}, @code{__LINE__}, @code{__DATE__},
+and @code{__TIME__} are excluded from this list because their replacements
+may change from one line of output to the next.

Is this ("their replacements may change") truefor __FILE__ as well?

In any case, this may be too nit-picking, and the patch looks fine
to me.


Thanks.  __FILE__ too can change between different lines of output:
by means of a #line directive.

I actually think it would be preferable if GCC printed the value of
__FILE__ and all the other macros, including the four mentioned here.
But since in response to the bug report Andrew implied that not
printing them is intentional I figured I'd just update the docs
and resolve this old bug (raised in 2009).

That said, I find the argument that these four are special because
their value can change from line to line a weak one given that the
definition of any predefined macro can change (simply by undefining
and redefining it something else).

Is there a better argument for not printing __FILE__ et al?  Do we
want to document this or change GCC to print all macros?

Martin

[PATCH] rs6000: Fix sanitizer frame unwind on 32-bit ABIs

2017-02-13 Thread Segher Boessenkool

This fixes many sanitizer problems with -m32.

This patch should go via LLVM, it's just FYI for here for now.


Segher


---
 libsanitizer/sanitizer_common/sanitizer_stacktrace.cc | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/libsanitizer/sanitizer_common/sanitizer_stacktrace.cc 
b/libsanitizer/sanitizer_common/sanitizer_stacktrace.cc
index cbb3af2..8bcc4e6 100644
--- a/libsanitizer/sanitizer_common/sanitizer_stacktrace.cc
+++ b/libsanitizer/sanitizer_common/sanitizer_stacktrace.cc
@@ -78,14 +78,21 @@ void BufferedStackTrace::FastUnwindStack(uptr pc, uptr bp, 
uptr stack_top,
  IsAligned((uptr)frame, sizeof(*frame)) &&
  size < max_depth) {
 #ifdef __powerpc__
-// PowerPC ABIs specify that the return address is saved at offset
-// 16 of the *caller's* stack frame.  Thus we must dereference the
-// back chain to find the caller frame before extracting it.
+// PowerPC ABIs specify that the return address is saved on the
+// *caller's* stack frame.  Thus we must dereference the back chain
+// to find the caller frame before extracting it.
 uhwptr *caller_frame = (uhwptr*)frame[0];
 if (!IsValidFrame((uptr)caller_frame, stack_top, bottom) ||
 !IsAligned((uptr)caller_frame, sizeof(uhwptr)))
   break;
+// For most ABIs the offset where the return address is saved is two
+// register sizes.  The exception is the SVR4 ABI, which uses an
+// offset of only one register size.
+#ifdef _CALL_SYSV
+uhwptr pc1 = caller_frame[1];
+#else
 uhwptr pc1 = caller_frame[2];
+#endif
 #elif defined(__s390__)
 uhwptr pc1 = frame[14];
 #else
-- 
1.9.3

Re: [Fortran, Patch, CAF] Failed Images patch (TS 18508)

2017-02-13 Thread Alessandro Fanfarillo

Now with the patch attached.

2017-02-13 13:35 GMT-07:00 Alessandro Fanfarillo :
> Thanks Jerry. That test case is supposed only to be compiled (it never
> runs). Anyway, the attached patch has been modified according to your
> suggestion.
>
> Patch built and regtested on x86_64-pc-linux-gnu.
>
> 2017-02-12 10:24 GMT-07:00 Jerry DeLisle :
>> On 02/11/2017 03:02 PM, Alessandro Fanfarillo wrote:
>>>
>>> Dear all,
>>> please find in attachment a new patch following the discussion at
>>> https://gcc.gnu.org/ml/fortran/2017-01/msg00054.html.
>>>
>>> Suggestions on how to fix potential issues are more than welcome.
>>>
>>> Regards,
>>> Alessandro
>>>
>>
>> On the failed images test:
>>
>> program test_image_status
>> +  implicit none
>> +
>> +  write(*,*) image_status(1)
>> +
>>
>> Write to a string and test the results.
>>
>> I assume you have regression tested this again as stated in the earlier
>> discussion.
>>
>> I think this is OK to go in.
>>
>> Jerry
>>
>>
commit 06ed189ff99710d4d18fefa7a83e12192c5d10bf
Author: Alessandro Fanfarillo 
Date:   Mon Feb 13 12:54:22 2017 -0700

Resurrected patch and tests - REV1

diff --git a/gcc/fortran/check.c b/gcc/fortran/check.c
index c22bfa9..ed88a19 100644
--- a/gcc/fortran/check.c
+++ b/gcc/fortran/check.c
@@ -1136,6 +1136,116 @@ gfc_check_atomic_ref (gfc_expr *value, gfc_expr *atom, 
gfc_expr *stat)
   return gfc_check_atomic (atom, 1, value, 0, stat, 2);
 }
 
+bool
+gfc_check_image_status (gfc_expr *image, gfc_expr *team)
+{
+  if (!type_check (image, 1, BT_INTEGER))
+return false;
+
+  int i = gfc_validate_kind (BT_INTEGER, image->ts.kind, false);
+  int j = gfc_validate_kind (BT_INTEGER, gfc_default_integer_kind, false);
+
+  if (gfc_integer_kinds[i].range < gfc_integer_kinds[j].range)
+{
+  gfc_error ("IMAGE argument of the IMAGE_STATUS intrinsic function at %L "
+"shall have at least the range of the default integer",
+>where);
+  return false;
+}
+
+  j = gfc_validate_kind (BT_INTEGER, gfc_default_integer_kind*2, false);
+
+  if (gfc_integer_kinds[i].range > gfc_integer_kinds[j].range)
+{
+  gfc_error ("IMAGE argument of the IMAGE_STATUS intrinsic function at %L "
+"shall have at most the range of the double precision integer",
+>where);
+  return false;
+}
+
+  if (team)
+{
+  gfc_error ("TEAM argument of the IMAGE_STATUS intrinsic function at %L "
+"not yet supported",
+>where);
+  return false;
+}
+  return true;
+}
+
+bool
+gfc_check_failed_images (gfc_expr *team, gfc_expr *kind)
+{
+  if (team)
+{
+  gfc_error ("TEAM argument of the FAILED_IMAGES intrinsic function "
+"at %L not yet supported", >where);
+  return false;
+}
+
+  if (kind)
+{
+  int i = gfc_validate_kind (BT_INTEGER, kind->ts.kind, false);
+  int j = gfc_validate_kind (BT_INTEGER, gfc_default_integer_kind, false);
+
+  if (gfc_integer_kinds[i].range < gfc_integer_kinds[j].range)
+   {
+ gfc_error ("KIND argument of the FAILED_IMAGES intrinsic function "
+"at %L shall have at least the range "
+"of the default integer", >where);
+ return false;
+   }
+
+  j = gfc_validate_kind (BT_INTEGER, gfc_default_integer_kind*2, false);
+
+  if (gfc_integer_kinds[i].range > gfc_integer_kinds[j].range)
+   {
+ gfc_error ("KIND argument of the FAILED_IMAGES "
+"intrinsic function at %L shall have at most the "
+"range of the double precision integer",
+>where);
+ return false;
+   }
+}
+  return true;
+}
+
+bool
+gfc_check_stopped_images (gfc_expr *team, gfc_expr *kind)
+{
+  if (team)
+{
+  gfc_error ("TEAM argument of the STOPPED_IMAGES intrinsic function "
+"at %L not yet supported", >where);
+  return false;
+}
+
+  if (kind)
+{
+  int i = gfc_validate_kind (BT_INTEGER, kind->ts.kind, false);
+  int j = gfc_validate_kind (BT_INTEGER, gfc_default_integer_kind, false);
+
+  if (gfc_integer_kinds[i].range < gfc_integer_kinds[j].range)
+   {
+ gfc_error ("KIND argument of the STOPPED_IMAGES intrinsic function "
+"at %L shall have at least the range "
+"of the default integer", >where);
+ return false;
+   }
+
+  j = gfc_validate_kind (BT_INTEGER, gfc_default_integer_kind*2, false);
+
+  if (gfc_integer_kinds[i].range > gfc_integer_kinds[j].range)
+   {
+ gfc_error ("KIND argument of the STOPPED_IMAGES "
+"intrinsic function at %L shall have at most the "
+"range of the double precision integer",
+>where);
+ return false;
+   }
+}
+  return true;
+}
 
 bool
 gfc_check_atomic_cas

Re: [Fortran, Patch, CAF] Failed Images patch (TS 18508)

2017-02-13 Thread Alessandro Fanfarillo

Thanks Jerry. That test case is supposed only to be compiled (it never
runs). Anyway, the attached patch has been modified according to your
suggestion.

Patch built and regtested on x86_64-pc-linux-gnu.

2017-02-12 10:24 GMT-07:00 Jerry DeLisle :
> On 02/11/2017 03:02 PM, Alessandro Fanfarillo wrote:
>>
>> Dear all,
>> please find in attachment a new patch following the discussion at
>> https://gcc.gnu.org/ml/fortran/2017-01/msg00054.html.
>>
>> Suggestions on how to fix potential issues are more than welcome.
>>
>> Regards,
>> Alessandro
>>
>
> On the failed images test:
>
> program test_image_status
> +  implicit none
> +
> +  write(*,*) image_status(1)
> +
>
> Write to a string and test the results.
>
> I assume you have regression tested this again as stated in the earlier
> discussion.
>
> I think this is OK to go in.
>
> Jerry
>
>

[PATCH v3] aarch64: Add split-stack support

2017-02-13 Thread Adhemerval Zanella

Changes from previous version:

  - The split-stack are is now allocated before the thread pointer
(instead of tcbhead_t which is positioned after it) as decribed
in glibc patch [1].  It has the advantage to not require linker
changes to take in consideration the new tcbhead_t size and it
also fixed the regression I was seeing on the previous version of
this patch at runtime/pprof: __splitstack_setcontext updates some
internal TLS variables and with split-stack trying to update
a larger tcbhead_t field it clobbered the tls variable when
using a default static linker (which expected a default tcbhead_t
structure as size of 16).

  - Function aarch64_split_stack_space_check is changed back to previous
implementation because a wrong comment in my previous version lead
to wrong assumptions on [2].  The function will generate a jump
to 'label' variable (which will call __morestack_allocate_stack_space)
not the arch-specific __morestack implementation.  I corrected the
comment as wellA

  - Change the stack adjustments to always use 2 instructions. [3]

I am not seeing any more issue when using split-stack neither check-go
regressions with this version.

[1] https://sourceware.org/ml/libc-alpha/2017-02/msg00247.html
[2] https://gcc.gnu.org/ml/gcc-patches/2017-01/msg00093.html
[3] https://gcc.gnu.org/ml/gcc-patches/2017-02/msg00055.html

--

This patch adds the split-stack support on aarch64 (PR #67877).  As for
other ports this patch should be used along with glibc and gold support.

The support is done similar to other architectures: a __private_ss field is
added on TCB in glibc, a target-specific __morestack implementation and
helper functions are added in libgcc and compiler supported in adjusted
(split-stack prologue, va_start for argument handling).  I also plan to
send the gold support to adjust stack allocation acrosss split-stack
and default code calls.

Current approach is similar to powerpc one: at most 2 GB of stack allocation
is support so stack adjustments can be done with 2 instructions (either just
a movn plus nop or a movn followed by movk).  The morestack call is non
standard with x10 hollding the requested stack pointer, x11 the argument
pointer, and x12 to return continuation address.  Unwinding is handled by a
personality routine that knows how to find stack segments.

Split-stack prologue on function entry is as follow (this goes before the
usual function prologue):

function:
mrsx9, tpidr_el0
movx10, -
movk   0x0
addx10, sp, x10
movx11, sp  # if function has stacked arguments
adrp   x12, main_fn_entry
addx12, x12, :lo12:.L2
cmpx9, x10
b.lt   
b  __morestack
main_fn_entry:
[function prologue]

Notes:

1. Even if a function does not allocate a stack frame, a split-stack prologue
   is created.  It is to avoid issues with tail call for external symbols
   which might require linker adjustment (libgo/runtime/go-varargs.c).

2. Basic-block reordering (enabled with -O2) will move split-stack TCB ldr
   to after the required stack calculation.

3. Similar to powerpc, When the linker detects a call from split-stack to
   non-split-stack code, it adds 16k (or more) to the value found in "allocate"
   instructions (so non-split-stack code gets a larger stack).  The amount is
   tunable by a linker option.  The edit means aarch64 does not need to
   implement __morestack_non_split, necessary on x86 because insufficient
   space is available there to edit the stack comparison code.  This feature
   is only implemented in the GNU gold linker.

4. AArch64 does not handle >4G stack initially and although it is possible
   to implement it, limiting to 4G allows to materize the allocation with
   only 2 instructions (mov + movk) and thus simplifying the linker
   adjustments required.  Supporting multiple threads each requiring more
   than 4G of stack is probably not that important, and likely to OOM at
   run time.

5. The TCB support on GLIBC is meant to be included in version 2.25.

6. The continuation address materialized on x12 is done using 'adrp'
   plus add and a static relocation.  Current code uses the
   aarch64_expand_mov_immediate function and since a better alternative
   would be 'adp', it could be a future optimization (not implemented
   in this patch).

libgcc/ChangeLog:

* libgcc/config.host: Use t-stack and t-statck-aarch64 for
aarch64*-*-linux.
* libgcc/config/aarch64/morestack-c.c: New file.
* libgcc/config/aarch64/morestack.S: Likewise.
* libgcc/config/aarch64/t-stack-aarch64: Likewise.
* libgcc/generic-morestack.c (__splitstack_find): Add aarch64-specific
code.

gcc/ChangeLog:

PR 67877
* common/config/aarch64/aarch64-common.c
(aarch64_supports_split_stack): New function.
(TARGET_SUPPORTS_SPLIT_STACK): New macro.

[PATCH] avoid eliminating snprintf(d, n, ...) whose zero size comes from a range (PR 79496)

2017-02-13 Thread Martin Sebor


When the size of the destination in a call to snprintf is in
a range, at level 1 -Wformat-truncation uses the upper bound
as the size while the stricter level 2 uses the lower bound.
However, when the lower bound is zero treating it the same as
a constant zero and optimizing the call into a constant isn't
correct because the actual argument need not be zero and the
output of the function is important.  The attached patch
avoids this unsafe transformation.

Is this okay for trunk?

Martin
PR middle-end/79496 - call to snprintf with zero size eliminated with -Wformat-truncation=2

gcc/ChangeLog:

	PR middle-end/79496
	* gimple-ssa-sprintf.c (pass_sprintf_length::handle_gimple_call): Avoid
	clearing info.nowrite flag when snprintf size argument is a range.

gcc/testsuite/ChangeLog:

	PR middle-end/79496
	* gcc.dg/tree-ssa/builtin-snprintf-2.c: New test.

diff --git a/gcc/gimple-ssa-sprintf.c b/gcc/gimple-ssa-sprintf.c
index bf76162..1079a41 100644
--- a/gcc/gimple-ssa-sprintf.c
+++ b/gcc/gimple-ssa-sprintf.c
@@ -3452,6 +3452,10 @@ pass_sprintf_length::handle_gimple_call (gimple_stmt_iterator *gsi)
 
   info.format = gimple_call_arg (info.callstmt, idx_format);
 
+  /* True when the destination size is constant as opposed to the lower
+ or upper bound of a range.  */
+  bool dstsize_cst_p = true;
+
   if (idx_dstsize == HOST_WIDE_INT_M1U)
 {
   /* For non-bounded functions like sprintf, determine the size
@@ -3492,8 +3496,8 @@ pass_sprintf_length::handle_gimple_call (gimple_stmt_iterator *gsi)
   else if (TREE_CODE (size) == SSA_NAME)
 	{
 	  /* Try to determine the range of values of the argument
-	 and use the greater of the two at -Wformat-level 1 and
-	 the smaller of them at level 2.  */
+	 and use the greater of the two at level 1 and the smaller
+	 of them at level 2.  */
 	  wide_int min, max;
 	  enum value_range_type range_type
 	= get_range_info (size, , );
@@ -3504,6 +3508,11 @@ pass_sprintf_length::handle_gimple_call (gimple_stmt_iterator *gsi)
 		   ? wi::fits_uhwi_p (max) ? max.to_uhwi () : max.to_shwi ()
 		   : wi::fits_uhwi_p (min) ? min.to_uhwi () : min.to_shwi ());
 	}
+
+	  /* The destination size is not constant.  If the function is
+	 bounded (e.g., snprintf) a lower bound of zero doesn't
+	 necessarily imply it can be eliminated.  */
+	  dstsize_cst_p = false;
 	}
 }
 
@@ -3520,7 +3529,7 @@ pass_sprintf_length::handle_gimple_call (gimple_stmt_iterator *gsi)
 	 without actually producing any.  Pretend the size is
 	 unlimited in this case.  */
   info.objsize = HOST_WIDE_INT_MAX;
-  info.nowrite = true;
+  info.nowrite = dstsize_cst_p;
 }
   else
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/builtin-snprintf-2.c b/gcc/testsuite/gcc.dg/tree-ssa/builtin-snprintf-2.c
new file mode 100644
index 000..a192aee
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/builtin-snprintf-2.c
@@ -0,0 +1,24 @@
+/* PR middle-end/79496 - call to snprintf with non-zero size eliminated
+   with -Wformat-truncation=2
+   { dg-do compile }
+   { dg-options "-O2 -Wall -Wformat-truncation=2 -fprintf-return-value -fdump-tree-optimized" } */
+
+char d[2];
+
+int test_cst (unsigned n)
+{
+  if (1 < n)
+n = 0;
+
+  return __builtin_snprintf (d, n, "%d", 1);
+}
+
+int test_var (char *d, unsigned n)
+{
+  if (2 < n)
+n = 0;
+
+  return __builtin_snprintf (d, n, "%i", 1);
+}
+
+/* { dg-final { scan-tree-dump-times "snprintf" 2 "optimized"} } */

[C++ PATCH] Fix 3 diagnostic whitespace issues

2017-02-13 Thread Jakub Jelinek

Hi!

And lastly this patch fixes 3 diagnostics issues found by that script
in the C++ FE.  Bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2017-02-13  Jakub Jelinek  

* init.c (warn_placement_new_too_small): Add missing space in
diagnostics.
* parser.c (cp_parser_oacc_declare): Likewise.
* mangle.c (maybe_check_abi_tags): Likewise.

--- gcc/cp/init.c.jj2017-01-31 09:26:02.0 +0100
+++ gcc/cp/init.c   2017-02-13 11:39:57.145924623 +0100
@@ -2606,7 +2606,7 @@ warn_placement_new_too_small (tree type,
exact_size ?
"placement new constructing an object of type %qT "
"and size %qwu in a region of type %qT and size %qwi"
-   : "placement new constructing an object of type %qT"
+   : "placement new constructing an object of type %qT "
"and size %qwu in a region of type %qT and size "
"at most %qwu",
type, bytes_need, TREE_TYPE (oper),
--- gcc/cp/parser.c.jj  2017-02-10 21:35:26.0 +0100
+++ gcc/cp/parser.c 2017-02-13 11:41:21.162795860 +0100
@@ -36177,7 +36177,7 @@ cp_parser_oacc_declare (cp_parser *parse
   || !DECL_EXTERNAL (decl)))
{
  error_at (loc,
-   "%qD must be a global variable in"
+   "%qD must be a global variable in "
"%<#pragma acc declare link%>",
decl);
  error = true;
--- gcc/cp/mangle.c.jj  2017-01-19 16:58:17.0 +0100
+++ gcc/cp/mangle.c 2017-02-13 11:40:41.354330685 +0100
@@ -4274,8 +4274,9 @@ maybe_check_abi_tags (tree t, tree for_d
for_decl, flag_abi_version, warn_abi_version);
   else
warning_at (DECL_SOURCE_LOCATION (t), OPT_Wabi,
-   "the mangled name of the initialization guard variable for"
-   "%qD changes between -fabi-version=%d and -fabi-version=%d",
+   "the mangled name of the initialization guard variable "
+   "for %qD changes between -fabi-version=%d and "
+   "-fabi-version=%d",
t, flag_abi_version, warn_abi_version);
 }
 }

Jakub

Re: [PATCH] Fix i386 REG_CLASS_NAMES

2017-02-13 Thread Uros Bizjak

On Mon, Feb 13, 2017 at 8:45 PM, Jakub Jelinek  wrote:
> Hi!
>
> Unlike the previous patch that is mainly about diagnostics or dump
> text, this one is something that could potentially crash the compiler,
> the missing , in between means MOD4_SSE_REGS class has
> MOD4_SSE_REGSALL_REGS name and ALL_REGS class will have NULL or whatever.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2017-02-13  Jakub Jelinek  
>
> * config/i386/i386.h (REG_CLASS_NAMES): Add , in between
> "MOD4_SSE_REGS" and "ALL_REGS".

OK.

Thanks,
Uros.

> --- gcc/config/i386/i386.h.jj   2017-02-07 16:40:46.0 +0100
> +++ gcc/config/i386/i386.h  2017-02-13 11:31:46.020532661 +0100
> @@ -1427,7 +1427,7 @@ enum reg_class
> "FLOAT_INT_SSE_REGS",   \
> "MASK_EVEX_REGS",   \
> "MASK_REGS",\
> -   "MOD4_SSE_REGS" \
> +   "MOD4_SSE_REGS",\
> "ALL_REGS" }
>
>  /* Define which registers fit in which classes.  This is an initializer
>
> Jakub

Re: [PATCH] Fix spellcheck test data

2017-02-13 Thread David Malcolm

On Mon, 2017-02-13 at 20:47 +0100, Jakub Jelinek wrote:
> Hi!
> 
> Not really sure about this, but I'd expect if you meant to write
>   "foofood",
> you'd have written it that way and not
>   "foo"
>   "food",
> Passes -fself-test either way, bootstrapped/regtested on x86_64-linux
> and
> i686-linux, ok for trunk?

Oops, yes, this was a mistake.

> 2017-02-13  Jakub Jelinek  
> 
>   * spellcheck.c (test_data): Add , in between "foo" and "food".
> 
> --- gcc/spellcheck.c.jj   2017-01-01 12:45:35.0 +0100
> +++ gcc/spellcheck.c  2017-02-13 11:34:22.352425744 +0100
> @@ -221,7 +221,7 @@ test_find_closest_string ()
>  
>  static const char * const test_data[] = {
>"",
> -  "foo"
> +  "foo",
>"food",
>"boo",
>   
>  "1234567890123456789012345678901234567890123456789012345678901234567
> 890"
> 
>   Jakub

[PATCH] Minor formatting improvement in nvptx mkoffload

2017-02-13 Thread Jakub Jelinek

Hi!

Another case of missing whitespace when string literals are split
across multiple lines, I believe we should emit a space here
for nicer formatting.

Ok for trunk?

2017-02-13  Jakub Jelinek  

* config/nvptx/mkoffload.c (process): Add space in between
, and %d.

--- gcc/config/nvptx/mkoffload.c.jj 2017-01-01 12:45:43.0 +0100
+++ gcc/config/nvptx/mkoffload.c2017-02-13 11:37:21.065021563 +0100
@@ -338,7 +338,7 @@ process (FILE *in, FILE *out)
   fprintf (out, "static __attribute__((constructor)) void init (void)\n"
   "{\n"
   "  GOMP_offload_register_ver (%#x, __OFFLOAD_TABLE__,"
-  "%d/*NVIDIA_PTX*/, _data);\n"
+  " %d/*NVIDIA_PTX*/, _data);\n"
   "};\n",
   GOMP_VERSION_PACK (GOMP_VERSION, GOMP_VERSION_NVIDIA_PTX),
   GOMP_DEVICE_NVIDIA_PTX);
@@ -346,7 +346,7 @@ process (FILE *in, FILE *out)
   fprintf (out, "static __attribute__((destructor)) void fini (void)\n"
   "{\n"
   "  GOMP_offload_unregister_ver (%#x, __OFFLOAD_TABLE__,"
-  "%d/*NVIDIA_PTX*/, _data);\n"
+  " %d/*NVIDIA_PTX*/, _data);\n"
   "};\n",
   GOMP_VERSION_PACK (GOMP_VERSION, GOMP_VERSION_NVIDIA_PTX),
   GOMP_DEVICE_NVIDIA_PTX);

Jakub

[PATCH] Fix spellcheck test data

2017-02-13 Thread Jakub Jelinek

Hi!

Not really sure about this, but I'd expect if you meant to write
  "foofood",
you'd have written it that way and not
  "foo"
  "food",
Passes -fself-test either way, bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?

2017-02-13  Jakub Jelinek  

* spellcheck.c (test_data): Add , in between "foo" and "food".

--- gcc/spellcheck.c.jj 2017-01-01 12:45:35.0 +0100
+++ gcc/spellcheck.c2017-02-13 11:34:22.352425744 +0100
@@ -221,7 +221,7 @@ test_find_closest_string ()
 
 static const char * const test_data[] = {
   "",
-  "foo"
+  "foo",
   "food",
   "boo",
   "1234567890123456789012345678901234567890123456789012345678901234567890"

Jakub

[PATCH] Fix i386 REG_CLASS_NAMES

2017-02-13 Thread Jakub Jelinek

Hi!

Unlike the previous patch that is mainly about diagnostics or dump
text, this one is something that could potentially crash the compiler,
the missing , in between means MOD4_SSE_REGS class has
MOD4_SSE_REGSALL_REGS name and ALL_REGS class will have NULL or whatever.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-02-13  Jakub Jelinek  

* config/i386/i386.h (REG_CLASS_NAMES): Add , in between
"MOD4_SSE_REGS" and "ALL_REGS".

--- gcc/config/i386/i386.h.jj   2017-02-07 16:40:46.0 +0100
+++ gcc/config/i386/i386.h  2017-02-13 11:31:46.020532661 +0100
@@ -1427,7 +1427,7 @@ enum reg_class
"FLOAT_INT_SSE_REGS",   \
"MASK_EVEX_REGS",   \
"MASK_REGS",\
-   "MOD4_SSE_REGS" \
+   "MOD4_SSE_REGS",\
"ALL_REGS" }
 
 /* Define which registers fit in which classes.  This is an initializer

Jakub

[PATCH] Fix various cases of missing whitespace in string literals split across lines

2017-02-13 Thread Jakub Jelinek

Hi!

While looking at PR79475 (already fixed), I wrote a small scriptlet
#!/bin/awk -f
/^[[:blank:]]*"[^[:blank:]]/ {
  if (last)
{
  print last
  print
}
}
{
  last = ""
}
/[^[:blank:]]"[[:blank:]]*\\?$/ {
  if ($0 ~ /\\[tnv]"[[:blank:]]*\\?$/)
last = ""
  else
last = $0
}
to look for possible issues like
"something"
"and something"
where there probably is supposed to be a space in between.  It shows
various false positives (especially in the spec handling stuff) and while
it handles
"something\n"
"something else"
it doesn't handle
"something"
"\nsomething else"
but in any case the false positive ratio was small enough that I could
easily look for actual bugs.  This patch fixes various of these
in the middle end as well as C and fortran FEs.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-02-13  Jakub Jelinek  

* cprop.c (cprop_jump): Add missing space in string literal.
* tree-ssa-structalias.c (rewrite_constraints): Likewise.
(get_constraint_for_component_ref): Likewise.
* df-core.c (df_worklist_dataflow_doublequeue): Likewise.
* tree-outof-ssa.c (insert_partition_copy_on_edge): Likewise.
* lra-constraints.c (process_alt_operands): Likewise.
* ipa-inline.c (inline_small_functions): Likewise.
* tree-ssa-sccvn.c (visit_reference_op_store): Likewise.
* cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Likewise.
* trans-mem.c (diagnose_tm_1_op): Likewise.
* omp-grid.c (grid_find_single_omp_among_assignments): Likewise.
(grid_parallel_clauses_gridifiable): Likewise.
c/
* c-parser.c (c_parser_oacc_declare): Add missing space in
diagnostics.
fortran/
* trans-expr.c (gfc_conv_substring): Add missing space in diagnostics.

--- gcc/cprop.c.jj  2017-02-07 18:45:12.0 +0100
+++ gcc/cprop.c 2017-02-13 11:47:26.122780219 +0100
@@ -972,7 +972,7 @@ cprop_jump (basic_block bb, rtx_insn *se
   if (dump_file != NULL)
 {
   fprintf (dump_file,
-  "GLOBAL CONST-PROP: Replacing reg %d in jump_insn %d with"
+  "GLOBAL CONST-PROP: Replacing reg %d in jump_insn %d with "
   "constant ", REGNO (from), INSN_UID (jump));
   print_rtl (dump_file, src);
   fprintf (dump_file, "\n");
--- gcc/tree-ssa-structalias.c.jj   2017-01-01 12:45:38.0 +0100
+++ gcc/tree-ssa-structalias.c  2017-02-13 11:46:31.354548069 +0100
@@ -2564,7 +2564,7 @@ rewrite_constraints (constraint_graph_t
  if (dump_file && (dump_flags & TDF_DETAILS))
{
 
- fprintf (dump_file, "%s is a non-pointer variable,"
+ fprintf (dump_file, "%s is a non-pointer variable, "
   "ignoring constraint:",
   get_varinfo (lhs.var)->name);
  dump_constraint (dump_file, c);
@@ -2579,7 +2579,7 @@ rewrite_constraints (constraint_graph_t
  if (dump_file && (dump_flags & TDF_DETAILS))
{
 
- fprintf (dump_file, "%s is a non-pointer variable,"
+ fprintf (dump_file, "%s is a non-pointer variable, "
   "ignoring constraint:",
   get_varinfo (rhs.var)->name);
  dump_constraint (dump_file, c);
@@ -3295,7 +3295,7 @@ get_constraint_for_component_ref (tree t
   else if (bitmaxsize == 0)
{
  if (dump_file && (dump_flags & TDF_DETAILS))
-   fprintf (dump_file, "Access to zero-sized part of variable,"
+   fprintf (dump_file, "Access to zero-sized part of variable, "
 "ignoring\n");
}
   else
--- gcc/df-core.c.jj2017-01-01 12:45:38.0 +0100
+++ gcc/df-core.c   2017-02-13 11:46:58.707164586 +0100
@@ -1064,7 +1064,7 @@ df_worklist_dataflow_doublequeue (struct
   /* Dump statistics. */
   if (dump_file)
 fprintf (dump_file, "df_worklist_dataflow_doublequeue:"
-"n_basic_blocks %d n_edges %d"
+" n_basic_blocks %d n_edges %d"
 " count %d (%5.2g)\n",
 n_basic_blocks_for_fn (cfun), n_edges_for_fn (cfun),
 dcount, dcount / (float)n_basic_blocks_for_fn (cfun));
--- gcc/tree-outof-ssa.c.jj 2017-01-01 12:45:36.0 +0100
+++ gcc/tree-outof-ssa.c2017-02-13 11:42:55.241531916 +0100
@@ -242,7 +242,7 @@ insert_partition_copy_on_edge (edge e, i
   if (dump_file && (dump_flags & TDF_DETAILS))
 {
   fprintf (dump_file,
-  "Inserting a partition copy on edge BB%d->BB%d :"
+  "Inserting a partition copy on edge BB%d->BB%d : "
   "PART.%d = PART.%d",
   e->src->index,
   e->dest->index, dest, src);
--- gcc/lra-constraints.c.jj2017-01-30 09:31:48.0 +0100
+++ gcc/lra-constraints.c   2017-02-13 11:50:31.671178832 +0100
@@ -2848,7 +2848,7 @@ process_alt_operands (int only_alternati

Re: [PATCH] Improve x % y to x VRP optimization (PR tree-optimization/79408)

2017-02-13 Thread Jakub Jelinek

On Mon, Feb 13, 2017 at 12:24:08PM +0100, Richard Biener wrote:
> You'd of course allocate it on the stack.  But yeah, sth like your patch
> works for me.

Now bootstrapped/regtested successfully on x86_64-linux and i686-linux.
So is this ok for trunk and perhaps we can add new APIs later?

> > 2017-02-13  Jakub Jelinek  
> > 
> > PR tree-optimization/79408
> > * tree-vrp.c (simplify_div_or_mod_using_ranges): Handle also the
> > case when on TRUNC_MOD_EXPR op0 is INTEGER_CST.
> > (simplify_stmt_using_ranges): Call simplify_div_or_mod_using_ranges
> > also if rhs1 is INTEGER_CST.
> > 
> > * gcc.dg/tree-ssa/pr79408-2.c: New test.

Jakub

[PATCH] Add missing _mm512_prefetch_i{32,64}gather_{pd,ps} (PR target/79481)

2017-02-13 Thread Jakub Jelinek

Hi!

As mentioned in the PR, ICC as well as clang have these non-masked
gather prefetch intrinsics in addition to masked (and for scatter
even GCC has both masked and non-masked), but GCC does not (the
SDM actually doesn't mention those, only those for scatters).

The following patch implements those, I think it is useful to have
them for compatibility with the other compilers as well for consistency.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-02-13  Jakub Jelinek  

PR target/79481
* config/i386/avx512pfintrin.h (_mm512_prefetch_i32gather_pd,
_mm512_prefetch_i32gather_ps, _mm512_prefetch_i64gather_pd,
_mm512_prefetch_i64gather_ps): New inline functions and macros.

* gcc.target/i386/sse-14.c (test_2vx): Add void return type.
(test_3vx): Change return type from int to void. 
(_mm512_prefetch_i32gather_ps, _mm512_prefetch_i32scatter_ps,
_mm512_prefetch_i64gather_ps, _mm512_prefetch_i64scatter_ps,
_mm512_prefetch_i32gather_pd, _mm512_prefetch_i32scatter_pd,
_mm512_prefetch_i64gather_pd, _mm512_prefetch_i64scatter_pd): New
tests.
* gcc.target/i386/sse-22.c (test_2vx): Add void return type.
(test_3vx): Change return type from int to void.
(_mm512_prefetch_i32gather_ps, _mm512_prefetch_i32scatter_ps,
_mm512_prefetch_i64gather_ps, _mm512_prefetch_i64scatter_ps,
_mm512_prefetch_i32gather_pd, _mm512_prefetch_i32scatter_pd,
_mm512_prefetch_i64gather_pd, _mm512_prefetch_i64scatter_pd): New
tests.
* gcc.target/i386/avx512pf-vgatherpf0dpd-1.c: Add non-masked
intrinsic.  Change scan-assembler-times number from 1 to 2.
* gcc.target/i386/avx512pf-vgatherpf0dps-1.c: Likewise.
* gcc.target/i386/avx512pf-vgatherpf0qpd-1.c: Likewise.
* gcc.target/i386/avx512pf-vgatherpf0qps-1.c: Likewise.
* gcc.target/i386/avx512pf-vgatherpf1dpd-1.c: Likewise.
* gcc.target/i386/avx512pf-vgatherpf1dps-1.c: Likewise.
* gcc.target/i386/avx512pf-vgatherpf1qpd-1.c: Likewise.
* gcc.target/i386/avx512pf-vgatherpf1qps-1.c: Likewise.

--- gcc/config/i386/avx512pfintrin.h.jj 2017-01-17 18:40:59.0 +0100
+++ gcc/config/i386/avx512pfintrin.h2017-02-13 09:56:21.03124 +0100
@@ -48,6 +48,24 @@ typedef unsigned short __mmask16;
 #ifdef __OPTIMIZE__
 extern __inline void
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_prefetch_i32gather_pd (__m256i __index, void const *__addr,
+ int __scale, int __hint)
+{
+  __builtin_ia32_gatherpfdpd ((__mmask8) 0xFF, (__v8si) __index, __addr,
+ __scale, __hint);
+}
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_prefetch_i32gather_ps (__m512i __index, void const *__addr,
+ int __scale, int __hint)
+{
+  __builtin_ia32_gatherpfdps ((__mmask16) 0x, (__v16si) __index, __addr,
+ __scale, __hint);
+}
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_mask_prefetch_i32gather_pd (__m256i __index, __mmask8 __mask,
   void const *__addr, int __scale, int __hint)
 {
@@ -66,6 +84,24 @@ _mm512_mask_prefetch_i32gather_ps (__m51
 
 extern __inline void
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_prefetch_i64gather_pd (__m512i __index, void const *__addr,
+ int __scale, int __hint)
+{
+  __builtin_ia32_gatherpfqpd ((__mmask8) 0xFF, (__v8di) __index, __addr,
+ __scale, __hint);
+}
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm512_prefetch_i64gather_ps (__m512i __index, void const *__addr,
+ int __scale, int __hint)
+{
+  __builtin_ia32_gatherpfqps ((__mmask8) 0xFF, (__v8di) __index, __addr,
+ __scale, __hint);
+}
+
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm512_mask_prefetch_i64gather_pd (__m512i __index, __mmask8 __mask,
   void const *__addr, int __scale, int __hint)
 {
@@ -155,6 +191,14 @@ _mm512_mask_prefetch_i64scatter_ps (void
 }
 
 #else
+#define _mm512_prefetch_i32gather_pd(INDEX, ADDR, SCALE, HINT)  \
+  __builtin_ia32_gatherpfdpd ((__mmask8)0xFF, (__v8si)(__m256i)INDEX,   \
+ (void const *)ADDR, (int)SCALE, (int)HINT)
+
+#define _mm512_prefetch_i32gather_ps(INDEX, ADDR, SCALE, HINT)  \
+  __builtin_ia32_gatherpfdps ((__mmask16)0x, (__v16si)(__m512i)INDEX,\
+ (void const *)ADDR, (int)SCALE, (int)HINT)
+
 #define _mm512_mask_prefetch_i32gather_pd(INDEX, MASK, ADDR, SCALE, HINT)\
   __builtin_ia32_gatherpfdpd

Re: Patch ping^2

2017-02-13 Thread Nathan Sidwell


On 02/13/2017 12:28 PM, Jakub Jelinek wrote:



The reason for that VOID_TYPE_P stuff in COND_EXPR handling is not
that stabilize_expr doesn't like it, but to avoid warning twice.
If I replace all 3 VOID_TYPE_P tests (with the patch) in
cp_build_modify_expr with 0 && VOID_TYPE_P, the above testcase
errors 6 times instead of 4 times as it should.
For the case when lhs is a pre-inc/decrement or modify_expr, we don't have
this problem, rhs is used just one in cp_convert_for_assignment where this
error is diagnosed normally.

Thus, I'll bootstrap/regtest following simplified version of the patch
instead:



ok, thanks

nathan

--
Nathan Sidwell

libgo patch committed: Fix s390x tests

2017-02-13 Thread Ian Lance Taylor

This patch to libgo fixes some tests on s390x that rely on assembler
code that is not (yet) implemented for gccgo.  The tests are simply
marked to be ignored.  This fixes GCC PR 79443.  Bootstrapped and
tested on x86_64-pc-linux-gnu and (by  Dominik Vogt) on s390x.
Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 245052)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-7fa4eb4b7a32953c2e838f1b0c684a6733172b43
+c3935e1f20ad5b1d4c41150f11fb266913c04df7
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/crypto/sha256/fallback_test.go
===
--- libgo/go/crypto/sha256/fallback_test.go (revision 245052)
+++ libgo/go/crypto/sha256/fallback_test.go (working copy)
@@ -2,6 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
+// +build ignore
 // +build s390x
 
 package sha256
Index: libgo/go/math/export_s390x_test.go
===
--- libgo/go/math/export_s390x_test.go  (revision 245052)
+++ libgo/go/math/export_s390x_test.go  (working copy)
@@ -2,6 +2,8 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
+// +build ignore
+
 package math
 
 // Export internal functions and variable for testing.

Re: Patch ping^2

2017-02-13 Thread Jakub Jelinek

On Mon, Feb 13, 2017 at 05:44:54PM +0100, Jakub Jelinek wrote:
> On Mon, Feb 13, 2017 at 11:41:48AM -0500, Nathan Sidwell wrote:
> > On 02/13/2017 10:46 AM, Jakub Jelinek wrote:
> > > Hi!
> > > 
> > > I'd like to ping a couple of patches:
> > > 
> > > - C++ P1 PR79232 - ICEs and wrong-code with COMPOUND_EXPR on lhs of 
> > > assignment
> > >   http://gcc.gnu.org/ml/gcc-patches/2017-01/msg02341.html
> > 
> > What puzzles me about (and may be an existing orthogonal issue), is the
> > checking for TYPE(rhs) == VOID.  In the current code it's only explicitly
> > checked in the lhs == COND_EXPR case, which is strange.  Why isn't it a
> > general constraint?
> 
> I'll double check; copied this from the COND_EXPR case which had the same.
> I believe the reason is that this is (or ought to be) checked later on,
> but stabilitize_expr wouldn't work well if it is called with a void
> expression.

Ok, so tried
void bar (int);

int
foo (int x, int *y, int z, int w, int u)
{
  y[++z] = bar (++u);
  (x++, y[++z]) = bar (++u);
  (w ? y[++z] : y[z++]) = bar (++u);
  (x++, (w ? y[++z] : y[z++])) = bar (++u);
  return x + z + u;
}

The reason for that VOID_TYPE_P stuff in COND_EXPR handling is not
that stabilize_expr doesn't like it, but to avoid warning twice.
If I replace all 3 VOID_TYPE_P tests (with the patch) in
cp_build_modify_expr with 0 && VOID_TYPE_P, the above testcase
errors 6 times instead of 4 times as it should.
For the case when lhs is a pre-inc/decrement or modify_expr, we don't have
this problem, rhs is used just one in cp_convert_for_assignment where this
error is diagnosed normally.

Thus, I'll bootstrap/regtest following simplified version of the patch
instead:

2017-02-13  Jakub Jelinek  

PR c++/79232
* typeck.c (cp_build_modify_expr): Handle properly COMPOUND_EXPRs
on lhs that have {PRE{DEC,INC}REMENT,MODIFY,MIN,MAX,COND}_EXPR
in the rightmost operand.

* g++.dg/cpp1z/eval-order4.C: New test.
* g++.dg/other/pr79232.C: New test.

--- gcc/cp/typeck.c.jj  2017-01-31 22:36:33.762297835 +0100
+++ gcc/cp/typeck.c 2017-02-13 18:22:06.927396122 +0100
@@ -7571,16 +7571,26 @@ tree
 cp_build_modify_expr (location_t loc, tree lhs, enum tree_code modifycode,
  tree rhs, tsubst_flags_t complain)
 {
-  tree result;
+  tree result = NULL_TREE;
   tree newrhs = rhs;
   tree lhstype = TREE_TYPE (lhs);
+  tree olhs = lhs;
   tree olhstype = lhstype;
   bool plain_assign = (modifycode == NOP_EXPR);
+  bool compound_side_effects_p = false;
+  tree preeval = NULL_TREE;
 
   /* Avoid duplicate error messages from operands that had errors.  */
   if (error_operand_p (lhs) || error_operand_p (rhs))
 return error_mark_node;
 
+  while (TREE_CODE (lhs) == COMPOUND_EXPR)
+{
+  if (TREE_SIDE_EFFECTS (TREE_OPERAND (lhs, 0)))
+   compound_side_effects_p = true;
+  lhs = TREE_OPERAND (lhs, 1);
+}
+
   /* Handle control structure constructs used as "lvalues".  Note that we
  leave COMPOUND_EXPR on the LHS because it is sequenced after the RHS.  */
   switch (TREE_CODE (lhs))
@@ -7588,20 +7598,41 @@ cp_build_modify_expr (location_t loc, tr
   /* Handle --foo = 5; as these are valid constructs in C++.  */
 case PREDECREMENT_EXPR:
 case PREINCREMENT_EXPR:
+  if (compound_side_effects_p)
+   newrhs = rhs = stabilize_expr (rhs, );
   if (TREE_SIDE_EFFECTS (TREE_OPERAND (lhs, 0)))
lhs = build2 (TREE_CODE (lhs), TREE_TYPE (lhs),
  cp_stabilize_reference (TREE_OPERAND (lhs, 0)),
  TREE_OPERAND (lhs, 1));
   lhs = build2 (COMPOUND_EXPR, lhstype, lhs, TREE_OPERAND (lhs, 0));
+maybe_add_compound:
+  /* If we had (bar, --foo) = 5; or (bar, (baz, --foo)) = 5;
+and looked through the COMPOUND_EXPRs, readd them now around
+the resulting lhs.  */
+  if (TREE_CODE (olhs) == COMPOUND_EXPR)
+   {
+ lhs = build2 (COMPOUND_EXPR, lhstype, TREE_OPERAND (olhs, 0), lhs);
+ tree *ptr = _OPERAND (lhs, 1);
+ for (olhs = TREE_OPERAND (olhs, 1);
+  TREE_CODE (olhs) == COMPOUND_EXPR;
+  olhs = TREE_OPERAND (olhs, 1))
+   {
+ *ptr = build2 (COMPOUND_EXPR, lhstype,
+TREE_OPERAND (olhs, 0), *ptr);
+ ptr = _OPERAND (*ptr, 1);
+   }
+   }
   break;
 
 case MODIFY_EXPR:
+  if (compound_side_effects_p)
+   newrhs = rhs = stabilize_expr (rhs, );
   if (TREE_SIDE_EFFECTS (TREE_OPERAND (lhs, 0)))
lhs = build2 (TREE_CODE (lhs), TREE_TYPE (lhs),
  cp_stabilize_reference (TREE_OPERAND (lhs, 0)),
  TREE_OPERAND (lhs, 1));
   lhs = build2 (COMPOUND_EXPR, lhstype, lhs, TREE_OPERAND (lhs, 0));
-  break;
+  goto maybe_add_compound;
 
 case MIN_EXPR:
 case MAX_EXPR:
@@ -7629,7 +7660,6 @@ cp_build_modify_expr (location_t loc, tr
   except that the

Re: [C++ PATCH] 79296, ICE mangling local class

2017-02-13 Thread Jason Merrill

OK.

On Mon, Feb 13, 2017 at 12:04 PM, Nathan Sidwell  wrote:
> On 02/06/2017 02:19 PM, Jason Merrill wrote:
>
>>> AFAICT we cannot easily assert the condition anymore -- C++11 added the
>>> ability for instantiations involving local types.
>>
>>
>> That doesn't mean they have linkage, though; this type isn't subject
>> to the ODR, so need_assembler_name_p should be false.
>>
>> I notice that for some reason TREE_PUBLIC is set on the TYPE_DECL for
>> A we're trying to mangle, which might be where the problem comes in.
>
>
> This patch amends determine_visibility to use the template fn context for a
> local class instantiation.  With that TREE_PUBLIC gets cleared, and we no
> longer ICE.
>
> ok?
>
> nathan
>
> --
> Nathan Sidwell

[PATCH] rs6000: Fix gcc.dg/tree-ssa/ssa-dom-cse-2.c

2017-02-13 Thread Segher Boessenkool

The testcase should xfail when compiling for a 64-bit target, not when
the default target is 64-bit.

Tested on powerpc-linux {-m32,-m64}, committing to trunk.


Segher


2017-02-13  Segher Boessenkool  

gcc/testsuite/
* gcc.dg/tree-ssa/ssa-dom-cse-2.c: Do not xfail powerpc64*-*-*.
Instead, xfail powerpc-*-*-* && lp64.

---
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c
index d2600db..a660e82 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-cse-2.c
@@ -25,4 +25,4 @@ foo ()
but the loop reads only one element at a time, and DOM cannot resolve these.
The same happens on powerpc depending on the SIMD support available.  */
 
-/* { dg-final { scan-tree-dump "return 28;" "optimized" { xfail { { alpha*-*-* 
hppa*64*-*-* powerpc64*-*-* } || { { sparc*-*-* && lp64 } || { riscv*-*-* && 
lp64 } } } } } } */
+/* { dg-final { scan-tree-dump "return 28;" "optimized" { xfail { { alpha*-*-* 
hppa*64*-*-* } || { lp64 && { powerpc*-*-* sparc*-*-* riscv*-*-* } } } } } } */
-- 
1.9.3

Re: [www patch] sort branches

2017-02-13 Thread Michael Eager


On 02/13/2017 06:34 AM, Nathan Sidwell wrote:

I've applied this patch to sort the other branch lists in svn.html.  I also 
added an index and split
the inactive branch list into merged and plain inactive.

Attention branch maintainers:  Please check whether I've incorrectly put a 
merged branch on the
inactive list.

nathan


The microblaze branch has been merged into the gcc mainline.

diff -urNp svn.html-orig svn.html
--- svn.html-orig   2017-02-13 09:00:36.0 -0800
+++ svn.html2017-02-13 09:04:50.0 -0800
@@ -455,12 +455,6 @@ the command svn log --stop-on-copy
   are Dwarakanath Rajagopal href="mailto:dwarak.rajago...@amd.com;>dwarak.rajago...@amd.com

   and H.J. Lu mailto:hjl.to...@gmail.com;>hjl.to...@gmail.com.

-  microblaze
-  This branch contains support for the Xilinx MicroBlaze architecture.
-  This branch will be used to update MicroBlaze support from gcc-4.1.2 to
-  to the head.  It is maintained by Michael Eager
-  mailto:ea...@eagercon.com;>ea...@eagercon.com.
-
   mpx
   The goal of this branch is to support Intel MPX technology
   (href="https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf;>link).

@@ -790,6 +784,13 @@ be prefixed with the initials of the dis
   and is currently unmaintained.
   This work now continues on the autovect-branch.

+  microblaze
+  This branch contained support for updating the Xilinx MicroBlaze
+  architecture to gcc-4.1.2.
+  It was created by Michael Eager
+  mailto:ea...@eagercon.com;>ea...@eagercon.com.
+  All changes have been merged into mainline.
+
   https://gcc.gnu.org/wiki/MemRef;>mem-ref2
   mips-3_4-rewrite-branch
   named-addr-spaces-branch

--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

[gomp4] adjust num_gangs and add a diagnostic for unsupported num_workers

2017-02-13 Thread Cesar Philippidis

This patch does the followings:

 * Adjusts the default num_gangs to utilize more of GPU hardware.
 * Teach libgomp to emit a diagnostic when num_workers isn't supported.

According to the confusing CUDA literature, it appears that the previous
num_gangs wasn't fully utilizing the GPU hardware. The previous strategy
was to set num_gangs to the number of SM processors. However, SM
processors can execute multiple CUDA blocks concurrently. In this patch,
I'm using a relaxed version of the formulas from Nvidia's CUDA Occupancy
Calculator spreadsheet to determine num_gangs. More specifically, since
we're not using shared-memory that extensively right now, I've omitted
that restraint from the formula.

While I was at it, I also taught the nvptx plugin how to emit a
diagnostic when the hardware doesn't have enough registers to support
the requested num_workers at run time. There are two problems here. 1)
The register file is a shared resource between all of the threads in a
SM. The more registers each thread in the SM utilizes, the fewer threads
the CUDA blocks can contain. 2) In order to eliminate MIN_EXPRs in the
for-loop branches in worker-partitioned loops, the nvptx BE is currently
hard-coding the default num_workers to 32 for any parallel region that
doesn't contain an explicit num_workers. When I disabled that
optimization, I observed a 2.5x slow down in CloverLeaf. So rather than
disabling that optimization, I taught the runtime to give the end user
some performance hints. E.g., recompile your program with
-fopenacc-dim=-:num_workers.

This patch has been applied to gomp-4_0-branch.

Cesar
2017-02-13  Cesar Philippidis  

	libgomp/
	* plugin/plugin-nvptx.c (nvptx_exec): Adjust the default num_gangs.
	Add diagnostic when the hardware cannot support the requested
	num_workers.


diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index d1261b4..8c696eb 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -899,6 +899,7 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
   CUdeviceptr dp;
   struct nvptx_thread *nvthd = nvptx_thread ();
   const char *maybe_abort_msg = "(perhaps abort was called)";
+  static int warp_size, block_size, dev_size, cpu_size, rf_size, sm_size;
 
   function = targ_fn->fn;
 
@@ -917,90 +918,145 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs,
 	seen_zero = 1;
 }
 
-  if (seen_zero)
-{
-  /* See if the user provided GOMP_OPENACC_DIM environment
-	 variable to specify runtime defaults. */
-  static int default_dims[GOMP_DIM_MAX];
+  /* Both reg_granuarlity and warp_granuularity were extracted from
+ the "Register Allocation Granularity" in Nvidia's CUDA Occupancy
+ Calculator spreadsheet.  Specifically, this required SM_30+
+ targets.  */
+  const int reg_granularity = 256;
+  const int warp_granularity = 4;
 
-  pthread_mutex_lock (_dev_lock);
-  if (!default_dims[0])
+  /* See if the user provided GOMP_OPENACC_DIM environment variable to
+ specify runtime defaults. */
+  static int default_dims[GOMP_DIM_MAX];
+
+  pthread_mutex_lock (_dev_lock);
+  if (!default_dims[0])
+{
+  /* We only read the environment variable once.  You can't
+	 change it in the middle of execution.  The syntax  is
+	 the same as for the -fopenacc-dim compilation option.  */
+  const char *env_var = getenv ("GOMP_OPENACC_DIM");
+  if (env_var)
 	{
-	  /* We only read the environment variable once.  You can't
-	 change it in the middle of execution.  The syntax  is
-	 the same as for the -fopenacc-dim compilation option.  */
-	  const char *env_var = getenv ("GOMP_OPENACC_DIM");
-	  if (env_var)
-	{
-	  const char *pos = env_var;
+	  const char *pos = env_var;
 
-	  for (i = 0; *pos && i != GOMP_DIM_MAX; i++)
+	  for (i = 0; *pos && i != GOMP_DIM_MAX; i++)
+	{
+	  if (i && *pos++ != ':')
+		break;
+	  if (*pos != ':')
 		{
-		  if (i && *pos++ != ':')
+		  const char *eptr;
+
+		  errno = 0;
+		  long val = strtol (pos, (char **), 10);
+		  if (errno || val < 0 || (unsigned)val != val)
 		break;
-		  if (*pos != ':')
-		{
-		  const char *eptr;
-
-		  errno = 0;
-		  long val = strtol (pos, (char **), 10);
-		  if (errno || val < 0 || (unsigned)val != val)
-			break;
-		  default_dims[i] = (int)val;
-		  pos = eptr;
-		}
+		  default_dims[i] = (int)val;
+		  pos = eptr;
 		}
 	}
+	}
 
-	  int warp_size, block_size, dev_size, cpu_size;
-	  CUdevice dev = nvptx_thread()->ptx_dev->dev;
-	  /* 32 is the default for known hardware.  */
-	  int gang = 0, worker = 32, vector = 32;
-	  CUdevice_attribute cu_tpb, cu_ws, cu_mpc, cu_tpm;
-
-	  cu_tpb = CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_BLOCK;
-	  cu_ws = CU_DEVICE_ATTRIBUTE_WARP_SIZE;
-	  cu_mpc = CU_DEVICE_ATTRIBUTE_MULTIPROCESSOR_COUNT;
-	  cu_tpm  = CU_DEVICE_ATTRIBUTE_MAX_THREADS_PER_MULTIPROCESSOR;
-
-	  if

Re: [C++ PATCH] 79296, ICE mangling local class

2017-02-13 Thread Nathan Sidwell


On 02/06/2017 02:19 PM, Jason Merrill wrote:


AFAICT we cannot easily assert the condition anymore -- C++11 added the
ability for instantiations involving local types.


That doesn't mean they have linkage, though; this type isn't subject
to the ODR, so need_assembler_name_p should be false.

I notice that for some reason TREE_PUBLIC is set on the TYPE_DECL for
A we're trying to mangle, which might be where the problem comes in.


This patch amends determine_visibility to use the template fn context 
for a local class instantiation.  With that TREE_PUBLIC gets cleared, 
and we no longer ICE.


ok?

nathan

--
Nathan Sidwell
2017-02-13  Nathan Sidwell  

	PR c++/79296 - ICE mangling localized template instantiation
	* decl2.c (determine_visibility): Use template fn context for
	local class instantiations.

	PR c++/79296
	* g++.dg/cpp0x/pr79296.C: New.

Index: cp/decl2.c
===
--- cp/decl2.c	(revision 245387)
+++ cp/decl2.c	(working copy)
@@ -2225,11 +2225,6 @@ constrain_visibility_for_template (tree
 void
 determine_visibility (tree decl)
 {
-  tree class_type = NULL_TREE;
-  bool use_template;
-  bool orig_visibility_specified;
-  enum symbol_visibility orig_visibility;
-
   /* Remember that all decls get VISIBILITY_DEFAULT when built.  */
 
   /* Only relevant for names with external linkage.  */
@@ -2241,25 +2236,28 @@ determine_visibility (tree decl)
  maybe_clone_body.  */
   gcc_assert (!DECL_CLONED_FUNCTION_P (decl));
 
-  orig_visibility_specified = DECL_VISIBILITY_SPECIFIED (decl);
-  orig_visibility = DECL_VISIBILITY (decl);
+  bool orig_visibility_specified = DECL_VISIBILITY_SPECIFIED (decl);
+  enum symbol_visibility orig_visibility = DECL_VISIBILITY (decl);
 
+  /* The decl may be a template instantiation, which could influence
+ visibilty.  */
+  tree template_decl = NULL_TREE;
   if (TREE_CODE (decl) == TYPE_DECL)
 {
   if (CLASS_TYPE_P (TREE_TYPE (decl)))
-	use_template = CLASSTYPE_USE_TEMPLATE (TREE_TYPE (decl));
+	{
+	  if (CLASSTYPE_USE_TEMPLATE (TREE_TYPE (decl)))
+	template_decl = decl;
+	}
   else if (TYPE_TEMPLATE_INFO (TREE_TYPE (decl)))
-	use_template = 1;
-  else
-	use_template = 0;
+	template_decl = decl;
 }
-  else if (DECL_LANG_SPECIFIC (decl))
-use_template = DECL_USE_TEMPLATE (decl);
-  else
-use_template = 0;
+  else if (DECL_LANG_SPECIFIC (decl) && DECL_USE_TEMPLATE (decl))
+template_decl = decl;
 
   /* If DECL is a member of a class, visibility specifiers on the
  class can influence the visibility of the DECL.  */
+  tree class_type = NULL_TREE;
   if (DECL_CLASS_SCOPE_P (decl))
 class_type = DECL_CONTEXT (decl);
   else
@@ -2302,8 +2300,11 @@ determine_visibility (tree decl)
 	}
 
 	  /* Local classes in templates have CLASSTYPE_USE_TEMPLATE set,
-	 but have no TEMPLATE_INFO, so don't try to check it.  */
-	  use_template = 0;
+	 but have no TEMPLATE_INFO.  Their containing template
+	 function does, and the local class could be constrained
+	 by that.  */
+	  if (template_decl)
+	template_decl = fn;
 	}
   else if (VAR_P (decl) && DECL_TINFO_P (decl)
 	   && flag_visibility_ms_compat)
@@ -2333,7 +2334,7 @@ determine_visibility (tree decl)
 	  && !CLASSTYPE_VISIBILITY_SPECIFIED (TREE_TYPE (DECL_NAME (decl
 	targetm.cxx.determine_class_data_visibility (decl);
 	}
-  else if (use_template)
+  else if (template_decl)
 	/* Template instantiations and specializations get visibility based
 	   on their template unless they override it with an attribute.  */;
   else if (! DECL_VISIBILITY_SPECIFIED (decl))
@@ -2350,11 +2351,11 @@ determine_visibility (tree decl)
 	}
 }
 
-  if (use_template)
+  if (template_decl)
 {
   /* If the specialization doesn't specify visibility, use the
 	 visibility from the template.  */
-  tree tinfo = get_template_info (decl);
+  tree tinfo = get_template_info (template_decl);
   tree args = TI_ARGS (tinfo);
   tree attribs = (TREE_CODE (decl) == TYPE_DECL
 		  ? TYPE_ATTRIBUTES (TREE_TYPE (decl))
Index: testsuite/g++.dg/cpp0x/pr79296.C
===
--- testsuite/g++.dg/cpp0x/pr79296.C	(revision 0)
+++ testsuite/g++.dg/cpp0x/pr79296.C	(working copy)
@@ -0,0 +1,18 @@
+// { dg-require-effective-target lto }
+// { dg-additional-options "-flto" }
+// { dg-do compile { target c++11 } }
+
+// PR 79296 ICE mangling local class of localized instantiation
+
+struct X {
+  template  X (T const *) {
+struct Z {};
+  }
+};
+
+void Baz ()
+{
+  struct Y { } y;
+
+  0, X ();
+}

Re: [RFA][PR tree-optimization/79095] [PATCH 1/4] Improve ranges for MINUS_EXPR and EXACT_DIV_EXPR V2

2017-02-13 Thread Jeff Law


On 02/13/2017 09:15 AM, Marc Glisse wrote:

On Mon, 13 Feb 2017, Jeff Law wrote:


On 02/12/2017 12:13 AM, Marc Glisse wrote:

On Tue, 7 Feb 2017, Jeff Law wrote:


* tree-vrp.c (extract_range_from_binary_expr_1): For EXACT_DIV_EXPR,
if the numerator has the range ~[0,0] make the resultant range ~[0,0].


If I understand correctly, for x /[ex] 4 with x!=0, we currently split
~[0,0] into [INT_MIN,-1] and [1,INT_MAX], then apply EXACT_DIV_EXPR
which gives [INT_MIN/4,-1] and [1,INT_MAX/4], and finally compute the
union of those, which prefers [INT_MIN/4,INT_MAX/4] over ~[0,0]. We
could change the union function, but this patch prefers changing the
code elsewhere so that the new behavior is restricted to the
EXACT_DIV_EXPR case (and supposedly the patch will be reverted if we get
a new non-contiguous version of ranges, where union would already work).
Is that it?

That was one of alternate approaches I suggested.

Part of the problem is the conversion of ~[0,0] is imprecise when it's
going to be used in EXACT_DIV_EXPR, which I outlined elsewhere in the
thread.  As a result of that imprecision, the ranges we get are
[INT_MIN/4,0] U [0,INT_MAX/4].


If VRP for [1, INT_MAX] /[ex] 4 produces [0, INT_MAX/4] instead of [1,
INT_MAX/4], that's a bug that should be fixed in any case. You shouldn't
need [4, INT_MAX] for that.
Agreed.  But given it doesn't actually make anything around 79095 
easier, I'd just assume defer to gcc-8.


I suspect that we'll see nicely refined anti-ranges, but rarely see 
improvements in the generated code.





If we fix that imprecision so that the conversion yields [INT_MIN,-4]
U [4, INT_MAX] then apply EXACT_DIV_EXPR we get [INT_MIN/4,-1] U
[1,INT_MAX/4], which union_ranges turns into [INT_MIN/4,INT_MAX/4].
We still end up needing a hack in union_ranges that will look a hell
of a lot like the one we're contemplating for intersect_ranges.


That was the point of my question. Do we want to put that "hack" (prefer
an anti-range in some cases) in a central place where it would apply any
time we try to replace [a,b]U[c,d] (b+1

Re: [PATCH] Update baseline_symbols.txt for {x86_64,i?86,aarch64,powerpc64,s390{,x}}-linux (PR libstdc++/79348)

2017-02-13 Thread H.J. Lu

On Mon, Feb 13, 2017 at 4:51 AM, Jonathan Wakely  wrote:
> On 13/02/17 13:09 +0100, Jakub Jelinek wrote:
>>
>> Hi!
>>
>> This patch updates baseline_symbols.txt mostly from our latest rpm builds,
>> x86_64 and i686 have been also compared to my local bootstrap abi lists
>> and s390 (31-bit) comes from my bootstrap (and 64-bit s390x compared
>> against
>> the rpm builds too).
>> s390{,x}-linux weren't updated the last time (for GCC 6), so it now even
>> fails the abi check without this patch, the rest has been.
>> s390x 64-bit also is identical to powerpc64 64-bit (which is right, both
>> have the same mangling of important types and also the long double
>> situation
>> is the same (well, powerpc64 also has the even newer long double, but I
>> think we don't instantiate in libstdc++ anything for that yet).
>>
>> Ok for trunk?
>
>
> OK, thanks.
>

I checked in this patch for x32.

-- 
H.J.
---
Index: ChangeLog
===
--- ChangeLog (revision 245393)
+++ ChangeLog (working copy)
@@ -1,3 +1,8 @@
+2017-02-13  H.J. Lu  
+
+ PR libstdc++/79348
+ * config/abi/post/x86_64-linux-gnu/x32/baseline_symbols.txt: Updated.
+
 2017-02-13  Jakub Jelinek  

  PR libstdc++/79348
Index: config/abi/post/x86_64-linux-gnu/x32/baseline_symbols.txt
===
--- config/abi/post/x86_64-linux-gnu/x32/baseline_symbols.txt (revision 245393)
+++ config/abi/post/x86_64-linux-gnu/x32/baseline_symbols.txt (working copy)
@@ -1580,6 +1580,7 @@ FUNC:_ZNSsC1EPKcRKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC1EPKcjRKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC1ERKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC1ERKSs@@GLIBCXX_3.4
+FUNC:_ZNSsC1ERKSsjRKSaIcE@@GLIBCXX_3.4.23
 FUNC:_ZNSsC1ERKSsjj@@GLIBCXX_3.4
 FUNC:_ZNSsC1ERKSsjjRKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC1ESt16initializer_listIcERKSaIcE@@GLIBCXX_3.4.11
@@ -1593,6 +1594,7 @@ FUNC:_ZNSsC2EPKcRKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC2EPKcjRKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC2ERKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC2ERKSs@@GLIBCXX_3.4
+FUNC:_ZNSsC2ERKSsjRKSaIcE@@GLIBCXX_3.4.23
 FUNC:_ZNSsC2ERKSsjj@@GLIBCXX_3.4
 FUNC:_ZNSsC2ERKSsjjRKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC2ESt16initializer_listIcERKSaIcE@@GLIBCXX_3.4.11
@@ -2232,6 +2234,7 @@ FUNC:_ZNSt15_List_node_base8transferEPS_
 FUNC:_ZNSt15_List_node_base9_M_unhookEv@@GLIBCXX_3.4.14
 FUNC:_ZNSt15__exception_ptr13exception_ptr4swapERS0_@@CXXABI_1.3.3
 FUNC:_ZNSt15__exception_ptr13exception_ptrC1EMS0_FvvE@@CXXABI_1.3.3
+FUNC:_ZNSt15__exception_ptr13exception_ptrC1EPv@@CXXABI_1.3.11
 FUNC:_ZNSt15__exception_ptr13exception_ptrC1ERKS0_@@CXXABI_1.3.3
 FUNC:_ZNSt15__exception_ptr13exception_ptrC1Ev@@CXXABI_1.3.3
 FUNC:_ZNSt15__exception_ptr13exception_ptrC2EMS0_FvvE@@CXXABI_1.3.3
@@ -2777,7 +2780,9 @@ FUNC:_ZNSt7__cxx1112basic_stringIcSt11ch
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE10_M_replaceEjjPKcj@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE10_S_compareEjj@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE11_M_capacityEj@@GLIBCXX_3.4.21
+FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_Alloc_hiderC1EPcOS3_@@GLIBCXX_3.4.23
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_Alloc_hiderC1EPcRKS3_@@GLIBCXX_3.4.21
+FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_Alloc_hiderC2EPcOS3_@@GLIBCXX_3.4.23
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_Alloc_hiderC2EPcRKS3_@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_M_constructEjc@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_M_constructIN9__gnu_cxx17__normal_iteratorIPKcS4_vT_SB_St20forward_iterator_tag@@GLIBCXX_3.4.21
@@ -2876,6 +2881,7 @@ FUNC:_ZNSt7__cxx1112basic_stringIcSt11ch
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1ERKS3_@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1ERKS4_@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1ERKS4_RKS3_@@GLIBCXX_3.4.21
+FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1ERKS4_jRKS3_@@GLIBCXX_3.4.23
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1ERKS4_jj@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1ERKS4_jjRKS3_@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC1ESt16initializer_listIcERKS3_@@GLIBCXX_3.4.21
@@ -2891,6 +2897,7 @@ FUNC:_ZNSt7__cxx1112basic_stringIcSt11ch
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2ERKS3_@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2ERKS4_@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2ERKS4_RKS3_@@GLIBCXX_3.4.21
+FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2ERKS4_jRKS3_@@GLIBCXX_3.4.23
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEC2ERKS4_jj@@GLIBCXX_3.4.21

Re: Patch ping^2

2017-02-13 Thread Nathan Sidwell


On 02/13/2017 10:46 AM, Jakub Jelinek wrote:

Hi!

I'd like to ping a couple of patches:

- C++ P1 PR79232 - ICEs and wrong-code with COMPOUND_EXPR on lhs of assignment
  http://gcc.gnu.org/ml/gcc-patches/2017-01/msg02341.html


What puzzles me about (and may be an existing orthogonal issue), is the 
checking for TYPE(rhs) == VOID.  In the current code it's only 
explicitly checked in the lhs == COND_EXPR case, which is strange.  Why 
isn't it a general constraint?


nathan

--
Nathan Sidwell

Re: Patch ping^2

2017-02-13 Thread Jakub Jelinek

On Mon, Feb 13, 2017 at 11:41:48AM -0500, Nathan Sidwell wrote:
> On 02/13/2017 10:46 AM, Jakub Jelinek wrote:
> > Hi!
> > 
> > I'd like to ping a couple of patches:
> > 
> > - C++ P1 PR79232 - ICEs and wrong-code with COMPOUND_EXPR on lhs of 
> > assignment
> >   http://gcc.gnu.org/ml/gcc-patches/2017-01/msg02341.html
> 
> What puzzles me about (and may be an existing orthogonal issue), is the
> checking for TYPE(rhs) == VOID.  In the current code it's only explicitly
> checked in the lhs == COND_EXPR case, which is strange.  Why isn't it a
> general constraint?

I'll double check; copied this from the COND_EXPR case which had the same.
I believe the reason is that this is (or ought to be) checked later on,
but stabilitize_expr wouldn't work well if it is called with a void
expression.

Jakub

[PATCH] rs6000: testsuite: Fix vec-adde[c]-int128.c

2017-02-13 Thread Segher Boessenkool

These are a runtime testcases so they should test p8vector_hw instead of
powerpc_p8vector_ok, or they will fail with an illegal instruction on
older processors.

Also they run on any PowerPC, not with just those compilers that were
configured to default to 64-bit targets.

Tested on powerpc64-linux {-m32,-m64}, committing to trunk.


Segher


2017-02-13  Segher Boessenkool  

gcc/testsuite/
* gcc.target/powerpc/vec-adde-int128.c: Use p8vector_hw instead of
powerpc_p8vector_ok.
* gcc.target/powerpc/vec-addec-int128.c: Ditto.

---
 gcc/testsuite/gcc.target/powerpc/vec-adde-int128.c  | 3 +--
 gcc/testsuite/gcc.target/powerpc/vec-addec-int128.c | 3 +--
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/vec-adde-int128.c 
b/gcc/testsuite/gcc.target/powerpc/vec-adde-int128.c
index 4f951a9..8eed7f5 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-adde-int128.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-adde-int128.c
@@ -1,5 +1,4 @@
-/* { dg-do run { target { powerpc64*-*-* } } } */
-/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-do run { target { powerpc*-*-* && p8vector_hw } } } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
 /* { dg-options "-mcpu=power8 -O3" } */
 
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-addec-int128.c 
b/gcc/testsuite/gcc.target/powerpc/vec-addec-int128.c
index f95143a..4388e06 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-addec-int128.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-addec-int128.c
@@ -1,5 +1,4 @@
-/* { dg-do run { target { powerpc64*-*-* } } } */
-/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-do run { target { powerpc*-*-* && p8vector_hw } } } */
 /* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
"-mcpu=power8" } } */
 /* { dg-options "-mcpu=power8 -O3" } */
 
-- 
1.9.3

Re: [RFA][PR tree-optimization/79095] [PATCH 1/4] Improve ranges for MINUS_EXPR and EXACT_DIV_EXPR V2

2017-02-13 Thread Marc Glisse


On Mon, 13 Feb 2017, Jeff Law wrote:


On 02/12/2017 12:13 AM, Marc Glisse wrote:

On Tue, 7 Feb 2017, Jeff Law wrote:


* tree-vrp.c (extract_range_from_binary_expr_1): For EXACT_DIV_EXPR,
if the numerator has the range ~[0,0] make the resultant range ~[0,0].


If I understand correctly, for x /[ex] 4 with x!=0, we currently split
~[0,0] into [INT_MIN,-1] and [1,INT_MAX], then apply EXACT_DIV_EXPR
which gives [INT_MIN/4,-1] and [1,INT_MAX/4], and finally compute the
union of those, which prefers [INT_MIN/4,INT_MAX/4] over ~[0,0]. We
could change the union function, but this patch prefers changing the
code elsewhere so that the new behavior is restricted to the
EXACT_DIV_EXPR case (and supposedly the patch will be reverted if we get
a new non-contiguous version of ranges, where union would already work).
Is that it?

That was one of alternate approaches I suggested.

Part of the problem is the conversion of ~[0,0] is imprecise when it's going 
to be used in EXACT_DIV_EXPR, which I outlined elsewhere in the thread.  As a 
result of that imprecision, the ranges we get are [INT_MIN/4,0] U 
[0,INT_MAX/4].


If VRP for [1, INT_MAX] /[ex] 4 produces [0, INT_MAX/4] instead of [1, 
INT_MAX/4], that's a bug that should be fixed in any case. You shouldn't 
need [4, INT_MAX] for that.


If we fix that imprecision so that the conversion yields [INT_MIN,-4] U [4, 
INT_MAX] then apply EXACT_DIV_EXPR we get [INT_MIN/4,-1] U [1,INT_MAX/4], 
which union_ranges turns into [INT_MIN/4,INT_MAX/4].  We still end up needing 
a hack in union_ranges that will look a hell of a lot like the one we're 
contemplating for intersect_ranges.


That was the point of my question. Do we want to put that "hack" (prefer 
an anti-range in some cases) in a central place where it would apply any 
time we try to replace [a,b]U[c,d] (b+1

Re: [RFA][PR tree-optimization/79095] [PATCH 1/4] Improve ranges for MINUS_EXPR and EXACT_DIV_EXPR V2

2017-02-13 Thread Jeff Law


On 02/12/2017 12:13 AM, Marc Glisse wrote:

On Tue, 7 Feb 2017, Jeff Law wrote:


* tree-vrp.c (extract_range_from_binary_expr_1): For EXACT_DIV_EXPR,
if the numerator has the range ~[0,0] make the resultant range ~[0,0].


If I understand correctly, for x /[ex] 4 with x!=0, we currently split
~[0,0] into [INT_MIN,-1] and [1,INT_MAX], then apply EXACT_DIV_EXPR
which gives [INT_MIN/4,-1] and [1,INT_MAX/4], and finally compute the
union of those, which prefers [INT_MIN/4,INT_MAX/4] over ~[0,0]. We
could change the union function, but this patch prefers changing the
code elsewhere so that the new behavior is restricted to the
EXACT_DIV_EXPR case (and supposedly the patch will be reverted if we get
a new non-contiguous version of ranges, where union would already work).
Is that it?

That was one of alternate approaches I suggested.

Part of the problem is the conversion of ~[0,0] is imprecise when it's 
going to be used in EXACT_DIV_EXPR, which I outlined elsewhere in the 
thread.  As a result of that imprecision, the ranges we get are 
[INT_MIN/4,0] U [0,INT_MAX/4].


If we fix that imprecision so that the conversion yields [INT_MIN,-4] U 
[4, INT_MAX] then apply EXACT_DIV_EXPR we get [INT_MIN/4,-1] U 
[1,INT_MAX/4], which union_ranges turns into [INT_MIN/4,INT_MAX/4].  We 
still end up needing a hack in union_ranges that will look a hell of a 
lot like the one we're contemplating for intersect_ranges.


Jeff

Re: [PATCH][AArch64] Use contains_mem_rtx_p to detect memory sub-rtxes

2017-02-13 Thread Kyrill Tkachov



On 13/02/17 15:53, Kyrill Tkachov wrote:

Hi all,

We recently (well, within the last year or two) introduced a general function 
to detect MEM sub-rtxes in rtlanal.c: contains_mem_rtx_p.
We can use that in aarch64.c and remove the custom has_memory_op that is 
defined in the same way (except that it takes an rtx_insn * instead of an rtx).

Bootstrapped and tested on aarch64-none-linux-gnu.
Committing as obvious.

Thanks,
Kyrill

2016-02-13  Kyrylo Tkachov  

* config/aarch64/aarch64.c (has_memory_op): Delete.
(aarch64_madd_needs_nop): Use contains_mem_rtx_p instead of
has_memory_op.


Committed as r245391 with the ChangeLog year fixed to the correct 2017.

Kyrill

[PATCH][AArch64] Use contains_mem_rtx_p to detect memory sub-rtxes

2017-02-13 Thread Kyrill Tkachov


Hi all,

We recently (well, within the last year or two) introduced a general function 
to detect MEM sub-rtxes in rtlanal.c: contains_mem_rtx_p.
We can use that in aarch64.c and remove the custom has_memory_op that is 
defined in the same way (except that it takes an rtx_insn * instead of an rtx).

Bootstrapped and tested on aarch64-none-linux-gnu.
Committing as obvious.

Thanks,
Kyrill

2016-02-13  Kyrylo Tkachov  

* config/aarch64/aarch64.c (has_memory_op): Delete.
(aarch64_madd_needs_nop): Use contains_mem_rtx_p instead of
has_memory_op.
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index f72e4c4423d28af66f3bd8068eeb83060d541839..e0289dd3559f07572d581d533212105c4ca90619 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -10965,21 +10965,6 @@ aarch64_mangle_type (const_tree type)
   return NULL;
 }
 
-
-/* Return true if the rtx_insn contains a MEM RTX somewhere
-   in it.  */
-
-static bool
-has_memory_op (rtx_insn *mem_insn)
-{
-  subrtx_iterator::array_type array;
-  FOR_EACH_SUBRTX (iter, array, PATTERN (mem_insn), ALL)
-if (MEM_P (*iter))
-  return true;
-
-  return false;
-}
-
 /* Find the first rtx_insn before insn that will generate an assembly
instruction.  */
 
@@ -11072,7 +11057,7 @@ aarch64_madd_needs_nop (rtx_insn* insn)
  Restore recog state to INSN to avoid state corruption.  */
   extract_constrain_insn_cached (insn);
 
-  if (!prev || !has_memory_op (prev))
+  if (!prev || !contains_mem_rtx_p (PATTERN (prev)))
 return false;
 
   body = single_set (prev);

Patch ping^2

2017-02-13 Thread Jakub Jelinek

Hi!

I'd like to ping a couple of patches:

- C++ P1 PR79232 - ICEs and wrong-code with COMPOUND_EXPR on lhs of assignment
  http://gcc.gnu.org/ml/gcc-patches/2017-01/msg02341.html

- C++ P1 PR79288 - wrong default TLS model for __thread static data members
  http://gcc.gnu.org/ml/gcc-patches/2017-01/msg02349.html

- small simplification for gimple-ssa-sprintf.c
  http://gcc.gnu.org/ml/gcc-patches/2017-02/msg00331.html

- noipa attribute addition
  http://gcc.gnu.org/ml/gcc-patches/2016-12/msg01501.html

Jakub

Re: [PATCH] Invalidate combiner's cached last value upon insn removal (PR rtl-optimization/79388, PR rtl-optimization/79450)

2017-02-13 Thread Segher Boessenkool

Hi Jakub,

On Fri, Feb 10, 2017 at 08:50:36PM +0100, Jakub Jelinek wrote:
> On the following testcases combiner during notes distribution sees a
> REG_DEAD note and decides to remove the setter thereof.  That register
> has remembered a last value on that insn though, and it is a pseudo set
> multiple times.  Later on we combine some further insns into
> pseudo = pseudo op something and use the last known value of the pseudo at
> the already deleted insn (0) and therefore simplify it into pseudo =
> something.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?
> 
> 2017-02-10  Jakub Jelinek  
> 
>   PR rtl-optimization/79388
>   PR rtl-optimization/79450
>   * combine.c (distribute_notes): When removing TEM_INSN for which
>   corresponding dest has last value recorded, invalidate that last
>   value.
> 
>   * gcc.c-torture/execute/pr79388.c: New test.
>   * gcc.c-torture/execute/pr79450.c: New test.
> 
> --- gcc/combine.c.jj  2017-01-30 09:31:48.0 +0100
> +++ gcc/combine.c 2017-02-10 17:05:57.500482518 +0100
> @@ -14288,6 +14288,11 @@ distribute_notes (rtx notes, rtx_insn *f
>   NULL_RTX, NULL_RTX, NULL_RTX);
> distribute_links (LOG_LINKS (tem_insn));
>  
> +   unsigned int regno = REGNO (XEXP (note, 0));
> +   reg_stat_type *rsp = _stat[regno];
> +   if (rsp->last_set == tem_insn)
> + record_value_for_reg (XEXP (note, 0), NULL, 
> NULL_RTX);
> +
> SET_INSN_DELETED (tem_insn);
> if (tem_insn == i2)
>   i2 = NULL;

This looks terribly ad hoc, but everything similar is handled the same
way.  Some day it will need to be made more robust, but the patch is
okay for now.  Thanks!


Segher

Re: [PATCH][ARM] Fix assembly comment syntax in -mprint-tune-info

2017-02-13 Thread Richard Earnshaw (lists)

On 07/02/17 14:04, Kyrill Tkachov wrote:
> Hi all,
> 
> Currently, -mprint-tune-info gives an assembly file that cannot be
> assembled because the branch cost values
> are not properly commented. For example:
> @.tune parameters
> @constant_limit:1
> @max_insns_skipped: 5
> @prefetch.num_slots:0
> @prefetch.l1_cache_size:-1
> @prefetch.l1_cache_line_size:   -1
> @prefer_constant_pool:  0
> @branch_cost:   (s:speed, p:predictable)
> s cost
> 00  1
> 01  1
> 10  4
> 11  4
> @prefer_ldrd_strd:  0
> @logical_op_non_short_circuit:  [1,1]
> @prefer_neon_for_64bits:0
> @disparage_flag_setting_t16_encodings:  0
> @string_ops_prefer_neon:1
> @max_insns_inline_memset:   8
> @fusible_ops:   3
> @sched_autopref:0
> 
> 
> This patch fixes it in the obvious way by adding the comment character
> on the branch costs printout:
> @.tune parameters
> @constant_limit:1
> @max_insns_skipped: 5
> @prefetch.num_slots:0
> @prefetch.l1_cache_size:-1
> @prefetch.l1_cache_line_size:   -1
> @prefer_constant_pool:  0
> @branch_cost:   (s:speed, p:predictable)
> @   s cost
> @   00  1
> @   01  1
> @   10  4
> @   11  4
> @prefer_ldrd_strd:  0
> @logical_op_non_short_circuit:  [1,1]
> @prefer_neon_for_64bits:0
> @disparage_flag_setting_t16_encodings:  0
> @string_ops_prefer_neon:1
> @max_insns_inline_memset:   8
> @fusible_ops:   3
> @sched_autopref:0
> 
> This assembles properly. Also, for pedantry's sake I've replaced the
> explicit '@' comment character
> with ASM_COMMENT_START, which I believe is the technically right thing
> to do.
> 
> From my testing this has been broken since GCC 5 when the option was
> introduced, so I guess this isn't a regression fix.
> However, it is of minimal impact and quite obvious.
> 
> So ok for trunk?
> 
> Bootstrapped and tested on arm-none-linux-gnueabihf.
> 
> Thanks,
> Kyrill
> 
> 
> 2016-02-07  Kyrylo Tkachov  
> 
> * config/arm/arm.c (arm_print_tune_info): Use ASM_COMMENT_START instead
> of explicit '@'.  Add missing assembly comment marker on branch costs
> printout.
> 
> arm-print-tune.patch
> 

OK.

R.

> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index 
> a5ca53d16bc756704c92f9d9de022b4fa6147fed..b7f7179d99ff211e6be518fdbbc4bdff312d6a07
>  100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -25955,46 +25955,55 @@ arm_emit_eabi_attribute (const char *name, int num, 
> int val)
>  void
>  arm_print_tune_info (void)
>  {
> -  asm_fprintf (asm_out_file, "\t@.tune parameters\n");
> -  asm_fprintf (asm_out_file, "\t\t@constant_limit:\t%d\n",
> +  asm_fprintf (asm_out_file, "\t" ASM_COMMENT_START ".tune parameters\n");
> +  asm_fprintf (asm_out_file, "\t\t" ASM_COMMENT_START 
> "constant_limit:\t%d\n",
>  current_tune->constant_limit);
> -  asm_fprintf (asm_out_file, "\t\t@max_insns_skipped:\t%d\n",
> -current_tune->max_insns_skipped);
> -  asm_fprintf (asm_out_file, "\t\t@prefetch.num_slots:\t%d\n",
> -current_tune->prefetch.num_slots);
> -  asm_fprintf (asm_out_file, "\t\t@prefetch.l1_cache_size:\t%d\n",
> +  asm_fprintf (asm_out_file, "\t\t" ASM_COMMENT_START
> +"max_insns_skipped:\t%d\n", current_tune->max_insns_skipped);
> +  asm_fprintf (asm_out_file, "\t\t" ASM_COMMENT_START
> +"prefetch.num_slots:\t%d\n", current_tune->prefetch.num_slots);
> +  asm_fprintf (asm_out_file, "\t\t" ASM_COMMENT_START
> +"prefetch.l1_cache_size:\t%d\n",
>  current_tune->prefetch.l1_cache_size);
> -  asm_fprintf (asm_out_file, "\t\t@prefetch.l1_cache_line_size:\t%d\n",
> +  asm_fprintf (asm_out_file, "\t\t" ASM_COMMENT_START
> +"prefetch.l1_cache_line_size:\t%d\n",
>  current_tune->prefetch.l1_cache_line_size);
> -  asm_fprintf (asm_out_file, "\t\t@prefer_constant_pool:\t%d\n",
> +  asm_fprintf (asm_out_file, "\t\t" ASM_COMMENT_START
> +"prefer_constant_pool:\t%d\n",
>  (int) current_tune->prefer_constant_pool);
> -  asm_fprintf (asm_out_file, "\t\t@branch_cost:\t(s:speed, 
>

Re: C++ Modules branch

2017-02-13 Thread Nathan Sidwell

On 02/12/2017 05:58 AM, Gerald Pfeifer wrote:

On Mon, 6 Feb 2017, Nathan Sidwell wrote:

Are you planning to add this to svn.html

Ah, thanks for the reminder.

And here's the patch I committed to document the branch.

nathan

--
Nathan Sidwell
? htdocs/.#svn.html.1.208
Index: htdocs/svn.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/svn.html,v
retrieving revision 1.212
diff -r1.212 svn.html
518a519,522
>   https://gcc.gnu.org/wiki/cxx-modules;>c++-modules
>   This branch is for development of a C++ modules system.  It is
>   maintained by Nathan Sidwell.
>

Re: [www patch] sort branches

2017-02-13 Thread Nathan Sidwell

On 02/13/2017 09:34 AM, Nathan Sidwell wrote:

I've applied this patch to sort the other branch lists in svn.html.  I
also added an index and split the inactive branch list into merged and
plain inactive.

I think these /p tags are needed.  At least that's my interpretation of 
the validator error.

nathan

--
Nathan Sidwell
? htdocs/.#svn.html.1.208
Index: htdocs/svn.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/svn.html,v
retrieving revision 1.211
diff -r1.211 svn.html
709c709
< These development have been merged to gcc mainline, and thus inactive.
---
> These development have been merged to gcc mainline, and thus inactive.
829c829
< merged.
---
> merged.

[www patch] sort branches

2017-02-13 Thread Nathan Sidwell

I've applied this patch to sort the other branch lists in svn.html.  I 
also added an index and split the inactive branch list into merged and 
plain inactive.


Attention branch maintainers:  Please check whether I've incorrectly 
put a merged branch on the inactive list.


nathan
--
Nathan Sidwell
? htdocs/.#svn.html.1.208
Index: htdocs/svn.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/svn.html,v
retrieving revision 1.210
diff -r1.210 svn.html
150c150,161
< General Infrastructure
---
> 
> General
> Architecture
> Target
> Language
> Distribution
> Merged
> Inactive
> 	  
> 
>   
> General Infrastructure
371c382
< Architecture-specific
---
> Architecture-specific
374,381c385,388
<   avx-512vlbwdq
<   The goal of this branch is to implement the Intel AVX-512{VL,BW,DQ}
<   Programming Reference
<   (https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf;>link).
<   The branch is maintained by Yukhin Kirill kirill.yuk...@intel.com.
<   Patches should be marked with the tag [AVX512] in the subject
<   line.
---
>   arc-20081210-branch
>   The goal of this branch is to make the port to the ARCompact
>   architecture available.  This branch is maintained by Joern Rennecke
>   during spring 2009, and is expected to be unmaintained thereafter.
391,392c398,400
<   mpx
<   The goal of this branch is to support Intel MPX technology
---
>   avx-512vlbwdq
>   The goal of this branch is to implement the Intel AVX-512{VL,BW,DQ}
>   Programming Reference
394,396c402,404
<   The branch is maintained by
<   Ilya Enkovich ilya.enkov...@intel.com
<   Patches should be marked with the tag [MPX] in the subject
---
>   The branch is maintained by Yukhin Kirillhref="mailto:kirill.yuk...@intel.com">kirill.yuk...@intel.com.
>   Patches should be marked with the tag [AVX512] in the subject
399,408d406
<   st/cli-be
<   The goal of the branch is to develop a back end producing CLI binaries,
<   compliant with ECMA-335 specification.
<   This branch was originally maintained by Roberto Costa
<   robsettanta...@gmail.com.
<   Since May 2007, the current maintainers are Andrea Ornstein
<   andrea.ornst...@st.com
<   and Erven Rohou
<   erven.ro...@st.com.
< 
416a415,424
>   cell-4_3-branch
>   The goal of this branch is to add fixes and additional features required
>   for the Cell/B.E. processor (both PPE and SPE) to GCC 4.3.x.  This branch
>   is maintained by Ulrich Weigand.
> 
>   cell-4_4-branch
>   The goal of this branch is to back-port from mainline fixes and additional
>   features required for the Cell/B.E. SPE processor to GCC 4.4.x.  This branch
>   is maintained by Ulrich Weigand.  The branch is merged from gcc-4_4-branch.
> 
450,453c458,462
<   cell-4_3-branch
<   The goal of this branch is to add fixes and additional features required
<   for the Cell/B.E. processor (both PPE and SPE) to GCC 4.3.x.  This branch
<   is maintained by Ulrich Weigand.
---
>   microblaze
>   This branch contains support for the Xilinx MicroBlaze architecture.
>   This branch will be used to update MicroBlaze support from gcc-4.1.2 to
>   to the head.  It is maintained by Michael Eager 
>   ea...@eagercon.com.
455,458c464,480
<   cell-4_4-branch
<   The goal of this branch is to back-port from mainline fixes and additional
<   features required for the Cell/B.E. SPE processor to GCC 4.4.x.  This branch
<   is maintained by Ulrich Weigand.  The branch is merged from gcc-4_4-branch.
---
>   mpx
>   The goal of this branch is to support Intel MPX technology
>   (https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf;>link).
>   The branch is maintained by
>   Ilya Enkovich ilya.enkov...@intel.com
>   Patches should be marked with the tag [MPX] in the subject
>   line.
> 
>   st/cli-be
>   The goal of the branch is to develop a back end producing CLI binaries,
>   compliant with ECMA-335 specification.
>   This branch was originally maintained by Roberto Costa
>   robsettanta...@gmail.com.
>   Since May 2007, the current maintainers are Andrea Ornstein
>   andrea.ornst...@st.com
>   and Erven Rohou
>   erven.ro...@st.com.
466,476d487
<   arc-20081210-branch
<   The goal of this branch is to make the port to the ARCompact
<   architecture available.  This branch is maintained by Joern Rennecke
<   during spring 2009, and is expected to be unmaintained thereafter.
< 
<   microblaze
<   This branch contains support for the Xilinx MicroBlaze architecture.
<   This branch will be used to update MicroBlaze support from gcc-4.1.2 to
<   to the head.  It is maintained by Michael Eager 
<   ea...@eagercon.com.
< 
484c495
< Target-specific
---
> Target-specific
499c510
< Language-specific
---
> Language-specific
501a513,526
>   c++-concepts
>   This is the sandbox for renewed work on concepts for C++.
>   It was originally created by Gabriel

Re: [wwwdocs] Added /gcc-7/porting_to.html

2017-02-13 Thread Jonathan Wakely


On 13/02/17 21:40 +0800, Tim Song wrote:

On Tue, Jan 31, 2017 at 1:54 AM, Jonathan Wakely  wrote:

 after including unrelated headers such as , , , and 



 or ?


Thanks, I think my fingers got confused by "mutex" and "regex" and
started ending everythign with -ex.

Fixed with this patch, committed to wwwdocs.


Index: htdocs/gcc-7/porting_to.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/porting_to.html,v
retrieving revision 1.7
diff -u -r1.7 porting_to.html
--- htdocs/gcc-7/porting_to.html	13 Feb 2017 10:21:30 -	1.7
+++ htdocs/gcc-7/porting_to.html	13 Feb 2017 13:43:06 -
@@ -131,7 +131,7 @@
 Previously components such as std::bind
 and std::function were implicitly defined after including
 unrelated headers such as memory,
-futex, mutex, and
+future, mutex, and
 regex.
 Correct code should #include functional to define them.

Re: PR79478

2017-02-13 Thread Richard Biener

On 13/02/17 12:57, Prathamesh Kulkarni wrote:
> Hi,
> As mentioned in PR, the attached patch sets source range when parsing ssa-name
> in c_parser_gimple_postfix_expression which avoids uninitialized use.
> Is it OK to commit after bootstrap+test ?

Ok.

Richard.

> Thanks,
> Prathamesh
>

Re: [wwwdocs] Added /gcc-7/porting_to.html

2017-02-13 Thread Tim Song

On Tue, Jan 31, 2017 at 1:54 AM, Jonathan Wakely  wrote:
>  after including unrelated headers such as , , , and 
> 

 or ?

Re: [libstdc++,doc] Standardize references to Boost documentation

2017-02-13 Thread Jonathan Wakely


On 12/02/17 15:47 +0100, Gerald Pfeifer wrote:

It appears we have been using various ways to refer to bits of Boost
documentation, and I suggest to standardize this a bit per the patch
below.

http://www.boost.org/libs/ seems to be the shortest and
simplest form doing to.

Thoughts?


Works for me.

[PATCH] PR libstdc++/79486 use lvalues in result_of expressions

2017-02-13 Thread Jonathan Wakely


This is a similar bug to the is_callable assertions I fixed in
shared_ptr the other day: it needs to use the correct value category
in the result_of type. This is also a regression, in code that has
been refactored on trunk.

PR libstdc++/79486
* include/std/future (__future_base::_Task_state::_M_run)
(__future_base::_Task_state::_M_run_delayed): Use lvalue types in
result_of expressions.
* testsuite/30_threads/packaged_task/79486.cc: New.

Tested powerpc64le-linux, committed to trunk.


commit 10639bc87b988d5cdac07fc905fe4d8c800a6283
Author: Jonathan Wakely 
Date:   Mon Feb 13 12:51:51 2017 +

PR libstdc++/79486 use lvalues in result_of expressions

PR libstdc++/79486
* include/std/future (__future_base::_Task_state::_M_run)
(__future_base::_Task_state::_M_run_delayed): Use lvalue types in
result_of expressions.
* testsuite/30_threads/packaged_task/79486.cc: New.

diff --git a/libstdc++-v3/include/std/future b/libstdc++-v3/include/std/future
index 6351d7e..cb53561 100644
--- a/libstdc++-v3/include/std/future
+++ b/libstdc++-v3/include/std/future
@@ -1416,7 +1416,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   virtual void
   _M_run(_Args&&... __args)
   {
-   auto __boundfn = [&] () -> typename result_of<_Fn(_Args&&...)>::type {
+   auto __boundfn = [&] () -> typename result_of<_Fn&(_Args&&...)>::type {
return std::__invoke(_M_impl._M_fn, std::forward<_Args>(__args)...);
};
this->_M_set_result(_S_task_setter(this->_M_result, __boundfn));
@@ -1425,7 +1425,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   virtual void
   _M_run_delayed(_Args&&... __args, weak_ptr<_State_base> __self)
   {
-   auto __boundfn = [&] () -> typename result_of<_Fn(_Args&&...)>::type {
+   auto __boundfn = [&] () -> typename result_of<_Fn&(_Args&&...)>::type {
return std::__invoke(_M_impl._M_fn, std::forward<_Args>(__args)...);
};
this->_M_set_delayed_result(_S_task_setter(this->_M_result, __boundfn),
diff --git a/libstdc++-v3/testsuite/30_threads/packaged_task/79486.cc 
b/libstdc++-v3/testsuite/30_threads/packaged_task/79486.cc
new file mode 100644
index 000..46b4f3d
--- /dev/null
+++ b/libstdc++-v3/testsuite/30_threads/packaged_task/79486.cc
@@ -0,0 +1,27 @@
+// Copyright (C) 2017 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-do compile { target c++11 } }
+
+#include 
+
+struct F {
+  void operator()() & { }
+  void operator()() && = delete; // PR libstdc++/79486
+};
+
+std::packaged_task t{F{}};

Re: [PATCH] Replace XALLOCAVEC with XCNEWVEC (PR c/79471).

2017-02-13 Thread Bernd Schmidt


On 02/13/2017 02:06 PM, Martin Liška wrote:

On 02/13/2017 01:58 PM, Bernd Schmidt wrote:

On 02/13/2017 11:15 AM, Martin Liška wrote:

In order to not cause a stack overflow, lets use a vector allocated on heap 
instead of
the one created by XALLOCVEC.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.


Ok. I'm surprised this is marked as a regression, but it's simple enough that 
it probably ought to be fixed in any case.


Yep. I've just removed that. Is it also fine to both active branches after it 
finishes regression tests?


Sure.


Bernd

Re: [PATCH] Replace XALLOCAVEC with XCNEWVEC (PR c/79471).

2017-02-13 Thread Martin Liška

On 02/13/2017 01:58 PM, Bernd Schmidt wrote:
> On 02/13/2017 11:15 AM, Martin Liška wrote:
>> In order to not cause a stack overflow, lets use a vector allocated on heap 
>> instead of
>> the one created by XALLOCVEC.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ok. I'm surprised this is marked as a regression, but it's simple enough that 
> it probably ought to be fixed in any case.

Yep. I've just removed that. Is it also fine to both active branches after it 
finishes regression tests?

Thanks,
Martin

> 
> 
> Bernd

Re: [PATCH] Invalidate combiner's cached last value upon insn removal (PR rtl-optimization/79388, PR rtl-optimization/79450)

2017-02-13 Thread Bernd Schmidt


On 02/10/2017 08:50 PM, Jakub Jelinek wrote:


2017-02-10  Jakub Jelinek  

PR rtl-optimization/79388
PR rtl-optimization/79450
* combine.c (distribute_notes): When removing TEM_INSN for which
corresponding dest has last value recorded, invalidate that last
value.

* gcc.c-torture/execute/pr79388.c: New test.
* gcc.c-torture/execute/pr79450.c: New test.


Ok.


Bernd

Re: [PATCH][GRAPHITE] Remove support for ISL 0.14

2017-02-13 Thread Martin Liška

On 02/11/2017 08:24 AM, Richard Biener wrote:
> On February 11, 2017 12:38:32 AM GMT+01:00, Jakub Jelinek  
> wrote:
>> On Fri, Feb 10, 2017 at 04:34:30PM -0700, Jeff Law wrote:
 2017-02-10  Richard Biener  

config/
* isl.m4: Remove support for ISL 0.14.

* configure: Re-generate.

gcc/
* configure.ac (HAVE_ISL_OPTIONS_SET_SCHEDULE_SERIALIZE_SCCS):
Remove.
* configure: Re-generate.
* config.in: Likewise.
* graphite-dependences.c: Simplify as if
HAVE_ISL_OPTIONS_SET_SCHEDULE_SERIALIZE_SCCS was defined.
* graphite-isl-ast-to-gimple.c: Likewise.
* graphite-optimize-isl.c: Likewise.
* graphite-poly.c: Likewise.
* graphite-sese-to-poly.c: Likewise.
* graphite.h: Likewise.
* toplev.c: Include isl/version.h and use isl_version () for
printing the ISL version.
* doc/install.texi: Update ISL requirement.
>>> My concern here would be that distributions may still be shipping
>> isl-0.14.
>>> Fedora 25 (for example) still uses iso-0.14.
>>
>> isl isn't a mandatory requirement for building gcc, worst case you just
>> don't have graphite support.  Fedora 26 already has isl-0.16.1.
> 
> Yeah.  Note that even with ISL 0.15 there are known bugs.  I just hope that 
> w/o the legacy code maintainance would be easier.

And there are at least 2 bugs (PR79471 and PR69675) which ICE with ISL 0.16 and 
are fixed (well no transformation is done)
in 0.18. But maybe it just hide the ICE instead of not introducing PHI nodes 
usage before a definition.

Martin

> 
> Richard.
> 
>>  Jakub
>

Re: [PATCH] Replace XALLOCAVEC with XCNEWVEC (PR c/79471).

2017-02-13 Thread Bernd Schmidt


On 02/13/2017 11:15 AM, Martin Liška wrote:

In order to not cause a stack overflow, lets use a vector allocated on heap 
instead of
the one created by XALLOCVEC.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.


Ok. I'm surprised this is marked as a regression, but it's simple enough 
that it probably ought to be fixed in any case.



Bernd

Re: [PATCH] Update baseline_symbols.txt for {x86_64,i?86,aarch64,powerpc64,s390{,x}}-linux (PR libstdc++/79348)

2017-02-13 Thread Jonathan Wakely


On 13/02/17 13:09 +0100, Jakub Jelinek wrote:

Hi!

This patch updates baseline_symbols.txt mostly from our latest rpm builds,
x86_64 and i686 have been also compared to my local bootstrap abi lists
and s390 (31-bit) comes from my bootstrap (and 64-bit s390x compared against
the rpm builds too).
s390{,x}-linux weren't updated the last time (for GCC 6), so it now even
fails the abi check without this patch, the rest has been.
s390x 64-bit also is identical to powerpc64 64-bit (which is right, both
have the same mangling of important types and also the long double situation
is the same (well, powerpc64 also has the even newer long double, but I
think we don't instantiate in libstdc++ anything for that yet).

Ok for trunk?


OK, thanks.


The only weird thing I've noticed in the changes is that
_ZNSbIwSt11char_traitsIwESaIwEEC1ERKS2_mRKS1_
_ZNSbIwSt11char_traitsIwESaIwEEC2ERKS2_mRKS1_
is added on all arches but i?86, any explanation for that?


Huh, no. That symbol comes from the instantiations in
src/c++11/cow-wstring-inst.cc but I don't know why it would be
different for i?86.

[PATCH] Update baseline_symbols.txt for {x86_64,i?86,aarch64,powerpc64,s390{,x}}-linux (PR libstdc++/79348)

2017-02-13 Thread Jakub Jelinek

Hi!

This patch updates baseline_symbols.txt mostly from our latest rpm builds,
x86_64 and i686 have been also compared to my local bootstrap abi lists
and s390 (31-bit) comes from my bootstrap (and 64-bit s390x compared against
the rpm builds too).
s390{,x}-linux weren't updated the last time (for GCC 6), so it now even
fails the abi check without this patch, the rest has been.
s390x 64-bit also is identical to powerpc64 64-bit (which is right, both
have the same mangling of important types and also the long double situation
is the same (well, powerpc64 also has the even newer long double, but I
think we don't instantiate in libstdc++ anything for that yet).

Ok for trunk?

The only weird thing I've noticed in the changes is that
_ZNSbIwSt11char_traitsIwESaIwEEC1ERKS2_mRKS1_
_ZNSbIwSt11char_traitsIwESaIwEEC2ERKS2_mRKS1_
is added on all arches but i?86, any explanation for that?

2017-02-13  Jakub Jelinek  

PR libstdc++/79348
* config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/x86_64-linux-gnu/32/baseline_symbols.txt: Likewise.
* config/abi/post/i386-linux-gnu/baseline_symbols.txt: Likewise.
* config/abi/post/i486-linux-gnu/baseline_symbols.txt: Likewise.
* config/abi/post/aarch64-linux-gnu/baseline_symbols.txt: Likewise.
* config/abi/post/s390x-linux-gnu/baseline_symbols.txt: Likewise.
* config/abi/post/s390x-linux-gnu/32/baseline_symbols.txt: Likewise.
* config/abi/post/s390-linux-gnu/baseline_symbols.txt: Likewise.
* config/abi/post/powerpc64-linux-gnu/baseline_symbols.txt: Likewise.

--- libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt.jj   
2016-08-06 12:12:32.0 +0200
+++ libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt  
2017-02-13 12:27:01.721264985 +0100
@@ -1329,6 +1329,7 @@ FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC1EP
 FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC1EPKwmRKS1_@@GLIBCXX_3.4
 FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC1ERKS1_@@GLIBCXX_3.4
 FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC1ERKS2_@@GLIBCXX_3.4
+FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC1ERKS2_mRKS1_@@GLIBCXX_3.4.23
 FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC1ERKS2_mm@@GLIBCXX_3.4
 FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC1ERKS2_mmRKS1_@@GLIBCXX_3.4
 
FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC1ESt16initializer_listIwERKS1_@@GLIBCXX_3.4.11
@@ -1342,6 +1343,7 @@ FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC2EP
 FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC2EPKwmRKS1_@@GLIBCXX_3.4
 FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC2ERKS1_@@GLIBCXX_3.4
 FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC2ERKS2_@@GLIBCXX_3.4
+FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC2ERKS2_mRKS1_@@GLIBCXX_3.4.23
 FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC2ERKS2_mm@@GLIBCXX_3.4
 FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC2ERKS2_mmRKS1_@@GLIBCXX_3.4
 
FUNC:_ZNSbIwSt11char_traitsIwESaIwEEC2ESt16initializer_listIwERKS1_@@GLIBCXX_3.4.11
@@ -1580,6 +1582,7 @@ FUNC:_ZNSsC1EPKcRKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC1EPKcmRKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC1ERKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC1ERKSs@@GLIBCXX_3.4
+FUNC:_ZNSsC1ERKSsmRKSaIcE@@GLIBCXX_3.4.23
 FUNC:_ZNSsC1ERKSsmm@@GLIBCXX_3.4
 FUNC:_ZNSsC1ERKSsmmRKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC1ESt16initializer_listIcERKSaIcE@@GLIBCXX_3.4.11
@@ -1593,6 +1596,7 @@ FUNC:_ZNSsC2EPKcRKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC2EPKcmRKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC2ERKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC2ERKSs@@GLIBCXX_3.4
+FUNC:_ZNSsC2ERKSsmRKSaIcE@@GLIBCXX_3.4.23
 FUNC:_ZNSsC2ERKSsmm@@GLIBCXX_3.4
 FUNC:_ZNSsC2ERKSsmmRKSaIcE@@GLIBCXX_3.4
 FUNC:_ZNSsC2ESt16initializer_listIcERKSaIcE@@GLIBCXX_3.4.11
@@ -2232,6 +2236,7 @@ FUNC:_ZNSt15_List_node_base8transferEPS_
 FUNC:_ZNSt15_List_node_base9_M_unhookEv@@GLIBCXX_3.4.14
 FUNC:_ZNSt15__exception_ptr13exception_ptr4swapERS0_@@CXXABI_1.3.3
 FUNC:_ZNSt15__exception_ptr13exception_ptrC1EMS0_FvvE@@CXXABI_1.3.3
+FUNC:_ZNSt15__exception_ptr13exception_ptrC1EPv@@CXXABI_1.3.11
 FUNC:_ZNSt15__exception_ptr13exception_ptrC1ERKS0_@@CXXABI_1.3.3
 FUNC:_ZNSt15__exception_ptr13exception_ptrC1Ev@@CXXABI_1.3.3
 FUNC:_ZNSt15__exception_ptr13exception_ptrC2EMS0_FvvE@@CXXABI_1.3.3
@@ -2777,7 +2782,9 @@ FUNC:_ZNSt7__cxx1112basic_stringIcSt11ch
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE10_M_replaceEmmPKcm@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE10_S_compareEmm@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE11_M_capacityEm@@GLIBCXX_3.4.21
+FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_Alloc_hiderC1EPcOS3_@@GLIBCXX_3.4.23
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_Alloc_hiderC1EPcRKS3_@@GLIBCXX_3.4.21
+FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_Alloc_hiderC2EPcOS3_@@GLIBCXX_3.4.23
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_Alloc_hiderC2EPcRKS3_@@GLIBCXX_3.4.21
 
FUNC:_ZNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEE12_M_constructEmc@@GLIBCXX_3.4.21

PR79478

2017-02-13 Thread Prathamesh Kulkarni

Hi,
As mentioned in PR, the attached patch sets source range when parsing ssa-name
in c_parser_gimple_postfix_expression which avoids uninitialized use.
Is it OK to commit after bootstrap+test ?

Thanks,
Prathamesh
2017-02-13  Prathamesh Kulkarni  

PR c/79478
* gimple-parser.c (c_parser_gimple_postfix_expression): Call
set_c_expr_source_range when parsing ssa-name.

diff --git a/gcc/c/gimple-parser.c b/gcc/c/gimple-parser.c
index 681951c..db7e407 100644
--- a/gcc/c/gimple-parser.c
+++ b/gcc/c/gimple-parser.c
@@ -863,6 +863,7 @@ c_parser_gimple_postfix_expression (c_parser *parser)
  c_parser_consume_token (parser);
  expr.value = c_parser_parse_ssa_name (parser, id, NULL_TREE,
version, ver_offset);
+ set_c_expr_source_range (, tok_range);
  /* For default definition SSA names.  */
  if (c_parser_next_token_is (parser, CPP_OPEN_PAREN)
  && c_parser_peek_2nd_token (parser)->type == CPP_NAME

Re: [PATCH] Improve x % y to x VRP optimization (PR tree-optimization/79408)

2017-02-13 Thread Marc Glisse


On Mon, 13 Feb 2017, Jakub Jelinek wrote:


On Mon, Feb 13, 2017 at 09:59:12AM +0100, Richard Biener wrote:

On Sun, 12 Feb 2017, Marc Glisse wrote:


On Sun, 12 Feb 2017, Marc Glisse wrote:


On Tue, 7 Feb 2017, Jakub Jelinek wrote:


* tree-vrp.c (simplify_div_or_mod_using_ranges): If op1 is not
constant, but SSA_NAME with a known integer range, use the minimum
of that range instead of op1 to determine if modulo can be replaced
with its first operand.


Would it make sense to use something like the operand_less_p helper so we
would also handle an INTEGER_CST lhs?


Oops, operand_less_p is just a helper for compare_values_warnv and even that
one doesn't seem to use ranges, so there may not already be a nice function we
can call (?). The idea remains that reusing such code would help handle more
cases (it may even handle a few symbolic cases).


Yeah, we only have the compare_range_with_value / compare_ranges functions
and those require you to "expand" the value you have a range for first.


Nice, I was looking for them but I'd missed those functions.


I think we can do something like following, but not sure how you want to
actually simplify it using helpers.  The only thing I can think of is
something like get_value_range that works on both SSA_NAMEs and
INTEGER_CSTs, but then it either has to allocate a new value range struct
for the INTEGER_CST case, or be more like extract_range_from_tree and
then it would copy the range for the SSA_NAME case, penalizing the code.


If allocating a new range for INTEGER_CST is too expensive (I wouldn't 
expect it to matter, but I never profiled gcc), we could have it on the 
stack:


value_range tmp;
value_range* vr0;
if(TREE_CODE(op0)==INTEGER_CST){
  tmp.max=tmp.min=op0;
  tmp.type=VR_RANGE;
  vr0=
} else {
  vr0=get_value_range(op0);
}
// same for op1
if(TYPE_UNSIGNED(TREE_TYPE(op0))&_ranges(LT_EXPR,vr0,vr1,))
  // op0 % op1 is just op0
etc.

That's not so different from what you did, but compare_ranges doesn't 
dismiss symbolic ranges (I am not sure there are any cases where that 
would help here), and it feels like it is duplicating less code. On the 
other hand I haven't tried to write the signed case, which may be uglier 
with this version since we would have to generate a range for -op1, so 
maybe your version is better?


(thanks for the patch)

--
Marc Glisse

Re: [PATCH] Improve x % y to x VRP optimization (PR tree-optimization/79408)

2017-02-13 Thread Richard Biener

On Mon, 13 Feb 2017, Jakub Jelinek wrote:

> On Mon, Feb 13, 2017 at 09:59:12AM +0100, Richard Biener wrote:
> > On Sun, 12 Feb 2017, Marc Glisse wrote:
> > 
> > > On Sun, 12 Feb 2017, Marc Glisse wrote:
> > > 
> > > > On Tue, 7 Feb 2017, Jakub Jelinek wrote:
> > > > 
> > > > >   * tree-vrp.c (simplify_div_or_mod_using_ranges): If op1 is not
> > > > >   constant, but SSA_NAME with a known integer range, use the 
> > > > > minimum
> > > > >   of that range instead of op1 to determine if modulo can be 
> > > > > replaced
> > > > >   with its first operand.
> > > > 
> > > > Would it make sense to use something like the operand_less_p helper so 
> > > > we
> > > > would also handle an INTEGER_CST lhs?
> > > 
> > > Oops, operand_less_p is just a helper for compare_values_warnv and even 
> > > that
> > > one doesn't seem to use ranges, so there may not already be a nice 
> > > function we
> > > can call (?). The idea remains that reusing such code would help handle 
> > > more
> > > cases (it may even handle a few symbolic cases).
> > 
> > Yeah, we only have the compare_range_with_value / compare_ranges functions
> > and those require you to "expand" the value you have a range for first.
> 
> I think we can do something like following, but not sure how you want to
> actually simplify it using helpers.  The only thing I can think of is
> something like get_value_range that works on both SSA_NAMEs and
> INTEGER_CSTs, but then it either has to allocate a new value range struct
> for the INTEGER_CST case, or be more like extract_range_from_tree and
> then it would copy the range for the SSA_NAME case, penalizing the code.

You'd of course allocate it on the stack.  But yeah, sth like your patch
works for me.

Richard.

> 2017-02-13  Jakub Jelinek  
> 
>   PR tree-optimization/79408
>   * tree-vrp.c (simplify_div_or_mod_using_ranges): Handle also the
>   case when on TRUNC_MOD_EXPR op0 is INTEGER_CST.
>   (simplify_stmt_using_ranges): Call simplify_div_or_mod_using_ranges
>   also if rhs1 is INTEGER_CST.
> 
>   * gcc.dg/tree-ssa/pr79408-2.c: New test.
> 
> --- gcc/tree-vrp.c.jj 2017-02-08 10:21:31.0 +0100
> +++ gcc/tree-vrp.c2017-02-13 10:55:00.755070795 +0100
> @@ -9241,8 +9241,24 @@ simplify_div_or_mod_using_ranges (gimple
>tree val = NULL;
>tree op0 = gimple_assign_rhs1 (stmt);
>tree op1 = gimple_assign_rhs2 (stmt);
> +  tree op0min = NULL_TREE, op0max = NULL_TREE;
>tree op1min = op1;
> -  value_range *vr = get_value_range (op0);
> +  value_range *vr = NULL;
> +
> +  if (TREE_CODE (op0) == INTEGER_CST)
> +{
> +  op0min = op0;
> +  op0max = op0;
> +}
> +  else
> +{
> +  vr = get_value_range (op0);
> +  if (range_int_cst_p (vr))
> + {
> +   op0min = vr->min;
> +   op0max = vr->max;
> + }
> +}
>  
>if (rhs_code == TRUNC_MOD_EXPR
>&& TREE_CODE (op1) == SSA_NAME)
> @@ -9254,13 +9270,13 @@ simplify_div_or_mod_using_ranges (gimple
>if (rhs_code == TRUNC_MOD_EXPR
>&& TREE_CODE (op1min) == INTEGER_CST
>&& tree_int_cst_sgn (op1min) == 1
> -  && range_int_cst_p (vr)
> -  && tree_int_cst_lt (vr->max, op1min))
> +  && op0max
> +  && tree_int_cst_lt (op0max, op1min))
>  {
>if (TYPE_UNSIGNED (TREE_TYPE (op0))
> -   || tree_int_cst_sgn (vr->min) >= 0
> +   || tree_int_cst_sgn (op0min) >= 0
> || tree_int_cst_lt (fold_unary (NEGATE_EXPR, TREE_TYPE (op1min), 
> op1min),
> -   vr->min))
> +   op0min))
>   {
> /* If op0 already has the range op0 % op1 has,
>then TRUNC_MOD_EXPR won't change anything.  */
> @@ -9269,6 +9285,9 @@ simplify_div_or_mod_using_ranges (gimple
>   }
>  }
>  
> +  if (TREE_CODE (op0) != SSA_NAME)
> +return false;
> +
>if (!integer_pow2p (op1))
>  {
>/* X % -Y can be only optimized into X % Y either if
> @@ -10377,7 +10396,8 @@ simplify_stmt_using_ranges (gimple_stmt_
>range.  */
>   case TRUNC_DIV_EXPR:
>   case TRUNC_MOD_EXPR:
> -   if (TREE_CODE (rhs1) == SSA_NAME
> +   if ((TREE_CODE (rhs1) == SSA_NAME
> +|| TREE_CODE (rhs1) == INTEGER_CST)
> && INTEGRAL_TYPE_P (TREE_TYPE (rhs1)))
>   return simplify_div_or_mod_using_ranges (gsi, stmt);
> break;
> --- gcc/testsuite/gcc.dg/tree-ssa/pr79408-2.c.jj  2017-02-13 
> 10:51:13.939063664 +0100
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr79408-2.c 2017-02-13 10:52:33.868008990 
> +0100
> @@ -0,0 +1,34 @@
> +/* PR tree-optimization/79408 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +void link_error (void);
> +
> +void
> +foo (unsigned int y)
> +{
> +  if (y <= 7312)
> +return;
> +  if (7312 % y != 7312)
> +link_error ();
> +}
> +
> +void
> +bar (int x, int y)
> +{
> +  if (y <= 7312)
> +return;
> +  if (7312 % y != 7312)
> +link_error ();
> +}
>

Re: [wwwdocs] Add a case to porting_to + a question wrt validity of another one

2017-02-13 Thread Marek Polacek

On Sun, Feb 12, 2017 at 09:08:42AM +0100, Gerald Pfeifer wrote:
> On Wed, 8 Feb 2017, Marek Polacek wrote:
> > Like this?
> 
>  As a consequence, the following examples are invalid and G++ will no longer
> -compile them:
> +compile them, because, in the following examples, G++ used to treat
> +this->member where member has a non-dependent type, as
> +type-dependent, and now it doesn't.
> 
> This has two instances of "the following examples".  Perhaps omit
> the second instance and break the sentence, putting "G++ used to
> treat..." in parenthesis after the first sentence, or adding this
> explanation after the examples?
 
I had already fixed this...

> Also you'll need to write "-" instead of "->", and member
> the second time as well (or member which we use in other places
> in changes.html for this kind of usage).

...and this I'm fixing with the following, thanks.

Index: porting_to.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-7/porting_to.html,v
retrieving revision 1.6
diff -u -r1.6 porting_to.html
--- porting_to.html 8 Feb 2017 18:48:11 -   1.6
+++ porting_to.html 13 Feb 2017 10:21:05 -
@@ -52,7 +52,7 @@
 
 
 As a consequence, the following examples are invalid and G++ will no longer
-compile them, because G++ used to treat this->member
+compile them, because G++ used to treat this-member
 where member has a non-dependent type, as type-dependent, and now it doesn't.
 
 

Marek

[PATCH] Replace XALLOCAVEC with XCNEWVEC (PR c/79471).

2017-02-13 Thread Martin Liška

Hello.

In order to not cause a stack overflow, lets use a vector allocated on heap 
instead of
the one created by XALLOCVEC.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin
>From c71fadc2104c729ae5625e06c54239998dd794a5 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Mon, 13 Feb 2017 09:25:36 +0100
Subject: [PATCH] Replace XALLOCAVEC with XCNEWVEC (PR c/79471).

gcc/ChangeLog:

2017-02-13  Martin Liska  

	PR c/79471
	* calls.c (expand_call): Replace XALLOCAVEC with XCNEWVEC.
---
 gcc/calls.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/calls.c b/gcc/calls.c
index 7b45b9a111d..6d5ef4e02a0 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -3262,8 +3262,7 @@ expand_call (tree exp, rtx target, int ignore)
 n_named_args = num_actuals;
 
   /* Make a vector to hold all the information about each arg.  */
-  args = XALLOCAVEC (struct arg_data, num_actuals);
-  memset (args, 0, num_actuals * sizeof (struct arg_data));
+  args = XCNEWVEC (struct arg_data, num_actuals);
 
   /* Build up entries in the ARGS array, compute the size of the
  arguments into ARGS_SIZE, etc.  */
@@ -4265,6 +4264,7 @@ expand_call (tree exp, rtx target, int ignore)
   currently_expanding_call--;
 
   free (stack_usage_map_buf);
+  free (args);
 
   /* Join result with returned bounds so caller may use them if needed.  */
   target = chkp_join_splitted_slot (target, valbnd);
-- 
2.11.0

[gomp4] Async related additions to OpenACC runtime library

2017-02-13 Thread Chung-Lin Tang

This patch adds:

// New functions to set/get the current default async queue
void acc_set_default_async (int);
int acc_get_default_async (void);

and _async versions of a few existing API functions:

void acc_copyin_async (void *, size_t, int);
void acc_create_async (void *, size_t, int);
void acc_copyout_async (void *, size_t, int);
void acc_delete_async (void *, size_t, int);
void acc_update_device_async (void *, size_t, int);
void acc_update_self_async (void *, size_t, int);
void acc_memcpy_to_device_async (void *, void *, size_t, int);
void acc_memcpy_from_device_async (void *, void *, size_t, int);

These implement part of the additional requirements for OpenACC 2.5
Tested and committed to gomp-4_0-branch.

Chung-Lin

2017-02-13  Chung-Lin Tang  

libgomp/
* oacc-async.c (acc_get_default_async): New API function.
(acc_set_default_async): Likewise.
* oacc-init.c ():
* oacc-int.h (struct goacc_thread): Add default_async field.
* oacc-mem.c (memcpy_tofrom_device): New function, combined from
acc_memcpy_to/from_device functions, now with async parameter.
(acc_memcpy_to_device): Modify to use memcpy_tofrom_device.
(acc_memcpy_from_device): Likewise.
(acc_memcpy_to_device_async): New API function.
(acc_memcpy_from_device_async): Likewise.
(present_create_copy): Add async parameter.
(acc_create): Adjust present_create_copy call.
(acc_copyin): Likewise.
(acc_present_or_create): Likewise.
(acc_present_or_copyin): Likewise.
(acc_create_async): New API function.
(acc_copyin_async): New API function.
(delete_copyout): Add async parameter.
(acc_delete): Adjust delete_copyout call.
(acc_copyout): Likewise.
(acc_delete_async): New API function.
(acc_copyout_async): Likewise.
(update_dev_host): Add async parameter.
(acc_update_device): Adjust update_dev_host call.
(acc_update_self): Likewise.
(acc_update_device_async): New API function.
(acc_update_self_async): Likewise.
* oacc-plugin.c (GOMP_PLUGIN_acc_thread_default_async): New function.
* oacc-plugin.h (GOMP_PLUGIN_acc_thread_default_async): Declare.
* openacc.f90 (acc_async_default): Declare.
(acc_set_default_async): Likewise.
(acc_get_default_async): Likewise.
* openacc_lib.h (acc_async_default): Declare.
(acc_set_default_async): Likewise.
(acc_get_default_async): Likewise.
* testsuite/libgomp.oacc-c-c++-common/asyncwait-2.c: New test.
* testsuite/libgomp.oacc-c-c++-common/lib-94.c: New test.
* testsuite/libgomp.oacc-c-c++-common/lib-95.c: New test.
* testsuite/libgomp.oacc-fortran/lib-16.f90: New test.

include/
* gomp-constants.h (GOMP_ASYNC_DEFAULT): Define.
Index: libgomp/oacc-async.c
===
--- libgomp/oacc-async.c(revision 245382)
+++ libgomp/oacc-async.c(working copy)
@@ -105,3 +105,28 @@ acc_wait_all_async (int async)
 
   thr->dev->openacc.async_wait_all_async_func (async);
 }
+
+int
+acc_get_default_async (void)
+{
+  struct goacc_thread *thr = goacc_thread ();
+
+  if (!thr || !thr->dev)
+gomp_fatal ("no device active");
+
+  return thr->default_async;
+}
+
+void
+acc_set_default_async (int async)
+{
+  if (async < acc_async_sync)
+gomp_fatal ("invalid async argument: %d", async);
+
+  struct goacc_thread *thr = goacc_thread ();
+
+  if (!thr || !thr->dev)
+gomp_fatal ("no device active");
+
+  thr->default_async = async;
+}
Index: libgomp/oacc-init.c
===
--- libgomp/oacc-init.c (revision 245382)
+++ libgomp/oacc-init.c (working copy)
@@ -437,6 +437,8 @@ goacc_attach_host_thread_to_device (int ord)
   
   thr->target_tls
 = acc_dev->openacc.create_thread_data_func (ord);
+
+  thr->default_async = acc_async_default;
   
   acc_dev->openacc.async_set_async_func (acc_async_sync);
 }
Index: libgomp/oacc-int.h
===
--- libgomp/oacc-int.h  (revision 245382)
+++ libgomp/oacc-int.h  (working copy)
@@ -73,6 +73,9 @@ struct goacc_thread
 
   /* Target-specific data (used by plugin).  */
   void *target_tls;
+
+  /* Default OpenACC async queue for current thread, exported to plugin.  */
+  int default_async;
 };
 
 #if defined HAVE_TLS || defined USE_EMUTLS
Index: libgomp/oacc-mem.c
===
--- libgomp/oacc-mem.c  (revision 245382)
+++ libgomp/oacc-mem.c  (working copy)
@@ -153,8 +153,9 @@ acc_free (void *d)
 gomp_fatal ("error in freeing device memory in %s", __FUNCTION__);
 }
 
-void
-acc_memcpy_to_device (void *d, void *h, size_t s)
+static void
+memcpy_tofrom_device (bool from, void *d, void *h, size_t s, int async,
+

Re: [PATCH] Improve x % y to x VRP optimization (PR tree-optimization/79408)

2017-02-13 Thread Jakub Jelinek

On Mon, Feb 13, 2017 at 09:59:12AM +0100, Richard Biener wrote:
> On Sun, 12 Feb 2017, Marc Glisse wrote:
> 
> > On Sun, 12 Feb 2017, Marc Glisse wrote:
> > 
> > > On Tue, 7 Feb 2017, Jakub Jelinek wrote:
> > > 
> > > > * tree-vrp.c (simplify_div_or_mod_using_ranges): If op1 is not
> > > > constant, but SSA_NAME with a known integer range, use the 
> > > > minimum
> > > > of that range instead of op1 to determine if modulo can be 
> > > > replaced
> > > > with its first operand.
> > > 
> > > Would it make sense to use something like the operand_less_p helper so we
> > > would also handle an INTEGER_CST lhs?
> > 
> > Oops, operand_less_p is just a helper for compare_values_warnv and even that
> > one doesn't seem to use ranges, so there may not already be a nice function 
> > we
> > can call (?). The idea remains that reusing such code would help handle more
> > cases (it may even handle a few symbolic cases).
> 
> Yeah, we only have the compare_range_with_value / compare_ranges functions
> and those require you to "expand" the value you have a range for first.

I think we can do something like following, but not sure how you want to
actually simplify it using helpers.  The only thing I can think of is
something like get_value_range that works on both SSA_NAMEs and
INTEGER_CSTs, but then it either has to allocate a new value range struct
for the INTEGER_CST case, or be more like extract_range_from_tree and
then it would copy the range for the SSA_NAME case, penalizing the code.

2017-02-13  Jakub Jelinek  

PR tree-optimization/79408
* tree-vrp.c (simplify_div_or_mod_using_ranges): Handle also the
case when on TRUNC_MOD_EXPR op0 is INTEGER_CST.
(simplify_stmt_using_ranges): Call simplify_div_or_mod_using_ranges
also if rhs1 is INTEGER_CST.

* gcc.dg/tree-ssa/pr79408-2.c: New test.

--- gcc/tree-vrp.c.jj   2017-02-08 10:21:31.0 +0100
+++ gcc/tree-vrp.c  2017-02-13 10:55:00.755070795 +0100
@@ -9241,8 +9241,24 @@ simplify_div_or_mod_using_ranges (gimple
   tree val = NULL;
   tree op0 = gimple_assign_rhs1 (stmt);
   tree op1 = gimple_assign_rhs2 (stmt);
+  tree op0min = NULL_TREE, op0max = NULL_TREE;
   tree op1min = op1;
-  value_range *vr = get_value_range (op0);
+  value_range *vr = NULL;
+
+  if (TREE_CODE (op0) == INTEGER_CST)
+{
+  op0min = op0;
+  op0max = op0;
+}
+  else
+{
+  vr = get_value_range (op0);
+  if (range_int_cst_p (vr))
+   {
+ op0min = vr->min;
+ op0max = vr->max;
+   }
+}
 
   if (rhs_code == TRUNC_MOD_EXPR
   && TREE_CODE (op1) == SSA_NAME)
@@ -9254,13 +9270,13 @@ simplify_div_or_mod_using_ranges (gimple
   if (rhs_code == TRUNC_MOD_EXPR
   && TREE_CODE (op1min) == INTEGER_CST
   && tree_int_cst_sgn (op1min) == 1
-  && range_int_cst_p (vr)
-  && tree_int_cst_lt (vr->max, op1min))
+  && op0max
+  && tree_int_cst_lt (op0max, op1min))
 {
   if (TYPE_UNSIGNED (TREE_TYPE (op0))
- || tree_int_cst_sgn (vr->min) >= 0
+ || tree_int_cst_sgn (op0min) >= 0
  || tree_int_cst_lt (fold_unary (NEGATE_EXPR, TREE_TYPE (op1min), 
op1min),
- vr->min))
+ op0min))
{
  /* If op0 already has the range op0 % op1 has,
 then TRUNC_MOD_EXPR won't change anything.  */
@@ -9269,6 +9285,9 @@ simplify_div_or_mod_using_ranges (gimple
}
 }
 
+  if (TREE_CODE (op0) != SSA_NAME)
+return false;
+
   if (!integer_pow2p (op1))
 {
   /* X % -Y can be only optimized into X % Y either if
@@ -10377,7 +10396,8 @@ simplify_stmt_using_ranges (gimple_stmt_
 range.  */
case TRUNC_DIV_EXPR:
case TRUNC_MOD_EXPR:
- if (TREE_CODE (rhs1) == SSA_NAME
+ if ((TREE_CODE (rhs1) == SSA_NAME
+  || TREE_CODE (rhs1) == INTEGER_CST)
  && INTEGRAL_TYPE_P (TREE_TYPE (rhs1)))
return simplify_div_or_mod_using_ranges (gsi, stmt);
  break;
--- gcc/testsuite/gcc.dg/tree-ssa/pr79408-2.c.jj2017-02-13 
10:51:13.939063664 +0100
+++ gcc/testsuite/gcc.dg/tree-ssa/pr79408-2.c   2017-02-13 10:52:33.868008990 
+0100
@@ -0,0 +1,34 @@
+/* PR tree-optimization/79408 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+void link_error (void);
+
+void
+foo (unsigned int y)
+{
+  if (y <= 7312)
+return;
+  if (7312 % y != 7312)
+link_error ();
+}
+
+void
+bar (int x, int y)
+{
+  if (y <= 7312)
+return;
+  if (7312 % y != 7312)
+link_error ();
+}
+
+void
+baz (int x, int y)
+{
+  if (y <= 7312)
+return;
+  if (-7312 % y != -7312)
+link_error ();
+}
+
+/* { dg-final { scan-tree-dump-times "link_error" 0 "optimized"} } */


Jakub

Re: [PATCH] Improve x % y to x VRP optimization (PR tree-optimization/79408)

2017-02-13 Thread Richard Biener

On Sun, 12 Feb 2017, Marc Glisse wrote:

> On Sun, 12 Feb 2017, Marc Glisse wrote:
> 
> > On Tue, 7 Feb 2017, Jakub Jelinek wrote:
> > 
> > >   * tree-vrp.c (simplify_div_or_mod_using_ranges): If op1 is not
> > >   constant, but SSA_NAME with a known integer range, use the minimum
> > >   of that range instead of op1 to determine if modulo can be replaced
> > >   with its first operand.
> > 
> > Would it make sense to use something like the operand_less_p helper so we
> > would also handle an INTEGER_CST lhs?
> 
> Oops, operand_less_p is just a helper for compare_values_warnv and even that
> one doesn't seem to use ranges, so there may not already be a nice function we
> can call (?). The idea remains that reusing such code would help handle more
> cases (it may even handle a few symbolic cases).

Yeah, we only have the compare_range_with_value / compare_ranges functions
and those require you to "expand" the value you have a range for first.

Richard.

> > unsigned f(unsigned x){
> >  if(x<42)return 18;
> >  return 33%x;
> > }
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)

77 matches

Mail list logo