Re: [PATCH] Builtin function roundeven folding implementation

2019-08-13 Thread Tejas Joshi
Hi.
Here is a clean patch that does not fold roundeven resulting for
integer type and the conditions for folding functions
round/ceil/floor/roundeven and trunc only checks for signaling NaN.
This patch also conforms to GNU's coding style and standards. The fact
that it should be CASE_MATHFN_FLOATN rather than CASE_MATHFN for
roundeven is corrected in i386 expansion patch.

Thanks,
Tejas

gcc/ChangeLog:

2019-06-12  Tejas Joshi  

* builtins.c (mathfn_built_in_2): Added CASE_MATHFN for ROUNDEVEN.
* builtins.def: Added function definitions for roundeven function
variants.
* fold-const-call.c (fold_const_call_ss): Added case for roundeven
function call.
* fold-const.c (negate_mathfn_p): Added case for roundeven function.
(tree_call_nonnegative_warnv_p): Added case for roundeven function.
(integer_valued_real_call_p): Added case for roundeven function.
* real.c (is_even): New function. Returns true if real number is
even, otherwise returns false.
(is_halfway_below): New function. Returns true if real number is
halfway between two integers, else return false.
(real_roundeven): New function. Round real number to nearest
integer, rounding halfway cases towards even.
* real.h (real_value): Added descriptive comments.
Added function declaration for roundeven function.

gcc/testsuite/ChangeLog:

2019-06-12  Tejas Joshi  

* gcc.dg/torture/builtin-round-roundeven.c: New test.
* gcc.dg/torture/builtin-round-roundevenf128.c: New test.

On Sat, 10 Aug 2019 at 02:15, Joseph Myers  wrote:
>
> On Fri, 28 Jun 2019, Tejas Joshi wrote:
>
> > +CASE_CFN_ROUNDEVEN:
> > +CASE_CFN_ROUNDEVEN_FN:
> > +  if (!REAL_VALUE_ISNAN (*arg) || !flag_errno_math)
>
> Checking flag_errno_math here does not make sense.  roundeven never sets
> errno (at least, TS 18661-1 makes it implementation-defined whether sNaN
> input counts as a domain error, but I'm not aware of implementations that
> make it a domain error and set errno, and typically GCC follows glibc in
> such cases in the absence of known implementations requiring a different
> approach).
>
> The only case where you need to avoid folding is where the argument is a
> signaling NaN (it's fine to fold for quiet NaNs).  In that case, you need
> to avoid folding to avoid losing an exception (if the user cares about
> signaling NaNs, they probably care about exceptions) - so it still doesn't
> matter whether the library implementation also sets errno or not.
>
> (Yes, this means the existing ceil / floor / round checks should be
> adjusted just to check for signaling NaN, though that's fairly cosmetic as
> calls to those functions with quiet NaN argument still get folded via
> match.pd.  trunc ought also check for signaling NaN rather than folding
> unconditionally, so all those functions should end up with the same
> conditions for folding.)
>
> > @@ -898,6 +907,10 @@ fold_const_call_ss (wide_int *result, combined_fn fn,
> >return fold_const_conversion (result, real_round, arg,
> >   precision, format);
> >
> > +CASE_CFN_ROUNDEVEN:
> > +CASE_CFN_ROUNDEVEN_FN:
> > +  return fold_const_conversion (result, real_roundeven, arg, 
> > precision, format);
> > +
>
> This is the version of fold_const_call_ss for functions returning a result
> of integer type; roundeven returns an integer value in a floating-point
> type.  I don't think this code should be there, and I don't think this
> version of the function should be called at all for roundeven.
>
> --
> Joseph S. Myers
> jos...@codesourcery.com
diff --git a/gcc/builtins.c b/gcc/builtins.c
index e2ba356c0d3..8ceb077b0bf 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -2056,6 +2056,7 @@ mathfn_built_in_2 (tree type, combined_fn fn)
 CASE_MATHFN (REMQUO)
 CASE_MATHFN_FLOATN (RINT)
 CASE_MATHFN_FLOATN (ROUND)
+CASE_MATHFN (ROUNDEVEN)
 CASE_MATHFN (SCALB)
 CASE_MATHFN (SCALBLN)
 CASE_MATHFN (SCALBN)
diff --git a/gcc/builtins.def b/gcc/builtins.def
index 6d41bdb4f44..8bb7027aac7 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -542,12 +542,18 @@ DEF_C99_BUILTIN(BUILT_IN_RINTL, "rintl", BT_FN_LONGDOUBLE_LONGDOUBLE, AT
 #define RINT_TYPE(F) BT_FN_##F##_##F
 DEF_EXT_LIB_FLOATN_NX_BUILTINS (BUILT_IN_RINT, "rint", RINT_TYPE, ATTR_CONST_NOTHROW_LEAF_LIST)
 #undef RINT_TYPE
+DEF_EXT_LIB_BUILTIN(BUILT_IN_ROUNDEVEN, "roundeven", BT_FN_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_ROUNDEVENF, "roundevenf", BT_FN_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
+DEF_EXT_LIB_BUILTIN(BUILT_IN_ROUNDEVENL, "roundevenl", BT_FN_LONGDOUBLE_LONGDOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN(BUILT_IN_ROUND, "round", BT_FN_DOUBLE_DOUBLE, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN(BUILT_IN_ROUNDF, "roundf", BT_FN_FLOAT_FLOAT, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_C99_BUILTIN(BUILT_IN_ROUNDL, "roundl", 

[Committed] PR fortran/87991 -- Pointer data object requires a target

2019-08-13 Thread Steve Kargl
Committed as obviously.

2019-08-13  Steven G. Kargl  

PR fortran/87991
* resolve.c (check_data_variable): data-stmt-object with pointer
attribute requires a data-stmt-value with the target attribute.
 
2019-08-13  Steven G. Kargl  

PR fortran/87991
* gfortran.dg/pr87991.f90: New test.

-- 
Steve
Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c	(revision 274398)
+++ gcc/fortran/resolve.c	(working copy)
@@ -15726,8 +15726,6 @@ check_data_variable (gfc_data_variable *var, locus *wh
   return false;
 }
 
-  has_pointer = sym->attr.pointer;
-
   if (gfc_is_coindexed (e))
 {
   gfc_error ("DATA element %qs at %L cannot have a coindex", sym->name,
@@ -15735,19 +15733,30 @@ check_data_variable (gfc_data_variable *var, locus *wh
   return false;
 }
 
+  has_pointer = sym->attr.pointer;
+
   for (ref = e->ref; ref; ref = ref->next)
 {
   if (ref->type == REF_COMPONENT && ref->u.c.component->attr.pointer)
 	has_pointer = 1;
 
-  if (has_pointer
-	&& ref->type == REF_ARRAY
-	&& ref->u.ar.type != AR_FULL)
-	  {
-	gfc_error ("DATA element %qs at %L is a pointer and so must "
-			"be a full array", sym->name, where);
-	return false;
-	  }
+  if (has_pointer)
+	{
+	  if (ref->type == REF_ARRAY && ref->u.ar.type != AR_FULL)
+	{
+	  gfc_error ("DATA element %qs at %L is a pointer and so must "
+			 "be a full array", sym->name, where);
+	  return false;
+	}
+
+	  if (values.vnode->expr->expr_type == EXPR_CONSTANT)
+	{
+	  gfc_error ("DATA object near %L has the pointer attribute "
+			 "and the corresponding DATA value is not a valid "
+			 "initial-data-target", where);
+	  return false;
+	}
+	}
 }
 
   if (e->rank == 0 || has_pointer)
Index: gcc/testsuite/gfortran.dg/pr87991.f90
===
--- gcc/testsuite/gfortran.dg/pr87991.f90	(nonexistent)
+++ gcc/testsuite/gfortran.dg/pr87991.f90	(working copy)
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! { dg-options "-w" }
+! PR fortran/87991
+program p
+   type t
+  character(:), pointer :: c
+   end type
+   type(t) :: x
+   allocate (character(3) :: x%c)
+   data x%c /'abc'/   ! { dg-error "has the pointer attribute" }
+end


Re: enforce canonicalization of value_range's

2019-08-13 Thread Aldy Hernandez

On 8/12/19 7:32 PM, Jeff Law wrote:

On 8/12/19 12:48 PM, Aldy Hernandez wrote:

This is a ping of:

https://gcc.gnu.org/ml/gcc-patches/2019-07/msg8.html

I have addressed the comments Jeff mentioned there.

This patch goes before the VR_VARYING + types patch, but I can adapt
either one to come first.  I can even munge them together into one
patch, if it helps review.

Tested on x86-64 Linux.

OK (assuming ChangeLog entries)?

Aldy


curr.patch

commit ea908bdbfc8cdb4bb63e7d42630d01203edbac41
Author: Aldy Hernandez 
Date:   Mon Jul 15 18:09:27 2019 +0200

 Enforce canonicalization in value_range.

Per your other message, I'll assume you'll write a suitable ChangeLog
entry before committing :-)


Attached patch has ChangeLog entries.






diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index 594ee9adc17..de2f39d8487 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -69,23 +69,20 @@ along with GCC; see the file COPYING3.  If not see
  #include "builtins.h"
  #include "wide-int-range.h"
  
+static bool

+ranges_from_anti_range (const value_range_base *ar,
+   value_range_base *vr0, value_range_base *vr1,
+   bool handle_pointers = false);
+

Presumably this was better than moving the implementation earlier.


Actually, it was for ease of review.  I made some changes to the 
function, and I didn't want the reviewer to miss them because I had 
moved the function wholesale.  I can move the function earlier, after we 
agree on the changes (see below).






+/* Normalize symbolics into constants.  */
+
+value_range_base
+value_range_base::normalize_symbolics () const
+{
+  if (varying_p () || undefined_p ())
+return *this;
+  tree ttype = type ();
+  bool min_symbolic = !is_gimple_min_invariant (min ());
+  bool max_symbolic = !is_gimple_min_invariant (max ());

Sadly "symbolic" is a particularly poor term.  When I think symbolics, I
think SYMBOL_REF, LABEL_REF and the tree equivalents.  They're
*symbols*.  Symbolics in this case are really things like SSA_NAMEs.
Confused the hell out of me briefly.


Sigh, yeah.  Unfortunately, symbolics is what VRP calls anything that is 
not an INTEGER_CST (that is, PLUS_EXPR and SSA_NAMEs, etc).




If we weren't on a path to kill VRP I'd probably suggest a distinct
effort to constify this code.  Some of the changes were a bit confusing
when it looked like we'd dropped a call to set the range of an object.
But those were just local copies, so setting the type/min/max directly
was actually fine.  constification would make this a bit clearer.  But
again, I don't think it's worth the effort given the long term
trajectory for tree-vrp.c.


I shouldn't be introducing any new confusion.  Did I add any new methods 
that should've been const that aren't?  I can't see any??.  I'm happy to 
fix anything I introduced.





So where does the handle_pointers stuff matter?   I'm a bit surprised we
have to do anything special for them.


I've learned to touch as little of VRP as is necessary, as changing 
anything to be more consistent breaks things in unexpected ways ;-).


In this particular case, TYPE_MIN_VALUE and TYPE_MAX_VALUE are not 
defined for pointers, and I didn't want to change the meaning of 
vrp_val_{min,max} throughout.  I was trying to minimize the changes to 
existing behavior.  If it bothers you too much, we could remove it as a 
follow up when we are sure there are no expected side-effects from the 
rest of the patch. ??





OK.  I don't expect the answers to the minor questions above will
ultimately change anything.


I could appreciate a final nod before I commit.  And even then, I will 
wait until the other patch is approved and commit them simultaneously. 
They are a bit intertwined.


Thanks for your prompt reviews.

Aldy
commit 39be54a2e0ef19b3fb1cc3e51b380a6811e50ab4
Author: Aldy Hernandez 
Date:   Mon Jul 15 18:09:27 2019 +0200

Enforce canonicalization in value_range.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 93e600d68f2..3fa864624c9 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,37 @@
+2019-08-13  Aldy Hernandez  
+
+	* tree-vrp.c (value_range_base::set): Merge in code from
+	value_range_base::set_and_canonicalize.
+	Enforce canonicalization at set time.
+	Normalize [MIN, MAX] into VARYING and ~[MIN, MAX] into UNDEFINED.
+	(value_range_base::set_undefined): Inline call to set().
+	(value_range_base::set_varying): Same.
+	(value_range_base::singleton_p): Handle VR_ANTI_RANGEs.
+	(vrp_val_max): New argument handle_pointers.
+	(vrp_val_min): Same.
+	(ranges_from_anti_range): Same.
+	(extract_range_into_wide_ints): Use tree argument instead of sign
+	and precision.
+	(extract_range_from_multiplicative_op): Take in tree type instead
+	of precision and sign.  Adapt function for canonicalized ranges.
+	(extract_range_from_binary_expr): Pass type to
+	extract_range_from_multiplicative_op.
+	Adapt for canonicalized ranges.
+	(extract_range_from_unary_expr): Same.
+	

Re: types for VR_VARYING

2019-08-13 Thread Aldy Hernandez



On 8/12/19 7:46 PM, Jeff Law wrote:

On 8/12/19 12:43 PM, Aldy Hernandez wrote:

This is a fresh re-post of:

https://gcc.gnu.org/ml/gcc-patches/2019-07/msg6.html

Andrew gave me some feedback a week ago, and I obviously don't remember
what it was because I was about to leave on PTO.  However, I do remember
I addressed his concerns before getting drunk on rum in tropical islands.

FWIW found a great coffee infused rum while in Kauai last week.  I'm not
a coffee fan, but it was wonderful.  The one bottle we brought back
isn't going to last until Cauldron and I don't think I can get a special
order filled before I leave :(


You must bring some to Cauldron before we believe you. :)



Their more traditional rums were good as well.  Koloa Rum.  See if you
can get it locally :-)




This patch adds MIN/MAX values for VR_VARYING.  As was discussed
earlier, we are only changing VR_VARYING, not VR_UNDEFINED as was the
original plan.

As I mentioned earlier, I am tired of re-doing ChangeLogs, so I'll whip
up the changelog when the review starts.

ACK.


ChangeLog entries included in attached patch.






@@ -150,12 +166,11 @@ vr_values::set_defs_to_varying (gimple *stmt)
ssa_op_iter i;
tree def;
FOR_EACH_SSA_TREE_OPERAND (def, stmt, i, SSA_OP_DEF)
-{
-  value_range *vr = get_value_range (def);
-  /* Avoid writing to vr_const_varying get_value_range may return.  */
-  if (!vr->varying_p ())
-   vr->set_varying ();
-}
+if (value_range_base::supports_type_p (TREE_TYPE (def)))
+  {
+   value_range *vr = get_value_range (def);
+   vr->set_varying (TREE_TYPE (def));
+  }
  }

Is the supports_type_p stuff there to placate the calls from ipa-cp?  I
can live with it in the short term, but it really feels like there
should be something in the ipa-cp client that avoids this silliness.


I am not happy with this either, but there are various places where 
statements that are !stmt_interesting_for_vrp() are still setting a 
range of VARYING, which is then being ignored at a later time.


For example, vrp_initialize:

  if (!stmt_interesting_for_vrp (phi))
{
  tree lhs = PHI_RESULT (phi);
  set_def_to_varying (lhs);
  prop_set_simulate_again (phi, false);
}

Also in evrp_range_analyzer::record_ranges_from_stmt(), where we if the 
statement is interesting for VRP but extract_range_from_stmt() does not 
produce a useful range, we also set a varying for a range we will never 
use.  Similarly for a statement that is not interesting in this hunk.


Then there is vrp_prop::visit_stmt() where we also set VARYING for types 
that VRP will never handle:


  case IFN_ADD_OVERFLOW:
  case IFN_SUB_OVERFLOW:
  case IFN_MUL_OVERFLOW:
  case IFN_ATOMIC_COMPARE_EXCHANGE:
/* These internal calls return _Complex integer type,
   which VRP does not track, but the immediate uses
   thereof might be interesting.  */
if (lhs && TREE_CODE (lhs) == SSA_NAME)
  {
imm_use_iterator iter;
use_operand_p use_p;
enum ssa_prop_result res = SSA_PROP_VARYING;

set_def_to_varying (lhs);

I've adjusted the patch so that set_def_to_varying will set the range to 
VR_UNDEFINED if !supports_type_p.  This is a fail safe, as we can't 
really do anything with a nonsensical range.  I just don't want to leave 
the range in an indeterminate state.





  
  /* Update the value range and equivalence set for variable VAR to



@@ -1920,7 +1935,7 @@ vr_values::dump_all_value_ranges (FILE *file)
  vr_values::vr_values () : vrp_value_range_pool ("Tree VRP value ranges")
  {
values_propagated = false;
-  num_vr_values = num_ssa_names;
+  num_vr_values = num_ssa_names + num_ssa_names / 10;
vr_value = XCNEWVEC (value_range *, num_vr_values);
vr_phi_edge_counts = XCNEWVEC (int, num_ssa_names);
bitmap_obstack_initialize (_equiv_obstack);

My recollection was someone (Richi) proposed 2X for the initial
allocation which should ensure that in all but the most pathological
cases that we never trigger a reallocation.  In terms of "wasted" space,
it shouldn't matter in practice.

So if you could check the old thread and verify that Richi recommended
the 2X initial allocation and if so, make that minor update.


Yes, it was 2X.

I noticed that Richi made some changes to the lattice handling for 
VARYING while the discussion was on-going.  I missed these, and had 
failed to adapt the patch for it.  I would appreciate a final review of 
the attached patch, especially the vr-values.c changes, which I have 
modified to play nice with current trunk.


I also noticed that Andrew's patch was setting num_vr_values to 
num_ssa_names + num_ssa_names / 10.  I think he meant num_vr_values + 
num_vr_values / 10.  Please verify the current incantation makes sense.


Tested on x86-64 Linux.

OK for trunk?

Aldy
commit 4c21b0f667176933f67469372da7459b8cf0ef71

Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-13 Thread Marc Glisse

On Tue, 13 Aug 2019, Robin Dapp wrote:


diff --git a/gcc/match.pd b/gcc/match.pd
index 0317bc704f7..94400529ad8 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2020,6 +2020,107 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(if (cst && !TREE_OVERFLOW (cst))
 (plus { cst; } @0

+/* ((T)(A + CST1)) + CST2 -> (T)(A) + CST  */
+#if GIMPLE
+  (simplify
+(plus (convert (plus @0 INTEGER_CST@1)) INTEGER_CST@2)
+  (if (INTEGRAL_TYPE_P (type)
+   && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)))
+   /* We actually want to perform two simplifications here:


Maybe two "transformations"? Since (1) is the exact opposite of the other 
simplification you are adding... It would have been nice if it could 
indeed be split into 2 simplifications, in particular because we already 
have a better (?) version of (2) not far above in the file. But we cannot 
add (1) and I can't think of a good way to reuse the existing version of 
(2) :-(



+ (1) (T)(A + CST1) + CST2  --> (T)(A) + (T)(CST1)
+ If for (A + CST1) we either do not care about overflow (e.g.
+ for a signed inner type) or the overflow is ok for an unsigned
+ inner type.


From the code, it seems like the condition is really "If we know that no 
overflow can occur, either because it would be UB (signed type), or thanks 
to range information". Maybe that's the meaning of what you wrote, but I 
had trouble interpreting it.



+ (2) (T)(A) + (T)(CST1) + CST2 --> (T)(A) + (T)(CST1 + CST2)
+ If (CST1 + CST2) does not overflow and we do care about overflow
+ (for a signed outer type) or we do not care about overflow in an
+ unsigned outer type.  */
+   (with
+   {
+ tree inner_type = TREE_TYPE (@0);
+ wide_int wmin0, wmax0;
+ wide_int cst1 = wi::to_wide (@1);
+
+wi::overflow_type min_ovf = wi::OVF_OVERFLOW,


There seems to be an inconsistency between spaces and TABs depending on 
the lines.



+  max_ovf = wi::OVF_OVERFLOW;
+
+/* Get overflow behavior.  */
+ bool ovf_undef_inner = TYPE_OVERFLOW_UNDEFINED (inner_type);
+ bool ovf_undef_outer = TYPE_OVERFLOW_UNDEFINED (type);
+
+/* Get value range of A.  */
+ enum value_range_kind vr0 = get_range_info (@0, , );
+
+/* If we have a proper range, determine min and max overflow
+   of (A + CST1).


At some point we may want to split this into a helper function 
operation_may_overflow or something, that takes an arbitrary mix of 
SSA_NAME and constants. Not necessary for this patch though.



+   ??? We might also want to handle anti ranges.  */
+if (vr0 == VR_RANGE)
+  {
+wi::add (wmin0, cst1, TYPE_SIGN (inner_type), _ovf);
+wi::add (wmax0, cst1, TYPE_SIGN (inner_type), _ovf);
+  }
+
+/* Inner overflow does not matter in this case.  */
+if (ovf_undef_inner)
+  {
+min_ovf = wi::OVF_NONE;
+max_ovf = wi::OVF_NONE;
+  }


Seems that you could switch the conditions, to avoid unnecessary additions
if (ovf_undef_inner) ... else if (vr0 == VR_RANGE) ...

Maybe even move get_range_info then.


+
+/* Extend CST from INNER_TYPE to TYPE.  */
+cst1 = cst1.from (cst1, TYPE_PRECISION (type), TYPE_SIGN (inner_type));


wide_int::from? It looks strange to use an arbitrary variable to access a 
static function, but that's a style preference, it is ok as is.



+
+/* Check for overflow of (TYPE)(CST1 + CST2).  */
+wi::overflow_type outer_ovf = wi::OVF_OVERFLOW;
+wide_int cst = wi::add (cst1, wi::to_wide (@2), TYPE_SIGN (type),
+_ovf);
+
+/* We *do* care about an overflow here as we do not want to introduce
+   new undefined behavior that was not there before.  */


There is always the possibility to cast to an unsigned type, perform the 
operation, then cast back, like the "other (2)" does. But that makes the 
transformation even longer, and can lose some information.



+if (ovf_undef_outer && outer_ovf)
+  {
+/* Set these here to prevent the final conversion below
+   to take place instead of introducing a new guard variable.  */
+min_ovf = wi::OVF_OVERFLOW;
+max_ovf = wi::OVF_OVERFLOW;
+  }
+   }
+   (if (min_ovf == wi::OVF_NONE && max_ovf == wi::OVF_NONE)
+(plus (convert @0) { wide_int_to_tree (type, cst); }
+ )
+#endif
+
+/* ((T)(A)) + CST -> (T)(A + CST)  */


The other transformation is a true simplification. This one is more a 
canonicalization. It does make sense to perform the operation in the 
narrower type, I just hope we have enough machinery to revert this later 
for platforms that do not like arithmetic in sub-word size. Or probably 
this is still restricted enough (doesn't narrow all operations) that it 
doesn't matter.



+#if 

Re: [PATCH] fold more string comparison with known result (PR 90879)

2019-08-13 Thread Jeff Law
On 8/13/19 3:43 PM, Martin Sebor wrote:
> On 8/13/19 2:07 PM, Jeff Law wrote:
>> On 8/9/19 10:51 AM, Martin Sebor wrote:
>>>
>>> PR tree-optimization/90879 - fold zero-equality of strcmp between a
>>> longer string and a smaller array
>>>
>>> gcc/c-family/ChangeLog:
>>>
>>> PR tree-optimization/90879
>>> * c.opt (-Wstring-compare): New option.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> PR tree-optimization/90879
>>> * gcc.dg/Wstring-compare-2.c: New test.
>>> * gcc.dg/Wstring-compare.c: New test.
>>> * gcc.dg/strcmpopt_3.c: Scan the optmized dump instead of strlen.
>>> * gcc.dg/strcmpopt_6.c: New test.
>>> * gcc.dg/strlenopt-65.c: Remove uinnecessary declarations, add
>>> test cases.
>>> * gcc.dg/strlenopt-66.c: Run it.
>>> * gcc.dg/strlenopt-68.c: New test.
>>>
>>> gcc/ChangeLog:
>>>
>>> PR tree-optimization/90879
>>> * builtins.c (check_access): Avoid using maxbound when null.
>>> * calls.c (maybe_warn_nonstring_arg): Adjust to get_range_strlen
>>> change.
>>> * doc/invoke.texi (-Wstring-compare): Document new warning option.
>>> * gimple-fold.c (get_range_strlen_tree): Make setting maxbound
>>> conditional.
>>> (get_range_strlen): Overwrite initial maxbound when non-null.
>>> * gimple-ssa-sprintf.c (get_string_length): Adjust to
>>> get_range_strlen
>>> change.
>>> * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.
>>> (used_only_for_zero_equality): New function.
>>> (handle_builtin_memcmp): Call it.
>>> (determine_min_objsize): Return an integer instead of tree.
>>> (get_len_or_size, strxcmp_eqz_result): New functions.
>>> (maybe_warn_pointless_strcmp): New function.
>>> (handle_builtin_string_cmp): Call it.  Fold zero-equality of strcmp
>>> between a longer string and a smaller array.
>>>
>>
>>> diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
>>> index 4af47855e7c..31e012b741b 100644
>>> --- a/gcc/tree-ssa-strlen.c
>>> +++ b/gcc/tree-ssa-strlen.c
>>
>>> @@ -3079,196 +3042,388 @@ determine_min_objsize (tree dest)
>>>       type = TYPE_MAIN_VARIANT (type);
>>>   -  /* We cannot determine the size of the array if it's a flexible
>>> array,
>>> - which is declared at the end of a structure.  */
>>> -  if (TREE_CODE (type) == ARRAY_TYPE
>>> -  && !array_at_struct_end_p (dest))
>>> +  /* The size of a flexible array cannot be determined.  Otherwise,
>>> + for arrays with more than one element, return the size of its
>>> + type.  GCC itself misuses arrays of both zero and one elements
>>> + as flexible array members so they are excluded as well.  */
>>> +  if (TREE_CODE (type) != ARRAY_TYPE
>>> +  || !array_at_struct_end_p (dest))
>>>   {
>>> -  tree size_t = TYPE_SIZE_UNIT (type);
>>> -  if (size_t && TREE_CODE (size_t) == INTEGER_CST
>>> -  && !integer_zerop (size_t))
>>> -    return size_t;
>>> +  tree type_size = TYPE_SIZE_UNIT (type);
>>> +  if (type_size && TREE_CODE (type_size) == INTEGER_CST
>>> +  && !integer_onep (type_size)
>>> +  && !integer_zerop (type_size))
>>> +    return tree_to_uhwi (type_size);
>> So I nearly commented on this when looking at the original patch.  Can
>> we really depend on the size when we've got an array at the end of a
>> struct with a declared size other than 0/1?   While 0/1 are by far the
>> most common way to declare them, couldn't someone have used other sizes?
>>   I think we pondered doing that at one time to cut down on the noise
>> from Coverity for RTL and TREE operand accessors.
>>
>> Your code makes us safer, so I'm not saying you've done anything wrong,
>> just trying to decide if we need to tighten this up even further.
> 
> This patch issues a warning in these cases, i.e., when it sees
> a call like, say, strcmp("foobar", A) with an A that's smaller
> than the string, because it seems they are likely (rare) bugs.
> I haven't seen the warning in any of the projects I tested it
> with (Binutils/GDB, GCC, Glibc, the Linux kernel, and LLVM).
> 
> The warning uses strcmp to detect these mistakes (or misuses)
> but I'd like to add similar warnings for other string functions
> as well and have code out there that does this on purpose use
> true flexible array members (or the zero-length extension)
> instead.  That makes the intent clear.
> 
> It's a judgment call whether to also fold (or do something else
> like insert a trap) in addition to issuing a warning.  In this
> case (reading) I don't think it matters as much as it does for
> writes.  Either way, it would be nice to set a policy and
> document it in the manual so users know what to expect and
> so we don't have to revisit this question for each patch that
> touches on this subject.
The GCC manual documents zero length arrays at the end of an aggregate
as a GNU extension for variable length objects.  The manual also
documents that it could be done with single element arrays, but that
doing so does contribute to the base 

Re: [PATCH] fold more string comparison with known result (PR 90879)

2019-08-13 Thread Martin Sebor

On 8/13/19 2:07 PM, Jeff Law wrote:

On 8/9/19 10:51 AM, Martin Sebor wrote:


PR tree-optimization/90879 - fold zero-equality of strcmp between a longer 
string and a smaller array

gcc/c-family/ChangeLog:

PR tree-optimization/90879
* c.opt (-Wstring-compare): New option.

gcc/testsuite/ChangeLog:

PR tree-optimization/90879
* gcc.dg/Wstring-compare-2.c: New test.
* gcc.dg/Wstring-compare.c: New test.
* gcc.dg/strcmpopt_3.c: Scan the optmized dump instead of strlen.
* gcc.dg/strcmpopt_6.c: New test.
* gcc.dg/strlenopt-65.c: Remove uinnecessary declarations, add
test cases.
* gcc.dg/strlenopt-66.c: Run it.
* gcc.dg/strlenopt-68.c: New test.

gcc/ChangeLog:

PR tree-optimization/90879
* builtins.c (check_access): Avoid using maxbound when null.
* calls.c (maybe_warn_nonstring_arg): Adjust to get_range_strlen change.
* doc/invoke.texi (-Wstring-compare): Document new warning option.
* gimple-fold.c (get_range_strlen_tree): Make setting maxbound
conditional.
(get_range_strlen): Overwrite initial maxbound when non-null.
* gimple-ssa-sprintf.c (get_string_length): Adjust to get_range_strlen
change.
* tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.
(used_only_for_zero_equality): New function.
(handle_builtin_memcmp): Call it.
(determine_min_objsize): Return an integer instead of tree.
(get_len_or_size, strxcmp_eqz_result): New functions.
(maybe_warn_pointless_strcmp): New function.
(handle_builtin_string_cmp): Call it.  Fold zero-equality of strcmp
between a longer string and a smaller array.




diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
index 4af47855e7c..31e012b741b 100644
--- a/gcc/tree-ssa-strlen.c
+++ b/gcc/tree-ssa-strlen.c



@@ -3079,196 +3042,388 @@ determine_min_objsize (tree dest)
  
type = TYPE_MAIN_VARIANT (type);
  
-  /* We cannot determine the size of the array if it's a flexible array,

- which is declared at the end of a structure.  */
-  if (TREE_CODE (type) == ARRAY_TYPE
-  && !array_at_struct_end_p (dest))
+  /* The size of a flexible array cannot be determined.  Otherwise,
+ for arrays with more than one element, return the size of its
+ type.  GCC itself misuses arrays of both zero and one elements
+ as flexible array members so they are excluded as well.  */
+  if (TREE_CODE (type) != ARRAY_TYPE
+  || !array_at_struct_end_p (dest))
  {
-  tree size_t = TYPE_SIZE_UNIT (type);
-  if (size_t && TREE_CODE (size_t) == INTEGER_CST
- && !integer_zerop (size_t))
-return size_t;
+  tree type_size = TYPE_SIZE_UNIT (type);
+  if (type_size && TREE_CODE (type_size) == INTEGER_CST
+ && !integer_onep (type_size)
+ && !integer_zerop (type_size))
+return tree_to_uhwi (type_size);

So I nearly commented on this when looking at the original patch.  Can
we really depend on the size when we've got an array at the end of a
struct with a declared size other than 0/1?   While 0/1 are by far the
most common way to declare them, couldn't someone have used other sizes?
  I think we pondered doing that at one time to cut down on the noise
from Coverity for RTL and TREE operand accessors.

Your code makes us safer, so I'm not saying you've done anything wrong,
just trying to decide if we need to tighten this up even further.


This patch issues a warning in these cases, i.e., when it sees
a call like, say, strcmp("foobar", A) with an A that's smaller
than the string, because it seems they are likely (rare) bugs.
I haven't seen the warning in any of the projects I tested it
with (Binutils/GDB, GCC, Glibc, the Linux kernel, and LLVM).

The warning uses strcmp to detect these mistakes (or misuses)
but I'd like to add similar warnings for other string functions
as well and have code out there that does this on purpose use
true flexible array members (or the zero-length extension)
instead.  That makes the intent clear.

It's a judgment call whether to also fold (or do something else
like insert a trap) in addition to issuing a warning.  In this
case (reading) I don't think it matters as much as it does for
writes.  Either way, it would be nice to set a policy and
document it in the manual so users know what to expect and
so we don't have to revisit this question for each patch that
touches on this subject.

Martin


[PATCH 3/3] [og9] Wait on queue-full condition in AMD GCN libgomp offloading plugin

2019-08-13 Thread Julian Brown
This patch lets the AMD GCN libgomp plugin wait for asynchronous queues
to have some space to push new operations when they are full, rather
than just erroring out immediately on that condition. This fixes the
libgomp.oacc-c-c++-common/da-4.c test.

Julian

ChangeLog

libgomp/
* plugin/plugin-gcn.c (queue_push_callback): Wait on queue-full
condition.
---
 libgomp/ChangeLog.openacc   |  5 +
 libgomp/plugin/plugin-gcn.c | 11 +--
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/libgomp/ChangeLog.openacc b/libgomp/ChangeLog.openacc
index 2a9a7f18ca2..f9d8e6ecd39 100644
--- a/libgomp/ChangeLog.openacc
+++ b/libgomp/ChangeLog.openacc
@@ -1,3 +1,8 @@
+2019-08-13  Julian Brown  
+
+   * plugin/plugin-gcn.c (queue_push_callback): Wait on queue-full
+   condition.
+
 2019-08-13  Julian Brown  
 
* plugin/plugin-gcn.c (struct copy_data): Add using_src_copy field.
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index 65690e643ed..099f70b647c 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -1416,8 +1416,15 @@ queue_push_callback (struct goacc_asyncqueue *aq, void 
(*fn)(void *),
 void *data)
 {
   if (aq->queue_n == ASYNC_QUEUE_SIZE)
-GOMP_PLUGIN_fatal ("Async thread %d:%d: error: queue overflowed",
-  aq->agent->device_id, aq->id);
+{
+  pthread_mutex_lock (>mutex);
+
+  /* Queue is full.  Wait for it to not be full.  */
+  while (aq->queue_n == ASYNC_QUEUE_SIZE)
+   pthread_cond_wait (>queue_cond_out, >mutex);
+
+  pthread_mutex_unlock (>mutex);
+}
 
   pthread_mutex_lock (>mutex);
 
-- 
2.22.0



[PATCH 2/3] [og9] Use temporary buffers for async host2dev copies

2019-08-13 Thread Julian Brown
In libgomp, host-to-device transfers are instigated in several places
where the source data is either on the stack, or in an unstable
heap location (i.e. which is immediately freed after performing the
host-to-device transfer).

When the transfer is asynchronous, this means that taking the address
of source data and attempting the copy from that at some later point
is extremely likely to fail. A previous fix for this problem (from our
internal branch, and included with the AMD GCN offloading patches)
attempted to separate transfers from the stack (performing them
immediately) from transfers from the heap (which can safely be done some
time later).

Unfortunately that doesn't work well with more recent changes to libgomp
and the GCN plugin. So instead, this patch copies the source data for
asynchronous host-to-device copies immediately to a temporary buffer,
then the transfer to the device can safely take place asynchronously
some time later.

Julian

ChangeLog

libgomp/
* plugin/plugin-gcn.c (struct copy_data): Add using_src_copy field.
(copy_data): Free temporary buffer if using.
(queue_push_copy): Add using_src_copy parameter.
(GOMP_OFFLOAD_dev2dev, GOMP_OFFLOAD_async_dev2host): Update calls to
queue_push_copy.
(GOMP_OFFLOAD_async_host2dev): Likewise.  Allocate temporary buffer and
copy source data to it immediately.
* target.c (gomp_copy_host2dev): Update function comment.
(copy_host2dev_immediate): Remove.
(gomp_map_pointer, gomp_map_vars_internal): Replace calls to
copy_host2dev_immediate with calls to gomp_copy_host2dev.
---
 libgomp/ChangeLog.openacc   | 14 ++
 libgomp/plugin/plugin-gcn.c | 20 ++---
 libgomp/target.c| 56 +++--
 3 files changed, 52 insertions(+), 38 deletions(-)

diff --git a/libgomp/ChangeLog.openacc b/libgomp/ChangeLog.openacc
index 2279545f361..2a9a7f18ca2 100644
--- a/libgomp/ChangeLog.openacc
+++ b/libgomp/ChangeLog.openacc
@@ -1,3 +1,17 @@
+2019-08-13  Julian Brown  
+
+   * plugin/plugin-gcn.c (struct copy_data): Add using_src_copy field.
+   (copy_data): Free temporary buffer if using.
+   (queue_push_copy): Add using_src_copy parameter.
+   (GOMP_OFFLOAD_dev2dev, GOMP_OFFLOAD_async_dev2host): Update calls to
+   queue_push_copy.
+   (GOMP_OFFLOAD_async_host2dev): Likewise.  Allocate temporary buffer and
+   copy source data to it immediately.
+   * target.c (gomp_copy_host2dev): Update function comment.
+   (copy_host2dev_immediate): Remove.
+   (gomp_map_pointer, gomp_map_vars_internal): Replace calls to
+   copy_host2dev_immediate with calls to gomp_copy_host2dev.
+
 2019-08-08  Julian Brown  
 
* plugin/plugin-gcn.c (gcn_exec): Use 1 for the default number of
diff --git a/libgomp/plugin/plugin-gcn.c b/libgomp/plugin/plugin-gcn.c
index a41568b3306..65690e643ed 100644
--- a/libgomp/plugin/plugin-gcn.c
+++ b/libgomp/plugin/plugin-gcn.c
@@ -3063,6 +3063,7 @@ struct copy_data
   const void *src;
   size_t len;
   bool use_hsa_memory_copy;
+  bool using_src_copy;
   struct goacc_asyncqueue *aq;
 };
 
@@ -3077,12 +3078,14 @@ copy_data (void *data_)
 hsa_fns.hsa_memory_copy_fn (data->dst, data->src, data->len);
   else
 memcpy (data->dst, data->src, data->len);
+  if (data->using_src_copy)
+free ((void *) data->src);
   free (data);
 }
 
 static void
 queue_push_copy (struct goacc_asyncqueue *aq, void *dst, const void *src,
-size_t len, bool use_hsa_memory_copy)
+size_t len, bool use_hsa_memory_copy, bool using_src_copy)
 {
   if (DEBUG_QUEUES)
 HSA_DEBUG ("queue_push_copy %d:%d: %zu bytes from (%p) to (%p)\n",
@@ -3093,6 +3096,7 @@ queue_push_copy (struct goacc_asyncqueue *aq, void *dst, 
const void *src,
   data->src = src;
   data->len = len;
   data->use_hsa_memory_copy = use_hsa_memory_copy;
+  data->using_src_copy = using_src_copy;
   data->aq = aq;
   queue_push_callback (aq, copy_data, data);
 }
@@ -3137,7 +3141,7 @@ GOMP_OFFLOAD_dev2dev (int device, void *dst, const void 
*src, size_t n)
 {
   struct agent_info *agent = get_agent_info (device);
   maybe_init_omp_async (agent);
-  queue_push_copy (agent->omp_async_queue, dst, src, n, false);
+  queue_push_copy (agent->omp_async_queue, dst, src, n, false, false);
   return true;
 }
 
@@ -3469,7 +3473,15 @@ GOMP_OFFLOAD_openacc_async_host2dev (int device, void 
*dst, const void *src,
 {
   struct agent_info *agent = get_agent_info (device);
   assert (agent == aq->agent);
-  queue_push_copy (aq, dst, src, n, image_address_p (agent, dst));
+  /* The source data does not necessarily remain live until the deferred
+ copy happens.  Taking a snapshot of the data here avoids reading
+ uninitialised data later, but means that (a) data is copied twice and
+ (b) modifications to the copied data between the "spawning" point of
+ the 

[PATCH 1/3] [og9] Wait at end of OpenACC asynchronous kernels regions

2019-08-13 Thread Julian Brown
This patch provides a workaround for unreliable operation of asynchronous
kernels regions on AMD GCN. At present, kernels regions are decomposed
into a series of parallel regions surrounded by a data region capturing
the data-movement clauses needed by the region as a whole:

  #pragma acc kernels async(n)
  { ... }

is translated to:

  #pragma acc data copyin(...) copyout(...)
  {
#pragma acc parallel async(n) present(...)
{ ... }
#pragma acc parallel async(n) present(...)
{ ... }
  }

This is however problematic for two reasons:

 - Variables mapped by the data clause will be unmapped immediately at the end
   of the data region, regardless of whether the inner asynchronous
   parallels have completed. (This causes crashes for GCN.)

 - Even if the "present" clause caused the reference count to stay above zero
   at the end of the data region -- which it doesn't -- the "present"
   clauses on the inner parallel regions would not cause "copyout"
   variables to be transferred back to the host at the appropriate time,
   i.e. when the async parallel region had completed.

There is no "async" data construct in OpenACC, so the correct solution
(which I am deferring on for now) is probably to use asynchronous
"enter data" and "exit data" directives when translating asynchronous
kernels regions instead.

The attached patch just adds a "wait" operation before the end of
the enclosing data region. This works, but introduces undesirable
synchronisation with the host.

Julian

ChangeLog

gcc/
* omp-oacc-kernels.c (add_wait): New function, split out of...
(add_async_clauses_and_wait): ...here. Call new outlined function.
(decompose_kernels_region_body): Add wait at the end of
explicitly-asynchronous kernels regions.
---
 gcc/ChangeLog.openacc  |  7 +++
 gcc/omp-oacc-kernels.c | 28 +---
 2 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/gcc/ChangeLog.openacc b/gcc/ChangeLog.openacc
index 84d80511603..a22f07c817c 100644
--- a/gcc/ChangeLog.openacc
+++ b/gcc/ChangeLog.openacc
@@ -1,3 +1,10 @@
+2019-08-13  Julian Brown  
+
+   * omp-oacc-kernels.c (add_wait): New function, split out of...
+   (add_async_clauses_and_wait): ...here. Call new outlined function.
+   (decompose_kernels_region_body): Add wait at the end of
+   explicitly-asynchronous kernels regions.
+
 2019-08-08  Julian Brown  
 
* config/gcn/gcn.c (gcn_goacc_validate_dims): Ensure
diff --git a/gcc/omp-oacc-kernels.c b/gcc/omp-oacc-kernels.c
index 20913859c12..a6c4220f472 100644
--- a/gcc/omp-oacc-kernels.c
+++ b/gcc/omp-oacc-kernels.c
@@ -900,6 +900,18 @@ maybe_build_inner_data_region (location_t loc, gimple 
*body,
   return body;
 }
 
+static void
+add_wait (location_t loc, gimple_seq *region_body)
+{
+  /* A "#pragma acc wait" is just a call GOACC_wait (acc_async_sync, 0).  */
+  tree wait_fn = builtin_decl_explicit (BUILT_IN_GOACC_WAIT);
+  tree sync_arg = build_int_cst (integer_type_node, GOMP_ASYNC_SYNC);
+  gimple *wait_call = gimple_build_call (wait_fn, 2,
+ sync_arg, integer_zero_node);
+  gimple_set_location (wait_call, loc);
+  gimple_seq_add_stmt (region_body, wait_call);
+}
+
 /* Helper function of decompose_kernels_region_body.  The statements in
REGION_BODY are expected to be decomposed parallel regions; add an
"async" clause to each.  Also add a "wait" pragma at the end of the
@@ -923,13 +935,7 @@ add_async_clauses_and_wait (location_t loc, gimple_seq 
*region_body)
   gimple_omp_target_set_clauses (as_a  (stmt),
  target_clauses);
 }
-  /* A "#pragma acc wait" is just a call GOACC_wait (acc_async_sync, 0).  */
-  tree wait_fn = builtin_decl_explicit (BUILT_IN_GOACC_WAIT);
-  tree sync_arg = build_int_cst (integer_type_node, GOMP_ASYNC_SYNC);
-  gimple *wait_call = gimple_build_call (wait_fn, 2,
- sync_arg, integer_zero_node);
-  gimple_set_location (wait_call, loc);
-  gimple_seq_add_stmt (region_body, wait_call);
+  add_wait (loc, region_body);
 }
 
 /* Auxiliary analysis of the body of a kernels region, to determine for each
@@ -1378,6 +1384,14 @@ decompose_kernels_region_body (gimple *kernels_region, 
tree kernels_clauses)
  a wait directive at the end.  */
   if (async_clause == NULL)
 add_async_clauses_and_wait (loc, _body);
+  else
+/* !!! If we have asynchronous parallel blocks inside a (synchronous) data
+   region, then target memory will get unmapped at the point the data
+   region ends, even if the inner asynchronous parallels have not yet
+   completed.  For kernels marked "async", we might want to use "enter data
+   async(...)" and "exit data async(...)" instead.
+   For now, insert a (synchronous) wait at the end of the block.  */
+add_wait (loc, _body);
 
   tree kernels_locals = gimple_bind_vars (as_a  (kernels_body));
   gimple *body 

[PATCH 0/3] [og9] OpenACC async fixes for AMD GCN

2019-08-13 Thread Julian Brown
These patches stabilise async support for AMD GCN. Several tests that
previously failed (some intermittently) now work.

Further commentary is provided alongside each patch. Tested with
offloading to AMD GCN.

I will apply shortly to the openacc-gcc-9-branch.

Thanks,

Julian

Julian Brown (3):
  [og9] Wait at end of OpenACC asynchronous kernels regions
  [og9] Use temporary buffers for async host2dev copies
  [og9] Wait on queue-full condition in AMD GCN libgomp offloading
plugin

 gcc/ChangeLog.openacc   |  7 +
 gcc/omp-oacc-kernels.c  | 28 ++-
 libgomp/ChangeLog.openacc   | 19 +
 libgomp/plugin/plugin-gcn.c | 31 
 libgomp/target.c| 56 +++--
 5 files changed, 94 insertions(+), 47 deletions(-)

-- 
2.22.0



Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-13 Thread Jeff Law
On 8/13/19 2:42 PM, Robin Dapp wrote:
>> +/* ((T)(A + CST1)) + CST2 -> (T)(A) + CST  */
>> Do you want to handle MINUS?  What about POINTER_PLUS_EXPR?
> 
> When I last attempted this patch I had the MINUS still in it but got
> confused easily by needing to think of too many cases at once leading to
> lots of stupid mistakes.  Hence, I left it out in the last revision :)
> Once everything is in place, I guess the additional cases can be added
> more easily - hope that's reasonable.
It's fine by me.  Richi has the final say though...

jeff


Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-13 Thread Robin Dapp
> +/* ((T)(A + CST1)) + CST2 -> (T)(A) + CST  */
> Do you want to handle MINUS?  What about POINTER_PLUS_EXPR?

When I last attempted this patch I had the MINUS still in it but got
confused easily by needing to think of too many cases at once leading to
lots of stupid mistakes.  Hence, I left it out in the last revision :)
Once everything is in place, I guess the additional cases can be added
more easily - hope that's reasonable.

Regards
 Robin



Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-13 Thread Jeff Law
On 8/13/19 2:33 AM, Robin Dapp wrote:
> We would like to simplify code like
>  (larger_type)(var + const1) + const2
> to
>  (larger_type)(var + combined_const1_const2)
> when we know that no overflow happens.
> ---
>  gcc/match.pd | 101 +++
>  1 file changed, 101 insertions(+)
> 
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 0317bc704f7..94400529ad8 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -2020,6 +2020,107 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (if (cst && !TREE_OVERFLOW (cst))
>   (plus { cst; } @0
>  
> +/* ((T)(A + CST1)) + CST2 -> (T)(A) + CST  */
Do you want to handle MINUS?  What about POINTER_PLUS_EXPR?

Richi and Marc are better suited to review the guts of this change than
I am.

I though we had these transformations in fold-const.c and I was going to
suggest removing them as a part of this patch.  *But* I can't seem to
find them.

Jeff



Re: [PATCH 2/2] Add more entries to the C++ get_std_name_hint array

2019-08-13 Thread Jason Merrill

On 8/13/19 9:36 AM, Jonathan Wakely wrote:

This adds some commonly-used C++11/14 names, and some new C++17/20
names. The latter aren't available when using the -std=gnu++14
default, so the fix-it suggesting to use a newer dialect is helpful.

 * name-lookup.c (get_std_name_hint): Add more entries.

Tested x86_64-linux. OK for trunk?


OK.

Jason



Re: [PATCH 1/2] PR c++/91436 fix C++ dialect for std::make_unique fix-it hint

2019-08-13 Thread Jason Merrill

On 8/13/19 9:32 AM, Jonathan Wakely wrote:

 * g++.dg/lookup/missing-std-include-6.C: Don't check make_unique in
 test that runs for C++11.


I'm not comfortable removing this test coverage entirely.  Doesn't it 
give a useful diagnostic in C++11 mode as well?


Jason


Re: [PATCH] fold more string comparison with known result (PR 90879)

2019-08-13 Thread Jeff Law
On 8/9/19 10:51 AM, Martin Sebor wrote:
> 
> PR tree-optimization/90879 - fold zero-equality of strcmp between a longer 
> string and a smaller array
> 
> gcc/c-family/ChangeLog:
> 
>   PR tree-optimization/90879
>   * c.opt (-Wstring-compare): New option.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/90879
>   * gcc.dg/Wstring-compare-2.c: New test.
>   * gcc.dg/Wstring-compare.c: New test.
>   * gcc.dg/strcmpopt_3.c: Scan the optmized dump instead of strlen.
>   * gcc.dg/strcmpopt_6.c: New test.
>   * gcc.dg/strlenopt-65.c: Remove uinnecessary declarations, add
>   test cases.
>   * gcc.dg/strlenopt-66.c: Run it.
>   * gcc.dg/strlenopt-68.c: New test.
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/90879
>   * builtins.c (check_access): Avoid using maxbound when null.
>   * calls.c (maybe_warn_nonstring_arg): Adjust to get_range_strlen change.
>   * doc/invoke.texi (-Wstring-compare): Document new warning option.
>   * gimple-fold.c (get_range_strlen_tree): Make setting maxbound
>   conditional.
>   (get_range_strlen): Overwrite initial maxbound when non-null.
>   * gimple-ssa-sprintf.c (get_string_length): Adjust to get_range_strlen
>   change.
>   * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Same.
>   (used_only_for_zero_equality): New function.
>   (handle_builtin_memcmp): Call it.
>   (determine_min_objsize): Return an integer instead of tree.
>   (get_len_or_size, strxcmp_eqz_result): New functions.
>   (maybe_warn_pointless_strcmp): New function.
>   (handle_builtin_string_cmp): Call it.  Fold zero-equality of strcmp
>   between a longer string and a smaller array.
> 

> diff --git a/gcc/tree-ssa-strlen.c b/gcc/tree-ssa-strlen.c
> index 4af47855e7c..31e012b741b 100644
> --- a/gcc/tree-ssa-strlen.c
> +++ b/gcc/tree-ssa-strlen.c

> @@ -3079,196 +3042,388 @@ determine_min_objsize (tree dest)
>  
>type = TYPE_MAIN_VARIANT (type);
>  
> -  /* We cannot determine the size of the array if it's a flexible array,
> - which is declared at the end of a structure.  */
> -  if (TREE_CODE (type) == ARRAY_TYPE
> -  && !array_at_struct_end_p (dest))
> +  /* The size of a flexible array cannot be determined.  Otherwise,
> + for arrays with more than one element, return the size of its
> + type.  GCC itself misuses arrays of both zero and one elements
> + as flexible array members so they are excluded as well.  */
> +  if (TREE_CODE (type) != ARRAY_TYPE
> +  || !array_at_struct_end_p (dest))
>  {
> -  tree size_t = TYPE_SIZE_UNIT (type);
> -  if (size_t && TREE_CODE (size_t) == INTEGER_CST
> -   && !integer_zerop (size_t))
> -return size_t;
> +  tree type_size = TYPE_SIZE_UNIT (type);
> +  if (type_size && TREE_CODE (type_size) == INTEGER_CST
> +   && !integer_onep (type_size)
> +   && !integer_zerop (type_size))
> +return tree_to_uhwi (type_size);
So I nearly commented on this when looking at the original patch.  Can
we really depend on the size when we've got an array at the end of a
struct with a declared size other than 0/1?   While 0/1 are by far the
most common way to declare them, couldn't someone have used other sizes?
 I think we pondered doing that at one time to cut down on the noise
from Coverity for RTL and TREE operand accessors.

Your code makes us safer, so I'm not saying you've done anything wrong,
just trying to decide if we need to tighten this up even further.

No additional comments beyond what I pointed out yesterday against the
original patch.

Jeff

>


Re: [PATCH][RFC][x86] Fix PR91154, add SImode smax, allow SImode add in SSE regs

2019-08-13 Thread H.J. Lu
On Tue, Aug 13, 2019 at 8:20 AM Jeff Law  wrote:
>
> On 8/12/19 6:27 AM, Richard Biener wrote:
> > On Fri, 9 Aug 2019, Uros Bizjak wrote:
> >
> >> On Fri, Aug 9, 2019 at 3:00 PM Richard Biener  wrote:
> >>
>  (define_mode_iterator MAXMIN_IMODE [SI "TARGET_SSE4_1"] [DI 
>  "TARGET_AVX512F"])
> 
>  and then we need to split DImode for 32bits, too.
> >>>
> >>> For now, please add "TARGET_64BIT && TARGET_AVX512F" for DImode
> >>> condition, I'll provide _doubleword splitter later.
> >>
> >> Shouldn't that be TARGET_AVX512VL instead?  Or does the insn use 
> >> %g0 etc.
> >> to force use of %zmmN?
> >
> > It generates V4SI mode, so - yes, AVX512VL.
> 
>  case SMAX:
>  case SMIN:
>  case UMAX:
>  case UMIN:
>    if ((mode == DImode && (!TARGET_64BIT || !TARGET_AVX512VL))
>    || (mode == SImode && !TARGET_SSE4_1))
>  return false;
> 
>  so there's no way to use AVX512VL for 32bit?
> >>>
> >>> There is a way, but on 32bit targets, we need to split DImode
> >>> operation to a sequence of SImode operations for unconverted pattern.
> >>> This is of course doable, but somehow more complex than simply
> >>> emitting a DImode compare + DImode cmove, which is what current
> >>> splitter does. So, a follow-up task.
> >>
> >> Please find attached the complete .md part that enables SImode for
> >> TARGET_SSE4_1 and DImode for TARGET_AVX512VL for both, 32bit and 64bit
> >> targets. The patterns also allows for memory operand 2, so STV has
> >> chance to create the vector pattern with implicit load. In case STV
> >> fails, the memory operand 2 is loaded to the register first;  operand
> >> 2 is used in compare and cmove instruction, so pre-loading of the
> >> operand should be beneficial.
> >
> > Thanks.
> >
> >> Also note, that splitting should happen rarely. Due to the cost
> >> function, STV should effectively always convert minmax to a vector
> >> insn.
> >
> > I've analyzed the 464.h264ref slowdown on Haswell and it is due to
> > this kind of "simple" conversion:
> >
> >   5.50 │1d0:   test   %esi,%es
> >   0.07 │   mov$0x0,%ex
> >│   cmovs  %eax,%es
> >   5.84 │   imul   %r8d,%es
> >
> > to
> >
> >   0.65 │1e0:   vpxor  %xmm0,%xmm0,%xmm0
> >   0.32 │   vpmaxs -0x10(%rsp),%xmm0,%xmm0
> >  40.45 │   vmovd  %xmm0,%eax
> >   2.45 │   imul   %r8d,%eax
> >
> > which looks like a RA artifact in the end.  We spill %esi only
> > with -mstv here as STV introduces a (subreg:V4SI ...) use
> > of a pseudo ultimatively set from di.  STV creates an additional
> > pseudo for this (copy-in) but it places that copy next to the
> > original def rather than next to the start of the chain it
> > converts which is probably the issue why we spill.  And this
> > is because it inserts those at each definition of the pseudo
> > rather than just at the reaching definition(s) or at the
> > uses of the pseudo in the chain (that because there may be
> > defs of that pseudo in the chain itself).  Note that STV emits
> > such "conversion" copies as simple reg-reg moves:
> >
> > (insn 1094 3 4 2 (set (reg:SI 777)
> > (reg/v:SI 438 [ y ])) "refbuf.i":4:1 -1
> >  (nil))
> >
> > but those do not prevail very long (this one gets removed by CSE2).
> > So IRA just sees the (subreg:V4SI (reg/v:SI 438 [ y ]) 0) use
> > and computes
> >
> > r438: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS
> > a297(r438,l0) costs: SSE_REGS:5628,5628 MEM:3618,3618
> >
> > so I wonder if STV shouldn't instead emit gpr->xmm moves
> > here (but I guess nothing again prevents RTL optimizers from
> > combining that with the single-use in the max instruction...).
> >
> > So this boils down to STV splitting live-ranges but other
> > passes undoing that and then RA not considering splitting
> > live-ranges here, arriving at unoptimal allocation.
> >
> > A testcase showing this issue is (simplified from 464.h264ref
> > UMVLine16Y_11):
> >
> > unsigned short
> > UMVLine16Y_11 (short unsigned int * Pic, int y, int width)
> > {
> >   if (y != width)
> > {
> >   y = y < 0 ? 0 : y;
> >   return Pic[y * width];
> > }
> >   return Pic[y];
> > }
> >
> > where the condition and the Pic[y] load mimics the other use of y.
> > Different, even worse spilling is generated by
> >
> > unsigned short
> > UMVLine16Y_11 (short unsigned int * Pic, int y, int width)
> > {
> >   y = y < 0 ? 0 : y;
> >   return Pic[y * width] + y;
> > }
> 

[Darwin, committed] There is no need to distinguish PIC/non-PIC symbol stubs.

2019-08-13 Thread Iain Sandoe
As per $SUBJECT,

We can use a single flag for both pic and non-pic.
Rename this now, before a confusing name gets into the wild.

Other than the name, this is NFC.

tested on i686/powerpc/x86-64 darwin, x86_64-linux-gnu, powerpc64-linux-gnu
applied to mainline.

thanks
Iain

gcc/

2019-08-13 Iain Sandoe 

* config/darwin.c (machopic_indirect_call_target): Rename symbol stub
flag.
(darwin_override_options): Likewise.
* config/darwin.h: Likewise.
* config/darwin.opt: Likewise.
* config/i386/i386.c (output_pic_addr_const): Likewise.
* config/rs6000/darwin.h: Likewise.
* config/rs6000/rs6000.c (rs6000_call_darwin_1): Likewise.
* config/i386/darwin.h (TARGET_MACHO_PICSYM_STUBS): Rename to ...
... this TARGET_MACHO_SYMBOL_STUBS.
(FUNCTION_PROFILER):Likewise.
* config/i386/i386.h: Likewise.

gcc/testsuite/

2019-08-13  Iain Sandoe  

* obj-c++.dg/stubify-1.mm: Rename symbol stub option.
* obj-c++.dg/stubify-2.mm: Likewise.
* objc.dg/stubify-1.m: Likewise.
* objc.dg/stubify-2.m: Likewise.

diff --git a/gcc/config/darwin.c b/gcc/config/darwin.c
index 5ac092583b..fdd23c49bb 100644
--- a/gcc/config/darwin.c
+++ b/gcc/config/darwin.c
@@ -788,7 +788,7 @@ machopic_indirect_data_reference (rtx orig, rtx reg)
 rtx
 machopic_indirect_call_target (rtx target)
 {
-  if (! darwin_picsymbol_stubs)
+  if (! darwin_symbol_stubs)
 return target;
 
   if (GET_CODE (target) != MEM)
@@ -3268,13 +3268,13 @@ darwin_override_options (void)
  Linkers that don't need stubs, don't need the EH symbol markers either.
   */
 
-  if (!global_options_set.x_darwin_picsymbol_stubs)
+  if (!global_options_set.x_darwin_symbol_stubs)
 {
   if (darwin_target_linker)
{
  if (strverscmp (darwin_target_linker, MIN_LD64_OMIT_STUBS) < 0)
{
- darwin_picsymbol_stubs = true;
+ darwin_symbol_stubs = true;
  ld_needs_eh_markers = true;
}
}
@@ -3283,15 +3283,15 @@ darwin_override_options (void)
  /* If we don't know the linker version and we're targeting an old
 system, we know no better than to assume the use of an earlier
 linker.  */
- darwin_picsymbol_stubs = true;
+ darwin_symbol_stubs = true;
  ld_needs_eh_markers = true;
}
 }
-  else if (DARWIN_X86 && darwin_picsymbol_stubs && TARGET_64BIT)
+  else if (DARWIN_X86 && darwin_symbol_stubs && TARGET_64BIT)
 {
   inform (input_location,
  "%<-mpic-symbol-stubs%> is not required for 64b code (ignored)");
-  darwin_picsymbol_stubs = false;
+  darwin_symbol_stubs = false;
 }
 
   if (generating_for_darwin_version >= 9)
diff --git a/gcc/config/darwin.h b/gcc/config/darwin.h
index ab5ad03f17..93dc638f8b 100644
--- a/gcc/config/darwin.h
+++ b/gcc/config/darwin.h
@@ -1019,7 +1019,7 @@ extern void darwin_driver_init (unsigned int *,struct 
cl_decoded_option **);
_tested_ version known to support this so far.  */
 #define MIN_LD64_NO_COAL_SECTS "236.4"
 
-/* From at least version 62.1, ld64 can build PIC indirection stubs as
+/* From at least version 62.1, ld64 can build symbol indirection stubs as
needed, and there is no need for the compiler to emit them.  */
 #define MIN_LD64_OMIT_STUBS "62.1"
 
diff --git a/gcc/config/darwin.opt b/gcc/config/darwin.opt
index d7e5e7bfca..7f5616cbe0 100644
--- a/gcc/config/darwin.opt
+++ b/gcc/config/darwin.opt
@@ -75,9 +75,9 @@ mone-byte-bool
 Target RejectNegative Report Var(darwin_one_byte_bool)
 Set sizeof(bool) to 1.
 
-mpic-symbol-stubs
-Target Report Var(darwin_picsymbol_stubs) Init(0)
-Force generation of PIC symbol stubs.
+msymbol-stubs
+Target Report Var(darwin_symbol_stubs) Init(0)
+Force generation of external symbol indirection stubs.
 
 ; Some code-gen may be improved / adjusted if the linker is sufficiently 
modern.
 mtarget-linker=
diff --git a/gcc/config/i386/darwin.h b/gcc/config/i386/darwin.h
index 84afedccfb..385d253692 100644
--- a/gcc/config/i386/darwin.h
+++ b/gcc/config/i386/darwin.h
@@ -91,8 +91,8 @@ along with GCC; see the file COPYING3.  If not see
 #define WCHAR_TYPE_SIZE 32
 
 /* Generate pic symbol indirection stubs if this is true.  */
-#undef TARGET_MACHO_PICSYM_STUBS
-#define TARGET_MACHO_PICSYM_STUBS (darwin_picsymbol_stubs)
+#undef TARGET_MACHO_SYMBOL_STUBS
+#define TARGET_MACHO_SYMBOL_STUBS (darwin_symbol_stubs)
 
 /* For compatibility with OSX system tools, use the new style of pic stub
if this is set (default).  */
@@ -243,7 +243,7 @@ along with GCC; see the file COPYING3.  If not see
 #undef FUNCTION_PROFILER
 #define FUNCTION_PROFILER(FILE, LABELNO)   \
   do { \
-if (TARGET_MACHO_PICSYM_STUBS  \
+if (TARGET_MACHO_SYMBOL_STUBS  \

Re: [PATCH] PR fortran/89647 -- Allow host associated procedure to be a binding target

2019-08-13 Thread Steve Kargl
On Tue, Aug 13, 2019 at 09:52:01AM +0300, Janne Blomqvist wrote:
> On Tue, Aug 13, 2019 at 1:32 AM Steve Kargl
>  wrote:
> >
> > The attached patch fixes PR fortran/89647, and has been
> > regression tested on x86_64-*-freebsd.
> >
> > If a procedure is made accessible by host association, the
> > that procedure can be used as binding target.  During the
> > resolution of a type bound procedure, gfortran needs to
> > check the parent namespace for a procedure before it fails.
> >
> > 2019-08-12  Steven G. Kargl  
> >
> > PF fortran/89647
> > resolve.c (resolve_typebound_procedure): Allow host associated
> > procedure to be a binding target.  While here, wrap long line.
> >
> > 2019-08-12  Steven G. Kargl  
> >
> > PF fortran/89647
> > * gfortran.dg/pr89647.f90: New test.
> >
> > OK to commit?
> 
> Ok, thanks.

Committed to trunk.
Committed to 9-branch after successful regression testing.

-- 
Steve


[patch, fortran, committed] Fix PR 90563, error while warning about do subscripts

2019-08-13 Thread Thomas Koenig

Hi,

I just committed as simple and obvious the patch below. This
fixes a 8/9/10 regression where a false positive with -Wdo-subscript
(which we know about) was compounded by an also invalid error.

Fixed by suppressing errors at the right time.

I will also commit the patch to the other affected branches.

Regards

Thomas

2013-08-13  Thomas Koenig  

PR fortran/90563
* frontend-passes.c (insert_index): Suppress errors while
simplifying the resulting expression.

2013-08-13  Thomas Koenig  

PR fortran/90563
* gfortran.dg/do_subsript_5.f90: New test.
Index: testsuite/gfortran.dg/do_subscript_5.f90
===
--- testsuite/gfortran.dg/do_subscript_5.f90	(Revision 274394)
+++ testsuite/gfortran.dg/do_subscript_5.f90	(Arbeitskopie)
@@ -1,4 +1,5 @@
 ! { dg-do compile }
+! { dg-additional-options "-Wdo-subscript" }
 ! PR 90563 - this used to be rejected, wrongly
 ! Original test case by Tobias Neumann
 program test
@@ -9,9 +10,11 @@
 
   p = 0.0
 
-  do j=1,6
+  ! The following warnings are actually bogus, but we are not yet
+  ! clever enough to suppress them.
+  do j=1,6 ! { dg-warning "out of bounds" }
   if (j<5) then
-  p(j) = p(swap(j))
+  p(j) = p(swap(j)) ! { dg-warning "out of bounds" }
   endif
   enddo
 end program
! { dg-do compile }
! { dg-additional-options "-Wdo-subscript" }
! PR 90563 - this used to be rejected, wrongly
! Original test case by Tobias Neumann
program test
  implicit none
  integer, parameter :: swap(4) = [2,1,3,4]
  real :: p(20)
  integer :: j

  p = 0.0

  ! The following warnings are actually bogus, but we are not yet
  ! clever enough to suppress them.
  do j=1,6 ! { dg-warning "out of bounds" }
  if (j<5) then
  p(j) = p(swap(j)) ! { dg-warning "out of bounds" }
  endif
  enddo
end program


Re: Use checking forms of DECL_FUNCTION_CODE (PR 91421)

2019-08-13 Thread Jeff Law
On 8/13/19 3:36 AM, Richard Sandiford wrote:
> We were shoe-horning all built-in enumerations (including frontend
> and target-specific ones) into a field of type built_in_function.  This
> was accessed as either an lvalue or an rvalue using DECL_FUNCTION_CODE.
> 
> The obvious danger with this (as was noted by several ??? comments)
> is that the ranges have nothing to do with each other, and targets can
> easily have more built-in functions than generic code.  But my patch to
> make the field bigger was the straw that finally made the problem visible.
> 
> This patch therefore:
> 
> - replaces the field with a plain unsigned int
> 
> - turns DECL_FUNCTION_CODE into an rvalue-only accessor that checks
>   that the function really is BUILT_IN_NORMAL
> 
> - adds corresponding DECL_MD_FUNCTION_CODE and DECL_FE_FUNCTION_CODE
>   accessors for BUILT_IN_MD and BUILT_IN_FRONTEND respectively
> 
> - adds DECL_UNCHECKED_FUNCTION_CODE for places that need to access the
>   underlying field (should be low-level code only)
> 
> - adds new helpers for setting the built-in class and function code
> 
> - makes DECL_BUILT_IN_CLASS an rvalue-only accessor too, since all
>   assignments should go through the new helpers
> 
> Tested on aarch64-linux-gnu and x86_64-linux-gnu (--enable-languages=all
> --enable-host-shared).  OK to install?
> 
> Richard
> 
> 
> 2019-08-13  Richard Sandiford  
> 
> gcc/
>   PR middle-end/91421
>   * tree-core.h (function_decl::function_code): Change type to
>   unsigned int.
>   * tree.h (DECL_FUNCTION_CODE): Rename old definition to...
>   (DECL_UNCHECKED_FUNCTION_CODE): ...this.
>   (DECL_BUILT_IN_CLASS): Make an rvalue macro only.
>   (DECL_FUNCTION_CODE): New function.  Assert that the built-in class
>   is BUILT_IN_NORMAL.
>   (DECL_MD_FUNCTION_CODE, DECL_FE_FUNCTION_CODE): New functions.
>   (set_decl_built_in_function, copy_decl_built_in_function): Likewise.
>   (fndecl_built_in_p): Change the type of the "name" argument to
>   unsigned int.
>   * builtins.c (expand_builtin): Move DECL_FUNCTION_CODE use
>   after check for DECL_BUILT_IN_CLASS.
>   * cgraphclones.c (build_function_decl_skip_args): Use
>   set_decl_built_in_function.
>   * ipa-param-manipulation.c (ipa_modify_formal_parameters): Likewise.
>   * ipa-split.c (split_function): Likewise.
>   * langhooks.c (add_builtin_function_common): Likewise.
>   * omp-simd-clone.c (simd_clone_create): Likewise.
>   * tree-streamer-in.c (unpack_ts_function_decl_value_fields): Likewise.
>   * config/darwin.c (darwin_init_cfstring_builtins): Likewise.
>   (darwin_fold_builtin): Use DECL_MD_FUNCTION_CODE instead of
>   DECL_FUNCTION_CODE.
>   * fold-const.c (operand_equal_p): Compare DECL_UNCHECKED_FUNCTION_CODE
>   instead of DECL_FUNCTION_CODE.
>   * lto-streamer-out.c (hash_tree): Use DECL_UNCHECKED_FUNCTION_CODE
>   instead of DECL_FUNCTION_CODE.
>   * tree-streamer-out.c (pack_ts_function_decl_value_fields): Likewise.
>   * print-tree.c (print_node): Use DECL_MD_FUNCTION_CODE when
>   printing DECL_BUILT_IN_MD.  Handle DECL_BUILT_IN_FRONTEND.
>   * config/aarch64/aarch64-builtins.c (aarch64_expand_builtin)
>   (aarch64_fold_builtin, aarch64_gimple_fold_builtin): Use
>   DECL_MD_FUNCTION_CODE instead of DECL_FUNCTION_CODE.
>   * config/aarch64/aarch64.c (aarch64_builtin_reciprocal): Likewise.
>   * config/alpha/alpha.c (alpha_expand_builtin, alpha_fold_builtin):
>   (alpha_gimple_fold_builtin): Likewise.
>   * config/arc/arc.c (arc_expand_builtin): Likewise.
>   * config/arm/arm-builtins.c (arm_expand_builtin): Likewise.
>   * config/avr/avr-c.c (avr_resolve_overloaded_builtin): Likewise.
>   * config/avr/avr.c (avr_expand_builtin, avr_fold_builtin): Likewise.
>   * config/bfin/bfin.c (bfin_expand_builtin): Likewise.
>   * config/c6x/c6x.c (c6x_expand_builtin): Likewise.
>   * config/frv/frv.c (frv_expand_builtin): Likewise.
>   * config/gcn/gcn.c (gcn_expand_builtin_1): Likewise.
>   (gcn_expand_builtin): Likewise.
>   * config/i386/i386-builtins.c (ix86_builtin_reciprocal): Likewise.
>   (fold_builtin_cpu): Likewise.
>   * config/i386/i386-expand.c (ix86_expand_builtin): Likewise.
>   * config/i386/i386.c (ix86_fold_builtin): Likewise.
>   (ix86_gimple_fold_builtin): Likewise.
>   * config/ia64/ia64.c (ia64_fold_builtin): Likewise.
>   (ia64_expand_builtin): Likewise.
>   * config/iq2000/iq2000.c (iq2000_expand_builtin): Likewise.
>   * config/mips/mips.c (mips_expand_builtin): Likewise.
>   * config/msp430/msp430.c (msp430_expand_builtin): Likewise.
>   * config/nds32/nds32-intrinsic.c (nds32_expand_builtin_impl): Likewise.
>   * config/nios2/nios2.c (nios2_expand_builtin): Likewise.
>   * config/nvptx/nvptx.c (nvptx_expand_builtin): Likewise.
>   * config/pa/pa.c (pa_expand_builtin): Likewise.
>  

Re: [PATCH] PR fortran/87993 -- An array can have a kind type inquiry suffix

2019-08-13 Thread Steve Kargl
On Tue, Aug 13, 2019 at 09:53:34AM +0300, Janne Blomqvist wrote:
> On Tue, Aug 13, 2019 at 3:35 AM Steve Kargl
>  wrote:
> >
> > The attached patch ahs been regression tested on x86_64-*-freebsd.
> > It probably borders on obvious, but I'll ask away.  OK to commit?
> >
> > 2019-08-12  Steven G. Kargl  
> >
> > PR fortran/87993
> > * expr.c (gfc_simplify_expr):  Simplifcation of an array with a kind
> > type inquiry suffix yields a constant expression.
> >
> > 2019-08-12  Steven G. Kargl  
> >
> > PR fortran/87993
> > * gfortran.dg/pr87993.f90: New test.
> 
> Ok.
> 

Committed to trunk.
Committed to 9-branch successful regression testing. 

-- 
Steve


Re: Optimise constant IFN_WHILE_ULTs

2019-08-13 Thread Jeff Law
On 8/13/19 4:54 AM, Richard Sandiford wrote:
> This patch is a combination of two changes that have to be committed as
> a single unit, one target-independent and one target-specific:
> 
> (1) Try to fold IFN_WHILE_ULTs with constant arguments to a VECTOR_CST
> (which is always possible for fixed-length vectors but is not
> necessarily so for variable-length vectors)
> 
> (2) Make the SVE port recognise constants that map to PTRUE VLn,
> which includes those generated by the new fold.
> 
> (2) can't be tested without (1) and (1) would be a significant
> pessimisation without (2).
> 
> The target-specific parts also start moving towards doing predicate
> manipulation in a canonical VNx16BImode form, using rtx_vector_builders.
> 
> Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf and
> x86_64-linux-gnu.  OK for the generic bits (= the first three files
> in the diff)?
> 
> Thanks,
> Richard
> 
> 
> 2019-08-13  Richard Sandiford  
> 
> gcc/
>   * tree.h (build_vector_a_then_b): Declare.
>   * tree.c (build_vector_a_then_b): New function.
>   * fold-const-call.c (fold_while_ult): Likewise.
>   (fold_const_call): Use it to handle IFN_WHILE_ULT.
>   * config/aarch64/aarch64-protos.h (AARCH64_FOR_SVPATTERN): New macro.
>   (aarch64_svpattern): New enum.
>   * config/aarch64/aarch64-sve.md (mov): Pass
>   constants through aarch64_expand_mov_immediate.
>   (*aarch64_sve_mov): Use aarch64_mov_operand rather
>   than general_operand as the predicate for operand 1.
>   (while_ult): Add a '@' marker.
>   * config/aarch64/aarch64.c (simd_immediate_info::PTRUE): New
>   insn_type.
>   (simd_immediate_info::simd_immediate_info): New overload that
>   takes a scalar_int_mode and an svpattern.
>   (simd_immediate_info::u): Add a "pattern" field.
>   (svpattern_token): New function.
>   (aarch64_get_sve_pred_bits, aarch64_widest_sve_pred_elt_size)
>   (aarch64_partial_ptrue_length, aarch64_svpattern_for_vl)
>   (aarch64_sve_move_pred_via_while): New functions.
>   (aarch64_expand_mov_immediate): Try using
>   aarch64_sve_move_pred_via_while for predicates that contain N ones
>   followed by M zeros but that do not correspond to a VLnnn pattern.
>   (aarch64_sve_pred_valid_immediate): New function.
>   (aarch64_simd_valid_immediate): Use it instead of dealing directly
>   with PTRUE and PFALSE.
>   (aarch64_output_sve_mov_immediate): Handle new simd_immediate_info
>   forms.
> 
> gcc/testsuite/
>   * gcc.target/aarch64/sve/spill_2.c: Increase iteration counts
>   beyond the range of a PTRUE.
>   * gcc.target/aarch64/sve/while_6.c: New test.
>   * gcc.target/aarch64/sve/while_7.c: Likewise.
>   * gcc.target/aarch64/sve/while_8.c: Likewise.
>   * gcc.target/aarch64/sve/while_9.c: Likewise.
>   * gcc.target/aarch64/sve/while_10.c: Likewise.
> 
Generic bits are fine.

jeff


[PATCH, i386]: Add missing *mmx_pextrb and *mmx_pextr{b,w}_zext patterns

2019-08-13 Thread Uros Bizjak
We can implement these for TARGET_MMX_WITH_SSE (and TARGET_SSE4_1 in
case of.pextrb).

2019-08-13  Uroš Bizjak  

* config/i386/i386.md (ix86_expand_vector_extract) :
Use vec_extr path for TARGET_MMX_WITH_SSE && TARGET_SSE4_1.
: Ditto.
* config/i386/mmx.md (*mmx_pextrw_zext): Rename from mmx_pextrw.
Use SWI48 mode iterator.  Use %k to output operand 0.
(*mmx_pextrw): New insn pattern.
(*mmx_pextrb): Ditto.
(*mmx_pextrb_zext): Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 718de73395c4..176347cd4e6d 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -14617,6 +14617,11 @@ ix86_expand_vector_extract (bool mmx_ok, rtx target, 
rtx vec, int elt)
   switch (mode)
 {
 case E_V2SImode:
+  use_vec_extr = TARGET_MMX_WITH_SSE && TARGET_SSE4_1;
+  if (use_vec_extr)
+   break;
+  /* FALLTHRU */
+
 case E_V2SFmode:
   if (!mmx_ok)
break;
@@ -14866,7 +14871,10 @@ ix86_expand_vector_extract (bool mmx_ok, rtx target, 
rtx vec, int elt)
   return;
 
 case E_V8QImode:
+  use_vec_extr = TARGET_MMX_WITH_SSE && TARGET_SSE4_1;
   /* ??? Could extract the appropriate HImode element and shift.  */
+  break;
+
 default:
   break;
 }
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 5ae27c85fe5c..33eb15fae207 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1510,23 +1510,73 @@
(set_attr "prefix" "orig,vex")
(set_attr "mode" "TI")])
 
-(define_insn "mmx_pextrw"
-  [(set (match_operand:SI 0 "register_operand" "=r,r")
-(zero_extend:SI
+(define_insn "*mmx_pextrw"
+  [(set (match_operand:HI 0 "register_sse4nonimm_operand" "=r,r,m")
+   (vec_select:HI
+ (match_operand:V4HI 1 "register_operand" "y,Yv,Yv")
+ (parallel [(match_operand:SI 2 "const_0_to_3_operand" "n,n,n")])))]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
+   && (TARGET_SSE || TARGET_3DNOW_A)"
+  "@
+   pextrw\t{%2, %1, %k0|%k0, %1, %2}
+   %vpextrw\t{%2, %1, %k0|%k0, %1, %2}
+   %vpextrw\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "isa" "*,sse2,sse4")
+   (set_attr "mmx_isa" "native,*,*")
+   (set_attr "type" "mmxcvt,sselog1,sselog1")
+   (set_attr "length_immediate" "1")
+   (set_attr "prefix" "orig,maybe_vex,maybe_vex")
+   (set_attr "mode" "DI,TI,TI")])
+
+(define_insn "*mmx_pextrw_zext"
+  [(set (match_operand:SWI48 0 "register_operand" "=r,r")
+   (zero_extend:SWI48
  (vec_select:HI
(match_operand:V4HI 1 "register_operand" "y,Yv")
(parallel [(match_operand:SI 2 "const_0_to_3_operand" "n,n")]]
   "(TARGET_MMX || TARGET_MMX_WITH_SSE)
&& (TARGET_SSE || TARGET_3DNOW_A)"
   "@
-   pextrw\t{%2, %1, %0|%0, %1, %2}
-   %vpextrw\t{%2, %1, %0|%0, %1, %2}"
+   pextrw\t{%2, %1, %k0|%k0, %1, %2}
+   %vpextrw\t{%2, %1, %k0|%k0, %1, %2}"
   [(set_attr "isa" "*,sse2")
(set_attr "mmx_isa" "native,*")
(set_attr "type" "mmxcvt,sselog1")
(set_attr "length_immediate" "1")
+   (set_attr "prefix" "orig,maybe_vex")
(set_attr "mode" "DI,TI")])
 
+(define_insn "*mmx_pextrb"
+  [(set (match_operand:QI 0 "nonimmediate_operand" "=r,m")
+   (vec_select:QI
+ (match_operand:V8QI 1 "register_operand" "Yv,Yv")
+ (parallel [(match_operand:SI 2 "const_0_to_7_operand" "n,n")])))]
+  "TARGET_MMX_WITH_SSE && TARGET_SSE4_1"
+  "@
+   %vpextrb\t{%2, %1, %k0|%k0, %1, %2}
+   %vpextrb\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "type" "sselog1")
+   (set_attr "prefix_data16" "1")
+   (set_attr "prefix_extra" "1")
+   (set_attr "length_immediate" "1")
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "TI")])
+
+(define_insn "*mmx_pextrb_zext"
+  [(set (match_operand:SWI248 0 "register_operand" "=r")
+   (zero_extend:SWI248
+ (vec_select:QI
+   (match_operand:V8QI 1 "register_operand" "Yv")
+   (parallel [(match_operand:SI 2 "const_0_to_7_operand" "n")]]
+  "TARGET_MMX_WITH_SSE && TARGET_SSE4_1"
+  "%vpextrb\t{%2, %1, %k0|%k0, %1, %2}"
+  [(set_attr "type" "sselog1")
+   (set_attr "prefix_data16" "1")
+   (set_attr "prefix_extra" "1")
+   (set_attr "length_immediate" "1")
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "TI")])
+
 (define_expand "mmx_pshufw"
   [(match_operand:V4HI 0 "register_operand")
(match_operand:V4HI 1 "register_mmxmem_operand")


Re: [PR other/79543] Fix GNU ld --version scanning to conform to the GNU Coding Standards

2019-08-13 Thread Rainer Orth
Hi Joseph,

> However, I strongly encourage a followup to refactor this code 
> (*_CHECK_LINKER_FEATURES and *_ENABLE_SYMVERS that use it, not just the 
> fragment that determines the linker version number), which is evidently 
> duplicated far too much in different target library directories, into 
> common macros in a .m4 file in the top-level config/ directory, so it's 
> more maintainable in future.  (Note 1: I don't know what differences there 
> might be between the versions in different directories; that would need 
> investigating as part of such a refactoring; differences need not be 
> deliberate, they could easily have arisen by accident.  Note 2: although 
> libffi is maintained outside of GCC, I think such a refactoring should 
> still be applied to the libffi directory along with the others; standalone 
> libffi would simply need its own copy of the relevant .m4 file.  Note 3: 
> it should be possible to do such a refactoring bit by bit if that's more 
> approachable, rather than necessarily doing a complete refactoring of all 
> the definitions of all these macros at once.)

as it happens, I've been working on exactly this.  I'd noticed that
*_CHECK_LINKER_FEATURES enabled --gc-sections only with gld although
recent Solaris ld supports it as well.  What's worse, even with gld the
option is detected and used only for some of the libraries due to
inconsistencies between the different versions of the macro.

I'd meant to unify *_ENABLE_SYMVERS for a long time and this seemed the
perfect opportunity to do so, among others because several libs require
it from *_CHECK_LINKER_FEATURES to get the linker version info ;-(

On top of that, there are many different copies of the code that handles
--enable-version-specific-runtime-libs, sets toolexeclibdir and
--enable-generated-files-in-srcdir, as well as several instances of
*_CHECK_ATTRIBUTE_ALIAS, *_CHECK_ATTRIBUTE_DLLEXPORT, and
*_CHECK_ATTRIBUTE_VISIBILITY, and probably more that I've missed so
far.  Overall, we've created an incredible mess here, and I'm probably
partially responsible at least for the *_ENABLE_SYMVERS part.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] Add generic support for "noinit" attribute

2019-08-13 Thread Christophe Lyon
Ping?

On Tue, 30 Jul 2019 at 15:35, Christophe Lyon
 wrote:
>
> Hi,
>
> Thanks for the useful feedback.
>
>
> On Tue, 16 Jul 2019 at 11:54, Richard Sandiford
>  wrote:
> >
> > Thanks for doing this in a generic way.
> >
> > Christophe Lyon  writes:
> > > @@ -2224,6 +2234,50 @@ handle_weak_attribute (tree *node, tree name,
> > >return NULL_TREE;
> > >  }
> > >
> > > +/* Handle a "noinit" attribute; arguments as in struct
> > > +   attribute_spec.handler.  Check whether the attribute is allowed
> > > +   here and add the attribute to the variable decl tree or otherwise
> > > +   issue a diagnostic.  This function checks NODE is of the expected
> > > +   type and issues diagnostics otherwise using NAME.  If it is not of
> > > +   the expected type *NO_ADD_ATTRS will be set to true.  */
> > > +
> > > +static tree
> > > +handle_noinit_attribute (tree * node,
> > > +   tree   name,
> > > +   tree   args,
> > > +   intflags ATTRIBUTE_UNUSED,
> > > +   bool *no_add_attrs)
> > > +{
> > > +  const char *message = NULL;
> > > +
> > > +  gcc_assert (DECL_P (*node));
> > > +  gcc_assert (args == NULL);
> > > +
> > > +  if (TREE_CODE (*node) != VAR_DECL)
> > > +message = G_("%qE attribute only applies to variables");
> > > +
> > > +  /* Check that it's possible for the variable to have a section.  */
> > > +  if ((TREE_STATIC (*node) || DECL_EXTERNAL (*node) || in_lto_p)
> > > +  && DECL_SECTION_NAME (*node))
> > > +message = G_("%qE attribute cannot be applied to variables "
> > > +  "with specific sections");
> > > +
> > > +  /* If this var is thought to be common, then change this.  Common
> > > + variables are assigned to sections before the backend has a
> > > + chance to process them.  */
> > > +  if (DECL_COMMON (*node))
> > > +DECL_COMMON (*node) = 0;
> > > +
> > > +  if (message)
> > > +{
> > > +  warning (OPT_Wattributes, message, name);
> > > +  *no_add_attrs = true;
> > > +}
> > > +
> > > +  return NULL_TREE;
> > > +}
> >
> > This might cause us to clear DECL_COMMON even when rejecting the
> > attribute.  Also, the first message should win over the others
> > (e.g. for a function in a particular section).
> >
> > So I think the "/* Check that it's possible ..." should be an else-if
> > and the DECL_COMMON stuff should be specific to !message.
>
> You are right, thanks.
>
> Jozef, this is true for msp430_data_attr() too. I'll let you update it.
>
> >
> > Since this is specific to ELF targets, I think we should also
> > warn if !targetm.have_switchable_bss_sections.
> >
> OK
>
> > > @@ -2338,6 +2336,8 @@ msp430_select_section (tree decl, int reloc, 
> > > unsigned HOST_WIDE_INT align)
> > >  {
> > >if (TREE_CODE (decl) == FUNCTION_DECL)
> > >   return text_section;
> > > +  /* FIXME: ATTR_NOINIT is handled generically in
> > > +  default_elf_select_section.  */
> > >else if (has_attr (ATTR_NOINIT, decl))
> > >   return noinit_section;
> > >else if (has_attr (ATTR_PERSIST, decl))
> >
> > I guess adding a FIXME is OK.  It's very tempting to remove
> > the noinit stuff and use default_elf_select_section instead of
> > default_select_section, but I realise that'd be difficult to test.
>
> I added that as suggested by Jozef, it's best if he handles the
> change you propose, I guess he can do proper testing.
>
>
> > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> > > index f2619e1..850153e 100644
> > > --- a/gcc/doc/extend.texi
> > > +++ b/gcc/doc/extend.texi
> > > @@ -7129,6 +7129,12 @@ The @code{visibility} attribute is described in
> > >  The @code{weak} attribute is described in
> > >  @ref{Common Function Attributes}.
> > >
> > > +@item noinit
> > > +@cindex @code{noinit} variable attribute
> > > +Any data with the @code{noinit} attribute will not be initialized by
> > > +the C runtime startup code, or the program loader.  Not initializing
> > > +data in this way can reduce program startup times.
> > > +
> > >  @end table
> > >
> > >  @node ARC Variable Attributes
> >
> > Would be good to mention that the attribute is specific to ELF targets.
> > (Yeah, we don't seem to do that consistently for other attributes.)
> > It might also be worth saying that it requires specific linker support.
> OK, done.
>
> >
> > > diff --git a/gcc/testsuite/gcc.c-torture/execute/noinit-attribute.c 
> > > b/gcc/testsuite/gcc.c-torture/execute/noinit-attribute.c
> > > new file mode 100644
> > > index 000..f33c0d0
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.c-torture/execute/noinit-attribute.c
> > > @@ -0,0 +1,59 @@
> > > +/* { dg-do run } */
> > > +/* { dg-require-effective-target noinit */
> > > +/* { dg-options "-O2" } */
> > > +
> > > +/* This test checks that noinit data is handled correctly.  */
> > > +
> > > +extern void _start (void) __attribute__ ((noreturn));
> > > +extern void abort (void) __attribute__ ((noreturn));
> > > +extern void exit (int) 

Re: [PR other/79543] Fix GNU ld --version scanning to conform to the GNU Coding Standards

2019-08-13 Thread Joseph Myers
On Thu, 4 Jul 2019, Chung-Lin Tang wrote:

> Bringing back this issue, as this is still bothering our OpenACC toolchains.
> 
> If the main variance in format was the 2007 ' ' to '.' change for non-release
> binutils builds, then is the attached patch okay?
> 
> What the patch does is to first look for an 8-digit part at the end
> (preceded by either a space or '.'), chop it off if it exists, and then take
> the last
> space-preceded string.

This patch is using \+ with sed, which is a GNU extension rather than part 
of POSIX sed.  I don't think we require GNU sed to build GCC.

The patch is OK for trunk with \+ changed to \{1,\} throughout (presuming 
that does indeed work).

However, I strongly encourage a followup to refactor this code 
(*_CHECK_LINKER_FEATURES and *_ENABLE_SYMVERS that use it, not just the 
fragment that determines the linker version number), which is evidently 
duplicated far too much in different target library directories, into 
common macros in a .m4 file in the top-level config/ directory, so it's 
more maintainable in future.  (Note 1: I don't know what differences there 
might be between the versions in different directories; that would need 
investigating as part of such a refactoring; differences need not be 
deliberate, they could easily have arisen by accident.  Note 2: although 
libffi is maintained outside of GCC, I think such a refactoring should 
still be applied to the libffi directory along with the others; standalone 
libffi would simply need its own copy of the relevant .m4 file.  Note 3: 
it should be possible to do such a refactoring bit by bit if that's more 
approachable, rather than necessarily doing a complete refactoring of all 
the definitions of all these macros at once.)

> (seeking approval for trunk and all active release branches)

For release branches you should at least wait a few weeks to see if any 
issues show up on trunk.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Add missing popcount simplifications (PR90693)

2019-08-13 Thread Andrew Pinski
On Tue, Aug 13, 2019 at 8:50 AM Wilco Dijkstra  wrote:
>
> Add simplifications for popcount (x) > 1 to (x & (x-1)) != 0 and
> popcount (x) == 1 into (x-1)  single-use cases and support an optional convert.  A microbenchmark
> shows a speedup of 2-2.5x on both x64 and AArch64.
>
> Bootstrap OK, OK for commit?

I think this should be in expand stage where there could be comparison
of the cost of the RTLs.
The only reason why it is faster for AARCH64 is the requirement of
moving between the GPRs and the SIMD registers.

Thanks,
Andrew Pinski

>
> ChangeLog:
> 2019-08-13  Wilco Dijkstra  
>
> gcc/
> PR middle-end/90693
> * match.pd: Add popcount simplifications.
>
> testsuite/
> PR middle-end/90693
> * gcc.dg/fold-popcount-5.c: Add new test.
>
> ---
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 
> 0317bc704f771f626ab72189b3a54de00087ad5a..bf4351a330f45f3a1424d9792cefc3da6267597d
>  100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -5356,7 +5356,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> rep (eq eq ne ne)
>  (simplify
>(cmp (popcount @0) integer_zerop)
> -  (rep @0 { build_zero_cst (TREE_TYPE (@0)); }
> +  (rep @0 { build_zero_cst (TREE_TYPE (@0)); })))
> +  /* popcount(X) == 1 -> (X-1)  +  (for cmp (eq ne)
> +   rep (lt ge)
> +(simplify
> +  (cmp (convert? (popcount:s @0)) integer_onep)
> +  (with {
> + tree utype = unsigned_type_for (TREE_TYPE (@0));
> + tree a0 = fold_convert (utype, @0); }
> +   (rep (plus { a0; } { build_minus_one_cst (utype); })
> +(bit_and (negate { a0; }) { a0; })
> +  /* popcount(X) > 1 -> (X & (X-1)) != 0.  */
> +  (for cmp (gt le)
> +   rep (ne eq)
> +(simplify
> +  (cmp (convert? (popcount:s @0)) integer_onep)
> +  (rep (bit_and (plus @0 { build_minus_one_cst (TREE_TYPE (@0)); }) @0)
> +  { build_zero_cst (TREE_TYPE (@0)); }
>
>  /* Simplify:
>
> diff --git a/gcc/testsuite/gcc.dg/fold-popcount-5.c 
> b/gcc/testsuite/gcc.dg/fold-popcount-5.c
> new file mode 100644
> index 
> ..fcf3910587caacb8e39cf437dc3971df892f405a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/fold-popcount-5.c
> @@ -0,0 +1,69 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> +/* Test popcount (x) > 1 -> (x & (x-1)) != 0.  */
> +
> +int test_1 (long x)
> +{
> +  return __builtin_popcountl (x) >= 2;
> +}
> +
> +int test_2 (int x)
> +{
> +  return (unsigned) __builtin_popcount (x) <= 1u;
> +}
> +
> +int test_3 (unsigned x)
> +{
> +  return __builtin_popcount (x) > 1u;
> +}
> +
> +int test_4 (unsigned long x)
> +{
> +  return (unsigned char) __builtin_popcountl (x) > 1;
> +}
> +
> +int test_5 (unsigned long x)
> +{
> +  return (signed char) __builtin_popcountl (x) <= (signed char)1;
> +}
> +
> +int test_6 (unsigned long long x)
> +{
> +  return 2u <= __builtin_popcountll (x);
> +}
> +
> +/* Test popcount (x) == 1 -> (x-1)  +
> +int test_7 (unsigned long x)
> +{
> +  return __builtin_popcountl (x) != 1;
> +}
> +
> +int test_8 (long long x)
> +{
> +  return (unsigned) __builtin_popcountll (x) == 1u;
> +}
> +
> +int test_9 (int x)
> +{
> +  return (unsigned char) __builtin_popcount (x) != 1u;
> +}
> +
> +int test_10 (unsigned x)
> +{
> +  return (unsigned char) __builtin_popcount (x) == 1;
> +}
> +
> +int test_11 (long x)
> +{
> +  return (signed char) __builtin_popcountl (x) == 1;
> +}
> +
> +int test_12 (long x)
> +{
> +  return 1u == __builtin_popcountl (x);
> +}
> +
> +/* { dg-final { scan-tree-dump-times "popcount" 0 "optimized" } } */
> +


Re: [PATCH] Add missing popcount simplifications (PR90693)

2019-08-13 Thread Marc Glisse

On Tue, 13 Aug 2019, Wilco Dijkstra wrote:


Add simplifications for popcount (x) > 1 to (x & (x-1)) != 0 and
popcount (x) == 1 into (x-1) 

Is that true even on targets that have a popcount instruction? (-mpopcnt 
for x64)



diff --git a/gcc/match.pd b/gcc/match.pd
index 
0317bc704f771f626ab72189b3a54de00087ad5a..bf4351a330f45f3a1424d9792cefc3da6267597d
 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5356,7 +5356,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   rep (eq eq ne ne)
(simplify
  (cmp (popcount @0) integer_zerop)
-  (rep @0 { build_zero_cst (TREE_TYPE (@0)); }
+  (rep @0 { build_zero_cst (TREE_TYPE (@0)); })))
+  /* popcount(X) == 1 -> (X-1) 

That doesn't seem right for a gimple transformation. I assume you didn't 
write (convert:utype @0) in the output because you want to avoid doing it 
3 times? IIRC you are allowed to write (convert:utype@1 @0) in the output 
and reuse @1 several times.



+   (rep (plus { a0; } { build_minus_one_cst (utype); })
+(bit_and (negate { a0; }) { a0; })
+  /* popcount(X) > 1 -> (X & (X-1)) != 0.  */
+  (for cmp (gt le)
+   rep (ne eq)
+(simplify
+  (cmp (convert? (popcount:s @0)) integer_onep)
+  (rep (bit_and (plus @0 { build_minus_one_cst (TREE_TYPE (@0)); }) @0)
+  { build_zero_cst (TREE_TYPE (@0)); }


Are there any types where this could be a problem? Say if you cast to a 
1-bit type. Actually, even converting popcnt(__uint128_t(-1)) to signed 
char may be problematic.


--
Marc Glisse


[PATCH] Add missing popcount simplifications (PR90693)

2019-08-13 Thread Wilco Dijkstra
Add simplifications for popcount (x) > 1 to (x & (x-1)) != 0 and
popcount (x) == 1 into (x-1) 

gcc/
PR middle-end/90693
* match.pd: Add popcount simplifications.

testsuite/
PR middle-end/90693
* gcc.dg/fold-popcount-5.c: Add new test.

---

diff --git a/gcc/match.pd b/gcc/match.pd
index 
0317bc704f771f626ab72189b3a54de00087ad5a..bf4351a330f45f3a1424d9792cefc3da6267597d
 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -5356,7 +5356,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
rep (eq eq ne ne)
 (simplify
   (cmp (popcount @0) integer_zerop)
-  (rep @0 { build_zero_cst (TREE_TYPE (@0)); }
+  (rep @0 { build_zero_cst (TREE_TYPE (@0)); })))
+  /* popcount(X) == 1 -> (X-1)  1 -> (X & (X-1)) != 0.  */
+  (for cmp (gt le)
+   rep (ne eq)
+(simplify
+  (cmp (convert? (popcount:s @0)) integer_onep)
+  (rep (bit_and (plus @0 { build_minus_one_cst (TREE_TYPE (@0)); }) @0)
+  { build_zero_cst (TREE_TYPE (@0)); }
 
 /* Simplify:
 
diff --git a/gcc/testsuite/gcc.dg/fold-popcount-5.c 
b/gcc/testsuite/gcc.dg/fold-popcount-5.c
new file mode 100644
index 
..fcf3910587caacb8e39cf437dc3971df892f405a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/fold-popcount-5.c
@@ -0,0 +1,69 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+/* Test popcount (x) > 1 -> (x & (x-1)) != 0.  */
+
+int test_1 (long x)
+{
+  return __builtin_popcountl (x) >= 2;
+}
+
+int test_2 (int x)
+{
+  return (unsigned) __builtin_popcount (x) <= 1u;
+}
+
+int test_3 (unsigned x)
+{
+  return __builtin_popcount (x) > 1u;
+}
+
+int test_4 (unsigned long x)
+{
+  return (unsigned char) __builtin_popcountl (x) > 1;
+}
+
+int test_5 (unsigned long x)
+{
+  return (signed char) __builtin_popcountl (x) <= (signed char)1;
+}
+
+int test_6 (unsigned long long x)
+{
+  return 2u <= __builtin_popcountll (x);
+}
+
+/* Test popcount (x) == 1 -> (x-1) 

Re: [PATCH][RFC][x86] Fix PR91154, add SImode smax, allow SImode add in SSE regs

2019-08-13 Thread Jeff Law
On 8/12/19 6:27 AM, Richard Biener wrote:
> On Fri, 9 Aug 2019, Uros Bizjak wrote:
> 
>> On Fri, Aug 9, 2019 at 3:00 PM Richard Biener  wrote:
>>
 (define_mode_iterator MAXMIN_IMODE [SI "TARGET_SSE4_1"] [DI 
 "TARGET_AVX512F"])

 and then we need to split DImode for 32bits, too.
>>>
>>> For now, please add "TARGET_64BIT && TARGET_AVX512F" for DImode
>>> condition, I'll provide _doubleword splitter later.
>>
>> Shouldn't that be TARGET_AVX512VL instead?  Or does the insn use %g0 
>> etc.
>> to force use of %zmmN?
>
> It generates V4SI mode, so - yes, AVX512VL.

 case SMAX:
 case SMIN:
 case UMAX:
 case UMIN:
   if ((mode == DImode && (!TARGET_64BIT || !TARGET_AVX512VL))
   || (mode == SImode && !TARGET_SSE4_1))
 return false;

 so there's no way to use AVX512VL for 32bit?
>>>
>>> There is a way, but on 32bit targets, we need to split DImode
>>> operation to a sequence of SImode operations for unconverted pattern.
>>> This is of course doable, but somehow more complex than simply
>>> emitting a DImode compare + DImode cmove, which is what current
>>> splitter does. So, a follow-up task.
>>
>> Please find attached the complete .md part that enables SImode for
>> TARGET_SSE4_1 and DImode for TARGET_AVX512VL for both, 32bit and 64bit
>> targets. The patterns also allows for memory operand 2, so STV has
>> chance to create the vector pattern with implicit load. In case STV
>> fails, the memory operand 2 is loaded to the register first;  operand
>> 2 is used in compare and cmove instruction, so pre-loading of the
>> operand should be beneficial.
>
> Thanks.
>
>> Also note, that splitting should happen rarely. Due to the cost
>> function, STV should effectively always convert minmax to a vector
>> insn.
>
> I've analyzed the 464.h264ref slowdown on Haswell and it is due to
> this kind of "simple" conversion:
>
>   5.50 │1d0:   test   %esi,%es
>   0.07 │   mov$0x0,%ex
>│   cmovs  %eax,%es
>   5.84 │   imul   %r8d,%es
>
> to
>
>   0.65 │1e0:   vpxor  %xmm0,%xmm0,%xmm0
>   0.32 │   vpmaxs -0x10(%rsp),%xmm0,%xmm0
>  40.45 │   vmovd  %xmm0,%eax
>   2.45 │   imul   %r8d,%eax
>
> which looks like a RA artifact in the end.  We spill %esi only
> with -mstv here as STV introduces a (subreg:V4SI ...) use
> of a pseudo ultimatively set from di.  STV creates an additional
> pseudo for this (copy-in) but it places that copy next to the
> original def rather than next to the start of the chain it
> converts which is probably the issue why we spill.  And this
> is because it inserts those at each definition of the pseudo
> rather than just at the reaching definition(s) or at the
> uses of the pseudo in the chain (that because there may be
> defs of that pseudo in the chain itself).  Note that STV emits
> such "conversion" copies as simple reg-reg moves:
>
> (insn 1094 3 4 2 (set (reg:SI 777)
> (reg/v:SI 438 [ y ])) "refbuf.i":4:1 -1
>  (nil))
>
> but those do not prevail very long (this one gets removed by CSE2).
> So IRA just sees the (subreg:V4SI (reg/v:SI 438 [ y ]) 0) use
> and computes
>
> r438: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS
> a297(r438,l0) costs: SSE_REGS:5628,5628 MEM:3618,3618
>
> so I wonder if STV shouldn't instead emit gpr->xmm moves
> here (but I guess nothing again prevents RTL optimizers from
> combining that with the single-use in the max instruction...).
>
> So this boils down to STV splitting live-ranges but other
> passes undoing that and then RA not considering splitting
> live-ranges here, arriving at unoptimal allocation.
>
> A testcase showing this issue is (simplified from 464.h264ref
> UMVLine16Y_11):
>
> unsigned short
> UMVLine16Y_11 (short unsigned int * Pic, int y, int width)
> {
>   if (y != width)
> {
>   y = y < 0 ? 0 : y;
>   return Pic[y * width];
> }
>   return Pic[y];
> }
>
> where the condition and the Pic[y] load mimics the other use of y.
> Different, even worse spilling is generated by
>
> unsigned short
> UMVLine16Y_11 (short unsigned int * Pic, int y, int width)
> {
>   y = y < 0 ? 0 : y;
>   return Pic[y * width] + y;
> }
>
> I guess this all shows that STVs "trick" of simply wrapping
> integer mode pseudos in (subreg:vector-mode ...) is bad?
>
> I've added a (failing) testcase to reflect the above.

 Experimenting a bit with just for the conversion insns using
 V4SImode 

Re: [PATCH] Add --with-static-standard-libraries to the top level

2019-08-13 Thread Tom Tromey
Jonathan> What I don't understand is why GDB crashes. It should still be able to
Jonathan> catch exceptions from a shared library even if linked to libstdc++.a,
Jonathan> unless the static libstdc++.a is somehow incompatible with the shared
Jonathan> libstdc++.so the shared lib linked to.

Jonathan> Is this on GNU/Linux, or something with a different linking model?

GNU/Linux, Fedora 29 in particular.  I didn't look into why it fails but
the gcc docs specifically mention this problem:

'-static-libgcc'
[...]
 There are several situations in which an application should use the
 shared 'libgcc' instead of the static version.  The most common of
 these is when the application wishes to throw and catch exceptions
 across different shared libraries.  In that case, each of the
 libraries as well as the application itself should use the shared
 'libgcc'.


I was able to reproduce it with a simple test program:

$ cd /tmp
$ cat t.cc 
void thrower() {
  throw 23;
}
$ g++ -fPIC -shared -o libt.so t.cc
$ cat a.cc
extern void thrower ();

int main () {
  try {
thrower ();
  } catch (...) {
  }
  return 0;
}
$ g++ -o a -static-libgcc -static-libstdc++ a.cc -L$(pwd) -lt
$ LD_LIBRARY_PATH=$(pwd) ./a
Aborted (core dumped)

thanks,
Tom


Re: [PATCH][RFC][x86] Fix PR91154, add SImode smax, allow SImode add in SSE regs

2019-08-13 Thread Jeff Law
On 8/9/19 7:00 AM, Richard Biener wrote:
> On Fri, 9 Aug 2019, Richard Biener wrote:
> 
>> On Fri, 9 Aug 2019, Richard Biener wrote:
>>
>>> On Fri, 9 Aug 2019, Uros Bizjak wrote:
>>>
 On Mon, Aug 5, 2019 at 3:09 PM Uros Bizjak  wrote:

>> (define_mode_iterator MAXMIN_IMODE [SI "TARGET_SSE4_1"] [DI 
>> "TARGET_AVX512F"])
>>
>> and then we need to split DImode for 32bits, too.
>
> For now, please add "TARGET_64BIT && TARGET_AVX512F" for DImode
> condition, I'll provide _doubleword splitter later.

 Shouldn't that be TARGET_AVX512VL instead?  Or does the insn use %g0 
 etc.
 to force use of %zmmN?
>>>
>>> It generates V4SI mode, so - yes, AVX512VL.
>>
>> case SMAX:
>> case SMIN:
>> case UMAX:
>> case UMIN:
>>   if ((mode == DImode && (!TARGET_64BIT || !TARGET_AVX512VL))
>>   || (mode == SImode && !TARGET_SSE4_1))
>> return false;
>>
>> so there's no way to use AVX512VL for 32bit?
>
> There is a way, but on 32bit targets, we need to split DImode
> operation to a sequence of SImode operations for unconverted pattern.
> This is of course doable, but somehow more complex than simply
> emitting a DImode compare + DImode cmove, which is what current
> splitter does. So, a follow-up task.

 Please find attached the complete .md part that enables SImode for
 TARGET_SSE4_1 and DImode for TARGET_AVX512VL for both, 32bit and 64bit
 targets. The patterns also allows for memory operand 2, so STV has
 chance to create the vector pattern with implicit load. In case STV
 fails, the memory operand 2 is loaded to the register first;  operand
 2 is used in compare and cmove instruction, so pre-loading of the
 operand should be beneficial.
>>>
>>> Thanks.
>>>
 Also note, that splitting should happen rarely. Due to the cost
 function, STV should effectively always convert minmax to a vector
 insn.
>>>
>>> I've analyzed the 464.h264ref slowdown on Haswell and it is due to
>>> this kind of "simple" conversion:
>>>
>>>   5.50 │1d0:   test   %esi,%es
>>>   0.07 │   mov$0x0,%ex
>>>│   cmovs  %eax,%es
>>>   5.84 │   imul   %r8d,%es
>>>
>>> to
>>>
>>>   0.65 │1e0:   vpxor  %xmm0,%xmm0,%xmm0
>>>   0.32 │   vpmaxs -0x10(%rsp),%xmm0,%xmm0
>>>  40.45 │   vmovd  %xmm0,%eax
>>>   2.45 │   imul   %r8d,%eax
>>>
>>> which looks like a RA artifact in the end.  We spill %esi only
>>> with -mstv here as STV introduces a (subreg:V4SI ...) use
>>> of a pseudo ultimatively set from di.  STV creates an additional
>>> pseudo for this (copy-in) but it places that copy next to the
>>> original def rather than next to the start of the chain it
>>> converts which is probably the issue why we spill.  And this
>>> is because it inserts those at each definition of the pseudo
>>> rather than just at the reaching definition(s) or at the
>>> uses of the pseudo in the chain (that because there may be
>>> defs of that pseudo in the chain itself).  Note that STV emits
>>> such "conversion" copies as simple reg-reg moves:
>>>
>>> (insn 1094 3 4 2 (set (reg:SI 777)
>>> (reg/v:SI 438 [ y ])) "refbuf.i":4:1 -1
>>>  (nil))
>>>
>>> but those do not prevail very long (this one gets removed by CSE2).
>>> So IRA just sees the (subreg:V4SI (reg/v:SI 438 [ y ]) 0) use
>>> and computes
>>>
>>> r438: preferred NO_REGS, alternative NO_REGS, allocno NO_REGS
>>> a297(r438,l0) costs: SSE_REGS:5628,5628 MEM:3618,3618
>>>
>>> so I wonder if STV shouldn't instead emit gpr->xmm moves
>>> here (but I guess nothing again prevents RTL optimizers from
>>> combining that with the single-use in the max instruction...).
>>>
>>> So this boils down to STV splitting live-ranges but other
>>> passes undoing that and then RA not considering splitting
>>> live-ranges here, arriving at unoptimal allocation.
>>>
>>> A testcase showing this issue is (simplified from 464.h264ref
>>> UMVLine16Y_11):
>>>
>>> unsigned short
>>> UMVLine16Y_11 (short unsigned int * Pic, int y, int width)
>>> {
>>>   if (y != width)
>>> {
>>>   y = y < 0 ? 0 : y;
>>>   return Pic[y * width];
>>> }
>>>   return Pic[y];
>>> }
>>>
>>> where the condition and the Pic[y] load mimics the other use of y.
>>> Different, even worse spilling is generated by
>>>
>>> unsigned short
>>> UMVLine16Y_11 (short unsigned int * Pic, int y, int width)
>>> {
>>>   y = y < 0 ? 0 : y;
>>>   return Pic[y * width] + y;
>>> }
>>>
>>> I guess this all shows that STVs "trick" of simply wrapping
>>> integer mode pseudos in (subreg:vector-mode ...) is bad?
>>>
>>> I've added a (failing) testcase to reflect the above.
>>
>> Experimenting a bit with just for the conversion insns using
>> V4SImode pseudos we end up preserving those moves (but I
>> do have to use a lowpart set, using reg:V4SI = subreg:V4SI Simode-reg
>> ends up using 

Re: [PATCH] Improve DSE to handle redundant zero initializations.

2019-08-13 Thread Matthew Beliveau
Hello,

This should have the changes you all asked for! Let me know if I
missed anything!

Thank you,
Matthew Beliveau

On Tue, Aug 13, 2019 at 5:15 AM Richard Biener
 wrote:
>
> On Tue, Aug 13, 2019 at 10:18 AM Richard Sandiford
>  wrote:
> >
> > Thanks for doing this.
> >
> > Matthew Beliveau  writes:
> > > diff --git gcc/tree-ssa-dse.c gcc/tree-ssa-dse.c
> > > index 5b7c4fc6d1a..dcaeb8edbfe 100644
> > > --- gcc/tree-ssa-dse.c
> > > +++ gcc/tree-ssa-dse.c
> > > @@ -628,11 +628,8 @@ dse_optimize_redundant_stores (gimple *stmt)
> > >tree fndecl;
> > >if ((is_gimple_assign (use_stmt)
> > >  && gimple_vdef (use_stmt)
> > > -&& ((gimple_assign_rhs_code (use_stmt) == CONSTRUCTOR
> > > - && CONSTRUCTOR_NELTS (gimple_assign_rhs1 (use_stmt)) == 0
> > > - && !gimple_clobber_p (stmt))
> > > -|| (gimple_assign_rhs_code (use_stmt) == INTEGER_CST
> > > -&& integer_zerop (gimple_assign_rhs1 (use_stmt)
> > > +&& initializer_zerop (gimple_op (use_stmt, 1), NULL)
> > > +&& !gimple_clobber_p (stmt))
> > > || (gimple_call_builtin_p (use_stmt, BUILT_IN_NORMAL)
> > > && (fndecl = gimple_call_fndecl (use_stmt)) != NULL
> > > && (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_MEMSET
> > > @@ -1027,15 +1024,13 @@ dse_dom_walker::dse_optimize_stmt 
> > > (gimple_stmt_iterator *gsi)
> > >  {
> > >bool by_clobber_p = false;
> > >
> > > -  /* First see if this store is a CONSTRUCTOR and if there
> > > -  are subsequent CONSTRUCTOR stores which are totally
> > > -  subsumed by this statement.  If so remove the subsequent
> > > -  CONSTRUCTOR store.
> > > +  /* Check if this store initalizes zero, or some aggregate of zeros,
> > > +  and check if there are subsequent stores which are subsumed by this
> > > +  statement.  If so, remove the subsequent store.
> > >
> > >This will tend to make fewer calls into memset with longer
> > >arguments.  */
> > > -  if (gimple_assign_rhs_code (stmt) == CONSTRUCTOR
> > > -   && CONSTRUCTOR_NELTS (gimple_assign_rhs1 (stmt)) == 0
> > > +  if (initializer_zerop (gimple_op (stmt, 1), NULL)
> > > && !gimple_clobber_p (stmt))
> > >   dse_optimize_redundant_stores (stmt);
> > >
> >
> > In addition to Jeff's comment, the original choice of gimple_assign_rhs1
> > is the preferred way to write this (applies to both hunks).
>
> And the !gimple_clobber_p test is now redundant.
>
> > Richard
Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-08-13  Matthew Beliveau  

	* tree-ssa-dse.c (dse_optimize_redundant_stores): Improved check to
	catch more redundant zero initialization cases.
	(dse_dom_walker::dse_optimize_stmt): Likewise.
	
	* gcc.dg/tree-ssa/redundant-assign-zero-1.c: New test.
	* gcc.dg/tree-ssa/redundant-assign-zero-2.c: New test.

diff --git gcc/testsuite/gcc.dg/tree-ssa/redundant-assign-zero-1.c gcc/testsuite/gcc.dg/tree-ssa/redundant-assign-zero-1.c
new file mode 100644
index 000..b8d01d1644b
--- /dev/null
+++ gcc/testsuite/gcc.dg/tree-ssa/redundant-assign-zero-1.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dse-details" } */
+
+void blah (char *);
+
+void bar ()
+{
+  char a[256] = "";
+  a[3] = 0; 
+  blah (a);
+}
+
+/* { dg-final { scan-tree-dump-times "Deleted redundant store" 1 "dse1"} } */
diff --git gcc/testsuite/gcc.dg/tree-ssa/redundant-assign-zero-2.c gcc/testsuite/gcc.dg/tree-ssa/redundant-assign-zero-2.c
new file mode 100644
index 000..8cefa6f0cb7
--- /dev/null
+++ gcc/testsuite/gcc.dg/tree-ssa/redundant-assign-zero-2.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-dse-details" } */
+
+#include 
+
+void blahd (double *);
+
+void fubar ()
+{
+  double d;
+  double *x = 
+
+  memset (, 0 , sizeof d);
+  *x = 0.0;
+  blahd (x);
+}
+
+/* { dg-final { scan-tree-dump-times "Deleted redundant store" 1 "dse1"} } */
diff --git gcc/tree-ssa-dse.c gcc/tree-ssa-dse.c
index 5b7c4fc6d1a..14b66228e1e 100644
--- gcc/tree-ssa-dse.c
+++ gcc/tree-ssa-dse.c
@@ -628,11 +628,7 @@ dse_optimize_redundant_stores (gimple *stmt)
   tree fndecl;
   if ((is_gimple_assign (use_stmt)
 	   && gimple_vdef (use_stmt)
-	   && ((gimple_assign_rhs_code (use_stmt) == CONSTRUCTOR
-	&& CONSTRUCTOR_NELTS (gimple_assign_rhs1 (use_stmt)) == 0
-	&& !gimple_clobber_p (stmt))
-	   || (gimple_assign_rhs_code (use_stmt) == INTEGER_CST
-		   && integer_zerop (gimple_assign_rhs1 (use_stmt)
+	   && initializer_zerop (gimple_assign_rhs1 (use_stmt)))
 	  || (gimple_call_builtin_p (use_stmt, BUILT_IN_NORMAL)
 	  && (fndecl = gimple_call_fndecl (use_stmt)) != NULL
 	  && (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_MEMSET
@@ -1027,16 +1023,10 @@ dse_dom_walker::dse_optimize_stmt (gimple_stmt_iterator *gsi)
 {
   bool by_clobber_p = false;
 
-  /* First see if this store is a CONSTRUCTOR and if there

Re: [PATCH, contrib]: Do not escape "=".

2019-08-13 Thread Jeff Law
On 8/13/19 2:55 AM, Uros Bizjak wrote:
> Attached patch fixes the following gawk-5.0.1 warning:
> 
> gawk: cmd. line:36: warning: regexp escape sequence `\=' is not a
> known regexp operator
> 
> 2019-08-13  Uros Bizjak  
> 
> * test_summary: Do not escape "=".
> 
> Tested by building mail-report.log with gawk-5.0.1 and gawk-4.2.1.
> 
> OK for mainline?
> 
> Uros.
> 
OK
jeff


Re: Protect some checks of DECL_FUNCTION_CODE

2019-08-13 Thread Jeff Law
On 8/13/19 3:34 AM, Richard Sandiford wrote:
> This patch protects various uses of DECL_FUNCTION_CODE that didn't
> obviously check for BUILT_IN_NORMAL first (either directly or in callers).
> They could therefore trigger for functions that either aren't built-ins
> or are a different kind of built-in.
> 
> Also, the patch removes a redundant GIMPLE_CALL check from
> optimize_stdarg_builtin, since it gave the impression that the stmt
> was less well checked than it actually is.
> 
> Tested on aarch64-linux-gnu and x86_64-linux-gnu (--enable-languages=all
> --enable-host-shared).  OK to install?
> 
> Richard
> 
> 
> 2019-08-13  Richard Sandiford  
> 
> gcc/
>   PR middle-end/91421
>   * attribs.c (decl_attributes): Check the DECL_BUILT_IN_CLASS
>   before the DECL_FUNCTION_CODE.
>   * calls.c (maybe_warn_alloc_args_overflow): Use fndecl_built_in_p
>   to check for a BUILT_IN_ALLOCA call.
>   * ipa-cp.c (ipa_get_indirect_edge_target_1): Likewise for
>   BUILT_IN_UNREACHABLE.  Don't check for a FUNCTION_TYPE.
>   * ipa-devirt.c (possible_polymorphic_call_target_p): Likewise.
>   * ipa-prop.c (try_make_edge_direct_virtual_call): Likewise.
>   * gimple-ssa-isolate-paths.c (is_addr_local): Check specifically
>   for BUILT_IN_NORMAL functions.
>   * trans-mem.c (expand_block_edges): Use gimple_call_builtin_p to
>   test for BUILT_IN_TM_ABORT.
>   * tree-ssa-ccp.c (optimize_stack_restore): Use fndecl_built_in_p
>   to check for a BUILT_IN_STACK_RESTORE call.
>   (optimize_stdarg_builtin): Remove redundant check for GIMPLE_CALL.
>   * tree-ssa-threadedge.c
>   (record_temporary_equivalences_from_stmts_at_dest): Check for a
>   BUILT_IN_NORMAL decl before checking its DECL_FUNCTION_CODE.
>   * tree-vect-patterns.c (vect_recog_pow_pattern): Use a positive
>   test for a BUILT_IN_NORMAL call instead of a negative test for
>   an internal function call.
> 
> gcc/c/
>   PR middle-end/91421
>   * c-decl.c (header_for_builtin_fn): Take a FUNCTION_DECL instead
>   of a built_in_function.
>   (diagnose_mismatched_decls, implicitly_declare): Update accordingly.
OK
jeff


Re: [PATCH] Improve documentation of target hooks for libc functions

2019-08-13 Thread Jeff Law
On 8/13/19 7:24 AM, Jonathan Wakely wrote:
> * target.def (libc_has_function, libc_has_fast_function): Improve
> documentation strings.
> * doc/tm.texi: Regenerate.
> 
> Bootstrapped x86_64-linux.
> 
> OK for trunk?
> 
> 
OK
jeff


Re: [C++ Patch] Improve grok_array_decl location

2019-08-13 Thread Jason Merrill

On 8/13/19 5:22 AM, Paolo Carlini wrote:

Hi,

for this error message about the types involved in array subscripting I 
think we can exploit the available location argument and point to the 
open square bracket instead of simply using input_location. Tested 
x86_64-linux.


Thanks, Paolo Carlini.





OK.

Jason



Re: C++ PATCH for c++/91416 - GC during late parsing collects live data

2019-08-13 Thread Jason Merrill

On 8/12/19 4:37 AM, Richard Biener wrote:

On Sun, Aug 11, 2019 at 7:12 PM Marek Polacek  wrote:


This is a crash that points to a GC problem.  Consider this test:

   __attribute__ ((unused)) struct S {
 S() { }
   } s;

We're parsing a simple-declaration.  While parsing the decl specs, we parse the
attribute, which means creating a TREE_LIST using ggc_alloc_*.

A function body is a complete-class context so when parsing the
member-specification of this class-specifier, we parse the bodies of the
functions we'd queued in cp_parser_late_parsing_for_member.  This then leads to
this call chain:
cp_parser_function_definition_after_declarator -> expand_or_defer_fn ->
expand_or_defer_fn_1 -> maybe_clone_body -> expand_or_defer_fn ->
cgraph_node::finalize_function -> ggc_collect.

In this test, the ggc_collect call collects the TREE_LIST we had allocated, and
a crash duly ensues.  We can't avoid late parsing of members in this context,
so my fix is to bump function_depth, exactly following cp_parser_lambda_body.
Since we are performing late parsing, we know we have to be nested in a class.
(We still ggc_collect later, in c_parse_final_cleanups.)


So the struct S itself is properly referenced by a GC root?  If so why not
attach the attribute list to the tentative struct instead?  Or do we
fear we have
other non-rooted data live at the point we collect?  If so shouldn't we instead
bump function_depth when parsing a declaration in general?


It's already a significant issue for C++ that we only collect between 
function declarations, I'm concerned that this change will cause more 
memory problems.


Perhaps if we start parsing a class-specifier we could push the 
decl_specifiers onto a vec that is a GC root?


Jason


But here's the thing.  This goes back to ancient r104500, at least.  How has
this not broken before?  All you need to trigger it is to enable GC checking
and have a class with a ctor/member function, that has an attribute.


...and is defined in the declaration of something else?  Does it happen 
without declaring the variable 's'?


Jason


Re: [PATCH 0/3] Libsanitizer: merge from trunk

2019-08-13 Thread Jeff Law
On 8/13/19 7:07 AM, Martin Liska wrote:
> Hi.
> 
> For this year, I decided to make a first merge now and the
> next (much smaller) at the end of October.
> 
> The biggest change is rename of many files from .cc to .cpp.
> 
> I bootstrapped the patch set on x86_64-linux-gnu and run
> asan/ubsan/tsan tests on x86_64, ppc64le (power8) and
> aarch64.
> 
> Libasan SONAME has been already bumped compared to GCC 9.
> 
> For other libraries, I don't see a reason for library bumping:
> 
> $ abidiff /usr/lib64/libubsan.so.1.0.0 
> ./x86_64-pc-linux-gnu/libsanitizer/ubsan/.libs/libubsan.so.1.0.0 --stat
> Functions changes summary: 0 Removed, 0 Changed, 4 Added functions
> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
> Function symbols changes summary: 3 Removed, 0 Added function symbols not 
> referenced by debug info
> Variable symbols changes summary: 0 Removed, 0 Added variable symbol not 
> referenced by debug info
> 
> $ abidiff /usr/lib64/libtsan.so.0.0.0  
> ./x86_64-pc-linux-gnu/libsanitizer/tsan/.libs/libtsan.so.0.0.0 --stat
> Functions changes summary: 0 Removed, 0 Changed, 47 Added functions
> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
> Function symbols changes summary: 1 Removed, 2 Added function symbols not 
> referenced by debug info
> Variable symbols changes summary: 0 Removed, 0 Added variable symbol not 
> referenced by debug info
> 
> Ready to be installed?
ISTM that a sanitizer merge during stage1 should be able to move forward
without ACKs.  Similarly for other runtimes where we pull from some
upstream master.

I'd be slightly concerned about the function removals, but I don't think
we've really tried to be ABI stable for the sanitizer runtimes.

jeff


Re: C++ PATCH for more checking of DECL_MUTABLE_P

2019-08-13 Thread Jason Merrill

On 8/10/19 2:57 PM, Marek Polacek wrote:

Some preparatory work before adding constinit.  mutable is only appropriate for
FIELD_DECLs (when they're not static), but we've never made sure that we're not
setting DECL_MUTABLE_P on a different _DECL.  I mean to use DECL_LANG_FLAG_0 in
a VAR_DECL for DECL_DECLARED_CONSTINIT_P.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-08-10  Marek Polacek  

* cp-tree.h (DECL_MUTABLE_P): Use FIELD_DECL_CHECK.


OK.

Jason



Re: [C++ Patch] Improve start_function and grokmethod locations

2019-08-13 Thread Jason Merrill

On 8/9/19 7:46 AM, Paolo Carlini wrote:

+   error_at (id_loc,
+ "typedef may not be a function definition");


We might use the location of the typedef keyword.  OK either way.

Jason



Re: C++ PATCH for c++/90473 - wrong code with nullptr in default argument

2019-08-13 Thread Jason Merrill

On 8/8/19 8:56 PM, Marek Polacek wrote:

This is a wrong-code bug where we are throwing away the function call in
the default argument here:

   void fn1 (void* p = (f(), nullptr)) { }

and thus dropping the possible side-effects of that call to the floor.

That happens because check_default_argument returns nullptr_node when it
sees a null_ptr_cst_p argument.  In this case the argument is a COMPOUND_EXPR
"f(), nullptr" and null_ptr_cst_p returns true for it.  And so the effects of
the LHS expression of the compound expr are forgotten.  Fixed as below.

It's tempting to change null_ptr_cst_p to return false when it sees something
that has TREE_SIDE_EFFECTS, but that would be wrong I think: [conv.ptr] says
that a null pointer constant is an integer literal with value zero or a prvalue
of type std::nullptr_t.  An expression "a, b", the built-in comma expression,
is a prvalue if 'b' is an rvalue.  And so "f(), nullptr" is a prvalue of type
std::nullptr_t.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-08-08  Marek Polacek  

PR c++/90473 - wrong code with nullptr in default argument.
* call.c (null_ptr_cst_p): Update quote from the standard.
* decl.c (check_default_argument): Don't return nullptr when the arg
has side-effects.


OK.

Jason



Re: [PATCH][2/2][i386] STV changes, reg-copy as vector

2019-08-13 Thread Richard Biener
On Tue, 13 Aug 2019, Uros Bizjak wrote:

> On Tue, Aug 13, 2019 at 2:33 PM Richard Biener  wrote:
> >
> >
> > The following splits out the change that makes the live-range split /
> > mode change copy pseudos vector mode pseudos.  I've tried reproducing
> > the issue I faced with SImode chains but failed likely because we're
> > never using simple moves in the end while for SImode we can end up
> > with
> >
> > (set (subreg:V4SI (reg:SI 42) 0) (subreg:V4SI (reg:SI 41) 0))
> >
> > which we happily propagate out.  So pursuing this patch independently
> > doesn't make much sense.  Still the main change is in
> > dimode_scalar_chain::make_vector_copies where 'vreg' is now V2DImode
> > instead of DImode and the change to emit_move_insn (vreg, tmp)
> > hints at the change done to the above problematic insn.
> >
> > The rest of the changes deal with some regs already being in
> > the appropriate vector mode.
> >
> > I realize I might have avoided my original issue by emitting not
> > the above reg-reg move but
> >
> >  (set (subreg:V4SI (reg:SI 42) 0)
> >(vec_merge:V4SI
> >   (vec_duplicate:V4SI (reg:SI 41))
> >   (const_vec [0 0 0 0])
> >   (const_1)))
> >
> > but I didn't try (the full patch contained a testcase for the
> > issue).  Still it seems odd to use DImode pseudos when all uses
> > are wrapped in (subreg:V2DI ...).
> 
> It looks to me that the above is the way to go. make_vector_copies
> creates a scalar pseudo, and all other support functions expect a
> scalar that will be processed with "replace_with_subreg". I think that
> the safest way is to emit the code above for SImode for the
> problematic move insn, and
> 
> > +   emit_move_insn (gen_rtx_SUBREG (V2DImode, vreg 0),
> > +   gen_rtx_VEC_MERGE (V2DImode,
> > +  gen_rtx_VEC_DUPLICATE 
> > (V2DImode,
> > + tmp),
> > +  CONST0_RTX (V2DImode),
> > +  GEN_INT (HOST_WIDE_INT_1U)));
> 
> for DImode. This way, you won't need other changes (conditional
> generation of SUBREGs), the above should be enough.
> 
> Oh, and don't use emit_move_insn, emit insn with gen_rtx_SET
> (gen_rtx_SUBREG (...)) should do the trick.

OK, that seems to work on the testcases, which makes splitting out this
patch pointless.  Integrated the change with the full patch.

Thanks,
Richard.


Re: [PATCH][2/2][i386] STV changes, reg-copy as vector

2019-08-13 Thread Uros Bizjak
On Tue, Aug 13, 2019 at 2:33 PM Richard Biener  wrote:
>
>
> The following splits out the change that makes the live-range split /
> mode change copy pseudos vector mode pseudos.  I've tried reproducing
> the issue I faced with SImode chains but failed likely because we're
> never using simple moves in the end while for SImode we can end up
> with
>
> (set (subreg:V4SI (reg:SI 42) 0) (subreg:V4SI (reg:SI 41) 0))
>
> which we happily propagate out.  So pursuing this patch independently
> doesn't make much sense.  Still the main change is in
> dimode_scalar_chain::make_vector_copies where 'vreg' is now V2DImode
> instead of DImode and the change to emit_move_insn (vreg, tmp)
> hints at the change done to the above problematic insn.
>
> The rest of the changes deal with some regs already being in
> the appropriate vector mode.
>
> I realize I might have avoided my original issue by emitting not
> the above reg-reg move but
>
>  (set (subreg:V4SI (reg:SI 42) 0)
>(vec_merge:V4SI
>   (vec_duplicate:V4SI (reg:SI 41))
>   (const_vec [0 0 0 0])
>   (const_1)))
>
> but I didn't try (the full patch contained a testcase for the
> issue).  Still it seems odd to use DImode pseudos when all uses
> are wrapped in (subreg:V2DI ...).

It looks to me that the above is the way to go. make_vector_copies
creates a scalar pseudo, and all other support functions expect a
scalar that will be processed with "replace_with_subreg". I think that
the safest way is to emit the code above for SImode for the
problematic move insn, and

> +   emit_move_insn (gen_rtx_SUBREG (V2DImode, vreg 0),
> +   gen_rtx_VEC_MERGE (V2DImode,
> +  gen_rtx_VEC_DUPLICATE 
> (V2DImode,
> + tmp),
> +  CONST0_RTX (V2DImode),
> +  GEN_INT (HOST_WIDE_INT_1U)));

for DImode. This way, you won't need other changes (conditional
generation of SUBREGs), the above should be enough.

Oh, and don't use emit_move_insn, emit insn with gen_rtx_SET
(gen_rtx_SUBREG (...)) should do the trick.

Uros.

> As (unrelated) change I propose to error with fatal_insn_not_found
> in case recog doesn't recognize the altered insns which make it
> easier to spot errors early.
>
> Bootstrap / regtest running on x86_64-unknown-linux-gnu ontop of [1/2], OK
> for trunk if that succeeds?
>
> Thanks,
> Richard.
>
> 2019-08-13  Richard Biener  
>
> * config/i386/i386-features.c
> (dimode_scalar_chain::replace_with_subreg): Elide the subreg if the
> reg is already vector.
> (dimode_scalar_chain::convert_op): Likewise.
> (dimode_scalar_chain::make_vector_copies): Use a vector-mode pseudo as
> destination to avoid later passes copy-propagating it out.
> (dimode_chain::convert_insn): Convert the source of a
> memory op in case it is not of the appropriate scalar mode.
> Raise fatal_insn_not_found if the converted insn is not recognized.
>
> Index: gcc/config/i386/i386-features.c
> ===
> --- gcc/config/i386/i386-features.c (revision 274328)
> +++ gcc/config/i386/i386-features.c (working copy)
> @@ -573,7 +573,8 @@ rtx
>  dimode_scalar_chain::replace_with_subreg (rtx x, rtx reg, rtx new_reg)
>  {
>if (x == reg)
> -return gen_rtx_SUBREG (V2DImode, new_reg, 0);
> +return (GET_MODE (new_reg) == V2DImode
> +   ? new_reg : gen_rtx_SUBREG (V2DImode, new_reg, 0));
>
>const char *fmt = GET_RTX_FORMAT (GET_CODE (x));
>int i, j;
> @@ -627,7 +628,7 @@ void
>  dimode_scalar_chain::make_vector_copies (unsigned regno)
>  {
>rtx reg = regno_reg_rtx[regno];
> -  rtx vreg = gen_reg_rtx (DImode);
> +  rtx vreg = gen_reg_rtx (V2DImode);
>df_ref ref;
>
>for (ref = DF_REG_DEF_CHAIN (regno); ref; ref = DF_REF_NEXT_REG (ref))
> @@ -641,7 +642,12 @@ dimode_scalar_chain::make_vector_copies
> gen_rtx_SUBREG (SImode, reg, 0));
> emit_move_insn (adjust_address (tmp, SImode, 4),
> gen_rtx_SUBREG (SImode, reg, 4));
> -   emit_move_insn (vreg, tmp);
> +   emit_move_insn (vreg,
> +   gen_rtx_VEC_MERGE (V2DImode,
> +  gen_rtx_VEC_DUPLICATE 
> (V2DImode,
> + tmp),
> +  CONST0_RTX (V2DImode),
> +  GEN_INT (HOST_WIDE_INT_1U)));
>   }
> else if (TARGET_SSE4_1)
>   {
> @@ -841,7 +847,8 @@ dimode_scalar_chain::convert_op (rtx *op
> gcc_assert (!DF_REF_CHAIN (ref));
> break;
>   }
> -  *op = gen_rtx_SUBREG (V2DImode, *op, 0);
> +  if (GET_MODE (*op) 

[PATCH] Update to MAINTAINERS

2019-08-13 Thread Mark Eggleston

I have added myself to the MAINTAINERS file.

--
https://www.codethink.co.uk/privacy.html

Index: MAINTAINERS
===
--- MAINTAINERS	(revision 274379)
+++ MAINTAINERS	(working copy)
@@ -371,6 +371,7 @@
 Jason Eckhardt	
 Bernd Edlinger	
 Phil Edwards	
+Mark Eggleston	
 Steve Ellcey	
 Mohan Embar	
 Revital Eres	


Re: [patch, fortran] Fix PR 90561

2019-08-13 Thread Paul Richard Thomas
Hi Thomas,

Yes that's good to apply to 9 and 10 branches. I am certain that there
are other places where the new function will be very helpful.

Thanks for the patch.

Paul

On Tue, 13 Aug 2019 at 13:59, Thomas König  wrote:
>
> Hello world,
>
> this patch fixes a 9/10 regression by placing the variable used to
> hold a string length at function scope.
>
> I chose to implement this version of gfc_evaluate_now as a separate
> function because I have a sneaking suspicion this may not be the
> last time we are going to encounter something like that - better
> have the function there for future use.
>
> Regression-tested. OK for trunk and gcc-9?
>
> Regards
>
> Thomas



-- 
"If you can't explain it simply, you don't understand it well enough"
- Albert Einstein


[PATCH 2/2] Add more entries to the C++ get_std_name_hint array

2019-08-13 Thread Jonathan Wakely

This adds some commonly-used C++11/14 names, and some new C++17/20
names. The latter aren't available when using the -std=gnu++14
default, so the fix-it suggesting to use a newer dialect is helpful.

* name-lookup.c (get_std_name_hint): Add more entries.

Tested x86_64-linux. OK for trunk?

commit 0c12d092e5659689951b5e134b84c5b506b543d5
Author: Jonathan Wakely 
Date:   Tue Aug 13 13:42:40 2019 +0100

Add more entries to the C++ get_std_name_hint array

* name-lookup.c (get_std_name_hint): Add more entries.

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 96b2d90540d..68ebd7655c7 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -5550,9 +5550,11 @@ get_std_name_hint (const char *name)
 {"make_any", "", cxx17},
 /* .  */
 {"array", "", cxx11},
+{"to_array", "", cxx2a},
 /* .  */
 {"atomic", "", cxx11},
 {"atomic_flag", "", cxx11},
+{"atomic_ref", "", cxx2a},
 /* .  */
 {"bitset", "", cxx11},
 /* .  */
@@ -5575,9 +5577,17 @@ get_std_name_hint (const char *name)
 {"ofstream", "", cxx98},
 /* .  */
 {"bind", "", cxx11},
+{"bind_front", "", cxx2a},
 {"function", "", cxx11},
 {"hash", "", cxx11},
+{"invoke", "", cxx17},
 {"mem_fn", "", cxx11},
+{"not_fn", "", cxx17},
+{"reference_wrapper", "", cxx11},
+{"unwrap_reference", "", cxx2a},
+{"unwrap_reference_t", "", cxx2a},
+{"unwrap_ref_decay", "", cxx2a},
+{"unwrap_ref_decay_t", "", cxx2a},
 /* . */
 {"async", "", cxx11},
 {"future", "", cxx11},
@@ -5618,11 +5628,16 @@ get_std_name_hint (const char *name)
 {"map", "", cxx98},
 {"multimap", "", cxx98},
 /* .  */
+{"allocate_shared", "", cxx11},
+{"allocator", "", cxx98},
+{"allocator_traits", "", cxx11},
 {"make_shared", "", cxx11},
 {"make_unique", "", cxx14},
 {"shared_ptr", "", cxx11},
 {"unique_ptr", "", cxx11},
 {"weak_ptr", "", cxx11},
+/* .  */
+{"pmr", "", cxx17},
 /* .  */
 {"mutex", "", cxx11},
 {"timed_mutex", "", cxx11},
@@ -5672,14 +5687,39 @@ get_std_name_hint (const char *name)
 {"u16string", "", cxx11},
 {"u32string", "", cxx11},
 /* .  */
+{"basic_string_view", "", cxx17},
 {"string_view", "", cxx17},
 /* .  */
 {"thread", "", cxx11},
+{"this_thread", "", cxx11},
 /* .  */
+{"apply", "", cxx17},
+{"forward_as_tuple", "", cxx11},
+{"make_from_tuple", "", cxx17},
 {"make_tuple", "", cxx11},
+{"tie", "", cxx11},
 {"tuple", "", cxx11},
+{"tuple_cat", "", cxx11},
 {"tuple_element", "", cxx11},
+{"tuple_element_t", "", cxx14},
 {"tuple_size", "", cxx11},
+{"tuple_size_v", "", cxx17},
+/* .  */
+{"enable_if", "", cxx11},
+{"enable_if_t", "", cxx14},
+{"invoke_result", "", cxx17},
+{"invoke_result_t", "", cxx17},
+{"remove_cvref", "", cxx17},
+{"remove_cvref_t", "", cxx17},
+{"type_identity", "", cxx2a},
+{"type_identity_t", "", cxx2a},
+{"void_t", "", cxx17},
+{"conjunction", "", cxx17},
+{"conjunction_v", "", cxx17},
+{"disjunction", "", cxx17},
+{"disjunction_v", "", cxx17},
+{"negation", "", cxx17},
+{"negation_v", "", cxx17},
 /* .  */
 {"unordered_map", "", cxx11},
 {"unordered_multimap", "", cxx11},


[PATCH 1/2] PR c++/91436 fix C++ dialect for std::make_unique fix-it hint

2019-08-13 Thread Jonathan Wakely

The std::make_unique function wasn't added until C++14, and neither was
the std::complex_literals namespace.

gcc/cp:

PR c++/91436
* name-lookup.c (get_std_name_hint): Fix min_dialect field for
complex_literals and make_unique entries.

gcc/testsuite:

PR c++/91436
* g++.dg/lookup/missing-std-include-5.C: Limit test to C++14 and up.
* g++.dg/lookup/missing-std-include-6.C: Don't check make_unique in
test that runs for C++11.


Tested x86_64-linux. OK for trunk?


commit b58e771fe21ae380200fdec00aa899d4b15b73d5
Author: Jonathan Wakely 
Date:   Tue Aug 13 13:25:39 2019 +0100

PR c++/91436 fix C++ dialect for std::make_unique fix-it hint

The std::make_unique function wasn't added until C++14, and neither was
the std::complex_literals namespace.

gcc/cp:

PR c++/91436
* name-lookup.c (get_std_name_hint): Fix min_dialect field for
complex_literals and make_unique entries.

gcc/testsuite:

PR c++/91436
* g++.dg/lookup/missing-std-include-5.C: Limit test to C++14 and up.
* g++.dg/lookup/missing-std-include-6.C: Don't check make_unique in
test that runs for C++11.

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 9f278220df3..96b2d90540d 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -5557,7 +5557,7 @@ get_std_name_hint (const char *name)
 {"bitset", "", cxx11},
 /* .  */
 {"complex", "", cxx98},
-{"complex_literals", "", cxx98},
+{"complex_literals", "", cxx14},
 /* . */
 {"condition_variable", "", cxx11},
 {"condition_variable_any", "", cxx11},
@@ -5619,7 +5619,7 @@ get_std_name_hint (const char *name)
 {"multimap", "", cxx98},
 /* .  */
 {"make_shared", "", cxx11},
-{"make_unique", "", cxx11},
+{"make_unique", "", cxx14},
 {"shared_ptr", "", cxx11},
 {"unique_ptr", "", cxx11},
 {"weak_ptr", "", cxx11},
diff --git a/gcc/testsuite/g++.dg/lookup/missing-std-include-5.C 
b/gcc/testsuite/g++.dg/lookup/missing-std-include-5.C
index fe880a6263b..3ec9abd9316 100644
--- a/gcc/testsuite/g++.dg/lookup/missing-std-include-5.C
+++ b/gcc/testsuite/g++.dg/lookup/missing-std-include-5.C
@@ -1,2 +1,3 @@
+// { dg-do compile { target c++14 } }
 using namespace std::complex_literals; // { dg-error "" }
 // { dg-message "#include " "" { target *-*-* } .-1 }
diff --git a/gcc/testsuite/g++.dg/lookup/missing-std-include-6.C 
b/gcc/testsuite/g++.dg/lookup/missing-std-include-6.C
index d9eeb4284e8..a8f27473e6d 100644
--- a/gcc/testsuite/g++.dg/lookup/missing-std-include-6.C
+++ b/gcc/testsuite/g++.dg/lookup/missing-std-include-6.C
@@ -11,15 +11,6 @@ void test_make_shared ()
   // { dg-error "expected primary-expression before '\\)' token" "" { target 
*-*-* } .-3 }
 }
 
-template
-void test_make_unique ()
-{
-  auto p = std::make_unique(); // { dg-error "'make_unique' is not a member 
of 'std'" }
-  // { dg-message "'#include '" "" { target *-*-* } .-1 }
-  // { dg-error "expected primary-expression before '>' token" "" { target 
*-*-* } .-2 }
-  // { dg-error "expected primary-expression before '\\)' token" "" { target 
*-*-* } .-3 }
-}
-
 std::shared_ptr test_shared_ptr; // { dg-error "'shared_ptr' in namespace 
'std' does not name a template type" }
 // { dg-message "'#include '" "" { target *-*-* } .-1 }
 


[PATCH] Improve documentation of target hooks for libc functions

2019-08-13 Thread Jonathan Wakely

* target.def (libc_has_function, libc_has_fast_function): Improve
documentation strings.
* doc/tm.texi: Regenerate.

Bootstrapped x86_64-linux.

OK for trunk?


commit c3645cae6c78896218c60135349056a5bf943b8a
Author: Jonathan Wakely 
Date:   Tue Aug 13 14:01:01 2019 +0100

Improve documentation of target hooks for libc functions

* target.def (libc_has_function, libc_has_fast_function): Improve
documentation strings.
* doc/tm.texi: Regenerate.

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8e5b01c9383..89990cbc871 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -5591,12 +5591,12 @@ macro, a reasonable default is used.
 
 @deftypefn {Target Hook} bool TARGET_LIBC_HAS_FUNCTION (enum function_class 
@var{fn_class})
 This hook determines whether a function from a class of functions
-@var{fn_class} is present at the runtime.
+@var{fn_class} is present in the target C library.
 @end deftypefn
 
 @deftypefn {Target Hook} bool TARGET_LIBC_HAS_FAST_FUNCTION (int @var{fcode})
 This hook determines whether a function from a class of functions
-@var{fn_class} has a fast implementation.
+@code{(enum function_class)}@var{fcode} has a fast implementation.
 @end deftypefn
 
 @defmac NEXT_OBJC_RUNTIME
diff --git a/gcc/target.def b/gcc/target.def
index 7cc0f37a0d1..73334e0e8fc 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2545,14 +2545,14 @@ set via @code{__attribute__}.",
 DEFHOOK
 (libc_has_function,
  "This hook determines whether a function from a class of functions\n\
-@var{fn_class} is present at the runtime.",
+@var{fn_class} is present in the target C library.",
  bool, (enum function_class fn_class),
  default_libc_has_function)
 
 DEFHOOK
 (libc_has_fast_function,
  "This hook determines whether a function from a class of functions\n\
-@var{fn_class} has a fast implementation.",
+@code{(enum function_class)}@var{fcode} has a fast implementation.",
  bool, (int fcode),
  default_libc_has_fast_function)
 


[PATCH 0/3] Libsanitizer: merge from trunk

2019-08-13 Thread Martin Liska
Hi.

For this year, I decided to make a first merge now and the
next (much smaller) at the end of October.

The biggest change is rename of many files from .cc to .cpp.

I bootstrapped the patch set on x86_64-linux-gnu and run
asan/ubsan/tsan tests on x86_64, ppc64le (power8) and
aarch64.

Libasan SONAME has been already bumped compared to GCC 9.

For other libraries, I don't see a reason for library bumping:

$ abidiff /usr/lib64/libubsan.so.1.0.0 
./x86_64-pc-linux-gnu/libsanitizer/ubsan/.libs/libubsan.so.1.0.0 --stat
Functions changes summary: 0 Removed, 0 Changed, 4 Added functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
Function symbols changes summary: 3 Removed, 0 Added function symbols not 
referenced by debug info
Variable symbols changes summary: 0 Removed, 0 Added variable symbol not 
referenced by debug info

$ abidiff /usr/lib64/libtsan.so.0.0.0  
./x86_64-pc-linux-gnu/libsanitizer/tsan/.libs/libtsan.so.0.0.0 --stat
Functions changes summary: 0 Removed, 0 Changed, 47 Added functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
Function symbols changes summary: 1 Removed, 2 Added function symbols not 
referenced by debug info
Variable symbols changes summary: 0 Removed, 0 Added variable symbol not 
referenced by debug info

Ready to be installed?
Thanks,
Martin

Martin Liska (3):
  Libsanitizer merge from trunk r368656.
  Reapply all revisions mentioned in LOCAL_PATCHES.
  Fix a test-case scan pattern.

 gcc/testsuite/c-c++-common/asan/memcmp-1.c|4 +-
 libsanitizer/MERGE|2 +-
 libsanitizer/Makefile.in  |8 +-
 libsanitizer/aclocal.m4   |  191 +-
 libsanitizer/asan/Makefile.am |   63 +-
 libsanitizer/asan/Makefile.in |  323 +-
 ...asan_activation.cc => asan_activation.cpp} |7 +-
 libsanitizer/asan/asan_activation.h   |5 +-
 libsanitizer/asan/asan_activation_flags.inc   |5 +-
 .../{asan_allocator.cc => asan_allocator.cpp} |   18 +-
 libsanitizer/asan/asan_allocator.h|   44 +-
 .../{asan_debugging.cc => asan_debugging.cpp} |7 +-
 ..._descriptions.cc => asan_descriptions.cpp} |7 +-
 libsanitizer/asan/asan_descriptions.h |7 +-
 .../asan/{asan_errors.cc => asan_errors.cpp}  |   22 +-
 libsanitizer/asan/asan_errors.h   |   27 +-
 ...asan_fake_stack.cc => asan_fake_stack.cpp} |7 +-
 libsanitizer/asan/asan_fake_stack.h   |7 +-
 .../asan/{asan_flags.cc => asan_flags.cpp}|   13 +-
 libsanitizer/asan/asan_flags.h|5 +-
 libsanitizer/asan/asan_flags.inc  |9 +-
 .../{asan_fuchsia.cc => asan_fuchsia.cpp} |   20 +-
 .../{asan_globals.cc => asan_globals.cpp} |   31 +-
 ...an_globals_win.cc => asan_globals_win.cpp} |7 +-
 libsanitizer/asan/asan_init_version.h |5 +-
 ..._interceptors.cc => asan_interceptors.cpp} |   18 +-
 libsanitizer/asan/asan_interceptors.h |   39 +-
 ...cc => asan_interceptors_memintrinsics.cpp} |7 +-
 .../asan/asan_interceptors_memintrinsics.h|7 +-
 libsanitizer/asan/asan_interceptors_vfork.S   |   12 +
 libsanitizer/asan/asan_interface.inc  |6 +-
 libsanitizer/asan/asan_interface_internal.h   |7 +-
 libsanitizer/asan/asan_internal.h |   22 +-
 .../asan/{asan_linux.cc => asan_linux.cpp}|   14 +-
 .../asan/{asan_mac.cc => asan_mac.cpp}|   11 +-
 ..._malloc_linux.cc => asan_malloc_linux.cpp} |   17 +-
 libsanitizer/asan/asan_malloc_local.h |   30 +-
 ...asan_malloc_mac.cc => asan_malloc_mac.cpp} |   46 +-
 libsanitizer/asan/asan_malloc_win.cc  |  259 --
 libsanitizer/asan/asan_malloc_win.cpp |  553 
 libsanitizer/asan/asan_mapping.h  |   25 +-
 libsanitizer/asan/asan_mapping_myriad.h   |5 +-
 libsanitizer/asan/asan_mapping_sparc64.h  |5 +-
 ...ory_profile.cc => asan_memory_profile.cpp} |7 +-
 ...asan_new_delete.cc => asan_new_delete.cpp} |   37 +-
 .../{asan_poisoning.cc => asan_poisoning.cpp} |7 +-
 libsanitizer/asan/asan_poisoning.h|   15 +-
 .../asan/{asan_posix.cc => asan_posix.cpp}|   53 +-
 .../{asan_preinit.cc => asan_preinit.cpp} |7 +-
 ...remap_shadow.cc => asan_premap_shadow.cpp} |7 +-
 libsanitizer/asan/asan_premap_shadow.h|5 +-
 .../asan/{asan_report.cc => asan_report.cpp}  |   16 +-
 libsanitizer/asan/asan_report.h   |7 +-
 .../asan/{asan_rtems.cc => asan_rtems.cpp}|   17 +-
 .../asan/{asan_rtl.cc => asan_rtl.cpp}|   45 +-
 libsanitizer/asan/asan_scariness_score.h  |5 +-
 ..._shadow_setup.cc => asan_shadow_setup.cpp} |   11 +-
 libsanitizer/asan/asan_stack.cc   |   38 -
 libsanitizer/asan/asan_stack.cpp  |   88 +
 libsanitizer/asan/asan_stack.h|   47 +-
 .../asan/{asan_stats.cc => asan_stats.cpp}|7 +-
 libsanitizer/asan/asan_stats.h  

[PATCH 2/3] Reapply all revisions mentioned in LOCAL_PATCHES.

2019-08-13 Thread Martin Liska

libsanitizer/ChangeLog:

2019-08-13  Martin Liska  

* asan/asan_globals.cpp (CheckODRViolationViaIndicator): Reapply
patch from trunk.
(CheckODRViolationViaPoisoning): Likewise.
(RegisterGlobal): Likewise.
* asan/asan_mapping.h: Likewise.
* sanitizer_common/sanitizer_linux_libcdep.cpp (defined): Likewise.
* sanitizer_common/sanitizer_mac.cpp (defined): Likewise.
* sanitizer_common/sanitizer_platform_limits_linux.cpp (defined): 
Likewise.
* sanitizer_common/sanitizer_platform_limits_posix.h (defined): 
Likewise.
* sanitizer_common/sanitizer_stacktrace.cpp (GetCanonicFrame): Likewise.
* ubsan/ubsan_handlers.cpp (__ubsan::__ubsan_handle_cfi_bad_icall): 
Likewise.
(__ubsan::__ubsan_handle_cfi_bad_icall_abort): Likewise.
* ubsan/ubsan_handlers.h (struct CFIBadIcallData): Likewise.
(struct CFICheckFailData): Likewise.
(RECOVERABLE): Likewise.
* ubsan/ubsan_platform.h: Likewise.
---
 libsanitizer/asan/asan_globals.cpp| 19 ---
 libsanitizer/asan/asan_mapping.h  |  2 +-
 .../sanitizer_linux_libcdep.cpp   |  4 
 .../sanitizer_common/sanitizer_mac.cpp|  2 +-
 .../sanitizer_platform_limits_linux.cpp   |  7 +--
 .../sanitizer_platform_limits_posix.h |  2 +-
 .../sanitizer_common/sanitizer_stacktrace.cpp | 17 -
 libsanitizer/ubsan/ubsan_handlers.cpp | 15 +++
 libsanitizer/ubsan/ubsan_handlers.h   |  8 
 libsanitizer/ubsan/ubsan_platform.h   |  2 ++
 10 files changed, 49 insertions(+), 29 deletions(-)

diff --git a/libsanitizer/asan/asan_globals.cpp b/libsanitizer/asan/asan_globals.cpp
index 54e75f3cee7..c77e5357bf9 100644
--- a/libsanitizer/asan/asan_globals.cpp
+++ b/libsanitizer/asan/asan_globals.cpp
@@ -154,23 +154,6 @@ static void CheckODRViolationViaIndicator(const Global *g) {
   }
 }
 
-// Check ODR violation for given global G by checking if it's already poisoned.
-// We use this method in case compiler doesn't use private aliases for global
-// variables.
-static void CheckODRViolationViaPoisoning(const Global *g) {
-  if (__asan_region_is_poisoned(g->beg, g->size_with_redzone)) {
-// This check may not be enough: if the first global is much larger
-// the entire redzone of the second global may be within the first global.
-for (ListOfGlobals *l = list_of_all_globals; l; l = l->next) {
-  if (g->beg == l->g->beg &&
-  (flags()->detect_odr_violation >= 2 || g->size != l->g->size) &&
-  !IsODRViolationSuppressed(g->name))
-ReportODRViolation(g, FindRegistrationSite(g),
-   l->g, FindRegistrationSite(l->g));
-}
-  }
-}
-
 // Clang provides two different ways for global variables protection:
 // it can poison the global itself or its private alias. In former
 // case we may poison same symbol multiple times, that can help us to
@@ -216,8 +199,6 @@ static void RegisterGlobal(const Global *g) {
 // where two globals with the same name are defined in different modules.
 if (UseODRIndicator(g))
   CheckODRViolationViaIndicator(g);
-else
-  CheckODRViolationViaPoisoning(g);
   }
   if (CanPoisonMemory())
 PoisonRedZones(*g);
diff --git a/libsanitizer/asan/asan_mapping.h b/libsanitizer/asan/asan_mapping.h
index 41fb49ee46d..09be904270c 100644
--- a/libsanitizer/asan/asan_mapping.h
+++ b/libsanitizer/asan/asan_mapping.h
@@ -163,7 +163,7 @@ static const u64 kDefaultShort64bitShadowOffset =
 static const u64 kAArch64_ShadowOffset64 = 1ULL << 36;
 static const u64 kMIPS32_ShadowOffset32 = 0x0aaa;
 static const u64 kMIPS64_ShadowOffset64 = 1ULL << 37;
-static const u64 kPPC64_ShadowOffset64 = 1ULL << 44;
+static const u64 kPPC64_ShadowOffset64 = 1ULL << 41;
 static const u64 kSystemZ_ShadowOffset64 = 1ULL << 52;
 static const u64 kSPARC64_ShadowOffset64 = 1ULL << 43;  // 0x800
 static const u64 kFreeBSD_ShadowOffset32 = 1ULL << 30;  // 0x4000
diff --git a/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp b/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
index 1f584a2add6..7dc38a0b703 100644
--- a/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
+++ b/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp
@@ -701,9 +701,13 @@ u32 GetNumberOfCPUs() {
 #elif SANITIZER_SOLARIS
   return sysconf(_SC_NPROCESSORS_ONLN);
 #else
+#if defined(CPU_COUNT)
   cpu_set_t CPUs;
   CHECK_EQ(sched_getaffinity(0, sizeof(cpu_set_t), ), 0);
   return CPU_COUNT();
+#else
+  return 1;
+#endif
 #endif
 }
 
diff --git a/libsanitizer/sanitizer_common/sanitizer_mac.cpp b/libsanitizer/sanitizer_common/sanitizer_mac.cpp
index bd6301aebad..7552b7aa965 100644
--- a/libsanitizer/sanitizer_common/sanitizer_mac.cpp
+++ b/libsanitizer/sanitizer_common/sanitizer_mac.cpp
@@ -36,7 +36,7 @@
 extern char **environ;
 #endif
 
-#if defined(__has_include) && 

[PATCH 3/3] Fix a test-case scan pattern.

2019-08-13 Thread Martin Liska

gcc/testsuite/ChangeLog:

2019-08-13  Martin Liska  

* c-c++-common/asan/memcmp-1.c: There's a new function in the
stack-trace on the top.  So shift expected output in stack
trace.
---
 gcc/testsuite/c-c++-common/asan/memcmp-1.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/asan/memcmp-1.c b/gcc/testsuite/c-c++-common/asan/memcmp-1.c
index 5915988be5b..0a513c05ee1 100644
--- a/gcc/testsuite/c-c++-common/asan/memcmp-1.c
+++ b/gcc/testsuite/c-c++-common/asan/memcmp-1.c
@@ -16,5 +16,5 @@ main ()
 }
 
 /* { dg-output "ERROR: AddressSanitizer: stack-buffer-overflow.*(\n|\r\n|\r)" } */
-/* { dg-output "#0 0x\[0-9a-f\]+ +(in _*(interceptor_|wrap_|)memcmp|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
-/* { dg-output "#1 0x\[0-9a-f\]+ +(in _*main|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "#1 0x\[0-9a-f\]+ +(in _*(interceptor_|wrap_|)memcmp|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "#2 0x\[0-9a-f\]+ +(in _*main|\[(\])\[^\n\r]*(\n|\r\n|\r)" } */


Re: [PATCH v3] Missed function specialization + partial devirtualization

2019-08-13 Thread Martin Liška
And I would also suggest to come up with parameter that will
drive minimum probability, maximum number of promotions and
maybe minimal number of edge executions to consider a speculation.

Clang provides all these:
https://github.com/microsoft/llvm-1/blob/master/lib/Analysis/IndirectCallPromotionAnalysis.cpp#L37

Martin


[patch, fortran] Fix PR 90561

2019-08-13 Thread Thomas König

Hello world,

this patch fixes a 9/10 regression by placing the variable used to
hold a string length at function scope.

I chose to implement this version of gfc_evaluate_now as a separate
function because I have a sneaking suspicion this may not be the
last time we are going to encounter something like that - better
have the function there for future use.

Regression-tested. OK for trunk and gcc-9?

Regards

Thomas
Index: trans.h
===
--- trans.h	(Revision 274370)
+++ trans.h	(Arbeitskopie)
@@ -507,6 +507,7 @@ void gfc_conv_label_variable (gfc_se * se, gfc_exp
 /* If the value is not constant, Create a temporary and copy the value.  */
 tree gfc_evaluate_now_loc (location_t, tree, stmtblock_t *);
 tree gfc_evaluate_now (tree, stmtblock_t *);
+tree gfc_evaluate_now_function_scope (tree, stmtblock_t *);
 
 /* Find the appropriate variant of a math intrinsic.  */
 tree gfc_builtin_decl_for_float_kind (enum built_in_function, int);
Index: trans.c
===
--- trans.c	(Revision 274370)
+++ trans.c	(Arbeitskopie)
@@ -118,7 +118,20 @@ gfc_evaluate_now (tree expr, stmtblock_t * pblock)
   return gfc_evaluate_now_loc (input_location, expr, pblock);
 }
 
+/* Like gfc_evaluate_now, but add the created variable to the
+   function scope.  */
 
+tree
+gfc_evaluate_now_function_scope (tree expr, stmtblock_t * pblock)
+{
+  tree var;
+  var = gfc_create_var_np (TREE_TYPE (expr), NULL);
+  gfc_add_decl_to_function (var);
+  gfc_add_modify (pblock, var, expr);
+
+  return var;
+}
+
 /* Build a MODIFY_EXPR node and add it to a given statement block PBLOCK.
A MODIFY_EXPR is an assignment:
LHS <- RHS.  */
Index: trans-expr.c
===
--- trans-expr.c	(Revision 274370)
+++ trans-expr.c	(Arbeitskopie)
@@ -10796,7 +10796,8 @@ gfc_trans_assignment_1 (gfc_expr * expr1, gfc_expr
   if (expr1->ts.deferred
 	  && gfc_expr_attr (expr1).allocatable
 	  && gfc_check_dependency (expr1, expr2, true))
-	rse.string_length = gfc_evaluate_now (rse.string_length, );
+	rse.string_length =
+	  gfc_evaluate_now_function_scope (rse.string_length, );
   string_length = rse.string_length;
 }
   else
! { dg-do run }
! PR fortran/90561
! This used to ICE.
! Original test case by Gerhard Steinmetz.
program p
   character(:), allocatable :: z(:)
   z = [character(2):: 'ab', 'xy']
   z = z(2)
   if (any(z /= 'xy')) stop 1
end


Re: [PATCH v3] Missed function specialization + partial devirtualization

2019-08-13 Thread Martin Liška
On 7/31/19 7:29 AM, luoxhu wrote:
> This patch aims to fix PR69678 caused by PGO indirect call profiling
> performance issues.
> The bug that profiling data is never working was fixed by Martin's pull
> back of topN patches, performance got GEOMEAN ~1% improvement.
> Still, currently the default profile only generates SINGLE indirect target
> that called more than 75%.  This patch leverages MULTIPLE indirect
> targets use in LTO-WPA and LTO-LTRANS stage, as a result, function
> specialization, profiling, partial devirtualization, inlining and
> cloning could be done successfully based on it.
> Performance can get improved from 0.70 sec to 0.38 sec on simple tests.
> Details are:
>   1.  PGO with topn is enbaled by default now, but only one indirect
>   target edge will be generated in ipa-profile pass, so add variables to 
> enable
>   multiple speculative edges through passes, speculative_id will record the 
> direct edge
>   index bind to the indirect edge, num_of_ics records how many direct edges 
> owned by
>   the indirect edge, postpone gimple_ic to ipa-profile like default as inline
>   pass will decide whether it is benefit to transform indirect call.
>   2.  Enable LTO WPA/LTRANS stage multiple indirect call targets analysis for
>   profile full support in ipa passes and cgraph_edge functions.  
> speculative_id
>   can be set by make_speculative id when multiple targets are binded to
>   one indirect edge, and cloned if new edge is cloned.  speculative_id
>   is streamed out and stream int by lto like lto_stmt_uid.
>   3.  Add 1 in module testcase and 2 cross module testcases.
>   4.  Bootstrap and regression test passed on Power8-LE.
> 
> v3 Changes:
>  1. Rebase to trunk.
>  2. Use speculative_id to track and search the reference node matched
>  with the direct edge's callee for multiple targets.  This could
>  eliminate the workaround strstr before.  Actually, it is the caller's
>  response to handle the direct edges mapped to same indirect edge.
>  speculative_call_info will still return one of the direct edge
>  specified, this will leverage current IPA edge process framework mostly.

Thank you for the next version of the patch. Inline comments follow:

> 
> gcc/ChangeLog
> 
>   2019-07-31  Xiong Hu Luo  
> 
>   PR ipa/69678
>   * cgraph.c (symbol_table::create_edge): Init speculative_id.
>   (cgraph_edge::make_speculative): Add param for setting speculative_id.
>   (cgraph_edge::speculative_call_info): Find reference by
>   speculative_id for multiple indirect targets.
>   (cgraph_edge::resolve_speculation): Decrease the speculations
>   for indirect edge, drop it's speculative if not direct target
>   left.
>   (cgraph_edge::redirect_call_stmt_to_callee): Likewise.
>   (cgraph_node::verify_node): Don't report error if speculative
>   edge not include statement.
>   * cgraph.h (struct indirect_target_info): New struct.
>   (indirect_call_targets): New vector variable.
>   (num_of_ics): New variable.
>   (make_speculative): Add param for setting speculative_id.
>   (speculative_id): New variable.
>   * cgraphclones.c (cgraph_node::create_clone): Clone speculative_id.
>   * ipa-inline.c (inline_small_functions): Add iterator update.
>   * ipa-profile.c (ipa_profile_generate_summary): Add indirect
>   multiple targets logic.
>   (ipa_profile): Likewise.
>   * ipa-ref.h (speculative_id): New variable.
>   * ipa.c (process_references): Fix typo.
>   * lto-cgraph.c (lto_output_edge): Add indirect multiple targets
>   logic.  Stream out speculative_id.
>   (input_edge): Likewise.
>   * predict.c (dump_prediction): Revome edges count assert to be
>   precise.
>   * symtab.c (symtab_node::create_reference): Init speculative_id.
>   (symtab_node::clone_references): Clone speculative_id.
>   (symtab_node::clone_referring): Clone speculative_id.
>   (symtab_node::clone_reference): Clone speculative_id.
>   (symtab_node::clear_stmts_in_references): Clear speculative_id.
>   * tree-inline.c (copy_bb): Duplicate all the speculative edges
>   if indirect call contains multiple speculative targets.
>   * tree-profile.c (gimple_gen_ic_profiler): Use the new variable
>   __gcov_indirect_call.counters and __gcov_indirect_call.callee.
>   (gimple_gen_ic_func_profiler): Likewise.
>   (pass_ipa_tree_profile::gate): Fix comment typos.
>   * value-prof.c  (gimple_ic_transform): Handle topn case.
>   Fix comment typos.
> 
> gcc/testsuite/ChangeLog
> 
>   2019-07-31  Xiong Hu Luo  
> 
>   PR ipa/69678
>   * gcc.dg/tree-prof/indir-call-prof-topn.c: New testcase.
>   * gcc.dg/tree-prof/crossmodule-indir-call-topn-1.c: New testcase.
>   * gcc.dg/tree-prof/crossmodule-indir-call-topn-1a.c: New testcase.
>   * gcc.dg/tree-prof/crossmodule-indir-call-topn-2.c: New testcase.
> ---
>  gcc/cgraph.c  |  70 

Re: [PATCH][testsuite] Fix PR91419

2019-08-13 Thread Hans-Peter Nilsson
> From: Richard Biener 
> Date: Tue, 13 Aug 2019 09:50:34 +0200

> 2019-08-13  Richard Biener  
> 
>   PR testsuite/91419
>   * lib/target-supports.exp (natural_alignment_32): Amend target
>   list based on BIGGEST_ALIGNMENT.
>   (natural_alignment_64): Targets not natural_alignment_32 cannot
>   be natural_alignment_64.
>   * gcc.dg/tree-ssa/pr91091-2.c: XFAIL for !natural_alignment_32.
>   * gcc.dg/tree-ssa/ssa-fre-77.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-fre-61.c: Require natural_alignment_32.

LGTM, thanks.  (Not tested myself but my cris-elf autotester will pick it up.)

brgds, H-p


[PATCH][2/2][i386] STV changes, reg-copy as vector

2019-08-13 Thread Richard Biener


The following splits out the change that makes the live-range split /
mode change copy pseudos vector mode pseudos.  I've tried reproducing
the issue I faced with SImode chains but failed likely because we're
never using simple moves in the end while for SImode we can end up
with

(set (subreg:V4SI (reg:SI 42) 0) (subreg:V4SI (reg:SI 41) 0))

which we happily propagate out.  So pursuing this patch independently
doesn't make much sense.  Still the main change is in
dimode_scalar_chain::make_vector_copies where 'vreg' is now V2DImode
instead of DImode and the change to emit_move_insn (vreg, tmp)
hints at the change done to the above problematic insn.

The rest of the changes deal with some regs already being in
the appropriate vector mode.

I realize I might have avoided my original issue by emitting not
the above reg-reg move but

 (set (subreg:V4SI (reg:SI 42) 0)
   (vec_merge:V4SI
  (vec_duplicate:V4SI (reg:SI 41))
  (const_vec [0 0 0 0])
  (const_1)))

but I didn't try (the full patch contained a testcase for the
issue).  Still it seems odd to use DImode pseudos when all uses
are wrapped in (subreg:V2DI ...).

As (unrelated) change I propose to error with fatal_insn_not_found
in case recog doesn't recognize the altered insns which make it
easier to spot errors early.

Bootstrap / regtest running on x86_64-unknown-linux-gnu ontop of [1/2], OK 
for trunk if that succeeds?

Thanks,
Richard.

2019-08-13  Richard Biener  

* config/i386/i386-features.c
(dimode_scalar_chain::replace_with_subreg): Elide the subreg if the
reg is already vector.
(dimode_scalar_chain::convert_op): Likewise.
(dimode_scalar_chain::make_vector_copies): Use a vector-mode pseudo as
destination to avoid later passes copy-propagating it out.
(dimode_chain::convert_insn): Convert the source of a
memory op in case it is not of the appropriate scalar mode.
Raise fatal_insn_not_found if the converted insn is not recognized.

Index: gcc/config/i386/i386-features.c
===
--- gcc/config/i386/i386-features.c (revision 274328)
+++ gcc/config/i386/i386-features.c (working copy)
@@ -573,7 +573,8 @@ rtx
 dimode_scalar_chain::replace_with_subreg (rtx x, rtx reg, rtx new_reg)
 {
   if (x == reg)
-return gen_rtx_SUBREG (V2DImode, new_reg, 0);
+return (GET_MODE (new_reg) == V2DImode
+   ? new_reg : gen_rtx_SUBREG (V2DImode, new_reg, 0));
 
   const char *fmt = GET_RTX_FORMAT (GET_CODE (x));
   int i, j;
@@ -627,7 +628,7 @@ void
 dimode_scalar_chain::make_vector_copies (unsigned regno)
 {
   rtx reg = regno_reg_rtx[regno];
-  rtx vreg = gen_reg_rtx (DImode);
+  rtx vreg = gen_reg_rtx (V2DImode);
   df_ref ref;
 
   for (ref = DF_REG_DEF_CHAIN (regno); ref; ref = DF_REF_NEXT_REG (ref))
@@ -641,7 +642,12 @@ dimode_scalar_chain::make_vector_copies
gen_rtx_SUBREG (SImode, reg, 0));
emit_move_insn (adjust_address (tmp, SImode, 4),
gen_rtx_SUBREG (SImode, reg, 4));
-   emit_move_insn (vreg, tmp);
+   emit_move_insn (vreg,
+   gen_rtx_VEC_MERGE (V2DImode,
+  gen_rtx_VEC_DUPLICATE (V2DImode,
+ tmp),
+  CONST0_RTX (V2DImode),
+  GEN_INT (HOST_WIDE_INT_1U)));
  }
else if (TARGET_SSE4_1)
  {
@@ -841,7 +847,8 @@ dimode_scalar_chain::convert_op (rtx *op
gcc_assert (!DF_REF_CHAIN (ref));
break;
  }
-  *op = gen_rtx_SUBREG (V2DImode, *op, 0);
+  if (GET_MODE (*op) != V2DImode)
+   *op = gen_rtx_SUBREG (V2DImode, *op, 0);
 }
   else if (CONST_INT_P (*op))
 {
@@ -931,6 +938,8 @@ dimode_scalar_chain::convert_insn (rtx_i
 case MEM:
   if (!REG_P (dst))
convert_op (, insn);
+  else if (GET_MODE (src) != DImode)
+   src = gen_rtx_SUBREG (DImode, src, 0);
   break;
 
 case REG:
@@ -977,7 +986,9 @@ dimode_scalar_chain::convert_insn (rtx_i
   PATTERN (insn) = def_set;
 
   INSN_CODE (insn) = -1;
-  recog_memoized (insn);
+  int patt = recog_memoized (insn);
+  if (patt == -1)
+fatal_insn_not_found (insn);
   df_insn_rescan (insn);
 }
 


[PATCH][1/2][i386] STV changes, DImode chain cost

2019-08-13 Thread Richard Biener


The following splits out the DImode chain cost changes from the
patch adding SImode chain handling.  There are two main parts:

1) fix REG_P (src) && REG_P (dst) costing which currently favors
   SSE because we use COSTS_N_INSNS based costs for GPR but move
   costs for SSE.

2) Use ix86_cost->sse_op as cost of the SSE op rather than just
   computing gain as one GPR insn cost.  The vectorizer already
   compares GPR insn costs and SSE insn costs so this shouldn't be
   apples and oranges.  Also when handling SImode chains we'd
   be left with a zero gain everywhere (but the minmax special-casing).

besides that for debugging I found it useful to dump per-stmt
gain if not zero (and spotted the reg-reg move issue that way).

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

OK for trunk?

Thanks,
Richard.

2019-08-13  Richard Biener  

* config/i386/i386-features.c
(dimode_scalar_chain::compute_convert_gain): Compute and dump
individual instruction gain.  Fix reg-reg copy GRP cost.  Use
ix86_cost->sse_op for vector instruction costs.

Index: gcc/config/i386/i386-features.c
===
--- gcc/config/i386/i386-features.c (revision 274328)
+++ gcc/config/i386/i386-features.c (working copy)
@@ -497,22 +497,23 @@ dimode_scalar_chain::compute_convert_gai
   rtx def_set = single_set (insn);
   rtx src = SET_SRC (def_set);
   rtx dst = SET_DEST (def_set);
+  int igain = 0;
 
   if (REG_P (src) && REG_P (dst))
-   gain += COSTS_N_INSNS (2) - ix86_cost->xmm_move;
+   igain += 2 - ix86_cost->xmm_move;
   else if (REG_P (src) && MEM_P (dst))
-   gain += 2 * ix86_cost->int_store[2] - ix86_cost->sse_store[1];
+   igain += 2 * ix86_cost->int_store[2] - ix86_cost->sse_store[1];
   else if (MEM_P (src) && REG_P (dst))
-   gain += 2 * ix86_cost->int_load[2] - ix86_cost->sse_load[1];
+   igain += 2 * ix86_cost->int_load[2] - ix86_cost->sse_load[1];
   else if (GET_CODE (src) == ASHIFT
   || GET_CODE (src) == ASHIFTRT
   || GET_CODE (src) == LSHIFTRT)
{
  if (CONST_INT_P (XEXP (src, 0)))
-   gain -= vector_const_cost (XEXP (src, 0));
- gain += ix86_cost->shift_const;
+   igain -= vector_const_cost (XEXP (src, 0));
+ igain += 2 * ix86_cost->shift_const - ix86_cost->sse_op;
  if (INTVAL (XEXP (src, 1)) >= 32)
-   gain -= COSTS_N_INSNS (1);
+   igain -= COSTS_N_INSNS (1);
}
   else if (GET_CODE (src) == PLUS
   || GET_CODE (src) == MINUS
@@ -520,20 +521,20 @@ dimode_scalar_chain::compute_convert_gai
   || GET_CODE (src) == XOR
   || GET_CODE (src) == AND)
{
- gain += ix86_cost->add;
+ igain += 2 * ix86_cost->add - ix86_cost->sse_op;
  /* Additional gain for andnot for targets without BMI.  */
  if (GET_CODE (XEXP (src, 0)) == NOT
  && !TARGET_BMI)
-   gain += 2 * ix86_cost->add;
+   igain += 2 * ix86_cost->add;
 
  if (CONST_INT_P (XEXP (src, 0)))
-   gain -= vector_const_cost (XEXP (src, 0));
+   igain -= vector_const_cost (XEXP (src, 0));
  if (CONST_INT_P (XEXP (src, 1)))
-   gain -= vector_const_cost (XEXP (src, 1));
+   igain -= vector_const_cost (XEXP (src, 1));
}
   else if (GET_CODE (src) == NEG
   || GET_CODE (src) == NOT)
-   gain += ix86_cost->add - COSTS_N_INSNS (1);
+   igain += 2 * ix86_cost->add - ix86_cost->sse_op - COSTS_N_INSNS (1);
   else if (GET_CODE (src) == COMPARE)
{
  /* Assume comparison cost is the same.  */
@@ -541,13 +542,20 @@ dimode_scalar_chain::compute_convert_gai
   else if (CONST_INT_P (src))
{
  if (REG_P (dst))
-   gain += COSTS_N_INSNS (2);
+   igain += 2 * COSTS_N_INSNS (1);
  else if (MEM_P (dst))
-   gain += 2 * ix86_cost->int_store[2] - ix86_cost->sse_store[1];
- gain -= vector_const_cost (src);
+   igain += 2 * ix86_cost->int_store[2] - ix86_cost->sse_store[1];
+ igain -= vector_const_cost (src);
}
   else
gcc_unreachable ();
+
+  if (igain != 0 && dump_file)
+   {
+ fprintf (dump_file, "  Instruction gain %d for ", igain);
+ dump_insn_slim (dump_file, insn);
+   }
+  gain += igain;
 }
 
   if (dump_file)


Re: [PATCH] Come up with json::integer_number and use it in GCOV.

2019-08-13 Thread Martin Liška
On 8/2/19 2:40 PM, David Malcolm wrote:
> Something that occurred to me reading the updated patch: maybe it would
> make things easier to have utility member functions of json::object to
> implicitly make the child, e.g.:
> 
> void
> json::object::set (const char *key, long v)
> {
>set (key, new json::integer_number (v));
> }
> 
> so that all those calls can be just:
> 
>   obj->set ("line", exploc.line);
>   obj->set ("column", exploc.column);
> 
> etc (assuming overloading is unambiguous).
> 
> But that's probably orthogonal to this patch.

Looks good to me. It's a candidate for a follow up patch.

> 
> 
>> And I changed all occurrences of float_number with integer_number
>> as you suggested.
> Thanks.
> 
>> I'm currently testing the updated patch.
>> Martin
> The updated patch looks good to me, but technically I'm not a reviewer
> for these files.

Sure, I hope @Jakub or @Richi can approve me that?
Thanks,
Martin

> 
> Dave



Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-13 Thread Robin Dapp
> I have become rather wary of INTEGRAL_TYPE_P recently because it
> includes enum types, which with -fstrict-enum can have a surprising
> behavior. If I have
> enum E { A, B, C };
> and e has type enum E, with -fstrict-enum, do your tests manage to
> prevent (long)e+1 from becoming (long)(e+1) with an unsafe operation
> done in the enum type?

Ah, I didn't specifically check for this. Going to rather change the
checks to TREE_CODE (type) == INTEGER_TYPE then.

Regards
 Robin



Re: [PATCH 1/3] Perform fold when propagating.

2019-08-13 Thread Robin Dapp
> May I suggest to add a parameter to the substitute-and-fold engine
> so we can do the folding on all stmts only when enabled and enable
> it just for VRP?  That also avoids the testsuite noise.

Would something along these lines do?

diff --git a/gcc/tree-ssa-propagate.c b/gcc/tree-ssa-propagate.c
index 7a8f1e037b0..6c0d743b823 100644
--- a/gcc/tree-ssa-propagate.c
+++ b/gcc/tree-ssa-propagate.c
@@ -814,7 +814,6 @@ ssa_propagation_engine::ssa_propagate (void)
   ssa_prop_fini ();
 }

-
 /* Return true if STMT is of the form 'mem_ref = RHS', where 'mem_ref'
is a non-volatile pointer dereference, a structure reference or a
reference to a single _DECL.  Ignore volatile memory references
@@ -1064,11 +1063,10 @@
substitute_and_fold_dom_walker::before_dom_children (basic_block bb)
   /* Replace real uses in the statement.  */
   did_replace |= substitute_and_fold_engine->replace_uses_in (stmt);

-  if (did_replace)
- gimple_set_modified (stmt, true);
-
-  if (fold_stmt (, follow_single_use_edges))
+  /* If we made a replacement, fold the statement.  */
+  if (did_replace ||
substitute_and_fold_engine->should_fold_all_stmts ())
{
+ fold_stmt (, follow_single_use_edges);
  did_replace = true;
  stmt = gsi_stmt (i);
  gimple_set_modified (stmt, true);
diff --git a/gcc/tree-ssa-propagate.h b/gcc/tree-ssa-propagate.h
index 81b635e0787..939680f487c 100644
--- a/gcc/tree-ssa-propagate.h
+++ b/gcc/tree-ssa-propagate.h
@@ -107,6 +107,13 @@ class substitute_and_fold_engine
   bool substitute_and_fold (basic_block = NULL);
   bool replace_uses_in (gimple *);
   bool replace_phi_args_in (gphi *);
+
+  /* Users like VRP can overwrite this when they want to perform
+ folding for every propagation.  */
+  virtual bool should_fold_all_stmts (void)
+{
+  return false;
+}
 };

 #endif /* _TREE_SSA_PROPAGATE_H  */
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index e2850682da2..8c8fa6f2bec 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -6271,6 +6271,9 @@ class vrp_folder : public substitute_and_fold_engine
 { return vr_values->simplify_stmt_using_ranges (gsi); }
  tree op_with_constant_singleton_value_range (tree op)
 { return vr_values->op_with_constant_singleton_value_range (op); }
+
+  /* Enable aggressive folding in every propagation.  */
+  bool should_fold_all_stmts (void) { return true; }
 };

 /* If the statement pointed by SI has a predicate whose value can be


> I think it's also only necessary to fold a stmt when a (indirect) use
> after substitution has either been folded or has (new) SSA name
> info (range/known-bits) set?

Where would this need to be changed?

Regards
 Robin



Re: [patch] handle casesi dispatch insns in create_trace_edges

2019-08-13 Thread Olivier Hainque
Hi Jeff,

> On 12 Aug 2019, at 16:48, Jeff Law  wrote:

> I think part of the reason I settled on rtl.c was because we're not
> really just looking at the form, not doing any real "analysis".  BUt I
> think either location for the implementation is fine.

The reference to pc_rtx and to single_set turned out
problematic in rtl.c, as they end up unresolved in helpers
such as genpreds.

I didn't want to start messing with those so I have finally
sticked the definition in rtlanal, next to tablejump_p.

This is rev 274377, ChangeLog below and patch attached for
the list records. rev 274378 follows to fix the placement of
the prototype in rtl.h, very minor.

Thanks again for your feedback,

With Kind Regards,

Olivier

2019-08-13  Olivier Hainque  

* rtlanal.c (tablejump_casesi_pattern): New function, to
determine if a tablejump insn is a casesi dispatcher. Extracted
from patch_jump_insn.
* rtl.h (tablejump_casesi_pattern): Declare.
* cfgrtl.c (patch_jump_insn): Use it.
* dwarf2cfi.c (create_trace_edges): Use it.

testsuite/

* gnat.dg/casesi.ad[bs], test_casesi.adb: New test.



casesi.diff
Description: Binary data


Re: [PATCH 2/3] C++20 constexpr lib part 2/3 - swappish functions.

2019-08-13 Thread Jonathan Wakely

On 01/08/19 13:16 -0400, Ed Smith-Rowland via libstdc++ wrote:

Greetings,

Here is a patch for C++20 p0879 - Constexpr for swap and swap related 
functions.


This essentially constexprifies the rest of .

Built and tested with C++20 (and pre-c++20) on x86_64-linux.

Ok?

Regards,

Ed Smith-Rowland





2019-08-01  Edward Smith-Rowland  <3dw...@verizon.net>

Implement C++20 p0879 - Constexpr for swap and swap related functions.
* include/bits/algorithmfwd.h (__cpp_lib_constexpr_swap_algorithms):
New macro. (iter_swap, make_heap, next_permutation, partial_sort_copy,


There should be a newline after "New macro." and before the next
parenthesized list of identifiers.

The parenthesized lists should not span multiple lines, so close and
reopen the parens, i.e.

   Implement C++20 p0879 - Constexpr for swap and swap related functions.
   * include/bits/algorithmfwd.h (__cpp_lib_constexpr_swap_algorithms):
   New macro.
   (iter_swap, make_heap, next_permutation, partial_sort_copy, pop_heap)
   (prev_permutation, push_heap, reverse, rotate, sort_heap, swap)
   (swap_ranges, nth_element, partial_sort, sort): Add constexpr.


@@ -193,6 +193,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION

#if __cplusplus > 201703L
#  define __cpp_lib_constexpr_algorithms 201711L
+#  define __cpp_lib_constexpr_swap_algorithms 201712L


Should this value be 201806L?

The new macro also needs to be added to .



Optimise constant IFN_WHILE_ULTs

2019-08-13 Thread Richard Sandiford
This patch is a combination of two changes that have to be committed as
a single unit, one target-independent and one target-specific:

(1) Try to fold IFN_WHILE_ULTs with constant arguments to a VECTOR_CST
(which is always possible for fixed-length vectors but is not
necessarily so for variable-length vectors)

(2) Make the SVE port recognise constants that map to PTRUE VLn,
which includes those generated by the new fold.

(2) can't be tested without (1) and (1) would be a significant
pessimisation without (2).

The target-specific parts also start moving towards doing predicate
manipulation in a canonical VNx16BImode form, using rtx_vector_builders.

Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf and
x86_64-linux-gnu.  OK for the generic bits (= the first three files
in the diff)?

Thanks,
Richard


2019-08-13  Richard Sandiford  

gcc/
* tree.h (build_vector_a_then_b): Declare.
* tree.c (build_vector_a_then_b): New function.
* fold-const-call.c (fold_while_ult): Likewise.
(fold_const_call): Use it to handle IFN_WHILE_ULT.
* config/aarch64/aarch64-protos.h (AARCH64_FOR_SVPATTERN): New macro.
(aarch64_svpattern): New enum.
* config/aarch64/aarch64-sve.md (mov): Pass
constants through aarch64_expand_mov_immediate.
(*aarch64_sve_mov): Use aarch64_mov_operand rather
than general_operand as the predicate for operand 1.
(while_ult): Add a '@' marker.
* config/aarch64/aarch64.c (simd_immediate_info::PTRUE): New
insn_type.
(simd_immediate_info::simd_immediate_info): New overload that
takes a scalar_int_mode and an svpattern.
(simd_immediate_info::u): Add a "pattern" field.
(svpattern_token): New function.
(aarch64_get_sve_pred_bits, aarch64_widest_sve_pred_elt_size)
(aarch64_partial_ptrue_length, aarch64_svpattern_for_vl)
(aarch64_sve_move_pred_via_while): New functions.
(aarch64_expand_mov_immediate): Try using
aarch64_sve_move_pred_via_while for predicates that contain N ones
followed by M zeros but that do not correspond to a VLnnn pattern.
(aarch64_sve_pred_valid_immediate): New function.
(aarch64_simd_valid_immediate): Use it instead of dealing directly
with PTRUE and PFALSE.
(aarch64_output_sve_mov_immediate): Handle new simd_immediate_info
forms.

gcc/testsuite/
* gcc.target/aarch64/sve/spill_2.c: Increase iteration counts
beyond the range of a PTRUE.
* gcc.target/aarch64/sve/while_6.c: New test.
* gcc.target/aarch64/sve/while_7.c: Likewise.
* gcc.target/aarch64/sve/while_8.c: Likewise.
* gcc.target/aarch64/sve/while_9.c: Likewise.
* gcc.target/aarch64/sve/while_10.c: Likewise.

Index: gcc/tree.h
===
--- gcc/tree.h  2019-08-13 10:38:29.003944679 +0100
+++ gcc/tree.h  2019-08-13 11:46:02.110720978 +0100
@@ -4314,6 +4314,7 @@ extern tree build_vector_from_val (tree,
 extern tree build_uniform_cst (tree, tree);
 extern tree build_vec_series (tree, tree, tree);
 extern tree build_index_vector (tree, poly_uint64, poly_uint64);
+extern tree build_vector_a_then_b (tree, unsigned int, tree, tree);
 extern void recompute_constructor_flags (tree);
 extern void verify_constructor_flags (tree);
 extern tree build_constructor (tree, vec * 
CXX_MEM_STAT_INFO);
Index: gcc/tree.c
===
--- gcc/tree.c  2019-08-13 10:38:29.003944679 +0100
+++ gcc/tree.c  2019-08-13 11:46:02.110720978 +0100
@@ -1981,6 +1981,23 @@ build_index_vector (tree vec_type, poly_
   return v.build ();
 }
 
+/* Return a VECTOR_CST of type VEC_TYPE in which the first NUM_A
+   elements are A and the rest are B.  */
+
+tree
+build_vector_a_then_b (tree vec_type, unsigned int num_a, tree a, tree b)
+{
+  gcc_assert (known_le (num_a, TYPE_VECTOR_SUBPARTS (vec_type)));
+  unsigned int count = constant_lower_bound (TYPE_VECTOR_SUBPARTS (vec_type));
+  /* Optimize the constant case.  */
+  if ((count & 1) == 0 && TYPE_VECTOR_SUBPARTS (vec_type).is_constant ())
+count /= 2;
+  tree_vector_builder builder (vec_type, count, 2);
+  for (unsigned int i = 0; i < count * 2; ++i)
+builder.quick_push (i < num_a ? a : b);
+  return builder.build ();
+}
+
 /* Something has messed with the elements of CONSTRUCTOR C after it was built;
calculate TREE_CONSTANT and TREE_SIDE_EFFECTS.  */
 
Index: gcc/fold-const-call.c
===
--- gcc/fold-const-call.c   2019-03-08 18:15:31.292760907 +
+++ gcc/fold-const-call.c   2019-08-13 11:46:02.106721009 +0100
@@ -691,6 +691,36 @@ fold_const_vec_convert (tree ret_type, t
 
 /* Try to evaluate:
 
+  IFN_WHILE_ULT (ARG0, ARG1, (TYPE) { ... })
+
+   Return the value on success and null on failure.  */
+
+static tree

[committed][AArch64] Improve SVE constant moves

2019-08-13 Thread Richard Sandiford
If there's no SVE instruction to load a given constant directly, this
patch instead tries to use an Advanced SIMD constant move and then
duplicates the constant to fill an SVE vector.  The main use of this
is to support constants in which each byte is in { 0, 0xff }.

Also, the patch prefers a simple integer move followed by a duplicate
over a load from memory, like we already do for Advanced SIMD.  This is
a useful option to have and would be easy to turn off via a tuning
parameter if necessary.

The patch also extends the handling of wide LD1Rs to big endian,
whereas previously we punted to a full LD1RQ.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274375 (treating the machmode.h bit as obvious).

Richard


2019-08-13  Richard Sandiford  

gcc/
* machmode.h (opt_mode::else_mode): New function.
(opt_mode::else_blk): Use it.
* config/aarch64/aarch64-protos.h (aarch64_vq_mode): Declare.
(aarch64_full_sve_mode, aarch64_sve_ld1rq_operand_p): Likewise.
(aarch64_gen_stepped_int_parallel): Likewise.
(aarch64_stepped_int_parallel_p): Likewise.
(aarch64_expand_mov_immediate): Remove the optional gen_vec_duplicate
argument.
* config/aarch64/aarch64.c
(aarch64_expand_sve_widened_duplicate): Delete.
(aarch64_expand_sve_dupq, aarch64_expand_sve_ld1rq): New functions.
(aarch64_expand_sve_const_vector): Rewrite to handle more cases.
(aarch64_expand_mov_immediate): Remove the optional gen_vec_duplicate
argument.  Use early returns in the !CONST_INT_P handling.
Pass all SVE data vectors to aarch64_expand_sve_const_vector rather
than handling some inline.
(aarch64_full_sve_mode, aarch64_vq_mode): New functions, split out
from...
(aarch64_simd_container_mode): ...here.
(aarch64_gen_stepped_int_parallel, aarch64_stepped_int_parallel_p)
(aarch64_sve_ld1rq_operand_p): New functions.
* config/aarch64/predicates.md (descending_int_parallel)
(aarch64_sve_ld1rq_operand): New predicates.
* config/aarch64/constraints.md (UtQ): New constraint.
* config/aarch64/aarch64.md (UNSPEC_REINTERPRET): New unspec.
* config/aarch64/aarch64-sve.md (mov): Remove the
gen_vec_duplicate from call to aarch64_expand_mov_immediate.
(@aarch64_sve_reinterpret): New expander.
(*aarch64_sve_reinterpret): New pattern.
(@aarch64_vec_duplicate_vq_le): New pattern.
(@aarch64_vec_duplicate_vq_be): Likewise.
(*sve_ld1rq): Replace with...
(@aarch64_sve_ld1rq): ...this new pattern.

gcc/testsuite/
* gcc.target/aarch64/sve/init_2.c: Expect ld1rd to be used
instead of a full vector load.
* gcc.target/aarch64/sve/init_4.c: Likewise.
* gcc.target/aarch64/sve/ld1r_2.c: Remove constants that no longer
need to be loaded from memory.
* gcc.target/aarch64/sve/slp_2.c: Expect the same output for
big and little endian.
* gcc.target/aarch64/sve/slp_3.c: Likewise.  Expect 3 of the
doubles to be moved via integer registers rather than loaded
from memory.
* gcc.target/aarch64/sve/slp_4.c: Likewise but for 4 doubles.
* gcc.target/aarch64/sve/spill_4.c: Expect 16-bit constants to be
loaded via an integer register rather than from memory.
* gcc.target/aarch64/sve/const_1.c: New test.
* gcc.target/aarch64/sve/const_2.c: Likewise.
* gcc.target/aarch64/sve/const_3.c: Likewise.

Index: gcc/machmode.h
===
--- gcc/machmode.h  2019-03-08 18:14:25.849009683 +
+++ gcc/machmode.h  2019-08-13 11:28:59.890111027 +0100
@@ -251,7 +251,8 @@ #define CLASS_HAS_WIDER_MODES_P(CLASS)
   ALWAYS_INLINE opt_mode (from_int m) : m_mode (machine_mode (m)) {}
 
   machine_mode else_void () const;
-  machine_mode else_blk () const;
+  machine_mode else_blk () const { return else_mode (BLKmode); }
+  machine_mode else_mode (machine_mode) const;
   T require () const;
 
   bool exists () const;
@@ -271,13 +272,13 @@ opt_mode::else_void () const
   return m_mode;
 }
 
-/* If the T exists, return its enum value, otherwise return E_BLKmode.  */
+/* If the T exists, return its enum value, otherwise return FALLBACK.  */
 
 template
 inline machine_mode
-opt_mode::else_blk () const
+opt_mode::else_mode (machine_mode fallback) const
 {
-  return m_mode == E_VOIDmode ? E_BLKmode : m_mode;
+  return m_mode == E_VOIDmode ? fallback : m_mode;
 }
 
 /* Assert that the object contains a T and return it.  */
Index: gcc/config/aarch64/aarch64-protos.h
===
--- gcc/config/aarch64/aarch64-protos.h 2019-08-13 11:21:04.501550177 +0100
+++ gcc/config/aarch64/aarch64-protos.h 2019-08-13 11:28:59.882111087 +0100
@@ -416,6 +416,8 @@ unsigned HOST_WIDE_INT aarch64_and_split
 

Re: [PATCH] Add --with-static-standard-libraries to the top level

2019-08-13 Thread Jonathan Wakely

On 08/08/19 14:53 -0600, Jeff Law wrote:

On 8/5/19 12:02 PM, Tom Tromey wrote:

gdb should normally not be linked with -static-libstdc++.  Currently
this has not caused problems, but it's incompatible with catching an
exception thrown from a shared library -- and a subsequent patch
changes gdb to do just this.

This patch adds a new --with-static-standard-libraries flag to the
top-level configure.  It defaults to "auto", which means enabled if
gcc is being built, and disabled otherwise.

Tom

2019-07-27  Tom Tromey  

* configure: Rebuild.
* configure.ac: Add --with-static-standard-libraries.

Deferring to Jon.

It might be worth reviewing:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=56750

And this thread:


https://gcc.gnu.org/ml/gcc-patches/2018-02/msg00403.html


While I NAK'd Aldy's patch in the email thread, if Jon thinks we should
have this capability I won't object.


The new option (and its default value) seem reasonable to me. I don't
see why GDB should be forced to link to libstdc++.a just because GCC
wants to.

What I don't understand is why GDB crashes. It should still be able to
catch exceptions from a shared library even if linked to libstdc++.a,
unless the static libstdc++.a is somehow incompatible with the shared
libstdc++.so the shared lib linked to.

Is this on GNU/Linux, or something with a different linking model?



Re: [patch, fortran] Some corrections for DO loop index warnings

2019-08-13 Thread Thomas Koenig

Hi Steve,


Ok.  If it regression cleanly on gcc9, go ahead an commit there as well.


Committed to both (after doing a regression test on gcc 9 and also
waiting for gcc-testresults containing the revision).

Thanks for the review!


[committed][AArch64] Use simd_immediate_info for SVE predicate constants

2019-08-13 Thread Richard Sandiford
This patch makes predicate constants use the normal simd_immediate_info
machinery, rather than treating PFALSE and PTRUE as special cases.
This makes it easier to add other types of predicate constant later.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274372.

Richard


2019-08-13  Richard Sandiford  

gcc/
* config/aarch64/aarch64-protos.h (aarch64_output_ptrue): Delete.
* config/aarch64/aarch64-sve.md (*aarch64_sve_mov):
Use a single Dn alternative instead of separate Dz and Dm
alternatives.  Use aarch64_output_sve_move_immediate.
* config/aarch64/aarch64.c (aarch64_sve_element_int_mode): New
function.
(aarch64_simd_valid_immediate): Fill in the simd_immediate_info
for predicates too.
(aarch64_output_sve_mov_immediate): Handle predicate modes.
(aarch64_output_ptrue): Delete.

Index: gcc/config/aarch64/aarch64-protos.h
===
--- gcc/config/aarch64/aarch64-protos.h 2019-08-08 18:11:22.0 +0100
+++ gcc/config/aarch64/aarch64-protos.h 2019-08-13 11:20:41.965713252 +0100
@@ -462,7 +462,6 @@ char *aarch64_output_scalar_simd_mov_imm
 char *aarch64_output_simd_mov_immediate (rtx, unsigned,
enum simd_immediate_check w = AARCH64_CHECK_MOV);
 char *aarch64_output_sve_mov_immediate (rtx);
-char *aarch64_output_ptrue (machine_mode, char);
 bool aarch64_pad_reg_upward (machine_mode, const_tree, bool);
 bool aarch64_regno_ok_for_base_p (int, bool);
 bool aarch64_regno_ok_for_index_p (int, bool);
Index: gcc/config/aarch64/aarch64-sve.md
===
--- gcc/config/aarch64/aarch64-sve.md   2019-08-13 10:38:35.963894971 +0100
+++ gcc/config/aarch64/aarch64-sve.md   2019-08-13 11:20:41.965713252 +0100
@@ -453,8 +453,8 @@ (define_expand "mov"
 )
 
 (define_insn "*aarch64_sve_mov"
-  [(set (match_operand:PRED_ALL 0 "nonimmediate_operand" "=Upa, m, Upa, Upa, 
Upa")
-   (match_operand:PRED_ALL 1 "general_operand" "Upa, Upa, m, Dz, Dm"))]
+  [(set (match_operand:PRED_ALL 0 "nonimmediate_operand" "=Upa, m, Upa, Upa")
+   (match_operand:PRED_ALL 1 "general_operand" "Upa, Upa, m, Dn"))]
   "TARGET_SVE
&& (register_operand (operands[0], mode)
|| register_operand (operands[1], mode))"
@@ -462,8 +462,7 @@ (define_insn "*aarch64_sve_mov"
mov\t%0.b, %1.b
str\t%1, %0
ldr\t%0, %1
-   pfalse\t%0.b
-   * return aarch64_output_ptrue (mode, '');"
+   * return aarch64_output_sve_mov_immediate (operands[1]);"
 )
 
 ;; =
Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2019-08-13 11:18:06.762836453 +0100
+++ gcc/config/aarch64/aarch64.c2019-08-13 11:20:41.969713225 +0100
@@ -1635,6 +1635,16 @@ aarch64_get_mask_mode (poly_uint64 nunit
   return default_get_mask_mode (nunits, nbytes);
 }
 
+/* Return the integer element mode associated with SVE mode MODE.  */
+
+static scalar_int_mode
+aarch64_sve_element_int_mode (machine_mode mode)
+{
+  unsigned int elt_bits = vector_element_size (BITS_PER_SVE_VECTOR,
+  GET_MODE_NUNITS (mode));
+  return int_mode_for_size (elt_bits, 0).require ();
+}
+
 /* Implement TARGET_PREFERRED_ELSE_VALUE.  For binary operations,
prefer to use the first arithmetic operand as the else value if
the else value doesn't matter, since that exactly matches the SVE
@@ -14700,8 +14710,18 @@ aarch64_simd_valid_immediate (rtx op, si
 
   /* Handle PFALSE and PTRUE.  */
   if (vec_flags & VEC_SVE_PRED)
-return (op == CONST0_RTX (mode)
-   || op == CONSTM1_RTX (mode));
+{
+  if (op == CONST0_RTX (mode) || op == CONSTM1_RTX (mode))
+   {
+ if (info)
+   {
+ scalar_int_mode int_mode = aarch64_sve_element_int_mode (mode);
+ *info = simd_immediate_info (int_mode, op == CONSTM1_RTX (mode));
+   }
+ return true;
+   }
+  return false;
+}
 
   scalar_float_mode elt_float_mode;
   if (n_elts == 1
@@ -16393,6 +16413,21 @@ aarch64_output_sve_mov_immediate (rtx co
 
   element_char = sizetochar (GET_MODE_BITSIZE (info.elt_mode));
 
+  machine_mode vec_mode = GET_MODE (const_vector);
+  if (aarch64_sve_pred_mode_p (vec_mode))
+{
+  static char buf[sizeof ("ptrue\t%0.N, vlN")];
+  unsigned int total_bytes;
+  if (info.u.mov.value == const0_rtx)
+   snprintf (buf, sizeof (buf), "pfalse\t%%0.b");
+  else if (BYTES_PER_SVE_VECTOR.is_constant (_bytes))
+   snprintf (buf, sizeof (buf), "ptrue\t%%0.%c, vl%d", element_char,
+ total_bytes / GET_MODE_SIZE (info.elt_mode));
+  else
+   snprintf (buf, sizeof (buf), "ptrue\t%%0.%c, all", element_char);
+  return buf;
+}
+
   if 

[committed][AArch64] Make simd_immediate_info INDEX explicit

2019-08-13 Thread Richard Sandiford
This patch tweaks the representation of SVE INDEX instructions in
simd_immediate_info so that it's easier to add new types of
constant on top.

Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
Applied as r274371.

Richard


2019-08-13  Richard Sandiford  

gcc/
* config/aarch64/aarch64.c (simd_immediate_info::insn_type): Add
INDEX.
(simd_immediate_info::value, simd_immediate_info::step)
(simd_immediate_info::modifier, simd_immediate_info::shift): Replace
with...
(simd_immediate_info::u): ...this new union.
(simd_immediate_info::simd_immediate_info): Update accordingly.
(aarch64_output_simd_mov_immediate): Likewise.
(aarch64_output_sve_mov_immediate): Likewise.

Index: gcc/config/aarch64/aarch64.c
===
--- gcc/config/aarch64/aarch64.c2019-08-13 10:50:31.050777017 +0100
+++ gcc/config/aarch64/aarch64.c2019-08-13 11:16:57.723336150 +0100
@@ -83,7 +83,7 @@ #define POINTER_BYTES (POINTER_SIZE / BI
 /* Information about a legitimate vector immediate operand.  */
 struct simd_immediate_info
 {
-  enum insn_type { MOV, MVN };
+  enum insn_type { MOV, MVN, INDEX };
   enum modifier_type { LSL, MSL };
 
   simd_immediate_info () {}
@@ -96,29 +96,43 @@ struct simd_immediate_info
   /* The mode of the elements.  */
   scalar_mode elt_mode;
 
-  /* The value of each element if all elements are the same, or the
- first value if the constant is a series.  */
-  rtx value;
-
-  /* The value of the step if the constant is a series, null otherwise.  */
-  rtx step;
-
   /* The instruction to use to move the immediate into a vector.  */
   insn_type insn;
 
-  /* The kind of shift modifier to use, and the number of bits to shift.
- This is (LSL, 0) if no shift is needed.  */
-  modifier_type modifier;
-  unsigned int shift;
+  union
+  {
+/* For MOV and MVN.  */
+struct
+{
+  /* The value of each element.  */
+  rtx value;
+
+  /* The kind of shift modifier to use, and the number of bits to shift.
+This is (LSL, 0) if no shift is needed.  */
+  modifier_type modifier;
+  unsigned int shift;
+} mov;
+
+/* For INDEX.  */
+struct
+{
+  /* The value of the first element and the step to be added for each
+subsequent element.  */
+  rtx base, step;
+} index;
+  } u;
 };
 
 /* Construct a floating-point immediate in which each element has mode
ELT_MODE_IN and value VALUE_IN.  */
 inline simd_immediate_info
 ::simd_immediate_info (scalar_float_mode elt_mode_in, rtx value_in)
-  : elt_mode (elt_mode_in), value (value_in), step (NULL_RTX), insn (MOV),
-modifier (LSL), shift (0)
-{}
+  : elt_mode (elt_mode_in), insn (MOV)
+{
+  u.mov.value = value_in;
+  u.mov.modifier = LSL;
+  u.mov.shift = 0;
+}
 
 /* Construct an integer immediate in which each element has mode ELT_MODE_IN
and value VALUE_IN.  The other parameters are as for the structure
@@ -128,17 +142,22 @@ struct simd_immediate_info
   unsigned HOST_WIDE_INT value_in,
   insn_type insn_in, modifier_type modifier_in,
   unsigned int shift_in)
-  : elt_mode (elt_mode_in), value (gen_int_mode (value_in, elt_mode_in)),
-step (NULL_RTX), insn (insn_in), modifier (modifier_in), shift (shift_in)
-{}
+  : elt_mode (elt_mode_in), insn (insn_in)
+{
+  u.mov.value = gen_int_mode (value_in, elt_mode_in);
+  u.mov.modifier = modifier_in;
+  u.mov.shift = shift_in;
+}
 
 /* Construct an integer immediate in which each element has mode ELT_MODE_IN
-   and where element I is equal to VALUE_IN + I * STEP_IN.  */
+   and where element I is equal to BASE_IN + I * STEP_IN.  */
 inline simd_immediate_info
-::simd_immediate_info (scalar_mode elt_mode_in, rtx value_in, rtx step_in)
-  : elt_mode (elt_mode_in), value (value_in), step (step_in), insn (MOV),
-modifier (LSL), shift (0)
-{}
+::simd_immediate_info (scalar_mode elt_mode_in, rtx base_in, rtx step_in)
+  : elt_mode (elt_mode_in), insn (INDEX)
+{
+  u.index.base = base_in;
+  u.index.step = step_in;
+}
 
 /* The current code model.  */
 enum aarch64_code_model aarch64_cmodel;
@@ -16275,17 +16294,18 @@ aarch64_output_simd_mov_immediate (rtx c
 
   if (GET_MODE_CLASS (info.elt_mode) == MODE_FLOAT)
 {
-  gcc_assert (info.shift == 0 && info.insn == simd_immediate_info::MOV);
+  gcc_assert (info.insn == simd_immediate_info::MOV
+ && info.u.mov.shift == 0);
   /* For FP zero change it to a CONST_INT 0 and use the integer SIMD
 move immediate path.  */
-  if (aarch64_float_const_zero_rtx_p (info.value))
-info.value = GEN_INT (0);
+  if (aarch64_float_const_zero_rtx_p (info.u.mov.value))
+info.u.mov.value = GEN_INT (0);
   else
{
  const unsigned int buf_size = 20;
  char float_buf[buf_size] = {'\0'};
  

Re: [PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-13 Thread Marc Glisse

On Tue, 13 Aug 2019, Robin Dapp wrote:


+/* ((T)(A)) + CST -> (T)(A + CST)  */
+#if GIMPLE
+  (simplify
+   (plus (convert SSA_NAME@0) INTEGER_CST@1)
+(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+ && INTEGRAL_TYPE_P (type)


I have become rather wary of INTEGRAL_TYPE_P recently because it includes 
enum types, which with -fstrict-enum can have a surprising behavior. If I 
have

enum E { A, B, C };
and e has type enum E, with -fstrict-enum, do your tests manage to prevent 
(long)e+1 from becoming (long)(e+1) with an unsafe operation done in the 
enum type?



+ && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0))
+ && int_fits_type_p (@1, TREE_TYPE (@0)))
+ /* Perform binary operation inside the cast if the constant fits
+and (A + CST)'s range does not overflow.  */
+ (with
+  {
+   wi::overflow_type min_ovf = wi::OVF_OVERFLOW,
+ max_ovf = wi::OVF_OVERFLOW;
+tree inner_type = TREE_TYPE (@0);
+
+wide_int w1 = w1.from (wi::to_wide (@1), TYPE_PRECISION (inner_type),
+   TYPE_SIGN (inner_type));
+
+wide_int wmin0, wmax0;
+if (get_range_info (@0, , ) == VR_RANGE)
+  {
+wi::add (wmin0, w1, TYPE_SIGN (inner_type), _ovf);
+wi::add (wmax0, w1, TYPE_SIGN (inner_type), _ovf);
+  }
+  }
+ (if (min_ovf == wi::OVF_NONE && max_ovf == wi::OVF_NONE)
+  (convert (plus @0 { {wide_int_to_tree (TREE_TYPE (@0), w1)}; })))
+ )))
+#endif
+


--
Marc Glisse


Re: [MSP430][PATCH 2/2] Read MCU data from external file

2019-08-13 Thread Jozef Lawrynowicz
On Mon, 12 Aug 2019 14:34:39 -0600
Jeff Law  wrote:

> On 8/8/19 6:17 AM, Jozef Lawrynowicz wrote:
> > This patch extends the MCU data handling so that MCU data can be provided
> > in an external file (devices.csv). This means the compiler doesn't have to 
> > be
> > updated and rebuilt to support new devices when they are released.
> > 
> > TI distribute devices.csv with other support files (header files, linker
> > scripts) on their website. I have also tested that a program builds 
> > successfully
> > for every MCU and the correct hwmult library is passed to the linker.
> > I also built the msp430 cross compiler using a trunk build of native GCC to
> > ensure all the error and warning messages added by my code are correct, as
> > checked by -Wformat-diag.
> > 
> > 
> > 0002-MSP430-Devices-2-Read-MCU-data-from-external-devices.patch
> > 
> > From 6f67cdd282f2351d7450e343314beeaa745f0159 Mon Sep 17 00:00:00 2001
> > From: Jozef Lawrynowicz 
> > Date: Tue, 6 Aug 2019 10:52:54 +0100
> > Subject: [PATCH 2/2] MSP430: Devices [2]: Read MCU data from external
> >  devices.csv file, if it exists
> > 
> > gcc/ChangeLog:
> > 
> > 2019-08-XX  Jozef Lawrynowicz  
> > 
> > * gcc/config/msp430/driver-msp430.c (msp430_set_driver_var): New.
> > * gcc/config/msp430/msp430-devices.c (canonicalize_path_dirsep): New.
> > (msp430_check_path_for_devices): New.
> > (parse_devices_csv_1): New.
> > (parse_devices_csv): New.
> > (msp430_extract_mcu_data): Try to find devices.csv and search for the
> > MCU data in devices.csv before using the hard-coded data.
> > Warn if devices.csv isn't found and the MCU wasn't found in the
> > hard-coded data either.
> > * gcc/config/msp430/msp430.h (DRIVER_SELF_SPECS): Call
> > msp430_set_driver_var for -mno-warn-devices-csv and -mdevices-csv-loc.
> > Search for devices.csv on -I and -L paths.
> > (EXTRA_SPEC_FUNCTIONS): Add msp430_check_path_for_devices and
> > msp430_set_driver_var.
> > * gcc/config/msp430/msp430.opt: Add -mwarn-devices-csv and
> > -mdevices-csv-loc=.
> > * gcc/doc/invoke.texi (-mmcu): Document that -I and -L paths are
> > searched for devices.csv.
> > (mwarn-devices-csv): Document option.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > 2019-08-XX  Jozef Lawrynowicz  
> > 
> > * gcc.target/msp430/msp430.exp (msp430_device_permutations_runtest):
> > Handle csv-* and bad-devices-* tests.
> > * gcc.target/msp430/devices/README: Document how bad-devices-* tests
> > work.
> > * gcc.target/msp430/devices/bad-devices-1.c: New test.
> > * gcc.target/msp430/devices/bad-devices-2.c: Likewise.
> > * gcc.target/msp430/devices/bad-devices-3.c: Likewise.
> > * gcc.target/msp430/devices/bad-devices-4.c: Likewise.
> > * gcc.target/msp430/devices/bad-devices-5.c: Likewise.
> > * gcc.target/msp430/devices/bad-devices-6.c: Likewise.
> > * gcc.target/msp430/devices/csv-device-order.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430_00.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430_01.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430_02.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430_04.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430_08.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430_10.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430_11.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430_12.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430_14.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430_18.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430_20.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430_21.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430_22.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430_24.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430_28.c: Likewise.
> > * gcc.target/msp430/devices/csv-msp430fr5969.c: Likewise.
> > * gcc.target/msp430/devices/hard-foo.c: Likewise.
> > * gcc.target/msp430/devices/bad-devices-1.csv: New test support file.
> > * gcc.target/msp430/devices/bad-devices-2.csv: Likewise.
> > * gcc.target/msp430/devices/bad-devices-3.csv: Likewise.
> > * gcc.target/msp430/devices/bad-devices-4.csv: Likewise.
> > * gcc.target/msp430/devices/bad-devices-5.csv: Likewise.
> > * gcc.target/msp430/devices/bad-devices-6.csv: Likewise.
> > * gcc.target/msp430/devices/devices.csv: Likewise.
> > ---  
> So it's good we don't have to do updates when a new devices.csv is
> released -- except when we devices require new multilibs, but I don't
> offhand see a sensible way to deal with that.

For a given device, the only choice to make regarding multilibs at the moment
is whether to use a 430 or 430X multilib. The large memory model multilib will
only be used if a user passes -mlarge for a 430X device.
So if a device is added which uses a new extension to the ISA (which is not

Re: [AArch64] Add a "y" constraint for V0-V7

2019-08-13 Thread Richard Sandiford
James Greenhalgh  writes:
> On Wed, Aug 07, 2019 at 07:19:12PM +0100, Richard Sandiford wrote:
>> Some indexed SVE FCMLA operations have a 3-bit register field that
>> requires one of Z0-Z7.  This patch adds a public "y" constraint for that.
>> 
>> The patch also documents "x", which is again intended to be a public
>> constraint.
>> 
>> Tested on aarch64-linux-gnu (with and without SVE) and aarch64_be-elf.
>> OK to install?
>
>
> I had the vague recollection that 'y' already meant something... I'm
> guessing you already checked, but just in case, please check.

Yeah, it's definitely unused at the moment.  It might be this usage that
you're remembering, since we unofficially "reserved" "y" for it a few
years back.  (It was a nice coincidence that the w/x/y sequence was still
free to mean 32/16/8 FP registers. :-))

Thanks,
Richard

> Otherwise, this is OK.
>
> Thanks,
> James
>
>
>> 
>> Richard
>> 
>> 
>> 2019-08-07  Richard Sandiford  
>> 
>> gcc/
>>  * doc/md.texi: Document the x and y constraints for AArch64.
>>  * config/aarch64/aarch64.h (FP_LO8_REGNUM_P): New macro.
>>  (FP_LO8_REGS): New reg_class.
>>  (REG_CLASS_NAMES, REG_CLASS_CONTENTS): Add an entry for FP_LO8_REGS.
>>  * config/aarch64/aarch64.c (aarch64_hard_regno_nregs)
>>  (aarch64_regno_regclass, aarch64_class_max_nregs): Handle FP_LO8_REGS.
>>  * config/aarch64/predicates.md (aarch64_simd_register): Use
>>  FP_REGNUM_P instead of checking the classes manually.
>>  * config/aarch64/constraints.md (y): New constraint.
>> 
>> gcc/testsuite/
>>  * gcc.target/aarch64/asm-x-constraint-1.c: New test.
>>  * gcc.target/aarch64/asm-y-constraint-1.c: Likewise.
>> 


Use checking forms of DECL_FUNCTION_CODE (PR 91421)

2019-08-13 Thread Richard Sandiford
We were shoe-horning all built-in enumerations (including frontend
and target-specific ones) into a field of type built_in_function.  This
was accessed as either an lvalue or an rvalue using DECL_FUNCTION_CODE.

The obvious danger with this (as was noted by several ??? comments)
is that the ranges have nothing to do with each other, and targets can
easily have more built-in functions than generic code.  But my patch to
make the field bigger was the straw that finally made the problem visible.

This patch therefore:

- replaces the field with a plain unsigned int

- turns DECL_FUNCTION_CODE into an rvalue-only accessor that checks
  that the function really is BUILT_IN_NORMAL

- adds corresponding DECL_MD_FUNCTION_CODE and DECL_FE_FUNCTION_CODE
  accessors for BUILT_IN_MD and BUILT_IN_FRONTEND respectively

- adds DECL_UNCHECKED_FUNCTION_CODE for places that need to access the
  underlying field (should be low-level code only)

- adds new helpers for setting the built-in class and function code

- makes DECL_BUILT_IN_CLASS an rvalue-only accessor too, since all
  assignments should go through the new helpers

Tested on aarch64-linux-gnu and x86_64-linux-gnu (--enable-languages=all
--enable-host-shared).  OK to install?

Richard


2019-08-13  Richard Sandiford  

gcc/
PR middle-end/91421
* tree-core.h (function_decl::function_code): Change type to
unsigned int.
* tree.h (DECL_FUNCTION_CODE): Rename old definition to...
(DECL_UNCHECKED_FUNCTION_CODE): ...this.
(DECL_BUILT_IN_CLASS): Make an rvalue macro only.
(DECL_FUNCTION_CODE): New function.  Assert that the built-in class
is BUILT_IN_NORMAL.
(DECL_MD_FUNCTION_CODE, DECL_FE_FUNCTION_CODE): New functions.
(set_decl_built_in_function, copy_decl_built_in_function): Likewise.
(fndecl_built_in_p): Change the type of the "name" argument to
unsigned int.
* builtins.c (expand_builtin): Move DECL_FUNCTION_CODE use
after check for DECL_BUILT_IN_CLASS.
* cgraphclones.c (build_function_decl_skip_args): Use
set_decl_built_in_function.
* ipa-param-manipulation.c (ipa_modify_formal_parameters): Likewise.
* ipa-split.c (split_function): Likewise.
* langhooks.c (add_builtin_function_common): Likewise.
* omp-simd-clone.c (simd_clone_create): Likewise.
* tree-streamer-in.c (unpack_ts_function_decl_value_fields): Likewise.
* config/darwin.c (darwin_init_cfstring_builtins): Likewise.
(darwin_fold_builtin): Use DECL_MD_FUNCTION_CODE instead of
DECL_FUNCTION_CODE.
* fold-const.c (operand_equal_p): Compare DECL_UNCHECKED_FUNCTION_CODE
instead of DECL_FUNCTION_CODE.
* lto-streamer-out.c (hash_tree): Use DECL_UNCHECKED_FUNCTION_CODE
instead of DECL_FUNCTION_CODE.
* tree-streamer-out.c (pack_ts_function_decl_value_fields): Likewise.
* print-tree.c (print_node): Use DECL_MD_FUNCTION_CODE when
printing DECL_BUILT_IN_MD.  Handle DECL_BUILT_IN_FRONTEND.
* config/aarch64/aarch64-builtins.c (aarch64_expand_builtin)
(aarch64_fold_builtin, aarch64_gimple_fold_builtin): Use
DECL_MD_FUNCTION_CODE instead of DECL_FUNCTION_CODE.
* config/aarch64/aarch64.c (aarch64_builtin_reciprocal): Likewise.
* config/alpha/alpha.c (alpha_expand_builtin, alpha_fold_builtin):
(alpha_gimple_fold_builtin): Likewise.
* config/arc/arc.c (arc_expand_builtin): Likewise.
* config/arm/arm-builtins.c (arm_expand_builtin): Likewise.
* config/avr/avr-c.c (avr_resolve_overloaded_builtin): Likewise.
* config/avr/avr.c (avr_expand_builtin, avr_fold_builtin): Likewise.
* config/bfin/bfin.c (bfin_expand_builtin): Likewise.
* config/c6x/c6x.c (c6x_expand_builtin): Likewise.
* config/frv/frv.c (frv_expand_builtin): Likewise.
* config/gcn/gcn.c (gcn_expand_builtin_1): Likewise.
(gcn_expand_builtin): Likewise.
* config/i386/i386-builtins.c (ix86_builtin_reciprocal): Likewise.
(fold_builtin_cpu): Likewise.
* config/i386/i386-expand.c (ix86_expand_builtin): Likewise.
* config/i386/i386.c (ix86_fold_builtin): Likewise.
(ix86_gimple_fold_builtin): Likewise.
* config/ia64/ia64.c (ia64_fold_builtin): Likewise.
(ia64_expand_builtin): Likewise.
* config/iq2000/iq2000.c (iq2000_expand_builtin): Likewise.
* config/mips/mips.c (mips_expand_builtin): Likewise.
* config/msp430/msp430.c (msp430_expand_builtin): Likewise.
* config/nds32/nds32-intrinsic.c (nds32_expand_builtin_impl): Likewise.
* config/nios2/nios2.c (nios2_expand_builtin): Likewise.
* config/nvptx/nvptx.c (nvptx_expand_builtin): Likewise.
* config/pa/pa.c (pa_expand_builtin): Likewise.
* config/pru/pru.c (pru_expand_builtin): Likewise.
* config/riscv/riscv-builtins.c (riscv_expand_builtin): 

Protect some checks of DECL_FUNCTION_CODE

2019-08-13 Thread Richard Sandiford
This patch protects various uses of DECL_FUNCTION_CODE that didn't
obviously check for BUILT_IN_NORMAL first (either directly or in callers).
They could therefore trigger for functions that either aren't built-ins
or are a different kind of built-in.

Also, the patch removes a redundant GIMPLE_CALL check from
optimize_stdarg_builtin, since it gave the impression that the stmt
was less well checked than it actually is.

Tested on aarch64-linux-gnu and x86_64-linux-gnu (--enable-languages=all
--enable-host-shared).  OK to install?

Richard


2019-08-13  Richard Sandiford  

gcc/
PR middle-end/91421
* attribs.c (decl_attributes): Check the DECL_BUILT_IN_CLASS
before the DECL_FUNCTION_CODE.
* calls.c (maybe_warn_alloc_args_overflow): Use fndecl_built_in_p
to check for a BUILT_IN_ALLOCA call.
* ipa-cp.c (ipa_get_indirect_edge_target_1): Likewise for
BUILT_IN_UNREACHABLE.  Don't check for a FUNCTION_TYPE.
* ipa-devirt.c (possible_polymorphic_call_target_p): Likewise.
* ipa-prop.c (try_make_edge_direct_virtual_call): Likewise.
* gimple-ssa-isolate-paths.c (is_addr_local): Check specifically
for BUILT_IN_NORMAL functions.
* trans-mem.c (expand_block_edges): Use gimple_call_builtin_p to
test for BUILT_IN_TM_ABORT.
* tree-ssa-ccp.c (optimize_stack_restore): Use fndecl_built_in_p
to check for a BUILT_IN_STACK_RESTORE call.
(optimize_stdarg_builtin): Remove redundant check for GIMPLE_CALL.
* tree-ssa-threadedge.c
(record_temporary_equivalences_from_stmts_at_dest): Check for a
BUILT_IN_NORMAL decl before checking its DECL_FUNCTION_CODE.
* tree-vect-patterns.c (vect_recog_pow_pattern): Use a positive
test for a BUILT_IN_NORMAL call instead of a negative test for
an internal function call.

gcc/c/
PR middle-end/91421
* c-decl.c (header_for_builtin_fn): Take a FUNCTION_DECL instead
of a built_in_function.
(diagnose_mismatched_decls, implicitly_declare): Update accordingly.

Index: gcc/attribs.c
===
--- gcc/attribs.c   2019-07-18 09:22:13.409771785 +0100
+++ gcc/attribs.c   2019-08-13 10:31:18.211019996 +0100
@@ -691,6 +691,7 @@ decl_attributes (tree *node, tree attrib
 
  if (!built_in
  || !DECL_P (*anode)
+ || DECL_BUILT_IN_CLASS (*anode) != BUILT_IN_NORMAL
  || (DECL_FUNCTION_CODE (*anode) != BUILT_IN_UNREACHABLE
  && (DECL_FUNCTION_CODE (*anode)
  != BUILT_IN_UBSAN_HANDLE_BUILTIN_UNREACHABLE)))
Index: gcc/calls.c
===
--- gcc/calls.c 2019-07-29 09:39:49.750165306 +0100
+++ gcc/calls.c 2019-08-13 10:31:18.215019968 +0100
@@ -1350,7 +1350,6 @@ maybe_warn_alloc_args_overflow (tree fn,
   location_t loc = EXPR_LOCATION (exp);
 
   tree fntype = fn ? TREE_TYPE (fn) : TREE_TYPE (TREE_TYPE (exp));
-  built_in_function fncode = fn ? DECL_FUNCTION_CODE (fn) : BUILT_IN_NONE;
   bool warned = false;
 
   /* Validate each argument individually.  */
@@ -1376,11 +1375,10 @@ maybe_warn_alloc_args_overflow (tree fn,
 friends.
 Also avoid issuing the warning for calls to function named
 "alloca".  */
- if ((fncode == BUILT_IN_ALLOCA
-  && IDENTIFIER_LENGTH (DECL_NAME (fn)) != 6)
- || (fncode != BUILT_IN_ALLOCA
- && !lookup_attribute ("returns_nonnull",
-   TYPE_ATTRIBUTES (fntype
+ if (fn && fndecl_built_in_p (fn, BUILT_IN_ALLOCA)
+ ? IDENTIFIER_LENGTH (DECL_NAME (fn)) != 6
+ : !lookup_attribute ("returns_nonnull",
+  TYPE_ATTRIBUTES (fntype)))
warned = warning_at (loc, OPT_Walloc_zero,
 "%Kargument %i value is zero",
 exp, idx[i] + 1);
Index: gcc/ipa-cp.c
===
--- gcc/ipa-cp.c2019-07-16 09:11:05.561423510 +0100
+++ gcc/ipa-cp.c2019-08-13 10:31:18.219019937 +0100
@@ -2436,8 +2436,7 @@ ipa_get_indirect_edge_target_1 (struct c
  if (can_refer)
{
  if (!target
- || (TREE_CODE (TREE_TYPE (target)) == FUNCTION_TYPE
- && DECL_FUNCTION_CODE (target) == BUILT_IN_UNREACHABLE)
+ || fndecl_built_in_p (target, BUILT_IN_UNREACHABLE)
  || !possible_polymorphic_call_target_p
   (ie, cgraph_node::get (target)))
{
Index: gcc/ipa-devirt.c
===
--- gcc/ipa-devirt.c2019-08-05 17:46:20.173727579 +0100
+++ gcc/ipa-devirt.c2019-08-13 

Re: [PATCH 1/3] Perform fold when propagating.

2019-08-13 Thread Richard Biener
On Tue, Aug 13, 2019 at 10:36 AM Robin Dapp  wrote:
>
> This patch performs more aggressive folding in order for the
> match.pd changes to kick in later.
>
> Some test cases rely on VRP doing something which now already
> happens during CCP so adjust them accordingly.
>
> Also, the loop versioning pass was missing one case when
> deconstructing addresses that would only be triggered after
> this patch for me:
> It could handle addition and subsequent convert/nop but not
> a convert/nop directly.  This would cause the hash to be
> calculated differently and, in turn, cause the pass to miss
> a versioning opportunity.  Fixed this by adding the missing
> case.
> ---
>  gcc/fortran/trans-intrinsic.c  | 5 +++--
>  gcc/gimple-loop-versioning.cc  | 6 ++
>  gcc/testsuite/g++.dg/tree-ssa/ssa-dse-2.C  | 4 +++-
>  gcc/testsuite/gcc.dg/pr35691-1.c   | 4 ++--
>  gcc/testsuite/gcc.dg/pr35691-2.c   | 4 ++--
>  gcc/testsuite/gcc.dg/pr35691-3.c   | 4 ++--
>  gcc/testsuite/gcc.dg/pr35691-4.c   | 4 ++--
>  gcc/testsuite/gcc.dg/tree-ssa/20040305-1.c | 4 ++--
>  gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c  | 4 ++--
>  gcc/testsuite/gcc.dg/tree-ssa/forwprop-16.c| 4 ++--
>  gcc/testsuite/gcc.dg/tree-ssa/loop-15.c| 2 +-
>  gcc/testsuite/gcc.dg/tree-ssa/pr23744.c| 4 ++--
>  gcc/testsuite/gcc.dg/tree-ssa/pr52631.c| 4 ++--
>  gcc/testsuite/gcc.dg/vect/vect-over-widen-23.c | 2 +-
>  gcc/tree-ssa-propagate.c   | 6 --
>  15 files changed, 36 insertions(+), 25 deletions(-)
>
> diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
> index a6e33833680..99ec5f34319 100644
> --- a/gcc/fortran/trans-intrinsic.c
> +++ b/gcc/fortran/trans-intrinsic.c
> @@ -5428,8 +5428,9 @@ gfc_conv_intrinsic_findloc (gfc_se *se, gfc_expr *expr)
>tree type;
>tree tmp;
>tree found;
> -  tree forward_branch;
> -  tree back_branch;
> +  /* Initialize here to avoid 'maybe used uninitialized'.  */
> +  tree forward_branch = NULL_TREE;
> +  tree back_branch = NULL_TREE;
>gfc_loopinfo loop;
>gfc_ss *arrayss;
>gfc_ss *maskss;
> diff --git a/gcc/gimple-loop-versioning.cc b/gcc/gimple-loop-versioning.cc
> index 8fa19488490..36c70543402 100644
> --- a/gcc/gimple-loop-versioning.cc
> +++ b/gcc/gimple-loop-versioning.cc
> @@ -1266,6 +1266,12 @@ loop_versioning::record_address_fragment (gimple *stmt,
>   continue;
> }
> }
> + if (code == NOP_EXPR)

CONVERT_EXPR_CODE_P (code)

> +   {
> + tree op1 = gimple_assign_rhs1 (assign);
> + address->terms[i].expr = strip_casts (op1);
> + continue;
> +   }
> }
>i += 1;
>  }
> diff --git a/gcc/testsuite/g++.dg/tree-ssa/ssa-dse-2.C 
> b/gcc/testsuite/g++.dg/tree-ssa/ssa-dse-2.C
> index ca7e18f66a9..83a8f0668df 100644
> --- a/gcc/testsuite/g++.dg/tree-ssa/ssa-dse-2.C
> +++ b/gcc/testsuite/g++.dg/tree-ssa/ssa-dse-2.C
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-dse2-details -Wno-return-type" } */
> +/* { dg-options "-O2 -fdump-tree-dse2-details -fdump-tree-vrp1-details 
> -Wno-return-type" } */
>
>  typedef __SIZE_TYPE__ size_t;
>  extern "C"
> @@ -55,3 +55,5 @@ fill_vec_av_set (av_set_t av)
>
>  /* { dg-final { scan-tree-dump-not "Trimming statement .head = -" "dse2" } } 
> */
>  /* { dg-final { scan-tree-dump-not "mem\[^\r\n\]*, 0\\);" "dse2" } } */
> +/* { dg-final { scan-tree-dump "Folded into: GIMPLE_NOP" "vrp1" } } */
> +/* { dg-final { scan-tree-dump-not "memmove" "dse2" } } */  }
> diff --git a/gcc/testsuite/gcc.dg/pr35691-1.c 
> b/gcc/testsuite/gcc.dg/pr35691-1.c
> index 34dc02ab560..2e1d8bd0f43 100644
> --- a/gcc/testsuite/gcc.dg/pr35691-1.c
> +++ b/gcc/testsuite/gcc.dg/pr35691-1.c
> @@ -1,5 +1,5 @@
>  /* { dg-do compile } */
> -/* { dg-options "-O2 -fdump-tree-forwprop1-details" } */
> +/* { dg-options "-O2 -fdump-tree-ccp1-details" } */

Hmm.  So this shows that while CCP doesn't produce anything
interesting in the lattice we still substitute (and now fold) everything.
The immediate next pass is forwprop so we're doing everything twice.

In total we have 3 CCP passes, 5 copy-prop and two VRP passes.

So we're re-folding everything an additional 10 times with the patch.
I know I suggested this - eh - but the number looks a bit excessive.
Mostly due to the fact we have so many of these passes.

May I suggest to add a parameter to the substitute-and-fold engine
so we can do the folding on all stmts only when enabled and enable
it just for VRP?  That also avoids the testsuite noise.

I think it's also only necessary to fold a stmt when a (indirect) use
after substitution has either been folded or has (new) SSA name
info (range/known-bits) set?

Thanks,
Richard.

>  int foo(int z0, unsigned z1)
>  {
> @@ -9,4 +9,4 @@ int foo(int z0, unsigned z1)
>return t2;
>  }
>
> -/* { dg-final { 

Re: [MSP430][PATCH 1/2] Consolidate handling of hard-coded MCU data

2019-08-13 Thread Jozef Lawrynowicz
On Mon, 12 Aug 2019 14:30:06 -0600
Jeff Law  wrote:

> On 8/8/19 6:14 AM, Jozef Lawrynowicz wrote:
> > This patch improves the handling of MCU data by consolidating multiple
> > copies of hard-coded MCU data into a single location, and adds a new 
> > function
> > to be used as a single entry point for the extraction of MCU data for the
> > selected MCU.
> > 
> > This ensures the data is only extracted once per invocation of the
> > driver/compiler, whilst previously, the data for the MCU is extracted each 
> > time
> > it is needed.
> > 
> > Some notes:
> > - The GNU assembler doesn't do anything with the -mmcu option beyond 
> > setting up
> >   the CPU ISA, so if the GCC driver passes it the -mcpu option, which it 
> > will
> >   always do if -mmcu is specified, then it is redundant to also pass it 
> > -mmcu.
> > - The indenting in some places (e.g. msp430_select_hwmult_lib) looks wrong 
> > in
> >   the patched file, but to make the diff a lot easier to read I have kept 
> > the
> >   indenting the same as it was before. I can fix this after the patch is
> >   accepted.  
> FWIW, another way to address the indentation issue is with the diff -b
> option which says to ignore whitespace differences.

Ah right, thanks for the tip.

> 
> Regardless, yes, please clean up any indentation issues.
> 
> 
> > 
> > 
> > 0001-MSP430-Devices-1-Consolidate-handling-of-hard-coded-.patch
> > 
> > From cd131b07e0447d104c99317e7ac37c2420c1bf6e Mon Sep 17 00:00:00 2001
> > From: Jozef Lawrynowicz 
> > Date: Wed, 31 Jul 2019 22:53:50 +0100
> > Subject: [PATCH 1/2] MSP430: Devices [1]: Consolidate handling of hard-coded
> >  MCU data
> > 
> > gcc/ChangeLog:
> > 
> > 2019-08-XX  Jozef Lawrynowicz  
> > 
> > * gcc/config.gcc (msp430*-*-*): Add msp430-devices.o to extra_objs and
> > extra_gcc_objs.
> > * gcc/config/msp430/driver-msp430.c: Remove msp430_mcu_data.
> > (msp430_select_cpu): New spec function.
> > (msp430_select_hwmult_lib): Use msp430_extract_mcu_data to extract
> > MCU data.
> > * gcc/config/msp430/msp430-devices.c: New file.
> > * gcc/config/msp430/msp430-devices.h: New file.
> > * gcc/config/msp430/msp430.c: Remove msp430_mcu_data.
> > (msp430_option_override): Use msp430_extract_mcu_data to extract
> > MCU data.
> > (msp430_use_f5_series_hwmult): Likewise.
> > (use_32bit_hwmult): Likewise.
> > (msp430_no_hwmult): Likewise.
> > * gcc/config/msp430/msp430.h (ASM_SPEC): Don't pass -mmcu to the
> > assembler.
> > (DRIVER_SELF_SPECS): Call msp430_select_cpu if -mmcu is used without
> > and -mcpu option.
> > (EXTRA_SPEC_FUNCTIONS): Add msp430_select_cpu.
> > * gcc/config/msp430/t-msp430: Add rule to build msp430-devices.o.
> > Remove hard-coded MCU multilib data.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > 2019-08-XX  Jozef Lawrynowicz  
> > 
> > * gcc.target/msp430/msp430.exp
> > (check_effective_target_msp430_430_selected): New.
> > (check_effective_target_msp430_430x_selected): New.
> > (check_effective_target_msp430_mlarge_selected): New.
> > (check_effective_target_msp430_hwmul_not_none): New.
> > (check_effective_target_msp430_hwmul_not_16bit): New.
> > (check_effective_target_msp430_hwmul_not_32bit): New.
> > (check_effective_target_msp430_hwmul_not_f5): New.
> > (msp430_get_opts): New.
> > (msp430_device_permutations_runtest): New.
> > * gcc.target/msp430/devices/README: New file.
> > * gcc.target/msp430/devices-main.c: New test.
> > * gcc.target/msp430/devices/hard-cc430f5123.c: Likewise.
> > * gcc.target/msp430/devices/hard-foo.c: Likewise.
> > * gcc.target/msp430/devices/hard-msp430afe253.c: Likewise.
> > * gcc.target/msp430/devices/hard-msp430cg4616.c: Likewise.
> > * gcc.target/msp430/devices/hard-msp430f4783.c: Likewise.
> > * gcc.target/msp430/devices/hard-rf430frl154h_rom.c: Likewise.  
> Presumably we aren't supporting switching the selected mcu via
> attributes on a per-function basis?

No that's not supported at the moment.

Given that 430X devices support 430 and 430X ISA instructions, I suppose this is
something that could be added in the future. Similar to (but not as
complicated as), ARM arm/thumb interworking.
 
> 
> > +/* Main entry point to load the MCU data for the given -mmcu into
> > +   extracted_mcu_data.  hard_msp430_mcu_data (initialized at the bottom of 
> > this
> > +   file) is searched for the MCU name.
> > +   This function only needs to be executed once, but it can be first called
> > +   from a number of different locations.  */
> > +void
> > +msp430_extract_mcu_data (const char * mcu_name)
> > +{
> > +  static int executed = 0;
> > +  int i;
> > +  if (mcu_name == NULL || executed == 1)
> > +return;
> > +  executed = 1;
> > +  /* FIXME: This array is alpha sorted - we could use a binary search.  */
> > +  for (i = ARRAY_SIZE (hard_msp430_mcu_data); i--;)
> > +if (strcasecmp (mcu_name, hard_msp430_mcu_data[i].name) 

[C++ Patch] Improve grok_array_decl location

2019-08-13 Thread Paolo Carlini

Hi,

for this error message about the types involved in array subscripting I 
think we can exploit the available location argument and point to the 
open square bracket instead of simply using input_location. Tested 
x86_64-linux.


Thanks, Paolo Carlini.



/cp
2019-08-13  Paolo Carlini  

* decl2.c (grok_array_decl): Use the location of the open square
bracket in error message about invalid types.

/testsuite
2019-08-13  Paolo Carlini  

* g++.dg/conversion/simd4.C: Test locations.
Index: cp/decl2.c
===
--- cp/decl2.c  (revision 274330)
+++ cp/decl2.c  (working copy)
@@ -434,8 +434,8 @@ grok_array_decl (location_t loc, tree array_expr,
array_expr = p2, index_exp = i1;
   else
{
- error ("invalid types %<%T[%T]%> for array subscript",
-type, TREE_TYPE (index_exp));
+ error_at (loc, "invalid types %<%T[%T]%> for array subscript",
+   type, TREE_TYPE (index_exp));
  return error_mark_node;
}
 
Index: testsuite/g++.dg/conversion/simd4.C
===
--- testsuite/g++.dg/conversion/simd4.C (revision 274330)
+++ testsuite/g++.dg/conversion/simd4.C (working copy)
@@ -12,13 +12,13 @@ void
 foo ()
 {
   b[t];
-  b[u];// { dg-error "invalid types" }
-  b[v];// { dg-error "invalid types" }
-  b[w];// { dg-error "invalid types" }
+  b[u];// { dg-error "4:invalid types" }
+  b[v];// { dg-error "4:invalid types" }
+  b[w];// { dg-error "4:invalid types" }
   t[b];
-  u[b];// { dg-error "invalid types" }
-  v[b];// { dg-error "invalid types" }
-  w[b];// { dg-error "invalid types" }
+  u[b];// { dg-error "4:invalid types" }
+  v[b];// { dg-error "4:invalid types" }
+  w[b];// { dg-error "4:invalid types" }
   new int[t];
   new int[u];  // { dg-error "new-declarator must have integral" }
   new int[v];  // { dg-error "new-declarator must have integral" }


Re: [PATCH] Improve DSE to handle redundant zero initializations.

2019-08-13 Thread Richard Biener
On Tue, Aug 13, 2019 at 10:18 AM Richard Sandiford
 wrote:
>
> Thanks for doing this.
>
> Matthew Beliveau  writes:
> > diff --git gcc/tree-ssa-dse.c gcc/tree-ssa-dse.c
> > index 5b7c4fc6d1a..dcaeb8edbfe 100644
> > --- gcc/tree-ssa-dse.c
> > +++ gcc/tree-ssa-dse.c
> > @@ -628,11 +628,8 @@ dse_optimize_redundant_stores (gimple *stmt)
> >tree fndecl;
> >if ((is_gimple_assign (use_stmt)
> >  && gimple_vdef (use_stmt)
> > -&& ((gimple_assign_rhs_code (use_stmt) == CONSTRUCTOR
> > - && CONSTRUCTOR_NELTS (gimple_assign_rhs1 (use_stmt)) == 0
> > - && !gimple_clobber_p (stmt))
> > -|| (gimple_assign_rhs_code (use_stmt) == INTEGER_CST
> > -&& integer_zerop (gimple_assign_rhs1 (use_stmt)
> > +&& initializer_zerop (gimple_op (use_stmt, 1), NULL)
> > +&& !gimple_clobber_p (stmt))
> > || (gimple_call_builtin_p (use_stmt, BUILT_IN_NORMAL)
> > && (fndecl = gimple_call_fndecl (use_stmt)) != NULL
> > && (DECL_FUNCTION_CODE (fndecl) == BUILT_IN_MEMSET
> > @@ -1027,15 +1024,13 @@ dse_dom_walker::dse_optimize_stmt 
> > (gimple_stmt_iterator *gsi)
> >  {
> >bool by_clobber_p = false;
> >
> > -  /* First see if this store is a CONSTRUCTOR and if there
> > -  are subsequent CONSTRUCTOR stores which are totally
> > -  subsumed by this statement.  If so remove the subsequent
> > -  CONSTRUCTOR store.
> > +  /* Check if this store initalizes zero, or some aggregate of zeros,
> > +  and check if there are subsequent stores which are subsumed by this
> > +  statement.  If so, remove the subsequent store.
> >
> >This will tend to make fewer calls into memset with longer
> >arguments.  */
> > -  if (gimple_assign_rhs_code (stmt) == CONSTRUCTOR
> > -   && CONSTRUCTOR_NELTS (gimple_assign_rhs1 (stmt)) == 0
> > +  if (initializer_zerop (gimple_op (stmt, 1), NULL)
> > && !gimple_clobber_p (stmt))
> >   dse_optimize_redundant_stores (stmt);
> >
>
> In addition to Jeff's comment, the original choice of gimple_assign_rhs1
> is the preferred way to write this (applies to both hunks).

And the !gimple_clobber_p test is now redundant.

> Richard


Re: [Patch v2] Enable math functions linking with static library for LTO

2019-08-13 Thread Richard Biener
On Tue, Aug 13, 2019 at 4:22 AM luoxhu  wrote:
>
> Hi Richard,
>
> On 2019/8/12 16:51, Richard Biener wrote:
> > On Mon, Aug 12, 2019 at 8:50 AM luoxhu  wrote:
> >>
> >> Hi Richard,
> >> Thanks for your comments, updated the v2 patch as below:
> >> 1. Define and use builtin_with_linkage_p.
> >> 2. Add comments.
> >> 3. Add a testcase.
> >>
> >> In LTO mode, if static library and dynamic library contains same
> >> function and both libraries are passed as arguments, linker will link
> >> the function in dynamic library no matter the sequence.  This patch
> >> will output LTO symbol node as UNDEF if BUILT_IN_NORMAL function FNDECL
> >> is a math function, then the function in static library will be linked
> >> first if its sequence is ahead of the dynamic library.
> >
> > Comments below
> >
> >> gcc/ChangeLog
> >>
> >>  2019-08-12  Xiong Hu Luo  
> >>
> >>  PR lto/91287
> >>  * builtins.c (builtin_with_linkage_p): New function.
> >>  * builtins.h (builtin_with_linkage_p): New function.
> >>  * symtab.c (write_symbol): Use builtin_with_linkage_p.
> >>  * lto-streamer-out.c (symtab_node::output_to_lto_symbol_table_p):
> >>  Likewise.
> >>
> >> gcc/testsuite/ChangeLog
> >>
> >>  2019-08-12  Xiong Hu Luo  
> >>
> >>  PR lto/91287
> >>  * gcc.dg/pr91287.c: New testcase.
> >> ---
> >>   gcc/builtins.c | 89 ++
> >>   gcc/builtins.h |  2 +
> >>   gcc/lto-streamer-out.c |  4 +-
> >>   gcc/symtab.c   | 13 -
> >>   gcc/testsuite/gcc.dg/pr91287.c | 40 +++
> >>   5 files changed, 145 insertions(+), 3 deletions(-)
> >>   create mode 100644 gcc/testsuite/gcc.dg/pr91287.c
> >>
> >> diff --git a/gcc/builtins.c b/gcc/builtins.c
> >> index 695a9d191af..f4dea941a27 100644
> >> --- a/gcc/builtins.c
> >> +++ b/gcc/builtins.c
> >> @@ -11244,3 +11244,92 @@ target_char_cst_p (tree t, char *p)
> >> *p = (char)tree_to_uhwi (t);
> >> return true;
> >>   }
> >> +
> >> +/* Return true if DECL is a specified builtin math function.  These 
> >> functions
> >> +   should have symbol in symbol table to provide linkage with faster 
> >> version of
> >> +   libraries.  */
> >
> > The comment should read like
> >
> > /* Return true if the builtin DECL is implemented in a standard
> > library.  Otherwise
> > returns false which doesn't guarantee it is not (thus the list of
> > handled builtins
> > below may be incomplete).  */
> >
> >> +bool
> >> +builtin_with_linkage_p (tree decl)
> >> +{
> >> +  if (!decl)
> >> +return false;
> >
> > Omit this check please.
> >
> >> +  if (DECL_BUILT_IN_CLASS (decl) == BUILT_IN_NORMAL)
> >> +switch (DECL_FUNCTION_CODE (decl))
> >> +{
> >> +  CASE_FLT_FN (BUILT_IN_ACOS):
> >> +  CASE_FLT_FN (BUILT_IN_ACOSH):
> >> +  CASE_FLT_FN (BUILT_IN_ASIN):
> >> +  CASE_FLT_FN (BUILT_IN_ASINH):
> >> +  CASE_FLT_FN (BUILT_IN_ATAN):
> >> +  CASE_FLT_FN (BUILT_IN_ATANH):
> >> +  CASE_FLT_FN (BUILT_IN_ATAN2):
> >> +  CASE_FLT_FN (BUILT_IN_CBRT):
> >> +  CASE_FLT_FN (BUILT_IN_CEIL):
> >> +  CASE_FLT_FN_FLOATN_NX (BUILT_IN_CEIL):
> >> +  CASE_FLT_FN (BUILT_IN_COPYSIGN):
> >> +  CASE_FLT_FN_FLOATN_NX (BUILT_IN_COPYSIGN):
> >> +  CASE_FLT_FN (BUILT_IN_COS):
> >> +  CASE_FLT_FN (BUILT_IN_COSH):
> >> +  CASE_FLT_FN (BUILT_IN_ERF):
> >> +  CASE_FLT_FN (BUILT_IN_ERFC):
> >> +  CASE_FLT_FN (BUILT_IN_EXP):
> >> +  CASE_FLT_FN (BUILT_IN_EXP2):
> >> +  CASE_FLT_FN (BUILT_IN_EXPM1):
> >> +  CASE_FLT_FN (BUILT_IN_FABS):
> >> +  CASE_FLT_FN_FLOATN_NX (BUILT_IN_FABS):
> >> +  CASE_FLT_FN (BUILT_IN_FDIM):
> >> +  CASE_FLT_FN (BUILT_IN_FLOOR):
> >> +  CASE_FLT_FN_FLOATN_NX (BUILT_IN_FLOOR):
> >> +  CASE_FLT_FN (BUILT_IN_FMA):
> >> +  CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMA):
> >> +  CASE_FLT_FN (BUILT_IN_FMAX):
> >> +  CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMAX):
> >> +  CASE_FLT_FN (BUILT_IN_FMIN):
> >> +  CASE_FLT_FN_FLOATN_NX (BUILT_IN_FMIN):
> >> +  CASE_FLT_FN (BUILT_IN_FMOD):
> >> +  CASE_FLT_FN (BUILT_IN_FREXP):
> >> +  CASE_FLT_FN (BUILT_IN_HYPOT):
> >> +  CASE_FLT_FN (BUILT_IN_ILOGB):
> >> +  CASE_FLT_FN (BUILT_IN_LDEXP):
> >> +  CASE_FLT_FN (BUILT_IN_LGAMMA):
> >> +  CASE_FLT_FN (BUILT_IN_LLRINT):
> >> +  CASE_FLT_FN (BUILT_IN_LLROUND):
> >> +  CASE_FLT_FN (BUILT_IN_LOG):
> >> +  CASE_FLT_FN (BUILT_IN_LOG10):
> >> +  CASE_FLT_FN (BUILT_IN_LOG1P):
> >> +  CASE_FLT_FN (BUILT_IN_LOG2):
> >> +  CASE_FLT_FN (BUILT_IN_LOGB):
> >> +  CASE_FLT_FN (BUILT_IN_LRINT):
> >> +  CASE_FLT_FN (BUILT_IN_LROUND):
> >> +  CASE_FLT_FN (BUILT_IN_MODF):
> >> +  CASE_FLT_FN (BUILT_IN_NAN):
> >> +  CASE_FLT_FN (BUILT_IN_NEARBYINT):
> >> +  CASE_FLT_FN_FLOATN_NX (BUILT_IN_NEARBYINT):
> >> +  CASE_FLT_FN (BUILT_IN_NEXTAFTER):
> >> +  CASE_FLT_FN 

Re: [PATCH] PR fortran/91414: Improved PRNG

2019-08-13 Thread Janne Blomqvist
On Mon, Aug 12, 2019 at 11:23 PM Steve Kargl
 wrote:
>
> On Sun, Aug 11, 2019 at 12:37:49PM +0300, Janne Blomqvist wrote:
> > Update the PRNG from xorshift1024* to xoshiro256** by the same
> > author. For details see
> >
> > http://prng.di.unimi.it/
> >
> > and the paper at
> >
> > https://arxiv.org/abs/1805.01407
> >
> > Also the seeding is slightly improved, by reading only 8 bytes from
> > the operating system and using the simple splitmix64 PRNG to fill in
> > the rest of the PRNG state (as recommended by the xoshiro author),
> > instead of reading the entire state from the OS.
> >
> > Regtested on x86_64-pc-linux-gnu, Ok for trunk?
> >
>
> Looks good to me.

Thanks, committed to trunk (with a minor bugfix I noticed). I
backported the initialization changes to the gcc9 and gcc8 branches as
well with the attached patch.



-- 
Janne Blomqvist
From 16abae420bcff75950868a30327bf7b242e0388e Mon Sep 17 00:00:00 2001
From: Janne Blomqvist 
Date: Tue, 13 Aug 2019 10:35:05 +0300
Subject: [PATCH] PR fortran/91414 Improve initialization of PRNG

As part of PR 91414 an improved PRNG was contributed to trunk. This is
a partial backport of some related changes to the PRNG. Namely when
seeding the PRNG, it needs only 8 bytes of randomness from the OS, and
uses a simple splitmix64 PRNG to fill in the rest of the state,
instead of getting all the state from the OS. This can be useful for
operating systems that can run out of entropy.

libgfortran/ChangeLog:

2019-08-13  Janne Blomqvist  

	Partial backport from trunk
	PR fortran/91414
	* intrinsics/random.c (lcg_parkmiller): Replace with splitmix64.
	(splitmix64): New function.
	(getosrandom): Fix return value, simplify.
	(init_rand_state): Use getosrandom only to get 8 bytes, splitmix64
	to fill rest of state.
---
 libgfortran/intrinsics/random.c | 51 ++---
 1 file changed, 21 insertions(+), 30 deletions(-)

diff --git a/libgfortran/intrinsics/random.c b/libgfortran/intrinsics/random.c
index 7476439647c..9283477482f 100644
--- a/libgfortran/intrinsics/random.c
+++ b/libgfortran/intrinsics/random.c
@@ -275,30 +275,19 @@ jump (xorshift1024star_state* rs)
 }
 
 
-/* Super-simple LCG generator used in getosrandom () if /dev/urandom
-   doesn't exist.  */
+/* Splitmix64 recommended by xorshift author for initializing.  After
+   getting one uint64_t value from the OS, this is used to fill in the
+   rest of the state.  */
 
-#define M 2147483647 /* 2^31 - 1 (A large prime number) */
-#define A 16807  /* Prime root of M, passes statistical tests and produces a full cycle */
-#define Q 127773 /* M / A (To avoid overflow on A * seed) */
-#define R 2836   /* M % A (To avoid overflow on A * seed) */
-
-__attribute__((unused)) static uint32_t
-lcg_parkmiller(uint32_t seed)
+static uint64_t
+splitmix64 (uint64_t x)
 {
-uint32_t hi = seed / Q;
-uint32_t lo = seed % Q;
-int32_t test = A * lo - R * hi;
-if (test <= 0)
-test += M;
-return test;
+  uint64_t z = (x += 0x9e3779b97f4a7c15);
+  z = (z ^ (z >> 30)) * 0xbf58476d1ce4e5b9;
+  z = (z ^ (z >> 27)) * 0x94d049bb133111eb;
+  return z ^ (z >> 31);
 }
 
-#undef M
-#undef A
-#undef Q
-#undef R
-
 
 /* Get some random bytes from the operating system in order to seed
the PRNG.  */
@@ -315,7 +304,7 @@ getosrandom (void *buf, size_t buflen)
 #else
 #ifdef HAVE_GETENTROPY
   if (getentropy (buf, buflen) == 0)
-return 0;
+return buflen;
 #endif
   int flags = O_RDONLY;
 #ifdef O_CLOEXEC
@@ -328,7 +317,7 @@ getosrandom (void *buf, size_t buflen)
   close (fd);
   return res;
 }
-  uint32_t seed = 1234567890;
+  uint64_t seed = 0x047f7684e9fc949dULL;
   time_t secs;
   long usecs;
   if (gf_gettime (, ) == 0)
@@ -340,13 +329,9 @@ getosrandom (void *buf, size_t buflen)
   pid_t pid = getpid();
   seed ^= pid;
 #endif
-  uint32_t* ub = buf;
-  for (size_t i = 0; i < buflen / sizeof (uint32_t); i++)
-{
-  ub[i] = seed;
-  seed = lcg_parkmiller (seed);
-}
-  return buflen;
+  size_t size = buflen < sizeof (uint64_t) ? buflen : sizeof (uint64_t);
+  memcpy (buf, , size);
+  return size;
 #endif /* __MINGW64_VERSION_MAJOR  */
 }
 
@@ -361,7 +346,13 @@ init_rand_state (xorshift1024star_state* rs, const bool locked)
 __gthread_mutex_lock (_lock);
   if (!master_init)
 {
-  getosrandom (master_state, sizeof (master_state));
+  uint64_t os_seed;
+  getosrandom (_seed, sizeof (os_seed));
+  for (uint64_t i = 0; i < sizeof (master_state) / sizeof (uint64_t); i++)
+	{
+  os_seed = splitmix64 (os_seed);
+  master_state[i] = os_seed;
+}
   njumps = 0;
   master_init = true;
 }
-- 
2.17.1



[PATCH, contrib]: Do not escape "=".

2019-08-13 Thread Uros Bizjak
Attached patch fixes the following gawk-5.0.1 warning:

gawk: cmd. line:36: warning: regexp escape sequence `\=' is not a
known regexp operator

2019-08-13  Uros Bizjak  

* test_summary: Do not escape "=".

Tested by building mail-report.log with gawk-5.0.1 and gawk-4.2.1.

OK for mainline?

Uros.
diff --git a/contrib/test_summary b/contrib/test_summary
index 3560a64c4f19..5760b053ec27 100755
--- a/contrib/test_summary
+++ b/contrib/test_summary
@@ -127,7 +127,7 @@ NR == 1 {
   if (lang == "") lang = " "$2" "; else lang = " ";
 }
 $2 == "version" { save = $0; $1 = ""; $2 = ""; version = $0; gsub(/^ */, "", 
version); gsub(/\r$/, "", version); $0 = save; }
-/\===.*Summary/ { print ""; print; blanks=1; }
+/===.*Summary/ { print ""; print; blanks=1; }
 /tests ===/ || /^(Target|Host|Native)/ || $2 == "version" { print; blanks=1; }
 /^(XPASS|FAIL|UNRESOLVED|WARNING|ERROR|# of )/ { sub ("\r", ""); print; }
 /^using:/ { print ""; print; print ""; }


[Ada] Do not remove side-effects in an others_clause with function calls

2019-08-13 Thread Pierre-Marie de Rodat
An aggregate can be handled by the backend if it consists of static
constants of an elementary type, or null. If a component is a type
conversion we must preanalyze and resolve it to determine whether the
ultimate value is in one of these categories.  Previously we did a full
analysis and resolution of the expression for the component, which could
lead to a removal of side-effects, which is semantically incorrect if
the expression includes functions with side-effects (e.g. a call to a
random generator).

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-13  Ed Schonberg  

gcc/ada/

* exp_aggr.adb (Aggr_Assignment_OK_For_Backend):  Preanalyze
expression, rather do a full analysis, to prevent unwanted
removal of side effects which mask the intent of the expression.

gcc/testsuite/

* gnat.dg/aggr27.adb: New testcase.--- gcc/ada/exp_aggr.adb
+++ gcc/ada/exp_aggr.adb
@@ -5321,6 +5321,16 @@ package body Exp_Aggr is
 return False;
  end if;
 
+ --  If the expression has side effects (e.g. contains calls with
+ --  potential side effects) reject as well. We only preanalyze the
+ --  expression to prevent the removal of intended side effects.
+
+ Preanalyze_And_Resolve (Expr, Ctyp);
+
+ if not Side_Effect_Free (Expr) then
+return False;
+ end if;
+
  --  The expression needs to be analyzed if True is returned
 
  Analyze_And_Resolve (Expr, Ctyp);

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/aggr27.adb
@@ -0,0 +1,26 @@
+--  { dg-do run }
+--  { dg-options "-gnatws -gnata" }
+
+with GNAT.Random_Numbers;
+
+procedure Aggr27 is
+
+   Gen: GNAT.Random_Numbers.Generator;
+
+   function Random return Long_Long_Integer is
+  Rand : Integer := GNAT.Random_Numbers.Random(Gen);
+   begin
+  return Long_Long_Integer(Rand);
+   end Random;
+
+   type Values is range 1 .. 4;
+
+   Seq_LLI : array (Values) of Long_Long_Integer := (others => Random);
+   Seq_I   : array (Values) of Integer   := (others => Integer(Random));
+
+begin
+   --  Verify that there is at least two  different entries in each.
+
+   pragma Assert (For some E of Seq_LLI => E /= Seq_LLI (Values'First));
+   pragma Assert (For some E of Seq_I => E /= Seq_I (Values'First));
+end Aggr27;



[Ada] Legality rule on ancestors of type extensions in generic bodies

2019-08-13 Thread Pierre-Marie de Rodat
This patch adds an RM reference for the rule that in a generic body a
type extension cannot have ancestors that are generic formal types. The
patch also extends the check to interface progenitors that may appear in
a derived type declaration or private extension declaration.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-13  Ed Schonberg  

gcc/ada/

* sem_ch3.adb (Check_Generic_Ancestor): New subprogram,
aubsidiary to Build_Derived_Record_Type. to enforce the rule
that a type extension declared in a generic body cznnot have an
ancestor that is a generic formal (RM 3.9.1 (4/2)). The rule
applies to all ancestors of the type, including interface
progenitors.

gcc/testsuite/

* gnat.dg/tagged4.adb: New testcase.--- gcc/ada/sem_ch3.adb
+++ gcc/ada/sem_ch3.adb
@@ -8574,6 +8574,84 @@ package body Sem_Ch3 is
   --  An empty Discs list means that there were no constraints in the
   --  subtype indication or that there was an error processing it.
 
+  procedure Check_Generic_Ancestors;
+  --  In Ada 2005 (AI-344), the restriction that a derived tagged type
+  --  cannot be declared at a deeper level than its parent type is
+  --  removed. The check on derivation within a generic body is also
+  --  relaxed, but there's a restriction that a derived tagged type
+  --  cannot be declared in a generic body if it's derived directly
+  --  or indirectly from a formal type of that generic. This applies
+  --  to progenitors as well.
+
+  -
+  -- Check_Generic_Ancestors --
+  -
+
+  procedure Check_Generic_Ancestors is
+ Ancestor_Type : Entity_Id;
+ Intf_List : List_Id;
+ Intf_Name : Node_Id;
+
+ procedure Check_Ancestor;
+ --  For parent and progenitors.
+
+ 
+ -- Check_Ancestor --
+ 
+
+ procedure Check_Ancestor is
+ begin
+--  If the derived type does have a formal type as an ancestor
+--  then it's an error if the derived type is declared within
+--  the body of the generic unit that declares the formal type
+--  in its generic formal part. It's sufficient to check whether
+--  the ancestor type is declared inside the same generic body
+--  as the derived type (such as within a nested generic spec),
+--  in which case the derivation is legal. If the formal type is
+--  declared outside of that generic body, then it's certain
+--  that the derived type is declared within the generic body
+--  of the generic unit declaring the formal type.
+
+if Is_Generic_Type (Ancestor_Type)
+  and then Enclosing_Generic_Body (Ancestor_Type) /=
+ Enclosing_Generic_Body (Derived_Type)
+then
+   Error_Msg_NE
+ ("ancestor type& is formal type of enclosing"
+& " generic unit (RM 3.9.1 (4/2))",
+  Indic, Ancestor_Type);
+end if;
+ end Check_Ancestor;
+
+  begin
+ if Nkind (N) = N_Private_Extension_Declaration then
+Intf_List := Interface_List (N);
+ else
+Intf_List := Interface_List (Type_Definition (N));
+ end if;
+
+ if Present (Enclosing_Generic_Body (Derived_Type)) then
+Ancestor_Type := Parent_Type;
+
+while not Is_Generic_Type (Ancestor_Type)
+  and then Etype (Ancestor_Type) /= Ancestor_Type
+loop
+   Ancestor_Type := Etype (Ancestor_Type);
+end loop;
+
+Check_Ancestor;
+
+if Present (Intf_List) then
+   Intf_Name := First (Intf_List);
+   while Present (Intf_Name) loop
+  Ancestor_Type := Entity (Intf_Name);
+  Check_Ancestor;
+  Next (Intf_Name);
+   end loop;
+end if;
+ end if;
+  end Check_Generic_Ancestors;
+
begin
   if Ekind (Parent_Type) = E_Record_Type_With_Private
 and then Present (Full_View (Parent_Type))
@@ -8680,7 +8758,8 @@ package body Sem_Ch3 is
 
   --  Indic can either be an N_Identifier if the subtype indication
   --  contains no constraint or an N_Subtype_Indication if the subtype
-  --  indication has a constraint.
+  --  indecation has a constraint. In either case it can include an
+  --  interface list.
 
   Indic := Subtype_Indication (Type_Def);
   Constraint_Present := (Nkind (Indic) = N_Subtype_Indication);
@@ -8909,52 +8988,8 @@ package body Sem_Ch3 is
 Freeze_Before (N, Parent_Type);
  end if;
 
- --  In Ada 2005 (AI-344), the restriction that a derived tagged type
- --  cannot be declared at a 

[PATCH 2/3] Add simplify rules for wrapped binary operations.

2019-08-13 Thread Robin Dapp
We would like to simplify code like
 (larger_type)(var + const1) + const2
to
 (larger_type)(var + combined_const1_const2)
when we know that no overflow happens.
---
 gcc/match.pd | 101 +++
 1 file changed, 101 insertions(+)

diff --git a/gcc/match.pd b/gcc/match.pd
index 0317bc704f7..94400529ad8 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -2020,6 +2020,107 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (if (cst && !TREE_OVERFLOW (cst))
  (plus { cst; } @0
 
+/* ((T)(A + CST1)) + CST2 -> (T)(A) + CST  */
+#if GIMPLE
+  (simplify
+(plus (convert (plus @0 INTEGER_CST@1)) INTEGER_CST@2)
+  (if (INTEGRAL_TYPE_P (type)
+   && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0)))
+   /* We actually want to perform two simplifications here:
+ (1) (T)(A + CST1) + CST2  --> (T)(A) + (T)(CST1)
+ If for (A + CST1) we either do not care about overflow (e.g.
+ for a signed inner type) or the overflow is ok for an unsigned
+ inner type.
+ (2) (T)(A) + (T)(CST1) + CST2 --> (T)(A) + (T)(CST1 + CST2)
+ If (CST1 + CST2) does not overflow and we do care about overflow
+ (for a signed outer type) or we do not care about overflow in an
+ unsigned outer type.  */
+   (with
+   {
+ tree inner_type = TREE_TYPE (@0);
+ wide_int wmin0, wmax0;
+ wide_int cst1 = wi::to_wide (@1);
+
+wi::overflow_type min_ovf = wi::OVF_OVERFLOW,
+  max_ovf = wi::OVF_OVERFLOW;
+
+/* Get overflow behavior.  */
+ bool ovf_undef_inner = TYPE_OVERFLOW_UNDEFINED (inner_type);
+ bool ovf_undef_outer = TYPE_OVERFLOW_UNDEFINED (type);
+
+/* Get value range of A.  */
+ enum value_range_kind vr0 = get_range_info (@0, , );
+
+/* If we have a proper range, determine min and max overflow
+   of (A + CST1).
+   ??? We might also want to handle anti ranges.  */
+if (vr0 == VR_RANGE)
+  {
+wi::add (wmin0, cst1, TYPE_SIGN (inner_type), _ovf);
+wi::add (wmax0, cst1, TYPE_SIGN (inner_type), _ovf);
+  }
+
+/* Inner overflow does not matter in this case.  */
+if (ovf_undef_inner)
+  {
+min_ovf = wi::OVF_NONE;
+max_ovf = wi::OVF_NONE;
+  }
+
+/* Extend CST from INNER_TYPE to TYPE.  */
+cst1 = cst1.from (cst1, TYPE_PRECISION (type), TYPE_SIGN (inner_type));
+
+/* Check for overflow of (TYPE)(CST1 + CST2).  */
+wi::overflow_type outer_ovf = wi::OVF_OVERFLOW;
+wide_int cst = wi::add (cst1, wi::to_wide (@2), TYPE_SIGN (type),
+_ovf);
+
+/* We *do* care about an overflow here as we do not want to introduce
+   new undefined behavior that was not there before.  */
+if (ovf_undef_outer && outer_ovf)
+  {
+/* Set these here to prevent the final conversion below
+   to take place instead of introducing a new guard variable.  */
+min_ovf = wi::OVF_OVERFLOW;
+max_ovf = wi::OVF_OVERFLOW;
+  }
+   }
+   (if (min_ovf == wi::OVF_NONE && max_ovf == wi::OVF_NONE)
+(plus (convert @0) { wide_int_to_tree (type, cst); }
+ )
+#endif
+
+/* ((T)(A)) + CST -> (T)(A + CST)  */
+#if GIMPLE
+  (simplify
+   (plus (convert SSA_NAME@0) INTEGER_CST@1)
+(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
+ && INTEGRAL_TYPE_P (type)
+ && TYPE_PRECISION (type) > TYPE_PRECISION (TREE_TYPE (@0))
+ && int_fits_type_p (@1, TREE_TYPE (@0)))
+ /* Perform binary operation inside the cast if the constant fits
+and (A + CST)'s range does not overflow.  */
+ (with
+  {
+   wi::overflow_type min_ovf = wi::OVF_OVERFLOW,
+ max_ovf = wi::OVF_OVERFLOW;
+tree inner_type = TREE_TYPE (@0);
+
+wide_int w1 = w1.from (wi::to_wide (@1), TYPE_PRECISION (inner_type),
+   TYPE_SIGN (inner_type));
+
+wide_int wmin0, wmax0;
+if (get_range_info (@0, , ) == VR_RANGE)
+  {
+wi::add (wmin0, w1, TYPE_SIGN (inner_type), _ovf);
+wi::add (wmax0, w1, TYPE_SIGN (inner_type), _ovf);
+  }
+  }
+ (if (min_ovf == wi::OVF_NONE && max_ovf == wi::OVF_NONE)
+  (convert (plus @0 { {wide_int_to_tree (TREE_TYPE (@0), w1)}; })))
+ )))
+#endif
+
   /* ~A + A -> -1 */
   (simplify
(plus:c (bit_not @0) @0)
-- 
2.17.0



[PATCH 3/3] Add new test cases for wrapped binop simplification.

2019-08-13 Thread Robin Dapp
---
 .../gcc.dg/tree-ssa/copy-headers-5.c  |  2 +-
 .../gcc.dg/tree-ssa/copy-headers-7.c  |  2 +-
 .../gcc.dg/wrapped-binop-simplify-run.c   | 52 
 .../gcc.dg/wrapped-binop-simplify-signed-1.c  | 60 +++
 .../wrapped-binop-simplify-unsigned-1.c   | 36 +++
 5 files changed, 150 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/wrapped-binop-simplify-run.c
 create mode 100644 gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c
 create mode 100644 gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-5.c 
b/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-5.c
index 3d9940558cb..7f2e4dc2e4d 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-5.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-5.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-ch2-details" } */
+/* { dg-options "-O2 -fno-tree-ccp -fno-tree-vrp -fdump-tree-ch2-details" } */
 
 int is_sorted(int *a, int n)
 {
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-7.c 
b/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-7.c
index a0a6e6a9b57..12fca8fd2c4 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-7.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/copy-headers-7.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-ch2-details --param 
logical-op-non-short-circuit=0" } */
+/* { dg-options "-O2 -fdump-tree-ch2-details -fno-tree-ccp -fno-tree-vrp 
--param logical-op-non-short-circuit=0" } */
 
 int is_sorted(int *a, int n, int m, int k)
 {
diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-run.c 
b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-run.c
new file mode 100644
index 000..06bf348f1c6
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-run.c
@@ -0,0 +1,52 @@
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#include 
+#include 
+
+unsigned int a = 3;
+int aa = 3;
+int bb = 1;
+int cc = 4;
+
+__attribute__((noinline))
+long foo(int a)
+{
+  if (a >= -30 && a <= -20)
+return (long)(a + 10) + LONG_MAX;
+}
+
+int main()
+{
+  volatile unsigned long b = (unsigned long)(UINT_MAX + 1) - 1;
+  assert (b == 18446744073709551615ul);
+
+  volatile unsigned long c = (unsigned long)(a - 4) + 1;
+  assert (c == 4294967296);
+
+  volatile unsigned long d = (unsigned long)(a + UINT_MAX - 4) + 2;
+  assert (d == 4294967296);
+
+  volatile unsigned long e = (unsigned long)(a - UINT_MAX) + UINT_MAX;
+  assert (e == 4294967299);
+
+  volatile unsigned long f = (unsigned long)(a + UINT_MAX) - UINT_MAX;
+  assert (f == 18446744069414584323ul);
+
+  volatile long g = (long)(a - 4) + 1;
+  assert (g == 4294967296);
+
+  volatile long h = (long)(aa + UINT_MAX) + 1;
+  assert (h == 3);
+
+  /* Zero-extend.  */
+  volatile unsigned long i = (unsigned long)(bb + 4294967294u) + 50;
+  assert (i == 9294967295);
+
+  /* Sign-extend.  */
+  volatile unsigned long j = (unsigned long)(cc + 4294967294u) + 80;
+  assert (j == 82);
+
+  volatile unsigned long k = foo (-25);
+  assert (k == 9223372036854775792l);
+}
diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c 
b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c
new file mode 100644
index 000..19b787b61b2
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c
@@ -0,0 +1,60 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ccp1-details" } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified to" 11 "ccp1" } } */
+
+#include 
+
+long foo(int a)
+{
+  return (long)(a - 2) + 1;
+}
+
+long bar(int a)
+{
+  return (long)(a + 3) - 1;
+}
+
+long baz(int a)
+{
+  return (long)(a - 1) + 2;
+}
+
+long baf(int a)
+{
+  return (long)(a + 1) - 2;
+}
+
+long bak(int a)
+{
+  return (long)(a + 1) + 3;
+}
+
+long bal(int a)
+{
+  return (long)(a - 7) - 4;
+}
+
+long bam(int a)
+{
+  return (long)(a - 1) - INT_MAX;
+}
+
+long bam2(int a)
+{
+  return (long)(a + 1) + INT_MAX;
+}
+
+long ban(int a)
+{
+  return (long)(a - 1) + INT_MIN;
+}
+
+long ban2(int a)
+{
+  return (long)(a + 1) - INT_MIN;
+}
+
+unsigned long baq(int a)
+{
+  return (unsigned long)(a + 1) - 1;
+}
diff --git a/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c 
b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c
new file mode 100644
index 000..71ab0807d7e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c
@@ -0,0 +1,36 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-ccp1-details -fdump-tree-evrp-details 
-fdump-tree-vrp1-details" } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified to" 1 "ccp1" } } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified to" 2 "evrp" } } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified to" 2 "vrp1" } } */
+
+#include 
+
+unsigned long foo(unsigned int a)
+{
+  if (a > 3 && a < INT_MAX - 100)
+return (unsigned long)(a - 1) + 1;
+}
+
+unsigned 

[Ada] Wrong dispatching call in type with aspect Implicit_Dereference

2019-08-13 Thread Pierre-Marie de Rodat
When a record type with an an access to class-wide type discriminant
has aspect Implicit_Dereference, and the discriminant is used as the
controlling argument of a dispatching call, the compiler may generate
wrong code to dispatch the call.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-13  Javier Miranda  

gcc/ada/

* sem_res.adb (Resolve_Selected_Component): When the type of the
component is an access to a class-wide type and the type of the
context is an access to a tagged type the relevant type is that
of the component (since in such case we may need to generate
implicit type conversions or dispatching calls).

gcc/testsuite/

* gnat.dg/tagged3.adb, gnat.dg/tagged3_pkg.adb,
gnat.dg/tagged3_pkg.ads: New testcase.--- gcc/ada/sem_res.adb
+++ gcc/ada/sem_res.adb
@@ -10598,6 +10598,10 @@ package body Sem_Res is
 
  pragma Assert (Found);
  Resolve (P, It1.Typ);
+
+ --  In general the expected type is the type of the context, not the
+ --  type of the candidate selected component.
+
  Set_Etype (N, Typ);
  Set_Entity_With_Checks (S, Comp1);
 
@@ -10610,6 +10614,17 @@ package body Sem_Res is
 
  if Ekind (Typ) = E_Anonymous_Access_Subprogram_Type then
 Set_Etype (N, Etype (Comp1));
+
+ --  When the type of the component is an access to a class-wide type
+ --  the relevant type is that of the component (since in such case we
+ --  may need to generate implicit type conversions or dispatching
+ --  calls).
+
+ elsif Is_Access_Type (Typ)
+   and then not Is_Class_Wide_Type (Designated_Type (Typ))
+   and then Is_Class_Wide_Type (Designated_Type (Etype (Comp1)))
+ then
+Set_Etype (N, Etype (Comp1));
  end if;
 
   else

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/tagged3.adb
@@ -0,0 +1,42 @@
+--  { dg-do run }
+
+with Tagged3_Pkg; use Tagged3_Pkg;
+procedure Tagged3 is
+   package SP is
+  type Ref is tagged private;
+
+  procedure Set (Self : in out Ref'Class; Data : Parent'Class);
+
+  type Reference_Type (Element : access Parent'Class)
+ is limited null record with Implicit_Dereference => Element;
+
+  function Get (Self : Ref'Class) return Reference_Type;
+
+   private
+  type Element_Access is access all Parent'Class;
+  type Ref is tagged record
+ Data : Element_Access;
+  end record;
+   end;
+
+   package body SP is
+  procedure Set (Self : in out Ref'Class; Data : Parent'Class) is
+  begin
+ Self.Data := new Parent'Class'(Data);
+  end;
+
+  function Get (Self : Ref'Class) return Reference_Type is
+  begin
+ return Reference_Type'(Element => Self.Data);
+  end;
+   end;
+
+   DC : Child;
+   RC : SP.Ref;
+begin
+   RC.Set (DC);
+   Prim1 (RC.Get.Element); -- Test
+   if not Tagged3_Pkg.Child_Prim1_Called then
+ raise Program_Error;
+   end if;
+end;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/tagged3_pkg.adb
@@ -0,0 +1,12 @@
+with Ada.Text_IO; use Ada.Text_IO;
+package body Tagged3_Pkg is
+   procedure Prim1 (Self : access Parent) is
+   begin
+  raise Program_Error;
+   end;
+
+   procedure Prim1 (Self : access Child) is
+   begin
+ Child_Prim1_Called := True;
+   end;
+end;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/tagged3_pkg.ads
@@ -0,0 +1,9 @@
+package Tagged3_Pkg is
+   type Parent is tagged null record;
+   procedure Prim1 (Self : access Parent);
+
+   type Child is new Parent with null record;
+   procedure Prim1 (Self : access Child);
+
+   Child_Prim1_Called : Boolean := False;
+end;



[PATCH 0/3] Simplify wrapped binops.

2019-08-13 Thread Robin Dapp
Hi,

after a long time I remembered to dig out this patch again.
It deals with simplifying additions wrapped inside a cast
like
 (larger_type)(var + const1) + const2
to
 (larger_type)(var + combined_const1_const2).

The original pattern is created quite frequently on S/390 
(see PR 69526) and manifests similarly to
 addr1,-1
 extend r1,r1
 addr1,1
where the adds could be avoided entirely.

This is the tree part of the fix, it will still be necessary to
correct rtl code generation in doloop later.

Bootstrapped and regtested on s390x, x86 running.

Regards
 Robin

--

Robin Dapp (3):
  Perform fold when propagating.
  Add simplify rules for wrapped binary operations.
  Add new test cases for wrapped binop simplification.

 gcc/fortran/trans-intrinsic.c |   5 +-
 gcc/gimple-loop-versioning.cc |   6 ++
 gcc/match.pd  | 101 ++
 gcc/testsuite/g++.dg/tree-ssa/ssa-dse-2.C |   4 +-
 gcc/testsuite/gcc.dg/pr35691-1.c  |   4 +-
 gcc/testsuite/gcc.dg/pr35691-2.c  |   4 +-
 gcc/testsuite/gcc.dg/pr35691-3.c  |   4 +-
 gcc/testsuite/gcc.dg/pr35691-4.c  |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/20040305-1.c|   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c |   4 +-
 .../gcc.dg/tree-ssa/copy-headers-5.c  |   2 +-
 .../gcc.dg/tree-ssa/copy-headers-7.c  |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/forwprop-16.c   |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/loop-15.c   |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr23744.c   |   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr52631.c   |   4 +-
 .../gcc.dg/vect/vect-over-widen-23.c  |   2 +-
 .../gcc.dg/wrapped-binop-simplify-run.c   |  52 +
 .../gcc.dg/wrapped-binop-simplify-signed-1.c  |  60 +++
 .../wrapped-binop-simplify-unsigned-1.c   |  36 +++
 gcc/tree-ssa-propagate.c  |   6 +-
 21 files changed, 287 insertions(+), 27 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/wrapped-binop-simplify-run.c
 create mode 100644 gcc/testsuite/gcc.dg/wrapped-binop-simplify-signed-1.c
 create mode 100644 gcc/testsuite/gcc.dg/wrapped-binop-simplify-unsigned-1.c

-- 
2.17.0



[PATCH 1/3] Perform fold when propagating.

2019-08-13 Thread Robin Dapp
This patch performs more aggressive folding in order for the
match.pd changes to kick in later.

Some test cases rely on VRP doing something which now already
happens during CCP so adjust them accordingly.

Also, the loop versioning pass was missing one case when
deconstructing addresses that would only be triggered after
this patch for me:
It could handle addition and subsequent convert/nop but not
a convert/nop directly.  This would cause the hash to be
calculated differently and, in turn, cause the pass to miss
a versioning opportunity.  Fixed this by adding the missing
case.
---
 gcc/fortran/trans-intrinsic.c  | 5 +++--
 gcc/gimple-loop-versioning.cc  | 6 ++
 gcc/testsuite/g++.dg/tree-ssa/ssa-dse-2.C  | 4 +++-
 gcc/testsuite/gcc.dg/pr35691-1.c   | 4 ++--
 gcc/testsuite/gcc.dg/pr35691-2.c   | 4 ++--
 gcc/testsuite/gcc.dg/pr35691-3.c   | 4 ++--
 gcc/testsuite/gcc.dg/pr35691-4.c   | 4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/20040305-1.c | 4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/bit-assoc.c  | 4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/forwprop-16.c| 4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/loop-15.c| 2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr23744.c| 4 ++--
 gcc/testsuite/gcc.dg/tree-ssa/pr52631.c| 4 ++--
 gcc/testsuite/gcc.dg/vect/vect-over-widen-23.c | 2 +-
 gcc/tree-ssa-propagate.c   | 6 --
 15 files changed, 36 insertions(+), 25 deletions(-)

diff --git a/gcc/fortran/trans-intrinsic.c b/gcc/fortran/trans-intrinsic.c
index a6e33833680..99ec5f34319 100644
--- a/gcc/fortran/trans-intrinsic.c
+++ b/gcc/fortran/trans-intrinsic.c
@@ -5428,8 +5428,9 @@ gfc_conv_intrinsic_findloc (gfc_se *se, gfc_expr *expr)
   tree type;
   tree tmp;
   tree found;
-  tree forward_branch;
-  tree back_branch;
+  /* Initialize here to avoid 'maybe used uninitialized'.  */
+  tree forward_branch = NULL_TREE;
+  tree back_branch = NULL_TREE;
   gfc_loopinfo loop;
   gfc_ss *arrayss;
   gfc_ss *maskss;
diff --git a/gcc/gimple-loop-versioning.cc b/gcc/gimple-loop-versioning.cc
index 8fa19488490..36c70543402 100644
--- a/gcc/gimple-loop-versioning.cc
+++ b/gcc/gimple-loop-versioning.cc
@@ -1266,6 +1266,12 @@ loop_versioning::record_address_fragment (gimple *stmt,
  continue;
}
}
+ if (code == NOP_EXPR)
+   {
+ tree op1 = gimple_assign_rhs1 (assign);
+ address->terms[i].expr = strip_casts (op1);
+ continue;
+   }
}
   i += 1;
 }
diff --git a/gcc/testsuite/g++.dg/tree-ssa/ssa-dse-2.C 
b/gcc/testsuite/g++.dg/tree-ssa/ssa-dse-2.C
index ca7e18f66a9..83a8f0668df 100644
--- a/gcc/testsuite/g++.dg/tree-ssa/ssa-dse-2.C
+++ b/gcc/testsuite/g++.dg/tree-ssa/ssa-dse-2.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-dse2-details -Wno-return-type" } */
+/* { dg-options "-O2 -fdump-tree-dse2-details -fdump-tree-vrp1-details 
-Wno-return-type" } */
 
 typedef __SIZE_TYPE__ size_t;
 extern "C"
@@ -55,3 +55,5 @@ fill_vec_av_set (av_set_t av)
 
 /* { dg-final { scan-tree-dump-not "Trimming statement .head = -" "dse2" } } */
 /* { dg-final { scan-tree-dump-not "mem\[^\r\n\]*, 0\\);" "dse2" } } */
+/* { dg-final { scan-tree-dump "Folded into: GIMPLE_NOP" "vrp1" } } */
+/* { dg-final { scan-tree-dump-not "memmove" "dse2" } } */  }
diff --git a/gcc/testsuite/gcc.dg/pr35691-1.c b/gcc/testsuite/gcc.dg/pr35691-1.c
index 34dc02ab560..2e1d8bd0f43 100644
--- a/gcc/testsuite/gcc.dg/pr35691-1.c
+++ b/gcc/testsuite/gcc.dg/pr35691-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-forwprop1-details" } */
+/* { dg-options "-O2 -fdump-tree-ccp1-details" } */
 
 int foo(int z0, unsigned z1)
 {
@@ -9,4 +9,4 @@ int foo(int z0, unsigned z1)
   return t2;
 }
 
-/* { dg-final { scan-tree-dump "gimple_simplified to _\[0-9\]* = \\(int\\) 
z1_\[0-9\]*\\(D\\);" "forwprop1" } } */
+/* { dg-final { scan-tree-dump "gimple_simplified to _\[0-9\]* = \\(int\\) 
z1_\[0-9\]*\\(D\\);" "ccp1" } } */
diff --git a/gcc/testsuite/gcc.dg/pr35691-2.c b/gcc/testsuite/gcc.dg/pr35691-2.c
index b89ce481c6c..d5e335bb72a 100644
--- a/gcc/testsuite/gcc.dg/pr35691-2.c
+++ b/gcc/testsuite/gcc.dg/pr35691-2.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-forwprop1-details" } */
+/* { dg-options "-O2 -fdump-tree-ccp1-details" } */
 
 int foo(int z0, unsigned z1)
 {
@@ -9,4 +9,4 @@ int foo(int z0, unsigned z1)
   return t2;
 }
 
-/* { dg-final { scan-tree-dump "gimple_simplified to _\[0-9\]* = \\(int\\) 
z1_\[0-9\]*\\(D\\);" "forwprop1" } } */
+/* { dg-final { scan-tree-dump "gimple_simplified to _\[0-9\]* = \\(int\\) 
z1_\[0-9\]*\\(D\\);" "ccp1" } } */
diff --git a/gcc/testsuite/gcc.dg/pr35691-3.c b/gcc/testsuite/gcc.dg/pr35691-3.c
index 75b49a6e2ca..a1896e47b59 100644
--- a/gcc/testsuite/gcc.dg/pr35691-3.c
+++ b/gcc/testsuite/gcc.dg/pr35691-3.c
@@ -1,5 +1,5 @@
 /* { dg-do 

[Ada] Build full derivation for private concurrent type

2019-08-13 Thread Pierre-Marie de Rodat
This extends the processing done for the derivation of private
discriminated types to concurrent types, which is now required because
this derivation is no longer redone when a subtype of the derived
concurrent type is built.

This increases the number of entities generated internally in the
compiler but this case is sufficiently rare as not to be a real concern.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-13  Eric Botcazou  

gcc/ada/

* sem_ch3.adb (Build_Derived_Concurrent_Type): Add a couple of
local variables and use them.  When the derived type fully
constrains the parent type, rewrite it as a subtype of an
implicit (unconstrained) derived type instead of the other way
around.
(Copy_And_Build): Deal with concurrent types and use predicates.
(Build_Derived_Private_Type): Build the full derivation if
needed for concurrent types too.
(Build_Derived_Record_Type): Add marker comment.
(Complete_Private_Subtype): Use predicates.

gcc/testsuite/

* gnat.dg/discr56.adb, gnat.dg/discr56.ads,
gnat.dg/discr56_pkg1.adb, gnat.dg/discr56_pkg1.ads,
gnat.dg/discr56_pkg2.ads: New testcase.--- gcc/ada/sem_ch3.adb
+++ gcc/ada/sem_ch3.adb
@@ -6831,7 +6831,9 @@ package body Sem_Ch3 is
   Parent_Type  : Entity_Id;
   Derived_Type : Entity_Id)
is
-  Loc : constant Source_Ptr := Sloc (N);
+  Loc   : constant Source_Ptr := Sloc (N);
+  Def   : constant Node_Id:= Type_Definition (N);
+  Indic : constant Node_Id:= Subtype_Indication (Def);
 
   Corr_Record  : constant Entity_Id := Make_Temporary (Loc, 'C');
   Corr_Decl: Node_Id;
@@ -6842,8 +6844,7 @@ package body Sem_Ch3 is
   --  this case.
 
   Constraint_Present : constant Boolean :=
- Nkind (Subtype_Indication (Type_Definition (N))) =
-  N_Subtype_Indication;
+  Nkind (Indic) = N_Subtype_Indication;
 
   D_Constraint   : Node_Id;
   New_Constraint : Elist_Id := No_Elist;
@@ -6918,36 +6919,50 @@ package body Sem_Ch3 is
   Expand_To_Stored_Constraint
 (Parent_Type,
  Build_Discriminant_Constraints
-   (Parent_Type,
-Subtype_Indication (Type_Definition (N)), True));
+   (Parent_Type, Indic, True));
  end if;
 
  End_Scope;
 
   elsif Constraint_Present then
 
- --  Build constrained subtype, copying the constraint, and derive
- --  from it to create a derived constrained type.
+ --  Build an unconstrained derived type and rewrite the derived type
+ --  as a subtype of this new base type.
 
  declare
-Loc  : constant Source_Ptr := Sloc (N);
-Anon : constant Entity_Id :=
- Make_Defining_Identifier (Loc,
-   Chars => New_External_Name (Chars (Derived_Type), 'T'));
-Decl : Node_Id;
+Parent_Base : constant Entity_Id := Base_Type (Parent_Type);
+New_Base: Entity_Id;
+New_Decl: Node_Id;
+New_Indic   : Node_Id;
 
  begin
-Decl :=
+New_Base :=
+ Create_Itype (Ekind (Derived_Type), N, Derived_Type, 'B');
+
+New_Decl :=
+  Make_Full_Type_Declaration (Loc,
+ Defining_Identifier => New_Base,
+ Type_Definition =>
+   Make_Derived_Type_Definition (Loc,
+ Abstract_Present  => Abstract_Present (Def),
+ Limited_Present   => Limited_Present (Def),
+ Subtype_Indication=>
+   New_Occurrence_Of (Parent_Base, Loc)));
+
+Mark_Rewrite_Insertion (New_Decl);
+Insert_Before (N, New_Decl);
+Analyze (New_Decl);
+
+New_Indic :=
+  Make_Subtype_Indication (Loc,
+Subtype_Mark => New_Occurrence_Of (New_Base, Loc),
+Constraint   => Relocate_Node (Constraint (Indic)));
+
+Rewrite (N,
   Make_Subtype_Declaration (Loc,
-Defining_Identifier => Anon,
-Subtype_Indication =>
-  New_Copy_Tree (Subtype_Indication (Type_Definition (N;
-Insert_Before (N, Decl);
-Analyze (Decl);
+Defining_Identifier => Derived_Type,
+Subtype_Indication  => New_Indic));
 
-Rewrite (Subtype_Indication (Type_Definition (N)),
-  New_Occurrence_Of (Anon, Loc));
-Set_Analyzed (Derived_Type, False);
 Analyze (N);
 return;
  end;
@@ -6978,10 +6993,7 @@ package body Sem_Ch3 is
 
 --  Verify that new discriminants are used to constrain old 

[Ada] Show Bit_Order and Scalar_Storage_Order in -gnatR4 output

2019-08-13 Thread Pierre-Marie de Rodat
This patch modifies the behavior of -gnatR4 so that representation
information for bit and scalar storage order gets displayed in all cases
and not just when defaults are overriden.


-- Source --


--  pkg.ads

package Pkg is
   type Root is tagged record
 Data0 : Integer;
   end record;

   type Derived is new Root with record
  Data1 : Integer;
   end record;
end Pkg;

-
-- Compilation --
-

$ gnatmake -gnatR4 pkg.ads

Representation information for unit Pkg (spec)
--

for Root'Size use 128;
for Root'Alignment use 8;
for Root use record
   Data0 at 8 range  0 .. 31;
end record;
for Root'Bit_Order use System.Low_Order_First;
for Root'Scalar_Storage_Order use System.Low_Order_First;

for Trootc'Size use 0;
for Trootc'Alignment use 0;
for Trootc use record
end record;
for Trootc'Bit_Order use System.Low_Order_First;
for Trootc'Scalar_Storage_Order use System.Low_Order_First;

for Derived'Size use 192;
for Derived'Alignment use 8;
for Derived use record
   Data0 at  8 range  0 .. 31;
   Data1 at 16 range  0 .. 31;
end record;
for Derived'Bit_Order use System.Low_Order_First;
for Derived'Scalar_Storage_Order use System.Low_Order_First;

for Tderivedc'Size use 0;
for Tderivedc'Alignment use 0;
for Tderivedc use record
   Data0 at  8 range  0 .. 31;
   Data1 at 16 range  0 .. 31;
end record;
for Tderivedc'Bit_Order use System.Low_Order_First;
for Tderivedc'Scalar_Storage_Order use System.Low_Order_First;i

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-13  Justin Squirek  

gcc/ada/

* repinfo.adb (List_Scalar_Storage_Order): Modify conditionals
for displaying ordering to always be triggered when -gnatR4 is
in effect.--- gcc/ada/repinfo.adb
+++ gcc/ada/repinfo.adb
@@ -1816,8 +1816,15 @@ package body Repinfo is
begin
   --  For record types, list Bit_Order if not default, or if SSO is shown
 
+  --  Also, when -gnatR4 is in effect always list bit order and scalar
+  --  storage order explicitly, so that you don't need to know the native
+  --  endianness of the target for which the output was produced in order
+  --  to interpret it.
+
   if Is_Record_Type (Ent)
-and then (List_SSO or else Reverse_Bit_Order (Ent))
+and then (List_SSO
+   or else Reverse_Bit_Order (Ent)
+   or else List_Representation_Info = 4)
   then
  List_Attr ("Bit_Order", Reverse_Bit_Order (Ent));
   end if;
@@ -1825,7 +1832,7 @@ package body Repinfo is
   --  List SSO if required. If not, then storage is supposed to be in
   --  native order.
 
-  if List_SSO then
+  if List_SSO or else List_Representation_Info = 4 then
  List_Attr ("Scalar_Storage_Order", Reverse_Storage_Order (Ent));
   else
  pragma Assert (not Reverse_Storage_Order (Ent));



[Ada] Add conformance check on actual subp. in instance of child unit

2019-08-13 Thread Pierre-Marie de Rodat
This patch properly diagnoses a conformance error between a formal
subprogram and the corresponding actual, when the instance is that of a
child unit that is instantiated as a compilation unit. The error is
normally suppressed on an instantiation nested within another generic,
given that analysis of the enclosing generic will have performed the
conformance check on the nested instance already.  In the case of a
child unit, its instantiation requires an explicit check if it is a
compilation unit, because it has not been analyzed in the context of the
parent generic.

Compiling test.adb must yield:

  container-list.ads:3:01: instantiation error at new_container_g-list_g.ads:12
  container-list.ads:3:01: not mode conformant with declaration at types.ads:5
  container-list.ads:3:01: mode of "Self" does not match


with New_Container_G.List_G;
pragma Elaborate_All (New_Container_G.List_G);
package Container.List is new Container.List_G (Init => Types.Init_Object);
with Types;
with Erreur;
with New_Container_G;

pragma Elaborate_All (New_Container_G);

package Container is new New_Container_G (
  Element_T  => Types.Integer_T,
  Pos_Range_T=> Types.Integer_Idx_T,
  Container_Name => Erreur.None);
package Erreur is
   type Container_Name_T is (None, Everything);
end;

package body New_Container_G.List_G is

  function Get_Element_At_Pos
   (Self : access List_T;
Pos  : in Index_Range_T)
return Element_Ptr is
  begin
if not Self.T_Status (Pos) then
  Erreur.Treat_Container_Error
   (Error_Name => Erreur.Element_Not_Valid,
Container_Name => Container_Name,
Procedure_Name => Erreur.Get_Element_Ptr_At_Pos,
Context=> Erreur.Null_Context_C);
end if;
return Pos;

  end Get_Element_At_Pos;

  function Get_Element_At_Pos
   (Self : in List_T;
Pos  : in Index_Range_T)
return Element_T is
  begin
if not Self.T_Status (Pos) then
  Erreur.Treat_Container_Error
   (Error_Name => Erreur.Element_Not_Valid,
Container_Name => Container_Name,
Procedure_Name => Erreur.Get_Element_At_Pos,
Context=> Erreur.Null_Context_C);
end if;
return Self.Data (Pos);
  end Get_Element_At_Pos;

  procedure Add_New
   (Self : in out List_T;
Pos  :out Pos_Range_T) is

Free_Found : Boolean := False;
  begin
if Self.First_Free = Rbc_Constants.Null_Pos then
  Pos := Rbc_Constants.Null_Pos;
else
  Self.Size   := Self.Size + 1;
  Self.T_Status (Self.First_Free) := True;
  Pos := Self.First_Free;
  Init (Self.Data (Pos));
  if Self.First_Relevant not in
 Rbc_Constants.Null_Pos + 1 .. Self.First_Free
  then
Self.First_Relevant := Self.First_Free;
  end if;

  while not (Free_Found or Self.First_Free = Rbc_Constants.Null_Pos) loop
if Self.First_Free = Pos_Range_T'Last then
  Self.First_Free := Rbc_Constants.Null_Pos;
else
  Self.First_Free := Self.First_Free + 1;
  if not Self.T_Status (Self.First_Free) then
Free_Found := True;
  end if;
end if;
  end loop;
end if;
  end Add_New;

  procedure Add_New_At_Pos
   (Self : in out List_T;
Pos  : in out Index_Range_T) is

Free_Found : Boolean := False;
  begin
if Self.T_Status (Pos) then
  Erreur.Treat_Container_Error
   (Error_Name => Erreur.Element_Not_Valid,
Container_Name => Container_Name,
Procedure_Name => Erreur.Add_New_At_Pos,
Context=> Erreur.Null_Context_C);
else
  Self.Size   := Self.Size + 1;
  Self.T_Status (Pos) := True;
  Init (Self.Data (Pos));
  if Self.First_Relevant = Rbc_Constants.Null_Pos
or Pos < Self.First_Relevant
  then
Self.First_Relevant := Pos;
  end if;
  if Self.First_Free = Pos
  then
-- Look for a new First_Free
while not (Free_Found or Self.First_Free = Rbc_Constants.Null_Pos) loop
  if Self.First_Free = Pos_Range_T'Last then
Self.First_Free := Rbc_Constants.Null_Pos;
  else
Self.First_Free := Self.First_Free + 1;
if not Self.T_Status (Self.First_Free) then
  Free_Found := True;
end if;
  end if;
end loop;
  end if;
  -
end if;
  end Add_New_At_Pos;

  procedure Clear (Self : out List_T) is
  begin
Self.T_Status   := (others => False);
Self.First_Free := Init_First_Free;
Self.First_Relevant := Rbc_Constants.Null_Pos;
Self.Size   := Empty;
  end Clear;

  procedure Remove_Element_At_Pos
   (Self : in out List_T;
Pos  : in Index_Range_T) is

Relevant_Found : Boolean := False;
  begin
-- REMOVE ITEM IF VALID -
if not Self.T_Status (Pos) then
  

[Ada] Implement pragma Max_Entry_Queue_Length

2019-08-13 Thread Pierre-Marie de Rodat
This patch implements AI12-0164-1 for the aspect/pragma
Max_Entry_Queue_Length.  Previously, the GNAT specific pragma
Max_Queue_Length fulfilled this role, but was not named to match the
standard and thus was insufficent.


-- Source --


--  pass.ads

with System;
package Pass is

   SOMETHING : constant Integer := 5;
   Variable : Boolean := False;

   protected type Protected_Example is

  entry A (Item : Integer)
 with Max_Entry_Queue_Length => 2;--  OK

  entry B (Item : Integer);
  pragma Max_Entry_Queue_Length (SOMETHING);  --  OK

  entry C (Item : Integer);  --  OK

  entry D (Item : Integer)
 with Max_Entry_Queue_Length => 4;--  OK

  entry D (Item : Integer; Item_B : Integer)
 with Max_Entry_Queue_Length => Float'Digits; --  OK

  entry E (Item : Integer);
  pragma Max_Entry_Queue_Length (SOMETHING * 2);  --  OK

  entry E (Item : Integer; Item_B : Integer);
  pragma Max_Entry_Queue_Length (11); --  OK

  entry F (Item : Integer; Item_B : Integer);
  pragma Pre (Variable = True);
  pragma Max_Entry_Queue_Length (11); --  OK

  entry G (Item : Integer; Item_B : Integer)
 with Pre => (Variable = True),
  Max_Entry_Queue_Length => 11;   --  OK

   private
  Data : Boolean := True;
   end Protected_Example;

   Prot_Ex  : Protected_Example;

end Pass;

--  fail.ads

package Fail is

   --  Not near entry

   pragma Max_Entry_Queue_Length (40);--  ERROR

   --  Task type

   task type Task_Example is

  entry Insert (Item : in Integer)
 with Max_Entry_Queue_Length => 10;   --  ERROR

  -- Entry family in task type

  entry A (Positive) (Item : in Integer)
 with Max_Entry_Queue_Length => 10;   --  ERROR

   end Task_Example;

   Task_Ex : Task_Example;

   --  Aspect applied to protected type

   protected type Protected_Failure_0
  with Max_Entry_Queue_Length => 50 is--  ERROR

  entry A (Item : Integer);
   private
  Data : Integer := 0;
   end Protected_Failure_0;

   Protected_Failure_0_Ex : Protected_Failure_0;

   protected type Protected_Failure is
  pragma Max_Entry_Queue_Length (10); --  ERROR

  --  Duplicates

  entry A (Item : Integer)
 with Max_Entry_Queue_Length => 10;   --  OK
  pragma Max_Entry_Queue_Length (4);  --  ERROR

  entry B (Item : Integer);
  pragma Max_Entry_Queue_Length (40); --  OK
  pragma Max_Entry_Queue_Length (4);  --  ERROR

  entry C (Item : Integer)
 with Max_Entry_Queue_Length => 10,   --  OK
  Max_Entry_Queue_Length => 40;   --  ERROR

  -- Duplicates with the same value

  entry AA (Item : Integer)
 with Max_Entry_Queue_Length => 10;   --  OK
  pragma Max_Entry_Queue_Length (10); --  ERROR

  entry BB (Item : Integer);
  pragma Max_Entry_Queue_Length (40); --  OK
  pragma Max_Entry_Queue_Length (40); --  ERROR

  entry CC (Item : Integer)
 with Max_Entry_Queue_Length => 10,   --  OK
  Max_Entry_Queue_Length => 10;   --  ERROR

  --  On subprogram

  procedure D (Item : Integer)
 with Max_Entry_Queue_Length => 10;   --  ERROR

  procedure E (Item : Integer);
  pragma Max_Entry_Queue_Length (4);  --  ERROR

  function F (Item : Integer) return Integer
 with Max_Entry_Queue_Length => 10;   --  ERROR

  function G (Item : Integer) return Integer;
  pragma Max_Entry_Queue_Length (4);  --  ERROR

  --  Bad parameters

  entry H (Item : Integer)
 with Max_Entry_Queue_Length => 0;--  ERROR

  entry I (Item : Integer)
 with Max_Entry_Queue_Length => -1;   --  ERROR

  entry J (Item : Integer)
 with Max_Entry_Queue_Length => 16#____#; --  ERROR

  entry K (Item : Integer)
 with Max_Entry_Queue_Length => False;--  ERROR

  entry L (Item : Integer)
 with Max_Entry_Queue_Length => "JUNK";   --  ERROR

  entry M (Item : Integer)
 with Max_Entry_Queue_Length => 1.0;  --  ERROR

  entry N (Item : Integer)
 with Max_Entry_Queue_Length => Long_Integer'(3); --  ERROR

  -- Entry family

  entry 

[Ada] Fix spurious instantiation error on private record type

2019-08-13 Thread Pierre-Marie de Rodat
This change was initially aimed at fixing a spurious instantiation error
due to a disambiguation issue which happens when a generic unit with two
formal type parameters is instantiated on a single actual type that is
private.

The compiler internally sets the Is_Generic_Actual_Type flag on the
actual subtypes built for the instantiation in order to ease the
disambiguation, but it would fail to set it on the full view if the
subtypes are private.  The change makes it so that the flag is properly
set and reset on the full view in this case.

But this uncovered an issue in Subtypes_Statically_Match, which was
relying on a stalled Is_Generic_Actual_Type flag set on a full view
outside of the instantiation to return a positive answer.  This bypass
was meant to solve an issue arising with a private discriminated record
type whose completion is a discriminated record type itself derived from
a private discriminated record type, which is used as actual type in an
instantiation in another unit, and the instantiation is used in a child
unit of the original unit.  In this case, the private and full views of
the generic actual type are swapped in the child unit, but there was a
mismatch between the chain of full and underlying full views of the
private discriminated record type and that of the generic actual type.

This secondary issue is solved by avoiding to skip the full view in the
preparation of the completion of the private subtype and by directly
constraining the underlying full view of the full view of the base type
instead of building an underlying full view from scratch.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-13  Eric Botcazou  

gcc/ada/

* sem_ch3.adb (Build_Underlying_Full_View): Delete.
(Complete_Private_Subtype): Do not set the full view on the
private subtype here.  If the full base is itself derived from
private, do not re-derive the parent type but instead constrain
an existing underlying full view.
(Prepare_Private_Subtype_Completion): Do not get to the
underlying full view, if any.  Set the full view on the private
subtype here.
(Process_Full_View): Likewise.
* sem_ch12.adb (Check_Generic_Actuals): Also set
Is_Generic_Actual_Type on the full view if the type of the
actual is private.
(Restore_Private_Views): Also reset Is_Generic_Actual_Type on
the full view if the type of the actual is private.
* sem_eval.adb (Subtypes_Statically_Match): Remove bypass for
generic actual types.

gcc/testsuite/

* gnat.dg/generic_inst10.adb, gnat.dg/generic_inst10_pkg.ads:
New testcase.--- gcc/ada/sem_ch12.adb
+++ gcc/ada/sem_ch12.adb
@@ -6804,7 +6804,12 @@ package body Sem_Ch12 is
Check_Private_View (Subtype_Indication (Parent (E)));
 end if;
 
-Set_Is_Generic_Actual_Type (E, True);
+Set_Is_Generic_Actual_Type (E);
+
+if Is_Private_Type (E) and then Present (Full_View (E)) then
+   Set_Is_Generic_Actual_Type (Full_View (E));
+end if;
+
 Set_Is_Hidden (E, False);
 Set_Is_Potentially_Use_Visible (E, In_Use (Instance));
 
@@ -14603,6 +14608,10 @@ package body Sem_Ch12 is
null;
 else
Set_Is_Generic_Actual_Type (E, False);
+
+   if Is_Private_Type (E) and then Present (Full_View (E)) then
+  Set_Is_Generic_Actual_Type (Full_View (E), False);
+   end if;
 end if;
 
 --  An unusual case of aliasing: the actual may also be directly

--- gcc/ada/sem_ch3.adb
+++ gcc/ada/sem_ch3.adb
@@ -232,18 +232,6 @@ package body Sem_Ch3 is
--  Needs a more complete spec--what are the parameters exactly, and what
--  exactly is the returned value, and how is Bound affected???
 
-   procedure Build_Underlying_Full_View
- (N   : Node_Id;
-  Typ : Entity_Id;
-  Par : Entity_Id);
-   --  If the completion of a private type is itself derived from a private
-   --  type, or if the full view of a private subtype is itself private, the
-   --  back-end has no way to compute the actual size of this type. We build
-   --  an internal subtype declaration of the proper parent type to convey
-   --  this information. This extra mechanism is needed because a full
-   --  view cannot itself have a full view (it would get clobbered during
-   --  view exchanges).
-
procedure Check_Access_Discriminant_Requires_Limited
  (D   : Node_Id;
   Loc : Node_Id);
@@ -10447,111 +10435,6 @@ package body Sem_Ch3 is
   return New_Bound;
end Build_Scalar_Bound;
 
-   
-   -- Build_Underlying_Full_View --
-   
-
-   procedure Build_Underlying_Full_View
- (N   : Node_Id;
-  Typ : Entity_Id;
-  Par : Entity_Id)
-   is
-  Loc  : constant Source_Ptr := Sloc (N);
-  Subt : 

[Ada] Small cleanup and improvement in inlining machinery

2019-08-13 Thread Pierre-Marie de Rodat
This is a small cleanup in the inlining machinery of the front-end
dealing with back-end inlining.  It should save a few cycles at -O0 by
stopping it from doing useless work.  No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-13  Eric Botcazou  

gcc/ada/

* exp_ch6.adb: Remove with and use clauses for Sem_Ch12.
(Expand_Call_Helper): Swap the back-end inlining case and the
special front-end expansion case.  In back-end inlining mode, do
not invoke Add_Inlined_Body unless the call may be inlined.
* inline.ads (Add_Pending_Instantiation): New function moved
from...
* inline.adb (Add_Inlined_Body): Simplify comment.  Turn test on
the enclosing unit into assertion.
(Add_Pending_Instantiation): New function moved from...
* sem_ch12.ads (Add_Pending_Instantiation): ...here.
* sem_ch12.adb (Add_Pending_Instantiation): ...here.--- gcc/ada/exp_ch6.adb
+++ gcc/ada/exp_ch6.adb
@@ -60,7 +60,6 @@ with Sem;   use Sem;
 with Sem_Aux;   use Sem_Aux;
 with Sem_Ch6;   use Sem_Ch6;
 with Sem_Ch8;   use Sem_Ch8;
-with Sem_Ch12;  use Sem_Ch12;
 with Sem_Ch13;  use Sem_Ch13;
 with Sem_Dim;   use Sem_Dim;
 with Sem_Disp;  use Sem_Disp;
@@ -4316,15 +4315,15 @@ package body Exp_Ch6 is
  if not Is_Inlined (Subp) then
 null;
 
- --  Frontend inlining of expression functions (performed also when
- --  backend inlining is enabled).
+ --  Front-end inlining of expression functions (performed also when
+ --  back-end inlining is enabled).
 
  elsif Is_Inlinable_Expression_Function (Subp) then
 Rewrite (N, New_Copy (Expression_Of_Expression_Function (Subp)));
 Analyze (N);
 return;
 
- --  Handle frontend inlining
+ --  Handle front-end inlining
 
  elsif not Back_End_Inlining then
 Inlined_Subprogram : declare
@@ -4420,27 +4419,36 @@ package body Exp_Ch6 is
end if;
 end Inlined_Subprogram;
 
- --  Back end inlining: let the back end handle it
+ --  Front-end expansion of simple functions returning unconstrained
+ --  types (see Check_And_Split_Unconstrained_Function). Note that the
+ --  case of a simple renaming (Body_To_Inline in N_Entity below, see
+ --  also Build_Renamed_Body) cannot be expanded here because this may
+ --  give rise to order-of-elaboration issues for the types of the
+ --  parameters of the subprogram, if any.
+
+ elsif Present (Unit_Declaration_Node (Subp))
+   and then Nkind (Unit_Declaration_Node (Subp)) =
+   N_Subprogram_Declaration
+   and then Present (Body_To_Inline (Unit_Declaration_Node (Subp)))
+   and then
+ Nkind (Body_To_Inline (Unit_Declaration_Node (Subp))) not in
+   N_Entity
+ then
+Expand_Inlined_Call (Call_Node, Subp, Orig_Subp);
+
+ --  Back-end inlining either if optimization is enabled or the call is
+ --  required to be inlined.
 
- elsif No (Unit_Declaration_Node (Subp))
-   or else Nkind (Unit_Declaration_Node (Subp)) /=
- N_Subprogram_Declaration
-   or else No (Body_To_Inline (Unit_Declaration_Node (Subp)))
-   or else Nkind (Body_To_Inline (Unit_Declaration_Node (Subp))) in
-  N_Entity
+ elsif Optimization_Level > 0
+   or else Has_Pragma_Inline_Always (Subp)
  then
 Add_Inlined_Body (Subp, Call_Node);
 
---  If the inlined call appears within an instantiation and either
---  is required to be inlined or optimization is enabled, ensure
+--  If the inlined call appears within an instance, then ensure
 --  that the enclosing instance body is available so the back end
 --  can actually perform the inlining.
 
-if In_Instance
-  and then Comes_From_Source (Subp)
-  and then (Has_Pragma_Inline_Always (Subp)
- or else Optimization_Level > 0)
-then
+if In_Instance and then Comes_From_Source (Subp) then
declare
   Decl  : Node_Id;
   Inst  : Entity_Id;
@@ -4491,16 +4499,6 @@ package body Exp_Ch6 is
   end if;
end;
 end if;
-
- --  Front end expansion of simple functions returning unconstrained
- --  types (see Check_And_Split_Unconstrained_Function). Note that the
- --  case of a simple renaming (Body_To_Inline in N_Entity above, see
- --  also Build_Renamed_Body) cannot be expanded here because this may
-   

[Ada] Avoid crash in GNATprove_Mode on allocator inside type

2019-08-13 Thread Pierre-Marie de Rodat
In the special mode for GNATprove, subtypes should be declared for
allocators when constraints are used. This was done previously but it
does not work inside spec expressions, as the declaration is not
inserted and analyzed in the AST in that case, leading to a later crash
on an incomplete entity. Thus, no declaration should be created in such
a case, letting GNATprove later reject such code due to the use of an
allocator in an interfering context.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-13  Yannick Moy  

gcc/ada/

* sem_ch4.adb (Analyze_Allocator): Do not insert subtype
declaration for allocator inside a spec expression.

gcc/testsuite/

* gnat.dg/allocator2.adb, gnat.dg/allocator2.ads: New testcase.--- gcc/ada/sem_ch4.adb
+++ gcc/ada/sem_ch4.adb
@@ -676,9 +676,15 @@ package body Sem_Ch4 is
 
--  In GNATprove mode we need to preserve the link between
--  the original subtype indication and the anonymous subtype,
-   --  to extend proofs to constrained acccess types.
-
-   if Expander_Active or else GNATprove_Mode then
+   --  to extend proofs to constrained acccess types. We only do
+   --  that outside of spec expressions, otherwise the declaration
+   --  cannot be inserted and analyzed. In such a case, GNATprove
+   --  later rejects the allocator as it is not used here in
+   --  a non-interfering context (SPARK 4.8(2) and 7.1.3(12)).
+
+   if Expander_Active
+ or else (GNATprove_Mode and then not In_Spec_Expression)
+   then
   Def_Id := Make_Temporary (Loc, 'S');
 
   Insert_Action (E,

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/allocator2.adb
@@ -0,0 +1,6 @@
+--  { dg-do compile }
+--  { dg-options "-gnatd.F" }
+
+package body Allocator2 is
+   procedure Dummy is null;
+end Allocator2;

--- /dev/null
new file mode 100644
+++ gcc/testsuite/gnat.dg/allocator2.ads
@@ -0,0 +1,15 @@
+pragma SPARK_Mode;
+package Allocator2 is
+type Nat_Array is array (Positive range <>) of Natural with
+  Default_Component_Value => 0;
+type Nat_Stack (Max : Natural) is record
+   Content : Nat_Array (1 .. Max);
+end record;
+type Stack_Acc is access Nat_Stack;
+type My_Rec is private;
+private
+type My_Rec is record
+   My_Stack : Stack_Acc := new Nat_Stack (Max => 10);
+end record;
+   procedure Dummy;
+end Allocator2;



[Ada] Avoid spurious errors on dimensionality checking in GNATprove

2019-08-13 Thread Pierre-Marie de Rodat
Complete the partial treatment that was started in r273405.  Instead of
checking for the special case of nodes inside inlined bodies at the call
site, check for this condition inside the top-level procedures called
for dimensionality checking.

There is no impact on compilation.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-13  Yannick Moy  

gcc/ada/

* sem_dim.adb (Analyze_Dimension,
Analyze_Dimension_Array_Aggregate, Analyze_Dimension_Call,
Analyze_Dimension_Extension_Or_Record_Aggregate): Return
immediately when inside an inlined body.
* sem_res.adb (Resolve_Call): Remove special checking now done
inside Analyze_Dimension_Call.--- gcc/ada/sem_dim.adb
+++ gcc/ada/sem_dim.adb
@@ -1142,6 +1142,11 @@ package body Sem_Dim is
   if Ada_Version < Ada_2012 then
  return;
 
+  --  Inlined bodies have already been checked for dimensionality
+
+  elsif In_Inlined_Body then
+ return;
+
   elsif not Comes_From_Source (N) then
  if Nkind_In (N, N_Explicit_Dereference,
  N_Identifier,
@@ -1245,10 +1250,13 @@ package body Sem_Dim is
   --  Aspect is an Ada 2012 feature. Nothing to do here if the component
   --  base type is not a dimensioned type.
 
+  --  Inlined bodies have already been checked for dimensionality.
+
   --  Note that here the original node must come from source since the
   --  original array aggregate may not have been entirely decorated.
 
   if Ada_Version < Ada_2012
+or else In_Inlined_Body
 or else not Comes_From_Source (Original_Node (N))
 or else not Has_Dimension_System (Base_Type (Comp_Typ))
   then
@@ -1634,10 +1642,11 @@ package body Sem_Dim is
 
begin
   --  Aspect is an Ada 2012 feature. Note that there is no need to check
-  --  dimensions for calls that don't come from source, or those that may
-  --  have semantic errors.
+  --  dimensions for calls in inlined bodies, or calls that don't come
+  --  from source, or those that may have semantic errors.
 
   if Ada_Version < Ada_2012
+or else In_Inlined_Body
 or else not Comes_From_Source (N)
 or else Error_Posted (N)
   then
@@ -1966,11 +1975,12 @@ package body Sem_Dim is
 
begin
   --  Aspect is an Ada 2012 feature. Note that there is no need to check
-  --  dimensions for aggregates that don't come from source, or if we are
-  --  within an initialization procedure, whose expressions have been
-  --  checked at the point of record declaration.
+  --  dimensions in inlined bodies, or for aggregates that don't come
+  --  from source, or if we are within an initialization procedure, whose
+  --  expressions have been checked at the point of record declaration.
 
   if Ada_Version < Ada_2012
+or else In_Inlined_Body
 or else not Comes_From_Source (N)
 or else Inside_Init_Proc
   then

--- gcc/ada/sem_res.adb
+++ gcc/ada/sem_res.adb
@@ -6952,9 +6952,7 @@ package body Sem_Res is
   --  Check the dimensions of the actuals in the call. For function calls,
   --  propagate the dimensions from the returned type to N.
 
-  if not In_Inlined_Body then
- Analyze_Dimension_Call (N, Nam);
-  end if;
+  Analyze_Dimension_Call (N, Nam);
 
   --  All done, evaluate call and deal with elaboration issues
 



[Ada] Spurious error on nested instantiation

2019-08-13 Thread Pierre-Marie de Rodat
This fixes a spurious error given by the compiler for a call to a
subprogram which is the formal subprogram parameter of a generic
package, if the generic package is instantiated in the body of an
enclosing generic package with two formal types and two formal
subprogram parameter homonyms taking them, and this instantiation takes
one the two formal types as actual, and the enclosing generic package is
instantiated on the same actual type with a single actual subprogram
parameter, and the aforementioned call is overloaded.

In this case, the renaming generated for the actual subprogram parameter
in the nested instantiation is ambiguous and must be disambiguated using
the corresponding formal parameter of the enclosing instantiation,
otherwise a (sub)type mismatch is created and later subprogram
disambiguation is not really possible.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-13  Eric Botcazou  

gcc/ada/

* sem_ch4.adb (Analyze_One_Call): Remove bypass for type
mismatch in nested instantiations.
* sem_ch8.adb (Find_Nearer_Entity): New function.
(Find_Renamed_Entity): Use it to disambiguate the candidates for
the renaming generated for an instantiation when it is
ambiguous.

gcc/testsuite/

* gnat.dg/generic_inst9.adb, gnat.dg/generic_inst9.ads,
gnat.dg/generic_inst9_pkg1-operator.ads,
gnat.dg/generic_inst9_pkg1.ads, gnat.dg/generic_inst9_pkg2.adb,
gnat.dg/generic_inst9_pkg2.ads: New testcase.--- gcc/ada/sem_ch4.adb
+++ gcc/ada/sem_ch4.adb
@@ -3619,59 +3619,6 @@ package body Sem_Ch4 is
   Next_Actual (Actual);
   Next_Formal (Formal);
 
-   --  In a complex case where an enclosing generic and a nested
-   --  generic package, both declared with partially parameterized
-   --  formal subprograms with the same names, are instantiated
-   --  with the same type, the types of the actual parameter and
-   --  that of the formal may appear incompatible at first sight.
-
-   --   generic
-   --  type Outer_T is private;
-   --  with function Func (Formal : Outer_T)
-   -- return ... is <>;
-
-   --   package Outer_Gen is
-   --  generic
-   -- type Inner_T is private;
-   -- with function Func (Formal : Inner_T)   --  (1)
-   --   return ... is <>;
-
-   --  package Inner_Gen is
-   -- function Inner_Func (Formal : Inner_T)  --  (2)
-   --   return ... is (Func (Formal));
-   --  end Inner_Gen;
-   --   end Outer_Generic;
-
-   --   package Outer_Inst is new Outer_Gen (Actual_T);
-   --   package Inner_Inst is new Outer_Inst.Inner_Gen (Actual_T);
-
-   --  In the example above, the type of parameter
-   --  Inner_Func.Formal at (2) is incompatible with the type of
-   --  Func.Formal at (1) in the context of instantiations
-   --  Outer_Inst and Inner_Inst. In reality both types are generic
-   --  actual subtypes renaming base type Actual_T as part of the
-   --  generic prologues for the instantiations.
-
-   --  Recognize this case and add a type conversion to allow this
-   --  kind of generic actual subtype conformance. Note that this
-   --  is done only when the call is non-overloaded because the
-   --  resolution mechanism already has the means to disambiguate
-   --  similar cases.
-
-   elsif not Is_Overloaded (Name (N))
- and then Is_Type (Etype (Actual))
- and then Is_Type (Etype (Formal))
- and then Is_Generic_Actual_Type (Etype (Actual))
- and then Is_Generic_Actual_Type (Etype (Formal))
- and then Base_Type (Etype (Actual)) =
-  Base_Type (Etype (Formal))
-   then
-  Rewrite (Actual,
-Convert_To (Etype (Formal), Relocate_Node (Actual)));
-  Analyze_And_Resolve (Actual, Etype (Formal));
-  Next_Actual (Actual);
-  Next_Formal (Formal);
-
--  Handle failed type check
 
else

--- gcc/ada/sem_ch8.adb
+++ gcc/ada/sem_ch8.adb
@@ -6721,6 +6721,15 @@ package body Sem_Ch8 is
   Old_S : Entity_Id;
   Inst  : Entity_Id;
 
+  function Find_Nearer_Entity
+(New_S  : Entity_Id;
+ Old1_S : Entity_Id;
+ Old2_S : Entity_Id) return Entity_Id;
+  --  Determine whether one of Old_S1 and Old_S2 is nearer to New_S than
+  --  the other, and return it if so. Return Empty otherwise. We use this
+  --  in conjunction with Inherit_Renamed_Profile to 

[Ada] Avoid crash in GNATprove due to inlining inside type

2019-08-13 Thread Pierre-Marie de Rodat
The special inlining for GNATprove should not inline calls inside record
types, used for the constraints of components.

There is no impact on compilation.

Tested on x86_64-pc-linux-gnu, committed on trunk

2019-08-13  Yannick Moy  

gcc/ada/

* sem_res.adb (Resolve_Call): Do not inline calls inside record
types.--- gcc/ada/sem_res.adb
+++ gcc/ada/sem_res.adb
@@ -7062,6 +7062,15 @@ package body Sem_Res is
  end if;
   end if;
 
+   --  Cannot inline a call inside the definition of a record type,
+   --  typically inside the constraints of the type. Calls in
+   --  default expressions are also not inlined, but this is
+   --  filtered out above when testing In_Default_Expr.
+
+   elsif Is_Record_Type (Current_Scope) then
+  Cannot_Inline
+("cannot inline & (inside record type)?", N, Nam_UA);
+
--  With the one-pass inlining technique, a call cannot be
--  inlined if the corresponding body has not been seen yet.
 



  1   2   >