[PING^5] [PATCH V3] PR88497 - Extend reassoc for vector bit_field_ref

2019-06-25 Thread Kewen.Lin
Hi all,

Gentle ping for this patch:

https://gcc.gnu.org/ml/gcc-patches/2019-03/msg00966.html

on 2019/6/11 上午10:46, Kewen.Lin wrote:
> Hi,
> 
> Gentle ping again.  Thanks!
> 
> Kewen
> 
> on 2019/5/21 上午10:02, Kewen.Lin wrote:
>> Hi,
>>
>> Gentle ping again.  Thanks!
>>
>>
>> Kewen
>>
>> on 2019/5/5 下午2:15, Kewen.Lin wrote:
>>> Hi,
>>>
>>> I'd like to gentle ping for this patch:
>>> https://gcc.gnu.org/ml/gcc-patches/2019-03/msg00966.html
>>>
>>> OK for trunk now?
>>>
>>> Thanks!
>>>
>>> on 2019/3/20 上午11:14, Kewen.Lin wrote:
 Hi,

 Please refer to below link for previous threads.
 https://gcc.gnu.org/ml/gcc-patches/2019-03/msg00348.html

 Comparing to patch v2, I've moved up the vector operation target 
 check upward together with vector type target check.  Besides, I
 ran bootstrap and regtest on powerpc64-linux-gnu (BE), updated 
 testcases' requirements and options for robustness.

 Is it OK for GCC10?


 gcc/ChangeLog

 2019-03-20  Kewen Lin  

PR target/88497
* tree-ssa-reassoc.c (reassociate_bb): Swap the positions of 
GIMPLE_BINARY_RHS check and gimple_visited_p check, call new 
function undistribute_bitref_for_vector.
(undistribute_bitref_for_vector): New function.
(cleanup_vinfo_map): Likewise.
(unsigned_cmp): Likewise.

 gcc/testsuite/ChangeLog

 2019-03-20  Kewen Lin  

* gcc.dg/tree-ssa/pr88497-1.c: New test.
* gcc.dg/tree-ssa/pr88497-2.c: Likewise.
* gcc.dg/tree-ssa/pr88497-3.c: Likewise.
* gcc.dg/tree-ssa/pr88497-4.c: Likewise.
* gcc.dg/tree-ssa/pr88497-5.c: Likewise.

 ---
  gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c |  44 +
  gcc/testsuite/gcc.dg/tree-ssa/pr88497-2.c |  33 
  gcc/testsuite/gcc.dg/tree-ssa/pr88497-3.c |  33 
  gcc/testsuite/gcc.dg/tree-ssa/pr88497-4.c |  33 
  gcc/testsuite/gcc.dg/tree-ssa/pr88497-5.c |  33 
  gcc/tree-ssa-reassoc.c| 306 
 +-
  6 files changed, 477 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c
  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr88497-2.c
  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr88497-3.c
  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr88497-4.c
  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr88497-5.c

 diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c 
 b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c
 new file mode 100644
 index 000..99c9af8
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-1.c
 @@ -0,0 +1,44 @@
 +/* { dg-do compile } */
 +/* { dg-require-effective-target vect_double } */
 +/* { dg-require-effective-target powerpc_vsx_ok { target { powerpc*-*-* } 
 } } */
 +/* { dg-options "-O2 -ffast-math" } */
 +/* { dg-options "-O2 -ffast-math -mvsx -fdump-tree-reassoc1" { target { 
 powerpc*-*-* } } } */
 +
 +/* To test reassoc can undistribute vector bit_field_ref summation.
 +
 +   arg1 and arg2 are two arrays whose elements of type vector double.
 +   Assuming:
 + A0 = arg1[0], A1 = arg1[1], A2 = arg1[2], A3 = arg1[3],
 + B0 = arg2[0], B1 = arg2[1], B2 = arg2[2], B3 = arg2[3],
 +
 +   Then:
 + V0 = A0 * B0, V1 = A1 * B1, V2 = A2 * B2, V3 = A3 * B3,
 +
 +   reassoc transforms
 +
 + accumulator += V0[0] + V0[1] + V1[0] + V1[1] + V2[0] + V2[1]
 ++ V3[0] + V3[1];
 +
 +   into:
 +
 + T = V0 + V1 + V2 + V3
 + accumulator += T[0] + T[1];
 +
 +   Fewer bit_field_refs, only two for 128 or more bits vector.  */
 +
 +typedef double v2df __attribute__ ((vector_size (16)));
 +double
 +test (double accumulator, v2df arg1[], v2df arg2[])
 +{
 +  v2df temp;
 +  temp = arg1[0] * arg2[0];
 +  accumulator += temp[0] + temp[1];
 +  temp = arg1[1] * arg2[1];
 +  accumulator += temp[0] + temp[1];
 +  temp = arg1[2] * arg2[2];
 +  accumulator += temp[0] + temp[1];
 +  temp = arg1[3] * arg2[3];
 +  accumulator += temp[0] + temp[1];
 +  return accumulator;
 +}
 +/* { dg-final { scan-tree-dump-times "BIT_FIELD_REF" 2 "reassoc1" { 
 target { powerpc*-*-* } } } } */
 diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr88497-2.c 
 b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-2.c
 new file mode 100644
 index 000..61ed0bf5
 --- /dev/null
 +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr88497-2.c
 @@ -0,0 +1,33 @@
 +/* { dg-do compile } */
 +/* { dg-require-effective-target vect_float } */
 +/* { dg-require-effective-target powerpc_altivec_ok { target { 
 powerpc*-*-* } } } */
 +/* { dg-options "-O2 -ffast-math" } */
 +/* { dg-options "-O2 -ffast-math -maltivec -fdump-tree-reassoc1" { 

Re: [PATCH] [RS6000] Change maddld match_operand from DI to GPR

2019-06-25 Thread Li Jia He




On 2019/6/24 3:38 PM, Segher Boessenkool wrote:

Hi Lijia,

On Mon, Jun 24, 2019 at 01:00:05AM -0500, Li Jia He wrote:

>From PowerPC ISA3.0, the description of `maddld RT, RA.RB, RC` is as follows:
64-bit RA and RB are multiplied and then the RC is signed extend to 128 bits,
and add them together.

We only apply it to 64-bit mode (DI) when implementing maddld.  However, if we
can guarantee that the result of the maddld operation will be limited to 32-bit
mode (SI), we can still apply it to 32-bit mode (SI).


Great :-)  Just some testcase comments:


diff --git a/gcc/testsuite/gcc.target/powerpc/maddld-1.c 
b/gcc/testsuite/gcc.target/powerpc/maddld-1.c
new file mode 100644
index 000..06f5f5774d4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/maddld-1.c
@@ -0,0 +1,19 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */


powerpc* is the default in gcc.target/powerpc, so you can leave it out:

/* { dg-do compile } */

(and that is default itself, but it is good documentation for the target
tests, many of those are run tests).


+/* { dg-require-effective-target powerpc_p9modulo_ok } */


You don't need this line, it tests if the assembler supports p9.


+/* { dg-final { scan-assembler-times "maddld " 2 } } */
+/* { dg-final { scan-assembler-not   "mulld "} } */
+/* { dg-final { scan-assembler-not   "add "  } } */


You can easier write this using \m and \M, a bit more exact even:

/* { dg-final { scan-assembler-times {\mmaddld\M} 2 } } *As the file name is 
madld-1.c, the resulting assembly file contains


   .file   "maddld-1.c"

This will cause the test case to fail.

I will replace it with the following statement
/* { dg-final { scan-assembler-times {\mmaddld\s} 2 } }

/* { dg-final { scan-assembler-not   {\mmul} } } */
/* { dg-final { scan-assembler-not   {\madd} } } */

Which allows only the exact mnemonic "maddld", and disallows anything
starting with "mul" or "add".

Okay for trunk, with the testcase improvements please.  Thanks!


Segher





[C++ PATCH] PR c++/70462 - unnecessary base ctor variant with final.

2019-06-25 Thread Jason Merrill
As pointed out in the PR, we don't need base 'tor variants for a final
class, since it can never be a base.  I tried also dropping complete
variants for abstract classes, but that runs into ABI compatibility problems
with older releases that refer to those symbols.

Tested x86_64-pc-linux-gnu, applying to trunk.

* optimize.c (populate_clone_array): Skip base variant if
CLASSTYPE_FINAL.
(maybe_clone_body): We don't need an alias if we are only defining
one clone.
---
 gcc/cp/optimize.c   | 12 
 gcc/testsuite/g++.dg/other/final8.C |  9 +
 gcc/cp/ChangeLog|  6 ++
 3 files changed, 23 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/other/final8.C

diff --git a/gcc/cp/optimize.c b/gcc/cp/optimize.c
index aace7dea684..0774857f503 100644
--- a/gcc/cp/optimize.c
+++ b/gcc/cp/optimize.c
@@ -247,15 +247,19 @@ populate_clone_array (tree fn, tree *fns)
   fns[1] = NULL_TREE;
   fns[2] = NULL_TREE;
 
-  /* Look for the complete destructor which may be used to build the
- delete destructor.  */
+  tree ctx = DECL_CONTEXT (fn);
+
   FOR_EACH_CLONE (clone, fn)
 if (DECL_NAME (clone) == complete_dtor_identifier
|| DECL_NAME (clone) == complete_ctor_identifier)
   fns[1] = clone;
 else if (DECL_NAME (clone) == base_dtor_identifier
 || DECL_NAME (clone) == base_ctor_identifier)
-  fns[0] = clone;
+  {
+   /* We don't need to define the base variants for a final class.  */
+   if (!CLASSTYPE_FINAL (ctx))
+ fns[0] = clone;
+  }
 else if (DECL_NAME (clone) == deleting_dtor_identifier)
   fns[2] = clone;
 else
@@ -480,7 +484,7 @@ maybe_clone_body (tree fn)
 
   /* Remember if we can't have multiple clones for some reason.  We need to
  check this before we remap local static initializers in clone_body.  */
-  if (!tree_versionable_function_p (fn))
+  if (!tree_versionable_function_p (fn) && fns[0] && fns[1])
 need_alias = true;
 
   /* We know that any clones immediately follow FN in the TYPE_FIELDS
diff --git a/gcc/testsuite/g++.dg/other/final8.C 
b/gcc/testsuite/g++.dg/other/final8.C
new file mode 100644
index 000..f90f94e9ea0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/other/final8.C
@@ -0,0 +1,9 @@
+// { dg-do compile { target c++11 } }
+// { dg-final { scan-assembler-not "_ZN1BC2Ev" } }
+// { dg-final { scan-assembler-not "_ZN1BD2Ev" } }
+
+struct A { int i; A(); virtual ~A() = 0; };
+struct B final: public virtual A { int j; B(); ~B(); };
+
+B::B() {}
+B::~B() {}
diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index dee69b87eba..9d89ccb4ab9 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,5 +1,11 @@
 2019-06-25  Jason Merrill  
 
+   PR c++/70462 - unnecessary base ctor variant with final.
+   * optimize.c (populate_clone_array): Skip base variant if
+   CLASSTYPE_FINAL.
+   (maybe_clone_body): We don't need an alias if we are only defining
+   one clone.
+
* class.c (resolves_to_fixed_type_p): Check CLASSTYPE_FINAL.
 
 2019-06-25  Jakub Jelinek  

base-commit: 92c7d5c6805b9cdad3acf325d0569d563d5ea278
-- 
2.20.1



Re: [PATCH] Enable GCC support for AVX512_VP2INTERSECT.

2019-06-25 Thread Hongtao Liu
On Wed, Jun 26, 2019 at 1:13 AM Uros Bizjak  wrote:
>
> On Tue, Jun 25, 2019 at 4:44 AM Hongtao Liu  wrote:
> >
> > On Sat, Jun 22, 2019 at 3:38 PM Uros Bizjak  wrote:
> > >
> > > On Fri, Jun 21, 2019 at 8:38 PM H.J. Lu  wrote:
> > >
> > > > > > > > > > > > >> > > +/* Register pair.  */
> > > > > > > > > > > > >> > > +VECTOR_MODES_WITH_PREFIX (P, INT, 2); /* P2QI */
> > > > > > > > > > > > >> > > +VECTOR_MODES_WITH_PREFIX (P, INT, 4); /* P2HI 
> > > > > > > > > > > > >> > > P4QI */
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > I think
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > INT_MODE (P2QI, 16);
> > > > > > > > > > > > >> > > INT_MODE (P2HI, 32);
> > > > > > > Why P2QI need 16 bytes but not 2 bytes?
> > > > > > > Same question with P2HI.
> > > > > >
> > > > > > Because we made a mistake. It should be 2 and 4, since these 
> > > > > > arguments
> > > > > Then it will run into internal comiler error when building libgcc.
> > > > > I'm still invertigating it.
> > > > > > are bytes, not bits.
> > > >
> > > > I don't think we can have 2 integer modes with the same number of bytes 
> > > > since
> > > > it breaks things like
> > > >
> > > > scalar_int_mode wider_mode = GET_MODE_WIDER_MODE (mode).require ();
> > > >
> > > > We can get
> > > >
> > > > (gdb) p mode
> > > > $2 = {m_mode = E_SImode}
> > > > (gdb) p wider_mode
> > > > $3 = {m_mode = E_P2HImode}
> > > > (gdb)
> > > >
> > > > Neither middle-end nor backend support it.
> > >
> > > Ouch... It looks we hit the limitation of the middle end (which should
> > > at least warn/error out if two modes of the same width are declared).
> > >
> > > OTOH, we can't solve this problem by using two HI/QImode registers,
> > > since a consecutive register pair has to be allocated It is also not
> > > possible to overload existing SI/HImode mode with different
> > > requirements w.r.t register pair allocation (e.g. sometimes the whole
> > > register is allocated, and sometimes a register pair is allocated).
> > >
> > > I think we have to invent something like SPECIAL_INT_MODE, which would
> > > avoid mode promotion functionality (basically, it should not be listed
> > > in mode_wider and similar arrays). This would prevent mode promotion
> > > issues, while it would still allow to have mode, having the same width
> > > as existing mode, but with special properties.
> > >
> > > I'm adding Jeff and Jakub to the discussion about SPECIAL_INT_MODE.
> > >
> > > Uros.
> >
> > Patch from H.J using PARTIAL_INT_MODE fixed this issue.
> >
> > +/* Register pair.  */
> > +PARTIAL_INT_MODE (HI, 16, P2QI);
> > +PARTIAL_INT_MODE (SI, 32, P2HI);
> > +
> >
> > Here is updated patch.
>
> OK for mainline, but please add the comment about the reason to use
> PARTIAL_INT_MODE.
>
Done.
> Thanks,
> Uros.

Commited in r272668.


--
BR,
Hongtao


[Committed] PR fortran/90988 -- Fix.

2019-06-25 Thread Steve Kargl
Committed as obvious.

The patch fixes PR fortran/90988 where a less than
informative error is emitted for invalid code with
a PUBLIC specification statement.  As a bonus fix
parsing errors for PRIVATE and PROTECTED as pointed
out in https://gcc.gnu.org/ml/fortran/2019-06/msg00144.html

2019-06-24  Steven G. Kargl  

PR Fortran/90988
* decl.c (access_attr_decl): Use temporary variable to reduce
unreadability of code.  Normalize jumping to return.
(gfc_match_protected): Fix parsing error.  Add comments to 
explain code.  Remove dead code.
(gfc_match_private): Use temporary variable to reduce unreadability 
of code. Fix parsing error.  Move code to test for blank PRIVATE.
Remove dead code.
(gfc_match_public): Move code to test for blank PUBLIC.  Fix
parsing error.  Remove dead code.

2019-06-24  Steven G. Kargl  

PR Fortran/90988
* gfortran.dg/pr90988_1.f90: New test.
* gfortran.dg/pr90988_2.f90: Ditto.
* gfortran.dg/pr90988_3.f90: Ditto.

-- 
Steve
Index: gcc/fortran/decl.c
===
--- gcc/fortran/decl.c	(revision 272666)
+++ gcc/fortran/decl.c	(working copy)
@@ -8788,6 +8788,7 @@ access_attr_decl (gfc_statement st)
   gfc_symbol *sym, *dt_sym;
   gfc_intrinsic_op op;
   match m;
+  gfc_access access = (st == ST_PUBLIC) ? ACCESS_PUBLIC : ACCESS_PRIVATE;
 
   if (gfc_match (" ::") == MATCH_NO && gfc_match_space () == MATCH_NO)
 goto done;
@@ -8798,7 +8799,7 @@ access_attr_decl (gfc_statement st)
   if (m == MATCH_NO)
 	goto syntax;
   if (m == MATCH_ERROR)
-	return MATCH_ERROR;
+	goto done;
 
   switch (type)
 	{
@@ -8818,18 +8819,12 @@ access_attr_decl (gfc_statement st)
 	  && sym->attr.flavor == FL_UNKNOWN)
 	sym->attr.flavor = FL_PROCEDURE;
 
-	  if (!gfc_add_access (>attr,
-			   (st == ST_PUBLIC)
-			   ? ACCESS_PUBLIC : ACCESS_PRIVATE,
-			   sym->name, NULL))
-	return MATCH_ERROR;
+	  if (!gfc_add_access (>attr, access, sym->name, NULL))
+	goto done;
 
 	  if (sym->attr.generic && (dt_sym = gfc_find_dt_in_generic (sym))
-	  && !gfc_add_access (_sym->attr,
-  (st == ST_PUBLIC)
-  ? ACCESS_PUBLIC : ACCESS_PRIVATE,
-  sym->name, NULL))
-	return MATCH_ERROR;
+	  && !gfc_add_access (_sym->attr, access, sym->name, NULL))
+	goto done;
 
 	  break;
 
@@ -8838,17 +8833,14 @@ access_attr_decl (gfc_statement st)
 	{
 	  gfc_intrinsic_op other_op;
 
-	  gfc_current_ns->operator_access[op] =
-		(st == ST_PUBLIC) ? ACCESS_PUBLIC : ACCESS_PRIVATE;
+	  gfc_current_ns->operator_access[op] = access;
 
 	  /* Handle the case if there is another op with the same
 		 function, for INTRINSIC_EQ vs. INTRINSIC_EQ_OS and so on.  */
 	  other_op = gfc_equivalent_op (op);
 
 	  if (other_op != INTRINSIC_NONE)
-		gfc_current_ns->operator_access[other_op] =
-		  (st == ST_PUBLIC) ? ACCESS_PUBLIC : ACCESS_PRIVATE;
-
+		gfc_current_ns->operator_access[other_op] = access;
 	}
 	  else
 	{
@@ -8864,8 +8856,7 @@ access_attr_decl (gfc_statement st)
 
 	  if (uop->access == ACCESS_UNKNOWN)
 	{
-	  uop->access = (st == ST_PUBLIC)
-			  ? ACCESS_PUBLIC : ACCESS_PRIVATE;
+	  uop->access = access;
 	}
 	  else
 	{
@@ -8898,7 +8889,14 @@ gfc_match_protected (void)
 {
   gfc_symbol *sym;
   match m;
+  char c;
 
+  /* PROTECTED has already been seen, but must be followed by whitespace
+ or ::.  */
+  c = gfc_peek_ascii_char ();
+  if (!gfc_is_whitespace (c) && c != ':')
+return MATCH_NO;
+
   if (!gfc_current_ns->proc_name
   || gfc_current_ns->proc_name->attr.flavor != FL_MODULE)
 {
@@ -8908,14 +8906,12 @@ gfc_match_protected (void)
 
 }
 
+  gfc_match (" ::");
+
   if (!gfc_notify_std (GFC_STD_F2003, "PROTECTED statement at %C"))
 return MATCH_ERROR;
 
-  if (gfc_match (" ::") == MATCH_NO && gfc_match_space () == MATCH_NO)
-{
-  return MATCH_ERROR;
-}
-
+  /* PROTECTED has an entity-list.  */
   if (gfc_match_eos () == MATCH_YES)
 goto syntax;
 
@@ -8958,41 +8954,48 @@ syntax:
 match
 gfc_match_private (gfc_statement *st)
 {
+  gfc_state_data *prev;
+  char c;
 
   if (gfc_match ("private") != MATCH_YES)
 return MATCH_NO;
 
+  /* Try matching PRIVATE without an access-list.  */
+  if (gfc_match_eos () == MATCH_YES)
+{
+  prev = gfc_state_stack->previous;
+  if (gfc_current_state () != COMP_MODULE
+	  && !(gfc_current_state () == COMP_DERIVED
+		&& prev && prev->state == COMP_MODULE)
+	  && !(gfc_current_state () == COMP_DERIVED_CONTAINS
+		&& prev->previous && prev->previous->state == COMP_MODULE))
+	{
+	  gfc_error ("PRIVATE statement at %C is only allowed in the "
+		 "specification part of a module");
+	  return MATCH_ERROR;
+	}
+
+  *st = ST_PRIVATE;
+  return MATCH_YES;
+}
+
+  /* At this point, PRIVATE must be followed by whitespace or ::.  */
+  c = 

[RFA][tree-optimization/90883] Improve DSE to handle redundant calls

2019-06-25 Thread Jeff Law
So based on the conversation in the BZ I cobbled together a patch to
extend tree-ssa-dse.c to also detect redundant stores.

To be clear, given two stores, the first store is dead if the later
store overwrites all the live bytes set by the first store.   In this
case we delete the first store.  If the first store is partially dead we
may trim it.

Given two stores, the second store is redundant if it stores _the same
value_ into locations set by the first store.  In this case we delete
the second store.


We prefer to remove redundant stores over removing dead or trimming
partially dead stores.

First, if we detect a redundant store, we can always remove it.  We may
not always be able to trim a partially dead store.  So removing the
redundant store wins in this case.

But even if the redundant store occurs at the head or tail of the prior
store, removing the redundant store is better than trimming the
partially dead store because we end up with fewer calls to memset with
the same number of total bytes written.

We only look for redundant stores in a few cases.  The first store must
be a memset, empty constructor or calloc call -- ie things which
initialize multiple memory locations to zero.  Subsequent stores can
occur via memset, empty constructors or simple memory assignments.

The chagne to tree-ssa-alias.c deserves a quick note.

When we're trying to determine if we have a redundant store, we create
an AO_REF for the *second* store, then ask the alias system if the first
store would kill the AO_REF.

So while normally a calloc wouldn't ever kill anything in normal
execution order, we're not asking about things in execution order.  We
really just want to know if the calloc is going to write into the
entirety of the AO_REF of the subsequent store.  So we compute the size
of the allocation and we know the destination from the LHS of the calloc
call and everything "just works".

This patch also includes a hunk I apparently left out from yesterday's
submission which just adds _CHK cases to all the existing BUILT_IN_MEM*
cases.  That's what I get for writing this part first, then adding the
_CHK stuff afterwards, then reversing the order of submission.

This includes a slightly reduced testcase from the BZ in g++.dg -- it's
actually a good way to capture when one empty constructor causes another
empty constructor to be redundant.  The gcc.dg cases capture other
scenarios.

This has been bootstrapped and regression tested on x86-64, i686, ppc64,
ppc64le, sparc64 & aarch64.  It's also bootstrapped on various arm
targets, alpha, m68k, mips, riscv64, sh4.  It's been built and tested on
a variety of *-elf targets as well as various *-linux-gnu targets as
crosses.  ANd just for giggles it was tested before the changes to add
the _CHK support, so it works with and without that as well.

OK for the trunk?

Jeff

* tree-ssa-alias.c (stmt_kills_ref_p): Handle BUILT_IN_CALLOC.
* tree-ssa-dse.c: Update various comments to distinguish between
dead and redundant stores.
(initialize_ao_ref_for_dse): Handle BUILT_IN_CALLOC.
(dse_optimize_redundant_stores): New function.
(delete_dead_or_redundant_call): Renamed from delete_dead_call.
Distinguish between dead and redundant calls in dump output.  All
callers updated.
(delete_dead_or_redundant_assignment): Similarly for assignments.
(dse_optimize_stmt): Handle _CHK variants.  For statements which
store 0 into multiple memory locations, try to prove a subsequent
store is redundant.

* g++.dg/tree-ssa/pr90883.C: New test.
* gcc.dg/tree-ssa/ssa-dse-36.c: New test.

diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index d9307390e4c..2ec35499e21 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -2848,13 +2848,30 @@ stmt_kills_ref_p (gimple *stmt, ao_ref *ref)
  case BUILT_IN_MEMSET_CHK:
  case BUILT_IN_STRNCPY:
  case BUILT_IN_STPNCPY:
+ case BUILT_IN_CALLOC:
{
  /* For a must-alias check we need to be able to constrain
 the access properly.  */
  if (!ref->max_size_known_p ())
return false;
- tree dest = gimple_call_arg (stmt, 0);
- tree len = gimple_call_arg (stmt, 2);
+ tree dest;
+ tree len;
+ if (DECL_FUNCTION_CODE (callee) == BUILT_IN_CALLOC)
+   {
+ tree arg0 = gimple_call_arg (stmt, 0);
+ tree arg1 = gimple_call_arg (stmt, 1);
+ if (TREE_CODE (arg0) != INTEGER_CST
+ || TREE_CODE (arg1) != INTEGER_CST)
+   return false;
+
+ dest = gimple_call_lhs (stmt);
+ len = fold_build2 (MULT_EXPR, TREE_TYPE (arg0), arg0, arg1);
+   }
+ else
+   {
+ dest = gimple_call_arg (stmt, 0);
+ len = 

Re: [PATCH V2, RFC] Fix PR62147 by passing finiteness information to RTL phase

2019-06-25 Thread Kewen.Lin
Hi Jeff,

on 2019/6/26 上午5:49, Jeff Law wrote:
> On 6/25/19 3:41 AM, Kewen.Lin wrote:
>> Hi Richard,
>>
>> Thanks a lot for review comments. 
>>
>> on 2019/6/25 下午3:23, Richard Biener wrote:
>>> On Tue, 25 Jun 2019, Kewen.Lin wrote:
>>>
 Hi all,


 It's based on two observations:
   1) the loop structure for one specific loop is shared between middle-end 
 and 
  back-end.
   2) for one specific loop, if it's finite then never become infinite 
 itself.

 As one gcc newbie, I'm not sure whether these two observations are true in 
 all
 cases.  Please feel free to correct me if anything missing.
>>>
>>> I think 2) is not true with -ffinite-loops.
>>
>> I just looked at the patch on this option, I don't fully understand it can 
>> affect
>> 2).  It's to take one loop as finite with any normal exit, can some loop 
>> with this
>> assertion turn into infinite later by some other analysis?
>>
>>>
 btw, I also took a look at how the loop constraint LOOP_C_FINITE is used, 
 I think
 it's not suitable for this purpose, it's mainly set by vectorizer and tell 
 niter 
 and scev to take one loop as finite.  The original patch has the words 
 "constraint flag is mainly set by consumers and affects certain semantics 
 of 
 niter analyzer APIs".

 Bootstrapped and regression testing passed on 
 powerpc64le-unknown-linux-gnu.
>>>
>>> Did you consider to simply use finite_loop_p () from doloop.c?  That
>>> would be a much simpler patch.
>>
>> Good suggestion!  I took it for granted that the function can be only 
>> efficient in
>> middle-end, but actually some information like bit any_upper_bound could be 
>> kept to
>> RTL.
>>
>>>
>>> For the testcase in question -ffinite-loops would provide this guarantee
>>> even on RTL, so would the upper bound that may be still set.
>>>
>>> Richard.
>>>
>>
>> The new version with Richard's suggestion listed below.
>> Regression testing is ongoing.
>>
>>
>> Thanks,
>> Kewen
>>
>> ---
>>
>> gcc/ChangeLog
>>
>> 2019-06-25  Kewen Lin  
>>
>> PR target/62147
>>  * gcc/loop-iv.c (find_simple_exit): Call finite_loop_p to update 
>> finiteness.
>>
>> gcc/testsuite/ChangeLog
>>
>> 2019-06-25  Kewen Lin  
>>
>>  PR target/62147
>>  * gcc.target/powerpc/pr62147.c: New test.
> This is fine assuming regression testing was OK.
> 

Thanks Jeff!  Bootstrapped and regression testing passed on 
powerpc64le-unknown-linux-gnu.

> One might argue that "finite_loop_p" belongs elsewhere since it's not
> really querying tree/gimple structures.

I guess it will do something gimple specific (estimate_numbers_of_iterations) 
when it can 
so it was placed there.


Thanks,
Kewen

> 
> jeff
> 



Re: [PATCH 1/2] PR c/65403 - Ignore -Wno-error=

2019-06-25 Thread Alex Henrie
On Wed, Jun 19, 2019 at 11:52 AM Jeff Law  wrote:
>
> On 3/18/19 8:46 PM, Alex Henrie wrote:
> > From: Manuel López-Ibáñez 
> >
> > * opts.c: Ignore -Wno-error= except if there are
> > other diagnostics.
> That's not a complete ChangeLog entry.  Each file/function changed
> should be mentioned.  Something like this:
>
> * opts-common.c (ignored_wnoerror_options): New global variable.
> * opts-global.c (print_ignored_options): Ignore
> -Wno-error= except if there are other
> diagnostics.
> * opts.c (enable_warning_as_error): Record ignored -Wno-error
> options.
> opts.h (ignored_wnoerror_options): Declare.

Thanks!

> If HINT is set, do we still want to potentially push the argument onto
> the ignored_wnoerror_options list?  My inclination is yes since the hint
> is just that, a fuzzily matched hint.  That hint may be appropriate
> (user typo'd or somesuch) or it may be totally offbase (user was trying
> to turn off some future warning.

I don't think we need to support hints in the case of
-Wno-error= because we don't support hints for
-Wno- and as you pointed out, hints are less
likely to be helpful here because the warning may be perfectly valid
in a newer version of GCC.

I'll send rebased patches later tonight. Thanks for the feedback!

-Alex


libgo patch committed: Ignore symbols with a leading dot in test script

2019-06-25 Thread Ian Lance Taylor
This libgo patch by Clément Chigot changes the libgo testsuite script
to ignore symbols with a leading dot.  On AIX, a function has two
symbols, a text symbol (with a leading dot) and a data one (without
it).  As the tests must be run only once, only the data symbol can be
used to retrieve the final go symbol.  Therefore, all symbols
beginning with a dot are ignored by symtogo.  Bootstrapped and ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 272661)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-1d6578a20a9a2ee599a07f03cf7f8e7797d72b9c
+d3d0f3c5bbe9d272178d55bdb907b07c188800e1
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/testsuite/gotest
===
--- libgo/testsuite/gotest  (revision 272608)
+++ libgo/testsuite/gotest  (working copy)
@@ -501,6 +501,13 @@ localname() {
 symtogo() {
   result=""
   for tp in $*; do
+# Discard symbols with a leading dot.
+# On AIX, this will remove function text symbols (with a leading dot).
+# Therefore, only function descriptor symbols (without this leading dot)
+# will be used to retrieve the go symbols, avoiding duplication.
+if expr "$tp" : '^\.' >/dev/null 2>&1; then
+  continue
+fi
 s=$(echo "$tp" | sed -e 's/\.\.z2f/%/g' | sed -e 's/.*%//')
 # Screen out methods (X.Y.Z).
 if ! expr "$s" : '^[^.]*\.[^.]*$' >/dev/null 2>&1; then


libgo patch committed: Silence ar with D flag failures

2019-06-25 Thread Ian Lance Taylor
This patch by Clément Chigot modifies the go tool to not display
failures when invoking ar with the D flag if the flag is not
supported.  The corresponding Go toolchain patch is
https://golang.org/cl/182077.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 272633)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-a857aad2f3994e6fa42a6fc65330e65d209597a0
+1d6578a20a9a2ee599a07f03cf7f8e7797d72b9c
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/cmd/go/internal/work/gccgo.go
===
--- libgo/go/cmd/go/internal/work/gccgo.go  (revision 272608)
+++ libgo/go/cmd/go/internal/work/gccgo.go  (working copy)
@@ -209,9 +209,16 @@ func (tools gccgoToolchain) pack(b *Buil
}
absAfile := mkAbs(objdir, afile)
// Try with D modifier first, then without if that fails.
-   if b.run(a, p.Dir, p.ImportPath, nil, tools.ar(), arArgs, "rcD", 
absAfile, absOfiles) != nil {
+   output, err := b.runOut(p.Dir, nil, tools.ar(), arArgs, "rcD", 
absAfile, absOfiles)
+   if err != nil {
return b.run(a, p.Dir, p.ImportPath, nil, tools.ar(), arArgs, 
"rc", absAfile, absOfiles)
}
+
+   if len(output) > 0 {
+   // Show the output if there is any even without errors.
+   b.showOutput(a, p.Dir, p.ImportPath, b.processOutput(output))
+   }
+
return nil
 }
 


Re: [PATCH] constrain one character optimization to one character stores (PR 90989)

2019-06-25 Thread Martin Sebor

On 6/25/19 3:38 PM, Jeff Law wrote:

On 6/24/19 6:47 PM, Martin Sebor wrote:

On 6/24/19 5:59 PM, Jeff Law wrote:

On 6/24/19 5:50 PM, Martin Sebor wrote:

The strlen enhancement committed in r263018 to handle multi-character
assignments extended the handle_char_store() function to handle such
stores via MEM_REFs.  Prior to that the function only dealt with
single-char stores.  The enhancement neglected to constrain a case
in the function that assumed the function's previous constraint.
As a result, when the original optimization takes place with
a multi-character store, the function computes the wrong string
length.

The attached patch adds the missing constraint.

Martin

gcc-90989.diff

PR tree-optimization - incorrrect strlen result after second strcpy into
the same destination

gcc/ChangeLog:

 * tree-ssa-strlen.c (handle_char_store): Constrain a single
character
 optimization to just single character stores.

gcc/testsuite/ChangeLog:

 * gcc.dg/strlenopt-26.c: Exit with test result status.
 * gcc.dg/strlenopt-67.c: New test.

Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c    (revision 272618)
+++ gcc/tree-ssa-strlen.c    (working copy)
@@ -3462,34 +3462,38 @@ handle_char_store (gimple_stmt_iterator *gsi)
     return false;
   }
   }
-  /* If si->nonzero_chars > OFFSET, we aren't overwriting '\0',
- and if we aren't storing '\0', we know that the length of the
- string and any other zero terminated string in memory remains
- the same.  In that case we move to the next gimple statement and
- return to signal the caller that it shouldn't invalidate anything.
   - This is benefical for cases like:
+  if (cmp > 0
+  && storing_nonzero_p
+  && TREE_CODE (TREE_TYPE (rhs)) == INTEGER_TYPE)

I'm not sure I follow why checking for TREE_CODE (TREE_TYPE (rhs)) ==
INTEGER_TYPE helps here.  If you need to check that we're storing bytes,
then don't you need to check the size, not just the TREE_CODE of the
type?


handle_char_store is only called for single-character assignments
or MEM_REF assignments to/from arrays.  The type of the RHS is only
integer when storing a single character.

I don't see any requirement here that INTEGER_TYPE implies a single byte
though.  That seems to be true in simple tests I've tried, but what's to
stop us from using something like 0x31323300 on the RHS for a big endian
machine to store "123"?


The caller ensures that handle_char_store is only called for stores
to arrays (MEM_REF) or single elements as wide as char.

What you describe sounds like

  char a[N];
  *(int*)a = 0x31323300;

which is represented as

  MEM[(int *)] = 825373440;

The LHS type of that is int so the function doesn't get called.



And if the NUL byte in the original was at byte offset 2, then didn't we
just change the length by overwriting where the NUL is?


No, because cmp is the result of compare_nonzero_chars and cmp > 0
means:

  1 if SI is known to start with more than OFF nonzero characters

i.e., the character is being stored before the terminating nul.
This is the basis of the original optimization:

  /* If si->nonzero_chars > OFFSET, we aren't overwriting '\0',
 and if we aren't storing '\0', we know that the length of the
 string and any other zero terminated string in memory remains
 the same.


ISTM you actually have to look at contents of the RHS object, not just
its type.


That's already been done earlier by calling initializer_zerop.
The function sets storing_nonzero_p when the sequence of
characters in RHS is not all zeros.  For single integers it
calls integer_zerop.  Since the type of the RHS is an integer
we know the RHS is a single non-zero byte.

Martin


Re: [PATCH] RISC-V: Add -malign-data= option.

2019-06-25 Thread Andrew Pinski
On Tue, Jun 25, 2019 at 3:46 PM Ilia Diachkov
 wrote:
>
> Hello,
>
> This patch adds new machine specific option -malign-data={word,abi} to
> RISC-V port. The option switches alignment of  global variables and
> constants of array/record/union types. The default value
> (-malign-data=word) keeps existing way of alignment computation. Another
> option value (-malign-data=abi) makes data natural alignment. It avoids
> extra spaces between data to reduce code size. The measured code size
> reduction is about 0.4% at -Os on EEMBC automotive 1.1 tests and
> SPEC2006 C/C++ benchmarks. The patch was tested in riscv-gnu-toolchain
> by dejagnu.
>
> Please check the patch into the trunk.

Hmm, may I suggest use "natural" rather than "abi" and 32bit or 64bit
rather than "word"; it is not obvious what abi means and it is not
obvious what word means here; it could be either 32bit or 64bit
depending on the option.
Also my other suggestion is create a new macro where you pass
riscv_align_data_type == riscv_align_data_type_word for the "(ALIGN) <
BITS_PER_WORD) " check to reduce the code duplication.

Thanks,
Andrew Pinski

>
> Best regards,
> Ilia.
>
> gcc/
> * config/riscv/riscv-opts.h (struct riscv_align_data): Added.
> * config/riscv/riscv.c (riscv_constant_alignment): Use
> riscv_align_data_type.
> * config/riscv/riscv.h (DATA_ALIGNMENT): Use riscv_align_data_type.
> (LOCAL_ALIGNMENT): Set to old DATA_ALIGNMENT value.
> * config/riscv/riscv.opt (malign-data): New.
> * doc/invoke.texi (RISC-V Options): Document -malign-data=.


[PATCH] RISC-V: Add -malign-data= option.

2019-06-25 Thread Ilia Diachkov

Hello,

This patch adds new machine specific option -malign-data={word,abi} to 
RISC-V port. The option switches alignment of  global variables and 
constants of array/record/union types. The default value 
(-malign-data=word) keeps existing way of alignment computation. Another 
option value (-malign-data=abi) makes data natural alignment. It avoids 
extra spaces between data to reduce code size. The measured code size 
reduction is about 0.4% at -Os on EEMBC automotive 1.1 tests and 
SPEC2006 C/C++ benchmarks. The patch was tested in riscv-gnu-toolchain 
by dejagnu.


Please check the patch into the trunk.

Best regards,
Ilia.

gcc/
* config/riscv/riscv-opts.h (struct riscv_align_data): Added.
	* config/riscv/riscv.c (riscv_constant_alignment): Use 
riscv_align_data_type.

* config/riscv/riscv.h (DATA_ALIGNMENT): Use riscv_align_data_type.
(LOCAL_ALIGNMENT): Set to old DATA_ALIGNMENT value.
* config/riscv/riscv.opt (malign-data): New.
* doc/invoke.texi (RISC-V Options): Document -malign-data=.From c183fbefb9b7b53eb066cbfdaa907b6087271029 Mon Sep 17 00:00:00 2001
From: Ilia Diachkov 
Date: Wed, 26 Jun 2019 01:33:20 +0300
Subject: [PATCH] RISC-V: Add -malign-data= option.

---
 gcc/config/riscv/riscv-opts.h |  5 +
 gcc/config/riscv/riscv.c  |  3 ++-
 gcc/config/riscv/riscv.h  | 10 +++---
 gcc/config/riscv/riscv.opt| 14 ++
 gcc/doc/invoke.texi   | 10 +-
 5 files changed, 37 insertions(+), 5 deletions(-)

diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index f3031f2..c1f7fa1 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -46,4 +46,9 @@ enum riscv_microarchitecture_type {
 };
 extern enum riscv_microarchitecture_type riscv_microarchitecture;
 
+enum riscv_align_data {
+  riscv_align_data_type_word,
+  riscv_align_data_type_abi
+};
+
 #endif /* ! GCC_RISCV_OPTS_H */
diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index d61455f..08418ce 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -4904,7 +4904,8 @@ riscv_can_change_mode_class (machine_mode, machine_mode, reg_class_t rclass)
 static HOST_WIDE_INT
 riscv_constant_alignment (const_tree exp, HOST_WIDE_INT align)
 {
-  if (TREE_CODE (exp) == STRING_CST || TREE_CODE (exp) == CONSTRUCTOR)
+  if ((TREE_CODE (exp) == STRING_CST || TREE_CODE (exp) == CONSTRUCTOR)
+  && (riscv_align_data_type == riscv_align_data_type_word))
 return MAX (align, BITS_PER_WORD);
   return align;
 }
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 8856cee..bace9d9 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -181,7 +181,8 @@ along with GCC; see the file COPYING3.  If not see
that copy constants to character arrays can be done inline.  */
 
 #define DATA_ALIGNMENT(TYPE, ALIGN)	\
-  ALIGN) < BITS_PER_WORD)		\
+  (((riscv_align_data_type == riscv_align_data_type_word)		\
+&& ((ALIGN) < BITS_PER_WORD)	\
 && (TREE_CODE (TYPE) == ARRAY_TYPE	\
 	|| TREE_CODE (TYPE) == UNION_TYPE\
 	|| TREE_CODE (TYPE) == RECORD_TYPE)) ? BITS_PER_WORD : (ALIGN))
@@ -190,8 +191,11 @@ along with GCC; see the file COPYING3.  If not see
character arrays to be word-aligned so that `strcpy' calls that copy
constants to character arrays can be done inline, and 'strcmp' can be
optimised to use word loads. */
-#define LOCAL_ALIGNMENT(TYPE, ALIGN) \
-  DATA_ALIGNMENT (TYPE, ALIGN)
+#define LOCAL_ALIGNMENT(TYPE, ALIGN)	\
+  ALIGN) < BITS_PER_WORD)		\
+&& (TREE_CODE (TYPE) == ARRAY_TYPE	\
+	|| TREE_CODE (TYPE) == UNION_TYPE\
+	|| TREE_CODE (TYPE) == RECORD_TYPE)) ? BITS_PER_WORD : (ALIGN))
 
 /* Define if operations between registers always perform the operation
on the full register even if a narrower mode is specified.  */
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 3b25f9a..a9b8ab5 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -131,3 +131,17 @@ Mask(RVE)
 mriscv-attribute
 Target Report Var(riscv_emit_attribute_p) Init(-1)
 Emit RISC-V ELF attribute.
+
+malign-data=
+Target RejectNegative Joined Var(riscv_align_data_type) Enum(riscv_align_data) Init(riscv_align_data_type_word)
+Use the given data alignment.
+
+Enum
+Name(riscv_align_data) Type(enum riscv_align_data)
+Known data alignment choices (for use with the -malign-data= option):
+
+EnumValue
+Enum(riscv_align_data) String(word) Value(riscv_align_data_type_word)
+
+EnumValue
+Enum(riscv_align_data) String(abi) Value(riscv_align_data_type_abi)
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 50e50e3..55c08b3 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1059,7 +1059,8 @@ See RS/6000 and PowerPC Options.
 -mcmodel=medlow  -mcmodel=medany @gol
 -mexplicit-relocs  -mno-explicit-relocs @gol
 -mrelax  -mno-relax @gol
--mriscv-attribute  -mmo-riscv-attribute}

Re: [PATCH V2, RFC] Fix PR62147 by passing finiteness information to RTL phase

2019-06-25 Thread Jeff Law
On 6/25/19 3:41 AM, Kewen.Lin wrote:
> Hi Richard,
> 
> Thanks a lot for review comments. 
> 
> on 2019/6/25 下午3:23, Richard Biener wrote:
>> On Tue, 25 Jun 2019, Kewen.Lin wrote:
>>
>>> Hi all,
>>>
>>>
>>> It's based on two observations:
>>>   1) the loop structure for one specific loop is shared between middle-end 
>>> and 
>>>  back-end.
>>>   2) for one specific loop, if it's finite then never become infinite 
>>> itself.
>>>
>>> As one gcc newbie, I'm not sure whether these two observations are true in 
>>> all
>>> cases.  Please feel free to correct me if anything missing.
>>
>> I think 2) is not true with -ffinite-loops.
> 
> I just looked at the patch on this option, I don't fully understand it can 
> affect
> 2).  It's to take one loop as finite with any normal exit, can some loop with 
> this
> assertion turn into infinite later by some other analysis?
> 
>>
>>> btw, I also took a look at how the loop constraint LOOP_C_FINITE is used, I 
>>> think
>>> it's not suitable for this purpose, it's mainly set by vectorizer and tell 
>>> niter 
>>> and scev to take one loop as finite.  The original patch has the words 
>>> "constraint flag is mainly set by consumers and affects certain semantics 
>>> of 
>>> niter analyzer APIs".
>>>
>>> Bootstrapped and regression testing passed on powerpc64le-unknown-linux-gnu.
>>
>> Did you consider to simply use finite_loop_p () from doloop.c?  That
>> would be a much simpler patch.
> 
> Good suggestion!  I took it for granted that the function can be only 
> efficient in
> middle-end, but actually some information like bit any_upper_bound could be 
> kept to
> RTL.
> 
>>
>> For the testcase in question -ffinite-loops would provide this guarantee
>> even on RTL, so would the upper bound that may be still set.
>>
>> Richard.
>>
> 
> The new version with Richard's suggestion listed below.
> Regression testing is ongoing.
> 
> 
> Thanks,
> Kewen
> 
> ---
> 
> gcc/ChangeLog
> 
> 2019-06-25  Kewen Lin  
> 
> PR target/62147
>   * gcc/loop-iv.c (find_simple_exit): Call finite_loop_p to update 
> finiteness.
> 
> gcc/testsuite/ChangeLog
> 
> 2019-06-25  Kewen Lin  
> 
>   PR target/62147
>   * gcc.target/powerpc/pr62147.c: New test.
This is fine assuming regression testing was OK.

One might argue that "finite_loop_p" belongs elsewhere since it's not
really querying tree/gimple structures.

jeff


Re: [PATCH 01/30] Changes to machine independent code

2019-06-25 Thread Jeff Law
On 6/25/19 2:22 PM, acsaw...@linux.ibm.com wrote:
> From: Aaron Sawdey 
> 
>   * builtins.c (get_memory_rtx): Fix comment.
>   * optabs.def (movmem_optab): Change to cpymem_optab.
>   * expr.c (emit_block_move_via_cpymem): Change movmem to cpymem.
>   (emit_block_move_hints): Change movmem to cpymem.
>   * defaults.h: Change movmem to cpymem.
>   * targhooks.c (get_move_ratio): Change movmem to cpymem.
>   (default_use_by_pieces_infrastructure_p): Ditto.
So I think you're missing an update to the RTL/MD documentation.  This
is also likely to cause problems for any out-of-tree ports, so it's
probably worth a mention in the gcc-10 changes, which will need to be
created (in CVS no less, ugh).

I think the stuff posted to-date is fine, but it shouldn't go in without
the corresponding docs and gcc-10 changes updates.

jeff


Re: [PATCH] constrain one character optimization to one character stores (PR 90989)

2019-06-25 Thread Jeff Law
On 6/24/19 6:47 PM, Martin Sebor wrote:
> On 6/24/19 5:59 PM, Jeff Law wrote:
>> On 6/24/19 5:50 PM, Martin Sebor wrote:
>>> The strlen enhancement committed in r263018 to handle multi-character
>>> assignments extended the handle_char_store() function to handle such
>>> stores via MEM_REFs.  Prior to that the function only dealt with
>>> single-char stores.  The enhancement neglected to constrain a case
>>> in the function that assumed the function's previous constraint.
>>> As a result, when the original optimization takes place with
>>> a multi-character store, the function computes the wrong string
>>> length.
>>>
>>> The attached patch adds the missing constraint.
>>>
>>> Martin
>>>
>>> gcc-90989.diff
>>>
>>> PR tree-optimization - incorrrect strlen result after second strcpy into
>>> the same destination
>>>
>>> gcc/ChangeLog:
>>>
>>> * tree-ssa-strlen.c (handle_char_store): Constrain a single
>>> character
>>> optimization to just single character stores.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.dg/strlenopt-26.c: Exit with test result status.
>>> * gcc.dg/strlenopt-67.c: New test.
>>>
>>> Index: gcc/tree-ssa-strlen.c
>>> ===
>>> --- gcc/tree-ssa-strlen.c    (revision 272618)
>>> +++ gcc/tree-ssa-strlen.c    (working copy)
>>> @@ -3462,34 +3462,38 @@ handle_char_store (gimple_stmt_iterator *gsi)
>>>     return false;
>>>   }
>>>   }
>>> -  /* If si->nonzero_chars > OFFSET, we aren't overwriting '\0',
>>> - and if we aren't storing '\0', we know that the length of the
>>> - string and any other zero terminated string in memory remains
>>> - the same.  In that case we move to the next gimple statement and
>>> - return to signal the caller that it shouldn't invalidate anything.
>>>   - This is benefical for cases like:
>>> +  if (cmp > 0
>>> +  && storing_nonzero_p
>>> +  && TREE_CODE (TREE_TYPE (rhs)) == INTEGER_TYPE)
>> I'm not sure I follow why checking for TREE_CODE (TREE_TYPE (rhs)) ==
>> INTEGER_TYPE helps here.  If you need to check that we're storing bytes,
>> then don't you need to check the size, not just the TREE_CODE of the
>> type?
> 
> handle_char_store is only called for single-character assignments
> or MEM_REF assignments to/from arrays.  The type of the RHS is only
> integer when storing a single character.
I don't see any requirement here that INTEGER_TYPE implies a single byte
though.  That seems to be true in simple tests I've tried, but what's to
stop us from using something like 0x31323300 on the RHS for a big endian
machine to store "123"?

And if the NUL byte in the original was at byte offset 2, then didn't we
just change the length by overwriting where the NUL is?

ISTM you actually have to look at contents of the RHS object, not just
its type.


Jeff


Re: [PATCH 17/30] Changes to microblaze

2019-06-25 Thread Michael Eager

OK

On 6/25/19 1:22 PM, acsaw...@linux.ibm.com wrote:

From: Aaron Sawdey 

* config/microblaze/microblaze.c: Change movmem to cpymem in comment.
* config/microblaze/microblaze.md (movmemsi): Change name to cpymemsi.
---
  gcc/config/microblaze/microblaze.c  | 2 +-
  gcc/config/microblaze/microblaze.md | 2 +-
  2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/microblaze/microblaze.c 
b/gcc/config/microblaze/microblaze.c
index 947eef8..c2cbe3b 100644
--- a/gcc/config/microblaze/microblaze.c
+++ b/gcc/config/microblaze/microblaze.c
@@ -1250,7 +1250,7 @@ microblaze_block_move_loop (rtx dest, rtx src, 
HOST_WIDE_INT length)
  microblaze_block_move_straight (dest, src, leftover);
  }
  
-/* Expand a movmemsi instruction.  */

+/* Expand a cpymemsi instruction.  */
  
  bool

  microblaze_expand_block_move (rtx dest, rtx src, rtx length, rtx align_rtx)
diff --git a/gcc/config/microblaze/microblaze.md 
b/gcc/config/microblaze/microblaze.md
index 183afff..1509e43 100644
--- a/gcc/config/microblaze/microblaze.md
+++ b/gcc/config/microblaze/microblaze.md
@@ -1144,7 +1144,7 @@
  ;; Argument 2 is the length
  ;; Argument 3 is the alignment
   
-(define_expand "movmemsi"

+(define_expand "cpymemsi"
[(parallel [(set (match_operand:BLK 0 "general_operand")
   (match_operand:BLK 1 "general_operand"))
  (use (match_operand:SI 2 ""))



--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306


[PATCH 30/30] Changes to xtensa

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/xtensa/xtensa.md (movmemsi): Change name to cpymemsi.
---
 gcc/config/xtensa/xtensa.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 362e5ff..d1448a0 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -1026,7 +1026,7 @@
 
 ;; Block moves
 
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(parallel [(set (match_operand:BLK 0 "" "")
   (match_operand:BLK 1 "" ""))
  (use (match_operand:SI 2 "arith_operand" ""))
-- 
2.7.4



Re: [PATCH 30/30] Changes to xtensa

2019-06-25 Thread augustine.sterl...@gmail.com
On Tue, Jun 25, 2019 at 1:41 PM  wrote:
>
> From: Aaron Sawdey 
>
> * config/xtensa/xtensa.md (movmemsi): Change name to cpymemsi.
> ---
>  gcc/config/xtensa/xtensa.md | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

OK for xtensa.


[PATCH 29/30] Changes to visium

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/visium/visium.h: Change movmem to cpymem in comment.
* config/visium/visium.md (movmemsi): Change name to cpymemsi.
---
 gcc/config/visium/visium.h  | 4 ++--
 gcc/config/visium/visium.md | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/config/visium/visium.h b/gcc/config/visium/visium.h
index 817e7dc..c9376b2 100644
--- a/gcc/config/visium/visium.h
+++ b/gcc/config/visium/visium.h
@@ -1138,8 +1138,8 @@ do
\
always make code faster, but eventually incurs high cost in
increased code size.
 
-   Since we have a movmemsi pattern, the default MOVE_RATIO is 2, which
-   is too low given that movmemsi will invoke a libcall.  */
+   Since we have a cpymemsi pattern, the default MOVE_RATIO is 2, which
+   is too low given that cpymemsi will invoke a libcall.  */
 #define MOVE_RATIO(speed) ((speed) ? 9 : 3)
 
 /* `CLEAR_RATIO (SPEED)`
diff --git a/gcc/config/visium/visium.md b/gcc/config/visium/visium.md
index f535441..e146b89 100644
--- a/gcc/config/visium/visium.md
+++ b/gcc/config/visium/visium.md
@@ -3006,7 +3006,7 @@
 ;; Argument 2 is the length
 ;; Argument 3 is the alignment
 
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(parallel [(set (match_operand:BLK 0 "memory_operand" "")
   (match_operand:BLK 1 "memory_operand" ""))
  (use (match_operand:SI  2 "general_operand" ""))
-- 
2.7.4



[PATCH 28/30] Changes to vax

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/vax/vax-protos.h (vax_output_movmemsi): Remove prototype
for nonexistent function.
* config/vax/vax.h: Change movmem to cpymem in comment.
* config/vax/vax.md (movmemhi, movmemhi1): Change movmem to cpymem.
---
 gcc/config/vax/vax-protos.h | 1 -
 gcc/config/vax/vax.h| 2 +-
 gcc/config/vax/vax.md   | 8 
 3 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/gcc/config/vax/vax-protos.h b/gcc/config/vax/vax-protos.h
index a76cf02..a85cf36 100644
--- a/gcc/config/vax/vax-protos.h
+++ b/gcc/config/vax/vax-protos.h
@@ -31,7 +31,6 @@ extern void vax_expand_addsub_di_operands (rtx *, enum 
rtx_code);
 extern const char * vax_output_int_move (rtx, rtx *, machine_mode);
 extern const char * vax_output_int_add (rtx_insn *, rtx *, machine_mode);
 extern const char * vax_output_int_subtract (rtx_insn *, rtx *, machine_mode);
-extern const char * vax_output_movmemsi (rtx, rtx *);
 #endif /* RTX_CODE */
 
 #ifdef REAL_VALUE_TYPE
diff --git a/gcc/config/vax/vax.h b/gcc/config/vax/vax.h
index a6a8227..e7137dc 100644
--- a/gcc/config/vax/vax.h
+++ b/gcc/config/vax/vax.h
@@ -430,7 +430,7 @@ enum reg_class { NO_REGS, ALL_REGS, LIM_REG_CLASSES };
 #define MOVE_MAX 8
 
 /* If a memory-to-memory move would take MOVE_RATIO or more simple
-   move-instruction pairs, we will do a movmem or libcall instead.  */
+   move-instruction pairs, we will do a cpymem or libcall instead.  */
 #define MOVE_RATIO(speed) ((speed) ? 6 : 3)
 #define CLEAR_RATIO(speed) ((speed) ? 6 : 2)
 
diff --git a/gcc/config/vax/vax.md b/gcc/config/vax/vax.md
index bfeae7f..298f339 100644
--- a/gcc/config/vax/vax.md
+++ b/gcc/config/vax/vax.md
@@ -206,8 +206,8 @@
 }")
 
 ;; This is here to accept 4 arguments and pass the first 3 along
-;; to the movmemhi1 pattern that really does the work.
-(define_expand "movmemhi"
+;; to the cpymemhi1 pattern that really does the work.
+(define_expand "cpymemhi"
   [(set (match_operand:BLK 0 "general_operand" "=g")
(match_operand:BLK 1 "general_operand" "g"))
(use (match_operand:HI 2 "general_operand" "g"))
@@ -215,7 +215,7 @@
   ""
   "
 {
-  emit_insn (gen_movmemhi1 (operands[0], operands[1], operands[2]));
+  emit_insn (gen_cpymemhi1 (operands[0], operands[1], operands[2]));
   DONE;
 }")
 
@@ -224,7 +224,7 @@
 ;; that anything generated as this insn will be recognized as one
 ;; and that it won't successfully combine with anything.
 
-(define_insn "movmemhi1"
+(define_insn "cpymemhi1"
   [(set (match_operand:BLK 0 "memory_operand" "=o")
(match_operand:BLK 1 "memory_operand" "o"))
(use (match_operand:HI 2 "general_operand" "g"))
-- 
2.7.4



[PATCH 26/30] Changes to sh

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/sh/sh.md (movmemsi): Change name to cpymemsi.
---
 gcc/config/sh/sh.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/sh/sh.md b/gcc/config/sh/sh.md
index 8354377..ed70e34 100644
--- a/gcc/config/sh/sh.md
+++ b/gcc/config/sh/sh.md
@@ -8906,7 +8906,7 @@
 
 ;; String/block move insn.
 
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(parallel [(set (mem:BLK (match_operand:BLK 0))
   (mem:BLK (match_operand:BLK 1)))
  (use (match_operand:SI 2 "nonmemory_operand"))
-- 
2.7.4



[PATCH 27/30] Changes to sparc

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/sparc/sparc.h: Change movmem to cpymem in comment.
---
 gcc/config/sparc/sparc.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/sparc/sparc.h b/gcc/config/sparc/sparc.h
index 015065f..2dd765b 100644
--- a/gcc/config/sparc/sparc.h
+++ b/gcc/config/sparc/sparc.h
@@ -1412,7 +1412,7 @@ do {  
   \
 #define MOVE_MAX 8
 
 /* If a memory-to-memory move would take MOVE_RATIO or more simple
-   move-instruction pairs, we will do a movmem or libcall instead.  */
+   move-instruction pairs, we will do a cpymem or libcall instead.  */
 
 #define MOVE_RATIO(speed) ((speed) ? 8 : 3)
 
-- 
2.7.4



[PATCH 25/30] Changes to s390

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/s390/s390-protos.h: Change movmem to cpymem.
* config/s390/s390.c (s390_expand_movmem, s390_expand_setmem,
s390_expand_insv): Change movmem to cpymem.
* config/s390/s390.md (movmem, movmem_short, *movmem_short,
movmem_long, *movmem_long, *movmem_long_31z): Change movmem to cpymem.
---
 gcc/config/s390/s390-protos.h |  2 +-
 gcc/config/s390/s390.c| 18 +-
 gcc/config/s390/s390.md   | 16 
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/gcc/config/s390/s390-protos.h b/gcc/config/s390/s390-protos.h
index aa04479..b162b26 100644
--- a/gcc/config/s390/s390-protos.h
+++ b/gcc/config/s390/s390-protos.h
@@ -104,7 +104,7 @@ extern void s390_reload_symref_address (rtx , rtx , rtx , 
bool);
 extern void s390_expand_plus_operand (rtx, rtx, rtx);
 extern void emit_symbolic_move (rtx *);
 extern void s390_load_address (rtx, rtx);
-extern bool s390_expand_movmem (rtx, rtx, rtx);
+extern bool s390_expand_cpymem (rtx, rtx, rtx);
 extern void s390_expand_setmem (rtx, rtx, rtx);
 extern bool s390_expand_cmpmem (rtx, rtx, rtx, rtx);
 extern void s390_expand_vec_strlen (rtx, rtx, rtx);
diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 3ae1219..5ec26a059 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -5394,7 +5394,7 @@ legitimize_reload_address (rtx ad, machine_mode mode 
ATTRIBUTE_UNUSED,
 /* Emit code to move LEN bytes from DST to SRC.  */
 
 bool
-s390_expand_movmem (rtx dst, rtx src, rtx len)
+s390_expand_cpymem (rtx dst, rtx src, rtx len)
 {
   /* When tuning for z10 or higher we rely on the Glibc functions to
  do the right thing. Only for constant lengths below 64k we will
@@ -5419,14 +5419,14 @@ s390_expand_movmem (rtx dst, rtx src, rtx len)
{
  rtx newdst = adjust_address (dst, BLKmode, o);
  rtx newsrc = adjust_address (src, BLKmode, o);
- emit_insn (gen_movmem_short (newdst, newsrc,
+ emit_insn (gen_cpymem_short (newdst, newsrc,
   GEN_INT (l > 256 ? 255 : l - 1)));
}
 }
 
   else if (TARGET_MVCLE)
 {
-  emit_insn (gen_movmem_long (dst, src, convert_to_mode (Pmode, len, 1)));
+  emit_insn (gen_cpymem_long (dst, src, convert_to_mode (Pmode, len, 1)));
 }
 
   else
@@ -5488,7 +5488,7 @@ s390_expand_movmem (rtx dst, rtx src, rtx len)
  emit_insn (prefetch);
}
 
-  emit_insn (gen_movmem_short (dst, src, GEN_INT (255)));
+  emit_insn (gen_cpymem_short (dst, src, GEN_INT (255)));
   s390_load_address (dst_addr,
 gen_rtx_PLUS (Pmode, dst_addr, GEN_INT (256)));
   s390_load_address (src_addr,
@@ -5505,7 +5505,7 @@ s390_expand_movmem (rtx dst, rtx src, rtx len)
   emit_jump (loop_start_label);
   emit_label (loop_end_label);
 
-  emit_insn (gen_movmem_short (dst, src,
+  emit_insn (gen_cpymem_short (dst, src,
   convert_to_mode (Pmode, count, 1)));
   emit_label (end_label);
 }
@@ -5557,7 +5557,7 @@ s390_expand_setmem (rtx dst, rtx len, rtx val)
if (l > 1)
  {
rtx newdstp1 = adjust_address (dst, BLKmode, o + 1);
-   emit_insn (gen_movmem_short (newdstp1, newdst,
+   emit_insn (gen_cpymem_short (newdstp1, newdst,
 GEN_INT (l > 257 ? 255 : l - 2)));
  }
  }
@@ -5664,7 +5664,7 @@ s390_expand_setmem (rtx dst, rtx len, rtx val)
  /* Set the first byte in the block to the value and use an
 overlapping mvc for the block.  */
  emit_move_insn (adjust_address (dst, QImode, 0), val);
- emit_insn (gen_movmem_short (dstp1, dst, GEN_INT (254)));
+ emit_insn (gen_cpymem_short (dstp1, dst, GEN_INT (254)));
}
   s390_load_address (dst_addr,
 gen_rtx_PLUS (Pmode, dst_addr, GEN_INT (256)));
@@ -5688,7 +5688,7 @@ s390_expand_setmem (rtx dst, rtx len, rtx val)
  emit_move_insn (adjust_address (dst, QImode, 0), val);
  /* execute only uses the lowest 8 bits of count that's
 exactly what we need here.  */
- emit_insn (gen_movmem_short (dstp1, dst,
+ emit_insn (gen_cpymem_short (dstp1, dst,
   convert_to_mode (Pmode, count, 1)));
}
 
@@ -6330,7 +6330,7 @@ s390_expand_insv (rtx dest, rtx op1, rtx op2, rtx src)
 
  dest = adjust_address (dest, BLKmode, 0);
  set_mem_size (dest, size);
- s390_expand_movmem (dest, src_mem, GEN_INT (size));
+ s390_expand_cpymem (dest, src_mem, GEN_INT (size));
  return true;
}
 
diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index 714d8b0..d06aea9 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -3196,17 +3196,17 @@
 
 
 ;
-; movmemM instruction pattern(s).
+; 

[PATCH 24/30] Changes to rx

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/rx/rx.md: (UNSPEC_MOVMEM, movmemsi, rx_movmem): Change
movmem to cpymem.
---
 gcc/config/rx/rx.md | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/config/rx/rx.md b/gcc/config/rx/rx.md
index 2790882..9df73e6 100644
--- a/gcc/config/rx/rx.md
+++ b/gcc/config/rx/rx.md
@@ -46,7 +46,7 @@
(UNSPEC_CONST   13)

(UNSPEC_MOVSTR  20)
-   (UNSPEC_MOVMEM  21)
+   (UNSPEC_CPYMEM  21)
(UNSPEC_SETMEM  22)
(UNSPEC_STRLEN  23)
(UNSPEC_CMPSTRN 24)
@@ -2449,13 +2449,13 @@
(set_attr "timings" "")] ;; The timing is a guesstimate.
 )
 
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(parallel
 [(set (match_operand:BLK 0 "memory_operand");; Dest
  (match_operand:BLK 1 "memory_operand"))   ;; Source
  (use (match_operand:SI  2 "register_operand")) ;; Length in bytes
  (match_operand  3 "immediate_operand") ;; Align
- (unspec_volatile:BLK [(reg:SI 1) (reg:SI 2) (reg:SI 3)] UNSPEC_MOVMEM)]
+ (unspec_volatile:BLK [(reg:SI 1) (reg:SI 2) (reg:SI 3)] UNSPEC_CPYMEM)]
 )]
   "rx_allow_string_insns"
   {
@@ -2486,16 +2486,16 @@
 emit_move_insn (len, force_operand (operands[2], NULL_RTX));
 operands[0] = replace_equiv_address_nv (operands[0], addr1);
 operands[1] = replace_equiv_address_nv (operands[1], addr2);
-emit_insn (gen_rx_movmem ());
+emit_insn (gen_rx_cpymem ());
 DONE;
   }
 )
 
-(define_insn "rx_movmem"
+(define_insn "rx_cpymem"
   [(set (mem:BLK (reg:SI 1))
(mem:BLK (reg:SI 2)))
(use (reg:SI 3))
-   (unspec_volatile:BLK [(reg:SI 1) (reg:SI 2) (reg:SI 3)] UNSPEC_MOVMEM)
+   (unspec_volatile:BLK [(reg:SI 1) (reg:SI 2) (reg:SI 3)] UNSPEC_CPYMEM)
(clobber (reg:SI 1))
(clobber (reg:SI 2))
(clobber (reg:SI 3))]
-- 
2.7.4



[PATCH 23/30] Changes to rs6000

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/rs6000/rs6000.md: (movmemsi) Change name to cpymemsi.
---
 gcc/config/rs6000/rs6000.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index b04c7055..c3087e5 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -9113,7 +9113,7 @@
 ;; Argument 2 is the length
 ;; Argument 3 is the alignment
 
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(parallel [(set (match_operand:BLK 0 "")
   (match_operand:BLK 1 ""))
  (use (match_operand:SI 2 ""))
-- 
2.7.4



[PATCH 22/30] Changes to riscv

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/riscv/riscv.c: Change movmem to cpymem in comment.
* config/riscv/riscv.h: Change movmem to cpymem.
* config/riscv/riscv.md: (movmemsi) Change name to cpymemsi.
---
 gcc/config/riscv/riscv.c  | 2 +-
 gcc/config/riscv/riscv.h  | 8 
 gcc/config/riscv/riscv.md | 2 +-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index d61455f..8ac09f2 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3050,7 +3050,7 @@ riscv_block_move_loop (rtx dest, rtx src, HOST_WIDE_INT 
length,
 emit_insn(gen_nop ());
 }
 
-/* Expand a movmemsi instruction, which copies LENGTH bytes from
+/* Expand a cpymemsi instruction, which copies LENGTH bytes from
memory reference SRC to memory reference DEST.  */
 
 bool
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 8856cee..4509d73 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -840,20 +840,20 @@ while (0)
 #undef PTRDIFF_TYPE
 #define PTRDIFF_TYPE (POINTER_SIZE == 64 ? "long int" : "int")
 
-/* The maximum number of bytes copied by one iteration of a movmemsi loop.  */
+/* The maximum number of bytes copied by one iteration of a cpymemsi loop.  */
 
 #define RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER (UNITS_PER_WORD * 4)
 
 /* The maximum number of bytes that can be copied by a straight-line
-   movmemsi implementation.  */
+   cpymemsi implementation.  */
 
 #define RISCV_MAX_MOVE_BYTES_STRAIGHT (RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER * 3)
 
 /* If a memory-to-memory move would take MOVE_RATIO or more simple
-   move-instruction pairs, we will do a movmem or libcall instead.
+   move-instruction pairs, we will do a cpymem or libcall instead.
Do not use move_by_pieces at all when strict alignment is not
in effect but the target has slow unaligned accesses; in this
-   case, movmem or libcall is more efficient.  */
+   case, cpymem or libcall is more efficient.  */
 
 #define MOVE_RATIO(speed)  \
   (!STRICT_ALIGNMENT && riscv_slow_unaligned_access_p ? 1 :\
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 8b21c19..309c109 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1498,7 +1498,7 @@
   DONE;
 })
 
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(parallel [(set (match_operand:BLK 0 "general_operand")
   (match_operand:BLK 1 "general_operand"))
  (use (match_operand:SI 2 ""))
-- 
2.7.4



[PATCH 21/30] Changes to pdp11

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/pdp11/pdp11.md (movmemhi, movmemhi1,
movmemhi_nocc, UNSPEC_MOVMEM): Change movmem to cpymem.
---
 gcc/config/pdp11/pdp11.md | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/config/pdp11/pdp11.md b/gcc/config/pdp11/pdp11.md
index ce781db..be5ddc4 100644
--- a/gcc/config/pdp11/pdp11.md
+++ b/gcc/config/pdp11/pdp11.md
@@ -26,7 +26,7 @@
 UNSPECV_BLOCKAGE
 UNSPECV_SETD
 UNSPECV_SETI
-UNSPECV_MOVMEM
+UNSPECV_CPYMEM
   ])
 
 (define_constants
@@ -664,8 +664,8 @@
   [(set_attr "length" "2,2,4,4,2")])
 
 ;; Expand a block move.  We turn this into a move loop.
-(define_expand "movmemhi"
-  [(parallel [(unspec_volatile [(const_int 0)] UNSPECV_MOVMEM)
+(define_expand "cpymemhi"
+  [(parallel [(unspec_volatile [(const_int 0)] UNSPECV_CPYMEM)
  (match_operand:BLK 0 "general_operand" "=g")
  (match_operand:BLK 1 "general_operand" "g")
  (match_operand:HI 2 "immediate_operand" "i")
@@ -694,8 +694,8 @@
 }")
 
 ;; Expand a block move.  We turn this into a move loop.
-(define_insn_and_split "movmemhi1"
-  [(unspec_volatile [(const_int 0)] UNSPECV_MOVMEM)
+(define_insn_and_split "cpymemhi1"
+  [(unspec_volatile [(const_int 0)] UNSPECV_CPYMEM)
(match_operand:HI 0 "register_operand" "+r")
(match_operand:HI 1 "register_operand" "+r")
(match_operand:HI 2 "register_operand" "+r")
@@ -707,7 +707,7 @@
   ""
   "#"
   "reload_completed"
-  [(parallel [(unspec_volatile [(const_int 0)] UNSPECV_MOVMEM)
+  [(parallel [(unspec_volatile [(const_int 0)] UNSPECV_CPYMEM)
  (match_dup 0)
  (match_dup 1)
  (match_dup 2)
@@ -719,8 +719,8 @@
  (clobber (reg:CC CC_REGNUM))])]
   "")
 
-(define_insn "movmemhi_nocc"
-  [(unspec_volatile [(const_int 0)] UNSPECV_MOVMEM)
+(define_insn "cpymemhi_nocc"
+  [(unspec_volatile [(const_int 0)] UNSPECV_CPYMEM)
(match_operand:HI 0 "register_operand" "+r")
(match_operand:HI 1 "register_operand" "+r")
(match_operand:HI 2 "register_operand" "+r")
-- 
2.7.4



[PATCH 20/30] Changes to pa

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/pa/pa.c (compute_movmem_length): Change movmem to cpymem.
(pa_adjust_insn_length): Change call to compute_movmem_length.
* config/pa/pa.md (movmemsi, movmemsi_prereload, movmemsi_postreload,
movmemdi, movmemdi_prereload,
movmemdi_postreload): Change movmem to cpymem.
---
 gcc/config/pa/pa.c  |  6 +++---
 gcc/config/pa/pa.md | 14 +++---
 2 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/gcc/config/pa/pa.c b/gcc/config/pa/pa.c
index deb2d43..0d00bf6 100644
--- a/gcc/config/pa/pa.c
+++ b/gcc/config/pa/pa.c
@@ -107,7 +107,7 @@ static int pa_can_combine_p (rtx_insn *, rtx_insn *, 
rtx_insn *, int, rtx,
 static bool forward_branch_p (rtx_insn *);
 static void compute_zdepwi_operands (unsigned HOST_WIDE_INT, unsigned *);
 static void compute_zdepdi_operands (unsigned HOST_WIDE_INT, unsigned *);
-static int compute_movmem_length (rtx_insn *);
+static int compute_cpymem_length (rtx_insn *);
 static int compute_clrmem_length (rtx_insn *);
 static bool pa_assemble_integer (rtx, unsigned int, int);
 static void remove_useless_addtr_insns (int);
@@ -2985,7 +2985,7 @@ pa_output_block_move (rtx *operands, int size_is_constant 
ATTRIBUTE_UNUSED)
count insns rather than emit them.  */
 
 static int
-compute_movmem_length (rtx_insn *insn)
+compute_cpymem_length (rtx_insn *insn)
 {
   rtx pat = PATTERN (insn);
   unsigned int align = INTVAL (XEXP (XVECEXP (pat, 0, 7), 0));
@@ -5060,7 +5060,7 @@ pa_adjust_insn_length (rtx_insn *insn, int length)
   && GET_CODE (XEXP (XVECEXP (pat, 0, 0), 1)) == MEM
   && GET_MODE (XEXP (XVECEXP (pat, 0, 0), 0)) == BLKmode
   && GET_MODE (XEXP (XVECEXP (pat, 0, 0), 1)) == BLKmode)
-length += compute_movmem_length (insn) - 4;
+length += compute_cpymem_length (insn) - 4;
   /* Block clear pattern.  */
   else if (NONJUMP_INSN_P (insn)
   && GET_CODE (pat) == PARALLEL
diff --git a/gcc/config/pa/pa.md b/gcc/config/pa/pa.md
index a568e79..809a7b7 100644
--- a/gcc/config/pa/pa.md
+++ b/gcc/config/pa/pa.md
@@ -3162,9 +3162,9 @@
 
 ;; The definition of this insn does not really explain what it does,
 ;; but it should suffice that anything generated as this insn will be
-;; recognized as a movmemsi operation, and that it will not successfully
+;; recognized as a cpymemsi operation, and that it will not successfully
 ;; combine with anything.
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(parallel [(set (match_operand:BLK 0 "" "")
   (match_operand:BLK 1 "" ""))
  (clobber (match_dup 4))
@@ -3244,7 +3244,7 @@
 ;; operands 0 and 1 are both equivalent to symbolic MEMs.  Thus, we are
 ;; forced to internally copy operands 0 and 1 to operands 7 and 8,
 ;; respectively.  We then split or peephole optimize after reload.
-(define_insn "movmemsi_prereload"
+(define_insn "cpymemsi_prereload"
   [(set (mem:BLK (match_operand:SI 0 "register_operand" "r,r"))
(mem:BLK (match_operand:SI 1 "register_operand" "r,r")))
(clobber (match_operand:SI 2 "register_operand" "=,"))  ;loop cnt/tmp
@@ -3337,7 +3337,7 @@
 }
 }")
 
-(define_insn "movmemsi_postreload"
+(define_insn "cpymemsi_postreload"
   [(set (mem:BLK (match_operand:SI 0 "register_operand" "+r,r"))
(mem:BLK (match_operand:SI 1 "register_operand" "+r,r")))
(clobber (match_operand:SI 2 "register_operand" "=,"))  ;loop cnt/tmp
@@ -3352,7 +3352,7 @@
   "* return pa_output_block_move (operands, !which_alternative);"
   [(set_attr "type" "multi,multi")])
 
-(define_expand "movmemdi"
+(define_expand "cpymemdi"
   [(parallel [(set (match_operand:BLK 0 "" "")
   (match_operand:BLK 1 "" ""))
  (clobber (match_dup 4))
@@ -3432,7 +3432,7 @@
 ;; operands 0 and 1 are both equivalent to symbolic MEMs.  Thus, we are
 ;; forced to internally copy operands 0 and 1 to operands 7 and 8,
 ;; respectively.  We then split or peephole optimize after reload.
-(define_insn "movmemdi_prereload"
+(define_insn "cpymemdi_prereload"
   [(set (mem:BLK (match_operand:DI 0 "register_operand" "r,r"))
(mem:BLK (match_operand:DI 1 "register_operand" "r,r")))
(clobber (match_operand:DI 2 "register_operand" "=,"))  ;loop cnt/tmp
@@ -3525,7 +3525,7 @@
 }
 }")
 
-(define_insn "movmemdi_postreload"
+(define_insn "cpymemdi_postreload"
   [(set (mem:BLK (match_operand:DI 0 "register_operand" "+r,r"))
(mem:BLK (match_operand:DI 1 "register_operand" "+r,r")))
(clobber (match_operand:DI 2 "register_operand" "=,"))  ;loop cnt/tmp
-- 
2.7.4



[PATCH 19/30] Changes to nds32

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/nds32/nds32-memory-manipulation.c
(nds32_expand_movmemsi_loop_unknown_size,
nds32_expand_movmemsi_loop_known_size, nds32_expand_movmemsi_loop,
nds32_expand_movmemsi_unroll,
nds32_expand_movmemsi): Change movmem to cpymem.
* config/nds32/nds32-multiple.md (movmemsi): Change name to cpymemsi.
* config/nds32/nds32-protos.h: Change movmem to cpymem.
---
 gcc/config/nds32/nds32-memory-manipulation.c | 30 ++--
 gcc/config/nds32/nds32-multiple.md   |  4 ++--
 gcc/config/nds32/nds32-protos.h  |  2 +-
 3 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/gcc/config/nds32/nds32-memory-manipulation.c 
b/gcc/config/nds32/nds32-memory-manipulation.c
index 71b75dc..b3f2cd6 100644
--- a/gcc/config/nds32/nds32-memory-manipulation.c
+++ b/gcc/config/nds32/nds32-memory-manipulation.c
@@ -1,4 +1,4 @@
-/* Auxiliary functions for expand movmem, setmem, cmpmem, load_multiple
+/* Auxiliary functions for expand cpymem, setmem, cmpmem, load_multiple
and store_multiple pattern of Andes NDS32 cpu for GNU compiler
Copyright (C) 2012-2019 Free Software Foundation, Inc.
Contributed by Andes Technology Corporation.
@@ -120,14 +120,14 @@ nds32_emit_mem_move_block (int base_regno, int count,
 
 /*  */
 
-/* Auxiliary function for expand movmem pattern.  */
+/* Auxiliary function for expand cpymem pattern.  */
 
 static bool
-nds32_expand_movmemsi_loop_unknown_size (rtx dstmem, rtx srcmem,
+nds32_expand_cpymemsi_loop_unknown_size (rtx dstmem, rtx srcmem,
 rtx size,
 rtx alignment)
 {
-  /* Emit loop version of movmem.
+  /* Emit loop version of cpymem.
 
andi$size_least_3_bit, $size, #~7
add $dst_end, $dst, $size
@@ -254,7 +254,7 @@ nds32_expand_movmemsi_loop_unknown_size (rtx dstmem, rtx 
srcmem,
 }
 
 static bool
-nds32_expand_movmemsi_loop_known_size (rtx dstmem, rtx srcmem,
+nds32_expand_cpymemsi_loop_known_size (rtx dstmem, rtx srcmem,
   rtx size, rtx alignment)
 {
   rtx dst_base_reg, src_base_reg;
@@ -288,7 +288,7 @@ nds32_expand_movmemsi_loop_known_size (rtx dstmem, rtx 
srcmem,
 
   if (total_bytes < 8)
 {
-  /* Emit total_bytes less than 8 loop version of movmem.
+  /* Emit total_bytes less than 8 loop version of cpymem.
add $dst_end, $dst, $size
move$dst_itr, $dst
.Lbyte_mode_loop:
@@ -321,7 +321,7 @@ nds32_expand_movmemsi_loop_known_size (rtx dstmem, rtx 
srcmem,
 }
   else if (total_bytes % 8 == 0)
 {
-  /* Emit multiple of 8 loop version of movmem.
+  /* Emit multiple of 8 loop version of cpymem.
 
 add $dst_end, $dst, $size
 move$dst_itr, $dst
@@ -370,7 +370,7 @@ nds32_expand_movmemsi_loop_known_size (rtx dstmem, rtx 
srcmem,
   else
 {
   /* Handle size greater than 8, and not a multiple of 8.  */
-  return nds32_expand_movmemsi_loop_unknown_size (dstmem, srcmem,
+  return nds32_expand_cpymemsi_loop_unknown_size (dstmem, srcmem,
  size, alignment);
 }
 
@@ -378,19 +378,19 @@ nds32_expand_movmemsi_loop_known_size (rtx dstmem, rtx 
srcmem,
 }
 
 static bool
-nds32_expand_movmemsi_loop (rtx dstmem, rtx srcmem,
+nds32_expand_cpymemsi_loop (rtx dstmem, rtx srcmem,
rtx size, rtx alignment)
 {
   if (CONST_INT_P (size))
-return nds32_expand_movmemsi_loop_known_size (dstmem, srcmem,
+return nds32_expand_cpymemsi_loop_known_size (dstmem, srcmem,
  size, alignment);
   else
-return nds32_expand_movmemsi_loop_unknown_size (dstmem, srcmem,
+return nds32_expand_cpymemsi_loop_unknown_size (dstmem, srcmem,
size, alignment);
 }
 
 static bool
-nds32_expand_movmemsi_unroll (rtx dstmem, rtx srcmem,
+nds32_expand_cpymemsi_unroll (rtx dstmem, rtx srcmem,
  rtx total_bytes, rtx alignment)
 {
   rtx dst_base_reg, src_base_reg;
@@ -533,13 +533,13 @@ nds32_expand_movmemsi_unroll (rtx dstmem, rtx srcmem,
This is auxiliary extern function to help create rtx template.
Check nds32-multiple.md file for the patterns.  */
 bool
-nds32_expand_movmemsi (rtx dstmem, rtx srcmem, rtx total_bytes, rtx alignment)
+nds32_expand_cpymemsi (rtx dstmem, rtx srcmem, rtx total_bytes, rtx alignment)
 {
-  if (nds32_expand_movmemsi_unroll (dstmem, srcmem, total_bytes, alignment))
+  if (nds32_expand_cpymemsi_unroll (dstmem, srcmem, total_bytes, alignment))
 return true;
 
   if (!optimize_size && optimize > 2)
-return nds32_expand_movmemsi_loop (dstmem, srcmem, total_bytes, alignment);
+return nds32_expand_cpymemsi_loop (dstmem, srcmem, total_bytes, alignment);
 
   

[PATCH 18/30] Changes to mips

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/mips/mips.c (mips_use_by_pieces_infrastructure_p):
Change movmem to cpymem.
* config/mips/mips.h: Change movmem to cpymem.
* config/mips/mips.md (movmemsi): Change name to cpymemsi.
---
 gcc/config/mips/mips.c  | 10 +-
 gcc/config/mips/mips.h  | 10 +-
 gcc/config/mips/mips.md |  2 +-
 3 files changed, 11 insertions(+), 11 deletions(-)

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 0e1a68a..cbebb45 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -7938,15 +7938,15 @@ mips_use_by_pieces_infrastructure_p (unsigned 
HOST_WIDE_INT size,
 {
   if (op == STORE_BY_PIECES)
 return mips_store_by_pieces_p (size, align);
-  if (op == MOVE_BY_PIECES && HAVE_movmemsi)
+  if (op == MOVE_BY_PIECES && HAVE_cpymemsi)
 {
-  /* movmemsi is meant to generate code that is at least as good as
-move_by_pieces.  However, movmemsi effectively uses a by-pieces
+  /* cpymemsi is meant to generate code that is at least as good as
+move_by_pieces.  However, cpymemsi effectively uses a by-pieces
 implementation both for moves smaller than a word and for
 word-aligned moves of no more than MIPS_MAX_MOVE_BYTES_STRAIGHT
 bytes.  We should allow the tree-level optimisers to do such
 moves by pieces, as it often exposes other optimization
-opportunities.  We might as well continue to use movmemsi at
+opportunities.  We might as well continue to use cpymemsi at
 the rtl level though, as it produces better code when
 scheduling is disabled (such as at -O).  */
   if (currently_expanding_to_rtl)
@@ -8165,7 +8165,7 @@ mips_block_move_loop (rtx dest, rtx src, HOST_WIDE_INT 
length,
 emit_insn (gen_nop ());
 }
 
-/* Expand a movmemsi instruction, which copies LENGTH bytes from
+/* Expand a cpymemsi instruction, which copies LENGTH bytes from
memory reference SRC to memory reference DEST.  */
 
 bool
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 953d82e..a5be7fa3 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -3099,12 +3099,12 @@ while (0)
 #define MIPS_MIN_MOVE_MEM_ALIGN 16
 
 /* The maximum number of bytes that can be copied by one iteration of
-   a movmemsi loop; see mips_block_move_loop.  */
+   a cpymemsi loop; see mips_block_move_loop.  */
 #define MIPS_MAX_MOVE_BYTES_PER_LOOP_ITER \
   (UNITS_PER_WORD * 4)
 
 /* The maximum number of bytes that can be copied by a straight-line
-   implementation of movmemsi; see mips_block_move_straight.  We want
+   implementation of cpymemsi; see mips_block_move_straight.  We want
to make sure that any loop-based implementation will iterate at
least twice.  */
 #define MIPS_MAX_MOVE_BYTES_STRAIGHT \
@@ -3119,11 +3119,11 @@ while (0)
 
 #define MIPS_CALL_RATIO 8
 
-/* Any loop-based implementation of movmemsi will have at least
+/* Any loop-based implementation of cpymemsi will have at least
MIPS_MAX_MOVE_BYTES_STRAIGHT / UNITS_PER_WORD memory-to-memory
moves, so allow individual copies of fewer elements.
 
-   When movmemsi is not available, use a value approximating
+   When cpymemsi is not available, use a value approximating
the length of a memcpy call sequence, so that move_by_pieces
will generate inline code if it is shorter than a function call.
Since move_by_pieces_ninsns counts memory-to-memory moves, but
@@ -3131,7 +3131,7 @@ while (0)
value of MIPS_CALL_RATIO to take that into account.  */
 
 #define MOVE_RATIO(speed)  \
-  (HAVE_movmemsi   \
+  (HAVE_cpymemsi   \
? MIPS_MAX_MOVE_BYTES_STRAIGHT / MOVE_MAX   \
: MIPS_CALL_RATIO / 2)
 
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index 2ae1f7e..d260cf9 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -5638,7 +5638,7 @@
 ;; Argument 2 is the length
 ;; Argument 3 is the alignment
 
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(parallel [(set (match_operand:BLK 0 "general_operand")
   (match_operand:BLK 1 "general_operand"))
  (use (match_operand:SI 2 ""))
-- 
2.7.4



[PATCH 17/30] Changes to microblaze

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/microblaze/microblaze.c: Change movmem to cpymem in comment.
* config/microblaze/microblaze.md (movmemsi): Change name to cpymemsi.
---
 gcc/config/microblaze/microblaze.c  | 2 +-
 gcc/config/microblaze/microblaze.md | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/microblaze/microblaze.c 
b/gcc/config/microblaze/microblaze.c
index 947eef8..c2cbe3b 100644
--- a/gcc/config/microblaze/microblaze.c
+++ b/gcc/config/microblaze/microblaze.c
@@ -1250,7 +1250,7 @@ microblaze_block_move_loop (rtx dest, rtx src, 
HOST_WIDE_INT length)
 microblaze_block_move_straight (dest, src, leftover);
 }
 
-/* Expand a movmemsi instruction.  */
+/* Expand a cpymemsi instruction.  */
 
 bool
 microblaze_expand_block_move (rtx dest, rtx src, rtx length, rtx align_rtx)
diff --git a/gcc/config/microblaze/microblaze.md 
b/gcc/config/microblaze/microblaze.md
index 183afff..1509e43 100644
--- a/gcc/config/microblaze/microblaze.md
+++ b/gcc/config/microblaze/microblaze.md
@@ -1144,7 +1144,7 @@
 ;; Argument 2 is the length
 ;; Argument 3 is the alignment
  
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(parallel [(set (match_operand:BLK 0 "general_operand")
   (match_operand:BLK 1 "general_operand"))
  (use (match_operand:SI 2 ""))
-- 
2.7.4



[PATCH 16/30] Changes to mcore

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/mcore/mcore.md (movmemsi): Change name to cpymemsi.
---
 gcc/config/mcore/mcore.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/mcore/mcore.md b/gcc/config/mcore/mcore.md
index cc84e34..c689351 100644
--- a/gcc/config/mcore/mcore.md
+++ b/gcc/config/mcore/mcore.md
@@ -2552,7 +2552,7 @@
 ;; Block move - adapted from m88k.md
 ;; 
 
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(parallel [(set (mem:BLK (match_operand:BLK 0 "" ""))
   (mem:BLK (match_operand:BLK 1 "" "")))
  (use (match_operand:SI 2 "general_operand" ""))
-- 
2.7.4



[PATCH 15/30] Changes to m32r

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/m32r/m32r.c (m32r_expand_block_move): Change movmem to cpymem.
* config/m32r/m32r.md (movmemsi, movmemsi_internal): Change movmem
to cpymem.
---
 gcc/config/m32r/m32r.c  | 4 ++--
 gcc/config/m32r/m32r.md | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/config/m32r/m32r.c b/gcc/config/m32r/m32r.c
index 6e79b2a..ac18aa2 100644
--- a/gcc/config/m32r/m32r.c
+++ b/gcc/config/m32r/m32r.c
@@ -2598,7 +2598,7 @@ m32r_expand_block_move (rtx operands[])
 to the word after the end of the source block, and dst_reg to point
 to the last word of the destination block, provided that the block
 is MAX_MOVE_BYTES long.  */
-  emit_insn (gen_movmemsi_internal (dst_reg, src_reg, at_a_time,
+  emit_insn (gen_cpymemsi_internal (dst_reg, src_reg, at_a_time,
new_dst_reg, new_src_reg));
   emit_move_insn (dst_reg, new_dst_reg);
   emit_move_insn (src_reg, new_src_reg);
@@ -2612,7 +2612,7 @@ m32r_expand_block_move (rtx operands[])
 }
 
   if (leftover)
-emit_insn (gen_movmemsi_internal (dst_reg, src_reg, GEN_INT (leftover),
+emit_insn (gen_cpymemsi_internal (dst_reg, src_reg, GEN_INT (leftover),
  gen_reg_rtx (SImode),
  gen_reg_rtx (SImode)));
   return 1;
diff --git a/gcc/config/m32r/m32r.md b/gcc/config/m32r/m32r.md
index be57397..e944363 100644
--- a/gcc/config/m32r/m32r.md
+++ b/gcc/config/m32r/m32r.md
@@ -2195,7 +2195,7 @@
 ;; Argument 2 is the length
 ;; Argument 3 is the alignment
 
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(parallel [(set (match_operand:BLK 0 "general_operand" "")
   (match_operand:BLK 1 "general_operand" ""))
  (use (match_operand:SI  2 "immediate_operand" ""))
@@ -2214,7 +2214,7 @@
 
 ;; Insn generated by block moves
 
-(define_insn "movmemsi_internal"
+(define_insn "cpymemsi_internal"
   [(set (mem:BLK (match_operand:SI 0 "register_operand" "r"))  ;; destination
(mem:BLK (match_operand:SI 1 "register_operand" "r")))  ;; source
(use (match_operand:SI 2 "m32r_block_immediate_operand" "J"));; # bytes to 
move
-- 
2.7.4



[PATCH 01/30] Changes to machine independent code

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* builtins.c (get_memory_rtx): Fix comment.
* optabs.def (movmem_optab): Change to cpymem_optab.
* expr.c (emit_block_move_via_cpymem): Change movmem to cpymem.
(emit_block_move_hints): Change movmem to cpymem.
* defaults.h: Change movmem to cpymem.
* targhooks.c (get_move_ratio): Change movmem to cpymem.
(default_use_by_pieces_infrastructure_p): Ditto.
---
 gcc/builtins.c  |  2 +-
 gcc/defaults.h  |  6 +++---
 gcc/expr.c  | 10 +-
 gcc/optabs.def  |  2 +-
 gcc/targhooks.c |  6 +++---
 5 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 4ecfd49..40afd03 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -1416,7 +1416,7 @@ expand_builtin_prefetch (tree exp)
 }
 
 /* Get a MEM rtx for expression EXP which is the address of an operand
-   to be used in a string instruction (cmpstrsi, movmemsi, ..).  LEN is
+   to be used in a string instruction (cmpstrsi, cpymemsi, ..).  LEN is
the maximum length of the block of memory that might be accessed or
NULL if unknown.  */
 
diff --git a/gcc/defaults.h b/gcc/defaults.h
index b753425..af7ea18 100644
--- a/gcc/defaults.h
+++ b/gcc/defaults.h
@@ -1318,10 +1318,10 @@ see the files COPYING3 and COPYING.RUNTIME 
respectively.  If not, see
 #endif
 
 /* If a memory-to-memory move would take MOVE_RATIO or more simple
-   move-instruction sequences, we will do a movmem or libcall instead.  */
+   move-instruction sequences, we will do a cpymem or libcall instead.  */
 
 #ifndef MOVE_RATIO
-#if defined (HAVE_movmemqi) || defined (HAVE_movmemhi) || defined 
(HAVE_movmemsi) || defined (HAVE_movmemdi) || defined (HAVE_movmemti)
+#if defined (HAVE_cpymemqi) || defined (HAVE_cpymemhi) || defined 
(HAVE_cpymemsi) || defined (HAVE_cpymemdi) || defined (HAVE_cpymemti)
 #define MOVE_RATIO(speed) 2
 #else
 /* If we are optimizing for space (-Os), cut down the default move ratio.  */
@@ -1342,7 +1342,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively.  
If not, see
 #endif
 
 /* If a memory set (to value other than zero) operation would take
-   SET_RATIO or more simple move-instruction sequences, we will do a movmem
+   SET_RATIO or more simple move-instruction sequences, we will do a setmem
or libcall instead.  */
 #ifndef SET_RATIO
 #define SET_RATIO(speed) MOVE_RATIO (speed)
diff --git a/gcc/expr.c b/gcc/expr.c
index c78bc74..4d39569 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -73,7 +73,7 @@ along with GCC; see the file COPYING3.  If not see
 int cse_not_expected;
 
 static bool block_move_libcall_safe_for_call_parm (void);
-static bool emit_block_move_via_movmem (rtx, rtx, rtx, unsigned, unsigned, 
HOST_WIDE_INT,
+static bool emit_block_move_via_cpymem (rtx, rtx, rtx, unsigned, unsigned, 
HOST_WIDE_INT,
unsigned HOST_WIDE_INT, unsigned 
HOST_WIDE_INT,
unsigned HOST_WIDE_INT);
 static void emit_block_move_via_loop (rtx, rtx, rtx, unsigned);
@@ -1624,7 +1624,7 @@ emit_block_move_hints (rtx x, rtx y, rtx size, enum 
block_op_methods method,
 
   if (CONST_INT_P (size) && can_move_by_pieces (INTVAL (size), align))
 move_by_pieces (x, y, INTVAL (size), align, RETURN_BEGIN);
-  else if (emit_block_move_via_movmem (x, y, size, align,
+  else if (emit_block_move_via_cpymem (x, y, size, align,
   expected_align, expected_size,
   min_size, max_size, probable_max_size))
 ;
@@ -1722,11 +1722,11 @@ block_move_libcall_safe_for_call_parm (void)
   return true;
 }
 
-/* A subroutine of emit_block_move.  Expand a movmem pattern;
+/* A subroutine of emit_block_move.  Expand a cpymem pattern;
return true if successful.  */
 
 static bool
-emit_block_move_via_movmem (rtx x, rtx y, rtx size, unsigned int align,
+emit_block_move_via_cpymem (rtx x, rtx y, rtx size, unsigned int align,
unsigned int expected_align, HOST_WIDE_INT 
expected_size,
unsigned HOST_WIDE_INT min_size,
unsigned HOST_WIDE_INT max_size,
@@ -1755,7 +1755,7 @@ emit_block_move_via_movmem (rtx x, rtx y, rtx size, 
unsigned int align,
   FOR_EACH_MODE_IN_CLASS (mode_iter, MODE_INT)
 {
   scalar_int_mode mode = mode_iter.require ();
-  enum insn_code code = direct_optab_handler (movmem_optab, mode);
+  enum insn_code code = direct_optab_handler (cpymem_optab, mode);
 
   if (code != CODE_FOR_nothing
  /* We don't need MODE to be narrower than BITS_PER_HOST_WIDE_INT
diff --git a/gcc/optabs.def b/gcc/optabs.def
index feee96f..03a08da 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -256,7 +256,7 @@ OPTAB_D (umul_highpart_optab, "umul$a3_highpart")
 OPTAB_D (cmpmem_optab, "cmpmem$a")
 OPTAB_D (cmpstr_optab, "cmpstr$a")
 OPTAB_D (cmpstrn_optab, "cmpstrn$a")
-OPTAB_D (movmem_optab, "movmem$a")
+OPTAB_D 

[PATCH 13/30] Changes to lm32

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/lm32/lm32.md (movmemsi): Change name to cpymemsi.
---
 gcc/config/lm32/lm32.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/lm32/lm32.md b/gcc/config/lm32/lm32.md
index c09052c..91a5fe1 100644
--- a/gcc/config/lm32/lm32.md
+++ b/gcc/config/lm32/lm32.md
@@ -216,7 +216,7 @@
 }
 }")
 
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(parallel [(set (match_operand:BLK 0 "general_operand" "")
   (match_operand:BLK 1 "general_operand" ""))
  (use (match_operand:SI 2 "" ""))
-- 
2.7.4



[PATCH 14/30] Changes to m32c

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/m32c/blkmov.md (movmemhi, movmemhi_bhi_op, movmemhi_bpsi_op,
movmemhi_whi_op, movmemhi_wpsi_op): Change movmem to cpymem.
* config/m32c/m32c-protos.h: Change movmem to cpymem.
* config/m32c/m32c.c (m32c_expand_movmemhi): Change movmem to cpymem.
---
 gcc/config/m32c/blkmov.md | 12 ++--
 gcc/config/m32c/m32c-protos.h |  2 +-
 gcc/config/m32c/m32c.c| 10 +-
 3 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/gcc/config/m32c/blkmov.md b/gcc/config/m32c/blkmov.md
index d7da439..e5cdc80 100644
--- a/gcc/config/m32c/blkmov.md
+++ b/gcc/config/m32c/blkmov.md
@@ -40,14 +40,14 @@
 ;; 1 = source (mem:BLK ...)
 ;; 2 = count
 ;; 3 = alignment
-(define_expand "movmemhi"
+(define_expand "cpymemhi"
   [(match_operand 0 "ap_operand" "")
(match_operand 1 "ap_operand" "")
(match_operand 2 "m32c_r3_operand" "")
(match_operand 3 "" "")
]
   ""
-  "if (m32c_expand_movmemhi(operands)) DONE; FAIL;"
+  "if (m32c_expand_cpymemhi(operands)) DONE; FAIL;"
   )
 
 ;; We can't use mode iterators for these because M16C uses r1h to extend
@@ -60,7 +60,7 @@
 ;; 3 = dest (in)
 ;; 4 = src (in)
 ;; 5 = count (in)
-(define_insn "movmemhi_bhi_op"
+(define_insn "cpymemhi_bhi_op"
   [(set (mem:QI (match_operand:HI 3 "ap_operand" "0"))
(mem:QI (match_operand:HI 4 "ap_operand" "1")))
(set (match_operand:HI 2 "m32c_r3_operand" "=R3w")
@@ -75,7 +75,7 @@
   "TARGET_A16"
   "mov.b:q\t#0,r1h\n\tsmovf.b\t; %0[0..%2-1]=r1h%1[]"
   )
-(define_insn "movmemhi_bpsi_op"
+(define_insn "cpymemhi_bpsi_op"
   [(set (mem:QI (match_operand:PSI 3 "ap_operand" "0"))
(mem:QI (match_operand:PSI 4 "ap_operand" "1")))
(set (match_operand:HI 2 "m32c_r3_operand" "=R3w")
@@ -89,7 +89,7 @@
   "TARGET_A24"
   "smovf.b\t; %0[0..%2-1]=%1[]"
   )
-(define_insn "movmemhi_whi_op"
+(define_insn "cpymemhi_whi_op"
   [(set (mem:HI (match_operand:HI 3 "ap_operand" "0"))
(mem:HI (match_operand:HI 4 "ap_operand" "1")))
(set (match_operand:HI 2 "m32c_r3_operand" "=R3w")
@@ -104,7 +104,7 @@
   "TARGET_A16"
   "mov.b:q\t#0,r1h\n\tsmovf.w\t; %0[0..%2-1]=r1h%1[]"
   )
-(define_insn "movmemhi_wpsi_op"
+(define_insn "cpymemhi_wpsi_op"
   [(set (mem:HI (match_operand:PSI 3 "ap_operand" "0"))
(mem:HI (match_operand:PSI 4 "ap_operand" "1")))
(set (match_operand:HI 2 "m32c_r3_operand" "=R3w")
diff --git a/gcc/config/m32c/m32c-protos.h b/gcc/config/m32c/m32c-protos.h
index 7d4d478..fe926fd 100644
--- a/gcc/config/m32c/m32c-protos.h
+++ b/gcc/config/m32c/m32c-protos.h
@@ -43,7 +43,7 @@ void m32c_emit_eh_epilogue (rtx);
 int  m32c_expand_cmpstr (rtx *);
 int  m32c_expand_insv (rtx *);
 int  m32c_expand_movcc (rtx *);
-int  m32c_expand_movmemhi (rtx *);
+int  m32c_expand_cpymemhi (rtx *);
 int  m32c_expand_movstr (rtx *);
 void m32c_expand_neg_mulpsi3 (rtx *);
 int  m32c_expand_setmemhi (rtx *);
diff --git a/gcc/config/m32c/m32c.c b/gcc/config/m32c/m32c.c
index 1a0d0c68..d0d24bb 100644
--- a/gcc/config/m32c/m32c.c
+++ b/gcc/config/m32c/m32c.c
@@ -3592,7 +3592,7 @@ m32c_expand_setmemhi(rtx *operands)
addresses, not [mem] syntax.  $0 is the destination (MEM:BLK), $1
is the source (MEM:BLK), and $2 the count (HI).  */
 int
-m32c_expand_movmemhi(rtx *operands)
+m32c_expand_cpymemhi(rtx *operands)
 {
   rtx desta, srca, count;
   rtx desto, srco, counto;
@@ -3620,9 +3620,9 @@ m32c_expand_movmemhi(rtx *operands)
 {
   count = copy_to_mode_reg (HImode, GEN_INT (INTVAL (count) / 2));
   if (TARGET_A16)
-   emit_insn (gen_movmemhi_whi_op (desto, srco, counto, desta, srca, 
count));
+   emit_insn (gen_cpymemhi_whi_op (desto, srco, counto, desta, srca, 
count));
   else
-   emit_insn (gen_movmemhi_wpsi_op (desto, srco, counto, desta, srca, 
count));
+   emit_insn (gen_cpymemhi_wpsi_op (desto, srco, counto, desta, srca, 
count));
   return 1;
 }
 
@@ -3632,9 +3632,9 @@ m32c_expand_movmemhi(rtx *operands)
 count = copy_to_mode_reg (HImode, count);
 
   if (TARGET_A16)
-emit_insn (gen_movmemhi_bhi_op (desto, srco, counto, desta, srca, count));
+emit_insn (gen_cpymemhi_bhi_op (desto, srco, counto, desta, srca, count));
   else
-emit_insn (gen_movmemhi_bpsi_op (desto, srco, counto, desta, srca, count));
+emit_insn (gen_cpymemhi_bpsi_op (desto, srco, counto, desta, srca, count));
 
   return 1;
 }
-- 
2.7.4



[PATCH 12/30] Changes to i386

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/i386/i386-expand.c (expand_set_or_movmem_via_loop,
expand_set_or_movmem_via_rep, expand_movmem_epilogue,
expand_setmem_epilogue_via_loop, expand_set_or_cpymem_prologue,
expand_small_cpymem_or_setmem,
expand_set_or_cpymem_prologue_epilogue_by_misaligned_moves,
expand_set_or_cpymem_constant_prologue,
ix86_expand_set_or_cpymem): Change movmem to cpymem.
* config/i386/i386-protos.h: Change movmem to cpymem.
* config/i386/i386.h: Change movmem to cpymem in comment.
* config/i386/i386.md (movmem): Change name to cpymem.
(setmem): Change expansion function name.
---
 gcc/config/i386/i386-expand.c | 36 ++--
 gcc/config/i386/i386-protos.h |  2 +-
 gcc/config/i386/i386.h|  2 +-
 gcc/config/i386/i386.md   |  6 +++---
 4 files changed, 23 insertions(+), 23 deletions(-)

diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 72be1df..ae1fe2a9 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -5801,7 +5801,7 @@ counter_mode (rtx count_exp)
 
 
 static void
-expand_set_or_movmem_via_loop (rtx destmem, rtx srcmem,
+expand_set_or_cpymem_via_loop (rtx destmem, rtx srcmem,
   rtx destptr, rtx srcptr, rtx value,
   rtx count, machine_mode mode, int unroll,
   int expected_size, bool issetmem)
@@ -5954,7 +5954,7 @@ scale_counter (rtx countreg, int scale)
Other arguments have same meaning as for previous function.  */
 
 static void
-expand_set_or_movmem_via_rep (rtx destmem, rtx srcmem,
+expand_set_or_cpymem_via_rep (rtx destmem, rtx srcmem,
   rtx destptr, rtx srcptr, rtx value, rtx orig_value,
   rtx count,
   machine_mode mode, bool issetmem)
@@ -6121,7 +6121,7 @@ ix86_expand_aligntest (rtx variable, int value, bool 
epilogue)
 /* Output code to copy at most count & (max_size - 1) bytes from SRC to DEST.  
*/
 
 static void
-expand_movmem_epilogue (rtx destmem, rtx srcmem,
+expand_cpymem_epilogue (rtx destmem, rtx srcmem,
rtx destptr, rtx srcptr, rtx count, int max_size)
 {
   rtx src, dest;
@@ -6146,7 +6146,7 @@ expand_movmem_epilogue (rtx destmem, rtx srcmem,
 {
   count = expand_simple_binop (GET_MODE (count), AND, count, GEN_INT 
(max_size - 1),
count, 1, OPTAB_DIRECT);
-  expand_set_or_movmem_via_loop (destmem, srcmem, destptr, srcptr, NULL,
+  expand_set_or_cpymem_via_loop (destmem, srcmem, destptr, srcptr, NULL,
 count, QImode, 1, 4, false);
   return;
 }
@@ -6295,7 +6295,7 @@ expand_setmem_epilogue_via_loop (rtx destmem, rtx 
destptr, rtx value,
 {
   count = expand_simple_binop (counter_mode (count), AND, count,
   GEN_INT (max_size - 1), count, 1, OPTAB_DIRECT);
-  expand_set_or_movmem_via_loop (destmem, NULL, destptr, NULL,
+  expand_set_or_cpymem_via_loop (destmem, NULL, destptr, NULL,
 gen_lowpart (QImode, value), count, QImode,
 1, max_size / 2, true);
 }
@@ -6416,7 +6416,7 @@ ix86_adjust_counter (rtx countreg, HOST_WIDE_INT value)
Return value is updated DESTMEM.  */
 
 static rtx
-expand_set_or_movmem_prologue (rtx destmem, rtx srcmem,
+expand_set_or_cpymem_prologue (rtx destmem, rtx srcmem,
  rtx destptr, rtx srcptr, rtx value,
  rtx vec_value, rtx count, int align,
  int desired_alignment, bool issetmem)
@@ -6449,7 +6449,7 @@ expand_set_or_movmem_prologue (rtx destmem, rtx srcmem,
or setmem sequence that is valid for SIZE..2*SIZE-1 bytes
and jump to DONE_LABEL.  */
 static void
-expand_small_movmem_or_setmem (rtx destmem, rtx srcmem,
+expand_small_cpymem_or_setmem (rtx destmem, rtx srcmem,
   rtx destptr, rtx srcptr,
   rtx value, rtx vec_value,
   rtx count, int size,
@@ -6575,7 +6575,7 @@ expand_small_movmem_or_setmem (rtx destmem, rtx srcmem,
done_label:
   */
 static void
-expand_set_or_movmem_prologue_epilogue_by_misaligned_moves (rtx destmem, rtx 
srcmem,
+expand_set_or_cpymem_prologue_epilogue_by_misaligned_moves (rtx destmem, rtx 
srcmem,
rtx *destptr, rtx 
*srcptr,
machine_mode mode,
rtx value, rtx 
vec_value,
@@ -6616,7 +6616,7 @@ 
expand_set_or_movmem_prologue_epilogue_by_misaligned_moves (rtx destmem, rtx src
 
   /* Handle sizes > 3.  */
   for (;size2 > 2; size2 >>= 1)
-   expand_small_movmem_or_setmem (destmem, srcmem,
+   

[PATCH 10/30] Changes to ft32

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/ft32/ft32.md (movmemsi): Change name to cpymemsi.
---
 gcc/config/ft32/ft32.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/ft32/ft32.md b/gcc/config/ft32/ft32.md
index de23946..9e31f2c 100644
--- a/gcc/config/ft32/ft32.md
+++ b/gcc/config/ft32/ft32.md
@@ -851,7 +851,7 @@
 "stpcpy %b1,%b2 # %0 %b1 %b2"
 )
 
-(define_insn "movmemsi"
+(define_insn "cpymemsi"
   [(set (match_operand:BLK 0 "memory_operand" "=W,BW")
 (match_operand:BLK 1 "memory_operand" "W,BW"))
 (use (match_operand:SI 2 "ft32_imm_operand" "KA,KA"))
-- 
2.7.4



[PATCH 11/30] Changes to h8300

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/h8300/h8300.md (movmemsi): Change name to cpymemsi.
---
 gcc/config/h8300/h8300.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/h8300/h8300.md b/gcc/config/h8300/h8300.md
index eb0ae83..42610fd 100644
--- a/gcc/config/h8300/h8300.md
+++ b/gcc/config/h8300/h8300.md
@@ -474,11 +474,11 @@
(set_attr "length_table" "*,movl")
(set_attr "cc" "set_zn,set_znv")])
 
-;; Implement block moves using movmd.  Defining movmemsi allows the full
+;; Implement block copies using movmd.  Defining cpymemsi allows the full
 ;; range of constant lengths (up to 0x4 bytes when using movmd.l).
 ;; See h8sx_emit_movmd for details.
 
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(use (match_operand:BLK 0 "memory_operand" ""))
(use (match_operand:BLK 1 "memory_operand" ""))
(use (match_operand:SI 2 "" ""))
-- 
2.7.4



[PATCH 09/30] Changes to frv

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/frv/frv.md (movmemsi): Change name to cpymemsi.
---
 gcc/config/frv/frv.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/frv/frv.md b/gcc/config/frv/frv.md
index 064bf53..6e8db59 100644
--- a/gcc/config/frv/frv.md
+++ b/gcc/config/frv/frv.md
@@ -1887,7 +1887,7 @@
 ;; Argument 2 is the length
 ;; Argument 3 is the alignment
 
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(parallel [(set (match_operand:BLK 0 "" "")
   (match_operand:BLK 1 "" ""))
  (use (match_operand:SI 2 "" ""))
-- 
2.7.4



[PATCH 07/30] Changes to bfin

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/bfin/bfin-protos.h: Change movmem to cpymem.
* config/bfin/bfin.c (single_move_for_movmem, bfin_expand_movmem):
Change movmem to cpymem.
* config/bfin/bfin.h: Change movmem to cpymem in comment.
* config/bfin/bfin.md (movmemsi): Change name to cpymemsi.
---
 gcc/config/bfin/bfin-protos.h |  2 +-
 gcc/config/bfin/bfin.c| 12 ++--
 gcc/config/bfin/bfin.h|  2 +-
 gcc/config/bfin/bfin.md   |  4 ++--
 4 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/gcc/config/bfin/bfin-protos.h b/gcc/config/bfin/bfin-protos.h
index 64a1842..7d0f705 100644
--- a/gcc/config/bfin/bfin-protos.h
+++ b/gcc/config/bfin/bfin-protos.h
@@ -81,7 +81,7 @@ extern bool expand_move (rtx *, machine_mode);
 extern void bfin_expand_call (rtx, rtx, rtx, rtx, int);
 extern bool bfin_longcall_p (rtx, int);
 extern bool bfin_dsp_memref_p (rtx);
-extern bool bfin_expand_movmem (rtx, rtx, rtx, rtx);
+extern bool bfin_expand_cpymem (rtx, rtx, rtx, rtx);
 
 extern enum reg_class secondary_input_reload_class (enum reg_class,
machine_mode,
diff --git a/gcc/config/bfin/bfin.c b/gcc/config/bfin/bfin.c
index e520115..319d7e2 100644
--- a/gcc/config/bfin/bfin.c
+++ b/gcc/config/bfin/bfin.c
@@ -3208,7 +3208,7 @@ output_pop_multiple (rtx insn, rtx *operands)
 /* Adjust DST and SRC by OFFSET bytes, and generate one move in mode MODE.  */
 
 static void
-single_move_for_movmem (rtx dst, rtx src, machine_mode mode, HOST_WIDE_INT 
offset)
+single_move_for_cpymem (rtx dst, rtx src, machine_mode mode, HOST_WIDE_INT 
offset)
 {
   rtx scratch = gen_reg_rtx (mode);
   rtx srcmem, dstmem;
@@ -3224,7 +3224,7 @@ single_move_for_movmem (rtx dst, rtx src, machine_mode 
mode, HOST_WIDE_INT offse
back on a different method.  */
 
 bool
-bfin_expand_movmem (rtx dst, rtx src, rtx count_exp, rtx align_exp)
+bfin_expand_cpymem (rtx dst, rtx src, rtx count_exp, rtx align_exp)
 {
   rtx srcreg, destreg, countreg;
   HOST_WIDE_INT align = 0;
@@ -3269,7 +3269,7 @@ bfin_expand_movmem (rtx dst, rtx src, rtx count_exp, rtx 
align_exp)
{
  if ((count & ~3) == 4)
{
- single_move_for_movmem (dst, src, SImode, offset);
+ single_move_for_cpymem (dst, src, SImode, offset);
  offset = 4;
}
  else if (count & ~3)
@@ -3282,7 +3282,7 @@ bfin_expand_movmem (rtx dst, rtx src, rtx count_exp, rtx 
align_exp)
}
  if (count & 2)
{
- single_move_for_movmem (dst, src, HImode, offset);
+ single_move_for_cpymem (dst, src, HImode, offset);
  offset += 2;
}
}
@@ -3290,7 +3290,7 @@ bfin_expand_movmem (rtx dst, rtx src, rtx count_exp, rtx 
align_exp)
{
  if ((count & ~1) == 2)
{
- single_move_for_movmem (dst, src, HImode, offset);
+ single_move_for_cpymem (dst, src, HImode, offset);
  offset = 2;
}
  else if (count & ~1)
@@ -3304,7 +3304,7 @@ bfin_expand_movmem (rtx dst, rtx src, rtx count_exp, rtx 
align_exp)
}
   if (count & 1)
{
- single_move_for_movmem (dst, src, QImode, offset);
+ single_move_for_cpymem (dst, src, QImode, offset);
}
   return true;
 }
diff --git a/gcc/config/bfin/bfin.h b/gcc/config/bfin/bfin.h
index 19b7f81..4aba596 100644
--- a/gcc/config/bfin/bfin.h
+++ b/gcc/config/bfin/bfin.h
@@ -793,7 +793,7 @@ typedef struct {
 #define MOVE_MAX UNITS_PER_WORD
 
 /* If a memory-to-memory move would take MOVE_RATIO or more simple
-   move-instruction pairs, we will do a movmem or libcall instead.  */
+   move-instruction pairs, we will do a cpymem or libcall instead.  */
 
 #define MOVE_RATIO(speed) 5
 
diff --git a/gcc/config/bfin/bfin.md b/gcc/config/bfin/bfin.md
index ac58924..6ac208d 100644
--- a/gcc/config/bfin/bfin.md
+++ b/gcc/config/bfin/bfin.md
@@ -2316,14 +2316,14 @@
(set_attr "length" "16")
(set_attr "seq_insns" "multi")])
 
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(match_operand:BLK 0 "general_operand" "")
(match_operand:BLK 1 "general_operand" "")
(match_operand:SI 2 "const_int_operand" "")
(match_operand:SI 3 "const_int_operand" "")]
   ""
 {
-  if (bfin_expand_movmem (operands[0], operands[1], operands[2], operands[3]))
+  if (bfin_expand_cpymem (operands[0], operands[1], operands[2], operands[3]))
 DONE;
   FAIL;
 })
-- 
2.7.4



[PATCH 08/30] Changes to c6x

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/c6x/c6x-protos.h: Change movmem to cpymem.
* config/c6x/c6x.c (c6x_expand_movmem): Change movmem to cpymem.
* config/c6x/c6x.md (movmemsi): Change name to cpymemsi.
---
 gcc/config/c6x/c6x-protos.h | 2 +-
 gcc/config/c6x/c6x.c| 4 ++--
 gcc/config/c6x/c6x.md   | 4 ++--
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/config/c6x/c6x-protos.h b/gcc/config/c6x/c6x-protos.h
index a657969..8c04c31 100644
--- a/gcc/config/c6x/c6x-protos.h
+++ b/gcc/config/c6x/c6x-protos.h
@@ -35,7 +35,7 @@ extern bool c6x_long_call_p (rtx);
 extern void c6x_expand_call (rtx, rtx, bool);
 extern rtx c6x_expand_compare (rtx, machine_mode);
 extern bool c6x_force_op_for_comparison_p (enum rtx_code, rtx);
-extern bool c6x_expand_movmem (rtx, rtx, rtx, rtx, rtx, rtx);
+extern bool c6x_expand_cpymem (rtx, rtx, rtx, rtx, rtx, rtx);
 
 extern rtx c6x_subword (rtx, bool);
 extern void split_di (rtx *, int, rtx *, rtx *);
diff --git a/gcc/config/c6x/c6x.c b/gcc/config/c6x/c6x.c
index 742c54b..93841e4 100644
--- a/gcc/config/c6x/c6x.c
+++ b/gcc/config/c6x/c6x.c
@@ -1686,10 +1686,10 @@ c6x_valid_mask_p (HOST_WIDE_INT val)
   return true;
 }
 
-/* Expand a block move for a movmemM pattern.  */
+/* Expand a block move for a cpymemM pattern.  */
 
 bool
-c6x_expand_movmem (rtx dst, rtx src, rtx count_exp, rtx align_exp,
+c6x_expand_cpymem (rtx dst, rtx src, rtx count_exp, rtx align_exp,
   rtx expected_align_exp ATTRIBUTE_UNUSED,
   rtx expected_size_exp ATTRIBUTE_UNUSED)
 {
diff --git a/gcc/config/c6x/c6x.md b/gcc/config/c6x/c6x.md
index 8218e1d..f9bf9ba 100644
--- a/gcc/config/c6x/c6x.md
+++ b/gcc/config/c6x/c6x.md
@@ -2844,7 +2844,7 @@
 ;; Block moves
 ;; -
 
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(use (match_operand:BLK 0 "memory_operand" ""))
(use (match_operand:BLK 1 "memory_operand" ""))
(use (match_operand:SI 2 "nonmemory_operand" ""))
@@ -2853,7 +2853,7 @@
(use (match_operand:SI 5 "const_int_operand" ""))]
   ""
 {
- if (c6x_expand_movmem (operands[0], operands[1], operands[2], operands[3],
+ if (c6x_expand_cpymem (operands[0], operands[1], operands[2], operands[3],
operands[4], operands[5]))
DONE;
  else
-- 
2.7.4



[PATCH 06/30] Changes to avr

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/avr/avr-protos.h: Change movmem to cpymem.
* config/avr/avr.c (avr_adjust_insn_length, avr_emit_movmemhi,
avr_out_movmem): Change movmem to cpymem.
* config/avr/avr.md (movmemhi, movmem_, movmemx_):
Change movmem to cpymem.
---
 gcc/config/avr/avr-protos.h |  4 ++--
 gcc/config/avr/avr.c| 14 +++---
 gcc/config/avr/avr.md   | 32 
 3 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/gcc/config/avr/avr-protos.h b/gcc/config/avr/avr-protos.h
index dd0babb..31fe3a6 100644
--- a/gcc/config/avr/avr-protos.h
+++ b/gcc/config/avr/avr-protos.h
@@ -82,7 +82,7 @@ extern rtx avr_to_int_mode (rtx);
 
 extern void avr_expand_prologue (void);
 extern void avr_expand_epilogue (bool);
-extern bool avr_emit_movmemhi (rtx*);
+extern bool avr_emit_cpymemhi (rtx*);
 extern int avr_epilogue_uses (int regno);
 
 extern void avr_output_addr_vec (rtx_insn*, rtx);
@@ -92,7 +92,7 @@ extern const char* avr_out_plus (rtx, rtx*, int* =NULL, int* 
=NULL, bool =true);
 extern const char* avr_out_round (rtx_insn *, rtx*, int* =NULL);
 extern const char* avr_out_addto_sp (rtx*, int*);
 extern const char* avr_out_xload (rtx_insn *, rtx*, int*);
-extern const char* avr_out_movmem (rtx_insn *, rtx*, int*);
+extern const char* avr_out_cpymem (rtx_insn *, rtx*, int*);
 extern const char* avr_out_insert_bits (rtx*, int*);
 extern bool avr_popcount_each_byte (rtx, int, int);
 extern bool avr_has_nibble_0xf (rtx);
diff --git a/gcc/config/avr/avr.c b/gcc/config/avr/avr.c
index 873a9da..b97faaf 100644
--- a/gcc/config/avr/avr.c
+++ b/gcc/config/avr/avr.c
@@ -9404,7 +9404,7 @@ avr_adjust_insn_length (rtx_insn *insn, int len)
 case ADJUST_LEN_MOV16: output_movhi (insn, op, ); break;
 case ADJUST_LEN_MOV24: avr_out_movpsi (insn, op, ); break;
 case ADJUST_LEN_MOV32: output_movsisf (insn, op, ); break;
-case ADJUST_LEN_MOVMEM: avr_out_movmem (insn, op, ); break;
+case ADJUST_LEN_CPYMEM: avr_out_cpymem (insn, op, ); break;
 case ADJUST_LEN_XLOAD: avr_out_xload (insn, op, ); break;
 case ADJUST_LEN_SEXT: avr_out_sign_extend (insn, op, ); break;
 
@@ -13321,7 +13321,7 @@ avr_emit3_fix_outputs (rtx (*gen)(rtx,rtx,rtx), rtx *op,
 }
 
 
-/* Worker function for movmemhi expander.
+/* Worker function for cpymemhi expander.
XOP[0]  Destination as MEM:BLK
XOP[1]  Source  " "
XOP[2]  # Bytes to copy
@@ -13330,7 +13330,7 @@ avr_emit3_fix_outputs (rtx (*gen)(rtx,rtx,rtx), rtx *op,
Return FALSE if the operand compination is not supported.  */
 
 bool
-avr_emit_movmemhi (rtx *xop)
+avr_emit_cpymemhi (rtx *xop)
 {
   HOST_WIDE_INT count;
   machine_mode loop_mode;
@@ -13407,14 +13407,14 @@ avr_emit_movmemhi (rtx *xop)
  Do the copy-loop inline.  */
 
   rtx (*fun) (rtx, rtx, rtx)
-= QImode == loop_mode ? gen_movmem_qi : gen_movmem_hi;
+= QImode == loop_mode ? gen_cpymem_qi : gen_cpymem_hi;
 
   insn = fun (xas, loop_reg, loop_reg);
 }
   else
 {
   rtx (*fun) (rtx, rtx)
-= QImode == loop_mode ? gen_movmemx_qi : gen_movmemx_hi;
+= QImode == loop_mode ? gen_cpymemx_qi : gen_cpymemx_hi;
 
   emit_move_insn (gen_rtx_REG (QImode, 23), a_hi8);
 
@@ -13428,7 +13428,7 @@ avr_emit_movmemhi (rtx *xop)
 }
 
 
-/* Print assembler for movmem_qi, movmem_hi insns...
+/* Print assembler for cpymem_qi, cpymem_hi insns...
$0 : Address Space
$1, $2 : Loop register
Z  : Source address
@@ -13436,7 +13436,7 @@ avr_emit_movmemhi (rtx *xop)
 */
 
 const char*
-avr_out_movmem (rtx_insn *insn ATTRIBUTE_UNUSED, rtx *op, int *plen)
+avr_out_cpymem (rtx_insn *insn ATTRIBUTE_UNUSED, rtx *op, int *plen)
 {
   addr_space_t as = (addr_space_t) INTVAL (op[0]);
   machine_mode loop_mode = GET_MODE (op[1]);
diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md
index f263b69..e85bf49 100644
--- a/gcc/config/avr/avr.md
+++ b/gcc/config/avr/avr.md
@@ -70,7 +70,7 @@
 
 (define_c_enum "unspec"
   [UNSPEC_STRLEN
-   UNSPEC_MOVMEM
+   UNSPEC_CPYMEM
UNSPEC_INDEX_JMP
UNSPEC_FMUL
UNSPEC_FMULS
@@ -158,7 +158,7 @@
tsthi, tstpsi, tstsi, compare, compare64, call,
mov8, mov16, mov24, mov32, reload_in16, reload_in24, reload_in32,
ufract, sfract, round,
-   xload, movmem,
+   xload, cpymem,
ashlqi, ashrqi, lshrqi,
ashlhi, ashrhi, lshrhi,
ashlsi, ashrsi, lshrsi,
@@ -992,20 +992,20 @@
 ;;=
 ;; move string (like memcpy)
 
-(define_expand "movmemhi"
+(define_expand "cpymemhi"
   [(parallel [(set (match_operand:BLK 0 "memory_operand" "")
(match_operand:BLK 1 "memory_operand" ""))
   (use (match_operand:HI 2 "const_int_operand" ""))
   (use (match_operand:HI 3 "const_int_operand" ""))])]
   ""
   {
-if (avr_emit_movmemhi (operands))
+if (avr_emit_cpymemhi (operands))
   DONE;
 
 

[PATCH 05/30] Changes to arm

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/arm/arm-protos.h: Change movmem to cpymem in names.
* config/arm/arm.c (arm_movmemqi_unaligned, arm_gen_movmemqi,
gen_movmem_ldrd_strd, thumb_expand_movmemqi) Change movmem to cpymem.
* config/arm/arm.md (movmemqi): Change movmem to cpymem.
---
 gcc/config/arm/arm-protos.h |  6 +++---
 gcc/config/arm/arm.c| 18 +-
 gcc/config/arm/arm.md   |  8 
 gcc/config/arm/thumb1.md|  4 ++--
 4 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 485bc68..bf2bf1c 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -126,8 +126,8 @@ extern bool offset_ok_for_ldrd_strd (HOST_WIDE_INT);
 extern bool operands_ok_ldrd_strd (rtx, rtx, rtx, HOST_WIDE_INT, bool, bool);
 extern bool gen_operands_ldrd_strd (rtx *, bool, bool, bool);
 extern bool valid_operands_ldrd_strd (rtx *, bool);
-extern int arm_gen_movmemqi (rtx *);
-extern bool gen_movmem_ldrd_strd (rtx *);
+extern int arm_gen_cpymemqi (rtx *);
+extern bool gen_cpymem_ldrd_strd (rtx *);
 extern machine_mode arm_select_cc_mode (RTX_CODE, rtx, rtx);
 extern machine_mode arm_select_dominance_cc_mode (rtx, rtx,
   HOST_WIDE_INT);
@@ -203,7 +203,7 @@ extern void thumb2_final_prescan_insn (rtx_insn *);
 extern const char *thumb_load_double_from_address (rtx *);
 extern const char *thumb_output_move_mem_multiple (int, rtx *);
 extern const char *thumb_call_via_reg (rtx);
-extern void thumb_expand_movmemqi (rtx *);
+extern void thumb_expand_cpymemqi (rtx *);
 extern rtx arm_return_addr (int, rtx);
 extern void thumb_reload_out_hi (rtx *);
 extern void thumb_set_return_address (rtx, rtx);
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index e3e71ea..820502a 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -14385,7 +14385,7 @@ arm_block_move_unaligned_loop (rtx dest, rtx src, 
HOST_WIDE_INT length,
core type, optimize_size setting, etc.  */
 
 static int
-arm_movmemqi_unaligned (rtx *operands)
+arm_cpymemqi_unaligned (rtx *operands)
 {
   HOST_WIDE_INT length = INTVAL (operands[2]);
   
@@ -14422,7 +14422,7 @@ arm_movmemqi_unaligned (rtx *operands)
 }
 
 int
-arm_gen_movmemqi (rtx *operands)
+arm_gen_cpymemqi (rtx *operands)
 {
   HOST_WIDE_INT in_words_to_go, out_words_to_go, last_bytes;
   HOST_WIDE_INT srcoffset, dstoffset;
@@ -14436,7 +14436,7 @@ arm_gen_movmemqi (rtx *operands)
 return 0;
 
   if (unaligned_access && (INTVAL (operands[3]) & 3) != 0)
-return arm_movmemqi_unaligned (operands);
+return arm_cpymemqi_unaligned (operands);
 
   if (INTVAL (operands[3]) & 3)
 return 0;
@@ -14570,7 +14570,7 @@ arm_gen_movmemqi (rtx *operands)
   return 1;
 }
 
-/* Helper for gen_movmem_ldrd_strd. Increase the address of memory rtx
+/* Helper for gen_cpymem_ldrd_strd. Increase the address of memory rtx
 by mode size.  */
 inline static rtx
 next_consecutive_mem (rtx mem)
@@ -14585,7 +14585,7 @@ next_consecutive_mem (rtx mem)
 /* Copy using LDRD/STRD instructions whenever possible.
Returns true upon success. */
 bool
-gen_movmem_ldrd_strd (rtx *operands)
+gen_cpymem_ldrd_strd (rtx *operands)
 {
   unsigned HOST_WIDE_INT len;
   HOST_WIDE_INT align;
@@ -14629,7 +14629,7 @@ gen_movmem_ldrd_strd (rtx *operands)
 
   /* If we cannot generate any LDRD/STRD, try to generate LDM/STM.  */
   if (!(dst_aligned || src_aligned))
-return arm_gen_movmemqi (operands);
+return arm_gen_cpymemqi (operands);
 
   /* If the either src or dst is unaligned we'll be accessing it as pairs
  of unaligned SImode accesses.  Otherwise we can generate DImode
@@ -26395,7 +26395,7 @@ thumb_call_via_reg (rtx reg)
 
 /* Routines for generating rtl.  */
 void
-thumb_expand_movmemqi (rtx *operands)
+thumb_expand_cpymemqi (rtx *operands)
 {
   rtx out = copy_to_mode_reg (SImode, XEXP (operands[0], 0));
   rtx in  = copy_to_mode_reg (SImode, XEXP (operands[1], 0));
@@ -26404,13 +26404,13 @@ thumb_expand_movmemqi (rtx *operands)
 
   while (len >= 12)
 {
-  emit_insn (gen_movmem12b (out, in, out, in));
+  emit_insn (gen_cpymem12b (out, in, out, in));
   len -= 12;
 }
 
   if (len >= 8)
 {
-  emit_insn (gen_movmem8b (out, in, out, in));
+  emit_insn (gen_cpymem8b (out, in, out, in));
   len -= 8;
 }
 
diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index ae58217..a7fa410 100644
--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -7250,7 +7250,7 @@
 ;; We could let this apply for blocks of less than this, but it clobbers so
 ;; many registers that there is then probably a better way.
 
-(define_expand "movmemqi"
+(define_expand "cpymemqi"
   [(match_operand:BLK 0 "general_operand" "")
(match_operand:BLK 1 "general_operand" "")
(match_operand:SI 2 "const_int_operand" "")
@@ -7262,12 +7262,12 @@
   if (TARGET_LDRD && current_tune->prefer_ldrd_strd
   

[PATCH 04/30] Changes to arc

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/arc/arc-protos.h: Change movmem to cpymem.
* config/arc/arc.c (arc_expand_movmem): Change movmem to cpymem.
* config/arc/arc.h: Change movmem to cpymem in comment.
* config/arc/arc.md (movmemsi): Change movmem to cpymem.
---
 gcc/config/arc/arc-protos.h | 2 +-
 gcc/config/arc/arc.c| 6 +++---
 gcc/config/arc/arc.h| 2 +-
 gcc/config/arc/arc.md   | 4 ++--
 4 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
index f501bc3..74e5247 100644
--- a/gcc/config/arc/arc-protos.h
+++ b/gcc/config/arc/arc-protos.h
@@ -35,7 +35,7 @@ extern void arc_final_prescan_insn (rtx_insn *, rtx *, int);
 extern const char *arc_output_libcall (const char *);
 extern int arc_output_addsi (rtx *operands, bool, bool);
 extern int arc_output_commutative_cond_exec (rtx *operands, bool);
-extern bool arc_expand_movmem (rtx *operands);
+extern bool arc_expand_cpymem (rtx *operands);
 extern bool prepare_move_operands (rtx *operands, machine_mode mode);
 extern void emit_shift (enum rtx_code, rtx, rtx, rtx);
 extern void arc_expand_atomic_op (enum rtx_code, rtx, rtx, rtx, rtx, rtx);
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 89f69c79..23171d2 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -8883,7 +8883,7 @@ arc_output_commutative_cond_exec (rtx *operands, bool 
output_p)
   return 8;
 }
 
-/* Helper function of arc_expand_movmem.  ADDR points to a chunk of memory.
+/* Helper function of arc_expand_cpymem.  ADDR points to a chunk of memory.
Emit code and return an potentially modified address such that offsets
up to SIZE are can be added to yield a legitimate address.
if REUSE is set, ADDR is a register that may be modified.  */
@@ -8917,7 +8917,7 @@ force_offsettable (rtx addr, HOST_WIDE_INT size, bool 
reuse)
offset ranges.  Return true on success.  */
 
 bool
-arc_expand_movmem (rtx *operands)
+arc_expand_cpymem (rtx *operands)
 {
   rtx dst = operands[0];
   rtx src = operands[1];
@@ -10427,7 +10427,7 @@ arc_use_by_pieces_infrastructure_p (unsigned 
HOST_WIDE_INT size,
enum by_pieces_operation op,
bool speed_p)
 {
-  /* Let the movmem expander handle small block moves.  */
+  /* Let the cpymem expander handle small block moves.  */
   if (op == MOVE_BY_PIECES)
 return false;
 
diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
index 80dead9..4a9dd07 100644
--- a/gcc/config/arc/arc.h
+++ b/gcc/config/arc/arc.h
@@ -1423,7 +1423,7 @@ do { \
in one reasonably fast instruction.  */
 #define MOVE_MAX 4
 
-/* Undo the effects of the movmem pattern presence on STORE_BY_PIECES_P .  */
+/* Undo the effects of the cpymem pattern presence on STORE_BY_PIECES_P .  */
 #define MOVE_RATIO(SPEED) ((SPEED) ? 15 : 3)
 
 /* Define this to be nonzero if shift instructions ignore all but the
diff --git a/gcc/config/arc/arc.md b/gcc/config/arc/arc.md
index 528e344..ba595dd 100644
--- a/gcc/config/arc/arc.md
+++ b/gcc/config/arc/arc.md
@@ -5122,13 +5122,13 @@ core_3, archs4x, archs4xd, archs4xd_slow"
(set_attr "type" "loop_end")
(set_attr "length" "4,20")])
 
-(define_expand "movmemsi"
+(define_expand "cpymemsi"
   [(match_operand:BLK 0 "" "")
(match_operand:BLK 1 "" "")
(match_operand:SI 2 "nonmemory_operand" "")
(match_operand 3 "immediate_operand" "")]
   ""
-  "if (arc_expand_movmem (operands)) DONE; else FAIL;")
+  "if (arc_expand_cpymem (operands)) DONE; else FAIL;")
 
 ;; Close http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35803 if this works
 ;; to the point that we can generate cmove instructions.
-- 
2.7.4



[PATCH 03/30] Changes for alpha

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/alpha/alpha.h: Change movmem to cpymem in comment.
* config/alpha/alpha.md (movmemqi, movmemdi, *movmemdi_1): Change
movmem to cpymem.
---
 gcc/config/alpha/alpha.h  | 2 +-
 gcc/config/alpha/alpha.md | 6 +++---
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/config/alpha/alpha.h b/gcc/config/alpha/alpha.h
index e200820..68eafe1 100644
--- a/gcc/config/alpha/alpha.h
+++ b/gcc/config/alpha/alpha.h
@@ -759,7 +759,7 @@ do {
 \
 #define MOVE_MAX 8
 
 /* If a memory-to-memory move would take MOVE_RATIO or more simple
-   move-instruction pairs, we will do a movmem or libcall instead.
+   move-instruction pairs, we will do a cpymem or libcall instead.
 
Without byte/word accesses, we want no more than four instructions;
with, several single byte accesses are better.  */
diff --git a/gcc/config/alpha/alpha.md b/gcc/config/alpha/alpha.md
index d15295d..b195eb9 100644
--- a/gcc/config/alpha/alpha.md
+++ b/gcc/config/alpha/alpha.md
@@ -4673,7 +4673,7 @@
 ;; Argument 2 is the length
 ;; Argument 3 is the alignment
 
-(define_expand "movmemqi"
+(define_expand "cpymemqi"
   [(parallel [(set (match_operand:BLK 0 "memory_operand")
   (match_operand:BLK 1 "memory_operand"))
  (use (match_operand:DI 2 "immediate_operand"))
@@ -4686,7 +4686,7 @@
 FAIL;
 })
 
-(define_expand "movmemdi"
+(define_expand "cpymemdi"
   [(parallel [(set (match_operand:BLK 0 "memory_operand")
   (match_operand:BLK 1 "memory_operand"))
  (use (match_operand:DI 2 "immediate_operand"))
@@ -4703,7 +4703,7 @@
   "TARGET_ABI_OPEN_VMS"
   "operands[4] = gen_rtx_SYMBOL_REF (Pmode, \"OTS$MOVE\");")
 
-(define_insn "*movmemdi_1"
+(define_insn "*cpymemdi_1"
   [(set (match_operand:BLK 0 "memory_operand" "=m,m")
(match_operand:BLK 1 "memory_operand" "m,m"))
(use (match_operand:DI 2 "nonmemory_operand" "r,i"))
-- 
2.7.4



[PATCH 02/30] Changes for aarch64

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

* config/aarch64/aarch64-protos.h: Change movmem to cpymem.
* config/aarch64/aarch64.c (aarch64_expand_movmem): Change movmem
to cpymem.
* config/aarch64/aarch64.h: Change movmem to cpymem.
* config/aarch64/aarch64.md (movmemdi): Change name to cpymemdi.
---
 gcc/config/aarch64/aarch64-protos.h | 4 ++--
 gcc/config/aarch64/aarch64.c| 4 ++--
 gcc/config/aarch64/aarch64.h| 2 +-
 gcc/config/aarch64/aarch64.md   | 6 +++---
 4 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 4b20796..e2f4cc1 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -424,12 +424,12 @@ bool aarch64_constant_address_p (rtx);
 bool aarch64_emit_approx_div (rtx, rtx, rtx);
 bool aarch64_emit_approx_sqrt (rtx, rtx, bool);
 void aarch64_expand_call (rtx, rtx, bool);
-bool aarch64_expand_movmem (rtx *);
+bool aarch64_expand_cpymem (rtx *);
 bool aarch64_float_const_zero_rtx_p (rtx);
 bool aarch64_float_const_rtx_p (rtx);
 bool aarch64_function_arg_regno_p (unsigned);
 bool aarch64_fusion_enabled_p (enum aarch64_fusion_pairs);
-bool aarch64_gen_movmemqi (rtx *);
+bool aarch64_gen_cpymemqi (rtx *);
 bool aarch64_gimple_fold_builtin (gimple_stmt_iterator *);
 bool aarch64_is_extend_from_extract (scalar_int_mode, rtx, rtx);
 bool aarch64_is_long_call_p (rtx);
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 285ae1c..5a923ca 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -17386,11 +17386,11 @@ aarch64_copy_one_block_and_progress_pointers (rtx 
*src, rtx *dst,
   *dst = aarch64_progress_pointer (*dst);
 }
 
-/* Expand movmem, as if from a __builtin_memcpy.  Return true if
+/* Expand cpymem, as if from a __builtin_memcpy.  Return true if
we succeed, otherwise return false.  */
 
 bool
-aarch64_expand_movmem (rtx *operands)
+aarch64_expand_cpymem (rtx *operands)
 {
   int n, mode_bits;
   rtx dst = operands[0];
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index bf06caa..92e38a8 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -855,7 +855,7 @@ typedef struct
 /* MOVE_RATIO dictates when we will use the move_by_pieces infrastructure.
move_by_pieces will continually copy the largest safe chunks.  So a
7-byte copy is a 4-byte + 2-byte + byte copy.  This proves inefficient
-   for both size and speed of copy, so we will instead use the "movmem"
+   for both size and speed of copy, so we will instead use the "cpymem"
standard name to implement the copy.  This logic does not apply when
targeting -mstrict-align, so keep a sensible default in that case.  */
 #define MOVE_RATIO(speed) \
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 91e46cf..7026b3a 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -1375,17 +1375,17 @@
 
 ;; 0 is dst
 ;; 1 is src
-;; 2 is size of move in bytes
+;; 2 is size of copy in bytes
 ;; 3 is alignment
 
-(define_expand "movmemdi"
+(define_expand "cpymemdi"
   [(match_operand:BLK 0 "memory_operand")
(match_operand:BLK 1 "memory_operand")
(match_operand:DI 2 "immediate_operand")
(match_operand:DI 3 "immediate_operand")]
"!STRICT_ALIGNMENT"
 {
-  if (aarch64_expand_movmem (operands))
+  if (aarch64_expand_cpymem (operands))
 DONE;
   FAIL;
 }
-- 
2.7.4



[PATCH 00/30] Rename movmem pattern to cpymem

2019-06-25 Thread acsawdey
From: Aaron Sawdey 

As we discussed on gcc-list back in mid-May, this is the first set of patches
to unscramble things so we can have sensible inline expansion of both memcpy()
and memmove().

This patch renames the movmem optab entry and all uses of it to cpymem to 
reflect the fact that this pattern is used to expand memcpy() and expects
that the source and destination blocks do not overlap.

I have split this out into the machine independent piece and the changes for
each target directory to make review easier. The patches will all need to be
committed 

I changed the pattern names in the .md files and also functions that used
"movmem" in the function name to keep everything consistent. I did not 
change function names like "*_block_move_*". 

A couple targets have support functions with the name movmem in libgcc
but I did not change any of that because I wasn't sure if that was going
to cause backward/forward compatibility issues.

Bootstrap/regtest passes on i386, x86_64, aarch64, and ppc64le.

Using Segher's cross compile build scripts I was able to build a
cross-compiler and use it to build the linux kernel on the following
targets:

alpha arc arm64 armhf arm csky h8300 i386 ia64 m68k microblaze mips64
nds32 openrisc parisc64 parisc powerpc64le powerpc64 riscv32 riscv64
s390 sh sparc64 sparc x86_64 xtensa

On these targets I was able to build a cross-compiler but could not
build the linux kernel because of unrelated errors or because linux
did not support that target:

avr blackfin c6x frv ft32 lm32 m32r mcore mips nios2 pdp11 powerpc rx
vax visium

OK for trunk?

Thanks!
Aaron



Aaron Sawdey (30):
  Changes to machine independent code
  Changes for aarch64
  Changes for alpha
  Changes to arc
  Changes to arm
  Changes to avr
  Changes to bfin
  Changes to c6x
  Changes to frv
  Changes to ft32
  Changes to h8300
  Changes to i386
  Changes to lm32
  Changes to m32c
  Changes to m32r
  Changes to mcore
  Changes to microblaze
  Changes to mips
  Changes to nds32
  Changes to pa
  Changes to pdp11
  Changes to riscv
  Changes to rs6000
  Changes to rx
  Changes to s390
  Changes to sh
  Changes to sparc
  Changes to vax
  Changes to visium
  Changes to xtensa

 gcc/builtins.c   |  2 +-
 gcc/config/aarch64/aarch64-protos.h  |  4 ++--
 gcc/config/aarch64/aarch64.c |  4 ++--
 gcc/config/aarch64/aarch64.h |  2 +-
 gcc/config/aarch64/aarch64.md|  6 ++---
 gcc/config/alpha/alpha.h |  2 +-
 gcc/config/alpha/alpha.md|  6 ++---
 gcc/config/arc/arc-protos.h  |  2 +-
 gcc/config/arc/arc.c |  6 ++---
 gcc/config/arc/arc.h |  2 +-
 gcc/config/arc/arc.md|  4 ++--
 gcc/config/arm/arm-protos.h  |  6 ++---
 gcc/config/arm/arm.c | 18 +++---
 gcc/config/arm/arm.md|  8 +++
 gcc/config/arm/thumb1.md |  4 ++--
 gcc/config/avr/avr-protos.h  |  4 ++--
 gcc/config/avr/avr.c | 14 +--
 gcc/config/avr/avr.md| 32 -
 gcc/config/bfin/bfin-protos.h|  2 +-
 gcc/config/bfin/bfin.c   | 12 +-
 gcc/config/bfin/bfin.h   |  2 +-
 gcc/config/bfin/bfin.md  |  4 ++--
 gcc/config/c6x/c6x-protos.h  |  2 +-
 gcc/config/c6x/c6x.c |  4 ++--
 gcc/config/c6x/c6x.md|  4 ++--
 gcc/config/frv/frv.md|  2 +-
 gcc/config/ft32/ft32.md  |  2 +-
 gcc/config/h8300/h8300.md|  4 ++--
 gcc/config/i386/i386-expand.c| 36 ++--
 gcc/config/i386/i386-protos.h|  2 +-
 gcc/config/i386/i386.h   |  2 +-
 gcc/config/i386/i386.md  |  6 ++---
 gcc/config/lm32/lm32.md  |  2 +-
 gcc/config/m32c/blkmov.md| 12 +-
 gcc/config/m32c/m32c-protos.h|  2 +-
 gcc/config/m32c/m32c.c   | 10 
 gcc/config/m32r/m32r.c   |  4 ++--
 gcc/config/m32r/m32r.md  |  4 ++--
 gcc/config/mcore/mcore.md|  2 +-
 gcc/config/microblaze/microblaze.c   |  2 +-
 gcc/config/microblaze/microblaze.md  |  2 +-
 gcc/config/mips/mips.c   | 10 
 gcc/config/mips/mips.h   | 10 
 gcc/config/mips/mips.md  |  2 +-
 gcc/config/nds32/nds32-memory-manipulation.c | 30 +++
 gcc/config/nds32/nds32-multiple.md   |  4 ++--
 gcc/config/nds32/nds32-protos.h  |  2 +-
 gcc/config/pa/pa.c   |  6 ++---
 

[Darwin, PPC, committed] Move the out of line register save/restore to an endfile.

2019-06-25 Thread Iain Sandoe
We have been including the out of line save/restore functions in libgcc,
which means that we have to append -lgcc even when using shared
libgcc.  In preparation for revision of libgcc for Darwin, split this into an
endfile.

tested on powerpc-darwin9,
applied to mainline
thanks
Iain

gcc/
2019-06-25  Iain Sandoe  

* config/rs6000/darwin.h (ENDFILE_SPEC): New.

libgcc/
2019-06-25  Iain Sandoe  

* config.host: Add libef_ppc.a to the extra files for powerpc-darwin.
* config/rs6000/t-darwin: (PPC_ENDFILE_SRC, PPC_ENDFILE_OBJS): New.
Build objects for the out of line save/restore register functions
so that they can be used for any supported Darwin version.
* config/t-darwin: Default the build Darwin version to Darwin8
(MacOS 10.4).


diff --git a/gcc/config/rs6000/darwin.h b/gcc/config/rs6000/darwin.h
index 705dd7f..fcc4354 100644
--- a/gcc/config/rs6000/darwin.h
+++ b/gcc/config/rs6000/darwin.h
@@ -132,6 +132,11 @@ extern int darwin_emit_picsym_stub;
 #define DARWIN_CRT2_SPEC \
   "%{!m64:%:version-compare(!> 10.4 mmacosx-version-min= crt2.o%s)}"
 
+/* The PPC regs save/restore functions are leaves and could, conceivably
+   be used by the tm destructor.  */
+#undef ENDFILE_SPEC
+#define ENDFILE_SPEC TM_DESTRUCTOR "-lef_ppc"
+
 #undef SUBTARGET_EXTRA_SPECS
 #define SUBTARGET_EXTRA_SPECS  \
   DARWIN_EXTRA_SPECS\
diff --git a/libgcc/ChangeLog b/libgcc/ChangeLog
index a589615..a6e8c20 100644

diff --git a/libgcc/config.host b/libgcc/config.host
index e6a834b..cf52b27 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -1091,11 +1091,11 @@ powerpc-*-darwin*)
  ;;
esac
tmake_file="$tmake_file rs6000/t-ibm-ldouble"
-   extra_parts="$extra_parts crt2.o"
+   extra_parts="$extra_parts crt2.o libef_ppc.a"
;;
 powerpc64-*-darwin*)
tmake_file="$tmake_file rs6000/t-darwin64 rs6000/t-ibm-ldouble"
-   extra_parts="$extra_parts crt2.o"
+   extra_parts="$extra_parts crt2.o libef_ppc.a"
;;
 powerpc*-*-freebsd*)
tmake_file="${tmake_file} rs6000/t-ppccomm rs6000/t-savresfgpr 
rs6000/t-crtstuff rs6000/t-freebsd t-softfp-sfdf t-softfp-excl t-softfp"
diff --git a/libgcc/config/rs6000/t-darwin b/libgcc/config/rs6000/t-darwin
index 61da0bd..0c238b7 100644
--- a/libgcc/config/rs6000/t-darwin
+++ b/libgcc/config/rs6000/t-darwin
@@ -3,23 +3,49 @@ DARWIN_EXTRA_CRT_BUILD_CFLAGS = -mlongcall 
-mmacosx-version-min=10.4
 crt2.o: $(srcdir)/config/rs6000/darwin-crt2.c
$(crt_compile) $(DARWIN_EXTRA_CRT_BUILD_CFLAGS) -c $<
 
+# The outlined register save/restore functions need to run anywhere, and
+# they must be leaf functions suitable for use in an endfile.
+
+PPC_ENDFILE_SRC = \
+  $(srcdir)/config/rs6000/darwin-gpsave.S \
+  $(srcdir)/config/rs6000/darwin-fpsave.S \
+  $(srcdir)/config/rs6000/darwin-vecsave.S
+
+PPC_ENDFILE_OBJS = \
+ darwin-gpsave.o \
+ darwin-fpsave.o \
+ darwin-vecsave.o
+
+darwin-gpsave.o: $(srcdir)/config/rs6000/darwin-gpsave.S
+   $(crt_compile) -mmacosx-version-min=10.1 -c $<
+
+darwin-fpsave.o: $(srcdir)/config/rs6000/darwin-fpsave.S
+   $(crt_compile) -mmacosx-version-min=10.1 -c $<
+
+darwin-vecsave.o: $(srcdir)/config/rs6000/darwin-vecsave.S
+   $(crt_compile) -mmacosx-version-min=10.1 -c $<
+
+# We build these into a library, so that they are only linked as needed and not
+# forced into every object.
+
+libef_ppc.a: $(PPC_ENDFILE_OBJS)
+   $(AR_CREATE_FOR_TARGET) $@ $(PPC_ENDFILE_OBJS)
+   $(RANLIB_FOR_TARGET) $@
+
 LIB2ADD = $(srcdir)/config/rs6000/darwin-tramp.S \
  $(srcdir)/config/darwin-64.c \
- $(srcdir)/config/rs6000/darwin-fpsave.S  \
- $(srcdir)/config/rs6000/darwin-gpsave.S  \
  $(srcdir)/config/rs6000/darwin-world.S \
  $(srcdir)/config/rs6000/ppc64-fp.c
 
-LIB2ADD_ST = \
- $(srcdir)/config/rs6000/darwin-vecsave.S
-
 # The .S files above are designed to run on all processors, even though
 # they use AltiVec instructions.
 # -Wa is used because -force_cpusubtype_ALL doesn't work with -dynamiclib.
-# -mmacosx-version-min=10.4 is used to provide compatibility for code from
-# earlier OSX versions.
-HOST_LIBGCC2_CFLAGS += -Wa,-force_cpusubtype_ALL -mmacosx-version-min=10.4
 
+HOST_LIBGCC2_CFLAGS += -Wa,-force_cpusubtype_ALL
+
+# Although the default for 10.4 is G3, we need the unwinder to be built
+# with vector support so that the "save/rest_world" outlined functions are
+# correctly invoked.
 unwind-dw2_s.o: HOST_LIBGCC2_CFLAGS += -maltivec
 unwind-dw2.o: HOST_LIBGCC2_CFLAGS += -maltivec
 
diff --git a/libgcc/config/t-darwin b/libgcc/config/t-darwin
index 8340ea2..2fcb712 100644
--- a/libgcc/config/t-darwin
+++ b/libgcc/config/t-darwin
@@ -1,6 +1,6 @@
 # Set this as a minimum (unless overriden by arch t-files) since it's a
 # reasonable lowest common denominator that works for all our archs.
-HOST_LIBGCC2_CFLAGS += -mmacosx-version-min=10.5

Re: [PATCH] Fix missing else keyword seen with clang-static-analyzer:

2019-06-25 Thread Jeff Law
On 6/24/19 7:13 AM, Martin Liška wrote:
> 
> Hi.
> 
> The patch is fixing following clang-static-analyzer error:
> /home/marxin/Programming/gcc/gcc/bb-reorder.c:1031:2: warning: Value stored 
> to 'is_better_edge' is never read
> is_better_edge = true;
> ^
> /home/marxin/Programming/gcc/gcc/bb-reorder.c:1034:2: warning: Value stored 
> to 'is_better_edge' is never read
> is_better_edge = false;
> ^~
> 
> It seems to me a missing else branch.
> Honza?
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2019-06-24  Martin Liska  
> 
>   * bb-reorder.c (connect_better_edge_p): Add missing else
>   statement in the middle of if-else statements.
Seems reasonable.  Essentially we'll be using counts the vast majority
of the time to determine which is the better edge -- which roughly
matches the comments in the code.

OK for the trunk.
jeff


Re: [PATCH] Define midpoint and lerp functions for C++20 (P0811R3)

2019-06-25 Thread Rainer Orth
Hi Jonathan,

>>Doh, I looked in  and saw that we get std::abs(double) from the
>>Solaris headers, and then forgot and used it anyway.
>>
>>I'll replace that right away, thanks.
>
> Should be fixed at r272653.
>
> Tested x86_64-linux, committed to trunk.

it did indeed.

Thanks a lot for the super-quick fix.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH, rs6000] Split up rs6000.c

2019-06-25 Thread Segher Boessenkool
On Mon, Jun 24, 2019 at 05:52:36PM -0500, Bill Seurer wrote:
> [PATCH, rs6000] Split up rs6000.c.
> 
> The source file rs6000.c has grown to unreasonable size

1.2MB, 40k lines.  With various includes of tables.

> and is being
> split up into several smaller source files.  This should improve
> compilation speed for building gcc.
> 
> This is the first of several patches to do this and moves most of the
> prologue/epilogue code to a new source file.

Thanks for doing this.


>   rs6000_emit_probe_stack_range_stack_clash, 

(trailing space here)

>   interesting_frame_related_regno, 

(and here)

>   emit_vrsave_prologue, emit_split_stack_prologue, 

(and here).

>   rs6000_split_stack_space_check, rs6000_save_toc_in_prologue_p): Moved
>   to rs6000-logue.c.

What a nasty name.  I like it :-)

>   rs6000_emit_probe_stack_range_stack_clash, 

(trailing space)

>   interesting_frame_related_regno, 

(yup)

>   emit_vrsave_prologue, emit_split_stack_prologue, 

(and here)

>   * config/rs6000/rs6000.h (machine_function): Moved to here from 

(last one)

> --- gcc/config/rs6000/rs6000-internal.h   (nonexistent)
> +++ gcc/config/rs6000/rs6000-internal.h   (working copy)
> @@ -0,0 +1,113 @@
> +/* Internal to rs6000 type and variable declarations and definitons 

Trailing space; typo ("definitions").

Definitions do not really belong in .h files, but I didn't see any anyway?
(Except the one static inline, which is more like a macro really).  So
maybe just say declarations?

> -static gty(()) section *toc_section;
>  
> +extern gty(()) section *toc_section;
> +section *toc_section = 0;

You probably shouldn't call it extern if it isn't ;-)  (Or is that needed
for the gty magic?)


Okay for trunk.  Thanks!


Segher


Re: [PATCH] Enable GCC support for AVX512_VP2INTERSECT.

2019-06-25 Thread Uros Bizjak
On Tue, Jun 25, 2019 at 4:44 AM Hongtao Liu  wrote:
>
> On Sat, Jun 22, 2019 at 3:38 PM Uros Bizjak  wrote:
> >
> > On Fri, Jun 21, 2019 at 8:38 PM H.J. Lu  wrote:
> >
> > > > > > > > > > > >> > > +/* Register pair.  */
> > > > > > > > > > > >> > > +VECTOR_MODES_WITH_PREFIX (P, INT, 2); /* P2QI */
> > > > > > > > > > > >> > > +VECTOR_MODES_WITH_PREFIX (P, INT, 4); /* P2HI 
> > > > > > > > > > > >> > > P4QI */
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > I think
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > INT_MODE (P2QI, 16);
> > > > > > > > > > > >> > > INT_MODE (P2HI, 32);
> > > > > > Why P2QI need 16 bytes but not 2 bytes?
> > > > > > Same question with P2HI.
> > > > >
> > > > > Because we made a mistake. It should be 2 and 4, since these arguments
> > > > Then it will run into internal comiler error when building libgcc.
> > > > I'm still invertigating it.
> > > > > are bytes, not bits.
> > >
> > > I don't think we can have 2 integer modes with the same number of bytes 
> > > since
> > > it breaks things like
> > >
> > > scalar_int_mode wider_mode = GET_MODE_WIDER_MODE (mode).require ();
> > >
> > > We can get
> > >
> > > (gdb) p mode
> > > $2 = {m_mode = E_SImode}
> > > (gdb) p wider_mode
> > > $3 = {m_mode = E_P2HImode}
> > > (gdb)
> > >
> > > Neither middle-end nor backend support it.
> >
> > Ouch... It looks we hit the limitation of the middle end (which should
> > at least warn/error out if two modes of the same width are declared).
> >
> > OTOH, we can't solve this problem by using two HI/QImode registers,
> > since a consecutive register pair has to be allocated It is also not
> > possible to overload existing SI/HImode mode with different
> > requirements w.r.t register pair allocation (e.g. sometimes the whole
> > register is allocated, and sometimes a register pair is allocated).
> >
> > I think we have to invent something like SPECIAL_INT_MODE, which would
> > avoid mode promotion functionality (basically, it should not be listed
> > in mode_wider and similar arrays). This would prevent mode promotion
> > issues, while it would still allow to have mode, having the same width
> > as existing mode, but with special properties.
> >
> > I'm adding Jeff and Jakub to the discussion about SPECIAL_INT_MODE.
> >
> > Uros.
>
> Patch from H.J using PARTIAL_INT_MODE fixed this issue.
>
> +/* Register pair.  */
> +PARTIAL_INT_MODE (HI, 16, P2QI);
> +PARTIAL_INT_MODE (SI, 32, P2HI);
> +
>
> Here is updated patch.

OK for mainline, but please add the comment about the reason to use
PARTIAL_INT_MODE.

Thanks,
Uros.


[patch][aarch64]: fix frame pointer setup before tlsdesc call

2019-06-25 Thread Sylvia Taylor
Greetings,

This patch fixes a bug with TLS in which the frame pointer is not
established until after the tlsdesc call, thus not conforming to
the aarch64 procedure call standard.

Changed the tlsdesc instruction patterns to set a dependency on the
x29 frame pointer. This helps the instruction scheduler to arrange
the tlsdesc call after the frame pointer is set.

Example of frame pointer (x29) set incorrectly after tlsdesc call:

stp x29, x30, [sp, -16]!
adrpx0, :tlsdesc:.LANCHOR0
ldr x2, [x0, #:tlsdesc_lo12:.LANCHOR0]
add x0, x0, :tlsdesc_lo12:.LANCHOR0
.tlsdesccall.LANCHOR0
blr x2
...
mov x29, sp
...

After introducing dependency on x29, the scheduler does the frame
pointer setup before tlsdesc:

stp x29, x30, [sp, -16]!
mov x29, sp
adrpx0, :tlsdesc:.LANCHOR0
ldr x2, [x0, #:tlsdesc_lo12:.LANCHOR0]
add x0, x0, :tlsdesc_lo12:.LANCHOR0
.tlsdesccall.LANCHOR0
blr x2
...

Testcase used with -O2 -fpic:

void foo()
{
  static __thread int x = 0;
  bar ();
}

I am not sure what would classify as an effective check for this
testcase. The only idea I received so far would be to write a regexp
inside a scan-assembler-not that would potentially look for this pattern:


.tlsdesccall 
blr 

[mov x29, sp] OR [add x29, sp, 0]


(similar to what was attempted in gcc/testsuite/gcc.target/arm/pr85434.c)

I would like maintainers' input on whether such a testcase should be added
and if there are better ways of checking for the instruction order.

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk? If yes, I don't have any commit rights, so can someone please
commit it on my behalf.

Cheers,
Syl

gcc/ChangeLog:

2019-06-25  Sylvia Taylor  

* config/aarch64/aarch64.md
(tlsdesc_small_advsimd_): Update.
(tlsdesc_small_sve_): Likewise.
(FP_REGNUM): New.
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md
index 
ff83974aeb0b1bf46415c29ba47ada74a79d7586..099cad54336ccaf2b658fbe9fd7a4a84b3abc6e0
 100644
--- a/gcc/config/aarch64/aarch64.md
+++ b/gcc/config/aarch64/aarch64.md
@@ -120,6 +120,7 @@
 ;; Scratch registers used in frame layout.
 (IP0_REGNUM 16)
 (IP1_REGNUM 17)
+(FP_REGNUM 29)
 (LR_REGNUM  30)
   ]
 )
@@ -6617,7 +6618,8 @@
UNSPEC_TLSDESC))
(clobber (reg:DI LR_REGNUM))
(clobber (reg:CC CC_REGNUM))
-   (clobber (match_scratch:DI 1 "=r"))]
+   (clobber (match_scratch:DI 1 "=r"))
+   (use (reg:DI FP_REGNUM))]
   "TARGET_TLS_DESC && !TARGET_SVE"
   "adrp\\tx0, %A0\;ldr\\t%1, [x0, #%L0]\;add\\t0, 0, 
%L0\;.tlsdesccall\\t%0\;blr\\t%1"
   [(set_attr "type" "call")
@@ -6680,7 +6682,8 @@
(clobber (reg:VNx2BI P13_REGNUM))
(clobber (reg:VNx2BI P14_REGNUM))
(clobber (reg:VNx2BI P15_REGNUM))
-   (clobber (match_scratch:DI 1 "=r"))]
+   (clobber (match_scratch:DI 1 "=r"))
+   (use (reg:DI FP_REGNUM))]
   "TARGET_TLS_DESC && TARGET_SVE"
   "adrp\\tx0, %A0\;ldr\\t%1, [x0, #%L0]\;add\\t0, 0, 
%L0\;.tlsdesccall\\t%0\;blr\\t%1"
   [(set_attr "type" "call")


[C++ PATCH] * class.c (resolves_to_fixed_type_p): Check CLASSTYPE_FINAL.

2019-06-25 Thread Jason Merrill
If we have a pointer to final class, we know the dynamic type of the object
must be that class, because it can't have any derived classes.

Tested x86_64-pc-linux-gnu, applying to trunk.
---
 gcc/cp/class.c | 6 --
 gcc/testsuite/g++.dg/tree-ssa/final1.C | 8 
 gcc/cp/ChangeLog   | 4 
 3 files changed, 16 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/tree-ssa/final1.C

diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index e0df9ef2b20..a679e651bbe 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -7477,10 +7477,12 @@ resolves_to_fixed_type_p (tree instance, int* nonnull)
 }
 
   fixed = fixed_type_or_null (instance, nonnull, );
-  if (fixed == NULL_TREE)
-return 0;
   if (INDIRECT_TYPE_P (t))
 t = TREE_TYPE (t);
+  if (CLASS_TYPE_P (t) && CLASSTYPE_FINAL (t))
+return 1;
+  if (fixed == NULL_TREE)
+return 0;
   if (!same_type_ignoring_top_level_qualifiers_p (t, fixed))
 return 0;
   return cdtorp ? -1 : 1;
diff --git a/gcc/testsuite/g++.dg/tree-ssa/final1.C 
b/gcc/testsuite/g++.dg/tree-ssa/final1.C
new file mode 100644
index 000..43407f09675
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/final1.C
@@ -0,0 +1,8 @@
+// { dg-do compile { target c++11 } }
+// { dg-additional-options -fdump-tree-gimple }
+// { dg-final { scan-tree-dump-not "vptr" gimple } }
+
+struct A { int i; };
+struct B final: public virtual A { int j; };
+
+int f(B* b) { return b->i; }
diff --git a/gcc/cp/ChangeLog b/gcc/cp/ChangeLog
index 3459ad7718b..0ad67f6bc1b 100644
--- a/gcc/cp/ChangeLog
+++ b/gcc/cp/ChangeLog
@@ -1,3 +1,7 @@
+2019-06-25  Jason Merrill  
+
+   * class.c (resolves_to_fixed_type_p): Check CLASSTYPE_FINAL.
+
 2019-06-24  Jan Hubicka  
 
* lex.c (cxx_make_type): Set TYPE_CXX_ODR_P.

base-commit: 14462095a3d89c8dccb0da790e30a13b59a02a9c
-- 
2.20.1



Re: [PATCH] Properly sum costs in tree-vect-loop.c (PR tree-optimization/90973).

2019-06-25 Thread Martin Liška
On 6/25/19 4:18 PM, Richard Biener wrote:
> On Tue, Jun 25, 2019 at 10:50 AM David Malcolm  wrote:
>>
>> On Tue, 2019-06-25 at 10:16 +0200, Martin Liška wrote:
>>> Hi.
>>>
>>> That's a thinko that's pre-approved by Richi.
>>>
>>> Patch can bootstrap on x86_64-linux-gnu and survives regression
>>> tests.
>>>
>>> Thanks,
>>> Martin
>>>
>>> gcc/ChangeLog:
>>>
>>> 2019-06-24  Martin Liska  
>>>
>>>   PR tree-optimization/90973
>>>   * tree-vect-loop.c (vect_get_known_peeling_cost): Sum retval
>>>   of prologue and epilogue.
>>> ---
>>>  gcc/tree-vect-loop.c | 4 ++--
>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>>> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
>>> index d3facf67bf9..489bee65397 100644
>>> --- a/gcc/tree-vect-loop.c
>>> +++ b/gcc/tree-vect-loop.c
>>> @@ -3405,8 +3405,8 @@ vect_get_known_peeling_cost (loop_vec_info 
>>> loop_vinfo, int peel_iters_prologue,
>>>   iterations are unknown, count a taken branch per peeled loop.  */
>>>retval = record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
>>>NULL, 0, vect_prologue);
>>> -  retval = record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
>>> -  NULL, 0, vect_epilogue);
>>> +  retval += record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
>>  ^^
>> Should this be epilogue_cost_vec?
> 
> I think so.
> 
>>> +   NULL, 0, vect_epilogue);
>>
>> (caveat: I'm purely going by symmetry here)

I've got a patch that I've been testing. I'll install it if it survives
regression tests.

Thanks,
Martin
>From 59a69485c9f7531e5318c835e09a317f0b5690a8 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 25 Jun 2019 17:02:50 +0200
Subject: [PATCH] Fix one another thinko in tree-vect-loop.c (PR
 tree-optimization/90973).

gcc/ChangeLog:

2019-06-25  Martin Liska  

	PR tree-optimization/90973
	* tree-vect-loop.c (vect_get_known_peeling_cost): Use
	epilogue_cost_vec instead of prologue_cost_vec for
	a epilogue cost.
---
 gcc/tree-vect-loop.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 489bee65397..b37bf6f427d 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -3405,7 +3405,7 @@ vect_get_known_peeling_cost (loop_vec_info loop_vinfo, int peel_iters_prologue,
  iterations are unknown, count a taken branch per peeled loop.  */
   retval = record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
  NULL, 0, vect_prologue);
-  retval += record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
+  retval += record_stmt_cost (epilogue_cost_vec, 1, cond_branch_taken,
   NULL, 0, vect_epilogue);
 }
   else
-- 
2.21.0



Re: [PATCH] Enable GCC support for AVX512_VP2INTERSECT.

2019-06-25 Thread H.J. Lu
On Tue, Jun 25, 2019 at 7:55 AM Jeff Law  wrote:
>
> On 6/25/19 8:34 AM, H.J. Lu wrote:
> > On Tue, Jun 25, 2019 at 12:58 AM Uros Bizjak  wrote:
> >>
> >> On 6/25/19, Hongtao Liu  wrote:
> >>> On Sat, Jun 22, 2019 at 3:38 PM Uros Bizjak  wrote:
> 
>  On Fri, Jun 21, 2019 at 8:38 PM H.J. Lu  wrote:
> 
> > +/* Register pair.  */
> > +VECTOR_MODES_WITH_PREFIX (P, INT, 2); /* P2QI
> > */
> > +VECTOR_MODES_WITH_PREFIX (P, INT, 4); /* P2HI
> > P4QI */
> >
> > I think
> >
> > INT_MODE (P2QI, 16);
> > INT_MODE (P2HI, 32);
>  Why P2QI need 16 bytes but not 2 bytes?
>  Same question with P2HI.
> >>>
> >>> Because we made a mistake. It should be 2 and 4, since these
> >>> arguments
> >> Then it will run into internal comiler error when building libgcc.
> >> I'm still invertigating it.
> >>> are bytes, not bits.
> >
> > I don't think we can have 2 integer modes with the same number of bytes
> > since
> > it breaks things like
> >
> > scalar_int_mode wider_mode = GET_MODE_WIDER_MODE (mode).require ();
> >
> > We can get
> >
> > (gdb) p mode
> > $2 = {m_mode = E_SImode}
> > (gdb) p wider_mode
> > $3 = {m_mode = E_P2HImode}
> > (gdb)
> >
> > Neither middle-end nor backend support it.
> 
>  Ouch... It looks we hit the limitation of the middle end (which should
>  at least warn/error out if two modes of the same width are declared).
> 
>  OTOH, we can't solve this problem by using two HI/QImode registers,
>  since a consecutive register pair has to be allocated It is also not
>  possible to overload existing SI/HImode mode with different
>  requirements w.r.t register pair allocation (e.g. sometimes the whole
>  register is allocated, and sometimes a register pair is allocated).
> 
>  I think we have to invent something like SPECIAL_INT_MODE, which would
>  avoid mode promotion functionality (basically, it should not be listed
>  in mode_wider and similar arrays). This would prevent mode promotion
>  issues, while it would still allow to have mode, having the same width
>  as existing mode, but with special properties.
> 
>  I'm adding Jeff and Jakub to the discussion about SPECIAL_INT_MODE.
> 
>  Uros.
> >>>
> >>> Patch from H.J using PARTIAL_INT_MODE fixed this issue.
> >>>
> >>> +/* Register pair.  */
> >>> +PARTIAL_INT_MODE (HI, 16, P2QI);
> >>> +PARTIAL_INT_MODE (SI, 32, P2HI);
> >>> +
> >>
> >> I don't think this approach is correct (the mode is not partial), and
> >> it could work by chance. The documentation is very brief with the
> >> details of different mode types, so let's ask middle-end and RTL
> >> experts.
> >>
> >
> > It is used by powerpc backend for similar purpose:
> >
> > :/* Replacement for TImode that only is allowed in GPRs.  We also use 
> > PTImode
> >for quad memory atomic operations to force getting an even/odd register
> >combination.  */
> > PARTIAL_INT_MODE (TI, 128, PTI);
> The partial modes were designed to handle things like targets with
> register sizes that aren't 2**n bits in size.  A port can certainly
> support something like SImode and PSImode side by side and they can have
> the same underlying size.
>
> Essentially the partial modes represent a mode where the compiler does
> not necessarily know the exact size, but instead knows a maximum size of
> the object.  You'll have to define suitable movXX patterns and any other
> operations you might want to perform.  THe compiler will generally not
> convert between the partial mode and any other modes without an explicit
> conversion (again it can't because it doesn't know how big the partial
> mode really is).

These are all what we need here.  We generate an instruction to set a
P2HI/P2QI register and immediately extract it to HI/QI registers.  No other
operations in P2HI/P2QI modes are generated nor needed.

[hjl@gnu-cfl-1 vp2intersect]$ cat 2.i
typedef int __v16si __attribute__ ((__vector_size__ (64)));

typedef unsigned char  __mmask8;
typedef unsigned short __mmask16;

__mmask16
foo (__v16si x, __v16si y, __mmask16 *b)
{
  __mmask16 a;
  __builtin_ia32_2intersectd512 (, b, x, y);
  return a;
}
[hjl@gnu-cfl-1 vp2intersect]$ make 2.s
/export/build/gnu/tools-build/gcc-intel/build-x86_64-linux/gcc/xgcc
-B/export/build/gnu/tools-build/gcc-intel/build-x86_64-linux/gcc/
-mavx512vp2intersect -O2 -S 2.i
[hjl@gnu-cfl-1 vp2intersect]$ cat 2.s
.file "2.i"
.text
.p2align 4
.globl foo
.type foo, @function
foo:
.LFB0:
.cfi_startproc
vp2intersectd %zmm1, %zmm0, %k0
kmovw %k0, %eax
kmovw %k1, (%rdi)
ret
.cfi_endproc
.LFE0:
.size foo, .-foo
.ident "GCC: (GNU) 10.0.0 20190620 (experimental)"
.section .note.GNU-stack,"",@progbits
[hjl@gnu-cfl-1 vp2intersect]$


> I don't see anything inherently 

Re: [PATCH] Enable GCC support for AVX512_VP2INTERSECT.

2019-06-25 Thread Jeff Law
On 6/25/19 8:34 AM, H.J. Lu wrote:
> On Tue, Jun 25, 2019 at 12:58 AM Uros Bizjak  wrote:
>>
>> On 6/25/19, Hongtao Liu  wrote:
>>> On Sat, Jun 22, 2019 at 3:38 PM Uros Bizjak  wrote:

 On Fri, Jun 21, 2019 at 8:38 PM H.J. Lu  wrote:

> +/* Register pair.  */
> +VECTOR_MODES_WITH_PREFIX (P, INT, 2); /* P2QI
> */
> +VECTOR_MODES_WITH_PREFIX (P, INT, 4); /* P2HI
> P4QI */
>
> I think
>
> INT_MODE (P2QI, 16);
> INT_MODE (P2HI, 32);
 Why P2QI need 16 bytes but not 2 bytes?
 Same question with P2HI.
>>>
>>> Because we made a mistake. It should be 2 and 4, since these
>>> arguments
>> Then it will run into internal comiler error when building libgcc.
>> I'm still invertigating it.
>>> are bytes, not bits.
>
> I don't think we can have 2 integer modes with the same number of bytes
> since
> it breaks things like
>
> scalar_int_mode wider_mode = GET_MODE_WIDER_MODE (mode).require ();
>
> We can get
>
> (gdb) p mode
> $2 = {m_mode = E_SImode}
> (gdb) p wider_mode
> $3 = {m_mode = E_P2HImode}
> (gdb)
>
> Neither middle-end nor backend support it.

 Ouch... It looks we hit the limitation of the middle end (which should
 at least warn/error out if two modes of the same width are declared).

 OTOH, we can't solve this problem by using two HI/QImode registers,
 since a consecutive register pair has to be allocated It is also not
 possible to overload existing SI/HImode mode with different
 requirements w.r.t register pair allocation (e.g. sometimes the whole
 register is allocated, and sometimes a register pair is allocated).

 I think we have to invent something like SPECIAL_INT_MODE, which would
 avoid mode promotion functionality (basically, it should not be listed
 in mode_wider and similar arrays). This would prevent mode promotion
 issues, while it would still allow to have mode, having the same width
 as existing mode, but with special properties.

 I'm adding Jeff and Jakub to the discussion about SPECIAL_INT_MODE.

 Uros.
>>>
>>> Patch from H.J using PARTIAL_INT_MODE fixed this issue.
>>>
>>> +/* Register pair.  */
>>> +PARTIAL_INT_MODE (HI, 16, P2QI);
>>> +PARTIAL_INT_MODE (SI, 32, P2HI);
>>> +
>>
>> I don't think this approach is correct (the mode is not partial), and
>> it could work by chance. The documentation is very brief with the
>> details of different mode types, so let's ask middle-end and RTL
>> experts.
>>
> 
> It is used by powerpc backend for similar purpose:
> 
> :/* Replacement for TImode that only is allowed in GPRs.  We also use PTImode
>for quad memory atomic operations to force getting an even/odd register
>combination.  */
> PARTIAL_INT_MODE (TI, 128, PTI);
The partial modes were designed to handle things like targets with
register sizes that aren't 2**n bits in size.  A port can certainly
support something like SImode and PSImode side by side and they can have
the same underlying size.

Essentially the partial modes represent a mode where the compiler does
not necessarily know the exact size, but instead knows a maximum size of
the object.  You'll have to define suitable movXX patterns and any other
operations you might want to perform.  THe compiler will generally not
convert between the partial mode and any other modes without an explicit
conversion (again it can't because it doesn't know how big the partial
mode really is).

I don't see anything inherently wrong with using the partial modes, but
we need to be aware that they're not stressed all that hard and we could
well run into under-specified cases and missed optimizations.
Jeff


Re: [PATCH] Enable GCC support for AVX512_VP2INTERSECT.

2019-06-25 Thread Richard Sandiford
"H.J. Lu"  writes:
> On Tue, Jun 25, 2019 at 12:58 AM Uros Bizjak  wrote:
>>
>> On 6/25/19, Hongtao Liu  wrote:
>> > On Sat, Jun 22, 2019 at 3:38 PM Uros Bizjak  wrote:
>> >>
>> >> On Fri, Jun 21, 2019 at 8:38 PM H.J. Lu  wrote:
>> >>
>> >> > > > > > > > > > >> > > +/* Register pair.  */
>> >> > > > > > > > > > >> > > +VECTOR_MODES_WITH_PREFIX (P, INT, 2); /* P2QI
>> >> > > > > > > > > > >> > > */
>> >> > > > > > > > > > >> > > +VECTOR_MODES_WITH_PREFIX (P, INT, 4); /* P2HI
>> >> > > > > > > > > > >> > > P4QI */
>> >> > > > > > > > > > >> > >
>> >> > > > > > > > > > >> > > I think
>> >> > > > > > > > > > >> > >
>> >> > > > > > > > > > >> > > INT_MODE (P2QI, 16);
>> >> > > > > > > > > > >> > > INT_MODE (P2HI, 32);
>> >> > > > > Why P2QI need 16 bytes but not 2 bytes?
>> >> > > > > Same question with P2HI.
>> >> > > >
>> >> > > > Because we made a mistake. It should be 2 and 4, since these
>> >> > > > arguments
>> >> > > Then it will run into internal comiler error when building libgcc.
>> >> > > I'm still invertigating it.
>> >> > > > are bytes, not bits.
>> >> >
>> >> > I don't think we can have 2 integer modes with the same number of bytes
>> >> > since
>> >> > it breaks things like
>> >> >
>> >> > scalar_int_mode wider_mode = GET_MODE_WIDER_MODE (mode).require ();
>> >> >
>> >> > We can get
>> >> >
>> >> > (gdb) p mode
>> >> > $2 = {m_mode = E_SImode}
>> >> > (gdb) p wider_mode
>> >> > $3 = {m_mode = E_P2HImode}
>> >> > (gdb)
>> >> >
>> >> > Neither middle-end nor backend support it.
>> >>
>> >> Ouch... It looks we hit the limitation of the middle end (which should
>> >> at least warn/error out if two modes of the same width are declared).
>> >>
>> >> OTOH, we can't solve this problem by using two HI/QImode registers,
>> >> since a consecutive register pair has to be allocated It is also not
>> >> possible to overload existing SI/HImode mode with different
>> >> requirements w.r.t register pair allocation (e.g. sometimes the whole
>> >> register is allocated, and sometimes a register pair is allocated).
>> >>
>> >> I think we have to invent something like SPECIAL_INT_MODE, which would
>> >> avoid mode promotion functionality (basically, it should not be listed
>> >> in mode_wider and similar arrays). This would prevent mode promotion
>> >> issues, while it would still allow to have mode, having the same width
>> >> as existing mode, but with special properties.
>> >>
>> >> I'm adding Jeff and Jakub to the discussion about SPECIAL_INT_MODE.
>> >>
>> >> Uros.
>> >
>> > Patch from H.J using PARTIAL_INT_MODE fixed this issue.
>> >
>> > +/* Register pair.  */
>> > +PARTIAL_INT_MODE (HI, 16, P2QI);
>> > +PARTIAL_INT_MODE (SI, 32, P2HI);
>> > +
>>
>> I don't think this approach is correct (the mode is not partial), and
>> it could work by chance. The documentation is very brief with the
>> details of different mode types, so let's ask middle-end and RTL
>> experts.

Agree your SPECIAL_INT_MODE sounds cleaner FWIW.  Having PARTIAL_INT_MODEs
that aren't actually partial seems pretty grim, but...

> It is used by powerpc backend for similar purpose:
>
> :/* Replacement for TImode that only is allowed in GPRs.  We also use PTImode
>for quad memory atomic operations to force getting an even/odd register
>combination.  */
> PARTIAL_INT_MODE (TI, 128, PTI);

...I guess this means that it's correct through usage.

Richard


Re: [patch] Add NetBSD/hppa target

2019-06-25 Thread Jeff Law
On 6/25/19 1:57 AM, co...@sdf.org wrote:
> On Fri, Jun 14, 2019 at 01:32:11PM -0400, John David Anglin wrote:
 +hppa*-*-netbsd*)
 +  target_cpu_default="MASK_PA_11|MASK_NO_SPACE_REGS"
>>> Any reason to not use the PA 2.0 ISA?   I'm virtually certain we
>>> supported the 32bit ABI running on PA 2.0 hardware in hpbsd (which is
>>> where the netbsd PA code is ultimately derived from).   I'd be really
>>> surprised if there's any PA1.1 hardware running anywhere, though there's
>>> certainly some PA2.0 hardware out in the wild.
>> You might also consider adding MASK_CALLER_COPIES as libgomp is broken for 
>> callee
>> copies.  This is an ABI choice so ideally you should do it now or not at all.
> 
> 
> Hi Jeff, Dave,
> 
> I've spoken to the authority of NetBSD/hppa (that's Nick Hudson), and he
> said he'd rather keep the ABI as it is for the purpose of upstreaming.
> He might switch ABIs eventually, but would rather do it with the local
> copy of GCC first.
> (And he has several PA1.1 machines :))
WRT PA1.1 vs PA2.0, as long as it's an informed decision (sounds like it
is), I'm not going to object.

WRT MASK_CALLER_COPIES, I concur with John, it's a now or never choice.
 Flipping it after the fact has far more impacts than libgomp --
essentially it changes who is responsible for copying structures that
are passed by invisible reference -- which happens in far more places
than libgomp.

Jeff


Re: [SVE] [fwprop] PR88833 - Redundant moves for WHILELO-based loops

2019-06-25 Thread Richard Sandiford
Prathamesh Kulkarni  writes:
> On Mon, 24 Jun 2019 at 21:41, Prathamesh Kulkarni
>  wrote:
>>
>> On Mon, 24 Jun 2019 at 19:51, Richard Sandiford
>>  wrote:
>> >
>> > Prathamesh Kulkarni  writes:
>> > > @@ -1415,6 +1460,19 @@ forward_propagate_into (df_ref use)
>> > >if (!def_set)
>> > >  return false;
>> > >
>> > > +  if (reg_prop_only
>> > > +  && !REG_P (SET_SRC (def_set))
>> > > +  && !REG_P (SET_DEST (def_set)))
>> > > +return false;
>> >
>> > This should be:
>> >
>> >   if (reg_prop_only
>> >   && (!REG_P (SET_SRC (def_set)) || !REG_P (SET_DEST (def_set
>> > return false;
>> >
>> > so that we return false if either operand isn't a register.
>> Oops, sorry about that  -:(
>> >
>> > > +
>> > > +  /* Allow propagations into a loop only for reg-to-reg copies, since
>> > > + replacing one register by another shouldn't increase the cost.  */
>> > > +
>> > > +  if (DF_REF_BB (def)->loop_father != DF_REF_BB (use)->loop_father
>> > > +  && !REG_P (SET_SRC (def_set))
>> > > +  && !REG_P (SET_DEST (def_set)))
>> > > +return false;
>> >
>> > Same here.
>> >
>> > OK with that change, thanks.
>> Thanks for the review, will make the changes and commit the patch
>> after re-testing.
> Hi,
> Testing the patch showed following failures on 32-bit x86:
>
>   Executed from: g++.target/i386/i386.exp
> g++:g++.target/i386/pr88152.C   scan-assembler-not vpcmpgt|vpcmpeq|vpsra
>   Executed from: gcc.target/i386/i386.exp
> gcc:gcc.target/i386/pr66768.c scan-assembler add*.[ \t]%gs:
> gcc:gcc.target/i386/pr90178.c scan-assembler-times xorl[\\t
> ]*\\%eax,[\\t ]*%eax 1
>
> The failure of pr88152.C is also seen on x86_64.
>
> For pr66768.c, and pr90178.c, forwprop replaces register which is
> volatile and frame related respectively.
> To avoid that, the attached patch, makes a stronger constraint that
> src and dest should be a register
> and not have frame_related or volatil flags set, which is checked in
> usable_reg_p().
> Which avoids the failures for both the cases.
> Does it look OK ?

That's not the reason it's a bad transform.  In both cases we're
propagating r2 <- r1 even though

(a) r1 dies in the copy and
(b) fwprop can't replace all uses of r2, because some have multiple
definitions

This has the effect of making both values live in cases where only one
was previously.

In the case of pr66768.c, fwprop2 is undoing the effect of
cse.c:canon_reg, which tries to pick the best register to use
(see cse.c:make_regs_eqv).  fwprop1 makes the same change,
and made it even before the patch, but the cse.c choice should win.

A (hopefully) conservative fix would be to propagate the copy only if
both registers have a single definition, which you can test with:

  (DF_REG_DEF_COUNT (regno) == 1
   && !bitmap_bit_p (DF_LR_OUT (ENTRY_BLOCK_PTR_FOR_FN (m_fn)), regno))

In that case, fwprop should see all uses of the destination, and should
be able to replace it in all cases with the source.

> For g++.target/i386/pr88152.C, the issue is that after the patch,
> forwprop1 does following propagation (in f10) which wasn't done
> before:
>
> In insn 10, replacing
>  (unspec:SI [
> (reg:V2DF 91)
> ] UNSPEC_MOVMSK)
>  with (unspec:SI [
> (subreg:V2DF (reg:V2DI 90) 0)
> ] UNSPEC_MOVMSK)
>
> This later defeats combine because insn 9 gets deleted.
> Without patch, the following combination takes place:
>
> Trying 7 -> 9:
> 7: r90:V2DI=r89:V2DI>r93:V2DI
>   REG_DEAD r93:V2DI
>   REG_DEAD r89:V2DI
> 9: r91:V2DF=r90:V2DI#0
>   REG_DEAD r90:V2DI
> Successfully matched this instruction:
> (set (subreg:V2DI (reg:V2DF 91) 0)
> (gt:V2DI (reg:V2DI 89)
> (reg:V2DI 93)))
> allowing combination of insns 7 and 9
>
> and then:
> Trying 6, 9 -> 10:
> 6: r89:V2DI=const_vector
> 9: r91:V2DF#0=r89:V2DI>r93:V2DI
>   REG_DEAD r89:V2DI
>   REG_DEAD r93:V2DI
>10: r87:SI=unspec[r91:V2DF] 43
>   REG_DEAD r91:V2DF
> Successfully matched this instruction:
> (set (reg:SI 87)
> (unspec:SI [
> (lt:V2DF (reg:V2DI 93)
> (const_vector:V2DI [
> (const_int 0 [0]) repeated x2
> ]))
> ] UNSPEC_MOVMSK))

Eh?  lt:*V2DF*?  Does that mean that it's 0 for false and an all-1 NaN
for true?

Looks like a bug that we manage to fold to that, and manage to match it.

Thanks,
Richard

> allowing combination of insns 6, 9 and 10
> original costs 4 + 8 + 4 = 16
> replacement cost 12
> deferring deletion of insn with uid = 9.
> deferring deletion of insn with uid = 6.
> which deletes insns 2, 3, 6, 7, 9.
>
> With patch, it fails to combine 7->10:
> Trying 7 -> 10:
> 7: r90:V2DI=r89:V2DI>r93:V2DI
>   REG_DEAD r93:V2DI
>   REG_DEAD r89:V2DI
>10: r87:SI=unspec[r90:V2DI#0] 43
>   REG_DEAD r90:V2DI
> Failed to match this instruction:
> (set (reg:SI 87)
> (unspec:SI [
> (subreg:V2DF (gt:V2DI (reg:V2DI 89)
>

Re: [PATCH] Enable GCC support for AVX512_VP2INTERSECT.

2019-06-25 Thread H.J. Lu
On Tue, Jun 25, 2019 at 12:58 AM Uros Bizjak  wrote:
>
> On 6/25/19, Hongtao Liu  wrote:
> > On Sat, Jun 22, 2019 at 3:38 PM Uros Bizjak  wrote:
> >>
> >> On Fri, Jun 21, 2019 at 8:38 PM H.J. Lu  wrote:
> >>
> >> > > > > > > > > > >> > > +/* Register pair.  */
> >> > > > > > > > > > >> > > +VECTOR_MODES_WITH_PREFIX (P, INT, 2); /* P2QI
> >> > > > > > > > > > >> > > */
> >> > > > > > > > > > >> > > +VECTOR_MODES_WITH_PREFIX (P, INT, 4); /* P2HI
> >> > > > > > > > > > >> > > P4QI */
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> > > I think
> >> > > > > > > > > > >> > >
> >> > > > > > > > > > >> > > INT_MODE (P2QI, 16);
> >> > > > > > > > > > >> > > INT_MODE (P2HI, 32);
> >> > > > > Why P2QI need 16 bytes but not 2 bytes?
> >> > > > > Same question with P2HI.
> >> > > >
> >> > > > Because we made a mistake. It should be 2 and 4, since these
> >> > > > arguments
> >> > > Then it will run into internal comiler error when building libgcc.
> >> > > I'm still invertigating it.
> >> > > > are bytes, not bits.
> >> >
> >> > I don't think we can have 2 integer modes with the same number of bytes
> >> > since
> >> > it breaks things like
> >> >
> >> > scalar_int_mode wider_mode = GET_MODE_WIDER_MODE (mode).require ();
> >> >
> >> > We can get
> >> >
> >> > (gdb) p mode
> >> > $2 = {m_mode = E_SImode}
> >> > (gdb) p wider_mode
> >> > $3 = {m_mode = E_P2HImode}
> >> > (gdb)
> >> >
> >> > Neither middle-end nor backend support it.
> >>
> >> Ouch... It looks we hit the limitation of the middle end (which should
> >> at least warn/error out if two modes of the same width are declared).
> >>
> >> OTOH, we can't solve this problem by using two HI/QImode registers,
> >> since a consecutive register pair has to be allocated It is also not
> >> possible to overload existing SI/HImode mode with different
> >> requirements w.r.t register pair allocation (e.g. sometimes the whole
> >> register is allocated, and sometimes a register pair is allocated).
> >>
> >> I think we have to invent something like SPECIAL_INT_MODE, which would
> >> avoid mode promotion functionality (basically, it should not be listed
> >> in mode_wider and similar arrays). This would prevent mode promotion
> >> issues, while it would still allow to have mode, having the same width
> >> as existing mode, but with special properties.
> >>
> >> I'm adding Jeff and Jakub to the discussion about SPECIAL_INT_MODE.
> >>
> >> Uros.
> >
> > Patch from H.J using PARTIAL_INT_MODE fixed this issue.
> >
> > +/* Register pair.  */
> > +PARTIAL_INT_MODE (HI, 16, P2QI);
> > +PARTIAL_INT_MODE (SI, 32, P2HI);
> > +
>
> I don't think this approach is correct (the mode is not partial), and
> it could work by chance. The documentation is very brief with the
> details of different mode types, so let's ask middle-end and RTL
> experts.
>

It is used by powerpc backend for similar purpose:

:/* Replacement for TImode that only is allowed in GPRs.  We also use PTImode
   for quad memory atomic operations to force getting an even/odd register
   combination.  */
PARTIAL_INT_MODE (TI, 128, PTI);


-- 
H.J.


Re: [PATCH][RFC] Sanitize equals and hash functions in hash-tables.

2019-06-25 Thread Richard Biener
On Tue, Jun 25, 2019 at 12:25 PM Martin Liška  wrote:
>
> On 6/24/19 4:09 PM, Richard Biener wrote:
> > You still get one instance in each TU ...
>
> Right, fixed in attached patch.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

Yes.

Thanks,
Richard.

> Thanks,
> Martin


Re: [PATCH] Properly sum costs in tree-vect-loop.c (PR tree-optimization/90973).

2019-06-25 Thread Richard Biener
On Tue, Jun 25, 2019 at 10:50 AM David Malcolm  wrote:
>
> On Tue, 2019-06-25 at 10:16 +0200, Martin Liška wrote:
> > Hi.
> >
> > That's a thinko that's pre-approved by Richi.
> >
> > Patch can bootstrap on x86_64-linux-gnu and survives regression
> > tests.
> >
> > Thanks,
> > Martin
> >
> > gcc/ChangeLog:
> >
> > 2019-06-24  Martin Liska  
> >
> >   PR tree-optimization/90973
> >   * tree-vect-loop.c (vect_get_known_peeling_cost): Sum retval
> >   of prologue and epilogue.
> > ---
> >  gcc/tree-vect-loop.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
>
> > diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> > index d3facf67bf9..489bee65397 100644
> > --- a/gcc/tree-vect-loop.c
> > +++ b/gcc/tree-vect-loop.c
> > @@ -3405,8 +3405,8 @@ vect_get_known_peeling_cost (loop_vec_info 
> > loop_vinfo, int peel_iters_prologue,
> >   iterations are unknown, count a taken branch per peeled loop.  */
> >retval = record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
> >NULL, 0, vect_prologue);
> > -  retval = record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
> > -  NULL, 0, vect_epilogue);
> > +  retval += record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
>  ^^
> Should this be epilogue_cost_vec?

I think so.

> > +   NULL, 0, vect_epilogue);
>
> (caveat: I'm purely going by symmetry here)


Re: [PATCH] Add .gnu.lto_.meta section.

2019-06-25 Thread Richard Biener
On Tue, Jun 25, 2019 at 10:14 AM Martin Liška  wrote:
>
> On 6/24/19 8:05 PM, Richard Biener wrote:
> > On Mon, Jun 24, 2019 at 3:31 PM Martin Liška  wrote:
> >>
> >> On 6/24/19 2:44 PM, Richard Biener wrote:
> >>> On Mon, Jun 24, 2019 at 2:12 PM Martin Liška  wrote:
> 
>  On 6/24/19 2:02 PM, Richard Biener wrote:
> > On Fri, Jun 21, 2019 at 4:01 PM Martin Liška  wrote:
> >>
> >> On 6/21/19 2:57 PM, Jan Hubicka wrote:
> >>> This looks like good step (and please stream it in host independent
> >>> way). I suppose all these issues can be done one-by-one.
> >>
> >> So there's a working patch for that. However one will see following 
> >> errors
> >> when using an older compiler or older LTO bytecode:
> >>
> >> $ gcc main9.o -flto
> >> lto1: fatal error: bytecode stream in file ‘main9.o’ generated with 
> >> LTO version -25480.4493 instead of the expected 9.0
> >>
> >> $ gcc main.o
> >> lto1: internal compiler error: compressed stream: data error
> >
> > This is because of your change to bitfields or because with the old
> > scheme the header with the
> > version is compressed (is it?).
> 
>  Because currently also the header is compressed.
> >>>
> >>> That was it, yeah :/  Stupid decisions in the past.
> >>>
> >>> I guess we have to bite the bullet and do this kind of incompatible
> >>> change, accepting
> >>> the odd error message above.
> >>>
> > I'd simply avoid any layout changes
> > in the version check range.
> 
>  Well, then we have to find out how to distinguish between compression 
>  algorithms.
> 
> >
> >> To be honest, I would prefer the new .gnu.lto_.meta section.
> >> Richi why is that so ugly?
> >
> > Because it's a change in the wrong direction and doesn't solve the
> > issue we already
> > have (cannot determine if a section is compressed or not).
> 
>  That's not true, the .gnu.lto_.meta section will be always uncompressed 
>  and we can
>  also backport changes to older compiler that can read it and print a 
>  proper error
>  message about LTO bytecode version mismatch.
> >>>
> >>> We can always backport changes, yes, but I don't see why we have to.
> >>
> >> I'm fine with the backward compatibility break. But we should also 
> >> consider lto-plugin.c
> >> that is parsing following 2 sections:
> >>
> >> 91  #define LTO_SECTION_PREFIX  ".gnu.lto_.symtab"
> >> 92  #define LTO_SECTION_PREFIX_LEN  (sizeof (LTO_SECTION_PREFIX) - 1)
> >> 93  #define OFFLOAD_SECTION ".gnu.offload_lto_.opts"
> >> 94  #define OFFLOAD_SECTION_LEN (sizeof (OFFLOAD_SECTION) - 1)
> >
> > Yeah, I know.  And BFD and gold hard-coded those __gnu_lto_{v1,slim} 
> > symbols...
>
> Yep, they do, 'nm' is also using that.
>
> >
> >>>
> > ELF section overhead
> > is quite big if you have lots of small functions.
> 
>  My patch is actually shrinking space as I'm suggesting to add _one_ 
>  extra ELF section
>  and remove the section header from all other LTO sections. That will 
>  save space
>  for all function sections.
> >>>
> >>> But we want the header there to at least say if the section is
> >>> compressed or not.
> >>> The fact that we have so many ELF section means we have the redundant 
> >>> version
> >>> info everywhere.
> >>>
> >>> We should have a single .gnu.lto_ section (and also get rid of those
> >>> __gnu_lto_v1 and __gnu_lto_slim COMMON symbols - checking for
> >>> existence of a symbol is more expensive compared to existence
> >>> of a section).
> >>
> >> I like removal of the 2 aforementioned sections. To be honest I would 
> >> recommend to
> >> add a new .gnu.lto_.meta section.
> >
> > Why .meta?  Why not just .gnu.lto_?
>
> Works for me.
>
> >
> >> We can use it instead of __gnu_lto_v1 and we can
> >> have a flag there instead of __gnu_lto_slim. As a second step, I'm willing 
> >> to concatenate all
> >>
> >>   LTO_section_function_body,
> >>   LTO_section_static_initializer
> >>
> >> sections into a single one. That will require an index that will have to 
> >> be created. I can discuss
> >> that with Honza as he suggested using something smarter than function 
> >> names.
> >
> > I think the index belongs to symtab?
> >
> > Let's properly do it if we want to change it.  Removing of
> > __gnu_lto_v1/slim is going to be
> > the most intrusive change btw. and orthogonal to the section changes.
>
> I'm fine with a proper change. So do I understand that correctly that:
> - we'll come up with .gnu.lto_ section that will be used by bfd, gold and nm
>   to detect LTO objects
> - for some time, we'll keep __gnu_lto_v1 and __gnu_lto_slim for backward
>   compatibility with older binutils tool
> - in couple of year, the legacy support will be removed

Yep.

Richard.

> ?
>
> Martin
>
> >
> > Richard.
> >
> >>
> >> Thoughts?
> >> Martin
> >>
> >>>
> >>> Richard.
> >>>
>  Martin

Re: value_range and irange unification

2019-06-25 Thread Richard Biener
On Tue, Jun 25, 2019 at 10:05 AM Aldy Hernandez  wrote:
>
>
>
> On 6/24/19 9:24 AM, Richard Biener wrote:
> > On Fri, Jun 21, 2019 at 1:41 PM Aldy Hernandez  wrote:
> >>
> >> Hi Richard.  Hi folks.
> >>
> >> In order to unify the APIs for value_range and irange, we'd like to make
> >> some minor changes to value_range.  We believe most of these changes
> >> could go in now, and would prefer so, to get broader testing and
> >> minimize the plethora of changes we drag around on our branch.
> >>
> >> First, introduce a type for VR_VARYING and VR_UNDEFINED.
> >> 
> >>
> >> irange utilizes 0 or more sub-ranges to represent a range, and VARYING
> >> is simply one subrange [MIN, MAX].value_range represents this with
> >> VR_VARYING, and since there is no type associated with it, we cannot
> >> calculate the lower and upper bounds for the range.  There is also a
> >> lack of canonicalness in value range in that VR_VARYING and [MIN, MAX]
> >> are two different representations of the same value.
> >>
> >> We tried to adjust irange to not associate a type with the empty range
> >> [] (representing undefined), but found we were unable to perform all
> >> operations properly.  In particular, we cannot invert an empty range.
> >> i.e. invert ( [] ) should produce [MIN, MAX].  Again, we need to have a
> >> type associated with this empty range.
> >>
> >> We'd like to tweak value_range so that set_varying() and set_undefined()
> >> both take a type, and then always set the min/max fields based on that
> >> type.  This takes no additional memory in the structure, and is
> >> virtually transparent to all the existing uses of value_range.
> >>
> >> This allows:
> >> 1)  invert to be implemented properly for both VARYING and UNDEFINED
> >> by simply changing one to the other.
> >> 2)  the type() method to always work without any special casing by
> >> simply returning TREE_TYPE(min)
> >> 3)  the new incoming bounds() routines to work trivially for these
> >> cases as well (lbound/ubound, num_pairs(), etc).
> >>
> >> This functionality is provided in the first attached patch.
> >>
> >> Note, the current implementation sets min/max to TREE_TYPE, not to
> >> TYPE_MIN/MAX_VALUE.  We can fix this if preferred.
> >
> > How does this work with
> >
> > value_range *
> > vr_values::get_value_range (const_tree var)
> > {
> >static const value_range vr_const_varying (VR_VARYING, NULL, NULL);
> > ...
> >/* If we query the range for a new SSA name return an unmodifiable 
> > VARYING.
> >   We should get here at most from the substitute-and-fold stage which
> >   will never try to change values.  */
> >if (ver >= num_vr_values)
> >  return CONST_CAST (value_range *, _const_varying);
> >
> > ?
>
> Good question.  This glaring omission came about after a full round of
> tests on our branch immediately after posting :).
>
> I am currently just allocating a new one each time:
>
> >if (ver >= num_vr_values)
> > -return CONST_CAST (value_range *, _const_varying);
> > +{
> > +  /* ?? At some point we should find a way to cache varying ranges
> > +by type.  In the tree type itself?  */
> > +  vr = vrp_value_range_pool.allocate ();
> > +  vr->set_varying (type);
> > +  return vr;
> > +}
>
> but we should discuss alternatives.  Ideally, we had batted around the
> idea of keeping the range for varying, cached in the type itself,
> because of its prevalence.  I think Andrew mentioned it would increase
> the size of type nodes by 4%.  Are there that many types that this would
> incur a significant penalty?  Another alternative would be a cache on
> the side.  What are your thoughts?

It's not that the static const varying isn't a hack -- it's done to avoid
growing the lattice dynamically as substitution / folding allocates
new SSA names (because it's a waste of time at that point).

One possibility is to simply return NULL from ::get_value_range
and treat that as VARYING in all callers.  But back in time that
was erorr-prone so I settled with the convenient global VARYING.

I don't like any kind of "caching" of VARYINGs per type too much.
If necessary then it should be done on the side, definitely _not_
in tree_type.

> >
> >> Second, enforce canonicalization at value_range build time.
> >> ---
> >>
> >> As discussed above, value_range has multiple representations for the
> >> same range.  For instance, ~[0,0] is the same as [1,MAX] in unsigned and
> >> [MIN, MAX] is really varying, among others.  We found it quite difficult
> >> to make things work, with multiple representations for a given range.
> >> Canonicalizing at build time solves this, as well as removing explicit
> >> set_and_canonicalize() calls throughout.  Furthermore, it avoids some
> >> special casing in VRP.
> >>
> >> Along with canonicalizing, we also enforce the existing value_range API
> >> 

Re: [C++ PATCH] Fix ICE in constexpr evaluation of ADDR_EXPR of ARRAY_REF of a vector (PR c++/90969)

2019-06-25 Thread Jason Merrill
On Tue, Jun 25, 2019 at 4:41 AM Jakub Jelinek  wrote:
> As mentioned in the PR, the following testcase ICEs starting with
> r272430.  The problem is that if cxx_eval_array_reference is called with
> lval true, we just want to constant evaluate the index and array, but need
> to keep the ARRAY_REF or possibly new one with updated operands in the IL.
> The code to look through VCEs from VECTOR_TYPEs has been previously in the
> !lval section, but has been made unconditional in that change.
> For !lval, we want that, we don't reconstruct ARRAY_REF, but want to fold it
> into a constant.  For lval, we only use ary in:
>   if (lval && ary == oldary && index == oldidx)
> return t;
>   else if (lval)
> return build4 (ARRAY_REF, TREE_TYPE (t), ary, index, NULL, NULL);
> though, so if we look through the VCE, we both for vectors can never reuse
> t and build always a new ARRAY_REF, but futhermore build a wrong one,
> as ARRAY_REF should always apply to an object with ARRAY_TYPE, not
> VECTOR_TYPE.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

OK.

Jason


Re: [PATCH] Enable use of #pragma omp simd reduction(inscan,...) even for GCC10+ in PSTL

2019-06-25 Thread Jonathan Wakely

On 24/06/19 15:18 -0700, Thomas Rodgers wrote:


Ok for trunk.


Yup, this is OK then, thanks for checking it, Thomas.



Can you push it into upstream PSTL?


Yes.

Thanks,
Tom.

Jakub Jelinek writes:


Hi!

Now that GCC supports inclusive/exclusive scans (like ICC 19.0 so far in
simd constructs only), we can enable it in PSTL as well.

Bootstrapped/regtested on x86_64-linux and i686-linux, additionally tested
with
#include 
#include 

auto
foo (std::vector , std::vector )
{
  return std::inclusive_scan(std::execution::unseq, ca.begin(), ca.end(), 
co.begin());
}

auto
bar (std::vector , std::vector )
{
  return std::exclusive_scan(std::execution::unseq, ca.begin(), ca.end(), 
co.begin(), 0);
}
and verifying with -O2 -fopenmp-simd it is vectorized.  Ok for trunk?
Can you push it into upstream PSTL?

2019-06-21  Jakub Jelinek  

* include/pstl/pstl_config.h (_PSTL_PRAGMA_SIMD_SCAN,
_PSTL_PRAGMA_SIMD_INCLUSIVE_SCAN, _PSTL_PRAGMA_SIMD_EXCLUSIVE_SCAN):
Define to OpenMP 5.0 pragmas even for GCC 10.0+.
(_PSTL_UDS_PRESENT): Define to 1 for GCC 10.0+.

--- libstdc++-v3/include/pstl/pstl_config.h.jj  2019-06-10 18:18:01.551191212 
+0200
+++ libstdc++-v3/include/pstl/pstl_config.h 2019-06-20 17:03:31.466367344 
+0200
@@ -70,7 +70,7 @@
 #define _PSTL_PRAGMA_FORCEINLINE
 #endif

-#if (__INTEL_COMPILER >= 1900)
+#if (__INTEL_COMPILER >= 1900) || (_PSTL_GCC_VERSION >= 10)
 #define _PSTL_PRAGMA_SIMD_SCAN(PRM) _PSTL_PRAGMA(omp simd 
reduction(inscan, PRM))
 #define _PSTL_PRAGMA_SIMD_INCLUSIVE_SCAN(PRM) _PSTL_PRAGMA(omp scan 
inclusive(PRM))
 #define _PSTL_PRAGMA_SIMD_EXCLUSIVE_SCAN(PRM) _PSTL_PRAGMA(omp scan 
exclusive(PRM))
@@ -100,7 +100,11 @@
 #define _PSTL_UDR_PRESENT 0
 #endif

-#define _PSTL_UDS_PRESENT (__INTEL_COMPILER >= 1900 && 
__INTEL_COMPILER_BUILD_DATE >= 20180626)
+#if ((__INTEL_COMPILER >= 1900 && __INTEL_COMPILER_BUILD_DATE >= 20180626) || 
_PSTL_GCC_VERSION >= 10)
+#define _PSTL_UDS_PRESENT 1
+#else
+#define _PSTL_UDS_PRESENT 0
+#endif

 #if _PSTL_EARLYEXIT_PRESENT
 #define _PSTL_PRAGMA_SIMD_EARLYEXIT _PSTL_PRAGMA(omp simd early_exit)

Jakub




Re: [PATCH] Define midpoint and lerp functions for C++20 (P0811R3)

2019-06-25 Thread Jonathan Wakely

On 25/06/19 10:12 +0100, Jonathan Wakely wrote:

On 25/06/19 11:06 +0200, Rainer Orth wrote:

Hi Jonathan,


On 12/03/19 23:04 +, Jonathan Wakely wrote:

On 12/03/19 22:49 +, Joseph Myers wrote:

On Tue, 5 Mar 2019, Jonathan Wakely wrote:


The midpoint and lerp functions for floating point types come straight
from the P0811R3 proposal, with no attempt at optimization.


I don't know whether P0811R3 states different requirements from the public
P0811R2, but the implementation of midpoint using isnormal does *not*
satisfy "at most one inexact operation occurs" and is *not* correctly
rounded, contrary to the claims made in P0811R2.


I did wonder how the implementation in the paper was meant to meet the
stated requirements, but I didn't wonder too hard.


Consider e.g. midpoint(DBL_MIN + DBL_TRUE_MIN, DBL_MIN + DBL_TRUE_MIN).
The value DBL_MIN + DBL_TRUE_MIN is normal, but dividing it by 2 is
inexact (and so that midpoint implementation would produce DBL_MIN as
result, so failing to satisfy midpoint(x, x) == x).

Replacing isnormal(x) by something like isgreaterequal(fabs(x), MIN*2)
would avoid those inexact divisions, but there would still be spurious
overflows in non-default rounding modes for e.g. midpoint(DBL_MAX,
DBL_TRUE_MIN) in FE_UPWARD mode, so failing "No overflow occurs" if that's
meant to apply in all rounding modes.


Thanks for this review, and the useful cases to test. Ed is working on
adding some more tests, so maybe he can also look at improving the
code :-)


I've committed r272616 to make this case work. This is the proposal
author's most recent suggestion for the implementation.

Tested x86_64-linux, committed to trunk.


the 26_numerics/midpoint/floating.cc test now FAILs on Solaris (sparc
and x86, 32 and 64-bit):

+FAIL: 26_numerics/midpoint/floating.cc (test for excess errors)
+UNRESOLVED: 26_numerics/midpoint/floating.cc compilation failed to produce 
executable

Excess errors:
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:65:
 error: non-constant condition for static assertion
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:65: error: 'constexpr std::enable_if_t<__and_v, 
std::is_same >::type, _Tp>, std::__not_ > >, _Tp> 
std::midpoint(_Tp, _Tp) [with _Tp = double; std::enable_if_t<__and_v, std::is_same >::type, _Tp>, std::__not_ > >, _Tp> = double]' called in a constant expression
/var/gcc/regression/trunk/11-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/numeric:199:
 error: call to non-'constexpr' function 'double std::abs(double)'
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:68:
 error: non-constant condition for static assertion
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:68: error: 'constexpr std::enable_if_t<__and_v, 
std::is_same >::type, _Tp>, std::__not_ > >, _Tp> 
std::midpoint(_Tp, _Tp) [with _Tp = float; std::enable_if_t<__and_v, std::is_same >::type, _Tp>, std::__not_ > >, _Tp> = float]' called in a constant expression
/var/gcc/regression/trunk/11-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/numeric:199:
 error: call to non-'constexpr' function 'float std::abs(float)'
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:71:
 error: non-constant condition for static assertion
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:71: error: 'constexpr std::enable_if_t<__and_v, 
std::is_same >::type, _Tp>, std::__not_ > >, _Tp> 
std::midpoint(_Tp, _Tp) [with _Tp = long double; std::enable_if_t<__and_v, std::is_same >::type, _Tp>, std::__not_ > >, _Tp> = long double]' called in a constant expression
/var/gcc/regression/trunk/11-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/numeric:199:
 error: call to non-'constexpr' function 'long double std::abs(long double)'


Doh, I looked in  and saw that we get std::abs(double) from the
Solaris headers, and then forgot and used it anyway.

I'll replace that right away, thanks.


Should be fixed at r272653.

Tested x86_64-linux, committed to trunk.




Re: [PATCH] Automatics in equivalence statements

2019-06-25 Thread Mark Eggleston



On 25/06/2019 00:17, Jeff Law wrote:

On 6/24/19 2:19 AM, Bernhard Reutner-Fischer wrote:

On Fri, 21 Jun 2019 07:10:11 -0700
Steve Kargl  wrote:


On Fri, Jun 21, 2019 at 02:31:51PM +0100, Mark Eggleston wrote:

Currently variables with the AUTOMATIC attribute can not appear in an
EQUIVALENCE statement. However its counterpart, STATIC, can be used in
an EQUIVALENCE statement.

Where there is a clear conflict in the attributes of variables in an
EQUIVALENCE statement an error message will be issued as is currently
the case.

If there is no conflict e.g. a variable with a AUTOMATIC attribute and a
variable(s) without attributes all variables in the EQUIVALENCE will
become AUTOMATIC.

Note: most of this patch was written by Jeff Law 

Please review.

ChangeLogs:

gcc/fortran

      Jeff Law  
      Mark Eggleston  

      * gfortran.h: Add check_conflict declaration.

This is wrong.  By convention a routine that is not static
has the gfc_ prefix.


Furthermore doesn't this export indicate that you're committing a
layering violation somehow?

Possibly.  I'm the original author, but my experience in our fortran
front-end is minimal.  I fully expected this patch to need some tweaking.

We certainly don't want to recreate all the checking that's done in
check_conflict.  We just need to defer it to a later point --
find_equivalence seemed like a good point since we've got the full
equivalence list handy and can accumulate the attributes across the
entire list, then check for conflicts.

If there's a concrete place where you think we should be doing this, I'm
all ears.


Any suggestions will be appreciate.

      * symbol.c (check_conflict): Remove automatic in equivalence conflict
      check.
      * symbol.c (save_symbol): Add check for in equivalence to stop the
      the save attribute being added.
      * trans-common.c (build_equiv_decl): Add is_auto parameter and
      add !is_auto to condition where TREE_STATIC (decl) is set.
      * trans-common.c (build_equiv_decl): Add local variable is_auto,
      set it true if an atomatic attribute is encountered in the variable

atomatic? I read atomic but you mean automatic.

Yes.


      list.  Call build_equiv_decl with is_auto as an additional parameter.
      flag_dec_format_defaults is enabled.
      * trans-common.c (accumulate_equivalence_attributes) : New subroutine.
      * trans-common.c (find_equivalence) : New local variable dummy_symbol,
      accumulated equivalence attributes from each symbol then check for
      conflicts.

I'm just curious why you don't gfc_copy_attr for the most part of 
accumulate_equivalence_attributes?
thanks,

Simply didn't know about it.  It could probably significantly simplify
the accumulation of attributes step.
Using gfc_copy_attr causes a great many "Duplicate DIMENSION attribute 
specified at (1)" errors. This is because there is a great deal of 
checking done instead of simply keeping track of the attributes used 
which is all that is required for determining whether there is a 
conflict in the equivalence statement.


Also, the final section of accumulate_equivalence_attributes involving 
SAVE, INTENT and ACCESS look suspect to me. I'll check and update the 
patch if necessary.



Jeff




--
https://www.codethink.co.uk/privacy.html



[PATCH] Try fix PR90911

2019-06-25 Thread Richard Biener


PR90911 reports a slowdown of 456.hmmer with the recent introduction
of vectorizer versioning of outer loops, more specifically the case
of re-using if-conversion created versions.

The patch below fixes things up to adjust the edge probability
and scale the loop bodies in two steps, delaying scalar_loop
scaling until all peeling is done.  This restores profile-mismatches
to the same state as it was on the GCC 9 branch and seems to
fix the observed slowdown of 456.hmmer.

Boostrap & regtest running on x86_64-unknown-linux-gnu.

Honza, does this look OK?

Thanks,
Richard.

2019-06-25  Richard Biener  

* tree-vectorizer.h (_loop_vec_info::scalar_loop_scaling): New field.
(LOOP_VINFO_SCALAR_LOOP_SCALING): new.
* tree-vect-loop.c (_loop_vec_info::_loop_vec_info): Initialize
scalar_loop_scaling.
(vect_transform_loop): Scale scalar loop profile if needed.
* tree-vect-loop-manip.c (vect_loop_versioning): When re-using
the loop copy from if-conversion adjust edge probabilities
and scale the vectorized loop body profile, queue the scalar
profile for updating after peeling.

Index: gcc/tree-vectorizer.h
===
--- gcc/tree-vectorizer.h   (revision 272636)
+++ gcc/tree-vectorizer.h   (working copy)
@@ -548,6 +548,9 @@ typedef struct _loop_vec_info : public v
   /* Mark loops having masked stores.  */
   bool has_mask_store;
 
+  /* Queued scaling factor for the scalar loop.  */
+  profile_probability scalar_loop_scaling;
+
   /* If if-conversion versioned this loop before conversion, this is the
  loop version without if-conversion.  */
   struct loop *scalar_loop;
@@ -603,6 +606,7 @@ typedef struct _loop_vec_info : public v
 #define LOOP_VINFO_PEELING_FOR_NITER(L)(L)->peeling_for_niter
 #define LOOP_VINFO_NO_DATA_DEPENDENCIES(L) (L)->no_data_dependencies
 #define LOOP_VINFO_SCALAR_LOOP(L) (L)->scalar_loop
+#define LOOP_VINFO_SCALAR_LOOP_SCALING(L)  (L)->scalar_loop_scaling
 #define LOOP_VINFO_HAS_MASK_STORE(L)   (L)->has_mask_store
 #define LOOP_VINFO_SCALAR_ITERATION_COST(L) (L)->scalar_cost_vec
 #define LOOP_VINFO_SINGLE_SCALAR_ITERATION_COST(L) 
(L)->single_scalar_iteration_cost
Index: gcc/tree-vect-loop.c
===
--- gcc/tree-vect-loop.c(revision 272636)
+++ gcc/tree-vect-loop.c(working copy)
@@ -835,6 +835,7 @@ _loop_vec_info::_loop_vec_info (struct l
 operands_swapped (false),
 no_data_dependencies (false),
 has_mask_store (false),
+scalar_loop_scaling (profile_probability::uninitialized ()),
 scalar_loop (NULL),
 orig_loop_info (NULL)
 {
@@ -8562,6 +8563,10 @@ vect_transform_loop (loop_vec_info loop_
   epilogue = vect_do_peeling (loop_vinfo, niters, nitersm1, _vector,
  _vector, _vector_mult_vf, th,
  check_profitability, niters_no_overflow);
+  if (LOOP_VINFO_SCALAR_LOOP (loop_vinfo)
+  && LOOP_VINFO_SCALAR_LOOP_SCALING (loop_vinfo).initialized_p ())
+scale_loop_frequencies (LOOP_VINFO_SCALAR_LOOP (loop_vinfo),
+   LOOP_VINFO_SCALAR_LOOP_SCALING (loop_vinfo));
 
   if (niters_vector == NULL_TREE)
 {
Index: gcc/tree-vect-loop-manip.c
===
--- gcc/tree-vect-loop-manip.c  (revision 272636)
+++ gcc/tree-vect-loop-manip.c  (working copy)
@@ -3114,8 +3114,17 @@ vect_loop_versioning (loop_vec_info loop
 GSI_SAME_STMT);
}
 
-  /* ???  if-conversion uses profile_probability::always () but
- prob below is profile_probability::likely ().  */
+  /* if-conversion uses profile_probability::always () for both paths,
+reset the paths probabilities appropriately.  */
+  edge te, fe;
+  extract_true_false_edges_from_block (condition_bb, , );
+  te->probability = prob;
+  fe->probability = prob.invert ();
+  /* We can scale loops counts immediately but have to postpone
+ scaling the scalar loop because we re-use it during peeling.  */
+  scale_loop_frequencies (loop_to_version, prob);
+  LOOP_VINFO_SCALAR_LOOP_SCALING (loop_vinfo) = prob.invert ();
+
   nloop = scalar_loop;
   if (dump_enabled_p ())
dump_printf_loc (MSG_NOTE, vect_location,


Re: Use ODR for canonical types construction in LTO

2019-06-25 Thread Christophe Lyon
Hi,


On Tue, 25 Jun 2019 at 10:20, Jan Hubicka  wrote:
>
> > > * gcc-interface/decl.c (gnat_to_gnu_entity): Check that
> > > type is array or integer prior checking string flag.
> >
> > The test for array is superfluous here.
> >
> > > * gcc-interface/gigi.h (gnat_signed_type_for,
> > > maybe_character_value): Likewise.
> >
> > Wrong ChangeLog, the first modified function is maybe_character_type.
> >
> > I have installed the attached patchlet after testing it on x86-64/Linux.
> >
> >
> >   * gcc-interface/decl.c (gnat_to_gnu_entity): Remove superfluous test 
> > in
> >   previous change.
> >   * gcc-interface/gigi.h (maybe_character_type): Fix formatting.
> >   (maybe_character_value): Likewise.
>
> Thanks a lot. I was not quite sure if ARRAY_TYPEs can happen there
> and I should have added you to the CC.
>

After the main commit (r272628), I have noticed regressions on arm and aarch64:

g++.dg/lto/pr60336 cp_lto_pr60336_0.o-cp_lto_pr60336_0.o link, -O0
-flto -flto-partition=1to1 -fno-use-linker-plugin  (internal compiler
error)
g++.dg/lto/pr60336 cp_lto_pr60336_0.o-cp_lto_pr60336_0.o link, -O0
-flto -flto-partition=none -fuse-linker-plugin (internal compiler
error)
g++.dg/lto/pr60336 cp_lto_pr60336_0.o-cp_lto_pr60336_0.o link, -O0
-flto -fuse-linker-plugin -fno-fat-lto-objects  (internal compiler
error)
g++.dg/lto/pr60336 cp_lto_pr60336_0.o-cp_lto_pr60336_0.o link, -O2
-flto -flto-partition=1to1 -fno-use-linker-plugin  (internal compiler
error)
g++.dg/lto/pr60336 cp_lto_pr60336_0.o-cp_lto_pr60336_0.o link, -O2
-flto -flto-partition=none -fuse-linker-plugin -fno-fat-lto-objects
(internal compiler error)
g++.dg/lto/pr60336 cp_lto_pr60336_0.o-cp_lto_pr60336_0.o link, -O2
-flto -fuse-linker-plugin (internal compiler error)
g++.dg/torture/pr45843.C   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  (internal compiler error)
g++.dg/torture/pr45843.C   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  (internal compiler error)
g++.dg/torture/stackalign/eh-vararg-1.C   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  (internal compiler error)
g++.dg/torture/stackalign/eh-vararg-1.C   -O2 -flto
-fno-use-linker-plugin -flto-partition=none -fpic (internal compiler
error)
g++.dg/torture/stackalign/eh-vararg-1.C   -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
g++.dg/torture/stackalign/eh-vararg-1.C   -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects -fpic (internal compiler
error)
g++.dg/torture/stackalign/eh-vararg-2.C   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  (internal compiler error)
g++.dg/torture/stackalign/eh-vararg-2.C   -O2 -flto
-fno-use-linker-plugin -flto-partition=none -fpic (internal compiler
error)
g++.dg/torture/stackalign/eh-vararg-2.C   -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects  (internal compiler error)
g++.dg/torture/stackalign/eh-vararg-2.C   -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects -fpic (internal compiler
error)

A sample ICE:
lto1: error: type variant differs by TYPE_CXX_ODR_P
  constant 256>
unit-size  constant 32>
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x2b3d78275dc8
fields 
public unsigned DI
size 
unit-size 
align:64 warn_if_not_align:0 symtab:0 alias-set -1
structural-equality
pointer_to_this >
unsigned DI :0:0 size  unit-size 
align:64 warn_if_not_align:0 offset_align 128
offset 
bit-offset  context

chain 
unsigned DI :0:0 size  unit-size 
align:64 warn_if_not_align:0 offset_align 128 offset
 bit-offset  context  chain >>
reference_to_this  chain >
  constant 256>
unit-size  constant 32>
align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type
0x2b3d78275dc8
fields 
public unsigned DI
size 
unit-size 
align:64 warn_if_not_align:0 symtab:0 alias-set -1
structural-equality
pointer_to_this >
unsigned DI :0:0 size  unit-size 
align:64 warn_if_not_align:0 offset_align 128
offset 
bit-offset  context

chain 
unsigned DI :0:0 size  unit-size 
align:64 warn_if_not_align:0 offset_align 128 offset
 bit-offset  context  chain >>
pointer_to_this >
lto1: internal compiler error: 'verify_type' failed
0xe667b0 verify_type(tree_node const*)
/gcc/tree.c:14650
0x632cd7 lto_fixup_state
/gcc/lto/lto-common.c:2429
0x63f459 lto_fixup_decls
/gcc/lto/lto-common.c:2460
0x63f459 read_cgraph_and_symbols(unsigned int, char const**)
/gcc/lto/lto-common.c:2693
0x620fa2 lto_main()
/gcc/lto/lto.c:616
Please submit a full bug report,

Christophe

> Honza
> >
> > --
> > Eric Botcazou
>
> > Index: gcc-interface/decl.c
> > ===
> > --- 

[PATCH] Transform filter-rtags-warnings to filter-clang-warnings.

2019-06-25 Thread Martin Liška
Hi.

I've left using rtags, but I'm still interested in clang warnings produced for 
GCC.
Thus I'm transforming the script a bit. Current output:

/home/marxin/Programming/gcc/gcc/config/i386/i386.c:10364:6: warning: use of 
logical '||' with constant operand [-Wconstant-logical-operand]
/home/marxin/Programming/gcc/gcc/cp/lex.c:169:45: warning: result of comparison 
of constant 64 with expression of type 'enum ovl_op_code' is always true 
[-Wtautological-constant-out-of-range-compare]
/home/marxin/Programming/gcc/gcc/dwarf2out.c:5150:1: warning: unused function 
'add_AT_vms_delta' [-Wunused-function]
/home/marxin/Programming/gcc/gcc/edit-context.c:1642:23: warning: empty 
parentheses interpreted as a function declaration [-Wvexing-parse]
/home/marxin/Programming/gcc/gcc/edit-context.c:1673:23: warning: empty 
parentheses interpreted as a function declaration [-Wvexing-parse]
/home/marxin/Programming/gcc/gcc/fortran/gfortran.texi:1791: warning: @node 
name should not contain `,': Default widths for F, G and I format descriptors
/home/marxin/Programming/gcc/gcc/fortran/gfortran.texi:2777: warning: @ref node 
name should not contain `:'
/home/marxin/Programming/gcc/gcc/genconditions.c:126:58: warning: cast from 
'void **' to 'const struct c_test **' must have all intermediate pointers const 
qualified to be safe [-Wcast-qual]
/home/marxin/Programming/gcc/gcc/ggc-page.c:946:60: warning: format specifies 
type 'void *' but the argument has type 'char *' [-Wformat-pedantic]
/home/marxin/Programming/gcc/gcc/ggc-page.c:947:7: warning: format specifies 
type 'void *' but the argument has type 'char *' [-Wformat-pedantic]
/home/marxin/Programming/gcc/gcc/ggc-page.c:980:20: warning: format specifies 
type 'void *' but the argument has type 'char *' [-Wformat-pedantic]
/home/marxin/Programming/gcc/gcc/ggc-page.c:980:7: warning: format specifies 
type 'void *' but the argument has type 'char *' [-Wformat-pedantic]
/home/marxin/Programming/gcc/gcc/omp-grid.c:1069:7: warning: comparison of two 
values with different enumeration types in switch statement ('enum tree_code' 
and 'omp_clause_code') [-Wenum-compare-switch]
/home/marxin/Programming/gcc/gcc/omp-grid.c:1080:7: warning: comparison of two 
values with different enumeration types in switch statement ('enum tree_code' 
and 'omp_clause_code') [-Wenum-compare-switch]
/home/marxin/Programming/gcc/gcc/omp-grid.c:1081:7: warning: comparison of two 
values with different enumeration types in switch statement ('enum tree_code' 
and 'omp_clause_code') [-Wenum-compare-switch]
/home/marxin/Programming/gcc/gcc/omp-grid.c:1082:7: warning: comparison of two 
values with different enumeration types in switch statement ('enum tree_code' 
and 'omp_clause_code') [-Wenum-compare-switch]
/home/marxin/Programming/gcc/gcc/print-rtl.h:72:22: warning: private field 
'm_rtx_reuse_manager' is not used [-Wunused-private-field]
/home/marxin/Programming/gcc/gcc/reload1.c:3530:32: warning: unknown warning 
group '-Wmaybe-uninitialized', ignored [-Wunknown-warning-option]
/home/marxin/Programming/gcc/gcc/tree.c:13462:16: warning: result of comparison 
of constant 42405 with expression of type 'enum tree_code' is always false 
[-Wtautological-constant-out-of-range-compare]
/home/marxin/Programming/gcc/gcc/tree.c:13840:28: warning: use of logical '&&' 
with constant operand [-Wconstant-logical-operand]
/home/marxin/Programming/gcc/libcpp/include/cpplib.h:897:14: warning: private 
field 'm_line_table' is not used [-Wunused-private-field]
/home/marxin/Programming/gcc/libcpp/include/cpplib.h:897:14: warning: private 
field 'm_line_table' is not used [-Wunused-private-field]
libtool: install: warning: remember to run `libtool --finish 
/home/marxin/bin/gcc/lib/gcc/x86_64-pc-linux-gnu/10.0.0'

Apart from that, it noticed 2 warnings that I'm going to address in a separate 
patch.

I'm going to install it.

Martin

ChangeLog:

contrib/filter-clang-warnings.py: Transform from
filter-rtags-warnings.py.
---
 ...s-warnings.py => filter-clang-warnings.py} | 47 ++-
 1 file changed, 24 insertions(+), 23 deletions(-)
 rename contrib/{filter-rtags-warnings.py => filter-clang-warnings.py} (66%)


diff --git a/contrib/filter-rtags-warnings.py b/contrib/filter-clang-warnings.py
similarity index 66%
rename from contrib/filter-rtags-warnings.py
rename to contrib/filter-clang-warnings.py
index ee27e7c8942..15cca5ff2df 100755
--- a/contrib/filter-rtags-warnings.py
+++ b/contrib/filter-clang-warnings.py
@@ -1,7 +1,6 @@
 #!/usr/bin/env python3
 #
-# Script to analyze warnings produced by rtags command (using LLVM):
-# rc --diagnose-all --synchronous-diagnostics --json
+# Script to analyze warnings produced by clang.
 #
 # This file is part of GCC.
 #
@@ -23,26 +22,26 @@
 #
 
 import sys
-import json
 import argparse
 
-def skip_warning(filename, warning):
+def skip_warning(filename, message):
 ignores = {
 '': ['-Warray-bounds', '-Wmismatched-tags', 'gcc_gfc: 

Re: [SVE] [fwprop] PR88833 - Redundant moves for WHILELO-based loops

2019-06-25 Thread Prathamesh Kulkarni
On Mon, 24 Jun 2019 at 21:41, Prathamesh Kulkarni
 wrote:
>
> On Mon, 24 Jun 2019 at 19:51, Richard Sandiford
>  wrote:
> >
> > Prathamesh Kulkarni  writes:
> > > @@ -1415,6 +1460,19 @@ forward_propagate_into (df_ref use)
> > >if (!def_set)
> > >  return false;
> > >
> > > +  if (reg_prop_only
> > > +  && !REG_P (SET_SRC (def_set))
> > > +  && !REG_P (SET_DEST (def_set)))
> > > +return false;
> >
> > This should be:
> >
> >   if (reg_prop_only
> >   && (!REG_P (SET_SRC (def_set)) || !REG_P (SET_DEST (def_set
> > return false;
> >
> > so that we return false if either operand isn't a register.
> Oops, sorry about that  -:(
> >
> > > +
> > > +  /* Allow propagations into a loop only for reg-to-reg copies, since
> > > + replacing one register by another shouldn't increase the cost.  */
> > > +
> > > +  if (DF_REF_BB (def)->loop_father != DF_REF_BB (use)->loop_father
> > > +  && !REG_P (SET_SRC (def_set))
> > > +  && !REG_P (SET_DEST (def_set)))
> > > +return false;
> >
> > Same here.
> >
> > OK with that change, thanks.
> Thanks for the review, will make the changes and commit the patch
> after re-testing.
Hi,
Testing the patch showed following failures on 32-bit x86:

  Executed from: g++.target/i386/i386.exp
g++:g++.target/i386/pr88152.C   scan-assembler-not vpcmpgt|vpcmpeq|vpsra
  Executed from: gcc.target/i386/i386.exp
gcc:gcc.target/i386/pr66768.c scan-assembler add*.[ \t]%gs:
gcc:gcc.target/i386/pr90178.c scan-assembler-times xorl[\\t
]*\\%eax,[\\t ]*%eax 1

The failure of pr88152.C is also seen on x86_64.

For pr66768.c, and pr90178.c, forwprop replaces register which is
volatile and frame related respectively.
To avoid that, the attached patch, makes a stronger constraint that
src and dest should be a register
and not have frame_related or volatil flags set, which is checked in
usable_reg_p().
Which avoids the failures for both the cases.
Does it look OK ?

For g++.target/i386/pr88152.C, the issue is that after the patch,
forwprop1 does following propagation (in f10) which wasn't done
before:

In insn 10, replacing
 (unspec:SI [
(reg:V2DF 91)
] UNSPEC_MOVMSK)
 with (unspec:SI [
(subreg:V2DF (reg:V2DI 90) 0)
] UNSPEC_MOVMSK)

This later defeats combine because insn 9 gets deleted.
Without patch, the following combination takes place:

Trying 7 -> 9:
7: r90:V2DI=r89:V2DI>r93:V2DI
  REG_DEAD r93:V2DI
  REG_DEAD r89:V2DI
9: r91:V2DF=r90:V2DI#0
  REG_DEAD r90:V2DI
Successfully matched this instruction:
(set (subreg:V2DI (reg:V2DF 91) 0)
(gt:V2DI (reg:V2DI 89)
(reg:V2DI 93)))
allowing combination of insns 7 and 9

and then:
Trying 6, 9 -> 10:
6: r89:V2DI=const_vector
9: r91:V2DF#0=r89:V2DI>r93:V2DI
  REG_DEAD r89:V2DI
  REG_DEAD r93:V2DI
   10: r87:SI=unspec[r91:V2DF] 43
  REG_DEAD r91:V2DF
Successfully matched this instruction:
(set (reg:SI 87)
(unspec:SI [
(lt:V2DF (reg:V2DI 93)
(const_vector:V2DI [
(const_int 0 [0]) repeated x2
]))
] UNSPEC_MOVMSK))
allowing combination of insns 6, 9 and 10
original costs 4 + 8 + 4 = 16
replacement cost 12
deferring deletion of insn with uid = 9.
deferring deletion of insn with uid = 6.
which deletes insns 2, 3, 6, 7, 9.

With patch, it fails to combine 7->10:
Trying 7 -> 10:
7: r90:V2DI=r89:V2DI>r93:V2DI
  REG_DEAD r93:V2DI
  REG_DEAD r89:V2DI
   10: r87:SI=unspec[r90:V2DI#0] 43
  REG_DEAD r90:V2DI
Failed to match this instruction:
(set (reg:SI 87)
(unspec:SI [
(subreg:V2DF (gt:V2DI (reg:V2DI 89)
(reg:V2DI 93)) 0)
] UNSPEC_MOVMSK))

and subsequently 6, 7 -> 10
(attached combine dumps before and after patch).

So IIUC, the issue is that the target does not have a pattern that can
match the above insn ?
I tried a simple workaround to "pessimize" the else condition in
propagate_rtx_1 in patch, to require old_rtx and new_rtx have same
rtx_code, which at least
works for this test-case, but not sure if that's the correct approach.
Could you suggest how to proceed ?

Thanks,
Prathamesh
>
> Thanks,
> Prathamesh
> >
> > Richard


before.combine
Description: Binary data


after.combine
Description: Binary data
diff --git a/gcc/fwprop.c b/gcc/fwprop.c
index 45703fe5f01..fd4e4eb2816 100644
--- a/gcc/fwprop.c
+++ b/gcc/fwprop.c
@@ -448,6 +448,22 @@ enum {
   PR_OPTIMIZE_FOR_SPEED = 4
 };
 
+/* Avoid propagating X if it is a reg with one of the below flags set.  */
+
+static bool
+usable_reg_p (rtx x)
+{
+  if (!REG_P (x))
+return false;
+
+  if (RTX_FLAG (x, frame_related))
+return false;
+
+  else if (RTX_FLAG (x, volatil))
+return false;
+
+  return true;
+}
 
 /* Replace all occurrences of OLD in *PX with NEW and try to simplify the
resulting expression.  Replace *PX with a new RTL expression if an
@@ -547,6 +563,54 @@ propagate_rtx_1 (rtx *px, rtx old_rtx, rtx new_rtx, 

Re: [PATCH][RFC] Sanitize equals and hash functions in hash-tables.

2019-06-25 Thread Martin Liška
On 6/25/19 12:25 PM, Martin Liška wrote:
> On 6/24/19 4:09 PM, Richard Biener wrote:
>> You still get one instance in each TU ...
> 
> Right, fixed in attached patch.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 

Btw. clang provides a reasonable warning for that:

/home/marxin/Programming/gcc/gcc/hash-table.h:1017:1: warning: 'static' 
function 'hashtab_chk_error' declared in header file should be declared 'static 
inline' [-Wunneeded-internal-declaration]

Martin


Re: [PATCH] Remove dead code in df-scan.c (PR tree-optimization/90978).

2019-06-25 Thread Richard Sandiford
Martin Liška  writes:
> Hi.
>
> The patch is about removal of an unreachable code. That has been proved
> by an accidental gcc_unreachable places 9 year ago.

Heh.

> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

Yes, thanks.

Richard

> Thanks,
> Martin
>
> gcc/ChangeLog:
>
> 2019-06-25  Martin Liska  
>
>   PR tree-optimization/90978
>   * df-scan.c (df_update_entry_block_defs): Remove dead else
>   branch.
>   (df_update_exit_block_uses): Likewise.


Re: C++ PATCH for c++/83820 - excessive attribute arguments not detected

2019-06-25 Thread Iain Sandoe
Hi Dominique,

> On 25 Jun 2019, at 12:10, Dominique d'Humières  wrote:
> 
> On darwin* I see
> 
> FAIL: g++.dg/cpp0x/gen-attrs-67.C  -std=c++14 (test for excess errors)
> FAIL: g++.dg/cpp0x/gen-attrs-67.C  -std=c++17 (test for excess errors)
> 
> This is caused by the additional error
> 
> /opt/gcc/_clean/gcc/testsuite/g++.dg/cpp0x/gen-attrs-67.C:11:34: error: 
> constructor priorities are not supported
>   11 | [[gnu::constructor(101)]] int f7();
> 
> and it is fixed by the following patch
> 
> --- ../_clean/gcc/testsuite/g++.dg/cpp0x/gen-attrs-67.C   2019-06-17 
> 20:33:15.0 +0200
> +++ gcc/testsuite/g++.dg/cpp0x/gen-attrs-67.C 2019-06-20 18:13:13.0 
> +0200
> @@ -8,4 +8,4 @@
> [[nodiscard()]] int f4(); // { dg-error ".nodiscard. attribute does not take 
> any arguments" }
> [[gnu::noinline()]] int f5(); // { dg-error ".noinline. attribute does not 
> take any arguments" }
> [[gnu::constructor]] int f6();
> -[[gnu::constructor(101)]] int f7();
> +[[gnu::constructor(101)]] int f7(); // { dg-error "constructor priorities 
> are not supported" { target *-*-darwin* } }

I think this needs to be:

+[[gnu::constructor(101)]] int f7(); // { dg-error "constructor priorities are 
not supported" "" { target *-*-darwin* } . }

or it will fail on other targets,

OK from a Darwin pov with that change.
Iain




Re: C++ PATCH for c++/83820 - excessive attribute arguments not detected

2019-06-25 Thread Dominique d'Humières
On darwin* I see

FAIL: g++.dg/cpp0x/gen-attrs-67.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp0x/gen-attrs-67.C  -std=c++17 (test for excess errors)

This is caused by the additional error

/opt/gcc/_clean/gcc/testsuite/g++.dg/cpp0x/gen-attrs-67.C:11:34: error: 
constructor priorities are not supported
   11 | [[gnu::constructor(101)]] int f7();

and it is fixed by the following patch

--- ../_clean/gcc/testsuite/g++.dg/cpp0x/gen-attrs-67.C 2019-06-17 
20:33:15.0 +0200
+++ gcc/testsuite/g++.dg/cpp0x/gen-attrs-67.C   2019-06-20 18:13:13.0 
+0200
@@ -8,4 +8,4 @@
 [[nodiscard()]] int f4(); // { dg-error ".nodiscard. attribute does not take 
any arguments" }
 [[gnu::noinline()]] int f5(); // { dg-error ".noinline. attribute does not 
take any arguments" }
 [[gnu::constructor]] int f6();
-[[gnu::constructor(101)]] int f7();
+[[gnu::constructor(101)]] int f7(); // { dg-error "constructor priorities are 
not supported" { target *-*-darwin* } }

Is it OK to commit it to trunk?

TIA

Dominique

PING: [PATCH, Modula-2 (C/C++/D/F/Go/Jit)] (Register spec fn) (v2)

2019-06-25 Thread Gaius Mulley


Just wanted to politely ping

https://gcc.gnu.org/ml/gcc-patches/2019-06/msg00832.html

(which was a rewrite of:
https://gcc.gnu.org/ml/gcc-patches/2013-11/msg02620.html
to set the context of the above approach)

OK for trunk?


regards,
Gaius


Re: [PATCH][RFC] Sanitize equals and hash functions in hash-tables.

2019-06-25 Thread Martin Liška
On 6/24/19 4:09 PM, Richard Biener wrote:
> You still get one instance in each TU ...

Right, fixed in attached patch.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin
>From aa5ea14a8665b14aa60245c42bd4c9809d0bf81a Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 25 Jun 2019 10:33:39 +0200
Subject: [PATCH] Put hashtab_chk_error into hash-table.c.

gcc/ChangeLog:

2019-06-25  Martin Liska  

	* hash-table.c (hashtab_chk_error): Move here from ...
	* hash-table.h (hashtab_chk_error): ... here.
---
 gcc/hash-table.c | 12 
 gcc/hash-table.h | 14 ++
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/gcc/hash-table.c b/gcc/hash-table.c
index 8e86fffa36f..e3b5d3da09e 100644
--- a/gcc/hash-table.c
+++ b/gcc/hash-table.c
@@ -124,3 +124,15 @@ void dump_hash_table_loc_statistics (void)
   hash_table_usage ().dump (origin);
 }
 }
+
+/* Report a hash table checking error.  */
+
+ATTRIBUTE_NORETURN ATTRIBUTE_COLD
+void
+hashtab_chk_error ()
+{
+  fprintf (stderr, "hash table checking failed: "
+	   "equal operator returns true for a pair "
+	   "of values with a different hash value\n");
+  gcc_unreachable ();
+}
diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index 4f5e150a0ac..a39fb942158 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -303,6 +303,8 @@ extern unsigned int hash_table_sanitize_eq_limit;
 extern unsigned int hash_table_higher_prime_index (unsigned long n)
ATTRIBUTE_PURE;
 
+extern ATTRIBUTE_NORETURN ATTRIBUTE_COLD void hashtab_chk_error ();
+
 /* Return X % Y using multiplicative inverse values INV and SHIFT.
 
The multiplicative inverses computed above are for 32-bit types,
@@ -1010,18 +1012,6 @@ hash_table
   return _entries[index];
 }
 
-/* Report a hash table checking error.  */
-
-ATTRIBUTE_NORETURN ATTRIBUTE_COLD
-static void
-hashtab_chk_error ()
-{
-  fprintf (stderr, "hash table checking failed: "
-	   "equal operator returns true for a pair "
-	   "of values with a different hash value\n");
-  gcc_unreachable ();
-}
-
 /* Verify that all existing elements in th hash table which are
equal to COMPARABLE have an equal HASH value provided as argument.  */
 
-- 
2.21.0



[PATCH] PR90930 followup

2019-06-25 Thread Richard Biener


This avoids going back-and-forth from linearized to widened
expressions in reassoc twice and only rewrites expressions
according to reassoc-width in the last reassoc pass.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2019-06-25  Richard Biener  

PR tree-optimization/90930
* tree-ssa-reassoc.c (reassociate_bb): Only rewrite expression
into parallel form in the last pass instance.

* gcc.dg/tree-ssa/reassoc-24.c: Adjust.
* gcc.dg/tree-ssa/reassoc-25.c: Likewise.

Index: gcc/tree-ssa-reassoc.c
===
--- gcc/tree-ssa-reassoc.c  (revision 272636)
+++ gcc/tree-ssa-reassoc.c  (working copy)
@@ -6013,12 +6013,7 @@ reassociate_bb (basic_block bb)
{
  machine_mode mode = TYPE_MODE (TREE_TYPE (lhs));
  int ops_num = ops.length ();
- int width = get_reassociation_width (ops_num, rhs_code, mode);
-
- if (dump_file && (dump_flags & TDF_DETAILS))
-   fprintf (dump_file,
-"Width = %d was chosen for reassociation\n", 
width);
-
+ int width;
 
  /* For binary bit operations, if there are at least 3
 operands and the last last operand in OPS is a constant,
@@ -6032,10 +6027,21 @@ reassociate_bb (basic_block bb)
  && TREE_CODE (ops.last ()->op) == INTEGER_CST)
std::swap (*ops[0], *ops[ops_num - 1]);
 
- if (width > 1
- && ops.length () > 3)
-   rewrite_expr_tree_parallel (as_a  (stmt),
-   width, ops);
+ /* Only rewrite the expression tree to parallel in the
+last reassoc pass to avoid useless work back-and-forth
+with initial linearization.  */
+ if (!reassoc_insert_powi_p
+ && ops.length () > 3
+ && (width = get_reassociation_width (ops_num, rhs_code,
+  mode)) > 1)
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file,
+"Width = %d was chosen for reassociation\n",
+width);
+ rewrite_expr_tree_parallel (as_a  (stmt),
+ width, ops);
+   }
  else
 {
   /* When there are three operands left, we want
Index: gcc/testsuite/gcc.dg/tree-ssa/reassoc-24.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/reassoc-24.c  (revision 272636)
+++ gcc/testsuite/gcc.dg/tree-ssa/reassoc-24.c  (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 --param tree-reassoc-width=2 -fdump-tree-reassoc1" } */
+/* { dg-options "-O2 --param tree-reassoc-width=2 -fdump-tree-reassoc2" } */
 
 unsigned int
 foo (void)
@@ -21,4 +21,4 @@ foo (void)
 
 /* Verify there are two pairs of __asm__ statements with no
intervening stmts.  */
-/* { dg-final { scan-tree-dump-times "__asm__\[^;\n]*;\n *__asm__" 2 
"reassoc1"} } */
+/* { dg-final { scan-tree-dump-times "__asm__\[^;\n]*;\n *__asm__" 2 
"reassoc2"} } */
Index: gcc/testsuite/gcc.dg/tree-ssa/reassoc-25.c
===
--- gcc/testsuite/gcc.dg/tree-ssa/reassoc-25.c  (revision 272636)
+++ gcc/testsuite/gcc.dg/tree-ssa/reassoc-25.c  (working copy)
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 --param tree-reassoc-width=3 
-fdump-tree-reassoc1-details" } */
+/* { dg-options "-O2 --param tree-reassoc-width=3 
-fdump-tree-reassoc2-details" } */
 
 unsigned int
 foo (int a, int b, int c, int d)
@@ -15,4 +15,4 @@ foo (int a, int b, int c, int d)
 }
 
 /* Verify reassociation width was chosen to be 2.  */
-/* { dg-final { scan-tree-dump-times "Width = 2" 1 "reassoc1"} } */
+/* { dg-final { scan-tree-dump-times "Width = 2" 1 "reassoc2"} } */


Re: [PATCH][MSP430] Implement alternate "__intN__" form of "__intN" type

2019-06-25 Thread Jozef Lawrynowicz
On Mon, 24 Jun 2019 17:53:48 -0600
Jeff Law  wrote:

> On 6/24/19 4:25 AM, Jozef Lawrynowicz wrote:
> > 
> > diff --git a/gcc/brig/brig-lang.c b/gcc/brig/brig-lang.c
> > index 91c7cfa35da..be853ccbc02 100644
> > --- a/gcc/brig/brig-lang.c
> > +++ b/gcc/brig/brig-lang.c
> > @@ -864,10 +864,12 @@ brig_build_c_type_nodes (void)
> >for (i = 0; i < NUM_INT_N_ENTS; i++)
> > if (int_n_enabled_p[i])
> >   {
> > -   char name[50];
> > +   char name[25], altname[25];
> > sprintf (name, "__int%d unsigned", int_n_data[i].bitsize);
> > +   sprintf (altname, "__int%d__ unsigned", int_n_data[i].bitsize);  
> So isn't this going to cause problems with targets where a plain int is
> 64 bits and the sprintf format checking patches?  In that case it's
> going to have to assume the numeric part is 20 characters  + 5 more for
> the __int and you've overflowed.
> 
> Why not just keep the size at 50 bytes?
> 
> Similarly in a few other places where you made similar changes.

Thanks, yes that was just an oversight where I thought I could save some space
for "free".
> 
> It looks fine with that fixed.

Fixed and applied.

Thanks,
Jozef
> 
> jeff



Re: [PATCH] [ARC] Fix PR89838

2019-06-25 Thread Claudiu Zissulescu
Thank you guys for your review. Patch pushed to master.

Claudiu


[PATCH V2, RFC] Fix PR62147 by passing finiteness information to RTL phase

2019-06-25 Thread Kewen.Lin
Hi Richard,

Thanks a lot for review comments. 

on 2019/6/25 下午3:23, Richard Biener wrote:
> On Tue, 25 Jun 2019, Kewen.Lin wrote:
> 
>> Hi all,
>>
>>
>> It's based on two observations:
>>   1) the loop structure for one specific loop is shared between middle-end 
>> and 
>>  back-end.
>>   2) for one specific loop, if it's finite then never become infinite itself.
>>
>> As one gcc newbie, I'm not sure whether these two observations are true in 
>> all
>> cases.  Please feel free to correct me if anything missing.
> 
> I think 2) is not true with -ffinite-loops.

I just looked at the patch on this option, I don't fully understand it can 
affect
2).  It's to take one loop as finite with any normal exit, can some loop with 
this
assertion turn into infinite later by some other analysis?

> 
>> btw, I also took a look at how the loop constraint LOOP_C_FINITE is used, I 
>> think
>> it's not suitable for this purpose, it's mainly set by vectorizer and tell 
>> niter 
>> and scev to take one loop as finite.  The original patch has the words 
>> "constraint flag is mainly set by consumers and affects certain semantics of 
>> niter analyzer APIs".
>>
>> Bootstrapped and regression testing passed on powerpc64le-unknown-linux-gnu.
> 
> Did you consider to simply use finite_loop_p () from doloop.c?  That
> would be a much simpler patch.

Good suggestion!  I took it for granted that the function can be only efficient 
in
middle-end, but actually some information like bit any_upper_bound could be 
kept to
RTL.

> 
> For the testcase in question -ffinite-loops would provide this guarantee
> even on RTL, so would the upper bound that may be still set.
> 
> Richard.
> 

The new version with Richard's suggestion listed below.
Regression testing is ongoing.


Thanks,
Kewen

---

gcc/ChangeLog

2019-06-25  Kewen Lin  

PR target/62147
* gcc/loop-iv.c (find_simple_exit): Call finite_loop_p to update 
finiteness.

gcc/testsuite/ChangeLog

2019-06-25  Kewen Lin  

PR target/62147
* gcc.target/powerpc/pr62147.c: New test.


diff --git a/gcc/loop-iv.c b/gcc/loop-iv.c
index 82b4bdb1523..36f9856f5f6 100644
--- a/gcc/loop-iv.c
+++ b/gcc/loop-iv.c
@@ -61,6 +61,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "intl.h"
 #include "dumpfile.h"
 #include "rtl-iter.h"
+#include "tree-ssa-loop-niter.h"

 /* Possible return values of iv_get_reaching_def.  */

@@ -2997,6 +2998,19 @@ find_simple_exit (struct loop *loop, struct niter_desc 
*desc)
fprintf (dump_file, "Loop %d is not simple.\n", loop->num);
 }

+  /* Fix up the finiteness if possible.  We can only do it for single exit,
+ since the loop is finite, but it's possible that we predicate one loop
+ exit to be finite which can not be determined as finite in middle-end as
+ well.  It results in incorrect predicate information on the exit condition
+ expression.  For example, if says [(int) _1 + -8, + , -8] != 0 finite,
+ it means _1 can exactly divide -8.  */
+  if (single_exit (loop) && finite_loop_p (loop))
+{
+  desc->infinite = NULL_RTX;
+  if (dump_file)
+   fprintf (dump_file, "  infinite updated to finite.\n");
+}
+
   free (body);
 }

diff --git a/gcc/testsuite/gcc.target/powerpc/pr62147.c 
b/gcc/testsuite/gcc.target/powerpc/pr62147.c
new file mode 100644
index 000..635c73711da
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr62147.c
@@ -0,0 +1,24 @@
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-options "-O2 -fno-tree-loop-distribute-patterns" } */
+
+/* Note that it's required to disable loop-distribute-patterns, otherwise the
+   loop will be optimized to memset.  */
+
+/* Expect loop_iv can know the loop is finite so the doloop_optimize
+   can perform the doloop transformation.  */
+
+typedef struct {
+  int l;
+  int b[258];
+} S;
+
+void clear (S* s )
+{
+  int i;
+  int len = s->l + 1;
+
+  for (i = 0; i <= len; i++)
+s->b[i] = 0;
+}
+
+/* { dg-final { scan-assembler {\mbdnz\M} } } */



Re: [committed, amdgcn] Wait for exit value to write before exiting.

2019-06-25 Thread Andrew Stubbs

On 24/05/2019 16:31, Andrew Stubbs wrote:
This patch fixes a bug in which GCN5 devices often fail to return an 
exit value because it's not yet been written to memory when the program 
exits. The fix is simply to wait for it properly. GCN3 devices did not 
demonstrate the problem, but it was technically wrong there also. The 
bug was introduced when we stopped waiting for all writes to complete.


I've also taken the opportunity to adjust gcn-run such that a similar 
issue can't go unnoticed for so long, in future.


Now backported to gcc-9-branch.

Andrew


Re: [committed, amdgcn] Fix stack initialization bug

2019-06-25 Thread Andrew Stubbs

On 24/05/2019 12:12, Andrew Stubbs wrote:
This patch fixes a 64-bit arithmetic bug in which the wrong instruction 
was used for the lo-part resulting in an incorrect calculation for the 
hi-part (signed vs. unsigned add). This causes a Memory Access Fault 
whenever the launcher happens to choose a problematic address for the 
stack allocation.


This problem never occurred on GCN3 because the launcher always chose 
addresses in the 32-bit range. It seems to happen more frequently on 
GCN5 devices since a recent ROCm update.


Now backported to gcc-9-branch.

Andrew


[PATCH] Change std::ceil2 to be undefined if the result can't be represented

2019-06-25 Thread Jonathan Wakely

* include/std/bit (__ceil2): Make unrepresentable results undefined,
as per P1355R2. Add debug assertion. Perform one left shift, not two,
so that out of range values cause undefined behaviour. Ensure that
shift will still be undefined if left operand is promoted.
* testsuite/26_numerics/bit/bit.pow.two/ceil2.cc: Replace checks for
unrepresentable values with checks that they are not core constant
expressions.
* testsuite/26_numerics/bit/bit.pow.two/ceil2_neg.cc: New test.

I'm not committing this yet, because P1355 hasn't been accepted into
the draft, but here's a patch to implement it (this reverses the
changes in r263986, and adds special handling for types that undergo
integer promotion).

The goal is that undefined shifts are detectable in three ways, even
if the type is promoted:

* In constant expressions they make the program ill-formed.
* At runtime they cause UBSan errors.
* At runtime they abort when _GLIBCXX_ASSERTIONS is defined.

commit fd8d9b7898083c8806d2cd300f78739d2afc3503
Author: Jonathan Wakely 
Date:   Fri Jun 14 13:32:39 2019 +0100

Change std::ceil2 to be undefined if the result can't be represented

* include/std/bit (__ceil2): Make unrepresentable results undefined,
as per P1355R2. Add debug assertion. Perform one left shift, not 
two,
so that out of range values cause undefined behaviour. Ensure that
shift will still be undefined if left operand is promoted.
* testsuite/26_numerics/bit/bit.pow.two/ceil2.cc: Replace checks for
unrepresentable values with checks that they are not core constant
expressions.
* testsuite/26_numerics/bit/bit.pow.two/ceil2_neg.cc: New test.

diff --git a/libstdc++-v3/include/std/bit b/libstdc++-v3/include/std/bit
index e0c53e53756..eb0a7578b8d 100644
--- a/libstdc++-v3/include/std/bit
+++ b/libstdc++-v3/include/std/bit
@@ -197,9 +197,27 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr auto _Nd = numeric_limits<_Tp>::digits;
   if (__x == 0 || __x == 1)
 return 1;
-  const unsigned __n = _Nd - std::__countl_zero((_Tp)(__x - 1u));
-  const _Tp __y_2 = (_Tp)1u << (__n - 1u);
-  return __y_2 << 1u;
+  auto __shift_exponent = _Nd - std::__countl_zero((_Tp)(__x - 1u));
+  // If the shift exponent equals _Nd then the correct result is not
+  // representable as a value of _Tp, and so the result is undefined.
+  // Want that undefined behaviour to be detected in constant expressions,
+  // by UBSan, and by debug assertions.
+#ifdef _GLIBCXX_HAVE_BUILTIN_IS_CONSTANT_EVALUATED
+  if (!__builtin_is_constant_evaluated())
+   __glibcxx_assert( __shift_exponent != numeric_limits<_Tp>::digits );
+#endif
+  using __promoted_type = decltype(__x << 1);
+  if _GLIBCXX17_CONSTEXPR (!is_same<__promoted_type, _Tp>::value)
+   {
+ // If __x undergoes integral promotion then shifting by _Nd is
+ // not undefined. In order to make the shift undefined, so that
+ // it is diagnosed in constant expressions and by UBsan, we also
+ // need to "promote" the shift exponent to be too large for the
+ // promoted type.
+ const int __extra_exp = sizeof(__promoted_type) / sizeof(_Tp) / 2;
+ __shift_exponent |= (__shift_exponent & _Nd) << __extra_exp;
+   }
+  return (_Tp)1u << __shift_exponent;
 }
 
   template
diff --git a/libstdc++-v3/testsuite/26_numerics/bit/bit.pow.two/ceil2.cc 
b/libstdc++-v3/testsuite/26_numerics/bit/bit.pow.two/ceil2.cc
index 6ffb5f70edb..788c008129e 100644
--- a/libstdc++-v3/testsuite/26_numerics/bit/bit.pow.two/ceil2.cc
+++ b/libstdc++-v3/testsuite/26_numerics/bit/bit.pow.two/ceil2.cc
@@ -20,6 +20,21 @@
 
 #include 
 
+template
+  constexpr T max = std::numeric_limits::max();
+// Largest representable power of two (i.e. has most significant bit set)
+template
+  constexpr T maxpow2 = T(1) << (std::numeric_limits::digits - 1);
+
+// Detect whether std::ceil2(N) is a constant expression.
+template
+  struct ceil2_valid
+  : std::false_type { };
+
+template
+  struct ceil2_valid>
+  : std::true_type { };
+
 template
 constexpr auto
 test(UInt x)
@@ -55,13 +70,18 @@ test(UInt x)
 static_assert( std::ceil2(UInt(3) << 64) == (UInt(4) << 64) );
   }
 
-  constexpr UInt msb = UInt(1) << (std::numeric_limits::digits - 1);
+  constexpr UInt msb = maxpow2;
+  static_assert( ceil2_valid() );
   static_assert( std::ceil2( msb ) == msb );
-  // Larger values cannot be represented so the return value is unspecified,
-  // but must still be valid in constant expressions, i.e. not undefined.
-  static_assert( std::ceil2( UInt(msb + 1) ) != 77 );
-  static_assert( std::ceil2( UInt(msb + 2) ) != 77 );
-  static_assert( std::ceil2( UInt(msb + 77) ) != 77 );
+  static_assert( std::ceil2( UInt(msb - 1) ) == msb );
+  static_assert( std::ceil2( UInt(msb - 2) ) == msb );
+  

[PATCH] Remove dead code in df-scan.c (PR tree-optimization/90978).

2019-06-25 Thread Martin Liška
Hi.

The patch is about removal of an unreachable code. That has been proved
by an accidental gcc_unreachable places 9 year ago.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-06-25  Martin Liska  

PR tree-optimization/90978
* df-scan.c (df_update_entry_block_defs): Remove dead else
branch.
(df_update_exit_block_uses): Likewise.
---
 gcc/df-scan.c | 44 
 1 file changed, 12 insertions(+), 32 deletions(-)


diff --git a/gcc/df-scan.c b/gcc/df-scan.c
index 08d7af33371..2eea149e458 100644
--- a/gcc/df-scan.c
+++ b/gcc/df-scan.c
@@ -3601,23 +3601,13 @@ df_update_entry_block_defs (void)
 
   auto_bitmap refs (_bitmap_obstack);
   df_get_entry_block_def_set (refs);
-  if (df->entry_block_defs)
+  gcc_assert (df->entry_block_defs);
+  if (!bitmap_equal_p (df->entry_block_defs, refs))
 {
-  if (!bitmap_equal_p (df->entry_block_defs, refs))
-	{
-	  struct df_scan_bb_info *bb_info = df_scan_get_bb_info (ENTRY_BLOCK);
-	  df_ref_chain_delete_du_chain (bb_info->artificial_defs);
-	  df_ref_chain_delete (bb_info->artificial_defs);
-	  bb_info->artificial_defs = NULL;
-	  changed = true;
-	}
-}
-  else
-{
-  struct df_scan_problem_data *problem_data
-	= (struct df_scan_problem_data *) df_scan->problem_data;
-	gcc_unreachable ();
-  df->entry_block_defs = BITMAP_ALLOC (_data->reg_bitmaps);
+  struct df_scan_bb_info *bb_info = df_scan_get_bb_info (ENTRY_BLOCK);
+  df_ref_chain_delete_du_chain (bb_info->artificial_defs);
+  df_ref_chain_delete (bb_info->artificial_defs);
+  bb_info->artificial_defs = NULL;
   changed = true;
 }
 
@@ -3775,23 +3765,13 @@ df_update_exit_block_uses (void)
 
   auto_bitmap refs (_bitmap_obstack);
   df_get_exit_block_use_set (refs);
-  if (df->exit_block_uses)
+  gcc_assert (df->exit_block_uses);
+  if (!bitmap_equal_p (df->exit_block_uses, refs))
 {
-  if (!bitmap_equal_p (df->exit_block_uses, refs))
-	{
-	  struct df_scan_bb_info *bb_info = df_scan_get_bb_info (EXIT_BLOCK);
-	  df_ref_chain_delete_du_chain (bb_info->artificial_uses);
-	  df_ref_chain_delete (bb_info->artificial_uses);
-	  bb_info->artificial_uses = NULL;
-	  changed = true;
-	}
-}
-  else
-{
-  struct df_scan_problem_data *problem_data
-	= (struct df_scan_problem_data *) df_scan->problem_data;
-	gcc_unreachable ();
-  df->exit_block_uses = BITMAP_ALLOC (_data->reg_bitmaps);
+  struct df_scan_bb_info *bb_info = df_scan_get_bb_info (EXIT_BLOCK);
+  df_ref_chain_delete_du_chain (bb_info->artificial_uses);
+  df_ref_chain_delete (bb_info->artificial_uses);
+  bb_info->artificial_uses = NULL;
   changed = true;
 }
 



Re: [PATCH] Define midpoint and lerp functions for C++20 (P0811R3)

2019-06-25 Thread Jonathan Wakely

On 25/06/19 11:06 +0200, Rainer Orth wrote:

Hi Jonathan,


On 12/03/19 23:04 +, Jonathan Wakely wrote:

On 12/03/19 22:49 +, Joseph Myers wrote:

On Tue, 5 Mar 2019, Jonathan Wakely wrote:


The midpoint and lerp functions for floating point types come straight
from the P0811R3 proposal, with no attempt at optimization.


I don't know whether P0811R3 states different requirements from the public
P0811R2, but the implementation of midpoint using isnormal does *not*
satisfy "at most one inexact operation occurs" and is *not* correctly
rounded, contrary to the claims made in P0811R2.


I did wonder how the implementation in the paper was meant to meet the
stated requirements, but I didn't wonder too hard.


Consider e.g. midpoint(DBL_MIN + DBL_TRUE_MIN, DBL_MIN + DBL_TRUE_MIN).
The value DBL_MIN + DBL_TRUE_MIN is normal, but dividing it by 2 is
inexact (and so that midpoint implementation would produce DBL_MIN as
result, so failing to satisfy midpoint(x, x) == x).

Replacing isnormal(x) by something like isgreaterequal(fabs(x), MIN*2)
would avoid those inexact divisions, but there would still be spurious
overflows in non-default rounding modes for e.g. midpoint(DBL_MAX,
DBL_TRUE_MIN) in FE_UPWARD mode, so failing "No overflow occurs" if that's
meant to apply in all rounding modes.


Thanks for this review, and the useful cases to test. Ed is working on
adding some more tests, so maybe he can also look at improving the
code :-)


I've committed r272616 to make this case work. This is the proposal
author's most recent suggestion for the implementation.

Tested x86_64-linux, committed to trunk.


the 26_numerics/midpoint/floating.cc test now FAILs on Solaris (sparc
and x86, 32 and 64-bit):

+FAIL: 26_numerics/midpoint/floating.cc (test for excess errors)
+UNRESOLVED: 26_numerics/midpoint/floating.cc compilation failed to produce 
executable

Excess errors:
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:65:
 error: non-constant condition for static assertion
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:65: error: 'constexpr std::enable_if_t<__and_v, 
std::is_same >::type, _Tp>, std::__not_ > >, _Tp> 
std::midpoint(_Tp, _Tp) [with _Tp = double; std::enable_if_t<__and_v, std::is_same >::type, _Tp>, std::__not_ > >, _Tp> = double]' called in a constant expression
/var/gcc/regression/trunk/11-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/numeric:199:
 error: call to non-'constexpr' function 'double std::abs(double)'
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:68:
 error: non-constant condition for static assertion
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:68: error: 'constexpr std::enable_if_t<__and_v, 
std::is_same >::type, _Tp>, std::__not_ > >, _Tp> 
std::midpoint(_Tp, _Tp) [with _Tp = float; std::enable_if_t<__and_v, std::is_same >::type, _Tp>, std::__not_ > >, _Tp> = float]' called in a constant expression
/var/gcc/regression/trunk/11-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/numeric:199:
 error: call to non-'constexpr' function 'float std::abs(float)'
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:71:
 error: non-constant condition for static assertion
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:71: error: 'constexpr std::enable_if_t<__and_v, 
std::is_same >::type, _Tp>, std::__not_ > >, _Tp> 
std::midpoint(_Tp, _Tp) [with _Tp = long double; std::enable_if_t<__and_v, std::is_same >::type, _Tp>, std::__not_ > >, _Tp> = long double]' called in a constant expression
/var/gcc/regression/trunk/11-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/numeric:199:
 error: call to non-'constexpr' function 'long double std::abs(long double)'


Doh, I looked in  and saw that we get std::abs(double) from the
Solaris headers, and then forgot and used it anyway.

I'll replace that right away, thanks.




Re: [PATCH] Define midpoint and lerp functions for C++20 (P0811R3)

2019-06-25 Thread Rainer Orth
Hi Jonathan,

> On 12/03/19 23:04 +, Jonathan Wakely wrote:
>>On 12/03/19 22:49 +, Joseph Myers wrote:
>>>On Tue, 5 Mar 2019, Jonathan Wakely wrote:
>>>
The midpoint and lerp functions for floating point types come straight
from the P0811R3 proposal, with no attempt at optimization.
>>>
>>>I don't know whether P0811R3 states different requirements from the public
>>>P0811R2, but the implementation of midpoint using isnormal does *not*
>>>satisfy "at most one inexact operation occurs" and is *not* correctly
>>>rounded, contrary to the claims made in P0811R2.
>>
>>I did wonder how the implementation in the paper was meant to meet the
>>stated requirements, but I didn't wonder too hard.
>>
>>>Consider e.g. midpoint(DBL_MIN + DBL_TRUE_MIN, DBL_MIN + DBL_TRUE_MIN).
>>>The value DBL_MIN + DBL_TRUE_MIN is normal, but dividing it by 2 is
>>>inexact (and so that midpoint implementation would produce DBL_MIN as
>>>result, so failing to satisfy midpoint(x, x) == x).
>>>
>>>Replacing isnormal(x) by something like isgreaterequal(fabs(x), MIN*2)
>>>would avoid those inexact divisions, but there would still be spurious
>>>overflows in non-default rounding modes for e.g. midpoint(DBL_MAX,
>>>DBL_TRUE_MIN) in FE_UPWARD mode, so failing "No overflow occurs" if that's
>>>meant to apply in all rounding modes.
>>
>>Thanks for this review, and the useful cases to test. Ed is working on
>>adding some more tests, so maybe he can also look at improving the
>>code :-)
>
> I've committed r272616 to make this case work. This is the proposal
> author's most recent suggestion for the implementation.
>
> Tested x86_64-linux, committed to trunk.

the 26_numerics/midpoint/floating.cc test now FAILs on Solaris (sparc
and x86, 32 and 64-bit):

+FAIL: 26_numerics/midpoint/floating.cc (test for excess errors)
+UNRESOLVED: 26_numerics/midpoint/floating.cc compilation failed to produce 
executable

Excess errors:
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:65:
 error: non-constant condition for static assertion
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:65:
 error: 'constexpr std::enable_if_t<__and_v, 
std::is_same >::type, _Tp>, 
std::__not_ > >, _Tp> std::midpoint(_Tp, _Tp) [with _Tp 
= double; std::enable_if_t<__and_v, 
std::is_same >::type, _Tp>, 
std::__not_ > >, _Tp> = double]' called in a constant 
expression
/var/gcc/regression/trunk/11-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/numeric:199:
 error: call to non-'constexpr' function 'double std::abs(double)'
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:68:
 error: non-constant condition for static assertion
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:68:
 error: 'constexpr std::enable_if_t<__and_v, 
std::is_same >::type, _Tp>, 
std::__not_ > >, _Tp> std::midpoint(_Tp, _Tp) [with _Tp 
= float; std::enable_if_t<__and_v, 
std::is_same >::type, _Tp>, 
std::__not_ > >, _Tp> = float]' called in a constant 
expression
/var/gcc/regression/trunk/11-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/numeric:199:
 error: call to non-'constexpr' function 'float std::abs(float)'
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:71:
 error: non-constant condition for static assertion
/vol/gcc/src/hg/trunk/local/libstdc++-v3/testsuite/26_numerics/midpoint/floating.cc:71:
 error: 'constexpr std::enable_if_t<__and_v, 
std::is_same >::type, _Tp>, 
std::__not_ > >, _Tp> std::midpoint(_Tp, _Tp) [with _Tp 
= long double; std::enable_if_t<__and_v, 
std::is_same >::type, _Tp>, 
std::__not_ > >, _Tp> = long double]' called in a 
constant expression
/var/gcc/regression/trunk/11-gcc/build/i386-pc-solaris2.11/libstdc++-v3/include/numeric:199:
 error: call to non-'constexpr' function 'long double std::abs(long double)'

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


RE: [GCC][middle-end] Add rules to strip away unneeded type casts in expressions (2nd patch)

2019-06-25 Thread Richard Biener
On Tue, 25 Jun 2019, Tamar Christina wrote:

> Adding some maintainers
> 
> > -Original Message-
> > From: gcc-patches-ow...@gcc.gnu.org  On 
> > Behalf Of Tamar Christina
> > Sent: Tuesday, June 25, 2019 09:31
> > To: gcc-patches@gcc.gnu.org
> > Cc: nd 
> > Subject: [GCC][middle-end] Add rules to strip away unneeded type casts 
> > in expressions (2nd patch)
> > 
> > Hi All,
> > 
> > This is an updated version of my GCC-9 patch which moves part of the 
> > type conversion code from convert.c to match.pd because match.pd is 
> > able to apply this transformation in the presence of intermediate 
> > temporary variables.
> > 
> > The previous patch was only regtested on aarch64-none-linux-gnu and I 
> > hadn't done a regression on x86_64-pc-linux-gnu only a bootstrap.  The 
> > previous patch was approved
> > 
> > here https://gcc.gnu.org/ml/gcc-patches/2018-12/msg00116.html
> > but before committing I ran a x86_64-pc-linux-gnu regtest to be sure 
> > and this showed an issue with a DFP test. I Have fixed this by 
> > removing the offending convert.
> > The convert was just saying "keep the type as is" but match.pd looped 
> > here as it thinks the match did something and would try other 
> > patterns, causing it to match itself again.
> > 
> > Instead when there's nothing to update, I just don't do anything.
> > 
> > The second change was to merge this with the existing pattern for 
> > integer conversion in order to silence a warning from match.pd which 
> > though that the two patterns overlaps because their match conditions 
> > are similar (they have different conditions inside the ifs but 
> > match.pd doesn't check those of course.).
> > 
> > Regtested and bootstrapped on aarch64-none-linux-gnu and x86_64-pc- 
> > linux-gnu and no issues.
> > 
> > Ok for trunk?

This looks like a literal 1:1 translation plus merging with the
existing pattern around integers.  You changed
(op:s@0 (convert@3 @1) (convert?@4 @2)) to
(op:s@0 (convert1?@3 @1) (convert2?@4 @2)) where this now also
matches if there are no inner conversions at all - was that a
desired change or did you merely want to catch when the first
operand is not a conversion but the second is, possibly only
for the RDIV_EXPR case?

+(for op (plus minus mult rdiv)
+ (simplify
+   (convert (op:s@0 (convert1?@3 @1) (convert2?@4 @2)))
+   (with { tree arg0 = strip_float_extensions (@1);
+  tree arg1 = strip_float_extensions (@2);
+  tree itype = TREE_TYPE (@0);

you now unconditionally call strip_float_extensions on each operand
even for the integer case, please sink stuff only used in one
case arm.  I guess keeping the integer case first via

  (if (INTEGRAL_TYPE_P (type)
...
   (with { tree arg0 = strip_float_extensions (@1);
...

should work (with the 'with' being in the ifs else position).

+  (if (code == REAL_TYPE)
+   /* Ignore the conversion if we don't need to store intermediate
+  results and neither type is a decimal float.  */
+ (if (!(flag_float_store
+  || DECIMAL_FLOAT_TYPE_P (type)
+  || DECIMAL_FLOAT_TYPE_P (itype))
+ && types_match (ty1, ty2))
+   (convert (op (convert:ty1 @1) (convert:ty2 @2)

this looks prone to the same recursion issue you described above.

'code' is used exactly once, using SCALAR_FLOAT_TYPE_P (itype)
in the above test would be clearer.  Also both ifs can be combined.
The snipped above also doesn't appear in the convert.c code you
remove and the original one is

  switch (TREE_CODE (TREE_TYPE (expr)))
{
case REAL_TYPE:
  /* Ignore the conversion if we don't need to store intermediate
 results and neither type is a decimal float.  */
  return build1_loc (loc,
 (flag_float_store
  || DECIMAL_FLOAT_TYPE_P (type)
  || DECIMAL_FLOAT_TYPE_P (itype))
 ? CONVERT_EXPR : NOP_EXPR, type, expr);

which as far as I can see doesn't do anything besides
exchanging CONVERT_EXPR for NOP_EXPR which is a noop to the IL.
So it appears this shouldn't be moved to match.pd at all?
It's also not a 1:1 move since you are changing 'expr'.

Thanks,
Richard.

> > Thanks,
> > Tamar
> > 
> > Concretely it makes both these cases behave the same
> > 
> >   float e = (float)a * (float)b;
> >   *c = (_Float16)e;
> > 
> > and
> > 
> >   *c = (_Float16)((float)a * (float)b);
> > 
> > Thanks,
> > Tamar
> > 
> > gcc/ChangeLog:
> > 
> > 2019-06-25  Tamar Christina  
> > 
> > * convert.c (convert_to_real_1): Move part of conversion code...
> > * match.pd: ...To here.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > 2019-06-25  Tamar Christina  
> > 
> > * gcc.dg/type-convert-var.c: New test.
> > 
> > --
> 

-- 
Richard Biener 
SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany;
GF: Felix Imendörffer, Mary Higgins, Sri Rasiah; HRB 21284 (AG Nürnberg)

Re: [PATCH] Properly sum costs in tree-vect-loop.c (PR tree-optimization/90973).

2019-06-25 Thread David Malcolm
On Tue, 2019-06-25 at 10:16 +0200, Martin Liška wrote:
> Hi.
> 
> That's a thinko that's pre-approved by Richi.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression
> tests.
> 
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2019-06-24  Martin Liska  
> 
>   PR tree-optimization/90973
>   * tree-vect-loop.c (vect_get_known_peeling_cost): Sum retval
>   of prologue and epilogue.
> ---
>  gcc/tree-vect-loop.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

> diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
> index d3facf67bf9..489bee65397 100644
> --- a/gcc/tree-vect-loop.c
> +++ b/gcc/tree-vect-loop.c
> @@ -3405,8 +3405,8 @@ vect_get_known_peeling_cost (loop_vec_info loop_vinfo, 
> int peel_iters_prologue,
>   iterations are unknown, count a taken branch per peeled loop.  */
>retval = record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
>NULL, 0, vect_prologue);
> -  retval = record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
> -  NULL, 0, vect_epilogue);
> +  retval += record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
 ^^
Should this be epilogue_cost_vec?

> +   NULL, 0, vect_epilogue);

(caveat: I'm purely going by symmetry here)


[C++ PATCH] Fix ICE in constexpr evaluation of ADDR_EXPR of ARRAY_REF of a vector (PR c++/90969)

2019-06-25 Thread Jakub Jelinek
Hi!

As mentioned in the PR, the following testcase ICEs starting with
r272430.  The problem is that if cxx_eval_array_reference is called with
lval true, we just want to constant evaluate the index and array, but need
to keep the ARRAY_REF or possibly new one with updated operands in the IL.
The code to look through VCEs from VECTOR_TYPEs has been previously in the
!lval section, but has been made unconditional in that change.
For !lval, we want that, we don't reconstruct ARRAY_REF, but want to fold it
into a constant.  For lval, we only use ary in:
  if (lval && ary == oldary && index == oldidx)
return t;
  else if (lval)
return build4 (ARRAY_REF, TREE_TYPE (t), ary, index, NULL, NULL);
though, so if we look through the VCE, we both for vectors can never reuse
t and build always a new ARRAY_REF, but futhermore build a wrong one,
as ARRAY_REF should always apply to an object with ARRAY_TYPE, not
VECTOR_TYPE.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2019-06-25  Jakub Jelinek  

PR c++/90969
* constexpr.c (cxx_eval_array_reference): Don't look through VCE from
vector type if lval.

* g++.dg/ext/vector38.C: New test.

--- gcc/cp/constexpr.c.jj   2019-06-19 10:04:24.0 +0200
+++ gcc/cp/constexpr.c  2019-06-24 11:01:57.535816915 +0200
@@ -2616,7 +2616,8 @@ cxx_eval_array_reference (const constexp
   non_constant_p, overflow_p);
   if (*non_constant_p)
 return t;
-  if (TREE_CODE (ary) == VIEW_CONVERT_EXPR
+  if (!lval
+  && TREE_CODE (ary) == VIEW_CONVERT_EXPR
   && VECTOR_TYPE_P (TREE_TYPE (TREE_OPERAND (ary, 0)))
   && TREE_TYPE (t) == TREE_TYPE (TREE_TYPE (TREE_OPERAND (ary, 0
 ary = TREE_OPERAND (ary, 0);
--- gcc/testsuite/g++.dg/ext/vector38.C.jj  2019-06-24 11:17:26.303987110 
+0200
+++ gcc/testsuite/g++.dg/ext/vector38.C 2019-06-24 11:08:01.603004478 +0200
@@ -0,0 +1,5 @@
+// PR c++/90969
+// { dg-do compile }
+
+__attribute__ ((__vector_size__ (4))) int v;
+int  = v[0];

Jakub


RE: [GCC][middle-end] Add rules to strip away unneeded type casts in expressions (2nd patch)

2019-06-25 Thread Tamar Christina
Adding some maintainers

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org  On 
> Behalf Of Tamar Christina
> Sent: Tuesday, June 25, 2019 09:31
> To: gcc-patches@gcc.gnu.org
> Cc: nd 
> Subject: [GCC][middle-end] Add rules to strip away unneeded type casts 
> in expressions (2nd patch)
> 
> Hi All,
> 
> This is an updated version of my GCC-9 patch which moves part of the 
> type conversion code from convert.c to match.pd because match.pd is 
> able to apply this transformation in the presence of intermediate 
> temporary variables.
> 
> The previous patch was only regtested on aarch64-none-linux-gnu and I 
> hadn't done a regression on x86_64-pc-linux-gnu only a bootstrap.  The 
> previous patch was approved
> 
> here https://gcc.gnu.org/ml/gcc-patches/2018-12/msg00116.html
> but before committing I ran a x86_64-pc-linux-gnu regtest to be sure 
> and this showed an issue with a DFP test. I Have fixed this by 
> removing the offending convert.
> The convert was just saying "keep the type as is" but match.pd looped 
> here as it thinks the match did something and would try other 
> patterns, causing it to match itself again.
> 
> Instead when there's nothing to update, I just don't do anything.
> 
> The second change was to merge this with the existing pattern for 
> integer conversion in order to silence a warning from match.pd which 
> though that the two patterns overlaps because their match conditions 
> are similar (they have different conditions inside the ifs but 
> match.pd doesn't check those of course.).
> 
> Regtested and bootstrapped on aarch64-none-linux-gnu and x86_64-pc- 
> linux-gnu and no issues.
> 
> Ok for trunk?
> 
> Thanks,
> Tamar
> 
> Concretely it makes both these cases behave the same
> 
>   float e = (float)a * (float)b;
>   *c = (_Float16)e;
> 
> and
> 
>   *c = (_Float16)((float)a * (float)b);
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 2019-06-25  Tamar Christina  
> 
>   * convert.c (convert_to_real_1): Move part of conversion code...
>   * match.pd: ...To here.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-06-25  Tamar Christina  
> 
>   * gcc.dg/type-convert-var.c: New test.
> 
> --


[GCC][middle-end] Add rules to strip away unneeded type casts in expressions (2nd patch)

2019-06-25 Thread Tamar Christina
Hi All,

This is an updated version of my GCC-9 patch which moves part of the type 
conversion code
from convert.c to match.pd because match.pd is able to apply this 
transformation in the
presence of intermediate temporary variables.

The previous patch was only regtested on aarch64-none-linux-gnu and I hadn't 
done a
regression on x86_64-pc-linux-gnu only a bootstrap.  The previous patch was 
approved

here https://gcc.gnu.org/ml/gcc-patches/2018-12/msg00116.html
but before committing I ran a x86_64-pc-linux-gnu regtest to be sure and this
showed an issue with a DFP test. I Have fixed this by removing the offending 
convert.
The convert was just saying "keep the type as is" but match.pd looped here as 
it thinks
the match did something and would try other patterns, causing it to match 
itself again.

Instead when there's nothing to update, I just don't do anything.

The second change was to merge this with the existing pattern for integer 
conversion
in order to silence a warning from match.pd which though that the two patterns 
overlaps
because their match conditions are similar (they have different conditions 
inside the ifs
but match.pd doesn't check those of course.).

Regtested and bootstrapped on aarch64-none-linux-gnu and x86_64-pc-linux-gnu 
and no issues.

Ok for trunk?

Thanks,
Tamar

Concretely it makes both these cases behave the same

  float e = (float)a * (float)b;
  *c = (_Float16)e;

and 

  *c = (_Float16)((float)a * (float)b);

Thanks,
Tamar

gcc/ChangeLog:

2019-06-25  Tamar Christina  

* convert.c (convert_to_real_1): Move part of conversion code...
* match.pd: ...To here.

gcc/testsuite/ChangeLog:

2019-06-25  Tamar Christina  

* gcc.dg/type-convert-var.c: New test.

-- 
diff --git a/gcc/convert.c b/gcc/convert.c
index d5aa07b510e0e7831e8d121b383e42e5c6e89321..923eb70366e6c05141fb1580ba6f85e354aa3f76 100644
--- a/gcc/convert.c
+++ b/gcc/convert.c
@@ -298,92 +298,6 @@ convert_to_real_1 (tree type, tree expr, bool fold_p)
 	  return build1 (TREE_CODE (expr), type, arg);
 	}
 	  break;
-	/* Convert (outertype)((innertype0)a+(innertype1)b)
-	   into ((newtype)a+(newtype)b) where newtype
-	   is the widest mode from all of these.  */
-	case PLUS_EXPR:
-	case MINUS_EXPR:
-	case MULT_EXPR:
-	case RDIV_EXPR:
-	   {
-	 tree arg0 = strip_float_extensions (TREE_OPERAND (expr, 0));
-	 tree arg1 = strip_float_extensions (TREE_OPERAND (expr, 1));
-
-	 if (FLOAT_TYPE_P (TREE_TYPE (arg0))
-		 && FLOAT_TYPE_P (TREE_TYPE (arg1))
-		 && DECIMAL_FLOAT_TYPE_P (itype) == DECIMAL_FLOAT_TYPE_P (type))
-	   {
-		  tree newtype = type;
-
-		  if (TYPE_MODE (TREE_TYPE (arg0)) == SDmode
-		  || TYPE_MODE (TREE_TYPE (arg1)) == SDmode
-		  || TYPE_MODE (type) == SDmode)
-		newtype = dfloat32_type_node;
-		  if (TYPE_MODE (TREE_TYPE (arg0)) == DDmode
-		  || TYPE_MODE (TREE_TYPE (arg1)) == DDmode
-		  || TYPE_MODE (type) == DDmode)
-		newtype = dfloat64_type_node;
-		  if (TYPE_MODE (TREE_TYPE (arg0)) == TDmode
-		  || TYPE_MODE (TREE_TYPE (arg1)) == TDmode
-		  || TYPE_MODE (type) == TDmode)
-newtype = dfloat128_type_node;
-		  if (newtype == dfloat32_type_node
-		  || newtype == dfloat64_type_node
-		  || newtype == dfloat128_type_node)
-		{
-		  expr = build2 (TREE_CODE (expr), newtype,
- convert_to_real_1 (newtype, arg0,
-			fold_p),
- convert_to_real_1 (newtype, arg1,
-			fold_p));
-		  if (newtype == type)
-			return expr;
-		  break;
-		}
-
-		  if (TYPE_PRECISION (TREE_TYPE (arg0)) > TYPE_PRECISION (newtype))
-		newtype = TREE_TYPE (arg0);
-		  if (TYPE_PRECISION (TREE_TYPE (arg1)) > TYPE_PRECISION (newtype))
-		newtype = TREE_TYPE (arg1);
-		  /* Sometimes this transformation is safe (cannot
-		 change results through affecting double rounding
-		 cases) and sometimes it is not.  If NEWTYPE is
-		 wider than TYPE, e.g. (float)((long double)double
-		 + (long double)double) converted to
-		 (float)(double + double), the transformation is
-		 unsafe regardless of the details of the types
-		 involved; double rounding can arise if the result
-		 of NEWTYPE arithmetic is a NEWTYPE value half way
-		 between two representable TYPE values but the
-		 exact value is sufficiently different (in the
-		 right direction) for this difference to be
-		 visible in ITYPE arithmetic.  If NEWTYPE is the
-		 same as TYPE, however, the transformation may be
-		 safe depending on the types involved: it is safe
-		 if the ITYPE has strictly more than twice as many
-		 mantissa bits as TYPE, can represent infinities
-		 and NaNs if the TYPE can, and has sufficient
-		 exponent range for the product or ratio of two
-		 values representable in the TYPE to be within the
-		 range of normal values of ITYPE.  */
-		  if (TYPE_PRECISION (newtype) < TYPE_PRECISION (itype)
-		  && 

RE: [PATCH][GCC][AArch64] Make processing less fragile in config.gcc

2019-06-25 Thread Tamar Christina
Hi All,

This is an update to the patch rebased to after the SVE2 options have been 
merged.

Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.

Ok for trunk?

Thanks,
Tamar

> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org 
> On Behalf Of Tamar Christina
> Sent: Tuesday, May 21, 2019 18:00
> To: gcc-patches@gcc.gnu.org
> Cc: nd ; James Greenhalgh ;
> Richard Earnshaw ; Marcus Shawcroft
> 
> Subject: [PATCH][GCC][AArch64] Make processing less fragile in config.gcc
> 
> Hi All,
> 
> Due to config.gcc all the options need to be on one line because of the grep
> lines which would select only the first line of the option.
> 
> This causes it not to select the right bits on options that are spread over
> multiple lines when the --with-arch configure option is used.  The issue
> happens silently and you just get a compiler with an incorrect set of default
> flags.
> 
> The current rules are quite rigid:
> 
>1) No space between the AARCH64_OPT_EXTENSION and the opening (.
>2) No space between the opening ( and the extension name.
>3) No space after the extension name before the ,.
>4) Spaces are only allowed after a , and around |.
> 
> This patch makes this a lot less fragile by using the C pre-processor to 
> flatten
> the list and then provides much more flexible regex using group matching to
> process the options instead of string replacement.  This removes all the
> restrictions above and makes the code a bit more readable.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for trunk? and for eventual backport?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 2019-05-21  Tamar Christina  
> 
>   PR target/89517
>   * config.gcc: Relax parsing of AARCH64_OPT_EXTENSION.
>   * config/aarch64/aarch64-option-extensions.def: Add new
> comments
>   and restore easier to read options.
> 
> --
diff --git a/gcc/config.gcc b/gcc/config.gcc
index 7122c8ed1c89fdf4c79d9a2e27d8e81a882632c1..a6ae9fefe9c3086e2a2e9b310ef51098aa559691 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -3830,32 +3830,40 @@ case "${target}" in
   sed -e 's/,.*$//'`
 			  fi
 
+			  # Use the pre-processor to strip flatten the options.
+			  # This makes the format less rigid than if we use
+			  # grep and sed directly here.
+			  opt_macro="AARCH64_OPT_EXTENSION(A, B, C, D, E, F)=A, B, C, D, E, F"
+			  options_parsed="`$ac_cv_prog_CPP -D"$opt_macro" -x c \
+${srcdir}/config/aarch64/aarch64-option-extensions.def`"
+
+			  # Match one element inside AARCH64_OPT_EXTENSION, we
+			  # consume anything that's not a ,.
+			  elem="[ 	]*\([^,]\+\)[ 	]*"
+
+			  # Repeat the pattern for the number of entries in the
+			  # AARCH64_OPT_EXTENSION, currently 6 times.
+			  sed_patt="^$elem,$elem,$elem,$elem,$elem,$elem"
+
 			  while [ x"$ext_val" != x ]
 			  do
 ext_val=`echo $ext_val | sed -e 's/\+//'`
 ext=`echo $ext_val | sed -e 's/\+.*//'`
 base_ext=`echo $ext | sed -e 's/^no//'`
+opt_line=`echo -e "$options_parsed" | \
+	grep "^\"$base_ext\""`
 
 if [ x"$base_ext" = x ] \
-|| grep "^AARCH64_OPT_EXTENSION(\"$base_ext\"," \
-${srcdir}/config/aarch64/aarch64-option-extensions.def \
-> /dev/null; then
-
-  ext_canon=`grep "^AARCH64_OPT_EXTENSION(\"$base_ext\"," \
-	${srcdir}/config/aarch64/aarch64-option-extensions.def | \
-	sed -e 's/^[^,]*,[ 	]*//' | \
-	sed -e 's/,.*$//'`
-  ext_on=`grep "^AARCH64_OPT_EXTENSION(\"$base_ext\"," \
-	${srcdir}/config/aarch64/aarch64-option-extensions.def | \
-	sed -e 's/^[^,]*,[ 	]*[^,]*,[ 	]*//' | \
-	sed -e 's/,.*$//' | \
-	sed -e 's/).*$//'`
-  ext_off=`grep "^AARCH64_OPT_EXTENSION(\"$base_ext\"," \
-	${srcdir}/config/aarch64/aarch64-option-extensions.def | \
-	sed -e 's/^[^,]*,[ 	]*[^,]*,[ 	]*[^,]*,[ 	]*//' | \
-	sed -e 's/,.*$//' | \
-	sed -e 's/).*$//'`
-
+|| [[ -n $opt_line ]]; then
+
+  # These regexp extract the elements based on
+  # their group match index in the regexp.
+  ext_canon=`echo -e "$opt_line" | \
+	sed -e "s/$sed_patt/\2/"`
+  ext_on=`echo -e "$opt_line" | \
+	sed -e "s/$sed_patt/\3/"`
+  ext_off=`echo -e "$opt_line" | \
+	sed -e "s/$sed_patt/\4/"`
 
   if [ $ext = $base_ext ]; then
 	# Adding extension
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index 4b10c62d20401a66374eb68e36531d73df300af1..f9f3d930d821ba20df2001ed9afda62a74d82299 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -45,32 +45,43 @@
  entries: aes, pmull, sha1, sha2 being present).  In that case this field
  should contain a space (" ") separated list of the strings in 'Features'
  that are required.  Their order is not important.  An empty string means
- do not detect this feature during auto detection.  */
+ do not detect this 

Re: Use ODR for canonical types construction in LTO

2019-06-25 Thread Jan Hubicka
> > * gcc-interface/decl.c (gnat_to_gnu_entity): Check that
> > type is array or integer prior checking string flag.
> 
> The test for array is superfluous here.
> 
> > * gcc-interface/gigi.h (gnat_signed_type_for,
> > maybe_character_value): Likewise.
> 
> Wrong ChangeLog, the first modified function is maybe_character_type.
> 
> I have installed the attached patchlet after testing it on x86-64/Linux.
> 
> 
>   * gcc-interface/decl.c (gnat_to_gnu_entity): Remove superfluous test in
>   previous change.
>   * gcc-interface/gigi.h (maybe_character_type): Fix formatting.
>   (maybe_character_value): Likewise.

Thanks a lot. I was not quite sure if ARRAY_TYPEs can happen there
and I should have added you to the CC.

Honza
> 
> -- 
> Eric Botcazou

> Index: gcc-interface/decl.c
> ===
> --- gcc-interface/decl.c  (revision 272633)
> +++ gcc-interface/decl.c  (working copy)
> @@ -1855,8 +1855,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
> = Has_Biased_Representation (gnat_entity);
>  
>/* Do the same processing for Character subtypes as for types.  */
> -  if ((TREE_CODE (TREE_TYPE (gnu_type)) == INTEGER_TYPE
> -|| TREE_CODE (TREE_TYPE (gnu_type)) == ARRAY_TYPE)
> +  if (TREE_CODE (TREE_TYPE (gnu_type)) == INTEGER_TYPE
> && TYPE_STRING_FLAG (TREE_TYPE (gnu_type)))
>   {
> TYPE_NAME (gnu_type) = gnu_entity_name;
> Index: gcc-interface/gigi.h
> ===
> --- gcc-interface/gigi.h  (revision 272633)
> +++ gcc-interface/gigi.h  (working copy)
> @@ -1139,7 +1139,8 @@ static inline tree
>  maybe_character_type (tree type)
>  {
>if (TREE_CODE (type) == INTEGER_TYPE
> -  && TYPE_STRING_FLAG (type) && !TYPE_UNSIGNED (type))
> +  && TYPE_STRING_FLAG (type)
> +  && !TYPE_UNSIGNED (type))
>  type = gnat_unsigned_type_for (type);
>  
>return type;
> @@ -1153,7 +1154,8 @@ maybe_character_value (tree expr)
>tree type = TREE_TYPE (expr);
>  
>if (TREE_CODE (type) == INTEGER_TYPE
> -  && TYPE_STRING_FLAG (type) && !TYPE_UNSIGNED (type))
> +  && TYPE_STRING_FLAG (type)
> +  && !TYPE_UNSIGNED (type))
>  {
>type = gnat_unsigned_type_for (type);
>expr = convert (type, expr);



Re: Use ODR for canonical types construction in LTO

2019-06-25 Thread Eric Botcazou
>   * gcc-interface/decl.c (gnat_to_gnu_entity): Check that
>   type is array or integer prior checking string flag.

The test for array is superfluous here.

>   * gcc-interface/gigi.h (gnat_signed_type_for,
>   maybe_character_value): Likewise.

Wrong ChangeLog, the first modified function is maybe_character_type.

I have installed the attached patchlet after testing it on x86-64/Linux.


* gcc-interface/decl.c (gnat_to_gnu_entity): Remove superfluous test in
previous change.
* gcc-interface/gigi.h (maybe_character_type): Fix formatting.
(maybe_character_value): Likewise.

-- 
Eric BotcazouIndex: gcc-interface/decl.c
===
--- gcc-interface/decl.c	(revision 272633)
+++ gcc-interface/decl.c	(working copy)
@@ -1855,8 +1855,7 @@ gnat_to_gnu_entity (Entity_Id gnat_entit
 	  = Has_Biased_Representation (gnat_entity);
 
   /* Do the same processing for Character subtypes as for types.  */
-  if ((TREE_CODE (TREE_TYPE (gnu_type)) == INTEGER_TYPE
-	   || TREE_CODE (TREE_TYPE (gnu_type)) == ARRAY_TYPE)
+  if (TREE_CODE (TREE_TYPE (gnu_type)) == INTEGER_TYPE
 	  && TYPE_STRING_FLAG (TREE_TYPE (gnu_type)))
 	{
 	  TYPE_NAME (gnu_type) = gnu_entity_name;
Index: gcc-interface/gigi.h
===
--- gcc-interface/gigi.h	(revision 272633)
+++ gcc-interface/gigi.h	(working copy)
@@ -1139,7 +1139,8 @@ static inline tree
 maybe_character_type (tree type)
 {
   if (TREE_CODE (type) == INTEGER_TYPE
-  && TYPE_STRING_FLAG (type) && !TYPE_UNSIGNED (type))
+  && TYPE_STRING_FLAG (type)
+  && !TYPE_UNSIGNED (type))
 type = gnat_unsigned_type_for (type);
 
   return type;
@@ -1153,7 +1154,8 @@ maybe_character_value (tree expr)
   tree type = TREE_TYPE (expr);
 
   if (TREE_CODE (type) == INTEGER_TYPE
-  && TYPE_STRING_FLAG (type) && !TYPE_UNSIGNED (type))
+  && TYPE_STRING_FLAG (type)
+  && !TYPE_UNSIGNED (type))
 {
   type = gnat_unsigned_type_for (type);
   expr = convert (type, expr);


[PATCH] Properly sum costs in tree-vect-loop.c (PR tree-optimization/90973).

2019-06-25 Thread Martin Liška
Hi.

That's a thinko that's pre-approved by Richi.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Thanks,
Martin

gcc/ChangeLog:

2019-06-24  Martin Liska  

PR tree-optimization/90973
* tree-vect-loop.c (vect_get_known_peeling_cost): Sum retval
of prologue and epilogue.
---
 gcc/tree-vect-loop.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index d3facf67bf9..489bee65397 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -3405,8 +3405,8 @@ vect_get_known_peeling_cost (loop_vec_info loop_vinfo, int peel_iters_prologue,
  iterations are unknown, count a taken branch per peeled loop.  */
   retval = record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
  NULL, 0, vect_prologue);
-  retval = record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
- NULL, 0, vect_epilogue);
+  retval += record_stmt_cost (prologue_cost_vec, 1, cond_branch_taken,
+  NULL, 0, vect_epilogue);
 }
   else
 {



  1   2   >