date:20200116

Re: limit on emails for merge commits.

2020-01-16 Thread Joel Brobecker

Hello,

> You should include Joel on such questions as the expert on the hooks.
> 
> I don't know whether there's something to put in the commit message to say 
> "allow this merge of more than 100 commits".  I don't think a squashed 
> merge is the right workaround, supposing you do want the git ancestry 
> information to reflect what got merged.  The simple workaround is to do 
> three successive merges, each of under 100 commits and each pushed 
> separately.  Alternatively, you could check out refs/meta/config, push a 
> change that either increases hooks.max-commit-emails or sets 
> hooks.no-emails to prevent mails for this branch, then push the merge, 
> then push a reversion of the change to refs/meta/config.

Joseph pretty much nailed it in terms of options.

-- 
Joel

[COMMITTED] contrib/gcc_update: Insert "tformat:" for git log --pretty=tformat:%p:%t:%H

2020-01-16 Thread Hans-Peter Nilsson

Really old git versions (like 1.6.0) require
"git log --pretty=tformat:%p:%t:%H"
or else we see:

Updating GIT tree
Current branch master is up to date.
fatal: invalid --pretty format: %p:%t:%H
Adjusting file timestamps
Touching gcc/config.in...
Touching gcc/config/arm/arm-tune.md...

...and an empty revision in LAST_UPDATED and gcc/REVISION.
In its absence, for newer git versions, "tformat" is the default
qualifier, documented as such default for at least git-2.11.0.

Committed as obvious.

diff --git a/contrib/ChangeLog b/contrib/ChangeLog
index 16d0667..4e89b8d 100644
--- a/contrib/ChangeLog
+++ b/contrib/ChangeLog
@@ -1,3 +1,8 @@
+2020-01-17  Hans-Peter Nilsson  
+
+   * gcc_update : Use git log "--pretty=tformat:%p:%t:%H",
+   not "--pretty=%p:%t:%H".
+
 2020-01-16  Andreas Schwab  
 
* gcc-git-customization.sh: Avoid double expansion.
diff --git a/contrib/gcc_update b/contrib/gcc_update
index 5df3297..8c980b1 100755
--- a/contrib/gcc_update
+++ b/contrib/gcc_update
@@ -330,7 +330,7 @@ case $vcs_type in
 exit 1
fi
 
-   revision=`$GCC_GIT log -n1 --pretty=%p:%t:%H`
+   revision=`$GCC_GIT log -n1 --pretty=tformat:%p:%t:%H`
branch=`$GCC_GIT name-rev --name-only HEAD || :`
;;
 
brgds, H-P

Re: [PATCH] vect: Fix ICE in vectorizable_comparison PR93292

2020-01-16 Thread Richard Biener

On Fri, 17 Jan 2020, Jakub Jelinek wrote:

> Hi!
> 
> The following testcase ICEs on powerpc64le-linux.  The problem is that
> get_vectype_for_scalar_type returns NULL, and while most places in
> tree-vect-stmts.c handle that case, this spot doesn't and punts only
> if it is non-NULL, but with different number of elts than expected.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

OK.

Richard.

> 2020-01-16  Jakub Jelinek  
> 
>   PR tree-optimization/93292
>   * tree-vect-stmts.c (vectorizable_comparison): Punt also if
>   get_vectype_for_scalar_type returns NULL.
> 
>   * g++.dg/opt/pr93292.C: New test.
> 
> --- gcc/tree-vect-stmts.c.jj  2020-01-12 11:54:38.522381590 +0100
> +++ gcc/tree-vect-stmts.c 2020-01-16 19:42:30.60882 +0100
> @@ -10492,7 +10492,7 @@ vectorizable_comparison (stmt_vec_info s
>  {
>vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1),
>slp_node);
> -  if (maybe_ne (TYPE_VECTOR_SUBPARTS (vectype), nunits))
> +  if (!vectype || maybe_ne (TYPE_VECTOR_SUBPARTS (vectype), nunits))
>   return false;
>  }
>else if (maybe_ne (nunits, TYPE_VECTOR_SUBPARTS (vectype)))
> --- gcc/testsuite/g++.dg/opt/pr93292.C.jj 2020-01-16 19:48:51.110144613 
> +0100
> +++ gcc/testsuite/g++.dg/opt/pr93292.C2020-01-16 19:47:57.351956177 
> +0100
> @@ -0,0 +1,18 @@
> +// PR tree-optimization/93292
> +// { dg-do compile }
> +// { dg-options "-O3 -w" }
> +
> +struct A {
> +  static int foo (float x) { static int b; b = x ? x + 0.5 : 0; return b; }
> +};
> +
> +void
> +bar (int *d, float e)
> +{
> +  float g;
> +  for (int h = 0; h < 64; h++)
> +{
> +  d[h] += A::foo (g < 0 ? : g > 5 ? : g);
> +  A::foo (e);
> +}
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Re: [PATCH] analyzer: fix handling of negative byte offsets (v2) (PR 93281)

2020-01-16 Thread Richard Biener

On Thu, 16 Jan 2020, David Malcolm wrote:

> On Thu, 2020-01-16 at 09:14 +0100, Jakub Jelinek wrote:
> > On Wed, Jan 15, 2020 at 06:56:48PM -0500, David Malcolm wrote:
> > > gcc/analyzer/ChangeLog:
> > >   PR analyzer/93281
> > >   * region-model.cc
> > >   (region_model::convert_byte_offset_to_array_index): Convert
> > >   offset_cst to ssizetype before dividing by byte_size.
> > > ---
> > >  gcc/analyzer/region-model.cc | 5 -
> > >  1 file changed, 4 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-
> > > model.cc
> > > index 5c39be4fd7f..b62ddf82a40 100644
> > > --- a/gcc/analyzer/region-model.cc
> > > +++ b/gcc/analyzer/region-model.cc
> > > @@ -6414,9 +6414,12 @@
> > > region_model::convert_byte_offset_to_array_index (tree ptr_type,
> > >/* This might not be a constant.  */
> > >tree byte_size = size_in_bytes (elem_type);
> > >  
> > > +  /* Ensure we're in a signed representation before doing the
> > > division.  */
> > > +  tree signed_offset_cst = fold_convert (ssizetype,
> > > offset_cst);
> > > +
> > >tree index
> > >   = fold_build2 (TRUNC_DIV_EXPR, integer_type_node,
> > > -offset_cst, byte_size);
> > > +signed_offset_cst, byte_size);
> > 
> > I'd say you want to fold_convert byte_size to ssizetype too.
> > Another thing is the integer_type_node, that looks wrong when you are
> > dividing ssizetype by ssizetype.  The fold-const.c folders are
> > sensitive to
> > using incorrect types and type mismatches.
> > And, I think fold_build2 is wasteful, you only care if you can
> > simplify it
> > to a constant (just constant or INTEGER_CST only?).
> > So, either use fold_binary (TRUNC_DIV_EXPR, ssizetype, offset_cst,
> > fold_convert (ssizetype, byte_size))
> > or, if you have checked that both arguments are INTEGER_CSTs, perhaps
> > int_const_binop or so.
> > 
> > >if (CONSTANT_CLASS_P (index))
> > 
> > And this would need to be if (index && TREE_CODE (index) ==
> > INTEGER_CST)
> > (or if you can handle other constants (index && CONSTANT_CLASS_P
> > (index)).
> > 
> > >   return get_or_create_constant_svalue (index);
> > 
> > Jakub
> 
> Thanks.
> 
> Here's an updated version of the patch which fold_converts both
> inputs to the division, and uses fold_binary rather than fold_build2.
> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu;
> verified with -m32 and -m64.
> 
> OK for master?

OK.

> gcc/analyzer/ChangeLog:
>   PR analyzer/93281
>   * region-model.cc
>   (region_model::convert_byte_offset_to_array_index): Convert to
>   ssizetype before dividing by byte_size.  Use fold_binary rather
>   than fold_build2 to avoid needlessly constructing a tree for the
>   non-const case.
> ---
>  gcc/analyzer/region-model.cc | 10 ++
>  1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
> index 5c39be4fd7f..f67572e2d45 100644
> --- a/gcc/analyzer/region-model.cc
> +++ b/gcc/analyzer/region-model.cc
> @@ -6414,11 +6414,13 @@ region_model::convert_byte_offset_to_array_index 
> (tree ptr_type,
>/* This might not be a constant.  */
>tree byte_size = size_in_bytes (elem_type);
>  
> +  /* Try to get a constant by dividing, ensuring that we're in a
> +  signed representation first.  */
>tree index
> - = fold_build2 (TRUNC_DIV_EXPR, integer_type_node,
> -offset_cst, byte_size);
> -
> -  if (CONSTANT_CLASS_P (index))
> + = fold_binary (TRUNC_DIV_EXPR, ssizetype,
> +fold_convert (ssizetype, offset_cst),
> +fold_convert (ssizetype, byte_size));
> +  if (index && TREE_CODE (index) == INTEGER_CST)
>   return get_or_create_constant_svalue (index);
>  }
>  
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Re: Deque rotate on current node

2020-01-16 Thread François Dumont

After the recent optimization that went in despite in stage 4 I wonder 
why I never got any feedback for this one.


https://gcc.gnu.org/ml/libstdc++/2019-05/msg00166.html

I forgot to note that this is the simplest reply to an old Paolo's 
request to recycle std::deque's "nodes". And this one is abi compatible !


François

On 5/20/19 7:39 AM, François Dumont wrote:

Hi

  std::deque is supposed to be like a double-queue that you can attack 
from front and back. But currrently its implementation makes it behave 
differently when starting from a fresh deque. If push_back then all 
goes well, it is copy/move to the current internal 'node'. If 
push_front then a new 'node' is created and on next pop_back the 
initial node will be deallocated without having ever be used.


  This patch changes this behavior. As long as deque is empty we can 
play with its state to make it push_back- or push_front-ready. It will 
also benefit to the usage of deque in the std::stack.


  More generally it could really improve scenario in which deque is 
used as queue of elements exchanged between different threads. As you 
need to make sure that consumers are faster than producers then your 
deque should remain empty most of the time and so this proposed patch 
should avoid nodes to be allocated/deallocated all the time.


    * include/bits/deque.tcc (deque<>::_M_push_back_aux):
    Rotate on current node if deque is empty.
    (deue<>::_M_push_front_aux): Likewise.

Tested under Linux x86_64, ok to commit ?

François

Re: [patch, fortran] Fix PR 44960, accepts invalid reference on function

2020-01-16 Thread Steve Kargl

On Thu, Jan 16, 2020 at 11:34:43PM +0100, Thomas Koenig wrote:
> diff --git a/gcc/testsuite/gfortran.dg/function_reference_1.f90 
> b/gcc/testsuite/gfortran.dg/function_reference_1.f90
> new file mode 100644
> index 000..78c92a6f20d
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/function_reference_1.f90
> @@ -0,0 +1,11 @@
> +! { dg-do compile }
> +! PR 44960 - this was erroneusly accepted.
> +! Original test case by Daniel Franke.
> +
> +type t
> +  integer :: a
> +end type t
> +type(t) :: foo
> +print *, foo(1)%a ! { dg-error "Function call can not contain a reference" }
> +end
> +

I do not understand this error message, and find it to be confusing.
foo(1)%a looks like an invalid array reference.  That is, foo is scalar
and foo(1) is an array element.

PS: s/can not/cannot

-- 
Steve

[C++ PATCH] PR c++/93286 - ICE with __is_constructible and variadic template.

2020-01-16 Thread Jason Merrill

Here we had been recursing in tsubst_copy_and_build if type2 was a TREE_LIST
because that function knew how to deal with pack expansions, and tsubst
didn't.  But tsubst_copy_and_build expects to be dealing with expressions,
so we crash when trying to convert_from_reference a type.

Tested x86_64-pc-linux-gnu, applying to trunk.

* pt.c (tsubst) [TREE_LIST]: Handle pack expansion.
(tsubst_copy_and_build) [TRAIT_EXPR]: Always use tsubst for type2.
---
 gcc/cp/pt.c  | 74 ++--
 gcc/testsuite/g++.dg/ext/is_constructible4.C | 18 +
 2 files changed, 85 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ext/is_constructible4.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 9bb8cc13e5f..872f8ff8f52 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -15350,6 +15350,71 @@ tsubst (tree t, tree args, tsubst_flags_t complain, 
tree in_decl)
if (t == void_list_node)
  return t;
 
+   if ((TREE_PURPOSE (t) && PACK_EXPANSION_P (TREE_PURPOSE (t)))
+   || (TREE_VALUE (t) && PACK_EXPANSION_P (TREE_VALUE (t
+ {
+   /* We have pack expansions, so expand those and
+  create a new list out of it.  */
+
+   /* Expand the argument expressions.  */
+   tree purposevec = NULL_TREE;
+   if (TREE_PURPOSE (t))
+ purposevec = tsubst_pack_expansion (TREE_PURPOSE (t), args,
+ complain, in_decl);
+   if (purposevec == error_mark_node)
+ return error_mark_node;
+
+   tree valuevec = NULL_TREE;
+   if (TREE_VALUE (t))
+ valuevec = tsubst_pack_expansion (TREE_VALUE (t), args,
+   complain, in_decl);
+   if (valuevec == error_mark_node)
+ return error_mark_node;
+
+   /* Build the rest of the list.  */
+   tree chain = TREE_CHAIN (t);
+   if (chain && chain != void_type_node)
+ chain = tsubst (chain, args, complain, in_decl);
+   if (chain == error_mark_node)
+ return error_mark_node;
+
+   /* Determine the number of arguments.  */
+   int len = -1;
+   if (purposevec && TREE_CODE (purposevec) == TREE_VEC)
+ {
+   len = TREE_VEC_LENGTH (purposevec);
+   gcc_assert (!valuevec || len == TREE_VEC_LENGTH (valuevec));
+ }
+   else if (TREE_CODE (valuevec) == TREE_VEC)
+ len = TREE_VEC_LENGTH (valuevec);
+   else
+ {
+   /* Since we only performed a partial substitution into
+  the argument pack, we only RETURN (a single list
+  node.  */
+   if (purposevec == TREE_PURPOSE (t)
+   && valuevec == TREE_VALUE (t)
+   && chain == TREE_CHAIN (t))
+ return t;
+
+   return tree_cons (purposevec, valuevec, chain);
+ }
+
+   /* Convert the argument vectors into a TREE_LIST.  */
+   for (int i = len; i-- > 0; )
+ {
+   purpose = (purposevec ? TREE_VEC_ELT (purposevec, i)
+  : NULL_TREE);
+   value = (valuevec ? TREE_VEC_ELT (valuevec, i)
+: NULL_TREE);
+
+   /* Build the list (backwards).  */
+   chain = hash_tree_cons (purpose, value, chain);
+ }
+
+   return chain;
+ }
+
purpose = TREE_PURPOSE (t);
if (purpose)
  {
@@ -20158,13 +20223,8 @@ tsubst_copy_and_build (tree t,
   {
tree type1 = tsubst (TRAIT_EXPR_TYPE1 (t), args,
 complain, in_decl);
-
-   tree type2 = TRAIT_EXPR_TYPE2 (t);
-   if (type2 && TREE_CODE (type2) == TREE_LIST)
- type2 = RECUR (type2);
-   else if (type2)
- type2 = tsubst (type2, args, complain, in_decl);
-
+   tree type2 = tsubst (TRAIT_EXPR_TYPE2 (t), args,
+complain, in_decl);
RETURN (finish_trait_expr (TRAIT_EXPR_LOCATION (t),
   TRAIT_EXPR_KIND (t), type1, type2));
   }
diff --git a/gcc/testsuite/g++.dg/ext/is_constructible4.C 
b/gcc/testsuite/g++.dg/ext/is_constructible4.C
new file mode 100644
index 000..6dfe3c01661
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/is_constructible4.C
@@ -0,0 +1,18 @@
+// PR c++/93286
+// { dg-do compile { target c++14 } }
+
+struct A { static const bool value = true; };
+template  using __bool_constant = A;
+template 
+struct B : __bool_constant<__is_constructible(int, _Args...)> {};
+template  using enable_if_t = int;
+template  bool is_constructible_v = B<_Args...>::value;
+class C {
+  template >>
+  C(_Tp &&);
+};
+using Effect_t = C;
+void fn1(Effect_t effect) {
+  int i;
+  [](int ) {}(i);
+}

base-commit: 801f5b96775288e55193a66a746caab1ddd56f4a
--

Re: limit on emails for merge commits.

2020-01-16 Thread Joseph Myers

On Fri, 17 Jan 2020, Iain Sandoe wrote:

> So I know that the policy is under review (and agree, from my PoV, that
> something that represents these in a similar way to the SVN single mail
> containing all the changes is probably optimum).
> 
> However, right now I’m stuck - and really want to get a short-term resolution
> so that i can complete the merge of coroutines onto trunk.
> 
> I guess I could do a squashed merge from trunk - presumably, that would only
> generate 1 email?

You should include Joel on such questions as the expert on the hooks.

I don't know whether there's something to put in the commit message to say 
"allow this merge of more than 100 commits".  I don't think a squashed 
merge is the right workaround, supposing you do want the git ancestry 
information to reflect what got merged.  The simple workaround is to do 
three successive merges, each of under 100 commits and each pushed 
separately.  Alternatively, you could check out refs/meta/config, push a 
change that either increases hooks.max-commit-emails or sets 
hooks.no-emails to prevent mails for this branch, then push the merge, 
then push a reversion of the change to refs/meta/config.

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] analyzer: fix handling of negative byte offsets (v2) (PR 93281)

2020-01-16 Thread David Malcolm

On Thu, 2020-01-16 at 09:14 +0100, Jakub Jelinek wrote:
> On Wed, Jan 15, 2020 at 06:56:48PM -0500, David Malcolm wrote:
> > gcc/analyzer/ChangeLog:
> > PR analyzer/93281
> > * region-model.cc
> > (region_model::convert_byte_offset_to_array_index): Convert
> > offset_cst to ssizetype before dividing by byte_size.
> > ---
> >  gcc/analyzer/region-model.cc | 5 -
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-
> > model.cc
> > index 5c39be4fd7f..b62ddf82a40 100644
> > --- a/gcc/analyzer/region-model.cc
> > +++ b/gcc/analyzer/region-model.cc
> > @@ -6414,9 +6414,12 @@
> > region_model::convert_byte_offset_to_array_index (tree ptr_type,
> >/* This might not be a constant.  */
> >tree byte_size = size_in_bytes (elem_type);
> >  
> > +  /* Ensure we're in a signed representation before doing the
> > division.  */
> > +  tree signed_offset_cst = fold_convert (ssizetype,
> > offset_cst);
> > +
> >tree index
> > = fold_build2 (TRUNC_DIV_EXPR, integer_type_node,
> > -  offset_cst, byte_size);
> > +  signed_offset_cst, byte_size);
> 
> I'd say you want to fold_convert byte_size to ssizetype too.
> Another thing is the integer_type_node, that looks wrong when you are
> dividing ssizetype by ssizetype.  The fold-const.c folders are
> sensitive to
> using incorrect types and type mismatches.
> And, I think fold_build2 is wasteful, you only care if you can
> simplify it
> to a constant (just constant or INTEGER_CST only?).
> So, either use fold_binary (TRUNC_DIV_EXPR, ssizetype, offset_cst,
>   fold_convert (ssizetype, byte_size))
> or, if you have checked that both arguments are INTEGER_CSTs, perhaps
> int_const_binop or so.
> 
> >if (CONSTANT_CLASS_P (index))
> 
> And this would need to be if (index && TREE_CODE (index) ==
> INTEGER_CST)
> (or if you can handle other constants (index && CONSTANT_CLASS_P
> (index)).
> 
> > return get_or_create_constant_svalue (index);
> 
>   Jakub

Thanks.

Here's an updated version of the patch which fold_converts both
inputs to the division, and uses fold_binary rather than fold_build2.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu;
verified with -m32 and -m64.

OK for master?

gcc/analyzer/ChangeLog:
PR analyzer/93281
* region-model.cc
(region_model::convert_byte_offset_to_array_index): Convert to
ssizetype before dividing by byte_size.  Use fold_binary rather
than fold_build2 to avoid needlessly constructing a tree for the
non-const case.
---
 gcc/analyzer/region-model.cc | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 5c39be4fd7f..f67572e2d45 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -6414,11 +6414,13 @@ region_model::convert_byte_offset_to_array_index (tree 
ptr_type,
   /* This might not be a constant.  */
   tree byte_size = size_in_bytes (elem_type);
 
+  /* Try to get a constant by dividing, ensuring that we're in a
+signed representation first.  */
   tree index
-   = fold_build2 (TRUNC_DIV_EXPR, integer_type_node,
-  offset_cst, byte_size);
-
-  if (CONSTANT_CLASS_P (index))
+   = fold_binary (TRUNC_DIV_EXPR, ssizetype,
+  fold_convert (ssizetype, offset_cst),
+  fold_convert (ssizetype, byte_size));
+  if (index && TREE_CODE (index) == INTEGER_CST)
return get_or_create_constant_svalue (index);
 }
 
-- 
2.21.0

limit on emails for merge commits.

2020-01-16 Thread Iain Sandoe

I have a merge commit to devel/c++-coroutines, which is intended to refect  
the state at the point of commit to trunk.


However, I cannot push that commit because:

remote: *** This update introduces too many new commits (284), which would
remote: *** trigger as many emails, exceeding the current limit (100).
remote: *** Contact your repository adminstrator if you really meant
remote: *** to generate this many commit emails.
remote: error: hook declined to update refs/heads/devel/c++-coroutines
To git+ssh://gcc.gnu.org/git/gcc.git
 ! [remote rejected] devel/c++-coroutines -> devel/c++-coroutines (hook 
declined)

in fact, I usually merged trunk into the branch weekly - this one got  
slightly delayed because it was intended to be the final one.


284 isn’t a lot of trunk commits for an interval between merges into a  
devel branch, so this seems likely to fire quite often.


——

So I know that the policy is under review (and agree, from my PoV, that  
something that represents these in a similar way to the SVN single mail  
containing all the changes is probably optimum).


However, right now I’m stuck - and really want to get a short-term  
resolution so that i can complete the merge of coroutines onto trunk.


I guess I could do a squashed merge from trunk - presumably, that would  
only generate 1 email?


cheers
Iain

Re: [PATCH][AArch64] Fix shrinkwrapping interactions with atomics (PR92692)

2020-01-16 Thread Andrew Pinski

On Thu, Jan 16, 2020 at 5:51 PM Andrew Pinski  wrote:
>
> On Thu, Jan 16, 2020 at 5:14 AM Richard Sandiford
>  wrote:
> >
> > Wilco Dijkstra  writes:
> > > The separate shrinkwrapping pass may insert stores in the middle
> > > of atomics loops which can cause issues on some implementations.
> > > Avoid this by delaying splitting of atomic patterns until after
> > > prolog/epilog generation.
> > >
> > > Bootstrap completed, no test regressions on AArch64.
> > >
> > > Andrew, can you verify this fixes the failure you were getting?
> > >
> > > ChangeLog:
> > > 2020-01-16  Wilco Dijkstra  
> > >
> > > PR target/92692
> > > * config/aarch64/aarch64.c (aarch64_split_compare_and_swap)
> > > Add assert to ensure prolog has been emitted.
> > > (aarch64_split_atomic_op): Likewise.
> > > * config/aarch64/atomics.md (aarch64_compare_and_swap)
> > > Use epilogue_completed rather than reload_completed.
> > > (aarch64_atomic_exchange): Likewise.
> > > (aarch64_atomic_): Likewise.
> > > (atomic_nand): Likewise.
> > > (aarch64_atomic_fetch_): Likewise.
> > > (atomic_fetch_nand): Likewise.
> > > (aarch64_atomic__fetch): Likewise.
> > > (atomic_nand_fetch): Likewise.
> >
> > OK if Andrew confirms it works, thanks.
>
> Yes this fixes the issue for me.
Here is the new assembly showing it worked:

d390:   f9000bf3str x19, [sp, #16]
d394:   885ffdc8ldaxr   w8, [x14]
d398:   6b01011fcmp w8, w1
d39c:   5461b.ned3a8
  // b.any
d3a0:   88137dc5stxrw19, w5, [x14]

Thanks,
Andrew Pinski


>
> Thanks,
> Andrew
>
> >
> > Richard
> >
> > > ---
> > >
> > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> > > index 
> > > ac89cc1f9c938455d33d8850d9ebfc0473cb73dc..cd9d813f2ac4990971f6435fdb28b0f94ae10309
> > >  100644
> > > --- a/gcc/config/aarch64/aarch64.c
> > > +++ b/gcc/config/aarch64/aarch64.c
> > > @@ -18375,6 +18375,9 @@ aarch64_emit_post_barrier (enum memmodel model)
> > >  void
> > >  aarch64_split_compare_and_swap (rtx operands[])
> > >  {
> > > +  /* Split after prolog/epilog to avoid interactions with 
> > > shrinkwrapping.  */
> > > +  gcc_assert (epilogue_completed);
> > > +
> > >rtx rval, mem, oldval, newval, scratch, x, model_rtx;
> > >machine_mode mode;
> > >bool is_weak;
> > > @@ -18469,6 +18472,9 @@ void
> > >  aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, 
> > > rtx mem,
> > >   rtx value, rtx model_rtx, rtx cond)
> > >  {
> > > +  /* Split after prolog/epilog to avoid interactions with 
> > > shrinkwrapping.  */
> > > +  gcc_assert (epilogue_completed);
> > > +
> > >machine_mode mode = GET_MODE (mem);
> > >machine_mode wmode = (mode == DImode ? DImode : SImode);
> > >const enum memmodel model = memmodel_from_int (INTVAL (model_rtx));
> > > diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
> > > index 
> > > c2bcabd0c3c2627b7222dcbc1af9c2e6b7ce6a76..996947799b5ef8445e9786b94e1ce62fd16e5b5c
> > >  100644
> > > --- a/gcc/config/aarch64/atomics.md
> > > +++ b/gcc/config/aarch64/atomics.md
> > > @@ -56,7 +56,7 @@ (define_insn_and_split "@aarch64_compare_and_swap"
> > > (clobber (match_scratch:SI 7 "="))]
> > >""
> > >"#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >[(const_int 0)]
> > >{
> > >  aarch64_split_compare_and_swap (operands);
> > > @@ -80,7 +80,7 @@ (define_insn_and_split "@aarch64_compare_and_swap"
> > > (clobber (match_scratch:SI 7 "="))]
> > >""
> > >"#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >[(const_int 0)]
> > >{
> > >  aarch64_split_compare_and_swap (operands);
> > > @@ -104,7 +104,7 @@ (define_insn_and_split 
> > > "@aarch64_compare_and_swap"
> > > (clobber (match_scratch:SI 7 "="))]
> > >""
> > >"#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >[(const_int 0)]
> > >{
> > >  aarch64_split_compare_and_swap (operands);
> > > @@ -223,7 +223,7 @@ (define_insn_and_split "aarch64_atomic_exchange"
> > > (clobber (match_scratch:SI 4 "="))]
> > >""
> > >"#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >[(const_int 0)]
> > >{
> > >  aarch64_split_atomic_op (SET, operands[0], NULL, operands[1],
> > > @@ -344,7 +344,7 @@ (define_insn_and_split 
> > > "aarch64_atomic_"
> > >(clobber (match_scratch:SI 4 "="))]
> > >""
> > >"#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >[(const_int 0)]
> > >{
> > >  aarch64_split_atomic_op (, NULL, operands[3], operands[0],
> > > @@ -400,7 +400,7 @@ (define_insn_and_split "atomic_nand"
> > > (clobber (match_scratch:SI 4 "="))]
> > >""
> > >"#"
> > > -  "&& reload_completed"
> > > +  "&& epilogue_completed"
> > >[(const_int 0)]
> > >{
> > >   aarch64_split_atomic_op (NOT, NULL, operands[3], operands[0],
> > > @@ -504,7 +504,7

Re: [PATCH][AArch64] Fix shrinkwrapping interactions with atomics (PR92692)

2020-01-16 Thread Andrew Pinski

On Thu, Jan 16, 2020 at 5:14 AM Richard Sandiford
 wrote:
>
> Wilco Dijkstra  writes:
> > The separate shrinkwrapping pass may insert stores in the middle
> > of atomics loops which can cause issues on some implementations.
> > Avoid this by delaying splitting of atomic patterns until after
> > prolog/epilog generation.
> >
> > Bootstrap completed, no test regressions on AArch64.
> >
> > Andrew, can you verify this fixes the failure you were getting?
> >
> > ChangeLog:
> > 2020-01-16  Wilco Dijkstra  
> >
> > PR target/92692
> > * config/aarch64/aarch64.c (aarch64_split_compare_and_swap)
> > Add assert to ensure prolog has been emitted.
> > (aarch64_split_atomic_op): Likewise.
> > * config/aarch64/atomics.md (aarch64_compare_and_swap)
> > Use epilogue_completed rather than reload_completed.
> > (aarch64_atomic_exchange): Likewise.
> > (aarch64_atomic_): Likewise.
> > (atomic_nand): Likewise.
> > (aarch64_atomic_fetch_): Likewise.
> > (atomic_fetch_nand): Likewise.
> > (aarch64_atomic__fetch): Likewise.
> > (atomic_nand_fetch): Likewise.
>
> OK if Andrew confirms it works, thanks.

Yes this fixes the issue for me.

Thanks,
Andrew

>
> Richard
>
> > ---
> >
> > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> > index 
> > ac89cc1f9c938455d33d8850d9ebfc0473cb73dc..cd9d813f2ac4990971f6435fdb28b0f94ae10309
> >  100644
> > --- a/gcc/config/aarch64/aarch64.c
> > +++ b/gcc/config/aarch64/aarch64.c
> > @@ -18375,6 +18375,9 @@ aarch64_emit_post_barrier (enum memmodel model)
> >  void
> >  aarch64_split_compare_and_swap (rtx operands[])
> >  {
> > +  /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  
> > */
> > +  gcc_assert (epilogue_completed);
> > +
> >rtx rval, mem, oldval, newval, scratch, x, model_rtx;
> >machine_mode mode;
> >bool is_weak;
> > @@ -18469,6 +18472,9 @@ void
> >  aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx 
> > mem,
> >   rtx value, rtx model_rtx, rtx cond)
> >  {
> > +  /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  
> > */
> > +  gcc_assert (epilogue_completed);
> > +
> >machine_mode mode = GET_MODE (mem);
> >machine_mode wmode = (mode == DImode ? DImode : SImode);
> >const enum memmodel model = memmodel_from_int (INTVAL (model_rtx));
> > diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
> > index 
> > c2bcabd0c3c2627b7222dcbc1af9c2e6b7ce6a76..996947799b5ef8445e9786b94e1ce62fd16e5b5c
> >  100644
> > --- a/gcc/config/aarch64/atomics.md
> > +++ b/gcc/config/aarch64/atomics.md
> > @@ -56,7 +56,7 @@ (define_insn_and_split "@aarch64_compare_and_swap"
> > (clobber (match_scratch:SI 7 "="))]
> >""
> >"#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >[(const_int 0)]
> >{
> >  aarch64_split_compare_and_swap (operands);
> > @@ -80,7 +80,7 @@ (define_insn_and_split "@aarch64_compare_and_swap"
> > (clobber (match_scratch:SI 7 "="))]
> >""
> >"#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >[(const_int 0)]
> >{
> >  aarch64_split_compare_and_swap (operands);
> > @@ -104,7 +104,7 @@ (define_insn_and_split "@aarch64_compare_and_swap"
> > (clobber (match_scratch:SI 7 "="))]
> >""
> >"#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >[(const_int 0)]
> >{
> >  aarch64_split_compare_and_swap (operands);
> > @@ -223,7 +223,7 @@ (define_insn_and_split "aarch64_atomic_exchange"
> > (clobber (match_scratch:SI 4 "="))]
> >""
> >"#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >[(const_int 0)]
> >{
> >  aarch64_split_atomic_op (SET, operands[0], NULL, operands[1],
> > @@ -344,7 +344,7 @@ (define_insn_and_split 
> > "aarch64_atomic_"
> >(clobber (match_scratch:SI 4 "="))]
> >""
> >"#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >[(const_int 0)]
> >{
> >  aarch64_split_atomic_op (, NULL, operands[3], operands[0],
> > @@ -400,7 +400,7 @@ (define_insn_and_split "atomic_nand"
> > (clobber (match_scratch:SI 4 "="))]
> >""
> >"#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >[(const_int 0)]
> >{
> >   aarch64_split_atomic_op (NOT, NULL, operands[3], operands[0],
> > @@ -504,7 +504,7 @@ (define_insn_and_split 
> > "aarch64_atomic_fetch_"
> > (clobber (match_scratch:SI 5 "="))]
> >""
> >"#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >[(const_int 0)]
> >{
> >  aarch64_split_atomic_op (, operands[0], operands[4], operands[1],
> > @@ -551,7 +551,7 @@ (define_insn_and_split "atomic_fetch_nand"
> > (clobber (match_scratch:SI 5 "="))]
> >""
> >"#"
> > -  "&& reload_completed"
> > +  "&& epilogue_completed"
> >[(const_int 0)]
> >{
> >  aarch64_split_atomic_op (NOT, operands[0], operands[4], operands[1],
> > @@ -604,7 +604,7 @@ (define_insn_and_split 
> >

Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-01-16 Thread Joseph Myers

On Thu, 16 Jan 2020, Alexandre Oliva wrote:

> And here's a followup that fixes a limitation (bug?) in libiberty that
> was hit when I attempted a last-minute simplification in lto-wrapper.
> 
> Regstrapped separately on x86_64-linux-gnu.  Ok to install?
> 
> 
> [libiberty] output empty args as a pair of quotes

This libiberty / lto-wrapper patch is OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Fix ICE caused by swallowing a token in c_parser_consume_token

2020-01-16 Thread Jeff Law

On Thu, 2020-01-16 at 23:43 +, Joseph Myers wrote:
> Thanks, patch committed.
Beat me to it :-)  I spun it yesterday, but didn't get around to
committing it.
jeff

Re: Copy list of development branches from svn.html into git.html

2020-01-16 Thread Joseph Myers

On Fri, 17 Jan 2020, Jakub Jelinek wrote:

> On Thu, Jan 16, 2020 at 11:22:15PM +, Joseph Myers wrote:
> > Committed.  I think the main remaining things that need work in our git 
> > documentation on the website are:
> 
> Shouldn't we move some of the branches from Active Development to Inactive?

Yes.  I only reviewed the ones under "General Infrastructure" (and one of 
the ones left there should probably be considered inactive as well, being 
a git-mirror-only branch).  If it's ended up in refs/dead/heads/ it can be 
presumed inactive and some vendor branches are likely inactive as well.

(Quite possibly some branches in devel/ are also inactive, but there are 
few enough at present it doesn't seem a priority to figure out a suitably 
smooth process for moving such branches into refs/dead/ - at present it 
would be necessary to create the refs/dead/ branch and remove the devel/ 
one manually on the server.  It may be better to leave any such moves out 
of the fetched-by-default branches into not-fetched-by-default refs until 
after the new sourceware is in use, since the new system should have 
enough resources for fully repacking the live repository on sourceware to 
be reasonable, and such a full repack is appropriate after moving anything 
out of the fetched-by-default refs in order to update the delta islands 
used in the packing.)

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] vect: Fix ICE in vectorizable_comparison PR93292

2020-01-16 Thread Jakub Jelinek

Hi!

The following testcase ICEs on powerpc64le-linux.  The problem is that
get_vectype_for_scalar_type returns NULL, and while most places in
tree-vect-stmts.c handle that case, this spot doesn't and punts only
if it is non-NULL, but with different number of elts than expected.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2020-01-16  Jakub Jelinek  

PR tree-optimization/93292
* tree-vect-stmts.c (vectorizable_comparison): Punt also if
get_vectype_for_scalar_type returns NULL.

* g++.dg/opt/pr93292.C: New test.

--- gcc/tree-vect-stmts.c.jj2020-01-12 11:54:38.522381590 +0100
+++ gcc/tree-vect-stmts.c   2020-01-16 19:42:30.60882 +0100
@@ -10492,7 +10492,7 @@ vectorizable_comparison (stmt_vec_info s
 {
   vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (rhs1),
 slp_node);
-  if (maybe_ne (TYPE_VECTOR_SUBPARTS (vectype), nunits))
+  if (!vectype || maybe_ne (TYPE_VECTOR_SUBPARTS (vectype), nunits))
return false;
 }
   else if (maybe_ne (nunits, TYPE_VECTOR_SUBPARTS (vectype)))
--- gcc/testsuite/g++.dg/opt/pr93292.C.jj   2020-01-16 19:48:51.110144613 
+0100
+++ gcc/testsuite/g++.dg/opt/pr93292.C  2020-01-16 19:47:57.351956177 +0100
@@ -0,0 +1,18 @@
+// PR tree-optimization/93292
+// { dg-do compile }
+// { dg-options "-O3 -w" }
+
+struct A {
+  static int foo (float x) { static int b; b = x ? x + 0.5 : 0; return b; }
+};
+
+void
+bar (int *d, float e)
+{
+  float g;
+  for (int h = 0; h < 64; h++)
+{
+  d[h] += A::foo (g < 0 ? : g > 5 ? : g);
+  A::foo (e);
+}
+}

Jakub

Re: [PATCH] Fix ICE caused by swallowing a token in c_parser_consume_token

2020-01-16 Thread Joseph Myers

Thanks, patch committed.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: Copy list of development branches from svn.html into git.html

2020-01-16 Thread Jakub Jelinek

On Thu, Jan 16, 2020 at 11:22:15PM +, Joseph Myers wrote:
> Committed.  I think the main remaining things that need work in our git 
> documentation on the website are:

Shouldn't we move some of the branches from Active Development to Inactive?
I mean, some branches haven't been touched since 2013, others since 2014 or
2011 (e.g. avx{512,512vlbwdq,2}), but various others are similarly inactive.
I guess I should post a patch for the redhat/ branches, because the ones
listed in there are certainly all inactive, maybe it is better to use
a wildcard like it was used for linaro.

Jakub

Copy list of development branches from svn.html into git.html

2020-01-16 Thread Joseph Myers

This patch makes a start on making the branch information from
svn.html available in git.html.  The list is copied, some notes added
on devel/, refs/dead/heads/ and refs/vendors/ naming, and inactive
branches from the general section (ones not present in
refs/heads/devel/, automatically moved to refs/dead/heads/ as part of
the conversion, because of lack of commits since at least December
2017) are moved into the section of inactive branches that might or
might not have been merged.  Significant further work on the list is
needed to ensure that all inactive branches are moved to inactive
sections, that active branches are properly listed in active sections,
and that it is clear for each branch where it appears in git
(including vendor branches, and git-mirror-only branches that are now
in refs/git-old/heads).

Committed.  I think the main remaining things that need work in our git 
documentation on the website are:

* More details on branch/tag naming in general, and how to use 
not-fetched-by-default branches.

* More work on the list of branches, as discussed above.

* Something about the gcc-git-customization.sh script.

* A realistic example commit session like the one in svnwrite.html.

* Documentation of rebasing a branch that was originally created in the 
git-svn mirror.

* Something about appropriate formatting and content of commit messages.

* At an appropriate point svn.html and svnwrite.html should be redirected 
to git.html and gitwrite.html.  Until then they should probably have 
warnings at the top that the SVN repository is now read-only and 
superseded by the Git repository.

diff --git a/htdocs/git.html b/htdocs/git.html
index 725bd81c..6100146e 100644
--- a/htdocs/git.html
+++ b/htdocs/git.html
@@ -197,5 +197,1079 @@ To create a worktree for a new project branch based on 
master, do
 git worktree add -b project ../project master
 
 
+Active Development Branches
+
+
+General
+Architecture
+Target
+Language
+Distribution
+Merged
+Inactive
+ 
+
+  
+General Infrastructure
+
+Active development branches have names starting devel/
+in Git.
+
+
+
+  ira-select
+  This branch is for work on a new algorithm of calculations of
+  pseudo register classes. The new algorithm is based on choosing an
+  insn alternative first and then calculation of pseudo reg class
+  costs knowing the alternative.  The branch is maintained by Vladimir
+  Makarov mailto:vmaka...@redhat.com;>vmaka...@redhat.com.
+  
+
+  https://gcc.gnu.org/wiki/OpenACC;>openacc-gcc-9-branch
+  This https://gcc.gnu.org/wiki/GitMirror;>Git-only branch is
+  used for collaborative development
+  of https://gcc.gnu.org/wiki/OpenACC;>OpenACC support and related
+  functionality, such
+  as https://gcc.gnu.org/wiki/Offloading;>offloading support.  The
+  branch is based on gcc-9-branch.  Find it
+  at git://gcc.gnu.org/git/gcc.git,
+  https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/openacc-gcc-9-branch;>https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/openacc-gcc-9-branch,
+  or
+  https://github.com/gcc-mirror/gcc/tree/openacc-gcc-9-branch;>https://github.com/gcc-mirror/gcc/tree/openacc-gcc-9-branch.
+  Please send patch emails with a short-hand [og9] tag in the
+  subject line, and use ChangeLog.openacc files.
+
+  unified-autovect
+  This branch is for work on improving effectiveness and generality of 
GCC's
+  autovectorization by performing target-aware reordering instruction selection
+  using unified representation. This branch is maintained by Sameera Deshpande
+  mailto:sameera.deshpa...@imgtec.com;>sameera.deshpa...@imgtec.com.
+
+
+
+Architecture-specific
+
+
+  https://gcc.gnu.org/git/?p=gcc.git;a=shortlog;h=refs/heads/aarch64/sve-acle-branch;>aarch64/sve-acle-branch
+  This https://gcc.gnu.org/wiki/GitMirror;>Git-only branch is
+  used for collaborative development of the AArch64 SVE ACLE implementation.
+  The branch is based off and merged with trunk.  Please send patches to
+  gcc-patches with an [SVE ACLE] tag in the subject line.
+  There's no need to use ChangeLogs; the ChangeLogs will instead be
+  written when the work is ready to be merged into trunk.  The branch is
+  maintained by Richard Sandiford.
+
+  arc-20081210-branch
+  The goal of this branch is to make the port to the ARCompact
+  architecture available.  This branch is maintained by Joern Rennecke
+  during spring 2009, and is expected to be unmaintained thereafter.
+
+  avx512
+  The goal of this branch is to implement Intel AVX-512 and SHA
+  Programming Reference.
+  The branch is maintained by Yukhin Kirill mailto:kirill.yuk...@intel.com;>kirill.yuk...@intel.com.
+  Patches should be marked with the tag [AVX512] in the subject
+  line.
+
+  avx-512vlbwdq
+  The goal of this branch is to implement the Intel AVX-512{VL,BW,DQ}
+  Programming Reference
+  (https://software.intel.com/sites/default/files/managed/39/c5/325462-sdm-vol-1-2abcd-3abcd.pdf;>link).
+  The branch is maintained by Yukhin Kirill

Make profile estimation more precise

2020-01-16 Thread Jan Hubicka

Hi,
while analyzing code size regression in SPEC2k GCC binary I noticed that we
perform some inline decisions because we think that number of executions are
very high.
In particular there was inline decision inlining gen_rtx_fmt_ee to find_reloads
believing that it is called 4 billion times.  This turned out to be cummulation
of roundoff errors in propagate_freq which was bit mechanically updated from
original sreals to C++ sreals and later to new probabilities.

This led us to estimate that a loopback edge is reached with probability 2.3
which was capped to 1-1/1 and since this happened in nested loop it quickly
escalated to large values.

Originally capping to REG_BR_PROB_BASE avoided such problems but now we have
much higher range.

This patch avoids going from probabilites to REG_BR_PROB_BASE so precision is
kept.  In addition it makes the propagation to not estimate more than
param-max-predicted-loop-iterations.  The first change makes the cap to not
be triggered on the gcc build, but it is still better to be safe than sorry.

2020-01-16  Jan Hubicka  

* ipa-fnsummary.c (estimate_calls_size_and_time): Fix formating of
dump.
* params.opt: (max-predicted-iterations): Set bounds.
* predict.c (real_almost_one, real_br_prob_base,
real_inv_br_prob_base, real_one_half, real_bb_freq_max): Remove.
(propagate_freq): Add max_cyclic_prob parameter; cap cyclic
probabilities; do not truncate to reg_br_prob_bases.
(estimate_loops_at_level): Pass max_cyclic_prob.
(estimate_loops): Compute max_cyclic_prob.
(estimate_bb_frequencies): Do not initialize real_*; update calculation
of back edge prob.
* profile-count.c (profile_probability::to_sreal): New.
* profile-count.h (class sreal): Move up in file.
(profile_probability::to_sreal): Declare.

diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
index 4e1c587b101..dbd53f12a40 100644
--- a/gcc/ipa-fnsummary.c
+++ b/gcc/ipa-fnsummary.c
@@ -3258,7 +3258,7 @@ estimate_calls_size_and_time (struct cgraph_node *node, 
int *size,
  gcc_assert (*size == old_size);
  if (time && (*time - old_time > 1 || *time - old_time < -1)
  && dump_file)
-   fprintf (dump_file, "Time mismatch in call summary %f!=%f",
+   fprintf (dump_file, "Time mismatch in call summary %f!=%f\n",
 old_time.to_double (),
 time->to_double ());
}
diff --git a/gcc/params.opt b/gcc/params.opt
index 31cc20031b1..f02c769d0e3 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -555,7 +555,7 @@ Common Joined UInteger Var(param_max_pow_sqrt_depth) 
Init(5) IntegerRange(1, 32)
 Maximum depth of sqrt chains to use when synthesizing exponentiation by a real 
constant.
 
 -param=max-predicted-iterations=
-Common Joined UInteger Var(param_max_predicted_iterations) Init(100) Param 
Optimization
+Common Joined UInteger Var(param_max_predicted_iterations) Init(100) 
IntegerRange(1, 65536) Param Optimization
 The maximum number of loop iterations we predict statically.
 
 -param=max-reload-search-insns=
diff --git a/gcc/predict.c b/gcc/predict.c
index 02c5fe0667d..c3aed9ed854 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -76,10 +76,6 @@ enum predictor_reason
 static const char *reason_messages[] = {"", " (ignored)",
 " (single edge duplicate)", " (edge pair duplicate)"};
 
-/* real constants: 0, 1, 1-1/REG_BR_PROB_BASE, REG_BR_PROB_BASE,
-  1/REG_BR_PROB_BASE, 0.5, BB_FREQ_MAX.  */
-static sreal real_almost_one, real_br_prob_base,
-real_inv_br_prob_base, real_one_half, real_bb_freq_max;
 
 static void combine_predictions_for_insn (rtx_insn *, basic_block);
 static void dump_prediction (FILE *, enum br_predictor, int, basic_block,
@@ -3266,7 +3262,8 @@ public:
TOVISIT, starting in HEAD.  */
 
 static void
-propagate_freq (basic_block head, bitmap tovisit)
+propagate_freq (basic_block head, bitmap tovisit,
+   sreal max_cyclic_prob)
 {
   basic_block bb;
   basic_block last;
@@ -3322,22 +3319,14 @@ propagate_freq (basic_block head, bitmap tovisit)
 
  FOR_EACH_EDGE (e, ei, bb->preds)
if (EDGE_INFO (e)->back_edge)
- {
-   cyclic_probability += EDGE_INFO (e)->back_edge_prob;
- }
+ cyclic_probability += EDGE_INFO (e)->back_edge_prob;
else if (!(e->flags & EDGE_DFS_BACK))
  {
-   /*  frequency += (e->probability
- * BLOCK_INFO (e->src)->frequency /
- REG_BR_PROB_BASE);  */
-
/* FIXME: Graphite is producing edges with no profile. Once
   this is fixed, drop this.  */
sreal tmp = e->probability.initialized_p () ?
-   e->probability.to_reg_br_prob_base () : 0;
-   tmp *= BLOCK_INFO (e->src)->frequency;
-   tmp *=

[wwwdocs] Remove reference to SVN and our svn.html page from projects/.

2020-01-16 Thread Gerald Pfeifer

...usually by simplifying and removing the link.

(And add a blank linke between the  and  tags where 
there was none.)

Pushed.

Gerald

---
 htdocs/projects/beginner.html | 4 ++--
 htdocs/projects/cfo.html  | 3 +--
 htdocs/projects/cli.html  | 3 +--
 htdocs/projects/cxx-reflection/index.html | 2 +-
 htdocs/projects/gomp/index.html   | 5 ++---
 htdocs/projects/gupc.html | 5 ++---
 htdocs/projects/sched-treegion.html   | 4 ++--
 htdocs/projects/tree-ssa/index.html   | 3 ++-
 8 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/htdocs/projects/beginner.html b/htdocs/projects/beginner.html
index c4b7214c..0a705dbf 100644
--- a/htdocs/projects/beginner.html
+++ b/htdocs/projects/beginner.html
@@ -24,8 +24,8 @@ individual task seems daunting; there's probably an easier 
one.  If
 you have no programming skills, we can still use your help with documentation or our bug tracker.
 
-We assume that you already know how to get the
-latest sources, configure and build the compiler, and run the test
+We assume that you already know how to get the
+latest sources, configure and build the compiler, and run the test
 suite.  You should also familiarize yourself with the
 requirements for contributions to GCC.
 
diff --git a/htdocs/projects/cfo.html b/htdocs/projects/cfo.html
index 4ec8c82b..8f4b4230 100644
--- a/htdocs/projects/cfo.html
+++ b/htdocs/projects/cfo.html
@@ -47,8 +47,7 @@ algorithms). The implementation currently resides on the 
branch.
 
 Contributing
 
-Checkout the cfo-branch branch from
-our respository.
+Checkout the cfo-branch branch from our respository.
 
 When posting to the development lists, please mark messages and patches with
 [cfo] in the subject. The usual contribution and testing rules apply. This
diff --git a/htdocs/projects/cli.html b/htdocs/projects/cli.html
index fcbcbaed..f6258fc1 100644
--- a/htdocs/projects/cli.html
+++ b/htdocs/projects/cli.html
@@ -134,8 +134,7 @@ People currently involved in the development or that had 
worked on it in the pas
 
 Contributing
 
-Check out st/cli-be branch following the instructions found in 
the
-SVN documentation.
+Check out st/cli-be branch.
 
 Being this a branch, the usual maintainer rules do not apply.
 The branch is being maintained by
diff --git a/htdocs/projects/cxx-reflection/index.html 
b/htdocs/projects/cxx-reflection/index.html
index 2f5dd953..276327aa 100644
--- a/htdocs/projects/cxx-reflection/index.html
+++ b/htdocs/projects/cxx-reflection/index.html
@@ -25,7 +25,7 @@ constrained genericity in C++.
 Contributing
 
 Checkout the cxx-reflection-branch branch
-in our respository.
+in our respository.
 
 When posting to the development lists, please mark messages and patches with
 [cxx-reflection] in the subject.  As this is a branch, and not
diff --git a/htdocs/projects/gomp/index.html b/htdocs/projects/gomp/index.html
index b96ce0d4..39c8d3ad 100644
--- a/htdocs/projects/gomp/index.html
+++ b/htdocs/projects/gomp/index.html
@@ -1,5 +1,6 @@
 
 
+
 
 
 GOMP  An OpenMP implementation for GCC
@@ -50,9 +51,7 @@ the generation of efficient and small code for OpenMP 
applications.
 Contributing
 We encourage everyone to
 contribute changes
-and help test GOMP.  GOMP has been merged into the SVN trunk.  We
-provide read access to our development sources for everybody with
-anonymous SVN.
+and help test GOMP.  GOMP has been merged into mainline GCC.
 
 Reporting Bugs
 Please add "openmp" to the keywords field when filing a bug report.
diff --git a/htdocs/projects/gupc.html b/htdocs/projects/gupc.html
index 9d3aeda0..3b822ca3 100644
--- a/htdocs/projects/gupc.html
+++ b/htdocs/projects/gupc.html
@@ -1,5 +1,6 @@
 
 
+
 
 
 GUPC  A UPC implementation for GCC
@@ -117,9 +118,7 @@ in the body of the message.
 
 We encourage everyone to
 contribute changes
-and help test GUPC.  GUPC is currently an SVN branch.  We
-provide read access to our development sources for everybody with
-anonymous SVN.
+and help test GUPC.  GUPC is currently on a development branch.
 
 
 Reporting Bugs
diff --git a/htdocs/projects/sched-treegion.html 
b/htdocs/projects/sched-treegion.html
index b906bcf0..d421d87b 100644
--- a/htdocs/projects/sched-treegion.html
+++ b/htdocs/projects/sched-treegion.html
@@ -1,5 +1,6 @@
 
 
+
 
 
 Treegion Scheduling
@@ -151,8 +152,7 @@ speculation for instructions dominated by this basic block.
 
 Contributing
 
-Checkout the sched-treegion branch following the instructions found in the 
-SVN documentation.
+Checkout the sched-treegion branch.
 
 When posting to the development lists, please mark messages and patches 
 with [sched-treegion] in the subject. The usual contribution and testing 
diff --git a/htdocs/projects/tree-ssa/index.html 
b/htdocs/projects/tree-ssa/index.html
index 0b2e01d5..a15d0f32 100644
--- a/htdocs/projects/tree-ssa/index.html
+++ b/htdocs/projects/tree-ssa/index.html
@@ -1,5 +1,6 @@
 
 
+
 
 
 SSA for Trees
@@ -80,7 +81,7 @@

[C++ PATCH] PR c++/93280 - ICE with aggregate assignment and DMI.

2020-01-16 Thread Jason Merrill

I recently added an assert to cp-gimplify to catch any
TARGET_EXPR_DIRECT_INIT_P being expanded without a target object, and this
testcase found one.  We started out with a TARGET_EXPR around the
CONSTRUCTOR, which would normally mean that the member initializer would be
used to directly initialize the appropriate member of whatever object the
TARGET_EXPR ends up initializing.  But then gimplify_modify_expr_rhs
stripped the TARGET_EXPR in order to assign directly from the elements of
the CONSTRUCTOR, leaving no object for the TARGET_EXPR_DIRECT_INIT_P to
initialize.  I considered setting CONSTRUCTOR_PLACEHOLDER_BOUNDARY in that
case, which implies TARGET_EXPR_NO_ELIDE, but decided that there's no
particular reason the A initializer needs to initialize a member of a B
rather than a distinct A object, so let's only set TARGET_EXPR_DIRECT_INIT_P
when we're using the DMI in a constructor.

Tested x86_64-pc-linux-gnu, applying to trunk.

* init.c (get_nsdmi): Set TARGET_EXPR_DIRECT_INIT_P here.
* typeck2.c (digest_nsdmi_init): Not here.
---
 gcc/cp/init.c| 4 
 gcc/cp/typeck2.c | 3 ---
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index ba80474e6ac..543d127abcd 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -655,6 +655,10 @@ get_nsdmi (tree member, bool in_ctor, tsubst_flags_t 
complain)
   if (simple_target)
 init = TARGET_EXPR_INITIAL (init);
   init = break_out_target_exprs (init, /*loc*/true);
+  if (in_ctor && init && TREE_CODE (init) == TARGET_EXPR)
+/* This expresses the full initialization, prevent perform_member_init from
+   calling another constructor (58162).  */
+TARGET_EXPR_DIRECT_INIT_P (init) = true;
   if (simple_target && TREE_CODE (init) != CONSTRUCTOR)
 /* Now put it back so C++17 copy elision works.  */
 init = get_target_expr (init);
diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index f36a5649ae6..371b203c29b 100644
--- a/gcc/cp/typeck2.c
+++ b/gcc/cp/typeck2.c
@@ -1369,9 +1369,6 @@ digest_nsdmi_init (tree decl, tree init, tsubst_flags_t 
complain)
   && CP_AGGREGATE_TYPE_P (type))
 init = reshape_init (type, init, complain);
   init = digest_init_flags (type, init, flags, complain);
-  if (TREE_CODE (init) == TARGET_EXPR)
-/* This represents the whole initialization.  */
-TARGET_EXPR_DIRECT_INIT_P (init) = true;
   return init;
 }
 

base-commit: 98d56ea8900fdcff8f1987cf2bf499a5b7399857
-- 
2.18.1

[patch, fortran] Fix PR 44960, accepts invalid reference on function

2020-01-16 Thread Thomas Koenig


Hello world,

here is my first patch made from the git world.  It certainly took
enough time to work out how to to this, but I think I have it
figured out now...

Anyway, the fix is rather straightforward. We cannot have a reference
on a function. If this slipped through on parsing, let's issue the
error during resolution.

Regression-tested. OK for trunk?

Regards

Thomas

2020-01-16  Thomas König  

PR fortran/44960
* resolve.c (resolve_function): Issue error when a
function call contains a reference.

2020-01-16  Thomas König  

PR fortran/44960
* gfortran.dg/function_reference_1.f90: New test.
diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 6f2a4c4d65a..1525c00ea4c 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -3129,6 +3129,13 @@ resolve_function (gfc_expr *expr)
 	  || sym->intmod_sym_id == GFC_ISYM_CAF_SEND))
 return true;
 
+  if (expr->ref)
+{
+  gfc_error ("Function call can not contain a reference at %L",
+		 >where);
+  return false;
+}
+
   if (sym && sym->attr.intrinsic
   && !gfc_resolve_intrinsic (sym, >where))
 return false;
diff --git a/gcc/testsuite/gfortran.dg/function_reference_1.f90 b/gcc/testsuite/gfortran.dg/function_reference_1.f90
new file mode 100644
index 000..78c92a6f20d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/function_reference_1.f90
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! PR 44960 - this was erroneusly accepted.
+! Original test case by Daniel Franke.
+
+type t
+  integer :: a
+end type t
+type(t) :: foo
+print *, foo(1)%a ! { dg-error "Function call can not contain a reference" }
+end
+
! { dg-do compile }
! PR 44960 - this was erroneusly accepted.
! Original test case by Daniel Franke.

type t
  integer :: a
end type t
type(t) :: foo
print *, foo(1)%a ! { dg-error "Function call can not contain a reference" }
end

[wwwdocs] Promote Git on our home page.

2020-01-16 Thread Gerald Pfeifer

Pushed.

Gerald
---
 htdocs/index.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/index.html b/htdocs/index.html
index 89bcfcca..41bcfe18 100644
--- a/htdocs/index.html
+++ b/htdocs/index.html
@@ -32,7 +32,7 @@ of native and cross targets (including GNU/Linux), and 
encourage everyone
 to contribute changes or help
 testing GCC.
 Our sources are readily and freely available via
-SVN and weekly
+Git and weekly
 snapshots.
 
 Major decisions about GCC are made by the 
-- 
2.24.1

Re: [PATCH] libstdc++/91223 Improve unordered containers == operator

2020-01-16 Thread François Dumont


On 1/16/20 5:01 PM, Jonathan Wakely wrote:

On 16/01/20 13:25 +, Jonathan Wakely wrote:

On 16/01/20 07:42 +0100, François Dumont wrote:

On 1/15/20 10:52 PM, Jonathan Wakely wrote:

On 15/01/20 21:48 +, Jonathan Wakely wrote:

On 14/01/20 22:25 +0100, François Dumont wrote:

On 1/13/20 10:53 PM, Jonathan Wakely wrote:

On 13/01/20 22:41 +0100, François Dumont wrote:


For the multi-keys we could still avoid redundant comparisons 
when _Equal is just doing == on the key type. On 
unordered_multiset we could just avoids the call to 
is_permuation and on the unordered_multimap we could check the 
is_permutation only on the associated value rather than on the 
std::pair.



I don't think that's necessary, or helpful.

The idea of 
https://gcc.gnu.org/ml/libstdc++/2020-01/msg00070.html is

that you shouldn't be using _Equal at all, and therefore it doesn't
matter whether it's std::equal_to or not.



And it was indeed possible.


Nice!


    PR libstdc++/91223
    * include/bits/hashtable.h (_Hashtable<>): Make _Equality<> 
friend.

    * include/bits/hashtable_policy.h: Include .
    (_Equality_base): Remove.
    (_Equality<>::_M_equal): Review implementation. Use 
std::is_permutation.

    * testsuite/23_containers/unordered_multiset/operators/1.cc
    (Hash, Equal, test02, test03): New.
    * testsuite/23_containers/unordered_set/operators/1.cc
    (Hash, Equal, test02, test03): New.

Tested under Linux x86_64.

Ok to commit ?


Yes, OK for trunk (we're in stage4 but your patch was posted in 
stage3

and fixes a pretty nasty performance bug, so is OK now).

N.B. GCC has moved to Git instead of Subversion. If you don't have 
Git

access set up let me know and I can commit the patch for you.


I haven't done the move yet and won't be able to do it before the 
week-end. So please proceed to the commit for me, thanks.


No problem, I can do that.


Your patch is now committed to trunk. Thanks for the major
improvement.

I had a look at std::is_permutation and I think we can make some
simplifications to the 4-argument overload, and we can share most of
the code between the 3-arg and 4-arg overloads (once they've confirmed
the lengths are the same they do exactly the same thing). See the
attached patch. This should probably wait for stage1 though.

I also wanted to add some comments to the _Equality::_M_equal
specialiation for unordered_multimap/multiset to explain what the code
was doing, and had some more ideas. See patch again.

It looks like this loop can potentially visit every element of
__other, instead of stopping at the end of the bucket:

  typename __hashtable::const_iterator __ity(__y_n);
  for (auto __ity_end = __ity; __ity_end != __other.end(); ++__ity_end)
    if (--__x_count == 0)
  break;

Consider a case like this:

unordered_multiset a{1, 2, 3, 4};
for (int i = 0; i <1; ++i)
  a.insert(1);
unordered_multiset b{1, 2, 3, 4};
for (int i = 0; i <1; ++i)
  b.insert(2);

When doing a == b we'll find 1 elements in a with key '1',
and find one element in b with that key. But then we iterate through
every element in b after that one, even though they have different
keys and are probably in different buckets.

Instead of just iterating from __ity to __other.end(), can we use a
local iterator so we stop at the end of the bucket?

This seems to make the PR91263 example *very* slightly slower, but
makes the example above significantly faster.

What do you think?


The hashtable implementation is doing its best to provide good 
performances as long as the user does its part of the job. Mainly 
provide a good hash to distrubute elements smoothly throughout the 
buckets. But also avoid this kind of unordered_multiset.


If you check if you don't move out of bucket you'll have to pay for the 
bucket computation (subject of PR 68303) or perform a redundant _Equal 
to check when we left the range of equivalent elements like it used to 
be done. Current implementation leave it to the std::is_permutation to 
do that which in normal situation will be better I think.

Re: [PATCH] Fix noreorder symbol partitioning reversion.

2020-01-16 Thread Jan Hubicka

> Hi.
> 
> The patch is fixes a regression in libgcrypt package where
> we incorrectly forget to stream out a definition of a no-reorder symbol.
> It's caused by LTO balanced map reversion, where we do not revert
> also best_noreorder_pos.
> 
> It's pre-approved patch by Honza and I'm going to install it.
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Thanks,
> Martin
> 
> 
> gcc/lto/ChangeLog:
> 
> 2020-01-16  Martin Liska  
> 
>   * lto-partition.c (lto_balanced_map): Remember
>   best_noreorder_pos and then restore to it
>   when we revert.

Thanks,
also please backport it to release branches once it survives some
testing in mainline.

Honza

[PATCH] Fix noreorder symbol partitioning reversion.

2020-01-16 Thread Martin Liška


Hi.

The patch is fixes a regression in libgcrypt package where
we incorrectly forget to stream out a definition of a no-reorder symbol.
It's caused by LTO balanced map reversion, where we do not revert
also best_noreorder_pos.

It's pre-approved patch by Honza and I'm going to install it.
Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Thanks,
Martin


gcc/lto/ChangeLog:

2020-01-16  Martin Liska  

* lto-partition.c (lto_balanced_map): Remember
best_noreorder_pos and then restore to it
when we revert.
---
 gcc/lto/lto-partition.c | 3 +++
 1 file changed, 3 insertions(+)


diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
index 3a9990903c7..8e0488ab13e 100644
--- a/gcc/lto/lto-partition.c
+++ b/gcc/lto/lto-partition.c
@@ -471,6 +471,7 @@ void
 lto_balanced_map (int n_lto_partitions, int max_partition_size)
 {
   int n_varpool_nodes = 0, varpool_pos = 0, best_varpool_pos = 0;
+  int best_noreorder_pos = 0;
   auto_vec  order (symtab->cgraph_count);
   auto_vec noreorder;
   auto_vec varpool_order;
@@ -732,6 +733,7 @@ lto_balanced_map (int n_lto_partitions, int max_partition_size)
 	  best_i = i;
 	  best_n_nodes = lto_symtab_encoder_size (partition->encoder);
 	  best_varpool_pos = varpool_pos;
+	  best_noreorder_pos = noreorder_pos;
 	}
   if (dump_file)
 	fprintf (dump_file, "Step %i: added %s, size %i, "
@@ -752,6 +754,7 @@ lto_balanced_map (int n_lto_partitions, int max_partition_size)
 			 i - best_i, best_i);
 	  undo_partition (partition, best_n_nodes);
 	  varpool_pos = best_varpool_pos;
+	  noreorder_pos = best_noreorder_pos;
 	}
 	  gcc_assert (best_size == partition->insns);
 	  i = best_i;

[PATCH] Redirect cvswrite.html to gitwrite.html instead of svnwrite.html.

2020-01-16 Thread Gerald Pfeifer

I doubt there's many references to https://gcc.gnu.org/cvs.html 
left, but just in case - Git is the new SVN is the new CVS. :-)

Gerald
---
 htdocs/.htaccess | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/htdocs/.htaccess b/htdocs/.htaccess
index acaac093..a28af129 100644
--- a/htdocs/.htaccess
+++ b/htdocs/.htaccess
@@ -58,7 +58,7 @@ Redirect permanent /java/ 
https://gcc.gnu.org/
 
 Redirect permanent /bugs.html  https://gcc.gnu.org/bugs/
 Redirect permanent /c9xstatus.html 
https://gcc.gnu.org/c99status.html
-Redirect permanent /cvswrite.html  
https://gcc.gnu.org/svnwrite.html
+Redirect permanent /cvswrite.html  
https://gcc.gnu.org/gitwrite.html
 Redirect permanent /gnats.html https://gcc.gnu.org/bugs/
 Redirect permanent /proj-bp.html   
https://gcc.gnu.org/projects/bp/main.html
 Redirect permanent /proj-cpplib.html   
https://gcc.gnu.org/projects/cpplib.html
-- 
2.24.1

Re: [committed, amdgcn] Allow constants in vector extends and truncates

2020-01-16 Thread Andrew Stubbs


On 19/12/2019 17:39, Richard Sandiford wrote:

Andrew Stubbs  writes:

This patch changes the operand predicates such that vector constants are
permitted during compilation.  This prevents ICEs caused by the compiler
trying to emit such instructions without checking.


That sounds like a target-independent bug though.  Why didn't we apply
the predicates in the normal way?


Sorry, I've just got back to checking this.

I've now confirmed that this is not a target independent bug. The code 
generating the instructions is in expanders elsewhere in this machine 
description. They don't attempt to fold constants, so we get an ICE when 
the following vregs pass attempts to recognise the instructions.


By relaxing the predicates we can allow the compiler to fold the 
instructions in the normal way, and keep the other expanders' code 
straight-forward.


Andrew

Re: [PATCH][Arm] Only enable fsched-pressure with Ofast

2020-01-16 Thread Wilco Dijkstra



ping

The current pressure scheduler doesn't appear to correctly track register
pressure and avoid creating unnecessary spills when register pressure is high.
As a result disabling the early scheduler improves integer performance
considerably and reduces codesize as a bonus. Since scheduling floating point
code is generally beneficial (more registers and higher latencies), only enable
the pressure scheduler with -Ofast.

On Cortex-A57 this gives a 0.7% performance gain on SPECINT2006 as well
as a 0.2% codesize reduction.

Bootstrapped on armhf. OK for commit?

ChangeLog:

2019-11-06  Wilco Dijkstra  

* gcc/common/config/arm-common.c (arm_option_optimization_table):
Enable fsched_pressure with Ofast only.

--
diff --git a/gcc/common/config/arm/arm-common.c 
b/gcc/common/config/arm/arm-common.c
index 
41a920f6dc96833e778faa8dbcc19beac483734c..b761d3abd670a144a593c4b410b1e7fbdcb52f56
 100644
--- a/gcc/common/config/arm/arm-common.c
+++ b/gcc/common/config/arm/arm-common.c
@@ -38,7 +38,7 @@ static const struct default_options 
arm_option_optimization_table[] =
   {
 /* Enable section anchors by default at -O1 or higher.  */
 { OPT_LEVELS_1_PLUS, OPT_fsection_anchors, NULL, 1 },
-{ OPT_LEVELS_1_PLUS, OPT_fsched_pressure, NULL, 1 },
+{ OPT_LEVELS_FAST, OPT_fsched_pressure, NULL, 1 },
 { OPT_LEVELS_NONE, 0, NULL, 0 }
   };

Re: [PATCH] libstdcxx: Update ctype_base.h from NetBSD upstream

2020-01-16 Thread Jason Thorpe



> On Jan 16, 2020, at 9:24 AM, Jonathan Wakely  wrote:
> 
> Thanks to everyone for their feedback. I've pushed the patch to GCC
> master now (I think it's reasonable to consider "doesn't build on
> NetBSD anymore" as a regression and so fixable at this point in GCC's
> development).
> 
> Please let me know of any problems you find with the changes.

Great, thanks Jonathan!

-- thorpej

Re: [PATCH][AARCH64] Enable compare branch fusion

2020-01-16 Thread Wilco Dijkstra

ping


Enable the most basic form of compare-branch fusion since various CPUs
support it. This has no measurable effect on cores which don't support
branch fusion, but increases fusion opportunities on cores which do.

Bootstrapped on AArch64, OK for commit?

ChangeLog:
2019-12-24  Wilco Dijkstra  

* config/aarch64/aarch64.c (generic_tunings): Add branch fusion.
(neoversen1_tunings): Likewise.

--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
a3b18b381e1748f8fe5e522bdec4f7c850821fe8..1c32a3543bec4031cc9b641973101829c77296b5
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -726,7 +726,7 @@ static const struct tune_params generic_tunings =
   SVE_NOT_IMPLEMENTED, /* sve_width  */
   4, /* memmov_cost  */
   2, /* issue_rate  */
-  (AARCH64_FUSE_AES_AESMC), /* fusible_ops  */
+  (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */
   "16:12", /* function_align.  */
   "4", /* jump_align.  */
   "8", /* loop_align.  */
@@ -1130,7 +1130,7 @@ static const struct tune_params neoversen1_tunings =
   SVE_NOT_IMPLEMENTED, /* sve_width  */
   4, /* memmov_cost  */
   3, /* issue_rate  */
-  AARCH64_FUSE_AES_AESMC, /* fusible_ops  */
+  (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */
   "32:16", /* function_align.  */
   "32:16", /* jump_align.  */
   "32:16", /* loop_align.  */

Re: [PATCH][AARCH64] Set jump-align=4 for neoversen1

2020-01-16 Thread Wilco Dijkstra

ping


Testing shows the setting of 32:16 for jump alignment has a significant codesize
cost, however it doesn't make a difference in performance. So set jump-align
to 4 to get 1.6% codesize improvement.

OK for commit?

ChangeLog
2019-12-24  Wilco Dijkstra  

* config/aarch64/aarch64.c (neoversen1_tunings): Set jump_align to 4.

--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
1646ed1d9a3de8ee2f0abff385a1ea145e234475..209ed8ebbe81104d9d8cff0df31946ab7704fb33
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -1132,7 +1132,7 @@ static const struct tune_params neoversen1_tunings =
   3, /* issue_rate  */
   (AARCH64_FUSE_AES_AESMC | AARCH64_FUSE_CMP_BRANCH), /* fusible_ops  */
   "32:16", /* function_align.  */
-  "32:16", /* jump_align.  */
+  "4", /* jump_align.  */
   "32:16", /* loop_align.  */
   2,   /* int_reassoc_width.  */
   4,   /* fp_reassoc_width.  */

Re: [PATCH] testsuite: Unbreak compat.exp testing with alt compiler (PR testsuite/93294)

2020-01-16 Thread David Malcolm

On Thu, 2020-01-16 at 18:01 +0100, Jakub Jelinek wrote:
> Hi!
> 
> The PR87488 changes include:
>   * lib/prune.exp (TEST_ALWAYS_FLAGS): Add -fdiagnostics-
> urls=never.
> Unfortunately, this broke all the compat.exp etc. compatibility
> testing with
> alternate compilers, because those compilers (likely) don't
> understand this
> new option and thus all alt compilations FAIL and everything that
> needs it
> is then UNRESOLVED.

Sorry about that.

> Fixed thusly, by handling it like -fdiagnostics-color=never and other
> similar options.
> 
> Tested on x86_64-linux, ok for trunk?

Looks sane to me, by analogy with the "color" handling.

Dave

Re: [PATCH] libstdcxx: Update ctype_base.h from NetBSD upstream

2020-01-16 Thread Jonathan Wakely


On 14/01/20 08:56 -0800, Jason Thorpe wrote:




On Jan 14, 2020, at 8:49 AM, Kamil Rytarowski  wrote:

On 10.01.2020 17:11, Jonathan Wakely wrote:

On 07/01/20 12:44 -0800, Jason Thorpe wrote:



On Jan 7, 2020, at 7:43 AM, Jonathan Wakely  wrote:

For Jason and Krister's benefit, that last comment was referring to
an earlier suggestion to not try to support old NetBSD releases, see
https://gcc.gnu.org/ml/libstdc++/2020-01/msg00026.html


I think we need the netbsd target maintainers (CC'd) to decide whether
GCC should still support older releases or drop that support for GCC
10. Usually I'd say we need a period of deprecation, but if GCC
doesn't currently build on NetBSD then maybe that's unnecessary.


The affected NetBSD versions are NetBSD 6 and earlier, which are EOL
from the NetBSD perspective, so I think this is OK.


So is this patch OK then?



Looks good to me.


Now that we have Kamil's confirmation that it's working for him, it also gets a 
thumbs-up from me.

Thanks!


Thanks to everyone for their feedback. I've pushed the patch to GCC
master now (I think it's reasonable to consider "doesn't build on
NetBSD anymore" as a regression and so fixable at this point in GCC's
development).

Please let me know of any problems you find with the changes.

Re: [PATCH][AArch64] Enable CLI for Armv8.6-A f64mm

2020-01-16 Thread Richard Sandiford

Matthew Malcomson  writes:
> This patch is necessary for sve-ld1ro intrinsic I posted in
> https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00466.html .
>
> I had mistakenly thought this option was already enabled upstream.
>
> This provides the option +f64mm, that turns on the 64 bit floating point
> matrix multiply extension.  This extension is only available for
> AArch64.  Turning on this extension also turns on the SVE extension.
>
> This extension is optional and only available at Armv8.2-A and onward.
>
> We also add the ACLE defined macro for this extension.
>
> Tested on aarch64 cross compiler from x86.
>
> gcc/ChangeLog:
>
> 2020-01-13  Matthew Malcomson  
>
>   * config/aarch64/aarch64-c.c (_ARM_FEATURE_MATMUL_FLOAT64):
>   Introduce this ACLE specified predefined macro.
>   * config/aarch64/aarch64-option-extensions.def (f64mm): New.
>   (fp): Disabling this disables f64mm.
>   (simd): Disabling this disables f64mm.
>   (fp16): Disabling this disables f64mm.
>   (sve): Disabling this disables f64mm.
>   * config/aarch64/aarch64.h (AARCH64_FL_F64MM): New.
>   (AARCH64_ISA_F64MM): New.
>   (TARGET_F64MM): New.
>   * doc/invoke.texi (f64mm): Document new option.

Sorry for the slow reply, somehow missed this originally.

> diff --git a/gcc/config/aarch64/aarch64-option-extensions.def 
> b/gcc/config/aarch64/aarch64-option-extensions.def
> index 
> 5022a1b3552f35364e35b3955bf2c39a33ab0752..6057b33033a0f3a8be7d656dc1e459d4d93b2842
>  100644
> --- a/gcc/config/aarch64/aarch64-option-extensions.def
> +++ b/gcc/config/aarch64/aarch64-option-extensions.def
> @@ -53,26 +53,26 @@
>  /* Enabling "fp" just enables "fp".
> Disabling "fp" also disables "simd", "crypto", "fp16", "aes", "sha2",
> "sha3", sm3/sm4, "sve", "sve2", "sve2-aes", "sve2-sha3", "sve2-sm4",
> -   "sve2-bitperm", "i8mm" and "bf16".  */
> +   "sve2-bitperm", "i8mm", "f64mm", and "bf16".  */
>  AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, 0, AARCH64_FL_SIMD | \
> AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | \
> AARCH64_FL_SHA2 | AARCH64_FL_SHA3 | AARCH64_FL_SM4 | \
> AARCH64_FL_SVE | AARCH64_FL_SVE2 | AARCH64_FL_SVE2_AES | \
> AARCH64_FL_SVE2_SHA3 | AARCH64_FL_SVE2_SM4 | \
> AARCH64_FL_SVE2_BITPERM | AARCH64_FL_I8MM | \
> -   AARCH64_FL_BF16, false, "fp")
> +   AARCH64_FL_F64MM | AARCH64_FL_BF16, false, "fp")
>  
>  /* Enabling "simd" also enables "fp".
> Disabling "simd" also disables "crypto", "dotprod", "aes", "sha2", "sha3",
> "sm3/sm4", "sve", "sve2", "sve2-aes", "sve2-sha3", "sve2-sm4",
> -   "sve2-bitperm", and "i8mm".  */
> +   "sve2-bitperm", "i8mm", and "f64mm".  */
>  AARCH64_OPT_EXTENSION("simd", AARCH64_FL_SIMD, AARCH64_FL_FP, \
> AARCH64_FL_CRYPTO | AARCH64_FL_DOTPROD | \
> AARCH64_FL_AES | AARCH64_FL_SHA2 | AARCH64_FL_SHA3 | \
> AARCH64_FL_SM4 | AARCH64_FL_SVE | AARCH64_FL_SVE2 | \
> AARCH64_FL_SVE2_AES | AARCH64_FL_SVE2_SHA3 | \
> AARCH64_FL_SVE2_SM4 | AARCH64_FL_SVE2_BITPERM | \
> -   AARCH64_FL_I8MM, false, \
> +   AARCH64_FL_I8MM | AARCH64_FL_F64MM, false, \
> "asimd")
>  
>  /* Enabling "crypto" also enables "fp", "simd", "aes" and "sha2".
> @@ -92,12 +92,13 @@ AARCH64_OPT_EXTENSION("crc", AARCH64_FL_CRC, 0, 0, false, 
> "crc32")
>  AARCH64_OPT_EXTENSION("lse", AARCH64_FL_LSE, 0, 0, false, "atomics")
>  
>  /* Enabling "fp16" also enables "fp".
> -   Disabling "fp16" disables "fp16", "fp16fml", "sve", "sve2", "sve2-aes",
> -   "sve2-sha3", "sve2-sm4", and "bitperm".  */
> +   Disabling "fp16" disables "fp16", "fp16fml", "sve", "f64mm", "sve2",
> +   "sve2-aes", "sve2-sha3", "sve2-sm4", and "bitperm".  */
>  AARCH64_OPT_EXTENSION("fp16", AARCH64_FL_F16, AARCH64_FL_FP, \
> -   AARCH64_FL_F16FML | AARCH64_FL_SVE | AARCH64_FL_SVE2 | \
> -   AARCH64_FL_SVE2_AES | AARCH64_FL_SVE2_SHA3 | \
> -   AARCH64_FL_SVE2_SM4 | AARCH64_FL_SVE2_BITPERM, false, \
> +   AARCH64_FL_F16FML | AARCH64_FL_SVE | AARCH64_FL_F64MM | \
> +   AARCH64_FL_SVE2 | AARCH64_FL_SVE2_AES | \
> +   AARCH64_FL_SVE2_SHA3 | AARCH64_FL_SVE2_SM4 | \
> +   AARCH64_FL_SVE2_BITPERM, false, \
> "fphp asimdhp")

Minor (of course), but it'd probably better to keep the order of the
options consistent.  Above "f64mm" came at the end, after "i8mm" and
the SVE options, so let's do the same here.  Could you fix
s/"bitperm"/"sve2-bitperm" while you're there?

> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 
> 53df4b1fdf9884d0b7cb73341d717887e79fef9c..47668731efc25dbf32a3b6b38817d7762bbfa93f
>  100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -16130,7 +16130,7 @@ and the

contrib: New remotes structure for vendor and personal refs

2020-01-16 Thread Richard Earnshaw (lists)

The initial structure for vendor and personal branches makes use of the 
default remote (normally origin) for the upstream repository). 
Unfortunately, this causes some confusion, especially for personal 
branches because a push will not push to the correct upstream location. 
This can be 'fixed' by adding a push refspec for the remote, but that 
has the unfortunate consequence of breaking the push.default behaviour 
for git push, and it becomes too easy to accidentally commit something 
unintended to the main parts of the repository.


To work around this, this patch changes the configuration to use 
separate 'remotes' for these additional refs, with one remote for the 
personal space and another remote for each vendor's space.  The personal 
space is called after the user's preferred branch-space prefix (default 
'me'), the vendor spaces are called vendor-.


As far as possible, I've made the script automatically restructure any 
existing fetch or push lines that earlier versions of the scripts may 
have created - the gcc-git-customization.sh script will convert all 
vendor refs that it can find, so it is not necessary to re-add any 
vendors you've already added.


You might, however, want to run
  git remote prune 
after running to clean up any stale upstream-refs that might still be in 
your local repo, and then

  git fetch vendor-
or
  git fetch 
to re-populate the remotes/ structures.

Also, for any branch you already have that tracks a personal or vendor 
branch upstream, you might need to run

  git config branch..remote 

so that merges and pushes go to the right place (I haven't attempted to 
automate this last part).


Please be aware that if you have multiple personal branches set up, then

  git push 

will still consider all of them for pushing.  If you only want to push 
one branch, then either write

  git push  HEAD
or
  git push  /branch
as appropriate.

And don't forget '-n' (--dry-run) to see what would be done if this were 
not a dry run.


* gcc-git-customization.sh: New personal and vendor layout.
Convert any existing fetch/push rules to new layout.
* git-fetch-vendor.sh: New vendor layout.

I'll wait a bit before pushing this, just in case someone spots an issue.

R.
diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index 26f4389bcc8..b62397314c6 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -81,6 +81,13 @@ then
 upstream="origin"
 fi
 ask "Local name for upstream repository" "origin" upstream
+
+v=$(git config --get-all "remote.${upstream}.fetch")
+if [ "x$v" = "x" ]
+then
+echo "Remote $upstream does not seem to exist as a remote"
+exit 1
+fi
 git config "gcc-config.upstream" "$upstream"
 
 remote_id=$(git config --get "gcc-config.user")
@@ -113,22 +120,38 @@ echo "(local branches starting / can be pushed directly to your"
 ask "personal area on the gcc server)" $old_pfx new_pfx
 git config "gcc-config.userpfx" "$new_pfx"
 
-echo "Setting up tracking for personal namespace $remote_id in remotes/$upstream/${new_pfx}"
-git config --replace-all "remote.${upstream}.fetch" "+refs/users/${remote_id}/heads/*:refs/remotes/${upstream}/${new_pfx}/*" ":refs/remotes/${upstream}/${old_pfx}/"
-git config --replace-all "remote.${upstream}.fetch" "+refs/users/${remote_id}/tags/*:refs/tags/${new_pfx}/*" ":refs/tags/${old_pfx}/"
+# Scan the existing settings to see if there are any we need to rewrite.
+vendors=$(git config --get-all "remote.${upstream}.fetch" "refs/vendors/" | sed -r "s:.*refs/vendors/([^/]+)/.*:\1:" | sort | uniq)
+url=$(git config --get "remote.${upstream}.url")
+pushurl=$(git config --get "remote.${upstream}.pushurl")
+for v in $vendors
+do
+echo "Migrating vendor $v to new remote vendor-$v"
+git config --unset-all "remote.${upstream}.fetch" "refs/vendors/$v/"
+git config --unset-all "remote.${upstream}.push" "refs/vendors/$v/"
+git config "remote.vendor-${v}.url" "${url}"
+if [ "x$pushurl" != "x" ]
+then
+	git config "remote.vendor-${v}.pushurl" "${pushurl}"
+fi
+git config --add "remote.vendor-${v}.fetch" "+refs/vendors/$v/heads/*:refs/remotes/vendor-${v}/*"
+git config --add "remote.vendor-${v}.fetch" "+refs/vendors/$v/tags/*:refs/tags/vendor-${v}/*"
+done
+
+echo "Setting up tracking for personal namespace $remote_id in remotes/${new_pfx}"
+git config "remote.${new_pfx}.url" "${url}"
+if [ "x$pushurl" != "x" ]
+then
+git config "remote.${new_pfx}.pushurl" "${pushurl}"
+fi
+git config --replace-all "remote.${new_pfx}.fetch" "+refs/users/${remote_id}/heads/*:refs/remotes/${new_pfx}/*" ":refs/remotes/${old_pfx}/"
+git config --replace-all "remote.${new_pfx}.fetch" "+refs/users/${remote_id}/tags/*:refs/tags/${new_pfx}/*" ":refs/tags/${old_pfx}/"
+git config --replace-all "remote.${new_pfx}.push" "refs/heads/${new_pfx}/*:refs/users/${remote_id}/heads/*" ":refs/users/${remote_id}"
 
-push_rule=$(git config --get

[PATCH] testsuite: Unbreak compat.exp testing with alt compiler (PR testsuite/93294)

2020-01-16 Thread Jakub Jelinek

Hi!

The PR87488 changes include:
* lib/prune.exp (TEST_ALWAYS_FLAGS): Add -fdiagnostics-urls=never.
Unfortunately, this broke all the compat.exp etc. compatibility testing with
alternate compilers, because those compilers (likely) don't understand this
new option and thus all alt compilations FAIL and everything that needs it
is then UNRESOLVED.

Fixed thusly, by handling it like -fdiagnostics-color=never and other
similar options.

Tested on x86_64-linux, ok for trunk?

2020-01-16  Jakub Jelinek  

PR testsuite/93294
* lib/c-compat.exp (compat-use-alt-compiler): Handle
-fdiagnostics-urls=never similarly to -fdiagnostics-color=never.
(compat_setup_dfp): Likewise.

--- gcc/testsuite/lib/c-compat.exp.jj   2020-01-12 11:54:38.457382571 +0100
+++ gcc/testsuite/lib/c-compat.exp  2020-01-16 17:48:25.688227433 +0100
@@ -36,6 +36,7 @@ load_lib target-libpath.exp
 proc compat-use-alt-compiler { } {
 global GCC_UNDER_TEST ALT_CC_UNDER_TEST
 global compat_same_alt compat_alt_caret compat_alt_color compat_no_line_no
+global compat_alt_urls
 global TEST_ALWAYS_FLAGS
 
 # We don't need to do this if the alternate compiler is actually
@@ -48,6 +49,9 @@ proc compat-use-alt-compiler { } {
if { $compat_alt_color == 0 } then {
regsub -- "-fdiagnostics-color=never" $TEST_ALWAYS_FLAGS "" 
TEST_ALWAYS_FLAGS
}
+   if { $compat_alt_urls == 0 } then {
+   regsub -- "-fdiagnostics-urls=never" $TEST_ALWAYS_FLAGS "" 
TEST_ALWAYS_FLAGS
+   }
if { $compat_no_line_no == 0 } then {
regsub -- "-fno-diagnostics-show-line-numbers" $TEST_ALWAYS_FLAGS 
"" TEST_ALWAYS_FLAGS
}
@@ -80,11 +84,13 @@ proc compat_setup_dfp { } {
 global compat_have_dfp
 global compat_alt_caret
 global compat_alt_color
+global compat_alt_urls
 global compat_no_line_no
 global TEST_ALWAYS_FLAGS compat_save_TEST_ALWAYS_FLAGS
 
 set compat_alt_caret 0
 set compat_alt_color 0
+set compat_alt_urls 0
 set compat_no_line_no 0
 set compat_save_TEST_ALWAYS_FLAGS $TEST_ALWAYS_FLAGS
 
@@ -105,6 +111,10 @@ proc compat_setup_dfp { } {
int dummy; } "-fdiagnostics-color=never"] != 0 } {
set compat_alt_color 1
}
+   if { [check_no_compiler_messages_nocache compat_alt_has_urls object {
+   int dummy; } "-fdiagnostics-urls=never"] != 0 } {
+   set compat_alt_urls 1
+   }
if { [check_no_compiler_messages_nocache compat_alt_has_no_line_no 
object {
int dummy; } "-fno-diagnostics-show-line-numbers"] != 0 } {
set compat_no_line_no 1


Jakub

Re: [Patch, fortran][9/10 Regression] PR 93236 -fno-automatic and RECURSIVE

2020-01-16 Thread Tobias Burnus


On 1/16/20 3:47 PM, Mark Eggleston wrote:

Please find attached patch to fix this regression.
OK for master and 9 branch?


OK for both – thanks!

As note to others:

* Up to Fortran 2008, 'recursive' attribute is required if a proc is 
recursively used; since Fortran 2018 recursive is the default (and the 
non_recursive attribute exists as well). For this patch (and the near 
future), we mostly ignore this bit of F2018. See also PR91413.


* The PR is about a vendor extension – cf. test case pr93263_1.f90. The 
second test case should be standard conforming and ensures that neither 
this patch (nor a future patch) breaks it. [Namely, static memory is 
indeed used as requested.]


Thanks,

Tobias


Change logs:
gcc/fortran

    Mark Eggleston  

    PR fortran/93236
    * resolve.c (resolve_types): Declare boolean recursive and set 
with the

    value of the recursive attribute of namespace proc_name symbol
    structure if it exists.  Call gfc_save_all if both flag_automatic and
    recursive are false or ns->save_all is true.

gcc/testsuite

    Mark Eggleston  
    Tobias Burnus  

    * gfortran.dg/pr93263_1.f90: New test.
    * gfortran.dg/pr93263_2.f90: New test.

Re: Git: accessing vendors branches

2020-01-16 Thread Jakub Jelinek

On Thu, Jan 16, 2020 at 04:14:18PM +, Mark Eggleston wrote:
> Using git ls-remote, refs for vendor branches can be seen e.g.:
> 
> 1a7bff0abccf167830c6e0443f3591380b960cb0
> refs/vendors/redhat/heads/gcc-9-branch
> 
> Using git branch -r and there is no sign of that branch.
> 
> What needs to be done to access vendor branches?

contrib/gcc-git-customization.sh #if you haven't run it yet
contrib/git-fetch-vendor.sh redhat

Jakub

Git: accessing vendors branches

2020-01-16 Thread Mark Eggleston


Using git ls-remote, refs for vendor branches can be seen e.g.:

1a7bff0abccf167830c6e0443f3591380b960cb0 
refs/vendors/redhat/heads/gcc-9-branch


Using git branch -r and there is no sign of that branch.

What needs to be done to access vendor branches?

--
https://www.codethink.co.uk/privacy.html

Re: [PING][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)

2020-01-16 Thread Stam Markianos-Wright



On 1/8/20 3:18 PM, Stam Markianos-Wright wrote:
> 
> 
> On 12/10/19 5:03 PM, Kyrill Tkachov wrote:
>> Hi Stam,
>>
>> On 11/15/19 5:26 PM, Stam Markianos-Wright wrote:
>>> Pinging with more correct maintainers this time :)
>>>
>>> Also would need to backport to gcc7,8,9, but need to get this approved
>>> first!
>>>
>>
>> Sorry for the delay.
> 
> Same here now! Sorry totally forget about this in the lead up to Xmas!
> 
> Done the changes marked below and also removed the unnecessary extra #defines 
> from the test.

Ping :)

Cheers,
Stam

> 
>>
>>
>>> Thank you,
>>> Stam
>>>
>>>
>>>  Forwarded Message 
>>> Subject: Re: [PATCH][GCC][ARM] Arm generates out of range conditional
>>> branches in Thumb2 (PR91816)
>>> Date: Mon, 21 Oct 2019 10:37:09 +0100
>>> From: Stam Markianos-Wright 
>>> To: Ramana Radhakrishnan 
>>> CC: gcc-patches@gcc.gnu.org , nd ,
>>> James Greenhalgh , Richard Earnshaw
>>> 
>>>
>>>
>>>
>>> On 10/13/19 4:23 PM, Ramana Radhakrishnan wrote:
>>> >>
>>> >> Patch bootstrapped and regression tested on arm-none-linux-gnueabihf,
>>> >> however, on my native Aarch32 setup the test times out when run as part
>>> >> of a big "make check-gcc" regression, but not when run individually.
>>> >>
>>> >> 2019-10-11  Stamatis Markianos-Wright 
>>> >>
>>> >>   * config/arm/arm.md: Update b for Thumb2 range checks.
>>> >>   * config/arm/arm.c: New function arm_gen_far_branch.
>>> >>   * config/arm/arm-protos.h: New function arm_gen_far_branch
>>> >>   prototype.
>>> >>
>>> >> gcc/testsuite/ChangeLog:
>>> >>
>>> >> 2019-10-11  Stamatis Markianos-Wright 
>>> >>
>>> >>   * testsuite/gcc.target/arm/pr91816.c: New test.
>>> >
>>> >> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
>>> >> index f995974f9bb..1dce333d1c3 100644
>>> >> --- a/gcc/config/arm/arm-protos.h
>>> >> +++ b/gcc/config/arm/arm-protos.h
>>> >> @@ -570,4 +570,7 @@ void arm_parse_option_features (sbitmap, const 
>>> cpu_arch_option *,
>>> >>
>>> >>   void arm_initialize_isa (sbitmap, const enum isa_feature *);
>>> >>
>>> >> +const char * arm_gen_far_branch (rtx *, int,const char * , const char 
>>> >> *);
>>> >> +
>>> >> +
>>> >
>>> > Lets get the nits out of the way.
>>> >
>>> > Unnecessary extra new line, need a space between int and const above.
>>> >
>>> >
>>>
>>> .Fixed!
>>>
>>> >>   #endif /* ! GCC_ARM_PROTOS_H */
>>> >> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
>>> >> index 39e1a1ef9a2..1a693d2ddca 100644
>>> >> --- a/gcc/config/arm/arm.c
>>> >> +++ b/gcc/config/arm/arm.c
>>> >> @@ -32139,6 +32139,31 @@ arm_run_selftests (void)
>>> >>   }
>>> >>   } /* Namespace selftest.  */
>>> >>
>>> >> +
>>> >> +/* Generate code to enable conditional branches in functions over 1 
>>> >> MiB.  */
>>> >> +const char *
>>> >> +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
>>> >> +    const char * branch_format)
>>> >
>>> > Not sure if this is some munging from the attachment but check
>>> > vertical alignment of parameters.
>>> >
>>>
>>> .Fixed!
>>>
>>> >> +{
>>> >> +  rtx_code_label * tmp_label = gen_label_rtx ();
>>> >> +  char label_buf[256];
>>> >> +  char buffer[128];
>>> >> +  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
>>> >> +    CODE_LABEL_NUMBER (tmp_label));
>>> >> +  const char *label_ptr = arm_strip_name_encoding (label_buf);
>>> >> +  rtx dest_label = operands[pos_label];
>>> >> +  operands[pos_label] = tmp_label;
>>> >> +
>>> >> +  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , label_ptr);
>>> >> +  output_asm_insn (buffer, operands);
>>> >> +
>>> >> +  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, 
>>> label_ptr);
>>> >> +  operands[pos_label] = dest_label;
>>> >> +  output_asm_insn (buffer, operands);
>>> >> +  return "";
>>> >> +}
>>> >> +
>>> >> +
>>> >
>>> > Unnecessary extra newline.
>>> >
>>>
>>> .Fixed!
>>>
>>> >>   #undef TARGET_RUN_TARGET_SELFTESTS
>>> >>   #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
>>> >>   #endif /* CHECKING_P */
>>> >> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
>>> >> index f861c72ccfc..634fd0a59da 100644
>>> >> --- a/gcc/config/arm/arm.md
>>> >> +++ b/gcc/config/arm/arm.md
>>> >> @@ -6686,9 +6686,16 @@
>>> >>   ;; And for backward branches we have
>>> >>   ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) + 
>>> >>4).
>>> >>   ;;
>>> >> +;; In 16-bit Thumb these ranges are:
>>> >>   ;; For a 'b'   pos_range = 2046, neg_range = -2048 giving 
>>> (-2040->2048).
>>> >>   ;; For a 'b' pos_range = 254, neg_range = -256  giving (-250 
>>> >>->256).
>>> >>
>>> >> +;; In 32-bit Thumb these ranges are:
>>> >> +;; For a 'b'   +/- 16MB is not checked for.
>>> >> +;; For a 'b' pos_range = 1048574, neg_range = -1048576  giving
>>> >> +;; (-1048568 -> 1048576).
>>> >> +
>>> >> +
>>> >
>>> > Unnecessary extra newline.
>>> >
>>>
>>> .Fixed!
>>>
>>> >>   (define_expand "cbranchsi4"
>>> >> [(set

[Pingx2][GCC][PATCH][ARM]Add ACLE intrinsics for dot product (vusdot - vector, vdot - by element) for AArch32 AdvSIMD ARMv8.6 Extension

2020-01-16 Thread Stam Markianos-Wright



On 1/10/20 6:48 PM, Stam Markianos-Wright wrote:
> 
> 
> On 12/18/19 1:25 PM, Stam Markianos-Wright wrote:
>>
>>
>> On 12/13/19 10:22 AM, Stam Markianos-Wright wrote:
>>> Hi all,
>>>
>>> This patch adds the ARMv8.6 Extension ACLE intrinsics for dot product
>>> operations (vector/by element) to the ARM back-end.
>>>
>>> These are:
>>> usdot (vector), dot (by element).
>>>
>>> The functions are optional from ARMv8.2-a as -march=armv8.2-a+i8mm and
>>> for ARM they remain optional as of ARMv8.6-a.
>>>
>>> The functions are declared in arm_neon.h, RTL patterns are defined to
>>> generate assembler and tests are added to verify and perform adequate 
>>> checks.
>>>
>>> Regression testing on arm-none-eabi passed successfully.
>>>
>>> This patch depends on:
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02195.html
>>>
>>> for ARM CLI updates, and on:
>>>
>>> https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00857.html
>>>
>>> for testsuite effective_target update.
>>>
>>> Ok for trunk?
>>
>> .Ping :)
>>
> Ping :)
> 
> New diff addressing review comments from Aarch64 version of the patch.
> 
> _Change of order of operands in RTL patterns.
> _Change tests to use check-function-bodies, compile with optimisation and 
> check 
> for exact registers.
> _Rename tests to remove "-compile-" in filename.
> 

Ping!

Cheers,
Stam

>>>
>>> Cheers,
>>> Stam
>>>
>>>
>>> ACLE documents are at https://developer.arm.com/docs/101028/latest
>>> ISA documents are at https://developer.arm.com/docs/ddi0596/latest
>>>
>>> PS. I don't have commit rights, so if someone could commit on my behalf,
>>> that would be great :)
>>>
>>>
>>> gcc/ChangeLog:
>>>
>>> 2019-11-28  Stam Markianos-Wright  
>>>
>>>  * config/arm/arm-builtins.c (enum arm_type_qualifiers):
>>>  (USTERNOP_QUALIFIERS): New define.
>>>  (USMAC_LANE_QUADTUP_QUALIFIERS): New define.
>>>  (SUMAC_LANE_QUADTUP_QUALIFIERS): New define.
>>>  (arm_expand_builtin_args):
>>>  Add case ARG_BUILTIN_LANE_QUADTUP_INDEX.
>>>  (arm_expand_builtin_1): Add qualifier_lane_quadtup_index.
>>>  * config/arm/arm_neon.h (vusdot_s32): New.
>>>  (vusdot_lane_s32): New.
>>>  (vusdotq_lane_s32): New.
>>>  (vsudot_lane_s32): New.
>>>  (vsudotq_lane_s32): New.
>>>  * config/arm/arm_neon_builtins.def
>>>  (usdot,usdot_lane,sudot_lane): New.
>>>  * config/arm/iterators.md (DOTPROD_I8MM): New.
>>>  (sup, opsuffix): Add .
>>>     * config/arm/neon.md (neon_usdot, dot_lane: New.
>>>  * config/arm/unspecs.md (UNSPEC_DOT_US, UNSPEC_DOT_SU): New.
>>>
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2019-12-12  Stam Markianos-Wright  
>>>
>>>  * gcc.target/arm/simd/vdot-compile-2-1.c: New test.
>>>  * gcc.target/arm/simd/vdot-compile-2-2.c: New test.
>>>  * gcc.target/arm/simd/vdot-compile-2-3.c: New test.
>>>  * gcc.target/arm/simd/vdot-compile-2-4.c: New test.
>>>
>>>
>

Re: [GCC][PATCH][AArch64]Add ACLE intrinsics for dot product (usdot - vector, dot - by element) for AArch64 AdvSIMD ARMv8.6 Extension

2020-01-16 Thread Stam Markianos-Wright



On 1/9/20 3:48 PM, Richard Sandiford wrote:
> OK, thanks.
> 

Committed as r10-6004-g8c197c851e7528baba7cb837f34c05ba2242f705

Thank you!
Stam
> Richard
> 
> Stam Markianos-Wright  writes:
>> On 12/30/19 10:21 AM, Richard Sandiford wrote:
>>> Stam Markianos-Wright  writes:
 On 12/20/19 2:13 PM, Richard Sandiford wrote:
> Stam Markianos-Wright  writes:
>> +**...
>> +**ret
>> +*/
>> +int32x2_t ufoo (int32x2_t r, uint8x8_t x, int8x8_t y)
>> +{
>> +  return vusdot_s32 (r, x, y);
>> +}
>> +
>
> If we're using check-function-bodies anyway, it might be slightly more
> robust to compile at -O and check for the exact RA.  E.g.:
>
> /*
> **ufoo:
> **usdotv0\.2s, (v1\.8b, v2\.8b|v2\.8b, v1\.8b)
> **ret
> */
>
> Just a suggestion though -- either way is fine.

 done this too and as per our internal discussion also added one
 xx_untied tests for usdot and one for usdot_lane

 That's one xx_untied test for each of the RTL pattern types added in
 aarch64-simd.md. Lmk if this is ok!

 Also I found that the way we were using check-function-bodies wasn't
 actually checking the assembler correctly, so I've changed that to:
 +/* { dg-final { check-function-bodies "**" "" "" } } */
 +/* { dg-skip-if "" { *-*-* } { "-fno-fat-lto-objects" } } */
 which seems to perform more checks
>>>
>>> Ah, OK, hadn't realised that we were cycling through optimisation
>>> options already.  In that case, it might be better to leave out the
>>> -O from the dg-options and instead use:
>>>
>>> /* { dg-skip-if "" { *-*-* } { { "-fno-fat-lto-objects" } { "-O0" } } } */
>>>
>>> (untested).
>>>
>>> It's unfortunate that we're skipping this for -O0 though.  Ideally we'd
>>> still compile the code and just skip the dg-final.  Does it work if you do:
>>>
>>> /* { dg-final { check-function-bodies "**" "" {-O[^0]} } } */
>>> /* { dg-skip-if "" { *-*-* } { { "-fno-fat-lto-objects" } } } */
>>>
>>> ?  Make sure that we actually still run the check-function-bodies when
>>> optimisation is enabled. :-)
>>
>> This works!
>> Now we are only doing the following for O0:
>> PASS: gcc.target/aarch64/advsimd-intrinsics/vdot-compile-3-2.c   -O0  (test 
>> for
>> excess errors)
>>
>> whereas for other optimisation levels do all the checks:
>> PASS: gcc.target/aarch64/advsimd-intrinsics/vdot-compile-3-2.c   -O1  (test 
>> for
>> excess errors)
>> PASS: gcc.target/aarch64/advsimd-intrinsics/vdot-compile-3-2.c   -O1
>> check-function-bodies ufoo
>> PASS: gcc.target/aarch64/advsimd-intrinsics/vdot-compile-3-2.c   -O1
>> check-function-bodies ufooq
>> PASS: gcc.target/aarch64/advsimd-intrinsics/vdot-compile-3-2.c   -O1
>> check-function-bodies ufoo_lane
>> PASS: gcc.target/aarch64/advsimd-intrinsics/vdot-compile-3-2.c   -O1
>> check-function-bodies ufoo_laneq
>> PASS: gcc.target/aarch64/advsimd-intrinsics/vdot-compile-3-2.c   -O1
>> check-function-bodies ufooq_lane
>> PASS: gcc.target/aarch64/advsimd-intrinsics/vdot-compile-3-2.c   -O1
>> check-function-bodies ufooq_laneq
>> PASS: gcc.target/aarch64/advsimd-intrinsics/vdot-compile-3-2.c   -O1
>> check-function-bodies sfoo_lane
>> PASS: gcc.target/aarch64/advsimd-intrinsics/vdot-compile-3-2.c   -O1
>> check-function-bodies sfoo_laneq
>> PASS: gcc.target/aarch64/advsimd-intrinsics/vdot-compile-3-2.c   -O1
>> check-function-bodies sfooq_lane
>> PASS: gcc.target/aarch64/advsimd-intrinsics/vdot-compile-3-2.c   -O1
>> check-function-bodies sfooq_laneq
>> PASS: gcc.target/aarch64/advsimd-intrinsics/vdot-compile-3-2.c   -O1
>> check-function-bodies ufoo_untied
>> PASS: gcc.target/aarch64/advsimd-intrinsics/vdot-compile-3-2.c   -O1
>> check-function-bodies ufooq_laneq_untied
>>
>>>
>>> Also, I'm an idiot.  The reason I'd used (...|...) in the regexps was
>>> that "dot product is commutative".  But of course that's not true for
>>> these mixed-sign ops, so the string must be:
>>>
>>> usdot  v0\.2s, v1\.8b, v2\.8b
>>>
>>> The patch copied the (...|...) regexps above to the lane tests, but those
>>> wouldn't be commutative even if the operands had the same type.
>>
>> Ahh, makes sense now. Done :)
>>
>> Cheers,
>> Stam
>>
>>>
>>> Thanks,
>>> Richard
>>>
>>
>>
>> diff --git a/gcc/config/aarch64/aarch64-builtins.c 
>> b/gcc/config/aarch64/aarch64-builtins.c
>> index 
>> 1bd2640a1ced352de232fed1cf134b46c69b80f7..702b317d94d2fc6ebe59609727ad853f3f5cc652
>>  100644
>> --- a/gcc/config/aarch64/aarch64-builtins.c
>> +++ b/gcc/config/aarch64/aarch64-builtins.c
>> @@ -107,6 +107,9 @@ enum aarch64_type_qualifiers
>> /* Lane indices selected in pairs. - must be in range, and flipped for
>>bigendian.  */
>> qualifier_lane_pair_index = 0x800,
>> +  /* Lane indices selected in quadtuplets. - must be in range, and flipped 
>> for
>> + bigendian.  */
>> +  qualifier_lane_quadtup_index = 0x1000,
>>   };
>>   
>>   typedef struct
>> @@ -173,6 +176,10 @@ 
>>

Re: [GCC][PATCH][AArch64]Add ACLE intrinsics for bfdot for ARMv8.6 Extension

2020-01-16 Thread Stam Markianos-Wright



On 1/9/20 3:54 PM, Richard Sandiford wrote:
> Please update the names of the testsuite files to match the ones
> in the bfloat16_t patch.  (Same for the usdot/sudot patch -- sorry
> for forgetting there.)
> 
> OK with that change, thanks.
> 

Done and committed as r10-6006-gf275d73a57f1e5a07fbd4978f4b4457a5eaa1e39

Thank you!
Stam

> Richard
> 
> Stam Markianos-Wright  writes:
>> On 12/30/19 10:29 AM, Richard Sandiford wrote:
>>> Stam Markianos-Wright  writes:
 diff --git a/gcc/config/aarch64/aarch64-simd.md 
 b/gcc/config/aarch64/aarch64-simd.md
 index 
 adfda96f077075ad53d4bea2919c4d3b326e49f5..7587bc46ba1c80389ea49fa83a0e6f8a489711e9
  100644
 --- a/gcc/config/aarch64/aarch64-simd.md
 +++ b/gcc/config/aarch64/aarch64-simd.md
 @@ -7028,3 +7028,36 @@
  "xtn\t%0., %1."
  [(set_attr "type" "neon_shift_imm_narrow_q")]
)
 +
 +(define_insn "aarch64_bfdot"
 +  [(set (match_operand:VDQSF 0 "register_operand" "=w")
 +  (plus:VDQSF
 +(unspec:VDQSF
 + [(match_operand: 2 "register_operand" "w")
 +  (match_operand: 3 "register_operand" "w")]
 +  UNSPEC_BFDOT)
 +(match_operand:VDQSF 1 "register_operand" "0")))]
 +  "TARGET_BF16_SIMD"
 +  "bfdot\t%0., %2., %3."
 +  [(set_attr "type" "neon_dot")]
 +)
 +
 +
 +(define_insn "aarch64_bfdot_lane"
>>>
>>> Too many blank lines.
>>
>> Fixed, sorry I hadn't noticed!
>>
>>>
 +  [(set (match_operand:VDQSF 0 "register_operand" "=w")
 +  (plus:VDQSF
 +(unspec:VDQSF
 + [(match_operand: 2 "register_operand" "w")
 +  (match_operand:VBF 3 "register_operand" "w")
 +  (match_operand:SI 4 "const_int_operand" "n")]
 +  UNSPEC_BFDOT)
 +(match_operand:VDQSF 1 "register_operand" "0")))]
 +  "TARGET_BF16_SIMD"
 +{
 +  int nunits = GET_MODE_NUNITS (mode).to_constant ();
 +  int lane = INTVAL (operands[4]);
 +  operands[4] = gen_int_mode (ENDIAN_LANE_N (nunits / 2, lane), SImode);
 +  return "bfdot\t%0., %2., %3.2h[%4]";
 +}
 +  [(set_attr "type" "neon_dot")]
 +)
 [...]
 diff --git 
 a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-compile-1.c 
 b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-compile-1.c
 new file mode 100644
 index 
 ..c575dcd3901172a52fa9403c9179d58eea44eb72
 --- /dev/null
 +++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/bfdot-compile-1.c
 @@ -0,0 +1,91 @@
 +/* { dg-do assemble { target { aarch64*-*-* } } } */
 +/* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */
 +/* { dg-add-options arm_v8_2a_bf16_neon }  */
 +/* { dg-additional-options "-O -save-temps" } */
 +/* { dg-final { check-function-bodies "**" "" } } */
 +/* { dg-skip-if "" { *-*-* } { "-fno-fat-lto-objects" } } */
>>>
>>> Same comment as for USDOT/SUDOT regarding the dg- markup.
>>
>> Done!
>>>
 +
 +#include 
 +
 +/*
 +**ufoo:
 +**bfdot   v0.2s, (v1.4h, v2.4h|v2.4h, v1.4h)
 +**ret
 +*/
 +float32x2_t ufoo(float32x2_t r, bfloat16x4_t x, bfloat16x4_t y)
 +{
 +  return vbfdot_f32 (r, x, y);
 +}
 +
 +/*
 +**ufooq:
 +**bfdot   v0.4s, (v1.8h, v2.8h|v2.8h, v1.8h)
 +**ret
 +*/
 +float32x4_t ufooq(float32x4_t r, bfloat16x8_t x, bfloat16x8_t y)
 +{
 +  return vbfdotq_f32 (r, x, y);
 +}
>>>
>>> The (...|...)s here are correct.
>> Yep.
>>>
 +
 +/*
 +**ufoo_lane:
 +**bfdot   v0.2s, (v1.4h, v2.2h\[0\]|v2.4h, v1.2h\[0\])
 +**ret
 +*/
 +float32x2_t ufoo_lane(float32x2_t r, bfloat16x4_t x, bfloat16x4_t y)
 +{
 +  return vbfdot_lane_f32 (r, x, y, 0);
 +}
 +
 +/*
 +**ufooq_laneq:
 +**bfdot   v0.4s, (v1.8h, v2.2h\[2\]|v2.8h, v1.2h\[2\])
 +**ret
 +*/
 +float32x4_t ufooq_laneq(float32x4_t r, bfloat16x8_t x, bfloat16x8_t y)
 +{
 +  return vbfdotq_laneq_f32 (r, x, y, 2);
 +}
 +
 +/*
 +**ufoo_laneq:
 +**bfdot   v0.2s, (v1.4h, v2.2h\[3\]|v2.4h, v1.2h\[3\])
 +**ret
 +*/
 +float32x2_t ufoo_laneq(float32x2_t r, bfloat16x4_t x, bfloat16x8_t y)
 +{
 +  return vbfdot_laneq_f32 (r, x, y, 3);
 +}
 +
 +/*
 +**ufooq_lane:
 +**bfdot   v0.4s, (v1.8h, v2.2h\[1\]|v2.8h, v1.2h\[1\])
 +**ret
 +*/
 +float32x4_t ufooq_lane(float32x4_t r, bfloat16x8_t x, bfloat16x4_t y)
 +{
 +  return vbfdotq_lane_f32 (r, x, y, 1);
 +}
>>>
>>> But these aren't, since the operands must be in the order given.
>> Yep.
>>>
 +
 +/*
 +**ufoo_untied:
 +**mov v0.8b, v1.8b
 +**bfdot   v0.2s, (v2.4h, v3.4h|v3.4h, v2.4h)
 +**ret
 +*/
 +float32x2_t ufoo_untied(float32x4_t unused, float32x2_t r, bfloat16x4_t 
 x,

Re: [PATCH] Add OpenACC 2.6 `acc_get_property' support

2020-01-16 Thread Thomas Schwinge

Hi Frederik!

On 2019-12-20T17:46:57+0100, "Harwath, Frederik"  
wrote:
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-2.c

I suggest to rename this one to 'acc_get_property-nvptx.c'.

> @@ -0,0 +1,68 @@
> +/* Test the `acc_get_property' and '`acc_get_property_string' library
> +   functions on Nvidia devices by comparing property values with
> +   those obtained through the CUDA API. */
> +/* { dg-additional-sources acc_get_property-aux.c } */
> +/* { dg-additional-options "-lcuda -lcudart" } */
> +/* { dg-do run { target openacc_nvidia_accel_selected } } */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +void expect_device_properties
> +(acc_device_t dev_type, int dev_num,
> + int expected_total_mem, int expected_free_mem,
> + const char* expected_vendor, const char* expected_name,
> + const char* expected_driver);
> +
> +int main ()
> +{
> +  int dev_count;
> +  cudaGetDeviceCount (_count);
> +
> +  for (int dev_num = 0; dev_num < dev_count; ++dev_num)
> +{
> +  if (cudaSetDevice (dev_num) != cudaSuccess)
> + {
> +   fprintf (stderr, "cudaSetDevice failed.\n");
> +   abort ();
> + }
> +
> +  printf("Checking device %d\n", dev_num);
> +
> +  const char *vendor = "Nvidia";
> +  size_t free_mem;
> +  size_t total_mem;
> +  if (cudaMemGetInfo(_mem, _mem) != cudaSuccess)
> + {
> +   fprintf (stderr, "cudaMemGetInfo failed.\n");
> +   abort ();
> + }
> +
> +  struct cudaDeviceProp p;
> +  if (cudaGetDeviceProperties(, dev_num) != cudaSuccess)
> + {
> +   fprintf (stderr, "cudaGetDeviceProperties failed.\n");
> +   abort ();
> + }
> +
> +  int driver_version;
> +  if (cudaDriverGetVersion(_version) != cudaSuccess)
> + {
> +   fprintf (stderr, "cudaDriverGetVersion failed.\n");
> +   abort ();
> + }
> +  /* The version string should contain the version of the CUDA Toolkit
> +  in the same MAJOR.MINOR format that is used by Nvidia.
> +  The format string below is the same that is used by the deviceQuery
> +  program, which belongs to Nvidia's CUDA samples, to print the version. 
> */
> +  char driver[30];
> +  snprintf (driver, sizeof driver, "CUDA Driver %u.%u",
> + driver_version / 1000, driver_version % 1000 / 10);
> +
> +  expect_device_properties(acc_device_nvidia, dev_num,

This assumes that the 'cuda*' interfaces and OpenACC/libgomp interfaces
handle/order device numbers in the same way -- which it seems they do,
but just noting this in case this becomes an issue at some point.

> +total_mem, free_mem, vendor, p.name, driver);
> +}
> +}

So I just witnessed a FAIL here, because:

Expected acc_property_free_memory to equal -929226752, but was -928956416.

Aside from improper data types being used for storing/printing the memory
information, we have to expect 'acc_property_free_memory' to change
between two invocations.  ;-)

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-3.c

I suggest to rename this one to 'acc_get_property-host.c'.

> @@ -0,0 +1,19 @@
> +/* Test the `acc_get_property' and '`acc_get_property_string' library
> +   functions for the host device. */
> +/* { dg-additional-sources acc_get_property-aux.c } */
> +/* { dg-do run } */
> +
> +#include 
> +#include 
> +
> +void expect_device_properties
> +(acc_device_t dev_type, int dev_num,
> + int expected_total_mem, int expected_free_mem,
> + const char* expected_vendor, const char* expected_name,
> + const char* expected_driver);
> +
> +int main()
> +{
> +  printf ("Checking acc_device_host device properties\n");
> +  expect_device_properties (acc_device_host, 0, 0, 0, "GNU", "GOMP", "1.0");
> +}

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/acc_get_property-aux.c
> @@ -0,0 +1,80 @@
> +/* Auxiliary functions for acc_get_property tests */
> +/* { dg-do compile  { target skip-all-targets } } */
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +void expect_device_properties
> +(acc_device_t dev_type, int dev_num,
> + int expected_total_mem, int expected_free_mem,
> + const char* expected_vendor, const char* expected_name,
> + const char* expected_driver)
> +{
> +  const char *vendor = acc_get_property_string (dev_num, dev_type,
> + acc_property_vendor);
> +  if (strcmp (vendor, expected_vendor))
> +{
> +  fprintf (stderr, "Expected acc_property_vendor to equal \"%s\", "
> +"but was \"%s\".\n", expected_vendor, vendor);
> +  abort ();
> +}
> +
> +  int total_mem = acc_get_property (dev_num, dev_type,
> + acc_property_memory);
> +  if (total_mem != expected_total_mem)
> +{
> +  fprintf (stderr, "Expected acc_property_memory to equal %d, "
> +"but was %d.\n", expected_total_mem, total_mem);
> +  abort ();
>

Re: [PATCH] libstdc++/91223 Improve unordered containers == operator

2020-01-16 Thread Jonathan Wakely


On 16/01/20 13:25 +, Jonathan Wakely wrote:

On 16/01/20 07:42 +0100, François Dumont wrote:

On 1/15/20 10:52 PM, Jonathan Wakely wrote:

On 15/01/20 21:48 +, Jonathan Wakely wrote:

On 14/01/20 22:25 +0100, François Dumont wrote:

On 1/13/20 10:53 PM, Jonathan Wakely wrote:

On 13/01/20 22:41 +0100, François Dumont wrote:


For the multi-keys we could still avoid redundant 
comparisons when _Equal is just doing == on the key type. 
On unordered_multiset we could just avoids the call to 
is_permuation and on the unordered_multimap we could check 
the is_permutation only on the associated value rather 
than on the std::pair.



I don't think that's necessary, or helpful.

The idea of https://gcc.gnu.org/ml/libstdc++/2020-01/msg00070.html is
that you shouldn't be using _Equal at all, and therefore it doesn't
matter whether it's std::equal_to or not.



And it was indeed possible.


Nice!


    PR libstdc++/91223
    * include/bits/hashtable.h (_Hashtable<>): Make _Equality<> friend.
    * include/bits/hashtable_policy.h: Include .
    (_Equality_base): Remove.
    (_Equality<>::_M_equal): Review implementation. Use 
std::is_permutation.

    * testsuite/23_containers/unordered_multiset/operators/1.cc
    (Hash, Equal, test02, test03): New.
    * testsuite/23_containers/unordered_set/operators/1.cc
    (Hash, Equal, test02, test03): New.

Tested under Linux x86_64.

Ok to commit ?


Yes, OK for trunk (we're in stage4 but your patch was posted in stage3
and fixes a pretty nasty performance bug, so is OK now).

N.B. GCC has moved to Git instead of Subversion. If you don't have Git
access set up let me know and I can commit the patch for you.


I haven't done the move yet and won't be able to do it before the 
week-end. So please proceed to the commit for me, thanks.


No problem, I can do that.


Your patch is now committed to trunk. Thanks for the major
improvement.

I had a look at std::is_permutation and I think we can make some
simplifications to the 4-argument overload, and we can share most of
the code between the 3-arg and 4-arg overloads (once they've confirmed
the lengths are the same they do exactly the same thing). See the
attached patch. This should probably wait for stage1 though.

I also wanted to add some comments to the _Equality::_M_equal
specialiation for unordered_multimap/multiset to explain what the code
was doing, and had some more ideas. See patch again.

It looks like this loop can potentially visit every element of
__other, instead of stopping at the end of the bucket:

  typename __hashtable::const_iterator __ity(__y_n);
  for (auto __ity_end = __ity; __ity_end != __other.end(); ++__ity_end)
if (--__x_count == 0)
  break;

Consider a case like this:

unordered_multiset a{1, 2, 3, 4};
for (int i = 0; i <1; ++i)
  a.insert(1);
unordered_multiset b{1, 2, 3, 4};
for (int i = 0; i <1; ++i)
  b.insert(2);

When doing a == b we'll find 1 elements in a with key '1',
and find one element in b with that key. But then we iterate through
every element in b after that one, even though they have different
keys and are probably in different buckets.

Instead of just iterating from __ity to __other.end(), can we use a
local iterator so we stop at the end of the bucket?

This seems to make the PR91263 example *very* slightly slower, but
makes the example above significantly faster.

What do you think?


diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h
index 6503d1518d3..6d5d3b2fced 100644
--- a/libstdc++-v3/include/bits/stl_algo.h
+++ b/libstdc++-v3/include/bits/stl_algo.h
@@ -3588,6 +3588,33 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   return std::make_pair(*__p.first, *__p.second);
 }
 
+  // precondition: distance(__first1, __last1) == distance(__first2, __last2).
+  template
+_GLIBCXX20_CONSTEXPR
+bool
+__is_permutation_impl(_FwdIter1 __first1, _FwdIter1 __last1,
+			  _FwdIter2 __first2, _FwdIter2 __last2,
+			  _BinaryPredicate __pred = {})
+{
+  for (auto __scan = __first1; __scan != __last1; ++__scan)
+	{
+	  if (__scan != std::__find_if(__first1, __scan,
+			  __gnu_cxx::__ops::__iter_comp_iter(__pred, __scan)))
+	continue; // We've seen this one before.
+
+	  auto __matches
+	= std::__count_if(__first2, __last2,
+			__gnu_cxx::__ops::__iter_comp_iter(__pred, __scan));
+	  if (0 == __matches ||
+	  std::__count_if(__scan, __last1,
+			__gnu_cxx::__ops::__iter_comp_iter(__pred, __scan))
+	  != __matches)
+	return false;
+	}
+  return true;
+}
+
   template
 _GLIBCXX20_CONSTEXPR
@@ -3604,26 +3631,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   if (__first1 == __last1)
 	return true;
 
-  // Establish __last2 assuming equal ranges by iterating over the
-  // rest of the list.
-  _ForwardIterator2 __last2 = __first2;
-  std::advance(__last2, std::distance(__first1, __last1));
-  for (_ForwardIterator1 __scan = __first1; __scan != __last1; ++__scan)
-	{
-

Re: [GCC][PATCH][ARM] Add Bfloat16_t scalar type, vector types and machine modes to ARM back-end [2/2]

2020-01-16 Thread Stam Markianos-Wright



On 1/13/20 10:43 AM, Kyrill Tkachov wrote:
> Hi Stam,
> 
> On 1/10/20 6:47 PM, Stam Markianos-Wright wrote:
>> Hi all,
>>
>> This patch is part 2 of Bfloat16_t enablement in the ARM back-end.
>>
>> This new type is constrained using target hooks TARGET_INVALID_CONVERSION,
>> TARGET_INVALID_UNARY_OP, TARGET_INVALID_BINARY_OP so that it may only be used
>> through ACLE intrinsics (will be provided in later patches).
>>
>> Regression testing on arm-none-eabi passed successfully.
>>
>> Ok for trunk?
> 
> 
> Ok.
> 
> Thanks,
> 
> Kyrill

Committed as r10-6021-g3ea9140170b8a511822b1a873dea1227093f3ccf

Thank you!
Stam
> 
> 
>>
>> Cheers,
>> Stam
>>
>>
>> ACLE documents are at https://developer.arm.com/docs/101028/latest
>> ISA documents are at https://developer.arm.com/docs/ddi0596/latest
>>
>> Details on ARM Bfloat can be found here:
>> https://community.arm.com/developer/ip-products/processors/b/ml-ip-blog/posts/bfloat16-processing-for-neural-networks-on-armv8_2d00_a
>>  
>>
>>
>>
>>
>> gcc/ChangeLog:
>>
>> 2020-01-10  Stam Markianos-Wright 
>>
>>     * config/arm/arm.c
>>     (arm_invalid_conversion): New function for target hook.
>>     (arm_invalid_unary_op): New function for target hook.
>>     (arm_invalid_binary_op): New function for target hook.
>>
>> 2020-01-10  Stam Markianos-Wright 
>>
>>     * gcc.target/arm/bfloat16_scalar_typecheck.c: New test.
>>     * gcc.target/arm/bfloat16_vector_typecheck_1.c: New test.
>>     * gcc.target/arm/bfloat16_vector_typecheck_2.c: New test.
>>
>>

Re: [GCC][PATCH][ARM] Add Bfloat16_t scalar type, vector types and machine modes to ARM back-end [1/2]

2020-01-16 Thread Stam Markianos-Wright



On 1/13/20 10:05 AM, Kyrill Tkachov wrote:
> Hi Stam,
> 
> On 1/10/20 6:45 PM, Stam Markianos-Wright wrote:
>> Hi all,
>>
>> This is a respin of patch:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01448.html
>>
>> which has now been split into two (similar to the Aarch64 version).
>>
>> This is patch 1 of 2 and adds Bfloat type support to the ARM back-end.
>> It also adds a new machine_mode (BFmode) for this type and accompanying 
>> Vector
>> modes V4BFmode and V8BFmode.
>>
>> The second patch in this series uses existing target hooks to restrict type 
>> use.
>>
>> Regression testing on arm-none-eabi passed successfully.
>>
>> This patch depends on:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00857.html
>>
>> for test suite effective_target update.
>>
>> Ok for trunk?
> 
> This is ok, thanks.
> 
> You can commit it once the git conversion goes through :)

Committed as r10-6020-g2e87b2f4121fe1d39edb76f4e492dfe327be6a1b

Thank you!
Stam
> 
> Kyrill
> 
> 
>>
>> Cheers,
>> Stam
>>
>>
>> ACLE documents are at https://developer.arm.com/docs/101028/latest
>> ISA documents are at https://developer.arm.com/docs/ddi0596/latest
>>
>> Details on ARM Bfloat can be found here:
>> https://community.arm.com/developer/ip-products/processors/b/ml-ip-blog/posts/bfloat16-processing-for-neural-networks-on-armv8_2d00_a
>>  
>>
>>
>>
>>
>> gcc/ChangeLog:
>>
>> 2020-01-10  Stam Markianos-Wright 
>>
>>     * config.gcc: Add arm_bf16.h.
>>     * config/arm/arm-builtins.c (arm_mangle_builtin_type):  Fix comment.
>>     (arm_simd_builtin_std_type): Add BFmode.
>>     (arm_init_simd_builtin_types): Define element types for vector types.
>>     (arm_init_bf16_types):  New function.
>>     (arm_init_builtins): Add arm_init_bf16_types function call.
>>     * config/arm/arm-modes.def: Add BFmode and V4BF, V8BF vector modes.
>>     * config/arm/arm-simd-builtin-types.def: Add V4BF, V8BF.
>>     * config/arm/arm.c (aapcs_vfp_sub_candidate):  Add BFmode.
>>     (arm_hard_regno_mode_ok): Add BFmode and tidy up statements.
>>     (arm_vector_mode_supported_p): Add V4BF, V8BF.
>>     (arm_mangle_type):
>>     * config/arm/arm.h: Add V4BF, V8BF to VALID_NEON_DREG_MODE,
>>   VALID_NEON_QREG_MODE respectively. Add export arm_bf16_type_node,
>>   arm_bf16_ptr_type_node.
>>     * config/arm/arm.md: New enabled_for_bfmode_scalar,
>>   enabled_for_bfmode_vector attributes. Add BFmode to movhf expand.
>>   pattern and define_split between ARM registers.
>>     * config/arm/arm_bf16.h: New file.
>>     * config/arm/arm_neon.h: Add arm_bf16.h and Bfloat vector types.
>>     * config/arm/iterators.md (ANY64_BF, VDXMOV, VHFBF, HFBF, fporbf): 
>> New.
>>   (VQXMOV): Add V8BF.
>>     * config/arm/neon.md: Add BF vector types to NEON move patterns.
>>     * config/arm/vfp.md: Add BFmode to movhf patterns.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2020-01-10  Stam Markianos-Wright 
>>
>>     * g++.dg/abi/mangle-neon.C: Add Bfloat vector types.
>>     * g++.dg/ext/arm-bf16/bf16-mangle-1.C: New test.
>>     * gcc.target/arm/bfloat16_scalar_1_1.c: New test.
>>     * gcc.target/arm/bfloat16_scalar_1_2.c: New test.
>>     * gcc.target/arm/bfloat16_scalar_2_1.c: New test.
>>     * gcc.target/arm/bfloat16_scalar_2_2.c: New test.
>>     * gcc.target/arm/bfloat16_scalar_3_1.c: New test.
>>     * gcc.target/arm/bfloat16_scalar_3_2.c: New test.
>>     * gcc.target/arm/bfloat16_scalar_4.c: New test.
>>     * gcc.target/arm/bfloat16_simd_1_1.c: New test.
>>     * gcc.target/arm/bfloat16_simd_1_2.c: New test.
>>     * gcc.target/arm/bfloat16_simd_2_1.c: New test.
>>     * gcc.target/arm/bfloat16_simd_2_2.c: New test.
>>     * gcc.target/arm/bfloat16_simd_3_1.c: New test.
>>     * gcc.target/arm/bfloat16_simd_3_2.c: New test.
>>
>>
>>

Re: [PATCH][AArch64] PR92424: Fix -fpatchable-function-entry=N,M with BTI

2020-01-16 Thread Richard Sandiford

Szabolcs Nagy  writes:
> this affects the linux kernel and technically a wrong code bug
> so this fix tries to be backportable (fixing all issues with
> -fpatchable-function-entry=N,M will likely require new option).

Even for the backportable version, I think it would be better
not to duplicate so much varasm stuff.

Perhaps instead we could:

(a) set a cfun->machine flag in aarch64_declare_function_name
to say that we've assembled the label

(b) override print_patchable_function_entry so that it prints a BTI if
this flag is set and the usual other conditions are true

As discussed off-list, it'd be good to avoid a second BTI after the
nops if possible.  print_patchable_function_entry should be able
to find the first instruction using get_insns and next_real_insn,
then remove it if it turns out to be a BTI.

Thanks,
Richard

>
> gcc/ChangeLog:
>
> 2020-01-16  Szabolcs Nagy  
>
>   PR target/92424
>   * config/aarch64/aarch64.c (aarch64_declare_function_name): Emit BTI c
>   if the function label is followed by a patch area.
>
> From ac2a46bab60ecc80a453328b4749a951908c02c5 Mon Sep 17 00:00:00 2001
> From: Szabolcs Nagy 
> Date: Wed, 15 Jan 2020 12:23:40 +
> Subject: [PATCH] [AArch64] PR92424: Fix -fpatchable-function-entry=N,M with
>  BTI
>
> This is a minimal workaround that emits another BTI after the function
> label if that is followed by a patch area.
>
> So before this commit -fpatchable-function-entry=3,1 with bti generates
>
> .section __patchable_function_entries
> .8byte .LPFE
> .text
>   .LPFE:
> nop
>   foo:
> nop
> nop
> bti c // or paciasp
> ...
>
> and after this commit
>
> .section __patchable_function_entries
> .8byte .LPFE
> .text
>   .LPFE:
> nop
>   foo:
> bti c
> nop
> nop
> bti c // or paciasp
> ...
>
> There is a new bti insn in the middle of the patchable area unless M=0
> (patch area is after the new bti) or M=N (patch area is before the
> label, no new bti).
>
> Note that .cfi_startproc and the asynchronous unwind table entry label
> comes after the patch area, but whatever effect that has on the newly
> inserted bti c, it already affected the insns in the patch area.
>
> Tested on aarch64-none-linux-gnu.
>
> gcc/ChangeLog:
>
>   PR target/92424
>   * config/aarch64/aarch64.c (aarch64_declare_function_name): Emit BTI c
>   if the function label is followed by a patch area.
> ---
>  gcc/config/aarch64/aarch64.c | 33 +
>  1 file changed, 33 insertions(+)
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 66e20becaf2..0394c274330 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -18079,6 +18079,39 @@ aarch64_declare_function_name (FILE *stream, const 
> char* name,
>/* Don't forget the type directive for ELF.  */
>ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "function");
>ASM_OUTPUT_LABEL (stream, name);
> +
> +  if (!aarch64_bti_enabled ()
> +  || cgraph_node::get (fndecl)->only_called_directly_p ())
> +return;
> +
> +  /* Copy logic from varasm.c assemble_start_function
> + to handle -fpatchable-function-entry=N,M with BTI.  */
> +  unsigned HOST_WIDE_INT patch_area_size = function_entry_patch_area_size;
> +  unsigned HOST_WIDE_INT patch_area_entry = function_entry_patch_area_start;
> +
> +  tree patchable_function_entry_attr
> += lookup_attribute ("patchable_function_entry", DECL_ATTRIBUTES 
> (fndecl));
> +  if (patchable_function_entry_attr)
> +{
> +  tree pp_val = TREE_VALUE (patchable_function_entry_attr);
> +  tree patchable_function_entry_value1 = TREE_VALUE (pp_val);
> +
> +  patch_area_size = tree_to_uhwi (patchable_function_entry_value1);
> +  patch_area_entry = 0;
> +  if (TREE_CHAIN (pp_val) != NULL_TREE)
> + {
> +   tree patchable_function_entry_value2
> + = TREE_VALUE (TREE_CHAIN (pp_val));
> +   patch_area_entry = tree_to_uhwi (patchable_function_entry_value2);
> + }
> +}
> +
> +  if (patch_area_entry > patch_area_size)
> +patch_area_entry = 0;
> +
> +  /* Emit a BTI c if a patch area comes after the function label.  */
> +  if (patch_area_size > patch_area_entry)
> +asm_fprintf (stream, "\thint\t34 // bti c\n");
>  }
>  
>  /* Implement ASM_OUTPUT_DEF_FROM_DECLS.  Output .variant_pcs for aliases.  */

Re: [patch] contrib: Don't add push rules for personal and vendor spaces.

2020-01-16 Thread Jakub Jelinek

On Wed, Jan 15, 2020 at 10:48:23PM +, Joseph Myers wrote:
> A reasonable replacement for those push rules might be command aliases 
> (e.g. "git upush " to push the user branch ).

Would we have then separate aliases for pushing to one vendor branch and
another vendor branch?
Couldn't one alias determine from the branch remote and merge and
perhaps the fetch rules figure out the right git push arguments
given the setup (e.g. treat $(git config --get gcc-config.userpfx)
and $(git config --get gcc-config.user) as push to user branch,
devel/*, master, releases/* as normal git push and the rest as vendors
(or let git-fetch-vendor.sh record list of all used vendors in some gcc-config
setting)?

Jakub

Re: [patch] contrib: Don't add push rules for personal and vendor spaces.

2020-01-16 Thread Jason Merrill

On Wed, Jan 15, 2020 at 5:56 PM Joseph Myers 
wrote:

> A reasonable replacement for those push rules might be command aliases
> (e.g. "git upush " to push the user branch ).
>

And/or git mpush for push origin HEAD:master

Jason

Re: contrib: Check and if needed set user.name and user.email in gcc-git-customization.sh

2020-01-16 Thread Andreas Schwab

On Jan 16 2020, Richard Earnshaw (lists) wrote:

> diff --git a/contrib/gcc-git-customization.sh 
> b/contrib/gcc-git-customization.sh
> index dae2c35bb57..1cde6fd8224 100755
> --- a/contrib/gcc-git-customization.sh
> +++ b/contrib/gcc-git-customization.sh
> @@ -11,9 +11,9 @@ ask () {
>  read answer
>  if [ "x$answer" = "x" ]
>  then
> - eval $var=$default
> + eval $var=\"$default\"
>  else
> - eval $var=$answer
> + eval $var=\"$answer\"

This still isn't the safe way to do indirect assignment.  The expansion
of the rhs needs to be delayed, so that the result isn't subject to
further expansions.

Andreas.

Subject: [PATCH] gcc-git-customization.sh: avoid double expansion

---
 contrib/gcc-git-customization.sh | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index b7e4ce308062..26f4389bcc8a 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -11,9 +11,9 @@ ask () {
 read answer
 if [ "x$answer" = "x" ]
 then
-   eval $var=\"$default\"
+   eval $var=\$default
 else
-   eval $var=\"$answer\"
+   eval $var=\$answer
 fi
 }
 
-- 
2.25.0


-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

[Patch, fortran][9/10 Regression] PR 93236 -fno-automatic and RECURSIVE

2020-01-16 Thread Mark Eggleston


Please find attached patch to fix this regression.

OK for master and 9 branch?

Change logs:

gcc/fortran

    Mark Eggleston  

    PR fortran/93236
    * resolve.c (resolve_types): Declare boolean recursive and set with the
    value of the recursive attribute of namespace proc_name symbol
    structure if it exists.  Call gfc_save_all if both flag_automatic and
    recursive are false or ns->save_all is true.

gcc/testsuite

    Mark Eggleston  
    Tobias Burnus  

    * gfortran.dg/pr93263_1.f90: New test.
    * gfortran.dg/pr93263_2.f90: New test.

--
https://www.codethink.co.uk/privacy.html

>From aefc0be25bc7ea1a216ac1950a856d5b654d5493 Mon Sep 17 00:00:00 2001
From: Mark Eggleston 
Date: Thu, 16 Jan 2020 14:08:11 +
Subject: [PATCH] Fortran: PR93263 -fno-automatic and RECURSIVE

The use of -fno-automatic should not affect the save attribute of a
recursive procedure. The first test case checks unsaved variables
and the second checks saved variables.
---
 gcc/fortran/resolve.c   |  3 ++-
 gcc/testsuite/gfortran.dg/pr93263_1.f90 | 29 +
 gcc/testsuite/gfortran.dg/pr93263_2.f90 | 24 
 3 files changed, 55 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr93263_1.f90
 create mode 100644 gcc/testsuite/gfortran.dg/pr93263_2.f90

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 6f2a4c4d65a..bddab39d023 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -17079,6 +17079,7 @@ resolve_types (gfc_namespace *ns)
   gfc_data *d;
   gfc_equiv *eq;
   gfc_namespace* old_ns = gfc_current_ns;
+  bool recursive = ns->proc_name && ns->proc_name->attr.recursive;
 
   if (ns->types_resolved)
 return;
@@ -17132,7 +17133,7 @@ resolve_types (gfc_namespace *ns)
 
   gfc_traverse_ns (ns, resolve_values);
 
-  if (ns->save_all || !flag_automatic)
+  if (ns->save_all || (!flag_automatic && !recursive))
 gfc_save_all (ns);
 
   iter_stack = NULL;
diff --git a/gcc/testsuite/gfortran.dg/pr93263_1.f90 b/gcc/testsuite/gfortran.dg/pr93263_1.f90
new file mode 100644
index 000..f96b3589411
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr93263_1.f90
@@ -0,0 +1,29 @@
+! { dg-do compile }
+! { dg-options "-fno-automatic -fdump-tree-original" }
+!
+! Test contributed by Mark Eggleston  
+
+program main
+  implicit none
+  call check(2)
+end 
+
+recursive subroutine check(n)
+  implicit none
+  integer n, a
+  a = 10
+  print*,"n=",n
+  if (n==1) then
+a=a-1
+print*,"assigning a=",a
+  else
+a=a-2
+print*,"assigning a=",a
+call check(n-1)
+  endif
+  print*,"a=",a
+end 
+
+! { dg-final { scan-tree-dump-not "static integer\\(kind=4\\) a" "original" } }
+! { dg-final { scan-tree-dump-not "integer\\(kind=4\\) a" "original" } }
+
diff --git a/gcc/testsuite/gfortran.dg/pr93263_2.f90 b/gcc/testsuite/gfortran.dg/pr93263_2.f90
new file mode 100644
index 000..fd353c6b548
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr93263_2.f90
@@ -0,0 +1,24 @@
+! { dg-do run }
+!
+! Test contributed by Tobias Burnus  
+
+  integer :: cnt
+  cnt = 0
+  call sub()
+  if (cnt /= 5) stop 1
+contains
+  recursive subroutine sub()
+save
+logical :: first = .true.
+integer :: i
+cnt = cnt + 1
+if (first) then
+  first = .false.
+  i = 1
+end if
+print *, "Hello", i
+i = i + 1
+if (i <= 5) call sub()
+  end subroutine
+end
+
-- 
2.11.0

Re: Prevent all uses of DFP when unsupported (PR c/91985)

2020-01-16 Thread Jakub Jelinek

On Thu, Jan 16, 2020 at 02:42:29PM +, Szabolcs Nagy wrote:
> what is the expected way to fix this issue?
> 
> i see hppa-linux-gnu baseline was updated to remove
> the decimal rtti symbols, but other targets were not.
> 
> is it better to update the baseline or wait for a
> generic fix?

My preference is a generic fix and we have a regression PR about this.

Jakub

Re: Prevent all uses of DFP when unsupported (PR c/91985)

2020-01-16 Thread Szabolcs Nagy

On 27/11/2019 18:00, Jakub Jelinek wrote:
> On Wed, Nov 27, 2019 at 05:48:21PM +, Joseph Myers wrote:
 On 26/11/19 00:57 +, Joseph Myers wrote:
> On Mon, 25 Nov 2019, Rainer Orth wrote:
>
>> and a few more, all DFP related.  They used to be emitted by g++ for
>> __fundamental_type_info in libsupc++/fundamental_type_info.cc and lived
>> in the CXXABI_1.3.4 version.  However, since Solaris *does* lack DFP
>> support, that's no longer the case.  I'm uncertain how best to deal with
>> this, however.
>
> As I understand it, _GLIBCXX_USE_DECIMAL_FLOAT should already have been
> undefined for this target, and so std::decimal::decimal32 etc. should not
> have been usable (both the header not working without that define, and the
> mode attributes in the header being rejected by the front end when DFP is
> unsupported).  I.e. such defines in libsupc++ would never have been usable
> on this target, so I think they are something it should be safe to remove
 >from the ABI baseline.

 If it's actually impossible that any real program could have depended
 on those symbols, then I agree.
>>>
>>> this is exactly what I've got no way of telling, that's why I was asking
>>> for guidance.  Just removing the DFP symbols from the baselines works,
>>> of course.
>>
>> I don't think any real program could have used those symbols; it would 
>> have required using __typeof (__builtin_fabsd32 (0)) or similar to access 
>> types that weren't normally available for those targets (and by accessing 
>> the types using builtins like that, you're getting a completely undefined 
>> function-calling ABI for them anyway).
> 
> I think various tools we use to check ABI will be unhappy about removal
> of symbols.  Can't we on targets that do support aliases and don't support
> decimal float e.g. alias the DFP rtti symbols to void rtti symbols?

what is the expected way to fix this issue?

i see hppa-linux-gnu baseline was updated to remove
the decimal rtti symbols, but other targets were not.

is it better to update the baseline or wait for a
generic fix?

Re: contrib: Check and if needed set user.name and user.email in gcc-git-customization.sh

2020-01-16 Thread Richard Earnshaw (lists)


On 15/01/2020 16:59, Richard Earnshaw (lists) wrote:
As discussed on IRC, this adds a couple more checks in the customization 
setup for git.  If the variables user.name and user.email are not set 
anywhere in the git config hierarchy, we set some local values.  We 
always ask about the values we detect and if the user gives an answer 
that is new, we save that in the local config: this gives the 
opportunity to use different values to those configured for the global 
space.


I've also cleaned up a couple of minor niggles, such as using $() rather 
than `` for subshells and some quoting issues when using eval.


 * gcc-git-customization.sh: Use $() instead of ``, and fix variable
 quoting with eval.  Set user.name and user.email if not found in the
 local or config.


I'm not sure what I did wrong yesterday when I (thought I wrote)

git config --get "user.name"

and found it not looking across all config scopes, but clearly I did 
something sill as the above does 'just work'.  So this is what I've 
checked in.


R.
diff --git a/contrib/gcc-git-customization.sh b/contrib/gcc-git-customization.sh
index dae2c35bb57..1cde6fd8224 100755
--- a/contrib/gcc-git-customization.sh
+++ b/contrib/gcc-git-customization.sh
@@ -11,9 +11,9 @@ ask () {
 read answer
 if [ "x$answer" = "x" ]
 then
-	eval $var=$default
+	eval $var=\"$default\"
 else
-	eval $var=$answer
+	eval $var=\"$answer\"
 fi
 }
 
@@ -30,7 +30,52 @@ git config alias.gcc-undescr \!"f() { o=\$(git config --get gcc-config.upstream)
 # *.mddiff=md
 git config diff.md.xfuncname '^\(define.*$'
 
-upstream=`git config --get "gcc-config.upstream"`
+set_user=$(git config --get "user.name")
+set_email=$(git config --get "user.email")
+
+if [ "x$set_user" = "x" ]
+then
+# Try to guess the user's name by looking it up in the password file
+new_user=$(getent passwd $(whoami) | awk -F: '{ print $5 }')
+if [ "x$new_user" = "x" ]
+then
+   new_user="(no default)"
+fi
+else
+new_user=$set_user
+fi
+ask "Your name" "${new_user}" new_user
+if [ "x$new_user" = "x(no default)" ]
+then
+echo "Cannot continue, git needs to record your name against commits"
+exit 1
+fi
+
+if [ "x$set_email" = "x" ]
+then
+new_email="(no_default)"
+else
+new_email=$set_email
+fi
+
+ask "Your email address (for git commits)" "${new_email}" new_email
+if [ "x$new_email" = "x(no default)" ]
+then
+echo "Cannot continue, git needs to record your email address against commits"
+exit 1
+fi
+
+if [ "x$set_user" != "x$new_user" ]
+then
+git config "user.name" "$new_user"
+fi
+
+if [ "x$set_email" != "x$new_email" ]
+then
+git config "user.email" "$new_email"
+fi
+
+upstream=$(git config --get "gcc-config.upstream")
 if [ "x$upstream" = "x" ]
 then
 upstream="origin"
@@ -38,27 +83,27 @@ fi
 ask "Local name for upstream repository" "origin" upstream
 git config "gcc-config.upstream" "$upstream"
 
-remote_id=`git config --get "gcc-config.user"`
+remote_id=$(git config --get "gcc-config.user")
 if [ "x$remote_id" = "x" ]
 then
 # See if the url specifies the remote user name.
-url=`git config --get "remote.$upstream.url"`
+url=$(git config --get "remote.$upstream.url")
 if [ "x$url" = "x" ]
 then
 	# This is a pure guess, but for many people it might be OK.
-	remote_id=`whoami`
+	remote_id=$(whoami)
 else
-	remote_id=`echo $url | sed -r "s|^.*ssh://(.+)@gcc.gnu.org.*$|\1|"`
+	remote_id=$(echo $url | sed -r "s|^.*ssh://(.+)@gcc.gnu.org.*$|\1|")
 	if [ x$remote_id = x$url ]
 	then
-	remote_id=`whoami`
+	remote_id=$(whoami)
 	fi
 fi
 fi
 ask "Account name on gcc.gnu.org (for your personal branches area)" $remote_id remote_id
 git config "gcc-config.user" "$remote_id"
 
-old_pfx=`git config --get "gcc-config.userpfx"`
+old_pfx=$(git config --get "gcc-config.userpfx")
 if [ "x$old_pfx" = "x" ]
 then
 old_pfx="me"
@@ -72,7 +117,7 @@ echo "Setting up tracking for personal namespace $remote_id in remotes/$upstream
 git config --replace-all "remote.${upstream}.fetch" "+refs/users/${remote_id}/heads/*:refs/remotes/${upstream}/${new_pfx}/*" ":refs/remotes/${upstream}/${old_pfx}/"
 git config --replace-all "remote.${upstream}.fetch" "+refs/users/${remote_id}/tags/*:refs/tags/${new_pfx}/*" ":refs/tags/${old_pfx}/"
 
-push_rule=`git config --get "remote.${upstream}.push"`
+push_rule=$(git config --get "remote.${upstream}.push")
 if [ "x$push_rule" != "x" ]
 then
 echo "***"

Re: [patch, openacc] Fix ICE verifying gimple

2020-01-16 Thread Andrew Stubbs


Ping.

On 22/11/2019 11:06, Andrew Stubbs wrote:

This test case causes an ICE (reformatted for email):

   void test(int k)
   {
     unsigned int x = 1;
   #pragma acc parallel loop async(x)
     for (int i = 0; i < k; i++) { }
   }

   t.c: In function 'test':
   t.c:4:9: error: invalid argument to gimple call
   4 | #pragma acc parallel loop async(x)
     | ^~~
   (int) x
   __builtin_GOACC_parallel_keyed (-1, test._omp_fn.0, 1,
   &.omp_data_arr.3, &.omp_data_sizes.4,
   &.omp_data_kinds.5, 536936447,
  (int) x, 0);
   during GIMPLE pass: ompexp
   dump file: t.c.013t.ompexp
   t.c:4:9: internal compiler error: verify_gimple failed

The problem is that "x" needs to be cast to "int" (from "unsigned int") 
before calling the function, and that's not valid in a gimple call.


The attached patch assigns the "(int) x" to a temporary and passes that 
to the function instead.


OK to commit?

[PATCH] Fix spacing in a dump in value-prof.c.

2020-01-16 Thread Martin Liška


Hi.

One obvious dump fix that I've just tested.
I'm going to install it.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Thanks,
Martin

gcc/ChangeLog:

2020-01-15  Martin Liska  

* value-prof.c (dump_histogram_value): Fix
obvious spacing issue.
---
 gcc/value-prof.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)


diff --git a/gcc/value-prof.c b/gcc/value-prof.c
index 9a2c46252c6..b7c7d7eaea5 100644
--- a/gcc/value-prof.c
+++ b/gcc/value-prof.c
@@ -262,10 +262,10 @@ dump_histogram_value (FILE *dump_file, histogram_value hist)
 	{
 	  fprintf (dump_file,
 		   (hist->type == HIST_TYPE_TOPN_VALUES
-		? "Top N value counter " : "Indirect call counter"));
+		? "Top N value counter" : "Indirect call counter"));
 	  if (hist->hvalue.counters)
 	{
-	  fprintf (dump_file, "all: %" PRId64 ", values: ",
+	  fprintf (dump_file, " all: %" PRId64 ", values: ",
 		   (int64_t) hist->hvalue.counters[0]);
 	  for (unsigned i = 0; i < GCOV_TOPN_VALUES; i++)
 		{

Re: [PATCH] libstdc++/91223 Improve unordered containers == operator

2020-01-16 Thread Jonathan Wakely


On 16/01/20 07:42 +0100, François Dumont wrote:

On 1/15/20 10:52 PM, Jonathan Wakely wrote:

On 15/01/20 21:48 +, Jonathan Wakely wrote:

On 14/01/20 22:25 +0100, François Dumont wrote:

On 1/13/20 10:53 PM, Jonathan Wakely wrote:

On 13/01/20 22:41 +0100, François Dumont wrote:


For the multi-keys we could still avoid redundant 
comparisons when _Equal is just doing == on the key type. On 
unordered_multiset we could just avoids the call to 
is_permuation and on the unordered_multimap we could check 
the is_permutation only on the associated value rather than 
on the std::pair.



I don't think that's necessary, or helpful.

The idea of https://gcc.gnu.org/ml/libstdc++/2020-01/msg00070.html is
that you shouldn't be using _Equal at all, and therefore it doesn't
matter whether it's std::equal_to or not.



And it was indeed possible.


Nice!


    PR libstdc++/91223
    * include/bits/hashtable.h (_Hashtable<>): Make _Equality<> friend.
    * include/bits/hashtable_policy.h: Include .
    (_Equality_base): Remove.
    (_Equality<>::_M_equal): Review implementation. Use 
std::is_permutation.

    * testsuite/23_containers/unordered_multiset/operators/1.cc
    (Hash, Equal, test02, test03): New.
    * testsuite/23_containers/unordered_set/operators/1.cc
    (Hash, Equal, test02, test03): New.

Tested under Linux x86_64.

Ok to commit ?


Yes, OK for trunk (we're in stage4 but your patch was posted in stage3
and fixes a pretty nasty performance bug, so is OK now).

N.B. GCC has moved to Git instead of Subversion. If you don't have Git
access set up let me know and I can commit the patch for you.


I haven't done the move yet and won't be able to do it before the 
week-end. So please proceed to the commit for me, thanks.


No problem, I can do that.

Re: [PATCH][AArch64] Fix shrinkwrapping interactions with atomics (PR92692)

2020-01-16 Thread Richard Sandiford

Wilco Dijkstra  writes:
> The separate shrinkwrapping pass may insert stores in the middle
> of atomics loops which can cause issues on some implementations.
> Avoid this by delaying splitting of atomic patterns until after
> prolog/epilog generation.
>
> Bootstrap completed, no test regressions on AArch64.
>
> Andrew, can you verify this fixes the failure you were getting?
>
> ChangeLog:
> 2020-01-16  Wilco Dijkstra  
>
> PR target/92692
> * config/aarch64/aarch64.c (aarch64_split_compare_and_swap)
> Add assert to ensure prolog has been emitted.
> (aarch64_split_atomic_op): Likewise.
> * config/aarch64/atomics.md (aarch64_compare_and_swap)
> Use epilogue_completed rather than reload_completed.
> (aarch64_atomic_exchange): Likewise.
> (aarch64_atomic_): Likewise.
> (atomic_nand): Likewise.
> (aarch64_atomic_fetch_): Likewise.
> (atomic_fetch_nand): Likewise.
> (aarch64_atomic__fetch): Likewise.
> (atomic_nand_fetch): Likewise.

OK if Andrew confirms it works, thanks.

Richard

> ---
>
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> ac89cc1f9c938455d33d8850d9ebfc0473cb73dc..cd9d813f2ac4990971f6435fdb28b0f94ae10309
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -18375,6 +18375,9 @@ aarch64_emit_post_barrier (enum memmodel model)
>  void
>  aarch64_split_compare_and_swap (rtx operands[])
>  {
> +  /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
> +  gcc_assert (epilogue_completed);
> +
>rtx rval, mem, oldval, newval, scratch, x, model_rtx;
>machine_mode mode;
>bool is_weak;
> @@ -18469,6 +18472,9 @@ void
>  aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx 
> mem,
>   rtx value, rtx model_rtx, rtx cond)
>  {
> +  /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
> +  gcc_assert (epilogue_completed);
> +
>machine_mode mode = GET_MODE (mem);
>machine_mode wmode = (mode == DImode ? DImode : SImode);
>const enum memmodel model = memmodel_from_int (INTVAL (model_rtx));
> diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
> index 
> c2bcabd0c3c2627b7222dcbc1af9c2e6b7ce6a76..996947799b5ef8445e9786b94e1ce62fd16e5b5c
>  100644
> --- a/gcc/config/aarch64/atomics.md
> +++ b/gcc/config/aarch64/atomics.md
> @@ -56,7 +56,7 @@ (define_insn_and_split "@aarch64_compare_and_swap"
> (clobber (match_scratch:SI 7 "="))]
>""
>"#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>[(const_int 0)]
>{
>  aarch64_split_compare_and_swap (operands);
> @@ -80,7 +80,7 @@ (define_insn_and_split "@aarch64_compare_and_swap"
> (clobber (match_scratch:SI 7 "="))]
>""
>"#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>[(const_int 0)]
>{
>  aarch64_split_compare_and_swap (operands);
> @@ -104,7 +104,7 @@ (define_insn_and_split "@aarch64_compare_and_swap"
> (clobber (match_scratch:SI 7 "="))]
>""
>"#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>[(const_int 0)]
>{
>  aarch64_split_compare_and_swap (operands);
> @@ -223,7 +223,7 @@ (define_insn_and_split "aarch64_atomic_exchange"
> (clobber (match_scratch:SI 4 "="))]
>""
>"#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>[(const_int 0)]
>{
>  aarch64_split_atomic_op (SET, operands[0], NULL, operands[1],
> @@ -344,7 +344,7 @@ (define_insn_and_split 
> "aarch64_atomic_"
>(clobber (match_scratch:SI 4 "="))]
>""
>"#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>[(const_int 0)]
>{
>  aarch64_split_atomic_op (, NULL, operands[3], operands[0],
> @@ -400,7 +400,7 @@ (define_insn_and_split "atomic_nand"
> (clobber (match_scratch:SI 4 "="))]
>""
>"#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>[(const_int 0)]
>{
>   aarch64_split_atomic_op (NOT, NULL, operands[3], operands[0],
> @@ -504,7 +504,7 @@ (define_insn_and_split 
> "aarch64_atomic_fetch_"
> (clobber (match_scratch:SI 5 "="))]
>""
>"#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>[(const_int 0)]
>{
>  aarch64_split_atomic_op (, operands[0], operands[4], operands[1],
> @@ -551,7 +551,7 @@ (define_insn_and_split "atomic_fetch_nand"
> (clobber (match_scratch:SI 5 "="))]
>""
>"#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>[(const_int 0)]
>{
>  aarch64_split_atomic_op (NOT, operands[0], operands[4], operands[1],
> @@ -604,7 +604,7 @@ (define_insn_and_split 
> "aarch64_atomic__fetch"
> (clobber (match_scratch:SI 4 "="))]
>""
>"#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>[(const_int 0)]
>{
>  aarch64_split_atomic_op (, NULL, operands[0], operands[1],
> @@ -628,7 +628,7 @@ (define_insn_and_split "atomic_nand_fetch"
> (clobber (match_scratch:SI 4 "="))]
>""
>"#"
> -  "&& reload_completed"
> +  "&& epilogue_completed"
>

[PATCH][AArch64] Fix shrinkwrapping interactions with atomics (PR92692)

2020-01-16 Thread Wilco Dijkstra

The separate shrinkwrapping pass may insert stores in the middle
of atomics loops which can cause issues on some implementations.
Avoid this by delaying splitting of atomic patterns until after
prolog/epilog generation.

Bootstrap completed, no test regressions on AArch64. 

Andrew, can you verify this fixes the failure you were getting?

ChangeLog:
2020-01-16  Wilco Dijkstra  

PR target/92692
* config/aarch64/aarch64.c (aarch64_split_compare_and_swap)
Add assert to ensure prolog has been emitted.
(aarch64_split_atomic_op): Likewise.
* config/aarch64/atomics.md (aarch64_compare_and_swap)
Use epilogue_completed rather than reload_completed.
(aarch64_atomic_exchange): Likewise.
(aarch64_atomic_): Likewise.
(atomic_nand): Likewise.
(aarch64_atomic_fetch_): Likewise.
(atomic_fetch_nand): Likewise.
(aarch64_atomic__fetch): Likewise.
(atomic_nand_fetch): Likewise.
---

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
ac89cc1f9c938455d33d8850d9ebfc0473cb73dc..cd9d813f2ac4990971f6435fdb28b0f94ae10309
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -18375,6 +18375,9 @@ aarch64_emit_post_barrier (enum memmodel model)
 void
 aarch64_split_compare_and_swap (rtx operands[])
 {
+  /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
+  gcc_assert (epilogue_completed);
+
   rtx rval, mem, oldval, newval, scratch, x, model_rtx;
   machine_mode mode;
   bool is_weak;
@@ -18469,6 +18472,9 @@ void
 aarch64_split_atomic_op (enum rtx_code code, rtx old_out, rtx new_out, rtx mem,
 rtx value, rtx model_rtx, rtx cond)
 {
+  /* Split after prolog/epilog to avoid interactions with shrinkwrapping.  */
+  gcc_assert (epilogue_completed);
+
   machine_mode mode = GET_MODE (mem);
   machine_mode wmode = (mode == DImode ? DImode : SImode);
   const enum memmodel model = memmodel_from_int (INTVAL (model_rtx));
diff --git a/gcc/config/aarch64/atomics.md b/gcc/config/aarch64/atomics.md
index 
c2bcabd0c3c2627b7222dcbc1af9c2e6b7ce6a76..996947799b5ef8445e9786b94e1ce62fd16e5b5c
 100644
--- a/gcc/config/aarch64/atomics.md
+++ b/gcc/config/aarch64/atomics.md
@@ -56,7 +56,7 @@ (define_insn_and_split "@aarch64_compare_and_swap"
(clobber (match_scratch:SI 7 "="))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
 aarch64_split_compare_and_swap (operands);
@@ -80,7 +80,7 @@ (define_insn_and_split "@aarch64_compare_and_swap"
(clobber (match_scratch:SI 7 "="))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
 aarch64_split_compare_and_swap (operands);
@@ -104,7 +104,7 @@ (define_insn_and_split "@aarch64_compare_and_swap"
(clobber (match_scratch:SI 7 "="))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
 aarch64_split_compare_and_swap (operands);
@@ -223,7 +223,7 @@ (define_insn_and_split "aarch64_atomic_exchange"
(clobber (match_scratch:SI 4 "="))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
 aarch64_split_atomic_op (SET, operands[0], NULL, operands[1],
@@ -344,7 +344,7 @@ (define_insn_and_split "aarch64_atomic_"
   (clobber (match_scratch:SI 4 "="))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
 aarch64_split_atomic_op (, NULL, operands[3], operands[0],
@@ -400,7 +400,7 @@ (define_insn_and_split "atomic_nand"
(clobber (match_scratch:SI 4 "="))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
  aarch64_split_atomic_op (NOT, NULL, operands[3], operands[0],
@@ -504,7 +504,7 @@ (define_insn_and_split 
"aarch64_atomic_fetch_"
(clobber (match_scratch:SI 5 "="))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
 aarch64_split_atomic_op (, operands[0], operands[4], operands[1],
@@ -551,7 +551,7 @@ (define_insn_and_split "atomic_fetch_nand"
(clobber (match_scratch:SI 5 "="))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
 aarch64_split_atomic_op (NOT, operands[0], operands[4], operands[1],
@@ -604,7 +604,7 @@ (define_insn_and_split 
"aarch64_atomic__fetch"
(clobber (match_scratch:SI 4 "="))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
 aarch64_split_atomic_op (, NULL, operands[0], operands[1],
@@ -628,7 +628,7 @@ (define_insn_and_split "atomic_nand_fetch"
(clobber (match_scratch:SI 4 "="))]
   ""
   "#"
-  "&& reload_completed"
+  "&& epilogue_completed"
   [(const_int 0)]
   {
 aarch64_split_atomic_op (NOT, NULL, operands[0], operands[1],

Re: [PATCH] adjust object size computation for union accesses and PHIs (PR 92765)

2020-01-16 Thread Jakub Jelinek

On Wed, Jan 15, 2020 at 09:52:57PM +0100, Jakub Jelinek wrote:
> This looks wrong.  For one, this function is used for two purposes now and
> you tweak it for one, but more importantly, whether he initial stmt
> you see is a PHI or not can't make a difference, how is that case e.g.
> different from _1 = PHI <_3, _4>; _2 = _1 + 1; and asking about _2?
> For _1, you'd use (correctly) the maximum, but if called on _2, you'd ask
> (wrongly) for minimum instead of maximum.

And now with testcases.  strlenopt-95.c shows the above.

> This also looks like a hack to shut up the particular testcases instead of
> really playing with what the IL provides.  Instead of the unions, consider
> e.g. C++ placement new, have a pointer to a buffer into which you placement
> new one structure, take address of some member in it, pass it to something,
> if it doesn't have a destructor do a C++ placement new into the same buffer
> but with different structure, take address of a different member with the
> same address as the first member, do the str*cmp on it that invokes this
> stuff.  SCCVN will (likely) find out that the values of those two pointers
> are the same and just use the former pointer in the latter case.

And strlenopt-93.C shows the latter.  strlenopt-94.C is similar, just to
show that it breaks equally badly with non-PODs that will be constructed by
placement new and destructed later.

Jakub
__attribute__((noipa)) int
barrier_copy (char *x, int y)
{
  asm volatile ("" : : "g" (x), "g" (y) : "memory");
  if (y == 0)
__builtin_strcpy (x, "abcd");
  return y;
}

__attribute__((noipa)) char *
test_2 (int x)
{
  char *p;
  if (x)
p = __builtin_malloc (4);
  else
p = __builtin_calloc (16, 1);
  char *q = p + 2;
  if (barrier_copy (q, x))
return p;
  if (__builtin_strcmp (q, "abcd") != 0)
__builtin_abort ();
  return p;
}

int
main ()
{
  __builtin_free (test_2 (0));
  __builtin_free (test_2 (1));
  return 0;
}#include 

struct S1 { char a[2]; char b[2]; char c[2]; };
struct S2 { char d[6]; };

__attribute__((noipa)) void
foo (char *b)
{
  b[0] = 1;
  b[1] = 2;
  asm volatile ("" : : "g" (b) : "memory");
}

__attribute__((noipa)) void
bar (char *d)
{
  __builtin_memcpy (d, "cde", 4);
  asm volatile ("" : : "g" (d) : "memory");
}

__attribute__((noipa)) void
baz (char *buf)
{
  S1 *s1 = new (buf) S1;
  char *p = (char *) >b;
  foo (p);
  S2 *s2 = new (buf) S2;
  char *q = (char *) >d[2];
  bar (q);
  if (__builtin_strcmp (q, "cde"))
__builtin_abort ();
}

int
main ()
{
  union U { S1 s1; S2 s2; char buf[sizeof (S1) > sizeof (S2) ? sizeof (S1) : 
sizeof (S2)]; } u;
  baz (u.buf);
  return 0;
}
#include 

struct S1 { char a[2]; char b[2]; char c[2]; S1 () { a[0] = 0; b[0] = 0; c[0] = 
0; }; ~S1 () {} };
struct S2 { char d[6]; S2 () { d[0] = 0; d[2] = 0; } ~S2 () {} };

__attribute__((noipa)) void
foo (char *b)
{
  b[0] = 1;
  b[1] = 2;
  asm volatile ("" : : "g" (b) : "memory");
}

__attribute__((noipa)) void
bar (char *d)
{
  __builtin_memcpy (d, "cde", 4);
  asm volatile ("" : : "g" (d) : "memory");
}

__attribute__((noipa)) void
baz (char *buf)
{
  S1 *s1 = new (buf) S1 ();
  char *p = (char *) >b;
  foo (p);
  s1->~S1 ();
  S2 *s2 = new (buf) S2 ();
  char *q = (char *) >d[2];
  bar (q);
  if (__builtin_strcmp (q, "cde"))
__builtin_abort ();
  s2->~S2 ();
}

int
main ()
{
  char buf[sizeof (S1) > sizeof (S2) ? sizeof (S1) : sizeof (S2)];
  baz (buf);
  return 0;
}

[PATCH] gimplifier: handle POLY_INT_CST-sized TARGET_EXPRs

2020-01-16 Thread Richard Sandiford

If a TARGET_EXPR has poly-int size, the gimplifier would treat it
like a VLA and use gimplify_vla_decl.  gimplify_vla_decl in turn
would use an alloca and expect all references to be gimplified
via the DECL_VALUE_EXPR.  This caused confusion later in
gimplify_var_or_parm_decl_1 when we (correctly) had direct rather
than indirect references.

For completeness, the patch also fixes similar tests in the RETURN_EXPR
handling and OpenMP depend clauses.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  I guess this isn't
really a regression, but instead a bug in a new feature.  It should
be very low risk for non-SVE targets though, so OK anyway?

Richard


2020-01-16  Richard Sandiford  

gcc/
* gimplify.c (gimplify_return_expr): Use poly_int_tree_p rather
than testing directly for INTEGER_CST.
(gimplify_target_expr, gimplify_omp_depend): Likewise.

gcc/testsuite/
* g++.target/aarch64/sve/acle/general-c++/gimplify_1.C: New test.
---
 gcc/gimplify.c  | 6 +++---
 .../g++.target/aarch64/sve/acle/general-c++/gimplify_1.C| 4 
 2 files changed, 7 insertions(+), 3 deletions(-)
 create mode 100644 
gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/gimplify_1.C

diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 05d7922116b..e50be5706c7 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -1632,7 +1632,7 @@ gimplify_return_expr (tree stmt, gimple_seq *pre_p)
 result = NULL_TREE;
   else if (aggregate_value_p (result_decl, TREE_TYPE (current_function_decl)))
 {
-  if (TREE_CODE (DECL_SIZE (result_decl)) != INTEGER_CST)
+  if (!poly_int_tree_p (DECL_SIZE (result_decl)))
{
  if (!TYPE_SIZES_GIMPLIFIED (TREE_TYPE (result_decl)))
gimplify_type_sizes (TREE_TYPE (result_decl), pre_p);
@@ -6714,7 +6714,7 @@ gimplify_target_expr (tree *expr_p, gimple_seq *pre_p, 
gimple_seq *post_p)
 
   /* TARGET_EXPR temps aren't part of the enclosing block, so add it
 to the temps list.  Handle also variable length TARGET_EXPRs.  */
-  if (TREE_CODE (DECL_SIZE (temp)) != INTEGER_CST)
+  if (!poly_int_tree_p (DECL_SIZE (temp)))
{
  if (!TYPE_SIZES_GIMPLIFIED (TREE_TYPE (temp)))
gimplify_type_sizes (TREE_TYPE (temp), pre_p);
@@ -7921,7 +7921,7 @@ gimplify_omp_depend (tree *list_p, gimple_seq *pre_p)
   tree type = build_array_type (ptr_type_node, build_index_type (totalpx));
   tree array = create_tmp_var_raw (type);
   TREE_ADDRESSABLE (array) = 1;
-  if (TREE_CODE (totalpx) != INTEGER_CST)
+  if (!poly_int_tree_p (totalpx))
 {
   if (!TYPE_SIZES_GIMPLIFIED (TREE_TYPE (array)))
gimplify_type_sizes (TREE_TYPE (array), pre_p);
diff --git a/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/gimplify_1.C 
b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/gimplify_1.C
new file mode 100644
index 000..06ee7eae029
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/sve/acle/general-c++/gimplify_1.C
@@ -0,0 +1,4 @@
+/* { dg-additional-options "-flax-vector-conversions" } */
+
+inline void foo (const __SVInt32_t ) {}
+void bar (__SVUint32_t x) { foo(x); }

[committed] contrib: Verify the id to be printed is ancestor of the corresponding remote release branch (or master), otherwise print nothing.

2020-01-16 Thread Jakub Jelinek

Hi!

The monotonically increasing revision ids need to be globally unique, so they 
should
only identify commits that were committed to the upstream repo to its master or
releases/gcc-N branches.  The alias could print something even for private 
branches
or vendor branches etc., but if such an identifier is then used publicly, it 
will
refer to something else.

Committed to trunk.

Martin also suggested for git gcc-descr --full to also include date (commit
date) in the string, so instead of
r10-5998-g2588197b6c2175d7ef9f05a8636f58dfeab12b1d
print
r10-5998-20200116-g2588197b6c2175d7ef9f05a8636f58dfeab12b1d

What do others think?  The r10-5998 part in there is still needed to be the
unique monotonically increasing identifier, the rest is something that can
be derived from it, but e.g. in bugzilla the date could provide extra
helpful information (which can be seen if one follows the link obviously).
In addition to adjusting gcc-git-customization.sh, git-hooks we'd need to
update the redirects and ask Frederic to update bugzilla URLization too.

2020-01-16  Jakub Jelinek  

* gcc-git-customization.sh: Verify the id to be printed is ancestor of
the corresponding remote release branch (or master), otherwise print
nothing.

--- contrib/gcc-git-customization.sh.jj 2020-01-15 14:28:26.996758773 +0100
+++ contrib/gcc-git-customization.sh2020-01-16 12:27:46.734690164 +0100
@@ -22,7 +22,7 @@ git config alias.svn-rev '!f() { rev=$1;
 
 # Add git commands to convert git commit to monotonically increasing revision 
number
 # and vice versa
-git config alias.gcc-descr \!"f() { if test \${1:-no} = --full; then r=\$(git 
describe --all --abbrev=40 --match 'basepoints/gcc-[0-9]*' \${2:-master} | sed 
-n 's,^\\(tags/\\)\\?basepoints/gcc-,r,p'); expr match \${r:-no} '^r[0-9]\\+\$' 
>/dev/null && r=\${r}-0-g\$(git rev-parse \${2:-master}); test -n \$r && echo 
\${r}; else git describe --all --match 'basepoints/gcc-[0-9]*' \${1:-master} | 
sed -n 
's,^\\(tags/\\)\\?basepoints/gcc-\\([0-9]\\+\\)-\\([0-9]\\+\\)-g[0-9a-f]*\$,r\\2-\\3,p;s,^\\(tags/\\)\\?basepoints/gcc-\\([0-9]\\+\\)\$,r\\2-0,p';
 fi; }; f"
+git config alias.gcc-descr \!"f() { if test \${1:-no} = --full; then 
c=\${2:-master}; r=\$(git describe --all --abbrev=40 --match 
'basepoints/gcc-[0-9]*' \$c | sed -n 's,^\\(tags/\\)\\?basepoints/gcc-,r,p'); 
expr match \${r:-no} '^r[0-9]\\+\$' >/dev/null && r=\${r}-0-g\$(git rev-parse 
\${2:-master}); else c=\${1:-master}; r=\$(git describe --all --match 
'basepoints/gcc-[0-9]*' \$c | sed -n 
's,^\\(tags/\\)\\?basepoints/gcc-\\([0-9]\\+\\)-\\([0-9]\\+\\)-g[0-9a-f]*\$,r\\2-\\3,p;s,^\\(tags/\\)\\?basepoints/gcc-\\([0-9]\\+\\)\$,r\\2-0,p');
 fi; if test -n \$r; then o=\$(git config --get gcc-config.upstream); 
rr=\$(echo \$r | sed -n 
's,^r\\([0-9]\\+\\)-[0-9]\\+\\(-g[0-9a-f]\\+\\)\\?\$,\\1,p'); if git rev-parse 
--verify --quiet \${o:-origin}/releases/gcc-\$rr >/dev/null; then 
m=releases/gcc-\$rr; else m=master; fi; git merge-base --is-ancestor \$c 
\${o:-origin}/\$m && \echo \${r}; fi; }; f"
 git config alias.gcc-undescr \!"f() { o=\$(git config --get 
gcc-config.upstream); r=\$(echo \$1 | sed -n 
's,^r\\([0-9]\\+\\)-[0-9]\\+\$,\\1,p'); n=\$(echo \$1 | sed -n 
's,^r[0-9]\\+-\\([0-9]\\+\\)\$,\\1,p'); test -z \$r && echo Invalid id \$1 && 
exit 1; h=\$(git rev-parse --verify --quiet \${o:-origin}/releases/gcc-\$r); 
test -z \$h && h=\$(git rev-parse --verify --quiet \${o:-origin}/master); 
p=\$(git describe --all --match 'basepoints/gcc-'\$r \$h | sed -n 
's,^\\(tags/\\)\\?basepoints/gcc-[0-9]\\+-\\([0-9]\\+\\)-g[0-9a-f]*\$,\\2,p;s,^\\(tags/\\)\\?basepoints/gcc-[0-9]\\+\$,0,p');
 git rev-parse --verify \$h~\$(expr \$p - \$n); }; f"
 
 # Make diff on MD files use "(define" as a function marker.

 
Jakub

[PATCH][AArch64] PR92424: Fix -fpatchable-function-entry=N,M with BTI

2020-01-16 Thread Szabolcs Nagy

this affects the linux kernel and technically a wrong code bug
so this fix tries to be backportable (fixing all issues with
-fpatchable-function-entry=N,M will likely require new option).

gcc/ChangeLog:

2020-01-16  Szabolcs Nagy  

PR target/92424
* config/aarch64/aarch64.c (aarch64_declare_function_name): Emit BTI c
if the function label is followed by a patch area.
From ac2a46bab60ecc80a453328b4749a951908c02c5 Mon Sep 17 00:00:00 2001
From: Szabolcs Nagy 
Date: Wed, 15 Jan 2020 12:23:40 +
Subject: [PATCH] [AArch64] PR92424: Fix -fpatchable-function-entry=N,M with
 BTI

This is a minimal workaround that emits another BTI after the function
label if that is followed by a patch area.

So before this commit -fpatchable-function-entry=3,1 with bti generates

.section __patchable_function_entries
.8byte .LPFE
.text
  .LPFE:
nop
  foo:
nop
nop
bti c // or paciasp
...

and after this commit

.section __patchable_function_entries
.8byte .LPFE
.text
  .LPFE:
nop
  foo:
bti c
nop
nop
bti c // or paciasp
...

There is a new bti insn in the middle of the patchable area unless M=0
(patch area is after the new bti) or M=N (patch area is before the
label, no new bti).

Note that .cfi_startproc and the asynchronous unwind table entry label
comes after the patch area, but whatever effect that has on the newly
inserted bti c, it already affected the insns in the patch area.

Tested on aarch64-none-linux-gnu.

gcc/ChangeLog:

	PR target/92424
	* config/aarch64/aarch64.c (aarch64_declare_function_name): Emit BTI c
	if the function label is followed by a patch area.
---
 gcc/config/aarch64/aarch64.c | 33 +
 1 file changed, 33 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 66e20becaf2..0394c274330 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -18079,6 +18079,39 @@ aarch64_declare_function_name (FILE *stream, const char* name,
   /* Don't forget the type directive for ELF.  */
   ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "function");
   ASM_OUTPUT_LABEL (stream, name);
+
+  if (!aarch64_bti_enabled ()
+  || cgraph_node::get (fndecl)->only_called_directly_p ())
+return;
+
+  /* Copy logic from varasm.c assemble_start_function
+ to handle -fpatchable-function-entry=N,M with BTI.  */
+  unsigned HOST_WIDE_INT patch_area_size = function_entry_patch_area_size;
+  unsigned HOST_WIDE_INT patch_area_entry = function_entry_patch_area_start;
+
+  tree patchable_function_entry_attr
+= lookup_attribute ("patchable_function_entry", DECL_ATTRIBUTES (fndecl));
+  if (patchable_function_entry_attr)
+{
+  tree pp_val = TREE_VALUE (patchable_function_entry_attr);
+  tree patchable_function_entry_value1 = TREE_VALUE (pp_val);
+
+  patch_area_size = tree_to_uhwi (patchable_function_entry_value1);
+  patch_area_entry = 0;
+  if (TREE_CHAIN (pp_val) != NULL_TREE)
+	{
+	  tree patchable_function_entry_value2
+	= TREE_VALUE (TREE_CHAIN (pp_val));
+	  patch_area_entry = tree_to_uhwi (patchable_function_entry_value2);
+	}
+}
+
+  if (patch_area_entry > patch_area_size)
+patch_area_entry = 0;
+
+  /* Emit a BTI c if a patch area comes after the function label.  */
+  if (patch_area_size > patch_area_entry)
+asm_fprintf (stream, "\thint\t34 // bti c\n");
 }
 
 /* Implement ASM_OUTPUT_DEF_FROM_DECLS.  Output .variant_pcs for aliases.  */
-- 
2.17.1

Re: [PATCH][gcc] libgccjit: introduce version entry points

2020-01-16 Thread Andrea Corallo

Hi, second version of the patch here cleaning up an unnecessary change.

Does not introduce regressions with make check-jit.

Andrea

gcc/jit/ChangeLog
2020-??-??  Andrea Corallo  

* docs/topics/compatibility.rst (LIBGCCJIT_ABI_13): New ABI tag
plus add version paragraph.
* libgccjit++.h (namespace gccjit::version): Add new namespace.
* libgccjit.c (gcc_jit_version_major, gcc_jit_version_minor)
(gcc_jit_version_patchlevel): New functions.
* libgccjit.h (LIBGCCJIT_HAVE_gcc_jit_version): New macro.
(gcc_jit_version_major, gcc_jit_version_minor)
(gcc_jit_version_patchlevel): New functions.
* libgccjit.map (LIBGCCJIT_ABI_13) New ABI tag.

gcc/testsuite/ChangeLog
2020-??-??  Andrea Corallo  

* jit.dg/test-version.c: New testcase.

diff --git a/gcc/jit/docs/topics/compatibility.rst b/gcc/jit/docs/topics/compatibility.rst
index a6faee0810e..0c0ce070d72 100644
--- a/gcc/jit/docs/topics/compatibility.rst
+++ b/gcc/jit/docs/topics/compatibility.rst
@@ -61,6 +61,28 @@ You can see the symbol tags provided by libgccjit.so using ``objdump``:
LIBGCCJIT_ABI_0
[...snip...]
 
+Programmatically checking version
+***
+
+Client code can programmatically check libgccjit version using:
+
+.. function::  int gcc_jit_version_major (void)
+
+   Return libgccjit major version.  This is analogous to __GNUC__ in C code.
+
+.. function::  int gcc_jit_version_minor (void)
+
+   Return libgccjit minor version.  This is analogous to
+   __GNUC_MINOR__ in C code.
+
+.. function::  int gcc_jit_version_patchlevel (void)
+
+   Return libgccjit patchlevel version.  This is analogous to
+   __GNUC_PATCHLEVEL__ in C code.
+
+.. note:: These entry points has been added with ``LIBGCCJIT_ABI_13``
+  (see below).
+
 ABI symbol tags
 ***
 
@@ -182,3 +204,14 @@ entrypoints:
 
 ``LIBGCCJIT_ABI_12`` covers the addition of
 :func:`gcc_jit_context_new_bitfield`
+
+``LIBGCCJIT_ABI_13``
+
+``LIBGCCJIT_ABI_13`` covers the addition of version functions via API
+entrypoints:
+
+  * :func:`gcc_jit_version_major`
+
+  * :func:`gcc_jit_version_minor`
+
+  * :func:`gcc_jit_version_patchlevel`
diff --git a/gcc/jit/libgccjit++.h b/gcc/jit/libgccjit++.h
index 82a62d614c5..afb92194c28 100644
--- a/gcc/jit/libgccjit++.h
+++ b/gcc/jit/libgccjit++.h
@@ -49,6 +49,8 @@ namespace gccjit
   class timer;
   class auto_time;
 
+  namespace version {};
+
   /* Errors within the API become C++ exceptions of this class.  */
   class error
   {
@@ -1913,6 +1915,26 @@ auto_time::~auto_time ()
   m_timer.pop (m_item_name);
 }
 
+namespace version
+{
+inline int
+major ()
+{
+  return gcc_jit_version_major ();
+}
+
+inline int
+minor ()
+{
+  return gcc_jit_version_minor ();
+}
+
+inline int
+patchlevel ()
+{
+  return gcc_jit_version_patchlevel ();
+}
+} // namespace version
 } // namespace gccjit
 
 #endif /* #ifndef LIBGCCJIT_PLUS_PLUS_H */
diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h
index 21a0dc09b03..1c5a12e9c01 100644
--- a/gcc/jit/libgccjit.h
+++ b/gcc/jit/libgccjit.h
@@ -1487,6 +1487,22 @@ gcc_jit_context_new_rvalue_from_vector (gcc_jit_context *ctxt,
 	size_t num_elements,
 	gcc_jit_rvalue **elements);
 
+#define LIBGCCJIT_HAVE_gcc_jit_version
+
+/* Functions to retrive libgccjit version.
+   Analogous to __GNUC__, __GNUC_MINOR__, __GNUC_PATCHLEVEL__ in C code.
+
+   These API entrypoints were added in LIBGCCJIT_ABI_13; you can test for their
+   presence using
+ #ifdef LIBGCCJIT_HAVE_gcc_jit_version
+ */
+extern int
+gcc_jit_version_major (void);
+extern int
+gcc_jit_version_minor (void);
+extern int
+gcc_jit_version_patchlevel (void);
+
 #ifdef __cplusplus
 }
 #endif /* __cplusplus */
diff --git a/gcc/jit/libgccjit.c b/gcc/jit/libgccjit.c
index 83055fc297b..572c82f053c 100644
--- a/gcc/jit/libgccjit.c
+++ b/gcc/jit/libgccjit.c
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "coretypes.h"
 #include "timevar.h"
 #include "typed-splay-tree.h"
+#include "cppbuiltin.h"
 
 #include "libgccjit.h"
 #include "jit-recording.h"
@@ -3175,3 +3176,27 @@ gcc_jit_context_new_rvalue_from_vector (gcc_jit_context *ctxt,
  as_vec_type,
  (gcc::jit::recording::rvalue **)elements);
 }
+
+extern int
+gcc_jit_version_major (void)
+{
+  int major, minor, patchlevel;
+  parse_basever (, , );
+  return major;
+}
+
+extern int
+gcc_jit_version_minor (void)
+{
+  int major, minor, patchlevel;
+  parse_basever (, , );
+  return minor;
+}
+
+extern int
+gcc_jit_version_patchlevel (void)
+{
+  int major, minor, patchlevel;
+  parse_basever (, , );
+  return patchlevel;
+}
diff --git a/gcc/jit/libgccjit.map b/gcc/jit/libgccjit.map
index 4514bd3aa33..6137dd4b4b0 100644
--- a/gcc/jit/libgccjit.map
+++ b/gcc/jit/libgccjit.map
@@ -179,4 +179,11 @@ LIBGCCJIT_ABI_11 {
 LIBGCCJIT_ABI_12 {
   global:
 gcc_jit_context_new_bitfield;
-} LIBGCCJIT_ABI_11;
\ No newline at end of file

Re: [PATCH] doc: Replace references to SVN with those to Git

2020-01-16 Thread Martin Jambor

Hi,

On Thu, Jan 16 2020, Gerald Pfeifer wrote:
> On Wed, 15 Jan 2020, Martin Jambor wrote:
>> when going over stuff linked from the SummerOfCode wiki page, 
>> I found out that doc/install.texi still refers to Subversion.
>
> We've got a fair number of references left in various places;
> working through that slowly and appreciate your help!
>
>> The following patch replaces SVN to Git almost mechanically.
>
> I'm wondering whether we can/should see where to use a more
> generic term in a few cases; let me propose some such cases.
>
>> OK for trunk? And then perhaps for the opened release branches too?
>
> Yes (considering the notes below), and yes, please!

It was brought to my attention that there already is Eric's
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00719.html which addresses
the same thing and better, so I don't plan to commit my patch any more.

However, it would be great if the (already approved?) patch would be
comitted soon.

Thanks,

Martin

Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-01-16 Thread Alexandre Oliva

On Jan 16, 2020, Alexandre Oliva  wrote:

> On Jan  9, 2020, Alexandre Oliva  wrote:
>> On Jan  9, 2020, Richard Biener  wrote:
>>> Did I miss the actual (non-documentation) patch?

>> No, I didn't post it.  It's kind of big, and only yesterday did I get it
>> to work as expected and now extensively documented, passing all of the
>> extensive testsuite I wrote for it.

> Here it is, at last, regstrapped on x86_64-linux-gnu.  Ok to install?

And here's a followup that fixes a limitation (bug?) in libiberty that
was hit when I attempted a last-minute simplification in lto-wrapper.

Regstrapped separately on x86_64-linux-gnu.  Ok to install?


[libiberty] output empty args as a pair of quotes

From: Alexandre Oliva 

writeargv writes out empty arguments in a way that expandargv skips
them instead of preserving them.  Fixed by writing out a pair of
quotes for them.

This enables lto-wrapper to pass down an empty string, as desired.


for  libiberty/ChangeLog

* argv.c (writeargv): Output empty args as "".

for  gcc/ChangeLog

* lto-wrapper.c (run_gcc): Use an empty string for -dumpdir.
---
 gcc/lto-wrapper.c |5 +
 libiberty/argv.c  |8 
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/gcc/lto-wrapper.c b/gcc/lto-wrapper.c
index ed076e3..aa71f1e 100644
--- a/gcc/lto-wrapper.c
+++ b/gcc/lto-wrapper.c
@@ -1482,10 +1482,7 @@ run_gcc (unsigned argc, char *argv[])
   if (!incoming_dumppfx)
 {
   obstack_ptr_grow (_obstack, "-dumpdir");
-  /* An empty string would do, if only writeargv would write it
-out in a way that would not be skipped by expandargv and
-buildargv.  */
-  obstack_ptr_grow (_obstack, current_dir);
+  obstack_ptr_grow (_obstack, "");
 }
   obstack_ptr_grow (_obstack, "-dumpbase");
 
diff --git a/libiberty/argv.c b/libiberty/argv.c
index 8c9794db..6a72208 100644
--- a/libiberty/argv.c
+++ b/libiberty/argv.c
@@ -327,6 +327,14 @@ writeargv (char * const *argv, FILE *f)
   arg++;
 }
 
+  /* Write out a pair of quotes for an empty argument.  */
+  if (arg == *argv)
+   if (EOF == fputs ("\"\"", f))
+ {
+   status = 1;
+   goto done;
+ }
+
   if (EOF == fputc ('\n', f))
 {
   status = 1;


-- 
Alexandre Oliva, freedom fighter   he/him   https://FSFLA.org/blogs/lxo
Free Software Evangelist   Stallman was right, but he's left :(
GNU Toolchain EngineerFSMatrix: It was he who freed the first of us
FSF & FSFLA board memberThe Savior shall return (true);

[committed] aarch64: Fix BE SVE mode punning involving floats

2020-01-16 Thread Richard Sandiford

The patterns used by aarch64_split_sve_subreg_move only support
integer modes, so if the widest mode is a float, we should get
its integer equivalent.

Fixes gcc.target/aarch64/sel_3.c for big-endian targets.

Tested on aarch64-linux-gnu and aarch64_be-none-elf.

Richard


2020-01-16  Richard Sandiford  

gcc/
* config/aarch64/aarch64.c (aarch64_split_sve_subreg_move): Apply
aarch64_sve_int_mode to each mode.
---
 gcc/config/aarch64/aarch64.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index ac89cc1f9c9..600a238c1f4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -4827,8 +4827,8 @@ aarch64_split_sve_subreg_move (rtx dest, rtx ptrue, rtx 
src)
   /* Decide which REV operation we need.  The mode with wider elements
  determines the mode of the operands and the mode with the narrower
  elements determines the reverse width.  */
-  machine_mode mode_with_wider_elts = GET_MODE (dest);
-  machine_mode mode_with_narrower_elts = GET_MODE (src);
+  machine_mode mode_with_wider_elts = aarch64_sve_int_mode (GET_MODE (dest));
+  machine_mode mode_with_narrower_elts = aarch64_sve_int_mode (GET_MODE (src));
   if (GET_MODE_UNIT_SIZE (mode_with_wider_elts)
   < GET_MODE_UNIT_SIZE (mode_with_narrower_elts))
 std::swap (mode_with_wider_elts, mode_with_narrower_elts);

[PATCH 4/4 GCC11] rs6000: P9 D-form test cases

2020-01-16 Thread Kewen.Lin


gcc/testsuite/ChangeLog

2020-01-16  Kelvin Nilsen  
Kewen Lin  

* gcc.target/powerpc/p9-dform-0.c: New test.
* gcc.target/powerpc/p9-dform-1.c: New test.
* gcc.target/powerpc/p9-dform-2.c: New test.
* gcc.target/powerpc/p9-dform-3.c: New test.
* gcc.target/powerpc/p9-dform-4.c: New test.
* gcc.target/powerpc/p9-dform-generic.h: New test.
 gcc/testsuite/gcc.target/powerpc/p9-dform-0.c  | 43 +
 gcc/testsuite/gcc.target/powerpc/p9-dform-1.c  | 55 ++
 gcc/testsuite/gcc.target/powerpc/p9-dform-2.c  | 12 +
 gcc/testsuite/gcc.target/powerpc/p9-dform-3.c  | 15 ++
 gcc/testsuite/gcc.target/powerpc/p9-dform-4.c  | 12 +
 .../gcc.target/powerpc/p9-dform-generic.h  | 34 +
 6 files changed, 171 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/p9-dform-0.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/p9-dform-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/p9-dform-2.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/p9-dform-3.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/p9-dform-4.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/p9-dform-generic.h

diff --git a/gcc/testsuite/gcc.target/powerpc/p9-dform-0.c 
b/gcc/testsuite/gcc.target/powerpc/p9-dform-0.c
new file mode 100644
index 000..01f8b69
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/p9-dform-0.c
@@ -0,0 +1,43 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops" } */
+
+/* This test confirms that the dform instructions are selected in the
+   translation of this main program.  */
+
+extern void first_dummy ();
+extern void dummy (double sacc, int n);
+extern void other_dummy ();
+
+extern float opt_value;
+extern char *opt_desc;
+
+#define M 128
+#define N 512
+
+double x [N];
+double y [N];
+
+int main (int argc, char *argv []) {
+  double sacc;
+
+  first_dummy ();
+  for (int j = 0; j < M; j++) {
+
+sacc = 0.00;
+for (unsigned long long int i = 0; i < N; i++) {
+  sacc += x[i] * y[i];
+}
+dummy (sacc, N);
+  }
+  opt_value = ((float) N) * 2 * ((float) M);
+  opt_desc = "flops";
+  other_dummy ();
+}
+
+/* At time the dform optimization pass was merged with trunk, 12
+   lxv instructions were emitted in place of the same number of lxvx
+   instructions.  No need to require exactly this number, as it may
+   change when other optimization passes evolve.  */
+
+/* { dg-final { scan-assembler {\mlxv\M} } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/p9-dform-1.c 
b/gcc/testsuite/gcc.target/powerpc/p9-dform-1.c
new file mode 100644
index 000..c6f1d76
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/p9-dform-1.c
@@ -0,0 +1,55 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops" } */
+
+/* This test confirms that the dform instructions are selected in the
+   translation of this main program.  */
+
+extern void first_dummy ();
+extern void dummy (double sacc, int n);
+extern void other_dummy ();
+
+extern float opt_value;
+extern char *opt_desc;
+
+#define M 128
+#define N 512
+
+double x [N];
+double y [N];
+double z [N];
+
+int main (int argc, char *argv []) {
+  double sacc;
+
+  first_dummy ();
+  for (int j = 0; j < M; j++) {
+
+sacc = 0.00;
+for (unsigned long long int i = 0; i < N; i++) {
+  z[i] = x[i] * y[i];
+  sacc += z[i];
+}
+dummy (sacc, N);
+  }
+  opt_value = ((float) N) * 2 * ((float) M);
+  opt_desc = "flops";
+  other_dummy ();
+}
+
+
+
+/* At time the dform optimization pass was merged with trunk, 12
+   lxv instructions were emitted in place of the same number of lxvx
+   instructions.  No need to require exactly this number, as it may
+   change when other optimization passes evolve.  */
+
+/* { dg-final { scan-assembler {\mlxv\M} } } */
+
+/* At time the dform optimization pass was merged with trunk, 6
+   stxv instructions were emitted in place of the same number of stxvx
+   instructions.  No need to require exactly this number, as it may
+   change when other optimization passes evolve.  */
+
+/* { dg-final { scan-assembler {\mstxv\M} } } */
+
diff --git a/gcc/testsuite/gcc.target/powerpc/p9-dform-2.c 
b/gcc/testsuite/gcc.target/powerpc/p9-dform-2.c
new file mode 100644
index 000..8752f3d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/p9-dform-2.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops" } */
+
+#define TYPE int
+#include "p9-dform-generic.h"
+
+/* The precise number of lxv and stxv instructions may be impacted by
+   complex interactions between optimization passes, but we expect at
+   least one of each.  */
+/* { dg-final { scan-assembler {\mlxv\M} } } */
+/* {

[PATCH 3/4 GCC11] IVOPTs Consider cost_step on different forms during unrolling

2020-01-16 Thread Kewen.Lin

gcc/ChangeLog

2020-01-16  Kewen Lin  

* tree-ssa-loop-ivopts.c (struct iv_group): New field dform_p.
(struct iv_cand): New field dform_p.
(struct ivopts_data): New field mark_dform_p.
(record_group): Initialize dform_p.
(mark_dform_groups): New function.
(find_interesting_uses): Call mark_dform_groups.
(add_candidate_1): Update dform_p if derived from dform_p group.
(determine_iv_cost): Increase cost by considering unroll factor.
(tree_ssa_iv_optimize_loop): Call estimate_unroll_factor, update 
mark_dform_p.

 gcc/tree-ssa-loop-ivopts.c | 84 +-
 1 file changed, 83 insertions(+), 1 deletion(-)

diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index ab52cbe..a0d29bb 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -429,6 +429,8 @@ struct iv_group
   struct iv_cand *selected;
   /* To indicate this is a doloop use group.  */
   bool doloop_p;
+  /* To indicate this group is D-form preferred.  */
+  bool dform_p;
   /* Uses in the group.  */
   vec vuses;
 };
@@ -470,6 +472,7 @@ struct iv_cand
   struct iv *orig_iv;  /* The original iv if this cand is added from biv with
   smaller type.  */
   bool doloop_p;   /* Whether this is a doloop candidate.  */
+  bool dform_p;/* Derived from one D-form preferred group.  */
 };
 
 /* Hashtable entry for common candidate derived from iv uses.  */
@@ -650,6 +653,10 @@ struct ivopts_data
 
   /* Whether the loop has doloop comparison use.  */
   bool doloop_use_p;
+
+  /* Whether the loop is likely to unroll and need to check and mark
+ D-form group for better step cost modeling.  */
+  bool mark_dform_p;
 };
 
 /* An assignment of iv candidates to uses.  */
@@ -1575,6 +1582,7 @@ record_group (struct ivopts_data *data, enum use_type 
type)
   group->related_cands = BITMAP_ALLOC (NULL);
   group->vuses.create (1);
   group->doloop_p = false;
+  group->dform_p = false;
 
   data->vgroups.safe_push (group);
   return group;
@@ -2724,6 +2732,59 @@ split_address_groups (struct ivopts_data *data)
 }
 }
 
+/* Go through all address type groups, check and mark D-form preferred.  */
+static void
+mark_dform_groups (struct ivopts_data *data)
+{
+  if (!data->mark_dform_p)
+return;
+
+  class loop *loop = data->current_loop;
+  bool dump_details = (dump_file && (dump_flags & TDF_DETAILS));
+  for (unsigned i = 0; i < data->vgroups.length (); i++)
+{
+  struct iv_group *group = data->vgroups[i];
+  if (address_p (group->type))
+   {
+ bool found = true;
+ for (unsigned j = 0; j < group->vuses.length (); j++)
+   {
+ struct iv_use *use = group->vuses[j];
+ gcc_assert (use->mem_type);
+ /* Ensure the step fit into D-form field.  */
+ if (TREE_CODE (use->iv->step) != INTEGER_CST
+ || !tree_fits_shwi_p (use->iv->step))
+   {
+ found = false;
+ if (dump_details)
+   fprintf (dump_file,
+" Group use %u.%u doesn't"
+"have constant step for D-form.\n",
+i, j);
+ break;
+   }
+ bool is_store
+   = TREE_CODE (gimple_assign_lhs (use->stmt)) == SSA_NAME;
+ if (!targetm.stride_dform_valid_p (TYPE_MODE (use->mem_type),
+tree_to_shwi (use->iv->step),
+TYPE_UNSIGNED (use->mem_type),
+is_store, loop->estimated_uf))
+   {
+ found = false;
+ if (dump_details)
+   fprintf (dump_file,
+" Group use %u.%u isn't"
+"suitable for D-form.\n",
+i, j);
+ break;
+   }
+   }
+ if (found)
+   group->dform_p = true;
+   }
+}
+}
+
 /* Finds uses of the induction variables that are interesting.  */
 
 static void
@@ -2755,6 +2816,8 @@ find_interesting_uses (struct ivopts_data *data)
 
   split_address_groups (data);
 
+  mark_dform_groups (data);
+
   if (dump_file && (dump_flags & TDF_DETAILS))
 {
   fprintf (dump_file, "\n:\n");
@@ -3137,6 +3200,7 @@ add_candidate_1 (struct ivopts_data *data, tree base, 
tree step, bool important,
   cand->important = important;
   cand->incremented_at = incremented_at;
   cand->doloop_p = doloop;
+  cand->dform_p = false;
   data->vcands.safe_push (cand);
 
   if (!poly_int_tree_p (step))
@@ -3173,7 +3237,11 @@ add_candidate_1 (struct ivopts_data *data, tree base, 
tree step, bool important,
 
   /* Relate candidate to the group for which it is added.  */
   if (use)
-

[PATCH 2/4 GCC11] Add target hook stride_dform_valid_p

2020-01-16 Thread Kewen.Lin


gcc/ChangeLog

2020-01-16  Kewen Lin  

* config/rs6000/rs6000.c (TARGET_STRIDE_DFORM_VALID_P): New macro.
(rs6000_stride_dform_valid_p): New function.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_STRIDE_DFORM_VALID_P): New hook.
* target.def (stride_dform_valid_p): New hook.

 gcc/config/rs6000/rs6000.c | 40 
 gcc/doc/tm.texi|  8 
 gcc/doc/tm.texi.in |  2 ++
 gcc/target.def | 13 -
 4 files changed, 62 insertions(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 0dabaa6..1e41fcf 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1657,6 +1657,9 @@ static const struct attribute_spec 
rs6000_attribute_table[] =
 #undef TARGET_PREDICT_DOLOOP_P
 #define TARGET_PREDICT_DOLOOP_P rs6000_predict_doloop_p
 
+#undef TARGET_STRIDE_DFORM_VALID_P
+#define TARGET_STRIDE_DFORM_VALID_P rs6000_stride_dform_valid_p
+
 #undef TARGET_HAVE_COUNT_REG_DECR_P
 #define TARGET_HAVE_COUNT_REG_DECR_P true
 
@@ -26272,6 +26275,43 @@ rs6000_predict_doloop_p (struct loop *loop)
   return true;
 }
 
+/* Return true if the memory access with mode MODE, signedness SIGNED_P and
+   store STORE_P with offset from 0 to (NUNROLL-1) * STRIDE are valid with
+   D-form instructions.  */
+
+static bool
+rs6000_stride_dform_valid_p (machine_mode mode, signed HOST_WIDE_INT stride,
+ bool signed_p, bool store_p, unsigned nunroll)
+{
+  static const HOST_WIDE_INT max_bound = 0x7fff;
+  static const HOST_WIDE_INT min_bound = -0x8000;
+
+  if (!IN_RANGE ((nunroll - 1) * stride, min_bound, max_bound))
+return false;
+
+  /* Check DQ-form for vector mode or float128 mode.  */
+  if (VECTOR_MODE_P (mode) || FLOAT128_VECTOR_P (mode))
+{
+  if (mode_supports_dq_form (mode) && !(stride & 0xF))
+   return true;
+  else
+   return false;
+}
+
+  /* Simply consider non VSX instructions.  */
+  if (mode == QImode || mode == HImode || mode == SFmode || mode == DFmode)
+return true;
+
+  /* lwz/stw is D-form, but lwa is DS-form.  */
+  if (mode == SImode && (!signed_p || store_p || !(stride & 0x03)))
+return true;
+
+  if (mode == DImode && !(stride & 0x03))
+return true;
+
+  return false;
+}
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-rs6000.h"
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 86ad278..0b8bc7c 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -11669,6 +11669,14 @@ function version at run-time for a given set of 
function versions.
 body must be generated.
 @end deftypefn
 
+@deftypefn {Target Hook} bool TARGET_STRIDE_DFORM_VALID_P (machine_mode 
@var{mode}, signed HOST_WIDE_INT @var{stride}, bool @var{signed_p}, bool 
@var{store_p}, unsigned @var{nunroll})
+For a given memory access, check whether it is valid to put 0, @var{stride}
+, 2 * @var{stride}, ... , (@var{nunroll} - 1) to the instruction D-form
+displacement, with mode @var{mode}, signedness @var{signed_p} and store
+@var{store_p}.  Return true if valid.
+The default version of this hook returns false.
+@end deftypefn
+
 @deftypefn {Target Hook} bool TARGET_PREDICT_DOLOOP_P (class loop *@var{loop})
 Return true if we can predict it is possible to use a low-overhead loop
 for a particular loop.  The parameter @var{loop} is a pointer to the loop.
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index fd9769e..e90d020 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -7953,6 +7953,8 @@ to by @var{ce_info}.
 
 @hook TARGET_GENERATE_VERSION_DISPATCHER_BODY
 
+@hook TARGET_STRIDE_DFORM_VALID_P
+
 @hook TARGET_PREDICT_DOLOOP_P
 
 @hook TARGET_HAVE_COUNT_REG_DECR_P
diff --git a/gcc/target.def b/gcc/target.def
index f61c831..ee19a8d 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -4300,7 +4300,18 @@ DEFHOOK
  emits a @code{speculation_barrier} instruction if that is defined.",
 rtx, (machine_mode mode, rtx result, rtx val, rtx failval),
  default_speculation_safe_value)
- 
+
+DEFHOOK
+(stride_dform_valid_p,
+ "For a given memory access, check whether it is valid to put 0, 
@var{stride}\n\
+, 2 * @var{stride}, ... , (@var{nunroll} - 1) to the instruction D-form\n\
+displacement, with mode @var{mode}, signedness @var{signed_p} and store\n\
+@var{store_p}.  Return true if valid.\n\
+The default version of this hook returns false.",
+ bool, (machine_mode mode, signed HOST_WIDE_INT stride, bool signed_p,
+ bool store_p, unsigned nunroll),
+ NULL)
+
 DEFHOOK
 (predict_doloop_p,
  "Return true if we can predict it is possible to use a low-overhead loop\n\
-- 
2.7.4

[PATCH 1/4 GCC11] Add middle-end unroll factor estimation

2020-01-16 Thread Kewen.Lin

gcc/ChangeLog

2020-01-16  Kewen Lin  

* cfgloop.h (struct loop): New field estimated_uf.
* config/rs6000/rs6000.c (TARGET_LOOP_UNROLL_ADJUST_TREE): New macro.
(rs6000_loop_unroll_adjust_tree): New function.
* doc/tm.texi: Regenerate.
* doc/tm.texi.in (TARGET_LOOP_UNROLL_ADJUST_TREE): New hook.
* target.def (loop_unroll_adjust_tree): New hook.
* tree-ssa-loop-manip.c (decide_uf_const_iter): New function.
(decide_uf_runtime_iter): Likewise.
(decide_uf_stupid): Likewise.
(estimate_unroll_factor): Likewise.
* tree-ssa-loop-manip.h (estimate_unroll_factor): New declare.
* tree-ssa-loop.c (tree_average_num_loop_insns): New function.
* tree-ssa-loop.h (tree_average_num_loop_insns): New declare.
 gcc/cfgloop.h  |   3 +
 gcc/config/rs6000/rs6000.c |  16 ++-
 gcc/doc/tm.texi|   6 ++
 gcc/doc/tm.texi.in |   2 +
 gcc/target.def |   8 ++
 gcc/tree-ssa-loop-manip.c  | 254 +
 gcc/tree-ssa-loop-manip.h  |   3 +-
 gcc/tree-ssa-loop.c|  33 ++
 gcc/tree-ssa-loop.h|   2 +
 9 files changed, 324 insertions(+), 3 deletions(-)

diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index e3590d7..feceed6 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -232,6 +232,9 @@ public:
  Other values means unroll with the given unrolling factor.  */
   unsigned short unroll;
 
+  /* Like unroll field above, but it's estimated in middle-end.  */
+  unsigned short estimated_uf;
+
   /* If this loop was inlined the main clique of the callee which does
  not need remapping when copying the loop body.  */
   unsigned short owned_clique;
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 2995348..0dabaa6 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -1431,6 +1431,9 @@ static const struct attribute_spec 
rs6000_attribute_table[] =
 #undef TARGET_LOOP_UNROLL_ADJUST
 #define TARGET_LOOP_UNROLL_ADJUST rs6000_loop_unroll_adjust
 
+#undef TARGET_LOOP_UNROLL_ADJUST_TREE
+#define TARGET_LOOP_UNROLL_ADJUST_TREE rs6000_loop_unroll_adjust_tree
+
 #undef TARGET_INIT_BUILTINS
 #define TARGET_INIT_BUILTINS rs6000_init_builtins
 #undef TARGET_BUILTIN_DECL
@@ -5090,7 +5093,8 @@ rs6000_destroy_cost_data (void *data)
   free (data);
 }
 
-/* Implement targetm.loop_unroll_adjust.  */
+/* Implement targetm.loop_unroll_adjust.  Don't forget to update
+   loop_unroll_adjust_tree for any changes.  */
 
 static unsigned
 rs6000_loop_unroll_adjust (unsigned nunroll, struct loop *loop)
@@ -5109,6 +5113,16 @@ rs6000_loop_unroll_adjust (unsigned nunroll, struct loop 
*loop)
   return nunroll;
 }
 
+/* Implement targetm.loop_unroll_adjust_tree, strictly refers to
+   targetm.loop_unroll_adjust.  */
+
+static unsigned
+rs6000_loop_unroll_adjust_tree (unsigned nunroll, struct loop *loop)
+{
+  /* For now loop_unroll_adjust is simple, just invoke directly.  */
+  return rs6000_loop_unroll_adjust (nunroll, loop);
+}
+
 /* Handler for the Mathematical Acceleration Subsystem (mass) interface to a
library with vectorized intrinsics.  */
 
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 2244df4..86ad278 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -11875,6 +11875,12 @@ is required only when the target has special 
constraints like maximum
 number of memory accesses.
 @end deftypefn
 
+@deftypefn {Target Hook} unsigned TARGET_LOOP_UNROLL_ADJUST_TREE (unsigned 
@var{nunroll}, class loop *@var{loop})
+This target hook is the same as @code{loop_unroll_adjust}, but it's for
+middle-end unroll factor estimation computation. See
+@code{loop_unroll_adjust} for the function description.
+@end deftypefn
+
 @defmac POWI_MAX_MULTS
 If defined, this macro is interpreted as a signed integer C expression
 that specifies the maximum number of floating point multiplications
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 52cd603..fd9769e 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -8008,6 +8008,8 @@ lists.
 
 @hook TARGET_LOOP_UNROLL_ADJUST
 
+@hook TARGET_LOOP_UNROLL_ADJUST_TREE
+
 @defmac POWI_MAX_MULTS
 If defined, this macro is interpreted as a signed integer C expression
 that specifies the maximum number of floating point multiplications
diff --git a/gcc/target.def b/gcc/target.def
index e705c5d..f61c831 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -2725,6 +2725,14 @@ number of memory accesses.",
  unsigned, (unsigned nunroll, class loop *loop),
  NULL)
 
+DEFHOOK
+(loop_unroll_adjust_tree,
+ "This target hook is the same as @code{loop_unroll_adjust}, but it's for\n\
+middle-end unroll factor estimation computation. See\n\
+@code{loop_unroll_adjust} for the function description.",
+ unsigned, (unsigned nunroll, class loop *loop),
+ NULL)
+
 /* True if X is a legitimate MODE-mode immediate operand.  */
 DEFHOOK
 (legitimate_constant_p,
diff --git a/gcc/tree-ssa-loop-manip.c

[PATCH 0/4 GCC11] IVOPTs consider step cost for different forms when unrolling

2020-01-16 Thread Kewen.Lin

Hi,

As we discussed in the thread
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00196.html
Original: https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00104.html,
I'm working to teach IVOPTs to consider D-form group access during unrolling.
The difference on D-form and other forms during unrolling is we can put the
stride into displacement field to avoid additional step increment. eg:

With X-form (uf step increment):
  ...
  LD A = baseA, X
  LD B = baseB, X
  ST C = baseC, X
  X = X + stride
  LD A = baseA, X
  LD B = baseB, X
  ST C = baseC, X
  X = X + stride
  LD A = baseA, X
  LD B = baseB, X
  ST C = baseC, X
  X = X + stride
  ...

With D-form (one step increment for each base):
  ...
  LD A = baseA, OFF
  LD B = baseB, OFF
  ST C = baseC, OFF
  LD A = baseA, OFF+stride
  LD B = baseB, OFF+stride
  ST C = baseC, OFF+stride
  LD A = baseA, OFF+2*stride
  LD B = baseB, OFF+2*stride
  ST C = baseC, OFF+2*stride
  ...
  baseA += stride * uf
  baseB += stride * uf
  baseC += stride * uf

Imagining that if the loop get unrolled by 8 times, then 3 step updates with
D-form vs. 8 step updates with X-form. Here we only need to check stride
meet D-form field requirement, since if OFF doesn't meet, we can construct
baseA' with baseA + OFF.

This patch set consists four parts:
 
  [PATCH 1/4 GCC11] Add middle-end unroll factor estimation

 Add unroll factor estimation in middle-end. It mainly refers to current
 RTL unroll factor determination in function decide_unrolling and its
 sub calls.  As Richard B. suggested, we probably can force unroll factor
 with this and avoid duplicate unroll factor calculation, but I think it
 need more benchmarking work and should be handled separately.

  [PATCH 2/4 GCC11] Add target hook stride_dform_valid_p 

 Add one target hook to determine whether the current memory access with
 the given mode, stride and other flags have available D-form supports.
 
  [PATCH 3/4 GCC11] IVOPTs Consider cost_step on different forms during 
unrolling

 Teach IVOPTs to identify address type iv group with D-form preferred,
 and flag dform_p of their derived iv cands.  Considering unroll factor,
 increase iv cost with (uf - 1) * cost_step if it's not a dform iv cand. 
 
  [PATCH 4/4 GCC11] rs6000: P9 D-form test cases

 Add some test cases, mainly copied from Kelvin's patch.

Bootstrapped and regress tested on powerpc64le-linux-gnu.
I'll take two weeks leave soon, please expect late responses.
Thanks a lot in advance!

BR,
Kewen



 gcc/cfgloop.h   |   3 +
 gcc/config/rs6000/rs6000.c  |  56 -
 gcc/doc/tm.texi |  14 +
 gcc/doc/tm.texi.in  |   4 ++
 gcc/target.def  |  21 ++-
 gcc/testsuite/gcc.target/powerpc/p9-dform-0.c   |  43 +
 gcc/testsuite/gcc.target/powerpc/p9-dform-1.c   |  55 +
 gcc/testsuite/gcc.target/powerpc/p9-dform-2.c   |  12 
 gcc/testsuite/gcc.target/powerpc/p9-dform-3.c   |  15 +
 gcc/testsuite/gcc.target/powerpc/p9-dform-4.c   |  12 
 gcc/testsuite/gcc.target/powerpc/p9-dform-generic.h |  34 +++
 gcc/tree-ssa-loop-ivopts.c  |  84 
+-
 gcc/tree-ssa-loop-manip.c   | 254 
+
 gcc/tree-ssa-loop-manip.h   |   3 +-
 gcc/tree-ssa-loop.c |  33 ++
 gcc/tree-ssa-loop.h |   2 +
 16 files changed, 640 insertions(+), 5 deletions(-)

Re: [PATCH 2/2] Uninitialized padding in struct _dep.

2020-01-16 Thread Richard Biener

On January 16, 2020 6:08:49 AM GMT+01:00, apin...@marvell.com wrote:
>From: Andrew Pinski 
>
>In struct _dep, there is an implicit padding of 4bits.  This
>bit-field padding is uninitialized when init_dep_1 is being called.
>This means we access uninitialized memory but never use it for
>anything.  Adding an unused bit-field field and initializing it
>in init_dep_1 will improve code generation also as we initialize
>the whole 32bits now rather than just part of it.
>
>OK?  Bootstrapped and tested on x86_64-linux-gnu.

OK. 

Richard. 

>Thanks,
>Andrew Pinski
>
>ChangeLog:
>* sched-int.h (_dep): Add unused bit-field field for the padding.
>* sched-deps.c (init_dep_1): Init unused field.
>
>Change-Id: I27000323e728f8a73189426e0b9a98c5235b8c55
>---
> gcc/sched-deps.c | 1 +
> gcc/sched-int.h  | 2 ++
> 2 files changed, 3 insertions(+)
>
>diff --git a/gcc/sched-deps.c b/gcc/sched-deps.c
>index 9182aba5588..331af5ffdb3 100644
>--- a/gcc/sched-deps.c
>+++ b/gcc/sched-deps.c
>@@ -101,6 +101,7 @@ init_dep_1 (dep_t dep, rtx_insn *pro, rtx_insn
>*con, enum reg_note type, ds_t ds
>   DEP_NONREG (dep) = 0;
>   DEP_MULTIPLE (dep) = 0;
>   DEP_REPLACE (dep) = NULL;
>+  dep->unused = 0;
> }
> 
> /* Init DEP with the arguments.
>diff --git a/gcc/sched-int.h b/gcc/sched-int.h
>index 833b552a340..a847f876e65 100644
>--- a/gcc/sched-int.h
>+++ b/gcc/sched-int.h
>@@ -238,6 +238,8 @@ struct _dep
>/* Cached cost of the dependency.  Make sure to update UNKNOWN_DEP_COST
>  when changing the size of this field.  */
>   int cost:20;
>+
>+  unsigned unused:4;
> };
> 
> #define UNKNOWN_DEP_COST ((int) ((unsigned int) -1 << 19))

Re: [PATCH 1/2] Fix uninitialized field in expand_operand.

2020-01-16 Thread Richard Biener

On January 16, 2020 6:08:48 AM GMT+01:00, apin...@marvell.com wrote:
>From: Andrew Pinski 
>
>Commit g:f96bf49a0 added the target field to expand_operand.
>But it leaves it uninitialized when doing a full initialization
>inside create_expand_operand.  This fixes the problem and improves
>the code generation inside create_expand_operand too.
>
>OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

OK. 

Richard. 

>ChangeLog:
>* optabs.h (create_expand_operand): Initialize target field also.
>
>Change-Id: Ib653fbfbb2b0709970db87fb94de14b59758bc6c
>---
> gcc/optabs.h | 1 +
> 1 file changed, 1 insertion(+)
>
>diff --git a/gcc/optabs.h b/gcc/optabs.h
>index 07bdc56586e..5bd19503a0a 100644
>--- a/gcc/optabs.h
>+++ b/gcc/optabs.h
>@@ -78,6 +78,7 @@ create_expand_operand (class expand_operand *op,
> {
>   op->type = type;
>   op->unsigned_p = unsigned_p;
>+  op->target = 0;
>   op->unused = 0;
>   op->mode = mode;
>   op->value = value;

Re: [PATCH] Fix value numbering dealing with reverse byte order

2020-01-16 Thread Richard Biener

On January 16, 2020 9:13:46 AM GMT+01:00, apin...@marvell.com wrote:
>From: Andrew Pinski 
>
>Hi,
>  While working on bit-field lowering pass, I came across this bug.
>The IR looks like:
>  VIEW_CONVERT_EXPR(var1) = _12;
>  _1 = BIT_FIELD_REF ;
>
>Where the BIT_FIELD_REF has REF_REVERSE_STORAGE_ORDER set on it
>and var1's type has TYPE_REVERSE_STORAGE_ORDER set on it.
>PRE/FRE would decided to prop _12 into the BFR statement
>which would produce wrong code.
>And yes _12 has the correct byte order already; bit-field lowering
>removes the implicit byte swaps in the IR and adds the explicity
>to make it easier optimize later on.
>
>This patch adds a check for storage_order_barrier_p on the lhs tree
>which returns true in the case where we had a reverse order with a VCE.
>
>gcc.c-torture/execute/pr86659-1.c was the testcase which showed the
>issue.
>
>OK?  Bootstrapped and tested on x86_64-linux-gnu with no regression.

OK. 

Richard. 

>Thanks,
>Andrew Pinski
>
>ChangeLog:
>* tree-ssa-sccvn.c(vn_reference_lookup_3): Check lhs for
>!storage_order_barrier_p.
>
>Change-Id: I7810de6fc4ff01e431033fa7f7f7b3ec95f67644
>---
> gcc/tree-ssa-sccvn.c | 2 ++
> 1 file changed, 2 insertions(+)
>
>diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
>index 3b27c50..4d13015 100644
>--- a/gcc/tree-ssa-sccvn.c
>+++ b/gcc/tree-ssa-sccvn.c
>@@ -2593,6 +2593,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse,
>void *data_,
>, , , );
>   if (base2
> && !reverse
>+&& !storage_order_barrier_p (lhs)
> && known_eq (maxsize2, size2)
> && multiple_p (size2, BITS_PER_UNIT)
> && multiple_p (offset2, BITS_PER_UNIT)
>@@ -2695,6 +2696,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse,
>void *data_,
>, , , );
>   tree def_rhs = gimple_assign_rhs1 (def_stmt);
>   if (!reverse
>+&& !storage_order_barrier_p (lhs)
> && known_size_p (maxsize2)
> && known_eq (maxsize2, size2)
> && adjust_offsets_for_equal_base_address (base, ,

Re: [PATCH] analyzer: fix handling of negative byte offsets (PR 93281)

2020-01-16 Thread Jakub Jelinek

On Wed, Jan 15, 2020 at 06:56:48PM -0500, David Malcolm wrote:
> gcc/analyzer/ChangeLog:
>   PR analyzer/93281
>   * region-model.cc
>   (region_model::convert_byte_offset_to_array_index): Convert
>   offset_cst to ssizetype before dividing by byte_size.
> ---
>  gcc/analyzer/region-model.cc | 5 -
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
> index 5c39be4fd7f..b62ddf82a40 100644
> --- a/gcc/analyzer/region-model.cc
> +++ b/gcc/analyzer/region-model.cc
> @@ -6414,9 +6414,12 @@ region_model::convert_byte_offset_to_array_index (tree 
> ptr_type,
>/* This might not be a constant.  */
>tree byte_size = size_in_bytes (elem_type);
>  
> +  /* Ensure we're in a signed representation before doing the division.  
> */
> +  tree signed_offset_cst = fold_convert (ssizetype, offset_cst);
> +
>tree index
>   = fold_build2 (TRUNC_DIV_EXPR, integer_type_node,
> -offset_cst, byte_size);
> +signed_offset_cst, byte_size);

I'd say you want to fold_convert byte_size to ssizetype too.
Another thing is the integer_type_node, that looks wrong when you are
dividing ssizetype by ssizetype.  The fold-const.c folders are sensitive to
using incorrect types and type mismatches.
And, I think fold_build2 is wasteful, you only care if you can simplify it
to a constant (just constant or INTEGER_CST only?).
So, either use fold_binary (TRUNC_DIV_EXPR, ssizetype, offset_cst,
fold_convert (ssizetype, byte_size))
or, if you have checked that both arguments are INTEGER_CSTs, perhaps
int_const_binop or so.

>if (CONSTANT_CLASS_P (index))

And this would need to be if (index && TREE_CODE (index) == INTEGER_CST)
(or if you can handle other constants (index && CONSTANT_CLASS_P (index)).

>   return get_or_create_constant_svalue (index);

Jakub

[PATCH] Fix value numbering dealing with reverse byte order

2020-01-16 Thread apinski

From: Andrew Pinski 

Hi,
  While working on bit-field lowering pass, I came across this bug.
The IR looks like:
  VIEW_CONVERT_EXPR(var1) = _12;
  _1 = BIT_FIELD_REF ;

Where the BIT_FIELD_REF has REF_REVERSE_STORAGE_ORDER set on it
and var1's type has TYPE_REVERSE_STORAGE_ORDER set on it.
PRE/FRE would decided to prop _12 into the BFR statement
which would produce wrong code.
And yes _12 has the correct byte order already; bit-field lowering
removes the implicit byte swaps in the IR and adds the explicity
to make it easier optimize later on.

This patch adds a check for storage_order_barrier_p on the lhs tree
which returns true in the case where we had a reverse order with a VCE.

gcc.c-torture/execute/pr86659-1.c was the testcase which showed the issue.

OK?  Bootstrapped and tested on x86_64-linux-gnu with no regression.

Thanks,
Andrew Pinski

ChangeLog:
* tree-ssa-sccvn.c(vn_reference_lookup_3): Check lhs for
!storage_order_barrier_p.

Change-Id: I7810de6fc4ff01e431033fa7f7f7b3ec95f67644
---
 gcc/tree-ssa-sccvn.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 3b27c50..4d13015 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -2593,6 +2593,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
*data_,
 , , , );
   if (base2
  && !reverse
+ && !storage_order_barrier_p (lhs)
  && known_eq (maxsize2, size2)
  && multiple_p (size2, BITS_PER_UNIT)
  && multiple_p (offset2, BITS_PER_UNIT)
@@ -2695,6 +2696,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
*data_,
 , , , );
   tree def_rhs = gimple_assign_rhs1 (def_stmt);
   if (!reverse
+ && !storage_order_barrier_p (lhs)
  && known_size_p (maxsize2)
  && known_eq (maxsize2, size2)
  && adjust_offsets_for_equal_base_address (base, ,
-- 
1.8.3.1

Re: [C++ coroutines 4/6] Middle end expanders and transforms.

2020-01-16 Thread Iain Sandoe

Bin.Cheng  wrote:

>>> +   gassign *get_res = gimple_build_assign (lhs, done);
>>> +   gsi_replace (gsi, get_res, true);
>>> +   *handled_ops_p = true;
>>> + }
>>> + break;
>>> +   }
>>> +}
>>> +  return NULL_TREE;
>>> +}
>>> +
>>> +/* Main entry point for lowering coroutine FE builtins.  */
>>> +
>>> +static unsigned int
>>> +execute_lower_coro_builtins (void)
>>> +{
>>> +  struct walk_stmt_info wi;
>>> +  gimple_seq body;
>>> +
>>> +  body = gimple_body (current_function_decl);
>>> +  memset (, 0, sizeof (wi));
>>> +  walk_gimple_seq_mod (, lower_coro_builtin, NULL, );
>>> +  gimple_set_body (current_function_decl, body);
>> 
>> it would be nice to elide the function walk for functions not
>> containing any CO* stuff (you noted that below for other parts).
>> We can spend a bit in struct function noting functions with
>> coroutine code inside and set the bit from frontends or from
>> the gimplifier for example.  Since it's behind the flag_coroutines
>> paywall this can be addressed as followup.
> 
> Yes, this bit flag is necessary for following optimization passes, I
> wonder which bit you would suggest?

it’s implemented in the patches (that should be merged soon, just retesing
after the git transition).
Uses a flag bit in the function struct. 

see:

https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00435.html

diff --git a/gcc/function.h b/gcc/function.h
index 496d3f728c..1ee8ed3de5 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -418,6 +418,9 @@ struct GTY(()) function {
   /* Set when the function was compiled with generation of debug
  (begin stmt, inline entry, ...) markers enabled.  */
   unsigned int debug_nonbind_markers : 1;
+
+  /* Set if this is a coroutine-related function.  */
+  unsigned int coroutine_component : 1;
 };

81 matches

Mail list logo