[PATCH] sim: depend on gnulib

2021-05-07 Thread Mike Frysinger via Gcc-patches
We're going to start using gnulib in the sim, so make sure it exists.

ChangeLog:

* Makefile.def: Add configure-sim dependency on all-gnulib.
* Makefile.in: Regenerated.
---
 Makefile.def | 1 +
 Makefile.in  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/Makefile.def b/Makefile.def
index df8ccfb24c3d..2029ff3a72af 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -541,6 +541,7 @@ dependencies = { module=install-strip-sid; 
on=install-strip-tcl; };
 dependencies = { module=install-sid; on=install-tk; };
 dependencies = { module=install-strip-sid; on=install-strip-tk; };
 
+dependencies = { module=configure-sim; on=all-gnulib; };
 dependencies = { module=configure-sim; on=configure-intl; };
 dependencies = { module=all-sim; on=all-intl; };
 dependencies = { module=all-sim; on=all-libiberty; };
diff --git a/Makefile.in b/Makefile.in
index 047be0255e26..1a08f3bd376a 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -61521,6 +61521,7 @@ install-sid: maybe-install-tcl
 install-strip-sid: maybe-install-strip-tcl
 install-sid: maybe-install-tk
 install-strip-sid: maybe-install-strip-tk
+configure-sim: maybe-all-gnulib
 all-sim: maybe-all-readline
 all-fastjar: maybe-all-build-texinfo
 all-libctf: all-libiberty
-- 
2.31.1



Re: [PATCH/RFC] Add a new memory gathering optimization for loop (PR98598)

2021-05-07 Thread Bin.Cheng via Gcc-patches
On Fri, Apr 30, 2021 at 1:20 PM Feng Xue OS via Gcc-patches
 wrote:
>
> >> This patch implements a new loop optimization according to the proposal
> >> in RFC given at
> >> https://gcc.gnu.org/pipermail/gcc/2021-January/234682.html.
> >> So do not repeat the idea in this mail. Hope your comments on it.
> >
> > With the caveat that I'm not an optimization expert (but no one else
> > seems to have replied), here are some thoughts.
> >
> > [...snip...]
> >
> >> Subject: [PATCH 1/3] mgo: Add a new memory gathering optimization for loop
> >>  [PR98598]
> >
> > BTW, did you mean to also post patches 2 and 3?
> >
>
> Not yet, but they are ready. Since this is kind of special optimization that 
> uses
> heap as temporary storage, not a common means in gcc, we do not know
> basic attitude of the community towards it. So only the first patch was sent
> out for initial comments, in that it implements a generic MGO framework, and
> is complete and self-contained. Other 2 patches just composed some
> enhancements for specific code pattern and dynamic alias check. If possible,
> this proposal would be accepted principally, we will submit other 2 for 
> review.
>
> >
> >> In nested loops, if scattered memory accesses inside inner loop remain
> >> unchanged in outer loop, we can sequentialize these loads by caching
> >> their values into a temporary memory region at the first time, and
> >> reuse the caching data in following iterations. This way can improve
> >> efficiency of cpu cache subsystem by reducing its unpredictable activies.
> >
> > I don't think you've cited any performance numbers so far.  Does the
> > optimization show a measurable gain on some benchmark(s)?  e.g. is this
> > ready to run SPEC yet, and how does it do?
>
> Yes, we have done that. Minor improvement about several point percentage
> could gain for some real applications. And to be specific, we also get major
> improvement as more than 30% for certain benchmark in SPEC2017.
Hi Feng Xue,
Could you help point out which bench it is?  I failed to observe
improvement in spec2017 on local x86 machine.  I am running with O3
level.

Thanks,
bin
>
> >
> >> To illustrate what the optimization will do, two pieces of pseudo code,
> >> before and after transformation, are given. Suppose all loads and
> >> "iter_count" are invariant in outer loop.
> >>
> >> From:
> >>
> >>   outer-loop ()
> >> {
> >>   inner-loop (iter, iter_count)
> >> {
> >>   Type1 v1 = LOAD (iter);
> >>   Type2 v2 = LOAD (v1);
> >>   Type3 v3 = LOAD (v2);
> >>   ...
> >>   iter = NEXT (iter);
> >> }
> >> }
> >>
> >> To:
> >>
> >>   typedef struct cache_elem
> >> {
> >>   bool   init;
> >>   Type1  c_v1;
> >>   Type2  c_v2;
> >>   Type3  c_v3;
> >
> > Putting the "bool init;" at the front made me think "what about
> > packing?" but presumably the idea is that every element is accessed in
> > order, so it presumably benefits speed to have "init" at the top of the
> > element, right?
>
> Yes, layout of the struct layout could be optimized in terms of size by
> some means, such as:
>   o. packing "init" into a padding hole after certain field
>   o. if certain field is a pointer type, the field can take the role of "init"
>   (Non-NULL implies "initialized")
> Now this simple scheme is straightforward, and would be enhanced
> in various aspects later.
>
> >> } cache_elem;
> >>
> >>   cache_elem *cache_arr = calloc (iter_count, sizeof (cache_elem));
>
> > What if the allocation fails at runtime?  Do we keep an unoptimized
> > copy of the nested loops around as a fallback and have an unlikely
> > branch to that copy?
>
> Yes, we should. But in a different way, a flag is added into original
> nested loop to control runtime switch between optimized and
> unoptimized execution. This definitely incurs runtime cost, but
> avoid possible code size bloating. A better handling, as a TODO is
> to apply dynamic-switch for large loop, and loop-clone for small one.
>
> > I notice that you're using calloc, presumably to clear all of the
> > "init" flags (and the whole buffer).
> >
> > FWIW, this feels like a case where it would be nice to have a thread-
> > local heap allocation, perhaps something like an obstack implemented in
> > the standard library - but that's obviously scope creep for this.
>
> Yes, that's good, specially for many-thread application.
>
> > Could it make sense to use alloca for small allocations?  (or is that
> > scope creep?)
>
> We did consider using alloca as you said.  But if we could not determine
> up limit for a non-constant size, we have to place alloca inside a loop that
> encloses the nested loop. Without a corresponding free operation, this
> kind of alloca-in-loop might cause stack overflow. So it becomes another
> TODO.
>
> >>   outer-loop ()
> >> {
> >>   size_t cache_idx = 0;
> >>
> >>   inner-loop (iter, iter_count)
> >> {
> >>   if 

Re: PowerPC64 ELFv1 -fpatchable-function-entry

2021-05-07 Thread Segher Boessenkool
Hi!

On Fri, May 07, 2021 at 12:19:50PM +0930, Alan Modra wrote:
> --- a/gcc/varasm.c
> +++ b/gcc/varasm.c
> @@ -6866,6 +6866,26 @@ default_elf_asm_named_section (const char *name, 
> unsigned int flags,
>*f = '\0';
>  }
>  
> +  char func_label[256];
> +  if (flags & SECTION_LINK_ORDER)
> +{
> +  static int recur;
> +  if (recur)
> + gcc_unreachable ();

That is written
  gcc_assert (!recur);
and no "else" after it please.

> + {
> +   ++recur;
> +   section *save_section = in_section;
> +   static int func_code_labelno;
> +   switch_to_section (function_section (decl));
> +   ++func_code_labelno;
> +   ASM_GENERATE_INTERNAL_LABEL (func_label, "LPFC", func_code_labelno);
> +   ASM_OUTPUT_LABEL (asm_out_file, func_label);
> +   switch_to_section (save_section);
> +   --recur;
> + }

See the other mail.  You could just write
  recur = true;
etc.?  That avoids unintentionally overflowing as well.

>   {
> -   tree id = DECL_ASSEMBLER_NAME (decl);
> -   ultimate_transparent_alias_target ();
> -   const char *name = IDENTIFIER_POINTER (id);
> -   name = targetm.strip_name_encoding (name);
> -   fprintf (asm_out_file, ",%s", name);
> +   fputc (',', asm_out_file);

Please don't use fputc and friends, just use fprintf.  The compiler will
make that fputc if that is a good idea.

Looks good with those things tweaked.


Segher


Re: PowerPC64 ELFv1 -fpatchable-function-entry

2021-05-07 Thread Segher Boessenkool
On Fri, May 07, 2021 at 08:47:02AM -0500, will schmidt wrote:
> On Fri, 2021-05-07 at 12:19 +0930, Alan Modra via Gcc-patches wrote:
> > --- a/gcc/varasm.c
> > +++ b/gcc/varasm.c
> > @@ -6866,6 +6866,26 @@ default_elf_asm_named_section (const char
> > *name, unsigned int flags,
> >*f = '\0';
> >  }
> > 
> > +  char func_label[256];
> > +  if (flags & SECTION_LINK_ORDER)
> > +{
> > +  static int recur;
> > +  if (recur)
> > +   gcc_unreachable ();
> 
> Interesting..   Is there any anticipation of re-entry or parallel runs
> through this function that requires the recur lock/protection?

Not parallel runs :-)  But:

> > +  else
> > +   {
> > + ++recur;
> > + section *save_section = in_section;
> > + static int func_code_labelno;
> > + switch_to_section (function_section (decl));

This could in theory call us again.  That should not be a problem, if

> > + ++func_code_labelno;

(Please use postfix increments btw)

...this is done *before* the call, so that we get two different labels.


Segher


Re: Revert "rs6000: Avoid -fpatchable-function-entry* regressions on powerpc64 be [PR98125]"

2021-05-07 Thread Segher Boessenkool
On Fri, May 07, 2021 at 12:19:51PM +0930, Alan Modra wrote:
> This reverts commit b680b9049737198d010e49cf434704c6a6ed2b3f now
> that the PowerPC64 ELFv1 regression is fixed properly.

This is okay when the rest goes in.  Do it in a bisectable order if
possible?  If that is easy :-)


Segher


Re: PR98125, PowerPC64 -fpatchable-function-entry

2021-05-07 Thread Segher Boessenkool
On Fri, May 07, 2021 at 12:19:49PM +0930, Alan Modra wrote:
> This series of patches fixes -fpatchable-function-entry on PowerPC64
> ELFv1 so that SECTION_LINK_ORDER (.section 'o' arg) is now supported,
> and on PowerPC64 ELFv2 to not break the global entry code.
> 
> Bootstrapped powerpc64le-linux and x86_64-linux all langs.  I did see
> one regression on both targets, libgo runtime/pprof.  It's unclear to
> me what that means.

Probably nothing?  It is one of the tests that fail regularly.


Segher


Re: [PATCH, rs6000] Add ALTIVEC_REGS as pressure class

2021-05-07 Thread Segher Boessenkool
Hi!

On Fri, May 07, 2021 at 10:53:31AM -0500, Pat Haugen wrote:
> Code that has heavy register pressure on Altivec registers can suffer from
> over-aggressive scheduling during sched1, which then leads to increased
> register spill. This is due to the fact that registers that prefer
> ALTIVEC_REGS are currently assigned an allocno class of VSX_REGS. This then
> misleads the scheduler to think there are 64 regs available, when in reality
> there are only 32 Altivec regs. This patch fixes the problem by assigning an
> allocno class of ALTIVEC_REGS and adding ALTIVEC_REGS as a pressure class.

> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/fold-vec-insert-float-p9.c: Adjust instruction 
> counts.

(line too long)

> +case VSX_REGS:
> +  if (best_class == ALTIVEC_REGS)
> + return ALTIVEC_REGS;

Should this be under just this case, or should we do this always?
Maybe change the big switch to be on best_class instead of on
allocno_class?

> --- a/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c
> +++ b/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c
> @@ -62,6 +62,6 @@ rlnm_test_2 (vector unsigned long long x, vector unsigned 
> long long y,
>  /* { dg-final { scan-assembler-times "vextsb2d" 1 } } */
>  /* { dg-final { scan-assembler-times "vslw" 1 } } */
>  /* { dg-final { scan-assembler-times "vsld" 1 } } */
> -/* { dg-final { scan-assembler-times "xxlor" 3 } } */
> +/* { dg-final { scan-assembler-times "xxlor" 2 } } */
>  /* { dg-final { scan-assembler-times "vrlwnm" 2 } } */
>  /* { dg-final { scan-assembler-times "vrldnm" 2 } } */

So what is this replaced with?  Was it an "xxlmr" and it is just
unnecessary now?

The patch is okay for trunk.  Thanks!


Segher


Re: [committed] amdgcn: disable TImode

2021-05-07 Thread Andrew Stubbs

Indeed, libgomp fails to build:

configure: error: unsupported system, cannot find Fortran int kind=16, 
needed for omp_depend_kind


I've reverted the patch for now. We'll just have to put up with a lot of 
new test failures in the stand-alone toolchain so that the offload 
toolchain will at least build.


I suspect we'll see some real failures here soon though.

Andrew

On 07/05/2021 23:45, Andrew Stubbs wrote:

On 07/05/2021 18:11, Tobias Burnus wrote:

On 07.05.21 18:35, Andrew Stubbs wrote:


TImode has always been a problem on amdgcn, and now it is causing many
new test failures, so I'm disabling it.


Does still still work with libgomp?

The patch sounds as if it might cause problems, but on the other hand,
I assume you did test it? To recall:

 >

The problem is that OpenMP's depobj as implemented in GCC has
sizeof() = 2*sizeof(void*) and is implemented as a two-element struct 
in C/C++.

But the OpenMP spec mandates that it is an integer type in Fortran, i.e.
integer(kind=omp_depend_kind).


I was focussing on getting the raw amdgcn toolchain toolchain to work 
again. I had forgotten about that little detail with Fortran. :-(


Is there another way we can fix this without implementing all of TImode 
support or tracking down every place in the middle-end that wants to use 
TImode without checking the optab?



Combining the impl choice and the type requirements that means that
on 64bit systems, this requires __int128 support, cf. commit
https://gcc.gnu.org/g:8d0b2b33748014ee57973c1d7bc9fd7706bb3da9
and https://gcc.gnu.org/PR96306

(Side note: The definition in OpenMP is bad - it should have been
some opaque derived type but that's a mistake done in OpenMP 5.0.)


:-(

Andrew




Re: [committed] libstdc++: Implement LWG 1203 for rvalue iostreams

2021-05-07 Thread Jonathan Wakely via Gcc-patches

On 06/05/21 18:28 +0100, Jonathan Wakely wrote:

On 06/05/21 18:09 +0100, Jonathan Wakely wrote:

On 06/05/21 17:55 +0200, Stephan Bergmann wrote:

On 30/04/2021 15:48, Jonathan Wakely via Libstdc++ wrote:

This implements the resolution of LWG 1203 so that the constraints for
rvalue stream insertion/extraction are simpler, and the return type is
the original rvalue stream type not its base class.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/std/istream (operator>>(Istream&&, x&)): Simplify, as
per LWG 1203.
* include/std/ostream (operator<<(Ostream&&, const x&)):
Likewise.
* 
testsuite/27_io/basic_istream/extractors_character/char/lwg2499_neg.cc:
Adjust dg-error pattern.
* 
testsuite/27_io/basic_istream/extractors_character/wchar_t/lwg2499_neg.cc:
Likewise.
* testsuite/27_io/basic_istream/extractors_other/char/4.cc: Define
is_extractable trait to replace std::__is_extractable. Make it
work with rvalue streams as well as lvalues, to replace f() and
g() helper functions.
* testsuite/27_io/basic_istream/extractors_other/wchar_t/4.cc:
Likewise.
* testsuite/27_io/basic_ostream/inserters_other/char/6.cc:
Define is_insertable trait to replace std::__is_insertable. Make
it work with rvalue streams as well as lvalues, to replace f()
and g() helper functions.
* testsuite/27_io/basic_ostream/inserters_other/wchar_t/6.cc:
Likewise.
* testsuite/27_io/filesystem/path/io/dr2989.cc: Prune additional
errors from new constraints.
* testsuite/27_io/rvalue_streams-2.cc: Remove PR 80675 checks,
which are no longer expected to compile.
* testsuite/27_io/rvalue_streams.cc: Adjust existing test.
Verify LWG 1203 changes.

Tested powerpc64le-linux. Committed to trunk.


FWIW, it looks like this is causing issues for Clang (at least 
Clang 11 and recent Clang 13 trunk):



$ cat test.cc
#include 
int i = 1 << std::ios::erase_event;


(i.e., using and enum in namespace std),


$ clang++ --gcc-toolchain=~/gcc/trunk/inst -fsyntax-only test.cc
In file included from test.cc:1:
~/gcc/trunk/inst/lib/gcc/x86_64-pc-linux-gnu/12.0.0/../../../../include/c++/12.0.0/ostream:727:33:
 error: cannot initialize a parameter of type 'std::ios_base *' with an rvalue 
of type 'int *'
 __rval_streamable(ios_base* = (_Tp*)nullptr);
 ^ ~
~/gcc/trunk/inst/lib/gcc/x86_64-pc-linux-gnu/12.0.0/../../../../include/c++/12.0.0/ostream:733:25:
 note: in instantiation of default function argument expression for 
'__rval_streamable' required here
typename = decltype(std::__rval_streamable<_Os>()
^
~/gcc/trunk/inst/lib/gcc/x86_64-pc-linux-gnu/12.0.0/../../../../include/c++/12.0.0/ostream:748:12:
 note: in instantiation of default argument for '__rvalue_stream_insertion_t' required here
 inline __rvalue_stream_insertion_t<_Ostream, _Tp>
^~
test.cc:2:11: note: while substituting deduced template arguments into function 
template 'operator<<' [with _Ostream = int, _Tp = std::ios_base::event]
int i = 1 << std::ios::erase_event;
   ^
~/gcc/trunk/inst/lib/gcc/x86_64-pc-linux-gnu/12.0.0/../../../../include/c++/12.0.0/ostream:727:33:
 note: passing argument to parameter here
 __rval_streamable(ios_base* = (_Tp*)nullptr);
 ^
1 error generated.


It looks like the failed conversion with the default argument is not
in the immediate context, so is an error not a substitution failure.
Clang is probably right, so I'll change it.

The reason I did it that way was to save instantiating
std::is_convertible but also because it seemed like an easy way to
avoid confusing diagnostics that say:

error: forming pointer to reference type 'std::basic_ostream&'

for overload resolution failures for operator<< (because those
diagnostics are already hundreds of lines long and confusing enough
already).

This seems to work:

--- a/libstdc++-v3/include/std/ostream
+++ b/libstdc++-v3/include/std/ostream
@@ -722,9 +722,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __rval_streamable();
#else
 template>>>
+  typename = _Require<__not_>,
+  is_convertible<_Tp*, ios_base*>>>
   _Tp&
-__rval_streamable(ios_base* = (_Tp*)nullptr);
+__rval_streamable();
#endif

 // SFINAE helper to check constraints for operator<<(Ostream&&, const T&).


I'll finish testing that.


Actually, if the function parameter can't be used for the convertible
check then there's no benefit to that function at all. It's simpler to
just put all the constraints directly on the alias template that also
checks the operator<< expression:

 // SFINAE helper to check constraints for operator<<(Ostream&&, const T&).
 // If the constraints are satisfied, it is an alias for Ostream&&.
#if __cpp_lib_concepts
 // Use 

Re: [committed] amdgcn: disable TImode

2021-05-07 Thread Andrew Stubbs

On 07/05/2021 18:11, Tobias Burnus wrote:

On 07.05.21 18:35, Andrew Stubbs wrote:


TImode has always been a problem on amdgcn, and now it is causing many
new test failures, so I'm disabling it.


Does still still work with libgomp?

The patch sounds as if it might cause problems, but on the other hand,
I assume you did test it? To recall:

>

The problem is that OpenMP's depobj as implemented in GCC has
sizeof() = 2*sizeof(void*) and is implemented as a two-element struct in 
C/C++.

But the OpenMP spec mandates that it is an integer type in Fortran, i.e.
integer(kind=omp_depend_kind).


I was focussing on getting the raw amdgcn toolchain toolchain to work 
again. I had forgotten about that little detail with Fortran. :-(


Is there another way we can fix this without implementing all of TImode 
support or tracking down every place in the middle-end that wants to use 
TImode without checking the optab?



Combining the impl choice and the type requirements that means that
on 64bit systems, this requires __int128 support, cf. commit
https://gcc.gnu.org/g:8d0b2b33748014ee57973c1d7bc9fd7706bb3da9
and https://gcc.gnu.org/PR96306

(Side note: The definition in OpenMP is bad - it should have been
some opaque derived type but that's a mistake done in OpenMP 5.0.)


:-(

Andrew


[PATCH] Use the dominator tree when checking for non-null.

2021-05-07 Thread Andrew MacLeod via Gcc-patches
The current non-null processing only sets a bit for the block which 
contains a non-null setting event.


We should check the dom tree to see if a predecessor dom block sets 
non-null, otherwise we can miss it.  We don't need to do this within the 
propagation engines like the on-entry cache.  They will propagate the 
range as appropriate without looking at the dom tree. So it just as the 
client level query this is needed, make it the default.


Later, the non-null processing will be replaced, but at least get it 
right for now.


Bootstraps on  x86_64-pc-linux-gnu with no testsuite regressions.

Pushed.

Andrew

commit 37935ca2f4b5a89322a24620fbec3f3c9f2845fd
Author: Andrew MacLeod 
Date:   Tue Apr 27 08:44:46 2021 -0400

When searching for non-null, check the dominator tree.

The non-null bitmap only indicates which blocks non-null setting occurs.
Generalized queries need to search the dom tree, whereas propagation
engines only need to know the current block.  Add a flag for this purpose.

* gimple-range-cache.cc (non_null_ref::non_null_deref_p): Search
dominator tree is available and requested.
(ranger_cache::ssa_range_in_bb): Don't search dom tree here.
(ranger_cache::fill_block_cache): Don't search dom tree here either.
* gimple-range-cache.h (non_null_deref_p): Add dom_search param.

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 38e4fe1c7c0..9b401927bd6 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -48,9 +48,10 @@ non_null_ref::~non_null_ref ()
 
 // Return true if NAME has a non-null dereference in block bb.  If this is the
 // first query for NAME, calculate the summary first.
+// If SEARCH_DOM is true, the search the dominator tree as well.
 
 bool
-non_null_ref::non_null_deref_p (tree name, basic_block bb)
+non_null_ref::non_null_deref_p (tree name, basic_block bb, bool search_dom)
 {
   if (!POINTER_TYPE_P (TREE_TYPE (name)))
 return false;
@@ -59,7 +60,24 @@ non_null_ref::non_null_deref_p (tree name, basic_block bb)
   if (!m_nn[v])
 process_name (name);
 
-  return bitmap_bit_p (m_nn[v], bb->index);
+  if (bitmap_bit_p (m_nn[v], bb->index))
+return true;
+
+  // See if any dominator has set non-zero.
+  if (search_dom && dom_info_available_p (CDI_DOMINATORS))
+{
+  // Search back to the Def block, or the top, whichever is closer.
+  basic_block def_bb = gimple_bb (SSA_NAME_DEF_STMT (name));
+  basic_block def_dom = def_bb
+			? get_immediate_dominator (CDI_DOMINATORS, def_bb)
+			: NULL;
+  for ( ;
+	bb && bb != def_dom;
+	bb = get_immediate_dominator (CDI_DOMINATORS, bb))
+	if (bitmap_bit_p (m_nn[v], bb->index))
+	  return true;
+}
+  return false;
 }
 
 // Allocate an populate the bitmap for NAME.  An ON bit for a block
@@ -800,7 +818,7 @@ ranger_cache::ssa_range_in_bb (irange , tree name, basic_block bb)
   // Check if pointers have any non-null dereferences.  Non-call
   // exceptions mean we could throw in the middle of the block, so just
   // punt for now on those.
-  if (r.varying_p () && m_non_null.non_null_deref_p (name, bb) &&
+  if (r.varying_p () && m_non_null.non_null_deref_p (name, bb, false) &&
   !cfun->can_throw_non_call_exceptions)
 r = range_nonzero (TREE_TYPE (name));
 }
@@ -1066,7 +1084,8 @@ ranger_cache::fill_block_cache (tree name, basic_block bb, basic_block def_bb)
 
 	  // Regardless of whether we have visited pred or not, if the
 	  // pred has a non-null reference, revisit this block.
-	  if (m_non_null.non_null_deref_p (name, pred))
+	  // Don't search the DOM tree.
+	  if (m_non_null.non_null_deref_p (name, pred, false))
 	{
 	  if (DEBUG_RANGE_CACHE)
 		fprintf (dump_file, "nonnull: update ");
diff --git a/gcc/gimple-range-cache.h b/gcc/gimple-range-cache.h
index 2b36a02654b..986a68a9e06 100644
--- a/gcc/gimple-range-cache.h
+++ b/gcc/gimple-range-cache.h
@@ -33,7 +33,7 @@ class non_null_ref
 public:
   non_null_ref ();
   ~non_null_ref ();
-  bool non_null_deref_p (tree name, basic_block bb);
+  bool non_null_deref_p (tree name, basic_block bb, bool search_dom = true);
 private:
   vec  m_nn;
   void process_name (tree name);


[PATCH] abstract the on entry cache.

2021-05-07 Thread Andrew MacLeod via Gcc-patches
This patch cleans up the ssa_block_range class which represents the bulk 
of the on-entry cache storage.


PR 100299 shows the we need better control over the cache, and different 
approaches can be useful for different situations. In preparation for 
fixing that PR, this patch:


1) Abstracts ssa_block_range into a pure virtual API, removing the 
unused set_bb_varying routine. Now its simply  set, get and query if 
there is a range..


2) Moves the original vector code into a derived class sbr_vector, while 
making some modest improvements such as caching a single varying and 
undefined range.


3) Provides the irange_alllocator obstack the ability to provide memory 
hunks.  This allows the on-entry cache to be completely allocated from 
within the one obstack, and eliminates any need for looping during 
destruction time.. we can just throw the entire obstack away and be done.


Even though this makes the class virtual, the end result is about an 
overall  0.4% improvement in the performance of the pass (according to 
callgrind). Mostly this is due to the tweaks to the vector 
implementation changes.


This also paves the way for providing an alternate implementation when 
the CFG is very big, or other conditions.  I have a follow up patch for 
later which address the large CFG issues and fixes that PR.  I just 
wanted to get this foundation in before other restructuring changes 
interfere with the ability to easily apply it to the gcc 11 branch.


Bootstraps on  x86_64-pc-linux-gnu with no testsuite regressions.

Pushed.

Andrew



commit 18f0495c021084fc98fd252798accada955c60dc
Author: Andrew MacLeod 
Date:   Fri May 7 12:03:01 2021 -0400

Clean up and virtualize the on-entry cache interface.

Cleanup/Virtualize the ssa_block_range class, and implement the current
vector approach as a derived class.
Allow memory allocation from the irange allocator obstack for easy freeing.

* gimple-range-cache.cc (ssa_block_ranges): Virtualize.
(sbr_vector): Renamed from ssa_block_cache.
(sbr_vector::sbr_vector): Allocate from obstack abd initialize.
(ssa_block_ranges::~ssa_block_ranges): Remove.
(sbr_vector::set_bb_range): Use varying and undefined cached values.
(ssa_block_ranges::set_bb_varying): Remove.
(sbr_vector::get_bb_range): Adjust assert.
(sbr_vector::bb_range_p): Adjust assert.
(~block_range_cache): No freeing loop required.
(block_range_cache::get_block_ranges): Remove.
(block_range_cache::set_bb_range): Inline get_block_ranges.
(block_range_cache::set_bb_varying): Remove.
* gimple-range_cache.h (set_bb_varying): Remove prototype.
* value-range.h (irange_allocator::get_memory): New.

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 9b401927bd6..60e5d66c52d 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -125,29 +125,53 @@ non_null_ref::process_name (tree name)
 
 // -
 
-// This class implements a cache of ranges indexed by basic block.  It
-// represents all that is known about an SSA_NAME on entry to each
-// block.  It caches a range-for-type varying range so it doesn't need
-// to be reformed all the time.  If a range is ever always associated
-// with a type, we can use that instead.  Whenever varying is being
-// set for a block, the cache simply points to this cached one rather
-// than create a new one each time.
+// This class represents the API into a cache of ranges for an SSA_NAME.
+// Routines must be implemented to set, get, and query if a value is set.
 
 class ssa_block_ranges
 {
 public:
-  ssa_block_ranges (tree t, irange_allocator *allocator);
-  ~ssa_block_ranges ();
-
-  void set_bb_range (const basic_block bb, const irange );
-  void set_bb_varying (const basic_block bb);
-  bool get_bb_range (irange , const basic_block bb);
-  bool bb_range_p (const basic_block bb);
+  virtual void set_bb_range (const basic_block bb, const irange ) = 0;
+  virtual bool get_bb_range (irange , const basic_block bb) = 0;
+  virtual bool bb_range_p (const basic_block bb) = 0;
 
   void dump(FILE *f);
-private:
-  vec m_tab;
-  irange *m_type_range;
+};
+
+// Print the list of known ranges for file F in a nice format.
+
+void
+ssa_block_ranges::dump (FILE *f)
+{
+  basic_block bb;
+  int_range_max r;
+
+  FOR_EACH_BB_FN (bb, cfun)
+if (get_bb_range (r, bb))
+  {
+	fprintf (f, "BB%d  -> ", bb->index);
+	r.dump (f);
+	fprintf (f, "\n");
+  }
+}
+
+// This class implements the range cache as a linear vector, indexed by BB.
+// It caches a varying and undefined range which are used instead of
+// allocating new ones each time.
+
+class sbr_vector : public ssa_block_ranges
+{
+public:
+  sbr_vector (tree t, irange_allocator *allocator);
+
+  virtual void set_bb_range (const basic_block 

[PATCH] tweak range-on-exit.

2021-05-07 Thread Andrew MacLeod via Gcc-patches
Range on exit was not ding the right thing when a basic block contained 
no statements other than a PHI node.


Rather than returning the value of the PHI, it was instead asking for 
range-on-entry for the phi, which  would trigger a walk back to the top 
of the CFG looking for the definition. When it didnt find it, it would 
then default to the global-value calculation. So it ended up with the 
right value, but it does a lot of unnecessary work and put entries in 
the on-entry cache that don't need to be there.


Bootstraps on  x86_64-pc-linux-gnu with no testsuite regressions.

Pushed.

Andrew


commit c0c25d1052950cecbf4488b7f76d41952672414a
Author: Andrew MacLeod 
Date:   Mon Apr 26 19:23:25 2021 -0400

Fix range_on_exit for PHI stmts when there are no other stmts in the block.

last_stmt(bb) returns NULL for blocks which only have PHI stmts, and
range_on_exit would trigger a cache fill all the way to the top of the
program for the SSA_NAME.

* gimple-range.cc (gimple_ranger::range_on_exit): Handle block with
only PHI nodes better.

diff --git a/gcc/gimple-range.cc b/gcc/gimple-range.cc
index 6158a754dd6..e94bb355de3 100644
--- a/gcc/gimple-range.cc
+++ b/gcc/gimple-range.cc
@@ -1003,14 +1003,23 @@ gimple_ranger::range_on_exit (irange , basic_block bb, tree name)
   gcc_checking_assert (bb != EXIT_BLOCK_PTR_FOR_FN (cfun));
   gcc_checking_assert (gimple_range_ssa_p (name));
 
-  gimple *s = last_stmt (bb);
-  // If there is no statement in the block and this isn't the entry
-  // block, go get the range_on_entry for this block.  For the entry
-  // block, a NULL stmt will return the global value for NAME.
-  if (!s && bb != ENTRY_BLOCK_PTR_FOR_FN (cfun))
-range_on_entry (r, bb, name);
-  else
+  gimple *s = SSA_NAME_DEF_STMT (name);
+  basic_block def_bb = gimple_bb (s);
+  // If this is not the definition block, get the range on the last stmt in
+  // the block... if there is one.
+  if (def_bb != bb)
+s = last_stmt (bb);
+  // If there is no statement provided, get the range_on_entry for this block.
+  if (s)
 range_of_expr (r, name, s);
+  else
+{
+  range_on_entry (r, bb, name);
+  // See if there was a deref in this block, if applicable
+  if (!cfun->can_throw_non_call_exceptions && r.varying_p () &&
+	  m_cache.m_non_null.non_null_deref_p (name, bb))
+	r = range_nonzero (TREE_TYPE (name));
+}
   gcc_checking_assert (r.undefined_p ()
 		   || range_compatible_p (r.type (), TREE_TYPE (name)));
 }


[PATCH] Export gcond edge range routine.

2021-05-07 Thread Andrew MacLeod via Gcc-patches
Rename the outgoing_edge class to gimple_outgoing_edge, and provide an 
extern entry point for convenience to return a range for the TRUE and 
FALSE edges of a gcond.


Bootstraps on  x86_64-pc-linux-gnu with no testsuite regressions.

Pushed.

Andrew

commit 42a146ac12f48422f5a26b5ab9a804639cbdeea3
Author: Andrew MacLeod 
Date:   Mon Apr 26 18:14:15 2021 -0400

Make TRUE/FALSE edge calculation available without the outgoing edge class.

Rename class to gimple_outoging_edge and provide a non-class routine for
the outgoing edge of a gcond.

* gimple-range-edge.h (gimple_outgoing_range): Rename from
outgoing_range.
(gcond_edge_range): Export prototype.
* gimple-range-edge.cc (gcond_edge_range): New.
(gimple_outgoing_range::edge_range_p): Use gcond_edge_range.
* gimple-range-gori.h (gori_compute): Use gimple_outgoing_range.

diff --git a/gcc/gimple-range-edge.cc b/gcc/gimple-range-edge.cc
index 4d4cb97bbec..d11153e677e 100644
--- a/gcc/gimple-range-edge.cc
+++ b/gcc/gimple-range-edge.cc
@@ -52,12 +52,26 @@ gimple_outgoing_range_stmt_p (basic_block bb)
 }
 
 
-outgoing_range::outgoing_range ()
+// Return a TRUE or FALSE range representing the edge value of a GCOND.
+
+void
+gcond_edge_range (irange , edge e)
+{
+  gcc_checking_assert (e->flags & (EDGE_TRUE_VALUE | EDGE_FALSE_VALUE));
+  if (e->flags & EDGE_TRUE_VALUE)
+r = int_range<2> (boolean_true_node, boolean_true_node);
+  else
+r = int_range<2> (boolean_false_node, boolean_false_node);
+}
+
+
+gimple_outgoing_range::gimple_outgoing_range ()
 {
   m_edge_table = NULL;
 }
 
-outgoing_range::~outgoing_range ()
+
+gimple_outgoing_range::~gimple_outgoing_range ()
 {
   if (m_edge_table)
 delete m_edge_table;
@@ -68,7 +82,7 @@ outgoing_range::~outgoing_range ()
 // Use a cached value if it exists, or calculate it if not.
 
 bool
-outgoing_range::get_edge_range (irange , gimple *s, edge e)
+gimple_outgoing_range::get_edge_range (irange , gimple *s, edge e)
 {
   gcc_checking_assert (is_a (s));
   gswitch *sw = as_a (s);
@@ -100,7 +114,7 @@ outgoing_range::get_edge_range (irange , gimple *s, edge e)
 // Calculate all switch edges from SW and cache them in the hash table.
 
 void
-outgoing_range::calc_switch_ranges (gswitch *sw)
+gimple_outgoing_range::calc_switch_ranges (gswitch *sw)
 {
   bool existed;
   unsigned x, lim;
@@ -165,7 +179,7 @@ outgoing_range::calc_switch_ranges (gswitch *sw)
 // return NULL
 
 gimple *
-outgoing_range::edge_range_p (irange , edge e)
+gimple_outgoing_range::edge_range_p (irange , edge e)
 {
   // Determine if there is an outgoing edge.
   gimple *s = gimple_outgoing_range_stmt_p (e->src);
@@ -174,12 +188,7 @@ outgoing_range::edge_range_p (irange , edge e)
 
   if (is_a (s))
 {
-  if (e->flags & EDGE_TRUE_VALUE)
-	r = int_range<2> (boolean_true_node, boolean_true_node);
-  else if (e->flags & EDGE_FALSE_VALUE)
-	r = int_range<2> (boolean_false_node, boolean_false_node);
-  else
-	gcc_unreachable ();
+  gcond_edge_range (r, e);
   return s;
 }
 
diff --git a/gcc/gimple-range-edge.h b/gcc/gimple-range-edge.h
index 8970c9e14d6..87b4124d01d 100644
--- a/gcc/gimple-range-edge.h
+++ b/gcc/gimple-range-edge.h
@@ -35,11 +35,11 @@ along with GCC; see the file COPYING3.  If not see
 // The return value is NULL for no range, or the branch statement which the
 // edge gets the range from, along with the range.
 
-class outgoing_range
+class gimple_outgoing_range
 {
 public:
-  outgoing_range ();
-  ~outgoing_range ();
+  gimple_outgoing_range ();
+  ~gimple_outgoing_range ();
   gimple *edge_range_p (irange , edge e);
 private:
   void calc_switch_ranges (gswitch *sw);
@@ -47,9 +47,11 @@ private:
 
   hash_map *m_edge_table;
   irange_allocator m_range_allocator;
-}; 
+};
 
-// If there is a range control statment at the end of block BB, return it.
+// If there is a range control statement at the end of block BB, return it.
 gimple *gimple_outgoing_range_stmt_p (basic_block bb);
+// Return the range on edge E if it is from a GCOND.  Either TRUE or FALSE.
+void gcond_edge_range (irange , edge e);
 
 #endif  // GIMPLE_RANGE_EDGE_H
diff --git a/gcc/gimple-range-gori.h b/gcc/gimple-range-gori.h
index 48c746d1f37..7bb18a9baf1 100644
--- a/gcc/gimple-range-gori.h
+++ b/gcc/gimple-range-gori.h
@@ -108,7 +108,7 @@ private:
 	const irange , tree name);
 
   class gori_map *m_gori_map;
-  outgoing_range outgoing;	// Edge values for COND_EXPR & SWITCH_EXPR.
+  gimple_outgoing_range outgoing;	// Edge values for COND_EXPR & SWITCH_EXPR.
 };
 
 


[PATCH] Don;t over allocate switch default clauses.

2021-05-07 Thread Andrew MacLeod via Gcc-patches
We were always allocating the 255 max ranges for the default 
condition.    Instead, use int_range_max to build the default range, 
then allocate and store only what is needed.


Bootstraps on  x86_64-pc-linux-gnu with no testsuite regressions.

Pushed.

Andrew


commit 57090307da6ff4378c84f768e2f2717095f712c1
Author: Andrew MacLeod 
Date:   Mon Apr 26 17:50:18 2021 -0400

Don't over-allocate switch default range object.

We were always allocating the 255 max ranges for the default condition.
Instead, use int_range_max to build the default range, then allocate and
store only what is needed.

* gimple-range-edge.cc (outgoing_range::calc_switch_ranges): Compute
default range into a temp and allocate only what is needed.

diff --git a/gcc/gimple-range-edge.cc b/gcc/gimple-range-edge.cc
index d2c7d384dff..4d4cb97bbec 100644
--- a/gcc/gimple-range-edge.cc
+++ b/gcc/gimple-range-edge.cc
@@ -112,8 +112,7 @@ outgoing_range::calc_switch_ranges (gswitch *sw)
   //
   // Allocate an int_range_max for the default range case, start with
   // varying and intersect each other case from it.
-  irange *default_range = m_range_allocator.allocate (255);
-  default_range->set_varying (type);
+  int_range_max default_range (type);
 
   for (x = 1; x < lim; x++)
 {
@@ -132,7 +131,7 @@ outgoing_range::calc_switch_ranges (gswitch *sw)
   int_range_max def_range (low, high);
   range_cast (def_range, type);
   def_range.invert ();
-  default_range->intersect (def_range);
+  default_range.intersect (def_range);
 
   // Create/union this case with anything on else on the edge.
   int_range_max case_range (low, high);
@@ -156,7 +155,8 @@ outgoing_range::calc_switch_ranges (gswitch *sw)
   irange * = m_edge_table->get_or_insert (default_edge, );
   // This should be the first call into this switch.
   gcc_checking_assert (!existed);
-  slot = default_range;
+  irange *dr = m_range_allocator.allocate (default_range);
+  slot = dr;
 }
 
 


[PATCH] Change x mod 0 to undefined.

2021-05-07 Thread Andrew MacLeod via Gcc-patches

We were setting x % 0 to varying...  change it to undefined.

Bootstraps on  x86_64-pc-linux-gnu with no testsuite regressions.

Pushed.

Andrew

commit 2262206b82da0ab4050fe6168a7d09be4e4b3d0f
Author: Andrew MacLeod 
Date:   Mon Apr 26 17:46:31 2021 -0400

Change x mod 0 to produce UNDEFINED rather than VARYING.

* range-ops.cc (operator_trunc_mod::wi_fold): x % 0 is UNDEFINED.

diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index 0027b3e1427..ab8f4e211ac 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -2689,7 +2689,7 @@ operator_trunc_mod::wi_fold (irange , tree type,
   // Mod 0 is undefined.
   if (wi_zero_p (type, rh_lb, rh_ub))
 {
-  r.set_varying (type);
+  r.set_undefined ();
   return;
 }
 


[PATCH] Enhance initial global ranges.

2021-05-07 Thread Andrew MacLeod via Gcc-patches

Add some tweaks to gimple_range_global () that

1) incorporates code from vr-values to pick up initial parameter values 
from IPA,


2) used before defined locals start with UNDEFINED instead of varying.  
This makes a big difference when folding PHI nodes.


Bootstraps on  x86_64-pc-linux-gnu with no testsuite regressions.

Pushed.

Andrew


commit ee9495f95f74ee68df62831a8a55a5e1597025d7
Author: Andrew MacLeod 
Date:   Mon Apr 26 17:41:22 2021 -0400

Enhance initial global value setting.

Incorporate code from vr_values to get safe initial parameter values.
If this is a local automatic which is used before defined, use UNDEFINED.

* gimple-range.h (gimple_range_global): Pick up parameter initial
values, and use-before defined locals are UNDEFINED.

diff --git a/gcc/gimple-range.h b/gcc/gimple-range.h
index 5751b0937a0..f33156181bf 100644
--- a/gcc/gimple-range.h
+++ b/gcc/gimple-range.h
@@ -138,22 +138,39 @@ gimple_range_global (tree name)
 {
   gcc_checking_assert (gimple_range_ssa_p (name));
   tree type = TREE_TYPE (name);
-#if 0
-  // Reenable picking up global ranges when we are OK failing tests that look
-  // for builtin_unreachable in the code, like
-  // RUNTESTFLAGS=dg.exp=pr61034.C check-g++
-  // pre-optimizations (inlining) set a global range which causes the ranger
-  // to remove the condition which leads to builtin_unreachable.
-  if (!POINTER_TYPE_P (type) && SSA_NAME_RANGE_INFO (name))
+
+  if (SSA_NAME_IS_DEFAULT_DEF (name))
 {
-  // Return a range from an SSA_NAME's available range.
-  wide_int min, max;
-  enum value_range_kind kind = get_range_info (name, , );
-  return value_range (type, min, max, kind);
-}
-#endif
- // Otherwise return range for the type.
- return value_range (type);
+  tree sym = SSA_NAME_VAR (name);
+  // Adapted from vr_values::get_lattice_entry().
+  // Use a range from an SSA_NAME's available range.
+  if (TREE_CODE (sym) == PARM_DECL)
+	{
+	  // Try to use the "nonnull" attribute to create ~[0, 0]
+	  // anti-ranges for pointers.  Note that this is only valid with
+	  // default definitions of PARM_DECLs.
+	  if (POINTER_TYPE_P (type)
+	  && (nonnull_arg_p (sym) || get_ptr_nonnull (name)))
+	{
+	  value_range r;
+	  r.set_nonzero (type);
+	  return r;
+	}
+	  else if (INTEGRAL_TYPE_P (type))
+	{
+	  value_range r;
+	  get_range_info (name, r);
+	  if (r.undefined_p ())
+		r.set_varying (type);
+	  return r;
+	}
+	}
+  // If this is a local automatic with no definition, use undefined.
+  else if (TREE_CODE (sym) != RESULT_DECL)
+	return value_range ();
+   }
+  // Otherwise return range for the type.
+  return value_range (type);
 }
 
 


Re: [patch] Fix incorrect array bounds with -fgnat-encodings=minimal in DWARF

2021-05-07 Thread Jeff Law via Gcc-patches



On 5/7/2021 5:09 AM, Eric Botcazou wrote:

Hi,

this makes add_subscript_info query the get_array_descr_info hook for the
actual information when it is defined.

Tested on x86-64/Linux, OK for mainline?


2021-05-07  Eric Botcazou  

* dwarf2out.c (add_subscript_info): Retrieve the bounds and the index
type by means of the get_array_descr_info langhook, if it is set and
returns true.  Remove obsolete code dealing with unnamed subtypes.


2021-05-07  Eric Botcazou  

* gnat.dg/debug18.adb: New test.


OK

jeff




Re: [PATCH] builtins.c: Ensure emit_move_insn operands are valid (PR100418)

2021-05-07 Thread Jeff Law via Gcc-patches



On 5/7/2021 10:26 AM, Andrew Stubbs wrote:
A recent patch from Alexandre added new calls to emit_move_insn with 
PLUS expressions in the operands. Apparently this works fine on (at 
least) x86_64, but fails on (at least) amdgcn, where the adddi3 patten 
has clobbers that the movdi3 does not. This results in ICEs in recog.


This patch inserts force_operand around the problem cases so that it 
only creates valid move instructions.


I've done a regression test on amdgcn and everything works again [*].

OK to commit?

Andrew

[*] Well, once I fix a new, unrelated TImode issue it does anyway.

210507-fix-try-store.patch

Ensure emit_move_insn operands are valid

Some architectures are fine with PLUS in move instructions, but others
are not (amdgcn is the motivating example).

gcc/ChangeLog:

PR target/100418
* builtins.c (try_store_by_multiple_pieces): Use force_operand for
emit_move_insn operands.


OK.  I've had the equivalent here, but hadn't submitted it yet.

jeff



Re: [PATCH] PR libstdc++/71579 assert that type traits are not misused with an incomplete type

2021-05-07 Thread Antony Polukhin via Gcc-patches
Rebased the patch on current master. Please review.

Changelog stays the same:

New std::common_type assertions attempt to give a proper 'required from
here' hint for user code, do not bring many changes to the
implementation and check all the template parameters for completeness.
In some cases the type could be checked for completeness more than
once. This seems to be unsolvable due to the fact that
std::common_type could be specialized by the user, so we have to call
std::common_type recursively, potentially repeating the check for the
first type.

std::common_reference assertions make sure that we detect incomplete
types even if the user specialized the std::basic_common_reference.

2020-11-12  Antony Polukhin  
PR libstdc/71579
* include/std/type_traits (is_convertible, is_nothrow_convertible)
(common_type, common_reference): Add static_asserts
to make sure that the arguments of the type traits are not misused
with incomplete types.
* testsuite/20_util/common_reference/incomplete_basic_common_neg.cc:
New test.
* testsuite/20_util/common_reference/incomplete_neg.cc: New test.
* testsuite/20_util/common_type/incomplete_neg.cc: New test.
* testsuite/20_util/common_type/requirements/sfinae_friendly_1.cc: Remove
SFINAE tests on incomplete types.
* testsuite/20_util/is_convertible/incomplete_neg.cc: New test.
* testsuite/20_util/is_nothrow_convertible/incomplete_neg.cc: New test.

пт, 8 янв. 2021 г. в 20:28, Antony Polukhin :
>
>
> On Thu, Nov 12, 2020, 21:55 Antony Polukhin  wrote:
>>
>> Final bits for libstdc/71579
>
>
> Gentle reminder on last patch



-- 
Best regards,
Antony Polukhin
diff --git a/libstdc++-v3/include/std/type_traits 
b/libstdc++-v3/include/std/type_traits
index eaf06fc..a95d327 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -1406,12 +1406,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_convertible
 : public __is_convertible_helper<_From, _To>::type
-{ };
+{
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_From>{}),
+   "first template argument must be a complete class or an unbounded 
array");
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_To>{}),
+   "second template argument must be a complete class or an unbounded 
array");
+};
 
   // helper trait for unique_ptr, shared_ptr, and span
   template
 using __is_array_convertible
-  = is_convertible<_FromElementType(*)[], _ToElementType(*)[]>;
+  = typename __is_convertible_helper<
+   _FromElementType(*)[], _ToElementType(*)[]>::type;
 
   template, is_function<_To>,
@@ -1454,7 +1460,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_nothrow_convertible
 : public __is_nt_convertible_helper<_From, _To>::type
-{ };
+{
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_From>{}),
+   "first template argument must be a complete class or an unbounded 
array");
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_To>{}),
+   "second template argument must be a complete class or an unbounded 
array");
+};
 
   /// is_nothrow_convertible_v
   template
@@ -2239,7 +2250,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct common_type<_Tp1, _Tp2>
 : public __common_type_impl<_Tp1, _Tp2>::type
-{ };
+{
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp1>{}),
+   "each argument type must be a complete class or an unbounded array");
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp2>{}),
+   "each argument type must be a complete class or an unbounded array");
+};
 
   template
 struct __common_type_pack
@@ -2253,7 +2269,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 struct common_type<_Tp1, _Tp2, _Rp...>
 : public __common_type_fold,
__common_type_pack<_Rp...>>
-{ };
+{
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp1>{}),
+   "first argument type must be a complete class or an unbounded array");
+  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp2>{}),
+   "second argument type must be a complete class or an unbounded array");
+#ifdef __cpp_fold_expressions
+  static_assert((std::__is_complete_or_unbounded(
+   __type_identity<_Rp>{}) && ...),
+   "each argument type must be a complete class or an unbounded array");
+#endif
+};
 
   // Let C denote the same type, if any, as common_type_t.
   // If there is such a type C, type shall denote the same type, if any,
@@ -3352,9 +3378,10 @@ template 
 
   // If A and B are both rvalue reference types, ...
   template
-struct __common_ref_impl<_Xp&&, _Yp&&,
-  _Require>,
-  is_convertible<_Yp&&, __common_ref_C<_Xp, _Yp
+struct __common_ref_impl<_Xp&&, _Yp&&, _Require<
+  typename __is_convertible_helper<_Xp&&, 

Re: [PATCH V7 4/7] CTF/BTF debug formats

2021-05-07 Thread Indu Bhagat via Gcc-patches

On 4/29/21 11:17 PM, Richard Biener wrote:

On Thu, 29 Apr 2021, Indu Bhagat wrote:


Hello,

On 4/29/21 5:10 AM, Richard Biener wrote:

On Thu, 29 Apr 2021, Jose E. Marchesi wrote:


This commit introduces support for generating CTF debugging
information and BTF debugging information from GCC.


Comments inline.


Thanks for your reviews.

My responses and questions inline at respective points.

Indu
 +#ifndef GCC_CTFC_H

+#define GCC_CTFC_H 1
+
+#include "config.h"
+#include "system.h"
+#include "tree.h"
+#include "fold-const.h"
+#include "dwarf2ctf.h"
+#include "ctf.h"
+#include "btf.h"
+
+/* Invalid CTF type ID definition.  */
+
+#define CTF_NULL_TYPEID 0
+
+/* Value to start generating the CTF type ID from.  */
+
+#define CTF_INIT_TYPEID 1
+
+/* CTF type ID.  */
+
+typedef unsigned long ctf_id_t;
+
+/* CTF string table element (list node).  */
+
+typedef struct GTY ((chain_next ("%h.cts_next"))) ctf_string


I know that DWARF takes the lead here but do all these have to
live in GC memory?  The reason the DWARF bits do is that they
point to 'tree' and that trees point to DIEs.



Not entirely sure what you mean here ? Do you mean to not tag it as GC root
and avoid traversal for GC marking the individual strings ?


Basically think of what part of the CTF data structures can live on the
heap (since you should know lifetime pretty well).



OK. I have added code to release the memory for CTF strings for now when 
the CTF container is finalized. This change will be included in the next 
 patch series.



+{
+  dw_die_ref type_die = get_AT_ref (die, DW_AT_type);
+  return (type_die ? type_die : ctf_void_die);
+}
+
+/* Some data member DIEs have location specified as a DWARF expression
+   (specifically in DWARF2).  Luckily, the expression is a simple
+   DW_OP_plus_uconst with one operand set to zero.
+
+   Sometimes the data member location may also be negative.  In yet some
other
+   cases (specifically union data members), the data member location is
+   non-existent.  Handle all these scenarios here to abstract this.  */
+
+static HOST_WIDE_INT ctf_get_AT_data_member_location (dw_die_ref die)


likewise.


+{
+  HOST_WIDE_INT field_location = 0;
+  dw_attr_node * attr;
+
+  /* The field location (in bits) can be determined from
+ either a DW_AT_data_member_location attribute or a
+ DW_AT_data_bit_offset attribute.  */
+  if (get_AT (die, DW_AT_data_bit_offset))
+field_location = get_AT_unsigned (die, DW_AT_data_bit_offset);
+  else
+{
+  attr = get_AT (die, DW_AT_data_member_location);
+  if (attr && AT_class (attr) == dw_val_class_loc)
+   {
+ dw_loc_descr_ref descr = AT_loc (attr);
+ /* Operand 2 must be zero; the structure is assumed to be on the
+stack in DWARF2.  */
+ gcc_assert (!descr->dw_loc_oprnd2.v.val_unsigned);
+ gcc_assert (descr->dw_loc_oprnd2.val_class
+ == dw_val_class_unsigned_const);
+ field_location = descr->dw_loc_oprnd1.v.val_unsigned;
+   }
+  else
+   {
+ attr = get_AT (die, DW_AT_data_member_location);
+ if (attr && AT_class (attr) == dw_val_class_const)
+   field_location = AT_int (attr);
+ else
+   field_location = (get_AT_unsigned (die,
+  DW_AT_data_member_location)
+ * 8);
+   }
+}


so when neither of the above we return 0?  Maybe we should ICE here
instead.  Ada for example has variable location fields.



Yes, adding gcc_unreachable is sensible. Will do.



Hmm... I have to correct myself and say that we should not ICE here when 
neither DW_AT_data_bit_offset nor DW_AT_data_member_location attributes 
are available. There are valid C constructs when we will hit that case. 
For these cases, we piggyback on the get_AT_unsigned () API as it 
returns 0 when the requested attribute is NULL.


union c
{
  int c1;
  int c2;
} my_u_c;

DIE0: DW_TAG_union_type (0x70f49190)
  abbrev id: 0 offset: 0 mark: 0
  DW_AT_name: "c"
  DW_AT_byte_size: 4
  DW_AT_decl_file: "test-union-1.c" (0)
  DW_AT_decl_line: 8
  DW_AT_decl_column: 7
DIE0: DW_TAG_member (0x70f491e0)
  abbrev id: 0 offset: 0 mark: 0
  DW_AT_name: "c1"
  DW_AT_decl_file: "test-union-1.c" (0)
  DW_AT_decl_line: 10
  DW_AT_decl_column: 7
  DW_AT_type: die -> 0 (0x70f49230)
DIE0: DW_TAG_member (0x70f49280)
  abbrev id: 0 offset: 0 mark: 0
  DW_AT_name: "c2"
  DW_AT_decl_file: "test-union-1.c" (0)
  DW_AT_decl_line: 11
  DW_AT_decl_column: 7
  DW_AT_type: die -> 0 (0x70f49230)

As for Ada, CTF is not supported.

So I think we are OK here.



Overall I think this is fine with the suggested changes.  You may want
to refactor the debug info kind into a flag based one (I've seen you
suggested that on IRC).

Richard.



Thanks again for reviewing. Yes, I have started tinkering around to make the
write_symbols into a 

Re: [committed] amdgcn: disable TImode

2021-05-07 Thread Tobias Burnus

On 07.05.21 18:35, Andrew Stubbs wrote:


TImode has always been a problem on amdgcn, and now it is causing many
new test failures, so I'm disabling it.


Does still still work with libgomp?

The patch sounds as if it might cause problems, but on the other hand,
I assume you did test it? To recall:

The problem is that OpenMP's depobj as implemented in GCC has
sizeof() = 2*sizeof(void*) and is implemented as a two-element struct in C/C++.
But the OpenMP spec mandates that it is an integer type in Fortran, i.e.
integer(kind=omp_depend_kind).

Combining the impl choice and the type requirements that means that
on 64bit systems, this requires __int128 support, cf. commit
https://gcc.gnu.org/g:8d0b2b33748014ee57973c1d7bc9fd7706bb3da9
and https://gcc.gnu.org/PR96306

(Side note: The definition in OpenMP is bad - it should have been
some opaque derived type but that's a mistake done in OpenMP 5.0.)

Tobias


The mode only has move instructions defined, which was enough for SLP,
but any other code trying to use it without checking the optabs is a
problem.

The mode remains available for use within the backend, which is
important because at least one hardware instruction uses a TImode
value with two DImode values packed inside.

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf


[PATCH] tsan: fix false positive for pthread_cond_clockwait

2021-05-07 Thread Michael de Lang via Gcc-patches
pthread_cond_clockwait isn't added to TSAN_INTERCEPTORS which leads to
false positives regarding double locking of a mutex. This was
uncovered by a user reporting an issue to the google sanitizer github:
https://github.com/google/sanitizers/issues/1259

This patch copies code from the fix made in llvm:
https://github.com/llvm/llvm-project/commit/16eb853ffdd1a1ad7c95455b7795c5f004402e46

However, because the tsan related source code hasn't been kept in sync
with llvm, I had to make some modifications.

Given that this is my first contibution to gcc, let me know if I've
missed anything.

Met vriendelijke groet,
Michael de Lang

+++ b/gcc/testsuite/g++.dg/tsan/pthread_cond_clockwait.C
@@ -0,0 +1,31 @@
+// Test pthread_cond_clockwait not generating false positives with tsan
+// { dg-do run { target { { *-*-linux* *-*-gnu* *-*-uclinux* } && pthread } } }
+// { dg-options "-fsanitize=thread -lpthread" }
+
+#include 
+
+pthread_cond_t cv;
+pthread_mutex_t mtx;
+
+void *fn(void *vp) {
+pthread_mutex_lock();
+pthread_cond_signal();
+pthread_mutex_unlock();
+return NULL;
+}
+
+int main() {
+pthread_mutex_lock();
+
+pthread_t tid;
+pthread_create(, NULL, fn, NULL);
+
+struct timespec ts;
+clock_gettime(CLOCK_MONOTONIC, );
+ts.tv_sec += 10;
+pthread_cond_clockwait(, , CLOCK_MONOTONIC, );
+pthread_mutex_unlock();
+
+pthread_join(tid, NULL);
+return 0;
+}
diff --git a/libsanitizer/tsan/tsan_interceptors_posix.cpp
b/libsanitizer/tsan/tsan_interceptors_posix.cpp
index aa04d8dfb67..7b3d0a917de 100644
--- a/libsanitizer/tsan/tsan_interceptors_posix.cpp
+++ b/libsanitizer/tsan/tsan_interceptors_posix.cpp
@@ -1126,7 +1126,10 @@ struct CondMutexUnlockCtx {
   ScopedInterceptor *si;
   ThreadState *thr;
   uptr pc;
+  void *c;
   void *m;
+  void *abstime;
+  __sanitizer_clockid_t clock;
 };

 static void cond_mutex_unlock(CondMutexUnlockCtx *arg) {
@@ -1152,19 +1155,18 @@ INTERCEPTOR(int, pthread_cond_init, void *c, void *a) {
 }

 static int cond_wait(ThreadState *thr, uptr pc, ScopedInterceptor *si,
- int (*fn)(void *c, void *m, void *abstime), void *c,
- void *m, void *t) {
+ int (*fn)(void *arg), void *c,
+ void *m, void *t, __sanitizer_clockid_t clock) {
   MemoryAccessRange(thr, pc, (uptr)c, sizeof(uptr), false);
   MutexUnlock(thr, pc, (uptr)m);
-  CondMutexUnlockCtx arg = {si, thr, pc, m};
+  CondMutexUnlockCtx arg = {si, thr, pc, c, m, t, clock};
   int res = 0;
   // This ensures that we handle mutex lock even in case of pthread_cancel.
   // See test/tsan/cond_cancel.cpp.
   {
 // Enable signal delivery while the thread is blocked.
 BlockingCall bc(thr);
-res = call_pthread_cancel_with_cleanup(
-fn, c, m, t, (void (*)(void *arg))cond_mutex_unlock, );
+res = call_pthread_cancel_with_cleanup(fn, (void (*)(void
*arg))cond_mutex_unlock, );
   }
   if (res == errno_EOWNERDEAD) MutexRepair(thr, pc, (uptr)m);
   MutexPostLock(thr, pc, (uptr)m, MutexFlagDoPreLockOnPostLock);
@@ -1174,25 +1176,34 @@ static int cond_wait(ThreadState *thr, uptr
pc, ScopedInterceptor *si,
 INTERCEPTOR(int, pthread_cond_wait, void *c, void *m) {
   void *cond = init_cond(c);
   SCOPED_TSAN_INTERCEPTOR(pthread_cond_wait, cond, m);
-  return cond_wait(thr, pc, , (int (*)(void *c, void *m, void
*abstime))REAL(
- pthread_cond_wait),
-   cond, m, 0);
+  return cond_wait(thr, pc, , [](void *a) { CondMutexUnlockCtx
*arg = (CondMutexUnlockCtx*)a; return REAL(pthread_cond_wait)(arg->c,
arg->m); },
+   cond, m, 0, 0);
 }

 INTERCEPTOR(int, pthread_cond_timedwait, void *c, void *m, void *abstime) {
   void *cond = init_cond(c);
   SCOPED_TSAN_INTERCEPTOR(pthread_cond_timedwait, cond, m, abstime);
-  return cond_wait(thr, pc, , REAL(pthread_cond_timedwait), cond, m,
-   abstime);
+  return cond_wait(thr, pc, , [](void *a) { CondMutexUnlockCtx
*arg = (CondMutexUnlockCtx*)a; return
REAL(pthread_cond_timedwait)(arg->c, arg->m, arg->abstime); }, cond,
m,
+   abstime, 0);
 }

+#if SANITIZER_LINUX
+INTERCEPTOR(int, pthread_cond_clockwait, void *c, void *m,
__sanitizer_clockid_t clock, void *abstime) {
+  void *cond = init_cond(c);
+  SCOPED_TSAN_INTERCEPTOR(pthread_cond_clockwait, cond, m, clock, abstime);
+  return cond_wait(thr, pc, ,
+   [](void *a) { CondMutexUnlockCtx *arg =
(CondMutexUnlockCtx*)a; return REAL(pthread_cond_clockwait)(arg->c,
arg->m, arg->clock, arg->abstime); },
+   cond, m, abstime, clock);
+}
+#endif
+
 #if SANITIZER_MAC
 INTERCEPTOR(int, pthread_cond_timedwait_relative_np, void *c, void *m,
 void *reltime) {
   void *cond = init_cond(c);
   SCOPED_TSAN_INTERCEPTOR(pthread_cond_timedwait_relative_np, cond,
m, reltime);
-  return cond_wait(thr, pc, ,
REAL(pthread_cond_timedwait_relative_np), cond,
-   m, reltime);
+  return 

[committed] amdgcn: disable TImode

2021-05-07 Thread Andrew Stubbs
TImode has always been a problem on amdgcn, and now it is causing many 
new test failures, so I'm disabling it.


The mode only has move instructions defined, which was enough for SLP, 
but any other code trying to use it without checking the optabs is a 
problem.


The mode remains available for use within the backend, which is 
important because at least one hardware instruction uses a TImode value 
with two DImode values packed inside.


Andrew
amdgcn: disable TImode

The TImode support works for moves only, which has worked in most case up
to now, but no longer.

We still need TImode to exist for the instructions that take two DImode
values packed together, but we don't need to advertise this to the middle-end.

gcc/ChangeLog:

* config/gcn/gcn.c (gcn_scalar_mode_supported_p): Disable TImode.

diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c
index 9660ca6eaa4..2baf91d2f1f 100644
--- a/gcc/config/gcn/gcn.c
+++ b/gcc/config/gcn/gcn.c
@@ -361,7 +361,7 @@ gcn_scalar_mode_supported_p (scalar_mode mode)
  || mode == HImode /* || mode == HFmode  */
  || mode == SImode || mode == SFmode
  || mode == DImode || mode == DFmode
- || mode == TImode);
+ /*|| mode == TImode*/); /* TI is used for back-end purposes only.  */
 }
 
 /* Implement TARGET_CLASS_MAX_NREGS.


[PATCH] c++: argument pack expansion inside constraint [PR100138]

2021-05-07 Thread Patrick Palka via Gcc-patches
This PR is about CTAD but the underlying problems are more general;
CTAD is a good trigger for them because of the necessary substitution
into constraints that deduction guide generation entails.

In the testcase below, when generating the implicit deduction guide for
the constrained constructor template for A, we substitute the generic
flattening map 'tsubst_args' into the constructor's constraints.  During
this substitution, tsubst_pack_expansion returns a rebuilt pack
expansion for sizeof...(xs), but it's neglecting to carry over the
PACK_EXPANSION_LOCAL_P (and PACK_EXPANSION_SIZEOF_P) flag from the
original tree to the rebuilt one.  The flag is otherwise unset on the
original tree[1] but set for the rebuilt tree from make_pack_expansion
only because we're doing the CTAD at function scope (inside main).  This
leads us to crash when substituting into the pack expansion during
satisfaction because we don't have local_specializations set up (it'd be
set up for us if PACK_EXPANSION_LOCAL_P is unset)

Similarly, when substituting into a constraint we need to set
cp_unevaluated since constraints are unevaluated operands.  This avoids
a crash during CTAD for C below.

[1]: Although the original pack expansion is in a function context, I
guess it makes sense that PACK_EXPANSION_LOCAL_P is unset for it because
we can't rely on local specializations (which are formed when
substituting into the function declaration) during satisfaction.

Bootstrapped and regtested on x86_64-pc-linux-gnu, also tested on
cmcstl2 and range-v3, does this look OK for trunk?

gcc/cp/ChangeLog:

PR c++/100138
* constraint.cc (tsubst_constraint): Set up cp_unevaluated.
(satisfy_atom): Set up iloc_sentinel before calling
cxx_constant_value.
* pt.c (tsubst_pack_expansion): When returning a rebuilt pack
expansion, carry over PACK_EXPANSION_LOCAL_P and
PACK_EXPANSION_SIZEOF_P from the original pack expansion.

gcc/testsuite/ChangeLog:

PR c++/100138
* g++.dg/cpp2a/concepts-ctad4.C: New test.
---
 gcc/cp/constraint.cc|  6 -
 gcc/cp/pt.c |  2 ++
 gcc/testsuite/g++.dg/cpp2a/concepts-ctad4.C | 25 +
 3 files changed, 32 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-ctad4.C

diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
index 0709695fd08..30fccc46678 100644
--- a/gcc/cp/constraint.cc
+++ b/gcc/cp/constraint.cc
@@ -2747,6 +2747,7 @@ tsubst_constraint (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
   /* We also don't want to evaluate concept-checks when substituting the
  constraint-expressions of a declaration.  */
   processing_constraint_expression_sentinel s;
+  cp_unevaluated u;
   tree expr = tsubst_expr (t, args, complain, in_decl, false);
   return expr;
 }
@@ -3005,7 +3006,10 @@ satisfy_atom (tree t, tree args, sat_info info)
 
   /* Compute the value of the constraint.  */
   if (info.noisy ())
-result = cxx_constant_value (result);
+{
+  iloc_sentinel ils (EXPR_LOCATION (result));
+  result = cxx_constant_value (result);
+}
   else
 {
   result = maybe_constant_value (result, NULL_TREE,
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 36a8cb5df5d..0d27dd1af65 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13203,6 +13203,8 @@ tsubst_pack_expansion (tree t, tree args, 
tsubst_flags_t complain,
   else
result = tsubst (pattern, args, complain, in_decl);
   result = make_pack_expansion (result, complain);
+  PACK_EXPANSION_LOCAL_P (result) = PACK_EXPANSION_LOCAL_P (t);
+  PACK_EXPANSION_SIZEOF_P (result) = PACK_EXPANSION_SIZEOF_P (t);
   if (PACK_EXPANSION_AUTO_P (t))
{
  /* This is a fake auto... pack expansion created in add_capture with
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-ctad4.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-ctad4.C
new file mode 100644
index 000..95a3a22dd04
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-ctad4.C
@@ -0,0 +1,25 @@
+// PR c++/100138
+// { dg-do compile { target c++20 } }
+
+template 
+struct A {
+  A(T, auto... xs) requires (sizeof...(xs) != 0) { }
+};
+
+constexpr bool f(...) { return true; }
+
+template 
+struct B {
+  B(T, auto... xs) requires (f(xs...)); // { dg-error "constant expression" }
+};
+
+template 
+struct C {
+  C(T, auto x) requires (f(x)); // { dg-error "constant expression" }
+};
+
+int main() {
+  A x{1, 2}; // { dg-bogus "" }
+  B y{1, 2}; // { dg-error "deduction|no match" }
+  C z{1, 2}; // { dg-error "deduction|no match" }
+}
-- 
2.31.1.442.g7e39198978



[PATCH] builtins.c: Ensure emit_move_insn operands are valid (PR100418)

2021-05-07 Thread Andrew Stubbs
A recent patch from Alexandre added new calls to emit_move_insn with 
PLUS expressions in the operands. Apparently this works fine on (at 
least) x86_64, but fails on (at least) amdgcn, where the adddi3 patten 
has clobbers that the movdi3 does not. This results in ICEs in recog.


This patch inserts force_operand around the problem cases so that it 
only creates valid move instructions.


I've done a regression test on amdgcn and everything works again [*].

OK to commit?

Andrew

[*] Well, once I fix a new, unrelated TImode issue it does anyway.
Ensure emit_move_insn operands are valid

Some architectures are fine with PLUS in move instructions, but others
are not (amdgcn is the motivating example).

gcc/ChangeLog:

PR target/100418
* builtins.c (try_store_by_multiple_pieces): Use force_operand for
emit_move_insn operands.

diff --git a/gcc/builtins.c b/gcc/builtins.c
index 0db4090c434..ef8852418af 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -6773,9 +6773,10 @@ try_store_by_multiple_pieces (rtx to, rtx len, unsigned 
int ctz_len,
 
   /* Adjust PTR, TO and REM.  Since TO's address is likely
 PTR+offset, we have to replace it.  */
-  emit_move_insn (ptr, XEXP (to, 0));
+  emit_move_insn (ptr, force_operand (XEXP (to, 0), NULL_RTX));
   to = replace_equiv_address (to, ptr);
-  emit_move_insn (rem, plus_constant (ptr_mode, rem, -blksize));
+  rtx rem_minus_blksize = plus_constant (ptr_mode, rem, -blksize);
+  emit_move_insn (rem, force_operand (rem_minus_blksize, NULL_RTX));
 }
 
   /* Iterate over power-of-two block sizes from the maximum length to
@@ -6809,9 +6810,10 @@ try_store_by_multiple_pieces (rtx to, rtx len, unsigned 
int ctz_len,
   /* Adjust REM and PTR, unless this is the last iteration.  */
   if (i != sctz_len)
{
- emit_move_insn (ptr, XEXP (to, 0));
+ emit_move_insn (ptr, force_operand (XEXP (to, 0), NULL_RTX));
  to = replace_equiv_address (to, ptr);
- emit_move_insn (rem, plus_constant (ptr_mode, rem, -blksize));
+ rtx rem_minus_blksize = plus_constant (ptr_mode, rem, -blksize);
+ emit_move_insn (rem, force_operand (rem_minus_blksize, NULL_RTX));
}
 
   if (label)


[pushed] c++: reject class lvalues in 'rvalue'

2021-05-07 Thread Jason Merrill via Gcc-patches
Wrapping a class lvalue in NON_LVALUE_EXPR is not sufficient to make it a
usable prvalue; callers must use force_rvalue instead.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* tree.c (rvalue): Assert expr is not a class lvalue.
---
 gcc/cp/tree.c | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 4ccd7a314f5..7f148b4b158 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -940,7 +940,12 @@ rvalue (tree expr)
   /* We need to do this for rvalue refs as well to get the right answer
  from decltype; see c++/36628.  */
   if (!processing_template_decl && glvalue_p (expr))
-expr = build1 (NON_LVALUE_EXPR, type, expr);
+{
+  /* But don't use this function for class lvalues; use move (to treat an
+lvalue as an xvalue) or force_rvalue (to make a prvalue copy).  */
+  gcc_checking_assert (!CLASS_TYPE_P (type));
+  expr = build1 (NON_LVALUE_EXPR, type, expr);
+}
   else if (type != TREE_TYPE (expr))
 expr = build_nop (type, expr);
 

base-commit: fc178519771db508c03611cff4a1466cf67fce1d
-- 
2.27.0



[pushed] c++: avoid non-TARGET_EXPR class prvalues

2021-05-07 Thread Jason Merrill via Gcc-patches
Around PR98469 I asked Jakub to wrap a class BIT_CAST_EXPR in TARGET_EXPR;
SPACESHIP_EXPR needs the same thing.  The dummy CAST_EXPR created in
can_convert is another instance of a non-TARGET_EXPR prvalue, so let's use
the declval-like build_stub_object there instead.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* cp-tree.h (build_stub_object): Declare.
* method.c (build_stub_object): No longer static.
* call.c (can_convert): Use it.
* tree.c (build_dummy_object): Adjust comment.
* typeck.c (cp_build_binary_op): Wrap SPACESHIP_EXPR in a
TARGET_EXPR.
---
 gcc/cp/cp-tree.h | 1 +
 gcc/cp/call.c| 2 +-
 gcc/cp/method.c  | 2 +-
 gcc/cp/tree.c| 3 ++-
 gcc/cp/typeck.c  | 2 ++
 5 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index a08867aea62..122dadf976f 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6968,6 +6968,7 @@ extern tree get_copy_ctor (tree, 
tsubst_flags_t);
 extern tree get_copy_assign(tree);
 extern tree get_default_ctor   (tree);
 extern tree get_dtor   (tree, tsubst_flags_t);
+extern tree build_stub_object  (tree);
 extern tree strip_inheriting_ctors (tree);
 extern tree inherited_ctor_binfo   (tree);
 extern bool base_ctor_omit_inherited_parms (tree);
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 8e455e59909..d2908b3b0cd 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -12175,7 +12175,7 @@ can_convert (tree to, tree from, tsubst_flags_t 
complain)
   /* implicit_conversion only considers user-defined conversions
  if it has an expression for the call argument list.  */
   if (CLASS_TYPE_P (from) || CLASS_TYPE_P (to))
-arg = build1 (CAST_EXPR, from, NULL_TREE);
+arg = build_stub_object (from);
   return can_convert_arg (to, from, arg, LOOKUP_IMPLICIT, complain);
 }
 
diff --git a/gcc/cp/method.c b/gcc/cp/method.c
index 0f416bec35b..f8c9456d720 100644
--- a/gcc/cp/method.c
+++ b/gcc/cp/method.c
@@ -1793,7 +1793,7 @@ build_stub_type (tree type, int quals, bool rvalue)
 /* Build a dummy glvalue from dereferencing a dummy reference of type
REFTYPE.  */
 
-static tree
+tree
 build_stub_object (tree reftype)
 {
   if (!TYPE_REF_P (reftype))
diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 3a20cd33fdc..4ccd7a314f5 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -4175,7 +4175,8 @@ member_p (const_tree decl)
 }
 
 /* Create a placeholder for member access where we don't actually have an
-   object that the access is against.  */
+   object that the access is against.  For a general declval equivalent,
+   use build_stub_object instead.  */
 
 tree
 build_dummy_object (tree type)
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 50d0f1e6a62..5af47ce89a9 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -5931,6 +5931,8 @@ cp_build_binary_op (const op_location_t ,
 
   if (!processing_template_decl)
 {
+  if (resultcode == SPACESHIP_EXPR)
+   result = get_target_expr_sfinae (result, complain);
   op0 = cp_fully_fold (op0);
   /* Only consider the second argument if the first isn't overflowed.  */
   if (!CONSTANT_CLASS_P (op0) || TREE_OVERFLOW_P (op0))

base-commit: 14ed21f8749ae359690d9c4a69ca38cc45d0d1b0
prerequisite-patch-id: bc368a9ce91fa5c1dcacbcaa3feb2c608a13570a
-- 
2.27.0



[PATCH] c: don't drop typedef information in casts

2021-05-07 Thread David Lamparter

The TYPE_MAIN_VARIANT() here was, for casts to a typedef'd type name,
resulting in all information about the typedef's involvement getting
lost.  This drops necessary information for warnings and can make them
confusing or even misleading.  It also makes specialized warnings for
unspecified-size system types (pid_t, uid_t, ...) impossible.

gcc/c/ChangeLog:
2021-03-09  David Lamparter  

PR c/99526
* c-typeck.c (build_c_cast): retain (unqualified) typedefs in
  casts rather than stripping down to basic type.
---
 gcc/c/c-typeck.c| 39 ++---
 gcc/testsuite/gcc.dg/cast-typedef.c | 35 ++
 2 files changed, 71 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/cast-typedef.c
---

Hi GCC hackers,


now that gcc 12 is open for development, I'd like to submit this patch
for reconsideration.  I've already gone through a bit of feedback while
the gcc 11 release was happening, cf. here:
https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566513.html

I've repeated my testing (full bootstrap & make check on x86_64) and
found nothing changed.


Cheers,

-David

diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index fdc7bb6125c2..ba6014726a4b 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -5876,6 +5876,7 @@ c_safe_function_type_cast_p (tree t1, tree t2)
 tree
 build_c_cast (location_t loc, tree type, tree expr)
 {
+  tree res_type, walk_type;
   tree value;
 
   bool int_operands = EXPR_INT_CONST_OPERANDS (expr);
@@ -5896,7 +5897,39 @@ build_c_cast (location_t loc, tree type, tree expr)
   if (objc_is_object_ptr (type) && objc_is_object_ptr (TREE_TYPE (expr)))
 return build1 (NOP_EXPR, type, expr);
 
+  /* Want to keep typedef information, but at the same time we need to strip
+ qualifiers for a proper rvalue.  Unfortunately, we don't know if any
+ qualifiers on a typedef are part of the typedef or were locally supplied.
+ So grab the original typedef and use that only if it has no qualifiers.
+ cf. cast-typedef testcase */
+
+  res_type = NULL;
+
+  for (walk_type = type;
+   walk_type && TYPE_NAME (walk_type)
+	 && TREE_CODE (TYPE_NAME (walk_type)) == TYPE_DECL;
+   walk_type = DECL_ORIGINAL_TYPE (TYPE_NAME (walk_type)))
+{
+  tree walk_unqual, orig_type, orig_decl;
+
+  walk_unqual = build_qualified_type (walk_type, TYPE_UNQUALIFIED);
+
+  orig_decl = lookup_name (TYPE_IDENTIFIER (walk_type));
+  if (!orig_decl || TREE_CODE (orig_decl) != TYPE_DECL)
+	continue;
+  orig_type = TREE_TYPE (orig_decl);
+
+  if (walk_unqual == orig_type)
+	{
+	  res_type = walk_unqual;
+	  break;
+	}
+}
+
   type = TYPE_MAIN_VARIANT (type);
+  if (!res_type)
+res_type = type;
+  gcc_assert (TYPE_MAIN_VARIANT (res_type) == type);
 
   if (TREE_CODE (type) == ARRAY_TYPE)
 {
@@ -5924,7 +5957,7 @@ build_c_cast (location_t loc, tree type, tree expr)
 		 "ISO C forbids casting nonscalar to the same type");
 
   /* Convert to remove any qualifiers from VALUE's type.  */
-  value = convert (type, value);
+  value = convert (res_type, value);
 }
   else if (TREE_CODE (type) == UNION_TYPE)
 {
@@ -6078,7 +6111,7 @@ build_c_cast (location_t loc, tree type, tree expr)
 		" from %qT to %qT", otype, type);
 
   ovalue = value;
-  value = convert (type, value);
+  value = convert (res_type, value);
 
   /* Ignore any integer overflow caused by the cast.  */
   if (TREE_CODE (value) == INTEGER_CST && !FLOAT_TYPE_P (otype))
@@ -6114,7 +6147,7 @@ build_c_cast (location_t loc, tree type, tree expr)
 		&& INTEGRAL_TYPE_P (TREE_TYPE (expr)))
 	   || TREE_CODE (expr) == REAL_CST
 	   || TREE_CODE (expr) == COMPLEX_CST)))
-  value = build1 (NOP_EXPR, type, value);
+  value = build1 (NOP_EXPR, res_type, value);
 
   /* If the expression has integer operands and so can occur in an
  unevaluated part of an integer constant expression, ensure the
diff --git a/gcc/testsuite/gcc.dg/cast-typedef.c b/gcc/testsuite/gcc.dg/cast-typedef.c
new file mode 100644
index ..3058e5a0b190
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cast-typedef.c
@@ -0,0 +1,35 @@
+/* Test cast <> typedef interactions */
+/* Origin: David Lamparter  */
+/* { dg-do compile } */
+/* { dg-options "-Wconversion" } */
+
+typedef int typedefname;
+typedef volatile int qual1;
+typedef volatile typedefname qual2;
+
+extern int val;
+extern void f2(unsigned char arg);
+
+void
+f (void)
+{
+  /* -Wconversion just used to print out the actual type */
+
+  f2 ((typedefname) val); /* { dg-warning "typedefname" } */
+  f2 ((volatile typedefname) val); /* { dg-warning "typedefname" } */
+  f2 ((qual1) val); /* { dg-warning "int" } */
+  f2 ((qual2) val); /* { dg-warning "typedefname" } */
+
+  /* { dg-bogus "volatile" "qualifiers should be stripped" { target { "*-*-*" } } 19  } */
+  /* { dg-bogus "volatile" "qualifiers should be stripped" { target { 

[pushed] c++: tweak prvalue test [PR98469]

2021-05-07 Thread Jason Merrill via Gcc-patches
Discussing the 98469 patch and class prvalues with Jakub also inspired me to
change the place that was mishandling BIT_CAST_EXPR and one other to use the
lvalue_kind machinery to decide whether something is a prvalue, instead of
looking specifically for a TARGET_EXPR.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* call.c (build_special_member_call): Use !glvalue_p rather
than specific tree codes to test for prvalue.
(conv_is_prvalue): Likewise.
(implicit_conversion): Check CLASS_TYPE_P first.
---
 gcc/cp/call.c | 10 --
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 57bac05fe70..8e455e59909 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -2133,8 +2133,8 @@ implicit_conversion (tree to, tree from, tree expr, bool 
c_cast_p,
flags, complain);
   if (!conv || conv->bad_p)
 return conv;
-  if (conv_is_prvalue (conv)
-  && CLASS_TYPE_P (conv->type)
+  if (CLASS_TYPE_P (conv->type)
+  && conv_is_prvalue (conv)
   && CLASSTYPE_PURE_VIRTUALS (conv->type))
 conv->bad_p = true;
   return conv;
@@ -8733,8 +8733,7 @@ conv_is_prvalue (conversion *c)
 return true;
   if (c->kind == ck_user && !TYPE_REF_P (c->type))
 return true;
-  if (c->kind == ck_identity && c->u.expr
-  && TREE_CODE (c->u.expr) == TARGET_EXPR)
+  if (c->kind == ck_identity && c->u.expr && !glvalue_p (c->u.expr))
 return true;
 
   return false;
@@ -10192,8 +10191,7 @@ build_special_member_call (tree instance, tree name, 
vec **args,
  && CONSTRUCTOR_NELTS (arg) == 1)
arg = CONSTRUCTOR_ELT (arg, 0)->value;
 
-  if ((TREE_CODE (arg) == TARGET_EXPR
-  || TREE_CODE (arg) == CONSTRUCTOR)
+  if (!glvalue_p (arg)
  && (same_type_ignoring_top_level_qualifiers_p
  (class_type, TREE_TYPE (arg
{

base-commit: 14ed21f8749ae359690d9c4a69ca38cc45d0d1b0
-- 
2.27.0



[PATCH, rs6000] Add ALTIVEC_REGS as pressure class

2021-05-07 Thread Pat Haugen via Gcc-patches
Add ALTIVEC_REGS as pressure class.

Code that has heavy register pressure on Altivec registers can suffer from
over-aggressive scheduling during sched1, which then leads to increased
register spill. This is due to the fact that registers that prefer
ALTIVEC_REGS are currently assigned an allocno class of VSX_REGS. This then
misleads the scheduler to think there are 64 regs available, when in reality
there are only 32 Altivec regs. This patch fixes the problem by assigning an
allocno class of ALTIVEC_REGS and adding ALTIVEC_REGS as a pressure class.

Bootstrap/regtest on powerpc64/powerpc64le with no new regressions. Testing
on CPU2017 showed no significant differences. Ok for trunk?

-Pat


2021-05-07  Pat Haugen  

gcc/ChangeLog:

* config/rs6000/rs6000.c (rs6000_ira_change_pseudo_allocno_class):
Return ALTIVEC_REGS if that is best_class.
(rs6000_compute_pressure_classes): Add ALTIVEC_REGS.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/fold-vec-insert-float-p9.c: Adjust instruction 
counts.
* gcc.target/powerpc/vec-rlmi-rlnm.c: Likewise.



diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 844fee8..fee4eef 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -22487,11 +22487,14 @@ rs6000_ira_change_pseudo_allocno_class (int regno 
ATTRIBUTE_UNUSED,
 of allocno class.  */
   if (best_class == BASE_REGS)
return GENERAL_REGS;
-  if (TARGET_VSX
- && (best_class == FLOAT_REGS || best_class == ALTIVEC_REGS))
+  if (TARGET_VSX && best_class == FLOAT_REGS)
return VSX_REGS;
   return best_class;
 
+case VSX_REGS:
+  if (best_class == ALTIVEC_REGS)
+   return ALTIVEC_REGS;
+
 default:
   break;
 }
@@ -23609,12 +23612,12 @@ rs6000_compute_pressure_classes (enum reg_class 
*pressure_classes)
 
   n = 0;
   pressure_classes[n++] = GENERAL_REGS;
+  if (TARGET_ALTIVEC)
+pressure_classes[n++] = ALTIVEC_REGS;
   if (TARGET_VSX)
 pressure_classes[n++] = VSX_REGS;
   else
 {
-  if (TARGET_ALTIVEC)
-   pressure_classes[n++] = ALTIVEC_REGS;
   if (TARGET_HARD_FLOAT)
pressure_classes[n++] = FLOAT_REGS;
 }
diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-float-p9.c 
b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-float-p9.c
index 1c57672..4541768 100644
--- a/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-float-p9.c
+++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-insert-float-p9.c
@@ -31,5 +31,5 @@ testf_cst (float f, vector float vf)
 /* { dg-final { scan-assembler-times {\mstfs\M} 2 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times {\mlxv\M} 2 { target ilp32 } } } */
 /* { dg-final { scan-assembler-times {\mlvewx\M} 1 { target ilp32 } } } */
-/* { dg-final { scan-assembler-times {\mvperm\M} 1 { target ilp32 } } } */
-/* { dg-final { scan-assembler-times {\mxxperm\M} 2 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times {\mvperm\M} 2 { target ilp32 } } } */
+/* { dg-final { scan-assembler-times {\mxxperm\M} 1 { target ilp32 } } } */
diff --git a/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c 
b/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c
index 1e7d739..5512c0f 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-rlmi-rlnm.c
@@ -62,6 +62,6 @@ rlnm_test_2 (vector unsigned long long x, vector unsigned 
long long y,
 /* { dg-final { scan-assembler-times "vextsb2d" 1 } } */
 /* { dg-final { scan-assembler-times "vslw" 1 } } */
 /* { dg-final { scan-assembler-times "vsld" 1 } } */
-/* { dg-final { scan-assembler-times "xxlor" 3 } } */
+/* { dg-final { scan-assembler-times "xxlor" 2 } } */
 /* { dg-final { scan-assembler-times "vrlwnm" 2 } } */
 /* { dg-final { scan-assembler-times "vrldnm" 2 } } */


Re: [PATCH] tree-optimization/79333 - fold stmts following SSA edges in VN

2021-05-07 Thread Richard Biener
On May 7, 2021 4:12:02 PM GMT+02:00, Christophe Lyon 
 wrote:
>On Wed, 5 May 2021 at 09:56, Richard Biener  wrote:
>>
>> This makes sure to follow SSA edges when folding eliminated stmts.
>> This reaps the same benefit as forwprop folding all stmts, not
>> waiting for one to produce copysign in the new testcase.
>>
>> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>>
>> 2021-05-04  Richard Biener  
>>
>> PR tree-optimization/79333
>> * tree-ssa-sccvn.c (eliminate_dom_walker::eliminate_stmt):
>> Fold stmt following SSA edges.
>>
>> * gcc.dg/tree-ssa/ssa-fre-94.c: New testcase.
>> * gcc.dg/graphite/fuse-1.c: Adjust.
>> * gcc.dg/pr43864-4.c: Likewise.
>> ---
>>  gcc/testsuite/gcc.dg/graphite/fuse-1.c |  4 ++--
>>  gcc/testsuite/gcc.dg/pr43864-4.c   |  6 +++---
>>  gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c | 16 
>>  gcc/tree-ssa-sccvn.c   |  2 +-
>>  4 files changed, 22 insertions(+), 6 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c
>>
>> diff --git a/gcc/testsuite/gcc.dg/graphite/fuse-1.c
>b/gcc/testsuite/gcc.dg/graphite/fuse-1.c
>> index 204d3b20703..527b6e5c415 100644
>> --- a/gcc/testsuite/gcc.dg/graphite/fuse-1.c
>> +++ b/gcc/testsuite/gcc.dg/graphite/fuse-1.c
>> @@ -1,6 +1,6 @@
>>  /* Check that the two loops are fused and that we manage to fold the
>two xor
>> operations.  */
>> -/* { dg-options "-O2 -floop-nest-optimize -fdump-tree-forwprop-all
>-fdump-tree-graphite-all" } */
>> +/* { dg-options "-O2 -floop-nest-optimize -fdump-tree-forwprop4
>-fdump-tree-graphite-all" } */
>>
>>  /* Make sure we fuse the loops like this:
>>  AST generated by isl:
>> @@ -12,7 +12,7 @@ for (int c0 = 0; c0 <= 99; c0 += 1) {
>>  /* { dg-final { scan-tree-dump-times "AST generated by isl:.*for
>\\(int c0 = 0; c0 <= 99; c0 \\+= 1\\)
>\\{.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*\\}" 1 "graphite" } }
>*/
>>
>>  /* Check that after fusing the loops, the scalar computation is also
>fused.  */
>> -/* { dg-final { scan-tree-dump-times "gimple_simplified
>to\[^\\n\]*\\^ 12" 1 "forwprop4" } } */
>> +/* { dg-final { scan-tree-dump-times " \\^ 12;" 2 "forwprop4" } } */
>>
>>  #define MAX 100
>>  int A[MAX];
>> diff --git a/gcc/testsuite/gcc.dg/pr43864-4.c
>b/gcc/testsuite/gcc.dg/pr43864-4.c
>> index 3c6cc50c5b8..8a25b0fd8ef 100644
>> --- a/gcc/testsuite/gcc.dg/pr43864-4.c
>> +++ b/gcc/testsuite/gcc.dg/pr43864-4.c
>> @@ -22,7 +22,7 @@ int f(int c, int b, int d)
>>return r - r2;
>>  }
>>
>> -/* { dg-final { scan-tree-dump-times "if " 0 "pre"} } */
>> -/* { dg-final { scan-tree-dump-times "(?n)_.*\\+.*_" 1 "pre"} } */
>> -/* { dg-final { scan-tree-dump-times "(?n)_.*-.*_" 2 "pre"} } */
>> +/* During PRE elimination we should simplify this to return b * 2. 
>*/
>> +/* { dg-final { scan-tree-dump-times "if " 0 "pre" } } */
>> +/* { dg-final { scan-tree-dump "_\[0-9\]+ = b_\[0-9\]+\\(D\\) \\*
>2;\[\\r\\n\]\[^\\r\\n\]*return _\[0-9\]+;" "pre" } } */
>>  /* { dg-final { scan-tree-dump-not "Invalid sum" "pre"} } */
>> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c
>b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c
>> new file mode 100644
>> index 000..92eebf636c6
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c
>> @@ -0,0 +1,16 @@
>> +/* PR tree-optimization/79333 */
>> +/* { dg-do compile } */
>> +/* { dg-options "-O -ffinite-math-only -fdump-tree-fre1" } */
>> +
>> +extern __inline __attribute__ ((__always_inline__,__gnu_inline__))
>> +double __attribute__ ((__nothrow__ , __leaf__))
>> +fabs (double __x) { return __builtin_fabs (__x); }
>> +
>> +double f(float f)
>> +{
>> +  double t1 = fabs(f);
>> +  double t2 = f / t1;
>> +  return t2;
>> +}
>> +
>> +/* { dg-final { scan-tree-dump "copysign" "fre1" } } */
>
>This new testcase fails on aarch64-elf / arm-eabi with newlib.
>
>Is that OK:

Yes, OK. 

Thanks, 
Richard. 

>===
>diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c
>b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c
>index 92eebf636c6..99c737562bb 100644
>--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c
>+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c
>@@ -1,5 +1,6 @@
> /* PR tree-optimization/79333 */
> /* { dg-do compile } */
>+/* { dg-require-effective-target c99_runtime } */
> /* { dg-options "-O -ffinite-math-only -fdump-tree-fre1" } */
>
> extern __inline __attribute__ ((__always_inline__,__gnu_inline__))
>===
>
>Thanks,
>
>Christophe
>
>
>
>
>> diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
>> index ca0974d72b8..e54a0c9065c 100644
>> --- a/gcc/tree-ssa-sccvn.c
>> +++ b/gcc/tree-ssa-sccvn.c
>> @@ -6362,7 +6362,7 @@ eliminate_dom_walker::eliminate_stmt
>(basic_block b, gimple_stmt_iterator *gsi)
>> recompute_tree_invariant_for_addr_expr (gimple_assign_rhs1
>(stmt));
>>gimple_stmt_iterator prev = *gsi;
>>gsi_prev ();
>> -  if (fold_stmt (gsi))
>> +  

[PATCH] i386: Implement mmx_pblendv to optimize SSE conditional moves [PR98218]

2021-05-07 Thread Uros Bizjak via Gcc-patches
Implement mmx_pblendv to optimize V8HI, V4HI and V2SI mode
conditional moves for SSE4.1 targets.

2021-05-07  Uroš Bizjak  

gcc/
PR target/98218
* config/i386/i386-expand.c (ix86_expand_sse_movcc):
Handle V8QI, V4HI and V2SI modes.
* config/i386/mmx.md (mmx_pblendvb): New insn pattern.
* config/i386/sse.md (unspec): Move UNSPEC_BLENDV ...
* config/i386/i386.md (unspec): ... here.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 61b2f921f41..e9f11bca78a 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -3702,6 +3702,19 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, 
rtx op_false)
  op_true = force_reg (mode, op_true);
}
   break;
+case E_V8QImode:
+case E_V4HImode:
+case E_V2SImode:
+  if (TARGET_SSE4_1)
+   {
+ gen = gen_mmx_pblendvb;
+ if (mode != V8QImode)
+   d = gen_reg_rtx (V8QImode);
+ op_false = gen_lowpart (V8QImode, op_false);
+ op_true = gen_lowpart (V8QImode, op_true);
+ cmp = gen_lowpart (V8QImode, cmp);
+   }
+  break;
 case E_V16QImode:
 case E_V8HImode:
 case E_V4SImode:
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index f79fd122f56..74e924f3c04 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -118,6 +118,7 @@ (define_c_enum "unspec" [
   UNSPEC_FIX_NOTRUNC
   UNSPEC_MASKMOV
   UNSPEC_MOVMSK
+  UNSPEC_BLENDV
   UNSPEC_RCP
   UNSPEC_RSQRT
   UNSPEC_PSADBW
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index 295501dec2f..f08570856f9 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -1700,6 +1700,26 @@ (define_expand "vcond_mask_"
   DONE;
 })
 
+(define_insn "mmx_pblendvb"
+  [(set (match_operand:V8QI 0 "register_operand" "=Yr,*x,x")
+   (unspec:V8QI
+ [(match_operand:V8QI 1 "register_operand" "0,0,x")
+  (match_operand:V8QI 2 "register_operand" "Yr,*x,x")
+  (match_operand:V8QI 3 "register_operand" "Yz,Yz,x")]
+ UNSPEC_BLENDV))]
+  "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE"
+  "@
+   pblendvb\t{%3, %2, %0|%0, %2, %3}
+   pblendvb\t{%3, %2, %0|%0, %2, %3}
+   vpblendvb\t{%3, %2, %1, %0|%0, %1, %2, %3}"
+  [(set_attr "isa" "noavx,noavx,avx")
+   (set_attr "type" "ssemov")
+   (set_attr "prefix_extra" "1")
+   (set_attr "length_immediate" "*,*,1")
+   (set_attr "prefix" "orig,orig,vex")
+   (set_attr "btver2_decode" "vector")
+   (set_attr "mode" "TI")])
+
 ;; XOP parallel XMM conditional moves
 (define_insn "*xop_pcmov_"
   [(set (match_operand:MMXMODEI 0 "register_operand" "=x")
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 897cf3eaea9..244fb13e97a 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -39,7 +39,6 @@ (define_c_enum "unspec" [
   UNSPEC_INSERTQ
 
   ;; For SSE4.1 support
-  UNSPEC_BLENDV
   UNSPEC_INSERTPS
   UNSPEC_DP
   UNSPEC_MOVNTDQA


Re: [PATCH] libcpp: Fix up pragma preprocessing [PR100450]

2021-05-07 Thread Marek Polacek via Gcc-patches
On Fri, May 07, 2021 at 09:53:29AM +0200, Jakub Jelinek wrote:
> Hi!
> 
> Since the r0-85991-ga25a8f3be322fe0f838947b679f73d6efc2a412c
> https://gcc.gnu.org/legacy-ml/gcc-patches/2008-02/msg01329.html
> changes, so that we handle macros inside of pragmas that should expand
> macros, during preprocessing we print those pragmas token by token,
> with CPP_PRAGMA printed as
>   fputs ("#pragma ", print.outf);
>   if (space)
> fprintf (print.outf, "%s %s", space, name);
>   else
> fprintf (print.outf, "%s", name);
> where name is some identifier (so e.g. print
> #pragma omp parallel
> or
> #pragma omp for
> etc.).  Because it ends in an identifier, we need to handle it like
> an identifier (i.e. CPP_NAME) for the decision whether a space needs
> to be emitted in between that #pragma whatever or #pragma whatever whatever
> and following token, otherwise the attached testcase is preprocessed as
> #pragma omp forreduction(+:red)
> rather than
> #pragma omp for reduction(+:red)
> The cpp_avoid_paste function is only called for this purpose.

Nice explanation, it helped me to understand this.
 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk
> and release branches (in particular 8 which freezes later today)?

OK.  I don't think we'd ever want to actually concatenate in this context.
 
> 2021-05-07  Jakub Jelinek  
> 
>   PR c/100450
>   * lex.c (cpp_avoid_paste): Handle token1 CPP_PRAGMA like CPP_NAME.
> 
>   * c-c++-common/gomp/pr100450.c: New test.
> 
> --- libcpp/lex.c.jj   2021-05-04 21:02:13.633917100 +0200
> +++ libcpp/lex.c  2021-05-06 20:32:07.695035739 +0200
> @@ -3709,6 +3709,7 @@ cpp_avoid_paste (cpp_reader *pfile, cons
>  case CPP_DEREF:  return c == '*';
>  case CPP_DOT:return c == '.' || c == '%' || b == CPP_NUMBER;
>  case CPP_HASH:   return c == '#' || c == '%'; /* Digraph form.  */
> +case CPP_PRAGMA:
>  case CPP_NAME:   return ((b == CPP_NUMBER
>&& name_p (pfile, >val.str))
>   || b == CPP_NAME
> --- gcc/testsuite/c-c++-common/gomp/pr100450.c.jj 2021-05-06 
> 20:33:45.302961055 +0200
> +++ gcc/testsuite/c-c++-common/gomp/pr100450.c2021-05-06 
> 20:33:39.882020738 +0200
> @@ -0,0 +1,20 @@
> +/* PR c/100450 */
> +/* { dg-do compile } */
> +/* { dg-options "-fopenmp -save-temps -Wunknown-pragmas" } */
> +
> +#define TEST(T) { \
> + {T} \
> +}
> +#define CLAUSES reduction(+:red)
> +#define PARALLEL_FOR(X) TEST({ \
> +_Pragma("omp for CLAUSES") \
> +X \
> +})
> +
> +void foo()
> +{
> +  int red = 0;
> +  int A[3] = {};
> +  #pragma omp parallel shared(red)
> +  PARALLEL_FOR( for(int i=0; i < 3; i++) red += A[i]; )
> +}
> 
>   Jakub
> 

Marek



Re: [PATCH][_GLIBCXX_DEBUG] libbacktrace integration

2021-05-07 Thread Jonathan Wakely via Gcc-patches

On 05/05/21 12:33 +0100, Jonathan Wakely wrote:

On 24/04/21 15:46 +0200, François Dumont via Libstdc++ wrote:

Hi

    Here is the patch to add backtrace generation on _GLIBCXX_DEBUG 
assertions thanks to libbacktrace.


Ville pointed out that we'll need to use libbacktrace for
std::stacktrace  anyway, and it would be
useful if/when we add support for C++ Contracts to the lirbary.

So let's integrate libbacktrace into libstdc++ properly. Jakub
suggested doing it how libsanitizer does it, which is to rebuild the
libbacktrace sources as part of the libsanitizer build, using the
preprocessor to rename the symbols so that they use reserved names.
e.g. rename backtrace_full to __glibcxx_backtrace_full or something
like that.

I'll work on getting it building as part of libstdc++ (or maybe as a
separate static library for now, as we do for libstdc++fs.a) and then
you can rework your Debug Mode patch to depend on the private version
of libbacktrace included with libstdc++ (instead of expecting users to
provide it themselves).



Re: [PATCH] tree-optimization/79333 - fold stmts following SSA edges in VN

2021-05-07 Thread Christophe Lyon via Gcc-patches
On Wed, 5 May 2021 at 09:56, Richard Biener  wrote:
>
> This makes sure to follow SSA edges when folding eliminated stmts.
> This reaps the same benefit as forwprop folding all stmts, not
> waiting for one to produce copysign in the new testcase.
>
> Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.
>
> 2021-05-04  Richard Biener  
>
> PR tree-optimization/79333
> * tree-ssa-sccvn.c (eliminate_dom_walker::eliminate_stmt):
> Fold stmt following SSA edges.
>
> * gcc.dg/tree-ssa/ssa-fre-94.c: New testcase.
> * gcc.dg/graphite/fuse-1.c: Adjust.
> * gcc.dg/pr43864-4.c: Likewise.
> ---
>  gcc/testsuite/gcc.dg/graphite/fuse-1.c |  4 ++--
>  gcc/testsuite/gcc.dg/pr43864-4.c   |  6 +++---
>  gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c | 16 
>  gcc/tree-ssa-sccvn.c   |  2 +-
>  4 files changed, 22 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c
>
> diff --git a/gcc/testsuite/gcc.dg/graphite/fuse-1.c 
> b/gcc/testsuite/gcc.dg/graphite/fuse-1.c
> index 204d3b20703..527b6e5c415 100644
> --- a/gcc/testsuite/gcc.dg/graphite/fuse-1.c
> +++ b/gcc/testsuite/gcc.dg/graphite/fuse-1.c
> @@ -1,6 +1,6 @@
>  /* Check that the two loops are fused and that we manage to fold the two xor
> operations.  */
> -/* { dg-options "-O2 -floop-nest-optimize -fdump-tree-forwprop-all 
> -fdump-tree-graphite-all" } */
> +/* { dg-options "-O2 -floop-nest-optimize -fdump-tree-forwprop4 
> -fdump-tree-graphite-all" } */
>
>  /* Make sure we fuse the loops like this:
>  AST generated by isl:
> @@ -12,7 +12,7 @@ for (int c0 = 0; c0 <= 99; c0 += 1) {
>  /* { dg-final { scan-tree-dump-times "AST generated by isl:.*for \\(int c0 = 
> 0; c0 <= 99; c0 \\+= 1\\) 
> \\{.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*\\}" 1 "graphite" } } */
>
>  /* Check that after fusing the loops, the scalar computation is also fused.  
> */
> -/* { dg-final { scan-tree-dump-times "gimple_simplified to\[^\\n\]*\\^ 12" 1 
> "forwprop4" } } */
> +/* { dg-final { scan-tree-dump-times " \\^ 12;" 2 "forwprop4" } } */
>
>  #define MAX 100
>  int A[MAX];
> diff --git a/gcc/testsuite/gcc.dg/pr43864-4.c 
> b/gcc/testsuite/gcc.dg/pr43864-4.c
> index 3c6cc50c5b8..8a25b0fd8ef 100644
> --- a/gcc/testsuite/gcc.dg/pr43864-4.c
> +++ b/gcc/testsuite/gcc.dg/pr43864-4.c
> @@ -22,7 +22,7 @@ int f(int c, int b, int d)
>return r - r2;
>  }
>
> -/* { dg-final { scan-tree-dump-times "if " 0 "pre"} } */
> -/* { dg-final { scan-tree-dump-times "(?n)_.*\\+.*_" 1 "pre"} } */
> -/* { dg-final { scan-tree-dump-times "(?n)_.*-.*_" 2 "pre"} } */
> +/* During PRE elimination we should simplify this to return b * 2.  */
> +/* { dg-final { scan-tree-dump-times "if " 0 "pre" } } */
> +/* { dg-final { scan-tree-dump "_\[0-9\]+ = b_\[0-9\]+\\(D\\) \\* 
> 2;\[\\r\\n\]\[^\\r\\n\]*return _\[0-9\]+;" "pre" } } */
>  /* { dg-final { scan-tree-dump-not "Invalid sum" "pre"} } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c
> new file mode 100644
> index 000..92eebf636c6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c
> @@ -0,0 +1,16 @@
> +/* PR tree-optimization/79333 */
> +/* { dg-do compile } */
> +/* { dg-options "-O -ffinite-math-only -fdump-tree-fre1" } */
> +
> +extern __inline __attribute__ ((__always_inline__,__gnu_inline__))
> +double __attribute__ ((__nothrow__ , __leaf__))
> +fabs (double __x) { return __builtin_fabs (__x); }
> +
> +double f(float f)
> +{
> +  double t1 = fabs(f);
> +  double t2 = f / t1;
> +  return t2;
> +}
> +
> +/* { dg-final { scan-tree-dump "copysign" "fre1" } } */

This new testcase fails on aarch64-elf / arm-eabi with newlib.

Is that OK:
===
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c
index 92eebf636c6..99c737562bb 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-fre-94.c
@@ -1,5 +1,6 @@
 /* PR tree-optimization/79333 */
 /* { dg-do compile } */
+/* { dg-require-effective-target c99_runtime } */
 /* { dg-options "-O -ffinite-math-only -fdump-tree-fre1" } */

 extern __inline __attribute__ ((__always_inline__,__gnu_inline__))
===

Thanks,

Christophe




> diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
> index ca0974d72b8..e54a0c9065c 100644
> --- a/gcc/tree-ssa-sccvn.c
> +++ b/gcc/tree-ssa-sccvn.c
> @@ -6362,7 +6362,7 @@ eliminate_dom_walker::eliminate_stmt (basic_block b, 
> gimple_stmt_iterator *gsi)
> recompute_tree_invariant_for_addr_expr (gimple_assign_rhs1 (stmt));
>gimple_stmt_iterator prev = *gsi;
>gsi_prev ();
> -  if (fold_stmt (gsi))
> +  if (fold_stmt (gsi, follow_all_ssa_edges))
> {
>   /* fold_stmt may have created new stmts inbetween
>  the previous stmt and the folded stmt.  Mark
> --
> 2.26.2


Re: PowerPC64 ELFv1 -fpatchable-function-entry

2021-05-07 Thread will schmidt via Gcc-patches
On Fri, 2021-05-07 at 12:19 +0930, Alan Modra via Gcc-patches wrote:
> On PowerPC64 ELFv1 function symbols are defined on function
> descriptors in an .opd section rather than in the function code.
> .opd is not split up by the PowerPC64 backend for comdat groups or
> other situations where per-function sections are required.  Thus
> SECTION_LINK_ORDER can't use the function name to reference a
> suitable
> section for ordering:  The .opd section might contain many other
> function descriptors and they may be in a different order to the
> final
> function code placement.  This patch arranges to use a code label
> instead of the function name symbol.
> 
> I chose to emit the label inside default_elf_asm_named_section,
> immediately before the .section directive using the label, and in
> case
> someone uses .previous or the like, need to save and restore the
> current section when switching to the function code section to emit
> the label.  That requires a tweak to switch_to_section in order to
> get
> the current section.  I checked all the TARGET_ASM_NAMED_SECTION
> functions and unnamed.callback functions and it appears none will be
> affected by that tweak.


Hi,

good description.  thanks :-)


> 
>   PR target/98125
>   * varasm.c (default_elf_asm_named_section): Use a function
>   code label rather than the function symbol as the "o" argument.
>   (switch_to_section): Don't set in_section until section
>   directive has been emitted.
> 
> diff --git a/gcc/varasm.c b/gcc/varasm.c
> index 97c1e6fff25..5f95f8cfa75 100644
> --- a/gcc/varasm.c
> +++ b/gcc/varasm.c
> @@ -6866,6 +6866,26 @@ default_elf_asm_named_section (const char
> *name, unsigned int flags,
>*f = '\0';
>  }
> 
> +  char func_label[256];
> +  if (flags & SECTION_LINK_ORDER)
> +{
> +  static int recur;
> +  if (recur)
> + gcc_unreachable ();

Interesting..   Is there any anticipation of re-entry or parallel runs
through this function that requires the recur lock/protection?


> +  else
> + {
> +   ++recur;
> +   section *save_section = in_section;
> +   static int func_code_labelno;
> +   switch_to_section (function_section (decl));
> +   ++func_code_labelno;
> +   ASM_GENERATE_INTERNAL_LABEL (func_label, "LPFC",
> func_code_labelno);
> +   ASM_OUTPUT_LABEL (asm_out_file, func_label);
> +   switch_to_section (save_section);
> +   --recur;
> + }
> +}


ok

> +
>fprintf (asm_out_file, "\t.section\t%s,\"%s\"", name, flagchars);
> 
>/* default_section_type_flags (above) knows which flags need
> special
> @@ -6893,11 +6913,8 @@ default_elf_asm_named_section (const char
> *name, unsigned int flags,
>   fprintf (asm_out_file, ",%d", flags & SECTION_ENTSIZE);
>if (flags & SECTION_LINK_ORDER)
>   {
> -   tree id = DECL_ASSEMBLER_NAME (decl);
> -   ultimate_transparent_alias_target ();
> -   const char *name = IDENTIFIER_POINTER (id);
> -   name = targetm.strip_name_encoding (name);
> -   fprintf (asm_out_file, ",%s", name);
> +   fputc (',', asm_out_file);
> +   assemble_name_raw (asm_out_file, func_label);


ok as far as I can tell :-)assemble_name_raw is an if/else that
outputs 'name' or a LABELREF based on the file & name.  It's not an
obvious analog to the untimate_transparent_alias_target() and name
processing that is being replaced, but seems to fit the changes as
described.


>   }
>if (HAVE_COMDAT_GROUP && (flags & SECTION_LINKONCE))
>   {
> @@ -7821,11 +7838,6 @@ switch_to_section (section *new_section, tree
> decl)
>else if (in_section == new_section)
>  return;
> 
> -  if (new_section->common.flags & SECTION_FORGET)
> -in_section = NULL;
> -  else
> -in_section = new_section;
> -
>switch (SECTION_STYLE (new_section))
>  {
>  case SECTION_NAMED:
> @@ -7843,6 +7855,11 @@ switch_to_section (section *new_section, tree
> decl)
>break;
>  }
> 
> +  if (new_section->common.flags & SECTION_FORGET)
> +in_section = NULL;
> +  else
> +in_section = new_section;
> +
>new_section->common.flags |= SECTION_DECLARED;


OK. 
lgtm, thx
-Will

>  }
> 



Re: Revert "rs6000: Avoid -fpatchable-function-entry* regressions on powerpc64 be [PR98125]"

2021-05-07 Thread will schmidt via Gcc-patches
On Fri, 2021-05-07 at 12:19 +0930, Alan Modra via Gcc-patches wrote:
> This reverts commit b680b9049737198d010e49cf434704c6a6ed2b3f now
> that the PowerPC64 ELFv1 regression is fixed properly.
> 
Hi,

Ok.  looks like that was initially handled by Jakub, on copy, good. :-)

Contents below appear to match that commit, reversed.
lgtm,
thanks
-Will


>   PR testsuite/98125
>   * targhooks.h (default_print_patchable_function_entry_1): Delete.
>   * targhooks.c (default_print_patchable_function_entry_1): Delete.
>   (default_print_patchable_function_entry): Expand above.
>   * config/rs6000/rs6000.c (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY):
>   Don't define.
>   (rs6000_print_patchable_function_entry): Delete.
>   * testsuite/g++.dg/pr93195a.C: Revert 2021-04-03 change.
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 663eed4f055..d43c36e7f1a 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -1362,10 +1362,6 @@ static const struct attribute_spec 
> rs6000_attribute_table[] =
>  #define TARGET_ASM_ASSEMBLE_VISIBILITY rs6000_assemble_visibility
>  #endif
> 
> -#undef TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY
> -#define TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY \
> -  rs6000_print_patchable_function_entry
> -
>  #undef TARGET_SET_UP_BY_PROLOGUE
>  #define TARGET_SET_UP_BY_PROLOGUE rs6000_set_up_by_prologue
> 
> @@ -14957,30 +14953,6 @@ rs6000_assemble_visibility (tree decl, int vis)
>  }
>  #endif
>  
> -/* Write PATCH_AREA_SIZE NOPs into the asm outfile FILE around a function
> -   entry.  If RECORD_P is true and the target supports named sections,
> -   the location of the NOPs will be recorded in a special object section
> -   called "__patchable_function_entries".  This routine may be called
> -   twice per function to put NOPs before and after the function
> -   entry.  */
> -
> -void
> -rs6000_print_patchable_function_entry (FILE *file,
> -unsigned HOST_WIDE_INT patch_area_size,
> -bool record_p)
> -{
> -  unsigned int flags = SECTION_WRITE | SECTION_RELRO;
> -  /* When .opd section is emitted, the function symbol
> - default_print_patchable_function_entry_1 is emitted into the .opd 
> section
> - while the patchable area is emitted into the function section.
> - Don't use SECTION_LINK_ORDER in that case.  */
> -  if (!(TARGET_64BIT && DEFAULT_ABI != ABI_ELFv2)
> -  && HAVE_GAS_SECTION_LINK_ORDER)
> -flags |= SECTION_LINK_ORDER;
> -  default_print_patchable_function_entry_1 (file, patch_area_size, record_p,
> - flags);
> -}
> -
>  enum rtx_code
>  rs6000_reverse_condition (machine_mode mode, enum rtx_code code)
>  {
> diff --git a/gcc/targhooks.c b/gcc/targhooks.c
> index 952fad422eb..d69c9a2d819 100644
> --- a/gcc/targhooks.c
> +++ b/gcc/targhooks.c
> @@ -1832,15 +1832,17 @@ default_compare_by_pieces_branch_ratio (machine_mode)
>return 1;
>  }
> 
> -/* Helper for default_print_patchable_function_entry and other
> -   print_patchable_function_entry hook implementations.  */
> +/* Write PATCH_AREA_SIZE NOPs into the asm outfile FILE around a function
> +   entry.  If RECORD_P is true and the target supports named sections,
> +   the location of the NOPs will be recorded in a special object section
> +   called "__patchable_function_entries".  This routine may be called
> +   twice per function to put NOPs before and after the function
> +   entry.  */
> 
>  void
> -default_print_patchable_function_entry_1 (FILE *file,
> -   unsigned HOST_WIDE_INT
> -   patch_area_size,
> -   bool record_p,
> -   unsigned int flags)
> +default_print_patchable_function_entry (FILE *file,
> + unsigned HOST_WIDE_INT patch_area_size,
> + bool record_p)
>  {
>const char *nop_templ = 0;
>int code_num;
> @@ -1862,6 +1864,9 @@ default_print_patchable_function_entry_1 (FILE *file,
>patch_area_number++;
>ASM_GENERATE_INTERNAL_LABEL (buf, "LPFE", patch_area_number);
> 
> +  unsigned int flags = SECTION_WRITE | SECTION_RELRO;
> +  if (HAVE_GAS_SECTION_LINK_ORDER)
> + flags |= SECTION_LINK_ORDER;
>switch_to_section (get_section ("__patchable_function_entries",
> flags, current_function_decl));
>assemble_align (POINTER_SIZE);
> @@ -1878,25 +1883,6 @@ default_print_patchable_function_entry_1 (FILE *file,
>  output_asm_insn (nop_templ, NULL);
>  }
> 
> -/* Write PATCH_AREA_SIZE NOPs into the asm outfile FILE around a function
> -   entry.  If RECORD_P is true and the target supports named sections,
> -   the location of the NOPs will be recorded in a special object section
> -   called 

Re: PowerPC64 ELFv2 -fpatchable-function-entry

2021-05-07 Thread will schmidt via Gcc-patches
On Fri, 2021-05-07 at 12:19 +0930, Alan Modra via Gcc-patches wrote:
> PowerPC64 ELFv2 dual entry point functions have a couple of problems
> with -fpatchable-function-entry.  One is that the nops added after the
> global entry land in the global entry code which is constrained to be
> a power of two number of instructions, and zero global entry code has
> special meaning for linkage.  The other is that the global entry code
> isn't always used on function entry, and some uses of
> -fpatchable-function-entry might want to affect all entries to the
> function.  So this patch arranges to put one batch of nops before the
> global entry, and the other after the local entry point.
> 

Hi,

Description good.  :-)

>   PR target/98125
>   * config/rs6000/rs6000.c (rs6000_print_patchable_function_entry): New
>   function.
>   (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): Define.
>   * config/rs6000/rs6000-logue.c: Include targhooks.h.
>   (rs6000_output_function_prologue): Handle nops for
>   -fpatchable-function-entry after local entry point.
> 
> diff --git a/gcc/config/rs6000/rs6000-logue.c 
> b/gcc/config/rs6000/rs6000-logue.c
> index b0ac183ceff..ffa3bb3dcf1 100644
> --- a/gcc/config/rs6000/rs6000-logue.c
> +++ b/gcc/config/rs6000/rs6000-logue.c
> @@ -51,6 +51,7 @@
>  #include "gstab.h"  /* for N_SLINE */
>  #include "dbxout.h" /* dbxout_ */
>  #endif
> +#include "targhooks.h"
> 
>  static int rs6000_ra_ever_killed (void);
>  static void is_altivec_return_reg (rtx, void *);
> @@ -3991,6 +3992,10 @@ rs6000_output_function_prologue (FILE *file)
>fputs (",1\n", file);
>  }
> 
> +  int nops_after_entry = crtl->patch_area_size - crtl->patch_area_entry;
> +  if (nops_after_entry > 0)
> +default_print_patchable_function_entry (file, nops_after_entry, false);
> +
>/* Output -mprofile-kernel code.  This needs to be done here instead of
>   in output_function_profile since it must go after the ELFv2 ABI
>   local entry point.  */
ok


> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index d43c36e7f1a..97f1b3e0674 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -1404,6 +1404,10 @@ static const struct attribute_spec 
> rs6000_attribute_table[] =
>  #undef TARGET_ASM_FUNCTION_EPILOGUE
>  #define TARGET_ASM_FUNCTION_EPILOGUE rs6000_output_function_epilogue
> 
> +#undef TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY
> +#define TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY \
> +  rs6000_print_patchable_function_entry
> +

ok

>  #undef TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA
>  #define TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA rs6000_output_addr_const_extra
> 
> @@ -14953,6 +14957,33 @@ rs6000_assemble_visibility (tree decl, int vis)
>  }
>  #endif
>  
> +/* Write NOPs into the asm outfile FILE around a function entry.  This
> +   routine may be called twice per function to put NOPs before and after
> +   the function entry.  If RECORD_P is true the location of the NOPs will
> +   be recorded by default_print_patchable_function_entry in a special
> +   object section called "__patchable_function_entries".  Disable output
> +   of any NOPs for the second call.  Those, if any, are output by
> +   rs6000_output_function_prologue.  This means that for ELFv2 any NOPs
> +   after the function entry are placed after the local entry point, not
> +   the global entry point.  NOPs after the entry may be found at
> +   record_loc + nops_before * 4 + local_entry_offset.  This holds true
> +   when nops_before is zero.  */
> +
> +static void
> +rs6000_print_patchable_function_entry (FILE *file,
> +unsigned HOST_WIDE_INT patch_area_size 
> ATTRIBUTE_UNUSED,
> +bool record_p)
> +{
> +  /* Always call default_print_patchable_function_entry when RECORD_P in

when RECORD_P is true?  (implied, but I like to be specific..)
> + order to output the location of the NOPs, but use the size of the
> + area before the entry on both possible calls.  If RECORD_P is true
> + on the second call then the area before the entry was zero size and
> + thus no NOPs will be output.  */
> +  if (record_p)
> +default_print_patchable_function_entry (file, crtl->patch_area_entry,
> + record_p);
> +}

ok.

lgtm,thx
-Will

> +
>  enum rtx_code
>  rs6000_reverse_condition (machine_mode mode, enum rtx_code code)
>  {



RE: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-05-07 Thread Tamar Christina via Gcc-patches
Hi Richi,

> -Original Message-
> From: Richard Biener 
> Sent: Friday, May 7, 2021 12:46 PM
> To: Tamar Christina 
> Cc: gcc-patches@gcc.gnu.org; nd 
> Subject: Re: [PATCH 1/4]middle-end Vect: Add support for dot-product
> where the sign for the multiplicant changes.
> 
> On Wed, 5 May 2021, Tamar Christina wrote:
> 
> > Hi All,
> >
> > This patch adds support for a dot product where the sign of the
> > multiplication arguments differ. i.e. one is signed and one is
> > unsigned but the precisions are the same.
> >
> > #define N 480
> > #define SIGNEDNESS_1 unsigned
> > #define SIGNEDNESS_2 signed
> > #define SIGNEDNESS_3 signed
> > #define SIGNEDNESS_4 unsigned
> >
> > SIGNEDNESS_1 int __attribute__ ((noipa)) f (SIGNEDNESS_1 int res,
> > SIGNEDNESS_3 char *restrict a,
> >SIGNEDNESS_4 char *restrict b)
> > {
> >   for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> > {
> >   int av = a[i];
> >   int bv = b[i];
> >   SIGNEDNESS_2 short mult = av * bv;
> >   res += mult;
> > }
> >   return res;
> > }
> >
> > The operations are performed as if the operands were extended to a 32-bit
> value.
> > As such this operation isn't valid if there is an intermediate
> > conversion to an unsigned value. i.e.  if SIGNEDNESS_2 is unsigned.
> >
> > more over if the signs of SIGNEDNESS_3 and SIGNEDNESS_4 are flipped
> > the same optab is used but the operands are flipped in the optab
> expansion.
> >
> > To support this the patch extends the dot-product detection to
> > optionally ignore operands with different signs and stores this
> > information in the optab subtype which is now made a bitfield.
> >
> > The subtype can now additionally controls which optab an EXPR can expand
> to.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master?
> >
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > * optabs.def (usdot_prod_optab): New.
> > * doc/md.texi: Document it.
> > * optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
> > * optabs-tree.h (enum optab_subtype): Likewise.
> > * optabs.c (expand_widen_pattern_expr): Likewise.
> > * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
> > * tree-vect-loop.c (vect_determine_dot_kind): New.
> > (vectorizable_reduction): Query dot-product kind.
> > * tree-vect-patterns.c (vect_supportable_direct_optab_p): Take
> optional
> > optab subtype.
> > (vect_joust_widened_type, vect_widened_op_tree): Optionally
> ignore
> > mismatch types.
> > (vect_recog_dot_prod_pattern): Support usdot_prod_optab.
> >
> > --- inline copy of patch --
> > diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index
> >
> d166a0debedf4d8edf55c842bcf4ff4690b3e9ce..baf20416e63745097825fc30fd
> f2
> > e66bc80d7d23 100644
> > --- a/gcc/doc/md.texi
> > +++ b/gcc/doc/md.texi
> > @@ -5440,11 +5440,13 @@ Like @samp{fold_left_plus_@var{m}}, but
> takes
> > an additional mask operand  @item @samp{sdot_prod@var{m}}  @cindex
> > @code{udot_prod@var{m}} instruction pattern  @itemx
> > @samp{udot_prod@var{m}}
> > +@cindex @code{usdot_prod@var{m}} instruction pattern @itemx
> > +@samp{usdot_prod@var{m}}
> >  Compute the sum of the products of two signed/unsigned elements.
> > -Operand 1 and operand 2 are of the same mode. Their product, which is
> > of a -wider mode, is computed and added to operand 3. Operand 3 is of
> > a mode equal or -wider than the mode of the product. The result is
> > placed in operand 0, which -is of the same mode as operand 3.
> > +Operand 1 and operand 2 are of the same mode but may differ in signs.
> > +Their product, which is of a wider mode, is computed and added to
> operand 3.
> > +Operand 3 is of a mode equal or wider than the mode of the product.
> > +The result is placed in operand 0, which is of the same mode as operand 3.
> 
> This doesn't really say what the 's', 'u' and 'us' specify.  Since we're 
> doing a
> widen multiplication and then a non-widening addition we only need to
> know the effective sign of the multiplication so I think the existing 's' and 
> 'u'
> are enough to cover all cases?

The existing 's' and 'u' enforce that both operands of the multiplication are 
of the
same sign.  So for e.g. 'u' both operand must be unsigned.

In the `us` case one can be signed and one unsigned. Operationally this does a 
sign
extension to the wider type for the signed value, and the unsigned value gets 
zero extended
first, and then converts it to unsigned to perform the
unsigned multiplication, conforming to the C promotion rules.

TL;DR; Without a new optab I can't tell during expansion which semantic the 
operation
had at the gimple/C level as modes don't carry signs.

Long version:

The problem with using the existing patterns, because of their enforcement of 
`av` and `bv` being
the same sign is that we can't remove the explicit sign extensions, but the 
multiplication must be done
on the sign/zero extended char input in the same sign.

Which means (unless I am 

Re: [PATCH, OG10, OpenMP 5.0, committed] Implement relaxation of implicit map vs. existing device mappings

2021-05-07 Thread Thomas Schwinge
Hi Chung-Lin!

On 2021-05-05T23:17:25+0800, Chung-Lin Tang via Gcc-patches 
 wrote:
> This patch implements relaxing the requirements when a map with the implicit 
> attribute encounters
> an overlapping existing map.  [...]

Oh, oh, these data mapping interfaces/semantics ares getting more and
more "convoluted"...  %-\ (Not your fault, of course.)

Haven't looked in too much detail in the patch/implementation (I'm not
very well-versend in the exact OpenMP semantics anyway), but I suppose we
should do similar things for OpenACC, too.  I think we even currently do
have a gimplification-level "hack" to replicate data clauses' array
bounds for implicit data clauses on compute constructs, if the default
"complete" mapping is going to clash with a "limited" mapping that's
specified in an outer OpenACC 'data' directive.  (That, of course,
doesn't work for the general case of non-lexical scoping, or dynamic
OpenACC 'enter data', etc., I suppose) I suppose your method could easily
replace and improve that; we shall look into that later.

That said, in your patch, is this current implementation (explicitly)
meant or not meant to be active for OpenACC, too, or just OpenMP (I
couldn't quickly tell), and/or is it (implicitly?) a no-op for OpenACC?

> As the OpenMP 5.0 spec describes on page 320, lines 18-27 (and 5.1 spec,
> page 352, lines 13-22):
>
> "If a single contiguous part of the original storage of a list item with an 
> implicit data-mapping
>   attribute has corresponding storage in the device data environment prior to 
> a task encountering the
>   construct that is associated with the map clause, only that part of the 
> original storage will have
>   corresponding storage in the device data environment as a result of the map 
> clause."
>
> Also tracked in the OpenMP spec context as issue #1463:
> https://github.com/OpenMP/spec/issues/1463
>
> The implementation inside the compiler is to of course, tag the implicitly 
> created maps with some
> indication of "implicit". I've done this with a OMP_CLAUSE_MAP_IMPLICIT_P 
> macro, using
> 'base.deprecated_flag' underneath.
>
> There is an encoding of this as GOMP_MAP_IMPLICIT == 
> GOMP_MAP_FLAG_SPECIAL_3|GOMP_MAP_FLAG_SPECIAL_4
> in include/gomp-constants.h for the runtime, but I've intentionally avoided 
> exploding the entire
> gimplify/omp-low with a new set of GOMP_MAP_IMPLICIT_TO/FROM/etc. symbols, 
> instead adding in the new
> flag bits only at the final runtime call generation during omp-lowering.
>
> The rest is libgomp mapping taking care of the implicit case: allowing map 
> success if an existing
> map is a proper subset of the new map, if the new map is implicit. 
> Straightforward enough I think.

Seems so -- based on my very quick look.  ;-)

> There are also some additions to print the implicit attribute during tree 
> pretty-printing, for that
> reason some scan tests were updated.

ACK, thanks.

> Also, another adjustment in this patch is how implicitly created clauses are 
> added to the current
> clause list in gimplify_adjust_omp_clauses(). Instead of simply appending the 
> new clauses to the end,
> this patch adds them at the position "after initial non-map clauses, but 
> right before any existing
> map clauses".

Probably you haven't been testing such a configuration; I've just pushed
"Fix up 'c-c++-common/goacc/firstprivate-mappings-1.c' for C, non-LP64"
to devel/omp/gcc-10 branch in commit
c51cc3b96f0b562deaffcfbcc51043aed216801a, see attached.

> The reason for this is: when combined with other map clauses, for example:
>
>#pragma omp target map(rec.ptr[:N])
>for (int i = 0; i < N; i++)
>  rec.ptr[i] += 1;
>
> There will be an implicit map created for map(rec), because of the access 
> inside the target region.
> The expectation is that 'rec' is implicitly mapped, and then the pointed 
> array-section part by 'rec.ptr'
> will be mapped, and then attachment to the 'rec.ptr' field of the mapped 
> 'rec' (in that order).
>
> If the implicit 'map(rec)' is appended to the end, instead of placed before 
> other maps, the attachment
> operation will not find anything to attach to, and the entire region will 
> fail.

But that doesn't (negatively) affect user-visible semantics (OpenMP, and
also OpenACC, if applicable), in that more/bigger objects then get mapped
than were before?  (I suppose not?)

Please make sure to put any rationale (like you've posted above) into
source code comments ('gcc/gimplify.c:gimplify_adjust_omp_clauses', I
suppose), or even GCC internals (?) manual, if applicable.


Grüße
 Thomas


> Note: this touches a bit on another issue which I will be sending a patch for 
> later:
> per the discussion on omp-lang, an array section list item should *not* be 
> mapping its base-pointer
> (although an attachment attempt should exist), while in current GCC behavior, 
> for struct member pointers
> like 'rec.ptr' above, we do map it (which should be deemed incorrect).
>
> This means that as of right now, this 

Re: [RFC] ldist: Recognize rawmemchr loop patterns

2021-05-07 Thread Stefan Schulze Frielinghaus via Gcc-patches
On Wed, May 05, 2021 at 11:36:41AM +0200, Richard Biener wrote:
> On Tue, Mar 16, 2021 at 6:13 PM Stefan Schulze Frielinghaus
>  wrote:
> >
> > [snip]
> >
> > Please find attached a new version of the patch.  A major change compared to
> > the previous patch is that I created a separate pass which hopefully makes
> > reviewing also easier since it is almost self-contained.  After realizing 
> > that
> > detecting loops which mimic the behavior of rawmemchr/strlen functions does 
> > not
> > really fit into the topic of loop distribution, I created a separate pass.
> 
> It's true that these reduction-like patterns are more difficult than
> the existing
> memcpy/memset cases.
> 
> >  Due
> > to this I was also able to play around a bit and schedule the pass at 
> > different
> > times.  Currently it is scheduled right before loop distribution where loop
> > header copying already took place which leads to the following effect.
> 
> In fact I'd schedule it after loop distribution so there's the chance that 
> loop
> distribution can expose a loop that fits the new pattern.
> 
> >  Running
> > this setup over
> >
> > char *t (char *p)
> > {
> >   for (; *p; ++p);
> >   return p;
> > }
> >
> > the new pass transforms
> >
> > char * t (char * p)
> > {
> >   char _1;
> >   char _7;
> >
> >[local count: 118111600]:
> >   _7 = *p_3(D);
> >   if (_7 != 0)
> > goto ; [89.00%]
> >   else
> > goto ; [11.00%]
> >
> >[local count: 105119324]:
> >
> >[local count: 955630225]:
> >   # p_8 = PHI 
> >   p_6 = p_8 + 1;
> >   _1 = *p_6;
> >   if (_1 != 0)
> > goto ; [89.00%]
> >   else
> > goto ; [11.00%]
> >
> >[local count: 105119324]:
> >   # p_2 = PHI 
> >   goto ; [100.00%]
> >
> >[local count: 850510901]:
> >   goto ; [100.00%]
> >
> >[local count: 12992276]:
> >
> >[local count: 118111600]:
> >   # p_9 = PHI 
> >   return p_9;
> >
> > }
> >
> > into
> >
> > char * t (char * p)
> > {
> >   char * _5;
> >   char _7;
> >
> >[local count: 118111600]:
> >   _7 = *p_3(D);
> >   if (_7 != 0)
> > goto ; [89.00%]
> >   else
> > goto ; [11.00%]
> >
> >[local count: 105119324]:
> >   _5 = p_3(D) + 1;
> >   p_10 = .RAWMEMCHR (_5, 0);
> >
> >[local count: 118111600]:
> >   # p_9 = PHI 
> >   return p_9;
> >
> > }
> >
> > which is fine so far.  However, I haven't made up my mind so far whether it 
> > is
> > worthwhile to spend more time in order to also eliminate the "first 
> > unrolling"
> > of the loop.
> 
> Might be a phiopt transform ;)  Might apply to quite some set of
> builtins.  I wonder how the strlen case looks like though.
> 
> > I gave it a shot by scheduling the pass prior pass copy header
> > and ended up with:
> >
> > char * t (char * p)
> > {
> >[local count: 118111600]:
> >   p_5 = .RAWMEMCHR (p_3(D), 0);
> >   return p_5;
> >
> > }
> >
> > which seems optimal to me.  The downside of this is that I have to 
> > initialize
> > scalar evolution analysis which might be undesired that early.
> >
> > All this brings me to the question where do you see this peace of code 
> > running?
> > If in a separate pass when would you schedule it?  If in an existing pass,
> > which one would you choose?
> 
> I think it still fits loop distribution.  If you manage to detect it
> with your pass
> standalone then you should be able to detect it in loop distribution.

If a loop is distributed only because one of the partitions matches a
rawmemchr/strlen-like loop pattern, then we have at least two partitions
which walk over the same memory region.  Since a rawmemchr/strlen-like
loop has no body (neglecting expression-3 of a for-loop where just an
increment happens) it is governed by the memory accesses in the loop
condition.  Therefore, in such a case loop distribution would result in
performance degradation.  This is why I think that it does not fit
conceptually into ldist pass.  However, since I make use of a couple of
helper functions from ldist pass, it may still fit technically.

Since currently all ldist optimizations operate over loops where niters
is known and for rawmemchr/strlen-like loops this is not the case, it is
not possible that those optimizations expose a loop which is suitable
for rawmemchr/strlen optimization.  Therefore, what do you think about
scheduling rawmemchr/strlen optimization right between those
if-statements of function loop_distribution::execute?

   if (nb_generated_loops + nb_generated_calls > 0)
 {
   changed = true;
   if (dump_enabled_p ())
 dump_printf_loc (MSG_OPTIMIZED_LOCATIONS,
  loc, "Loop%s %d distributed: split to %d loops "
  "and %d library calls.\n", str, loop->num,
  nb_generated_loops, nb_generated_calls);

   break;
 }

   // rawmemchr/strlen like loops

   if (dump_file && (dump_flags & TDF_DETAILS))
 fprintf (dump_file, "Loop%s %d not distributed.\n", str, loop->num);

> Can you
> explain what part is "easier" as 

Re: [RFC] Using main loop's updated IV as base_address for epilogue vectorization

2021-05-07 Thread Richard Biener
On Wed, 5 May 2021, Andre Vieira (lists) wrote:

> 
> On 05/05/2021 13:34, Richard Biener wrote:
> > On Wed, 5 May 2021, Andre Vieira (lists) wrote:
> >
> >> I tried to see what IVOPTs would make of this and it is able to analyze the
> >> IVs but it doesn't realize (not even sure it tries) that one IV's end (loop
> >> 1)
> >> could be used as the base for the other (loop 2). I don't know if this is
> >> where you'd want such optimizations to be made, on one side I think it
> >> would
> >> be great as it would also help with non-vectorized loops as you allured to.
> > Hmm, OK.  So there's the first loop that has a looparound jump and thus
> > we do not always enter the 2nd loop with the first loop final value of the
> > IV.  But yes, IVOPTs does not try to allocate IVs across multiple loops.
> > And for a followup transform to catch this it would need to compute
> > the final value of the IV and then match this up with the initial
> > value computation.  I suppose FRE could be teached to do this, at
> > least for very simple cases.
> I will admit I am not at all familiar with how FRE works, I know it exists as
> the occlusion of running it often breaks my vector patches :P But that's about
> all I know.
> I will have a look and see if it makes sense from my perspective to address it
> there, because ...
> >
> >> Anyway I diverge. Back to the main question of this patch. How do you
> >> suggest
> >> I go about this? Is there a way to make IVOPTS aware of the 'iterate-once'
> >> IVs
> >> in the epilogue(s) (both vector and scalar!) and then teach it to merge
> >> IV's
> >> if one ends where the other begins?
> > I don't think we will make that work easily.  So indeed attacking this
> > in the vectorizer sounds most promising.
> 
> The problem with this that I found with my approach is that it only tackles
> the vectorized epilogues and that leads to regressions, I don't have the
> example at hand, but what I saw was happening was that increased register
> pressure lead to a spill in the hot path. I believe this was caused by the
> epilogue loop using the update pointers as the base for their DR's, in this
> case there were three DR's (2 loads one store), but the scalar epilogue still
> using the original base + niters, since this data_reference approach only
> changes the vectorized epilogues.

Yeah, this issue obviously extends to the scalar pro and epilogue loops...

So ideally we'd produce IL (mainly for the IV setup code I guess)
that will be handled well by the following passes but then IVOPTs
is not multi-loop aware ...

That said, in the end we should be able to code-generate the scalar
loops as well (my plan is to add that, at least for the vector
loop, to be able to support partly vectorized loops with unvectorizable
stmts simply replicated as scalar ops).  In that case we can use
the same IVs again.

> >   I'll note there's also
> > the issue of epilogue vectorization and reductions where we seem
> > to not re-use partially reduced reduction vectors but instead
> > reduce to a scalar in each step.  That's a related issue - we're
> > not able to carry forward a (reduction) IV we generated for the
> > main vector loop to the epilogue loops.  Like for
> >
> > double foo (double *a, int n)
> > {
> >double sum = 0.;
> >for (int i = 0; i < n; ++i)
> >  sum += a[i];
> >return sum;
> > }
> >
> > with AVX512 we get three reductions to scalars instead of
> > a partial reduction from zmm to ymm before the first vectorized
> > epilogue followed by a reduction from ymm to xmm before the second
> > (the jump around for the epilogues need to jump to the further
> > reduction piece obviously).
> >
> > So I think we want to record IVs we generate (the reduction IVs
> > are already nicely associated with the stmt-infos), one might
> > consider to refer to them from the dr_vec_info for example.
> >
> > It's just going to be "interesting" to wire everything up
> > correctly with all the jump-arounds we have ...
> I have a downstream hack for the reductions, but it only worked for
> partial-vector-usage as there you have the guarantee it's the same
> vector-mode, so you don't need to pfaff around with half and full vectors.
> Obviously what you are suggesting has much wider applications and not
> surprisingly I think Richard Sandiford also pointed out to me that these are
> somewhat related and we might be able to reuse the IV-creation to manage it
> all. But I feel like I am currently light years away from that.
> 
> I had started to look at removing the data_reference updating we have now and
> dealing with this in the 'create_iv' calls from 'vect_create_data_ref_ptr'
> inside 'vectorizable_{load,store}' but then I thought it would be good to
> discuss it with you first. This will require keeping track of the 'end-value'
> of the IV, which for loops where we can skip the previous loop means we will
> need to construct a phi-node containing the updated pointer and the initial
> base. But I'm not 

Re: [PATCH] Vect: Remove restrictions on dotprod signedness

2021-05-07 Thread Richard Biener
On Wed, 5 May 2021, Tamar Christina wrote:

> Hi All,
> 
> There's no reason that the sign of the operands of dot-product have to all be
> the same.  The only restriction really is that the sign of the multiplicands
> are the same, however the sign between the multiplier and the accumulator need
> not be the same.
> 
> The type of the overall operations should be determined by the sign of the
> multiplicand which is already being done by optabs-tree.c.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?

OK if the rest of the series is.

Richard.

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * tree-vect-patterns.c (vect_recog_dot_prod_pattern): Remove sign check.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.dg/vect/vect-reduc-dot-2.c: Expect to pass.
>   * gcc.dg/vect/vect-reduc-dot-3.c: Likewise.
>   * gcc.dg/vect/vect-reduc-dot-6.c: Likewise.
>   * gcc.dg/vect/vect-reduc-dot-7.c: Likewise.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-2.c 
> b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-2.c
> index 
> 25757d2b6713b53a325979b96f89396dbf4675b8..2ebe98887a6072b9e674846af1df38cdc94258dd
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-2.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-2.c
> @@ -6,5 +6,5 @@
>  
>  #include "vect-reduc-dot-1.c"
>  
> -/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" 
> "vect" } } */
> +/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" 
> "vect" } } */
>  
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-3.c 
> b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-3.c
> index 
> b1deb64e186da99ef42cb687d107445c0b800bd8..6a6679d522350ab4c19836f5537119122f0e654e
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-3.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-3.c
> @@ -6,5 +6,5 @@
>  
>  #include "vect-reduc-dot-1.c"
>  
> -/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" 
> "vect" } } */
> +/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" 
> "vect" } } */
>  
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-6.c 
> b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-6.c
> index 
> b690c9f2eb18b34f4b147d779bb3da582e285399..0cd4b823643bd4fadd529b2fe4e1d664aa1159ad
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-6.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-6.c
> @@ -6,5 +6,5 @@
>  
>  #include "vect-reduc-dot-1.c"
>  
> -/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" 
> "vect" } } */
> +/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" 
> "vect" } } */
>  
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-7.c 
> b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-7.c
> index 
> 29e442e8bbf7176cf861518dc171a83d82967764..eefee2e2ca27d749cd3af2238723aeae4e60a429
>  100644
> --- a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-7.c
> +++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-7.c
> @@ -6,5 +6,5 @@
>  
>  #include "vect-reduc-dot-1.c"
>  
> -/* { dg-final { scan-tree-dump-not "vect_recog_dot_prod_pattern: detected" 
> "vect" } } */
> +/* { dg-final { scan-tree-dump "vect_recog_dot_prod_pattern: detected" 
> "vect" } } */
>  
> diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> index 
> 803de3fc287371fa202610a55b17e2c8934672f3..441d6cd28c4eaded7abd756164890dbcffd2f3b8
>  100644
> --- a/gcc/tree-vect-patterns.c
> +++ b/gcc/tree-vect-patterns.c
> @@ -946,7 +946,8 @@ vect_recog_dot_prod_pattern (vec_info *vinfo,
>   In which
>   - DX is double the size of X
>   - DY is double the size of Y
> - - DX, DY, DPROD all have the same type
> + - DX, DY, DPROD all have the same type but the sign
> +   between DX, DY and DPROD can differ.
>   - sum is the same size of DPROD or bigger
>   - sum has been recognized as a reduction variable.
>  
> @@ -988,12 +989,6 @@ vect_recog_dot_prod_pattern (vec_info *vinfo,
>false, 2, unprom0, _type))
>  return NULL;
>  
> -  /* If there are two widening operations, make sure they agree on
> - the sign of the extension.  */
> -  if (TYPE_PRECISION (unprom_mult.type) != TYPE_PRECISION (type)
> -  && TYPE_SIGN (unprom_mult.type) != TYPE_SIGN (half_type))
> -return NULL;
> -
>vect_pattern_detected ("vect_recog_dot_prod_pattern", last_stmt);
>  
>tree half_vectype;
> 
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [PATCH 1/4]middle-end Vect: Add support for dot-product where the sign for the multiplicant changes.

2021-05-07 Thread Richard Biener
On Wed, 5 May 2021, Tamar Christina wrote:

> Hi All,
> 
> This patch adds support for a dot product where the sign of the multiplication
> arguments differ. i.e. one is signed and one is unsigned but the precisions 
> are
> the same.
> 
> #define N 480
> #define SIGNEDNESS_1 unsigned
> #define SIGNEDNESS_2 signed
> #define SIGNEDNESS_3 signed
> #define SIGNEDNESS_4 unsigned
> 
> SIGNEDNESS_1 int __attribute__ ((noipa))
> f (SIGNEDNESS_1 int res, SIGNEDNESS_3 char *restrict a,
>SIGNEDNESS_4 char *restrict b)
> {
>   for (__INTPTR_TYPE__ i = 0; i < N; ++i)
> {
>   int av = a[i];
>   int bv = b[i];
>   SIGNEDNESS_2 short mult = av * bv;
>   res += mult;
> }
>   return res;
> }
> 
> The operations are performed as if the operands were extended to a 32-bit 
> value.
> As such this operation isn't valid if there is an intermediate conversion to 
> an
> unsigned value. i.e.  if SIGNEDNESS_2 is unsigned.
> 
> more over if the signs of SIGNEDNESS_3 and SIGNEDNESS_4 are flipped the same
> optab is used but the operands are flipped in the optab expansion.
> 
> To support this the patch extends the dot-product detection to optionally
> ignore operands with different signs and stores this information in the optab
> subtype which is now made a bitfield.
> 
> The subtype can now additionally controls which optab an EXPR can expand to.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master?
> 
> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
>   * optabs.def (usdot_prod_optab): New.
>   * doc/md.texi: Document it.
>   * optabs-tree.c (optab_for_tree_code): Support usdot_prod_optab.
>   * optabs-tree.h (enum optab_subtype): Likewise.
>   * optabs.c (expand_widen_pattern_expr): Likewise.
>   * tree-cfg.c (verify_gimple_assign_ternary): Likewise.
>   * tree-vect-loop.c (vect_determine_dot_kind): New.
>   (vectorizable_reduction): Query dot-product kind.
>   * tree-vect-patterns.c (vect_supportable_direct_optab_p): Take optional
>   optab subtype.
>   (vect_joust_widened_type, vect_widened_op_tree): Optionally ignore
>   mismatch types.
>   (vect_recog_dot_prod_pattern): Support usdot_prod_optab.
> 
> --- inline copy of patch -- 
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 
> d166a0debedf4d8edf55c842bcf4ff4690b3e9ce..baf20416e63745097825fc30fdf2e66bc80d7d23
>  100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5440,11 +5440,13 @@ Like @samp{fold_left_plus_@var{m}}, but takes an 
> additional mask operand
>  @item @samp{sdot_prod@var{m}}
>  @cindex @code{udot_prod@var{m}} instruction pattern
>  @itemx @samp{udot_prod@var{m}}
> +@cindex @code{usdot_prod@var{m}} instruction pattern
> +@itemx @samp{usdot_prod@var{m}}
>  Compute the sum of the products of two signed/unsigned elements.
> -Operand 1 and operand 2 are of the same mode. Their product, which is of a
> -wider mode, is computed and added to operand 3. Operand 3 is of a mode equal 
> or
> -wider than the mode of the product. The result is placed in operand 0, which
> -is of the same mode as operand 3.
> +Operand 1 and operand 2 are of the same mode but may differ in signs. Their
> +product, which is of a wider mode, is computed and added to operand 3.
> +Operand 3 is of a mode equal or wider than the mode of the product. The
> +result is placed in operand 0, which is of the same mode as operand 3.

This doesn't really say what the 's', 'u' and 'us' specify.  Since
we're doing a widen multiplication and then a non-widening addition
we only need to know the effective sign of the multiplication so
I think the existing 's' and 'u' are enough to cover all cases?

The tree.def docs say the sum is also possibly widening but I don't see
this covered by the optab so we should eventually remove this
feature from the tree side.  In fact the tree-cfg.c verifier requires
the addition to be not widening - thus only tree.def needs adjustment.

>  @cindex @code{ssad@var{m}} instruction pattern
>  @item @samp{ssad@var{m}}
> diff --git a/gcc/optabs-tree.h b/gcc/optabs-tree.h
> index 
> c3aaa1a416991e856d3e24da45968a92ebada82c..ebc23ac86fe99057f375781c2f1990e0548ba08d
>  100644
> --- a/gcc/optabs-tree.h
> +++ b/gcc/optabs-tree.h
> @@ -27,11 +27,29 @@ along with GCC; see the file COPYING3.  If not see
> shift amount vs. machines that take a vector for the shift amount.  */
>  enum optab_subtype
>  {
> -  optab_default,
> -  optab_scalar,
> -  optab_vector
> +  optab_default = 1 << 0,
> +  optab_scalar = 1 << 1,
> +  optab_vector = 1 << 2,
> +  optab_signed_to_unsigned = 1 << 3,
> +  optab_unsigned_to_signed = 1 << 4
>  };
>  
> +/* Override the OrEqual-operator so we can use optab_subtype as a bit flag.  
> */
> +inline enum optab_subtype&
> +operator |= (enum optab_subtype& a, enum optab_subtype b)
> +{
> +return a = static_cast(static_cast(a)
> +   | static_cast(b));
> +}
> +
> +/* Override the Or-operator so we can 

Re: [PATCH] rs6000: Make density_test only for vector version

2021-05-07 Thread will schmidt via Gcc-patches
On Fri, 2021-05-07 at 10:28 +0800, Kewen.Lin via Gcc-patches wrote:
> Hi,
> 
> When I was investigating density_test heuristics, I noticed that
> the current rs6000_density_test could be used for single scalar
> iteration cost calculation, through the call trace:
>   vect_compute_single_scalar_iteration_cost
> -> rs6000_finish_cost
>  -> rs6000_density_test
> 
> It looks unexpected as its desriptive function comments and Bill
> helped to confirm this needs to be fixed (thanks!).
> 
> So this patch is to check the passed data, if it's the same as
> the one in loop_vinfo, it indicates it's working on vector version
> cost calculation, otherwise just early return.
> 
> Bootstrapped/regtested on powerpc64le-linux-gnu P9.
> 
> Nothing remarkable was observed with SPEC2017 Power9 full run.
> 
> Is it ok for trunk?
> 
> BR,
> Kewen
> --
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.c (rs6000_density_test): Early return if
>   calculating single scalar iteration cost.



Ok, so data is passed in.. 
  static void
  rs6000_density_test (rs6000_cost_data *data)
  {
  ...
and loop_vinfo is calculated via...
  loop_vec_info loop_vinfo = loop_vec_info_for_loop (data->loop_info);
which is
  static inline loop_vec_info
  loop_vec_info_for_loop (class loop *loop)
  {
return (loop_vec_info) loop->aux;
  }


> +  /* Only care about cost of vector version, so exclude scalar
> version here.  */
> 
> +  if (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo) != (void *) data)
> 
> +return;

Can the loop contain both vector and scalar parts?  Comments in the
function now mention a percentage of vector instructions within the
loop.  So..  this is meant to return early if there are no(?) vector
instructions?

I'm admittedly not clear on what 'scalar version' means here.
Would it
be accurate or clearer to update the comment to something like 
/* Return early if the loop_vinfo value indicates there are no vector
instructions within this loop. */ ?

thanks
-Will




Re: [vect] Support min/max + index pattern

2021-05-07 Thread Richard Biener
On Wed, 5 May 2021, Joel Hutton wrote:

> Hi all,
> 
> looking for some feedback on this, one thing I would like to draw 
> attention to is the fact that this pattern requires 2 separate dependent 
> reductions in the epilogue. The accumulator vector for the 
> maximum/minimum elements can be reduced to a scalar result trivially 
> with a min/max, but getting the index from accumulator vector of indices 
> is more complex and requires using the position of the maximum/minimum 
> scalar result value within the accumulator vector to create a mask.
> 
> The given solution works but it's slightly messy. 
> vect_create_epilogue_for_reduction creates the epilogue for one 
> vectorized scalar stmt at a time. This modification makes one invocation 
> create the epilogue for both related stmts and marks the other as 
> 'done'. Alternate suggestions are welcome.

So I'm not looking at the very details in the patch but I think
a concept of "dependent" reductions might be useful (there might
be related patterns that could be vectorized).  So during
reduction discovery detect unhandled multi-uses and queue those
for later re-evaluation (in case we analyze the value reduction
first).  Discover the index reduction as regular reduction.
Upon re-analyzing the value reduction allow multi-uses that are
reductions themselves and note the dependence in some new stmt_vinfo
field.

Then during reduction analysis we can verify if we handle
a particular dependence and during code-gen we can ensure
to only code-gen one epilogue and have access to the other part.

That said, I'd make the thing a bit more generic in appearance
but otherwise yes, we do need to handle this together somehow.

+  /* For handling multi phi reductions.  */
+  tree scalar_result;
+  stmt_vec_info reduc_multi_use_related_stmt;
+  gimple* reduc_multi_use_result;
+  tree induc_val;
+  tree initial_def;
+  bool is_minmax_index;
+  bool is_minmax;
+  bool epilog_finished;

which should ideally reduce to a single field then.

Oh, and make sure an SLP variant is supported.

  for (int i = 0; i < n; i++) {
if (data[2*i] < best1) {
  best1 = data[2*i];
  best_i1 = 2*i;
}
if (data[2*i+1] < best2) {
  best2 = data[2*i+1];
  best_i2 = 2*i;
}
  }

Richard.

> Joel
> 
> [vect] Support min/max + index pattern
> 
> Add the capability to vect-loop to support the following pattern.
> 
> for (int i = 0; i < n; i ++)
> {
> if (data[i] < best)
> {
> best = data[i];
> best_i = i;
> }
> }
> 
> gcc/ChangeLog:
> 
>   * tree-vect-loop.c (vect_reassociating_reduction_simple_p): New 
>   
>
>   function.   
>   
>
>   (vect_recog_minmax_index_pattern): New function.
>   
>
>   (vect_is_simple_reduction): Add multi_use_reduction case.   
>   
>
>   (vect_create_epilog_for_reduction): Add minmax+index epilogue handling.
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


Re: [patch] Do not apply scalar storage order to pointer fields

2021-05-07 Thread Richard Biener via Gcc-patches
On Fri, May 7, 2021 at 12:42 PM Eric Botcazou  wrote:
>
> Hi,
>
> I didn't really think of pointer fields (nor of vector fields originally) when
> implementing the scalar_storage_order attribute, so they are swapped as well.
> As Ulrich pointed out, this is problematic to describe in DWARF and probably
> not very useful in any case, so the attached patch pulls them out.
>
> Tested on x86-64/Linux, OK for mainline?

OK.

Richard.

>
> 2021-05-07  Eric Botcazou  
>
> * doc/extend.texi (scalar_storage_order): Mention effect on pointer
> and vector fields.
> * tree.h (reverse_storage_order_for_component_p): Return false if
> the type is a pointer.
> c/
> * c-typeck.c (build_unary_op) : Do not issue an error
> on the address of a pointer field in a record with reverse SSO.
>
>
> 2021-05-07  Eric Botcazou  
>
> * gcc.dg/sso-12.c: New test.
>
> --
> Eric Botcazou


[patch] Fix incorrect array bounds with -fgnat-encodings=minimal in DWARF

2021-05-07 Thread Eric Botcazou
Hi,

this makes add_subscript_info query the get_array_descr_info hook for the
actual information when it is defined.

Tested on x86-64/Linux, OK for mainline?


2021-05-07  Eric Botcazou  

* dwarf2out.c (add_subscript_info): Retrieve the bounds and the index
type by means of the get_array_descr_info langhook, if it is set and
returns true.  Remove obsolete code dealing with unnamed subtypes.


2021-05-07  Eric Botcazou  

* gnat.dg/debug18.adb: New test.

-- 
Eric Botcazoudiff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 5b819ab1a92..ad948d94da9 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -21223,8 +21223,6 @@ add_bound_info (dw_die_ref subrange_die, enum dwarf_attribute bound_attr,
 
 /* Add subscript info to TYPE_DIE, describing an array TYPE, collapsing
possibly nested array subscripts in a flat sequence if COLLAPSE_P is true.
-   Note that the block of subscript information for an array type also
-   includes information about the element type of the given array type.
 
This function reuses previously set type and bound information if
available.  */
@@ -21232,9 +21230,21 @@ add_bound_info (dw_die_ref subrange_die, enum dwarf_attribute bound_attr,
 static void
 add_subscript_info (dw_die_ref type_die, tree type, bool collapse_p)
 {
-  unsigned dimension_number;
-  tree lower, upper;
   dw_die_ref child = type_die->die_child;
+  struct array_descr_info info;
+  int dimension_number;
+
+  if (lang_hooks.types.get_array_descr_info)
+{
+  memset (, 0, sizeof (info));
+  if (lang_hooks.types.get_array_descr_info (type, ))
+	/* Fortran sometimes emits array types with no dimension.  */
+	gcc_assert (info.ndimensions >= 0
+		&& info.ndimensions
+		   <= DWARF2OUT_ARRAY_DESCR_INFO_MAX_DIMEN);
+}
+  else
+info.ndimensions = 0;
 
   for (dimension_number = 0;
TREE_CODE (type) == ARRAY_TYPE && (dimension_number == 0 || collapse_p);
@@ -21282,26 +21292,22 @@ add_subscript_info (dw_die_ref type_die, tree type, bool collapse_p)
   if (domain)
 	{
 	  /* We have an array type with specified bounds.  */
-	  lower = TYPE_MIN_VALUE (domain);
-	  upper = TYPE_MAX_VALUE (domain);
+	  tree lower = TYPE_MIN_VALUE (domain);
+	  tree upper = TYPE_MAX_VALUE (domain);
+	  tree index_type = TREE_TYPE (domain);
 
-	  /* Define the index type.  */
-	  if (TREE_TYPE (domain)
-	  && !get_AT (subrange_die, DW_AT_type))
+	  if (dimension_number <= info.ndimensions - 1)
 	{
-	  /* ??? This is probably an Ada unnamed subrange type.  Ignore the
-		 TREE_TYPE field.  We can't emit debug info for this
-		 because it is an unnamed integral type.  */
-	  if (TREE_CODE (domain) == INTEGER_TYPE
-		  && TYPE_NAME (domain) == NULL_TREE
-		  && TREE_CODE (TREE_TYPE (domain)) == INTEGER_TYPE
-		  && TYPE_NAME (TREE_TYPE (domain)) == NULL_TREE)
-		;
-	  else
-		add_type_attribute (subrange_die, TREE_TYPE (domain),
-TYPE_UNQUALIFIED, false, type_die);
+	  lower = info.dimen[dimension_number].lower_bound;
+	  upper = info.dimen[dimension_number].upper_bound;
+	  index_type = info.dimen[dimension_number].bounds_type;
 	}
 
+	  /* Define the index type.  */
+	  if (index_type && !get_AT (subrange_die, DW_AT_type))
+	add_type_attribute (subrange_die, index_type, TYPE_UNQUALIFIED,
+false, type_die);
+
 	  /* ??? If upper is NULL, the array has unspecified length,
 	 but it does have a lower bound.  This happens with Fortran
 	   dimension arr(N:*)
@@ -21309,8 +21315,9 @@ add_subscript_info (dw_die_ref type_die, tree type, bool collapse_p)
 	 to produce useful results, go ahead and output the lower
 	 bound solo, and hope the debugger can cope.  */
 
-	  if (!get_AT (subrange_die, DW_AT_lower_bound))
+	  if (lower && !get_AT (subrange_die, DW_AT_lower_bound))
 	add_bound_info (subrange_die, DW_AT_lower_bound, lower, NULL);
+
 	  if (!get_AT (subrange_die, DW_AT_upper_bound)
 	  && !get_AT (subrange_die, DW_AT_count))
 	{
@@ -22039,6 +22046,7 @@ decl_start_label (tree decl)
 /* For variable-length arrays that have been previously generated, but
may be incomplete due to missing subscript info, fill the subscript
info.  Return TRUE if this is one of those cases.  */
+
 static bool
 fill_variable_array_bounds (tree type)
 {
-- { dg-do compile }
-- { dg-skip-if "No Dwarf" { { hppa*-*-hpux* } && { ! lp64 } } }
-- { dg-options "-cargs -O0 -g -dA -fgnat-encodings=minimal -margs" }

procedure Debug18 is

   procedure Check (Size : Integer) is
  type Bit_Array_Type is array (1 .. Size) of boolean;
  pragma Pack (Bit_Array_Type);

  Bits : Bit_Array_Type := (others => False);
   begin
  Bits (Bits'First) := True;
   end;
  
begin
   Check (Size => 9);
end;

-- { dg-final { scan-assembler-not "DW_AT_lower_bound" } }


[PATCH][GCC 8] aarch64: PR target/99037 Fix RTL represntation in move_lo_quad patterns

2021-05-07 Thread Kyrylo Tkachov via Gcc-patches
This is a GCC 8 backport

This patch fixes the RTL representation of the move_lo_quad patterns to use 
aarch64_simd_or_scalar_imm_zero
for the zero part rather than a vec_duplicate of zero or a const_int 0.
The expander that generates them is also adjusted so that we use and match 
the correct const_vector forms throughout.

Bootstrapped and tested on aarch64-none-linux-gnu.

Co-Authored-By: Jakub Jelinek 
gcc/ChangeLog:

PR target/99037
PR target/100441
* config/aarch64/aarch64-simd.md (move_lo_quad_internal_): Use
aarch64_simd_or_scalar_imm_zero to match zeroes.  Remove pattern
matching const_int 0.
(move_lo_quad_internal_be_): Likewise.
(move_lo_quad_): Update for the above.
* config/aarch64/iterators.md (VQ_2E): Delete.

gcc/testsuite/ChangeLog:

PR target/99808
* gcc.target/aarch64/pr99808.c: New test.


gcc-8.patch
Description: gcc-8.patch


[PATCH][GCC-9]aarch64: PR target/99037 Fix RTL represntation in move_lo_quad patterns

2021-05-07 Thread Kyrylo Tkachov via Gcc-patches
This is a GCC 9 backport

This patch fixes the RTL representation of the move_lo_quad patterns to use 
aarch64_simd_or_scalar_imm_zero
for the zero part rather than a vec_duplicate of zero or a const_int 0.
The expander that generates them is also adjusted so that we use and match 
the correct const_vector forms throughout.

Bootstrapped and tested on aarch64-none-linux-gnu

Co-Authored-By: Jakub Jelinek 
gcc/ChangeLog:

PR target/99037
PR target/100441
* config/aarch64/aarch64-simd.md (move_lo_quad_internal_): Use
aarch64_simd_or_scalar_imm_zero to match zeroes.  Remove pattern
matching const_int 0.
(move_lo_quad_internal_be_): Likewise.
(move_lo_quad_): Update for the above.
* config/aarch64/iterators.md (VQ_2E): Delete.

gcc/testsuite/ChangeLog:

PR target/99808
* gcc.target/aarch64/pr99808.c: New test.


gcc-9.patch
Description: gcc-9.patch


[Patch] contrib/gcc-changelog: Detect if same file appears twice

2021-05-07 Thread Tobias Burnus

Test for a copyed-but-not-fully-edited error.

OK?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
contrib/gcc-changelog: Detect if same file appears twice

contrib/ChangeLog:

	* gcc-changelog/git_commit.py (Error.__repr__): Add space after the colon.
	(GitCommit.check_mentioned_files): Check whether the same file has been
	specified multiple times.
	* gcc-changelog/test_email.py (TestGccChangelog.test_multi_same_file): New.
	* gcc-changelog/test_patches.txt (0001-OpenMP-Fix-SIMT): New test.

diff --git a/contrib/gcc-changelog/git_commit.py b/contrib/gcc-changelog/git_commit.py
index b28f7deac23..d9332cb0c38 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -200,7 +200,7 @@ class Error:
 def __repr__(self):
 s = self.message
 if self.line:
-s += ':"%s"' % self.line
+s += ': "%s"' % self.line
 return s
 
 
@@ -629,7 +629,12 @@ class GitCommit:
 assert not entry.folder.endswith('/')
 for file in entry.files:
 if not self.is_changelog_filename(file):
-mentioned_files.add(os.path.join(entry.folder, file))
+item = os.path.join(entry.folder, file)
+if item in mentioned_files:
+msg = 'same file specified multiple times'
+self.errors.append(Error(msg, file))
+else:
+mentioned_files.add(item)
 for pattern in entry.file_patterns:
 mentioned_patterns.append(os.path.join(entry.folder, pattern))
 
diff --git a/contrib/gcc-changelog/test_email.py b/contrib/gcc-changelog/test_email.py
index 8abf5c37487..d66bf5be4eb 100755
--- a/contrib/gcc-changelog/test_email.py
+++ b/contrib/gcc-changelog/test_email.py
@@ -424,3 +424,7 @@ class TestGccChangelog(unittest.TestCase):
 def test_long_filenames(self):
 email = self.from_patch_glob('0001-long-filenames')
 assert not email.errors
+
+def test_multi_same_file(self):
+email = self.from_patch_glob('0001-OpenMP-Fix-SIMT')
+assert email.errors[0].message == 'same file specified multiple times'
diff --git a/contrib/gcc-changelog/test_patches.txt b/contrib/gcc-changelog/test_patches.txt
index 3f9806dc076..7e4a4b01081 100644
--- a/contrib/gcc-changelog/test_patches.txt
+++ b/contrib/gcc-changelog/test_patches.txt
@@ -3546,3 +3546,32 @@ index 5ad82db1def..53b15f32516 100644
 @@ -1 +1,2 @@
  
 +
+
+=== 0001-OpenMP-Fix-SIMT ===
+From 33b647956caa977d1ae489f9baed9cef70b4f382 Mon Sep 17 00:00:00 2001
+From: Tobias Burnus 
+Date: Fri, 7 May 2021 12:11:51 +0200
+Subject: [PATCH] OpenMP: Fix SIMT for complex/float reduction with && and ||
+
+libgomp/ChangeLog:
+
+	* testsuite/libgomp.c-c++-common/reduction-5.c: New test, testing
+	complex/floating-point || + && reduction with 'omp target'.
+	* testsuite/libgomp.c-c++-common/reduction-5.c: Likewise.
+---
+diff --git a/libgomp/testsuite/libgomp.c-c++-common/reduction-5.c b/libgomp/testsuite/libgomp.c-c++-common/reduction-5.c
+new file mode 100644
+index 000..21540512e23
+--- /dev/null
 b/libgomp/testsuite/libgomp.c-c++-common/reduction-5.c
+@@ -0,0 +1,1 @@
++
+diff --git a/libgomp/testsuite/libgomp.c-c++-common/reduction-6.c b/libgomp/testsuite/libgomp.c-c++-common/reduction-6.c
+new file mode 100644
+index 000..21540512e23
+--- /dev/null
 b/libgomp/testsuite/libgomp.c-c++-common/reduction-6.c
+@@ -0,0 +1,1 @@
++
+-- 
+2.25.1


Re: [PATCH] middle-end/100464 - avoid spurious TREE_ADDRESSABLE in folding debug stmts

2021-05-07 Thread Richard Biener via Gcc-patches
On Fri, May 7, 2021 at 12:17 PM Richard Biener  wrote:
>
> canonicalize_constructor_val was setting TREE_ADDRESSABLE on bases
> of ADDR_EXPRs but that's futile when we're dealing with CTOR values
> in debug stmts.  This rips out the code which was added for Java
> and should have been an assertion when we didn't have debug stmts.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages
> which revealed PR100468 for which I added the cp/class.c hunk below.
> Re-testing with that in progress.
>
> OK for trunk and branch?  It looks like this C++ code is new in GCC 11.

I mislooked, the code is old.

This hunk also breaks (or fixes) g++.dg/tree-ssa/array-temp1.C where
the gimplifier previously passes the

   && (flag_merge_constants >= 2 || !TREE_ADDRESSABLE (object))

check guarding it against unifying addresses of different instances
of variables.  Clearly in the case of the testcase there are addresses to
this variable as part of the initializer list construction.  So the hunk fixes
wrong-code, but it breaks the testcase.

Any comments?  I can of course change the testcase accordingly.

Thanks,
Richard.

> Thanks,
> Richard.
>
> 2021-05-07  Richard Biener  
>
> PR middle-end/100464
> PR c++/100468
> gcc/
> * gimple-fold.c (canonicalize_constructor_val): Do not set
> TREE_ADDRESSABLE.
>
> gcc/cp/
> * call.c (set_up_extended_ref_temp): Mark the temporary
> addressable if the TARGET_EXPR was.
>
> gcc/testsuite/
> * gcc.dg/pr100464.c: New testcase.
> ---
>  gcc/cp/call.c   |  2 ++
>  gcc/gimple-fold.c   |  4 +++-
>  gcc/testsuite/gcc.dg/pr100464.c | 16 
>  3 files changed, 21 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/pr100464.c
>
> diff --git a/gcc/cp/call.c b/gcc/cp/call.c
> index 57bac05fe70..ea97be22f07 100644
> --- a/gcc/cp/call.c
> +++ b/gcc/cp/call.c
> @@ -12478,6 +12478,8 @@ set_up_extended_ref_temp (tree decl, tree expr, 
> vec **cleanups,
>   VAR.  */
>if (TREE_CODE (expr) != TARGET_EXPR)
>  expr = get_target_expr (expr);
> +  else if (TREE_ADDRESSABLE (expr))
> +TREE_ADDRESSABLE (var) = 1;
>
>if (TREE_CODE (decl) == FIELD_DECL
>&& extra_warnings && !TREE_NO_WARNING (decl))
> diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> index aa33779b753..768ef89d876 100644
> --- a/gcc/gimple-fold.c
> +++ b/gcc/gimple-fold.c
> @@ -245,7 +245,9 @@ canonicalize_constructor_val (tree cval, tree from_decl)
>if (TREE_TYPE (base) == error_mark_node)
> return NULL_TREE;
>if (VAR_P (base))
> -   TREE_ADDRESSABLE (base) = 1;
> +   /* ???  We should be able to assert that TREE_ADDRESSABLE is set,
> +  but since the use can be in a debug stmt we can't.  */
> +   ;
>else if (TREE_CODE (base) == FUNCTION_DECL)
> {
>   /* Make sure we create a cgraph node for functions we'll reference.
> diff --git a/gcc/testsuite/gcc.dg/pr100464.c b/gcc/testsuite/gcc.dg/pr100464.c
> new file mode 100644
> index 000..46cc37dff54
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr100464.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -fcompare-debug" } */
> +
> +int *a;
> +static int b, c, d, e, g, h;
> +int f;
> +void i() {
> +  int *j[] = {, , , , , , , , , , ,
> +  , , , , , , , , , , ,
> +  , , , , , , , , , , };
> +  int **k = [5];
> +  for (; f;)
> +b |= *a;
> +  *k = 
> +}
> +int main() {}
> --
> 2.26.2


Re: [Patch] + [nvptx RFH/RFC]: OpenMP: Fix SIMT for complex/float reduction with && and ||

2021-05-07 Thread Tom de Vries
On 5/7/21 12:05 PM, Tobias Burnus wrote:
> On 06.05.21 16:32, Jakub Jelinek wrote:
> 
>> s/recution/reduction/
> Fixed.
>> This comment needs to be adjusted to talk about non-integral types.
> Fixed.
>> Is this hunk still needed when the first hunk is in?
> 
> No - and now removed.
> 
> Updated code attached.
> 


> libgomp/ChangeLog:
> 
>   * testsuite/libgomp.c-c++-common/reduction-5.c: New test, testing
>   complex/floating-point || + && reduction with 'omp target'.
>   * testsuite/libgomp.c-c++-common/reduction-5.c: Likewise.

5 -> 6.

Otherwise, LGTM.

Thanks,
- Tom


Re: [Patch] + [nvptx RFH/RFC]: OpenMP: Fix SIMT for complex/float reduction with && and ||

2021-05-07 Thread Jakub Jelinek via Gcc-patches
On Fri, May 07, 2021 at 12:05:11PM +0200, Tobias Burnus wrote:
> 2021-05-07  Tobias Burnus  
>   Tom de Vries  
> 
> gcc/ChangeLog:
> 
>   * omp-low.c (lower_rec_simd_input_clauses): Set max_vf = 1 if
>   a truth_value_p reduction variable is nonintegral.
> 
> libgomp/ChangeLog:
> 
>   * testsuite/libgomp.c-c++-common/reduction-5.c: New test, testing
>   complex/floating-point || + && reduction with 'omp target'.
>   * testsuite/libgomp.c-c++-common/reduction-5.c: Likewise.
> 
>  gcc/omp-low.c  |  28 ++-
>  .../testsuite/libgomp.c-c++-common/reduction-5.c   | 193 
>  .../testsuite/libgomp.c-c++-common/reduction-6.c   | 196 
> +
>  3 files changed, 410 insertions(+), 7 deletions(-)

Ok, thanks.

Jakub



Re: [Patch] + [nvptx RFH/RFC]: OpenMP: Fix SIMT for complex/float reduction with && and ||

2021-05-07 Thread Tobias Burnus

On 06.05.21 16:32, Jakub Jelinek wrote:


s/recution/reduction/

Fixed.

This comment needs to be adjusted to talk about non-integral types.

Fixed.

Is this hunk still needed when the first hunk is in?


No - and now removed.

Updated code attached.

Tobias


-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
OpenMP: Fix SIMT for complex/float reduction with && and ||

2021-05-07  Tobias Burnus  
	Tom de Vries  

gcc/ChangeLog:

	* omp-low.c (lower_rec_simd_input_clauses): Set max_vf = 1 if
	a truth_value_p reduction variable is nonintegral.

libgomp/ChangeLog:

	* testsuite/libgomp.c-c++-common/reduction-5.c: New test, testing
	complex/floating-point || + && reduction with 'omp target'.
	* testsuite/libgomp.c-c++-common/reduction-5.c: Likewise.

 gcc/omp-low.c  |  28 ++-
 .../testsuite/libgomp.c-c++-common/reduction-5.c   | 193 
 .../testsuite/libgomp.c-c++-common/reduction-6.c   | 196 +
 3 files changed, 410 insertions(+), 7 deletions(-)

diff --git a/gcc/omp-low.c b/gcc/omp-low.c
index 26ceaf74b2d..2325cfcfc34 100644
--- a/gcc/omp-low.c
+++ b/gcc/omp-low.c
@@ -4389,14 +4389,28 @@ lower_rec_simd_input_clauses (tree new_var, omp_context *ctx,
 	{
 	  for (tree c = gimple_omp_for_clauses (ctx->stmt); c;
 	   c = OMP_CLAUSE_CHAIN (c))
-	if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION
-		&& OMP_CLAUSE_REDUCTION_PLACEHOLDER (c))
-	  {
-		/* UDR reductions are not supported yet for SIMT, disable
-		   SIMT.  */
-		sctx->max_vf = 1;
-		break;
+	{
+	  if (OMP_CLAUSE_CODE (c) != OMP_CLAUSE_REDUCTION)
+		continue;
+
+	  if (OMP_CLAUSE_REDUCTION_PLACEHOLDER (c))
+		{
+		  /* UDR reductions are not supported yet for SIMT, disable
+		 SIMT.  */
+		  sctx->max_vf = 1;
+		  break;
+		}
+
+	  if (truth_value_p (OMP_CLAUSE_REDUCTION_CODE (c))
+		  && !INTEGRAL_TYPE_P (TREE_TYPE (new_var)))
+		{
+		  /* Doing boolean operations on non-integral types is
+		 for conformance only, it's not worth supporting this
+		 for SIMT.  */
+		  sctx->max_vf = 1;
+		  break;
 	  }
+	}
 	}
   if (maybe_gt (sctx->max_vf, 1U))
 	{
diff --git a/libgomp/testsuite/libgomp.c-c++-common/reduction-5.c b/libgomp/testsuite/libgomp.c-c++-common/reduction-5.c
new file mode 100644
index 000..21540512e23
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c-c++-common/reduction-5.c
@@ -0,0 +1,193 @@
+/* { dg-additional-options "-foffload=-latomic" { target { offload_target_nvptx } } } */
+/* C / C++'s logical AND and OR operators take any scalar argument
+   which compares (un)equal to 0 - the result 1 or 0 and of type int.
+
+   In this testcase, the int result is again converted to a floating-poing
+   or complex type.
+
+   While having a floating-point/complex array element with || and && can make
+   sense, having a non-integer/non-bool reduction variable is odd but valid.
+
+   Test: FP reduction variable + FP array - as reduction-1.c but with target  */
+
+#define N 1024
+_Complex float rcf[N];
+_Complex double rcd[N];
+float rf[N];
+double rd[N];
+
+int
+reduction_or ()
+{
+  float orf = 0;
+  double ord = 0;
+  _Complex float orfc = 0;
+  _Complex double ordc = 0;
+
+  #pragma omp target parallel reduction(||: orf) map(orf)
+  for (int i=0; i < N; ++i)
+orf = orf || rf[i];
+
+  #pragma omp target parallel for reduction(||: ord) map(ord)
+  for (int i=0; i < N; ++i)
+ord = ord || rcd[i];
+
+  #pragma omp target parallel for simd reduction(||: orfc) map(orfc)
+  for (int i=0; i < N; ++i)
+orfc = orfc || rcf[i];
+
+  #pragma omp target parallel loop reduction(||: ordc) map(ordc)
+  for (int i=0; i < N; ++i)
+ordc = ordc || rcd[i];
+
+  return orf + ord + __real__ orfc + __real__ ordc;
+}
+
+int
+reduction_or_teams ()
+{
+  float orf = 0;
+  double ord = 0;
+  _Complex float orfc = 0;
+  _Complex double ordc = 0;
+
+  #pragma omp target teams distribute parallel for reduction(||: orf) map(orf)
+  for (int i=0; i < N; ++i)
+orf = orf || rf[i];
+
+  #pragma omp target teams distribute parallel for simd reduction(||: ord) map(ord)
+  for (int i=0; i < N; ++i)
+ord = ord || rcd[i];
+
+  #pragma omp target teams distribute parallel for reduction(||: orfc) map(orfc)
+  for (int i=0; i < N; ++i)
+orfc = orfc || rcf[i];
+
+  #pragma omp target teams distribute parallel for simd reduction(||: ordc) map(ordc)
+  for (int i=0; i < N; ++i)
+ordc = ordc || rcd[i];
+
+  return orf + ord + __real__ orfc + __real__ ordc;
+}
+
+int
+reduction_and ()
+{
+  float andf = 1;
+  double andd = 1;
+  _Complex float andfc = 1;
+  _Complex double anddc = 1;
+
+  #pragma omp target parallel reduction(&&: andf) map(andf)
+  for (int i=0; i < N; ++i)
+andf = andf && rf[i];
+
+  #pragma omp target parallel for reduction(&&: andd) map(andd)
+  for (int i=0; i < N; ++i)
+andd = andd && 

Re: [PATCH] split loop for NE condition.

2021-05-07 Thread Richard Biener
On Fri, 7 May 2021, guojiufu wrote:

> On 2021-05-06 16:27, Richard Biener wrote:
> > On Thu, 6 May 2021, guojiufu wrote:
> > 
> >> On 2021-05-03 20:18, Richard Biener wrote:
> >> > On Thu, 29 Apr 2021, Jiufu Guo wrote:
> >> >
> >> >> When there is the possibility that overflow may happen on the loop
> >> >> index,
> >> >> a few optimizations would not happen. For example code:
> >> >>
> >> >> foo (int *a, int *b, unsigned k, unsigned n)
> >> >> {
> >> >>   while (++k != n)
> >> >> a[k] = b[k]  + 1;
> >> >> }
> >> >>
> >> >> For this code, if "l > n", overflow may happen.  if "l < n" at begining,
> >> >> it could be optimized (e.g. vectorization).
> >> >>
> >> >> We can split the loop into two loops:
> >> >>
> >> >>   while (++k > n)
> >> >> a[k] = b[k]  + 1;
> >> >>   while (l++ < n)
> >> >> a[k] = b[k]  + 1;
> >> >>
> >> >> then for the second loop, it could be optimized.
> >> >>
> >> >> This patch is splitting this kind of small loop to achieve better
> >> >> performance.
> >> >>
> >> >> Bootstrap and regtest pass on ppc64le.  Is this ok for trunk?
> >> >
> >> > Do you have any statistics on how often this splits a loop during
> >> > bootstrap (use --with-build-config=bootstrap-O3)?  Or alternatively
> >> > on SPEC?
> >> 
> >> In SPEC2017, there are ~240 loops are split.  And I saw some performance
> >> improvement on xz.
> >> I would try bootstrap-O3 (encounter ICE).
> Without this patch, the ICE is also there when building with bootstrap-O3 on
> ppc64le.
> 
> >> 
> >> >
> >> > Actual comments on the patch inline.
> >> >
> >> >> Thanks!
> >> >>
> >> >> Jiufu Guo.
> >> >>
> >> >> gcc/ChangeLog:
> >> >>
> >> >> 2021-04-29  Jiufu Guo  
> >> >>
> >> >>  * params.opt (max-insns-ne-cond-split): New.
> >> >>  * tree-ssa-loop-split.c (connect_loop_phis): Add new param.
> >> >>  (get_ne_cond_branch): New function.
> >> >>  (split_ne_loop): New function.
> >> >>  (split_loop_on_ne_cond): New function.
> >> >>  (tree_ssa_split_loops): Use split_loop_on_ne_cond.
> >> >>
> >> >> gcc/testsuite/ChangeLog:
> >> >> 2021-04-29  Jiufu Guo  
> >> >>
> >> >>  * gcc.dg/loop-split1.c: New test.
> >> >>
> >> >> ---
> >> >>  gcc/params.opt |   4 +
> >> >>  gcc/testsuite/gcc.dg/loop-split1.c |  28 
> >> >>  gcc/tree-ssa-loop-split.c  | 219
> >> >> -
> >> >>  3 files changed, 247 insertions(+), 4 deletions(-)
> >> >>  create mode 100644 gcc/testsuite/gcc.dg/loop-split1.c
> >> >>
> >> >> diff --git a/gcc/params.opt b/gcc/params.opt
> >> >> index 2e4cbdd7a71..900b59b5136 100644
> >> >> --- a/gcc/params.opt
> >> >> +++ b/gcc/params.opt
> >> >> @@ -766,6 +766,10 @@ Min. ratio of insns to prefetches to enable
> >> >> prefetching for a loop with an unkno
> >> >> Common Joined UInteger Var(param_min_loop_cond_split_prob) Init(30)
> >> >> IntegerRange(0, 100) Param Optimization
> >> >> The minimum threshold for probability of semi-invariant condition
> >> >> statement
> >> >> to trigger loop split.
> >> >>
> >> >> +-param=max-insns-ne-cond-split=
> >> >> +Common Joined UInteger Var(param_max_insn_ne_cond_split) Init(64) Param
> >> >> Optimization
> >> >> +The maximum threshold for insnstructions number of a loop with ne
> >> >> condition to split.
> >> >> +
> >> >>  -param=min-nondebug-insn-uid=
> >> >>  Common Joined UInteger Var(param_min_nondebug_insn_uid) Param
> >> >>  The minimum UID to be used for a nondebug insn.
> >> >> diff --git a/gcc/testsuite/gcc.dg/loop-split1.c
> >> >> b/gcc/testsuite/gcc.dg/loop-split1.c
> >> >> new file mode 100644
> >> >> index 000..4c466aa9f54
> >> >> --- /dev/null
> >> >> +++ b/gcc/testsuite/gcc.dg/loop-split1.c
> >> >> @@ -0,0 +1,28 @@
> >> >> +/* { dg-do compile } */
> >> >> +/* { dg-options "-O2 -fsplit-loops -fdump-tree-lsplit-details" } */
> >> >> +
> >> >> +void
> >> >> +foo (int *a, int *b, unsigned l, unsigned n)
> >> >> +{
> >> >> +  while (++l != n)
> >> >> +a[l] = b[l]  + 1;
> >> >> +}
> >> >> +
> >> >> +void
> >> >> +foo1 (int *a, int *b, unsigned l, unsigned n)
> >> >> +{
> >> >> +  while (l++ != n)
> >> >> +a[l] = b[l]  + 1;
> >> >> +}
> >> >> +
> >> >> +unsigned
> >> >> +foo2 (char *a, char *b, unsigned l, unsigned n)
> >> >> +{
> >> >> +  while (++l != n)
> >> >> +if (a[l] != b[l])
> >> >> +  break;
> >> >> +
> >> >> +  return l;
> >> >> +}
> >> >> +
> >> >> +/* { dg-final { scan-tree-dump-times "Loop split" 3 "lsplit" } } */
> >> >> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
> >> >> index b80b6a75e62..a6d28078e5e 100644
> >> >> --- a/gcc/tree-ssa-loop-split.c
> >> >> +++ b/gcc/tree-ssa-loop-split.c
> >> >> @@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
> >> >>  #include "cfghooks.h"
> >> >>  #include "gimple-fold.h"
> >> >>  #include "gimplify-me.h"
> >> >> +#include "tree-ssa-loop-ivopts.h"
> >> >>
> >> >>  /* This file implements two kinds of loop splitting.
> >> >>
> >> >> @@ -233,7 +234,8 @@ easy_exit_values (class loop *loop)
> >> >> 

Re: [PATCH] run early sprintf warning after SSA (PR 100325)

2021-05-07 Thread Aldy Hernandez via Gcc-patches




On 5/7/21 11:34 AM, Richard Biener wrote:

On Fri, May 7, 2021 at 2:12 AM Martin Sebor  wrote:


On 5/6/21 8:32 AM, Aldy Hernandez wrote:



On 5/5/21 9:26 AM, Richard Biener wrote:

On Wed, May 5, 2021 at 1:32 AM Martin Sebor via Gcc-patches
 wrote:


With no optimization, -Wformat-overflow and -Wformat-truncation
runs early to detect a subset of simple bugs.  But as it turns out,
the pass runs just a tad too early, before SSA.  That causes it to
miss a class of problems that can easily be detected once code is
in SSA form, and I would expect might also cause false positives.

The attached change moves the sprintf pass just after pass_build_ssa,
similar to other early flow-sensitive warnings (-Wnonnull-compare and
-Wuninitialized).


Makes sense.  I suppose walloca might also benefit from SSA - it seems
to do range queries which won't work quite well w/o SSA?


The early Walloca pass that runs without optimization doesn't do much,
as we've never had ranges so early.  All it does is diagnose _every_
call to alloca(), if -Walloca is passed:

// The first time this pass is called, it is called before
// optimizations have been run and range information is unavailable,
// so we can only perform strict alloca checking.
if (first_time_p)
  return warn_alloca != 0;

Though, I suppose we could move the first alloca pass after SSA is
available and make it the one and only pass, since ranger only needs
SSA.  However, I don't know how well this would work without value
numbering or CSE.  For example, for gcc.dg/Walloca-4.c the gimple is:

 :
_1 = rear_ptr_9(D) - w_10(D);
_2 = (long unsigned int) _1;
if (_2 <= 4095)
  goto ; [INV]
else
  goto ; [INV]

 :
_3 = rear_ptr_9(D) - w_10(D);
_4 = (long unsigned int) _3;
src_16 = __builtin_alloca (_4);
goto ; [INV]

No ranges can be determined for _4.  However, if either FRE or DOM run,
as they do value numbering and CSE respectively, we could easily
determine a range as the above would become:

:
_1 = rear_ptr_9(D) - w_10(D);
_2 = (long unsigned int) _1;
if (_2 <= 4095)
  goto ; [INV]
else
  goto ; [INV]

 :
src_16 = __builtin_alloca (_2);
goto ; [INV]

I'm inclined to leave the first alloca pass before SSA runs, as it
doesn't do anything with ranges.  If anyone's open to a simple -O0 CSE
type pass, it would be a different story.  Thoughts?


Improving the analysis at -O0 and getting better warnings that are
more consistent with what is issued with optimization would be very
helpful (as as long as it doesn't compromise debugging experience
of course).


I agree.  It shouldn't be too difficult to for example run the VN
propagation part without doing actual elimiation and keep
value-numbers for consumption.  do_rpo_vn (not exported)
might even already support iterate = false, eliminate = false,
it would just need factoring out the init/deinit somewhat.


Interesting.  This could give good ranges at -O0 and make it possible to 
move all these pesky range needy passes early in the pipeline.



Of course it will be a lot more expensive to do since it cannot
do "on-demand" value-numbering of interesting SSA names.
I'm not sure that would be possible anyhow.  Though for
the alloca case quickly scanning the function whether there's
any would of course be faster than throwing VN at it.


That's exact what we do for strict -Walloca warnings.  For 
-Walloca-larger-than=, you need ranges though, so your VN idea would fit 
the bill.


Aldy



Re: [PATCH] rs6000: Make density_test only for vector version

2021-05-07 Thread Richard Biener via Gcc-patches
On Fri, May 7, 2021 at 5:30 AM Kewen.Lin via Gcc-patches
 wrote:
>
> Hi,
>
> When I was investigating density_test heuristics, I noticed that
> the current rs6000_density_test could be used for single scalar
> iteration cost calculation, through the call trace:
>   vect_compute_single_scalar_iteration_cost
> -> rs6000_finish_cost
>  -> rs6000_density_test
>
> It looks unexpected as its desriptive function comments and Bill
> helped to confirm this needs to be fixed (thanks!).
>
> So this patch is to check the passed data, if it's the same as
> the one in loop_vinfo, it indicates it's working on vector version
> cost calculation, otherwise just early return.
>
> Bootstrapped/regtested on powerpc64le-linux-gnu P9.
>
> Nothing remarkable was observed with SPEC2017 Power9 full run.
>
> Is it ok for trunk?

+  /* Only care about cost of vector version, so exclude scalar
version here.  */
+  if (LOOP_VINFO_TARGET_COST_DATA (loop_vinfo) != (void *) data)
+return;

Hmm, looks like a quite "random" test to me.  What about adding a
parameter to finish_cost () (or init_cost?) indicating the cost kind?

OTOH we already pass scalar_stmt to individual add_stmt_cost,
so not sure whether the context really matters.  That said,
the density test looks "interesting" ... the intent was that finish_cost
might look at gathered data from add_stmt, not that it looks at
the GIMPLE IL ... so why are you not counting vector_stmt vs.
scalar_stmt entries in vect_body and using that for this metric?

Richard.

> BR,
> Kewen
> --
> gcc/ChangeLog:
>
> * config/rs6000/rs6000.c (rs6000_density_test): Early return if
> calculating single scalar iteration cost.


[Ada] Restore nnd capability

2021-05-07 Thread Pierre-Marie de Rodat
Move the nnd capability from Atree to Sinfo.Utils, because Atree is now
compiled with "pragma Assertion_Policy (Ignore);", which disables
pragma Debug.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* atree.adb: Move nnd-related code from here, and leave a
comment pointing to sinfo-utils.adb.
* sinfo-utils.ads, sinfo-utils.adb: Move nnd-related code to
here.diff --git a/gcc/ada/atree.adb b/gcc/ada/atree.adb
--- a/gcc/ada/atree.adb
+++ b/gcc/ada/atree.adb
@@ -43,11 +43,17 @@ with Opt; use Opt;
 with Output;  use Output;
 with Seinfo;  use Seinfo;
 with Sinfo.Utils; use Sinfo.Utils;
-with Sinput;  use Sinput;
 with System.Storage_Elements;
 
 package body Atree is
 
+   ---
+   -- Debugging --
+   ---
+
+   --  Suppose you find that node 12345 is messed up. You might want to find
+   --  the code that created that node. See sinfo-utils.adb for how to do that.
+
Ignored_Ghost_Recording_Proc : Ignored_Ghost_Record_Proc := null;
--  This soft link captures the procedure invoked during the creation of an
--  ignored Ghost node or entity.
@@ -64,57 +70,6 @@ package body Atree is
Rewriting_Proc : Rewrite_Proc := null;
--  This soft link captures the procedure invoked during a node rewrite
 
-   ---
-   -- Debugging --
-   ---
-
-   --  Suppose you find that node 12345 is messed up. You might want to find
-   --  the code that created that node. There are two ways to do this:
-
-   --  One way is to set a conditional breakpoint on New_Node_Debugging_Output
-   --  (nickname "nnd"):
-   -- break nnd if n = 12345
-   --  and run gnat1 again from the beginning.
-
-   --  The other way is to set a breakpoint near the beginning (e.g. on
-   --  gnat1drv), and run. Then set Watch_Node (nickname "ww") to 12345 in gdb:
-   -- ww := 12345
-   --  and set a breakpoint on New_Node_Breakpoint (nickname "nn"). Continue.
-
-   --  Either way, gnat1 will stop when node 12345 is created, or certain other
-   --  interesting operations are performed, such as Rewrite. To see exactly
-   --  which operations, search for "pragma Debug" below.
-
-   --  The second method is much faster if the amount of Ada code being
-   --  compiled is large.
-
-   ww : Node_Id'Base := Node_Id'First - 1;
-   pragma Export (Ada, ww);
-   Watch_Node : Node_Id'Base renames ww;
-   --  Node to "watch"; that is, whenever a node is created, we check if it
-   --  is equal to Watch_Node, and if so, call New_Node_Breakpoint. You have
-   --  presumably set a breakpoint on New_Node_Breakpoint. Note that the
-   --  initial value of Node_Id'First - 1 ensures that by default, no node
-   --  will be equal to Watch_Node.
-
-   procedure nn;
-   pragma Export (Ada, nn);
-   procedure New_Node_Breakpoint renames nn;
-   --  This doesn't do anything interesting; it's just for setting breakpoint
-   --  on as explained above.
-
-   procedure nnd (N : Node_Id);
-   pragma Export (Ada, nnd);
-   procedure New_Node_Debugging_Output (N : Node_Id) renames nnd;
-   --  For debugging. If debugging is turned on, New_Node and New_Entity call
-   --  this. If debug flag N is turned on, this prints out the new node.
-   --
-   --  If Node = Watch_Node, this prints out the new node and calls
-   --  New_Node_Breakpoint. Otherwise, does nothing.
-
-   procedure Node_Debug_Output (Op : String; N : Node_Id);
-   --  Called by nnd; writes Op followed by information about N
-
-
-- Local Objects and Types --
-
@@ -1103,9 +1058,6 @@ package body Atree is
---
 
procedure Copy_Node (Source, Destination : Node_Or_Entity_Id) is
-  pragma Debug (New_Node_Debugging_Output (Source));
-  pragma Debug (New_Node_Debugging_Output (Destination));
-
   pragma Assert (Source /= Destination);
 
   Save_In_List : constant Boolean  := In_List (Destination);
@@ -1115,6 +1067,9 @@ package body Atree is
   D_Size : constant Field_Offset := Size_In_Slots_To_Alloc (Destination);
 
begin
+  New_Node_Debugging_Output (Source);
+  New_Node_Debugging_Output (Destination);
+
   --  Currently all entities are allocated the same number of slots.
   --  Hopefully that won't always be the case, but if it is, the following
   --  is suboptimal if D_Size < S_Size, because in fact the Destination was
@@ -1335,9 +1290,6 @@ package body Atree is
---
 
procedure Exchange_Entities (E1 : Entity_Id; E2 : Entity_Id) is
-  pragma Debug (New_Node_Debugging_Output (E1));
-  pragma Debug (New_Node_Debugging_Output (E2));
-
   pragma Debug (Validate_Node_Write (E1));
   pragma Debug (Validate_Node_Write (E2));
   pragma Assert
@@ -1363,6 +1315,9 @@ package body Atree is
  Set_Defining_Identifier (Parent (E1), E1);
  Set_Defining_Identifier (Parent (E2), E2);
   end if;
+
+  

[Ada] Robust detection of access-to-subprogram and access-to-object types

2021-05-07 Thread Pierre-Marie de Rodat
Routines Is_Access_Object_Type and Is_Access_Subprogram_Type were
arbitrarily categorizing E_Access_Subtype as an access-to-object, even
though it could represent an access-to-subprogram.

Now those routines examine not just the Ekind, but also the
Designated_Type of an access (sub)type, which is more reliable.

Only the handling of Can_Use_Internal_Rep and Convention flags need to
be adjusted, because they are set before the Designated_Type. However,
those flags are only set at base type anyway, so there is no problem
with E_Access_Subtype being wrongly recognized and we can safely rely on
the Ekind to detect access-to-subprograms.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* einfo-utils.adb (Is_Access_Object_Type): Use
Directly_Designated_Type.
(Is_Access_Subprogram_Type): Use Directly_Designated_Type.
(Set_Convention): Use plain Ekind.
* gen_il-gen-gen_entities.adb (Type_Kind): Use plain Ekind.
* sem_ch3.adb (Access_Type_Declaration): When seeing an illegal
completion with an access type don't attempt to decorate the
completion entity; previously the entity had its Ekind set to
E_General_Access_Type or E_Access_Type, but its Designated_Type
was empty, which caused a crash in freezing. (Actually, the
error recovery in the surrounding context is still incomplete,
e.g. we will crash when the illegal completion is an access to
an unknown identifier).diff --git a/gcc/ada/einfo-utils.adb b/gcc/ada/einfo-utils.adb
--- a/gcc/ada/einfo-utils.adb
+++ b/gcc/ada/einfo-utils.adb
@@ -101,7 +101,8 @@ package body Einfo.Utils is
 
function Is_Access_Object_Type   (Id : E) return B is
begin
-  return Is_Access_Type (Id) and then not Is_Access_Subprogram_Type (Id);
+  return Is_Access_Type (Id)
+and then Ekind (Directly_Designated_Type (Id)) /= E_Subprogram_Type;
end Is_Access_Object_Type;
 
function Is_Access_Type  (Id : E) return B is
@@ -116,7 +117,8 @@ package body Einfo.Utils is
 
function Is_Access_Subprogram_Type   (Id : E) return B is
begin
-  return Ekind (Id) in Access_Subprogram_Kind;
+  return Is_Access_Type (Id)
+and then Ekind (Directly_Designated_Type (Id)) = E_Subprogram_Type;
end Is_Access_Subprogram_Type;
 
function Is_Aggregate_Type   (Id : E) return B is
@@ -2672,8 +2674,7 @@ package body Einfo.Utils is
begin
   Set_Basic_Convention (E, Val);
 
-  if Is_Type (E)
-and then Is_Access_Subprogram_Type (Base_Type (E))
+  if Ekind (E) in Access_Subprogram_Kind
 and then Has_Foreign_Convention (E)
   then
  Set_Can_Use_Internal_Rep (E, False);


diff --git a/gcc/ada/gen_il-gen-gen_entities.adb b/gcc/ada/gen_il-gen-gen_entities.adb
--- a/gcc/ada/gen_il-gen-gen_entities.adb
+++ b/gcc/ada/gen_il-gen-gen_entities.adb
@@ -480,7 +480,7 @@ begin -- Gen_IL.Gen.Gen_Entities
(Sm (Alignment, Uint),
 Sm (Associated_Node_For_Itype, Node_Id),
 Sm (Can_Use_Internal_Rep, Flag, Base_Type_Only,
-Pre => "Is_Access_Subprogram_Type (Base_Type (N))"),
+Pre => "Ekind (Base_Type (N)) in Access_Subprogram_Kind"),
 Sm (Class_Wide_Type, Node_Id),
 Sm (Contract, Node_Id),
 Sm (Current_Use_Clause, Node_Id),


diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -1354,6 +1354,7 @@ package body Sem_Ch3 is
 
 else
pragma Assert (Error_Posted (T));
+   return;
 end if;
 
 --  If the designated type is a limited view, we cannot tell if
@@ -6725,7 +6726,9 @@ package body Sem_Ch3 is
   Has_Private_Component (Derived_Type));
   Conditional_Delay  (Derived_Type, Subt);
 
-  if Is_Access_Subprogram_Type (Derived_Type) then
+  if Is_Access_Subprogram_Type (Derived_Type)
+and then Is_Base_Type (Derived_Type)
+  then
  Set_Can_Use_Internal_Rep
(Derived_Type, Can_Use_Internal_Rep (Parent_Type));
   end if;




[Ada] Reinitialize Private_Dependents when it is vanishing

2021-05-07 Thread Pierre-Marie de Rodat
We call Set_Ekind on a E_Incomplete_Subtype entity, and the
Private_Dependents field vanishes. This patch resets it to zero, as
required for vanishing fields.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch3.adb (Process_Incomplete_Dependents): Reset
Private_Dependents field to zero before calling Set_Ekind.  Also
move Set_Etype to after Set_Ekind, because it's always best to
set the Ekind as early as possible.
* atree.adb: Improve debugging facilities for vanishing fields.diff --git a/gcc/ada/atree.adb b/gcc/ada/atree.adb
--- a/gcc/ada/atree.adb
+++ b/gcc/ada/atree.adb
@@ -29,6 +29,8 @@
 
 --  Checks and assertions in this package are too slow, and are mostly needed
 --  when working on this package itself, or on gen_il, so we disable them.
+--  To debug low-level bugs in this area, comment out the following pragmas,
+--  and run with -gnatd_v.
 
 pragma Suppress (All_Checks);
 pragma Assertion_Policy (Ignore);
@@ -868,7 +870,9 @@ package body Atree is
   Old_Kind : constant Entity_Kind := Ekind (Old_N);
 
   --  If this fails, it means you need to call Reinit_Field_To_Zero before
-  --  calling Set_Ekind.
+  --  calling Set_Ekind. But we have many cases where vanishing fields are
+  --  expected to reappear after converting to/from E_Void. Other cases are
+  --  more problematic; set a breakpoint on "(non-E_Void case)" below.
 
begin
   for J in Entity_Field_Table (Old_Kind)'Range loop
@@ -882,12 +886,14 @@ package body Atree is
   Write_Str (New_Kind'Img);
   Write_Str (" Nonzero field ");
   Write_Str (F'Img);
-  Write_Str (" is vanishing");
+  Write_Str (" is vanishing ");
   Write_Eol;
 
-  pragma Assert (New_Kind = E_Void or else Old_Kind = E_Void);
-
-  raise Program_Error;
+  if New_Kind = E_Void or else Old_Kind = E_Void then
+ Write_Line ("(E_Void case)");
+  else
+ Write_Line ("(non-E_Void case)");
+  end if;
end if;
 end if;
  end;


diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -21299,8 +21299,11 @@ package body Sem_Ch3 is
  then
 Set_Subtype_Indication
   (Parent (Priv_Dep), New_Occurrence_Of (Full_T, Sloc (Priv_Dep)));
-Set_Etype (Priv_Dep, Full_T);
+Reinit_Field_To_Zero
+  (Priv_Dep, Private_Dependents,
+   Old_Ekind => E_Incomplete_Subtype);
 Set_Ekind (Priv_Dep, Subtype_Kind (Ekind (Full_T)));
+Set_Etype (Priv_Dep, Full_T);
 Set_Analyzed (Parent (Priv_Dep), False);
 
 --  Reanalyze the declaration, suppressing the call to Enter_Name




[Ada] Fix type mismatch warnings during LTO bootstrap #4

2021-05-07 Thread Pierre-Marie de Rodat
There are 3 views of the exception record type in an Ada program: the
master is declared as Exception_Data in System.Standard_Library, the
compiler view is built by Cstand at the beginning of the compilation,
and the C view is declared in the raise.h header file.  These views must
be sufficiently alike in order for the LTO compiler to merge them into a
single type.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/s-stalib.ads (Exception_Data): Mark components as aliased.
* stand.ads (Standard_Entity_Type): Enhance comments.
* cstand.adb (Make_Component): Rename into...
(Make_Aliased_Component): ...this; set Is_Aliased and Is_Independent
flags on the component.
(Create_Standard): Adjust the types of the component of the record
Standard_Exception_Type and mark them as aliased.
* exp_ch11.adb (Expand_N_Exception_Declaration): Use OK
conversion to Standard_Address for Full_Name component, except
in CodePeer_Mode (set it to 0).
* exp_prag.adb (Expand_Pragma_Import_Or_Interface): Likewise.
* raise.h (struct Exception_Data): Change the type of Full_Name,
HTable_Ptr and Foreign_Data.diff --git a/gcc/ada/cstand.adb b/gcc/ada/cstand.adb
--- a/gcc/ada/cstand.adb
+++ b/gcc/ada/cstand.adb
@@ -133,12 +133,12 @@ package body CStand is
--  Returns an identifier node with the same name as the defining identifier
--  corresponding to the given Standard_Entity_Type value.
 
-   procedure Make_Component
+   procedure Make_Aliased_Component
  (Rec : Entity_Id;
   Typ : Entity_Id;
   Nam : String);
-   --  Build a record component with the given type and name, and append to
-   --  the list of components of Rec.
+   --  Build an aliased record component with the given type and name,
+   --  and append to the list of components of Rec.
 
function Make_Formal (Typ : Entity_Id; Nam : String) return Entity_Id;
--  Construct entity for subprogram formal with given name and type
@@ -1495,38 +1495,40 @@ package body CStand is
   --  known by the run-time. Components of the record are documented in
   --  the declaration in System.Standard_Library.
 
-  Standard_Exception_Type := New_Standard_Entity ("exception");
-  Set_Ekind   (Standard_Exception_Type, E_Record_Type);
-  Set_Etype   (Standard_Exception_Type, Standard_Exception_Type);
-  Set_Scope   (Standard_Exception_Type, Standard_Standard);
-  Set_Stored_Constraint
-  (Standard_Exception_Type, No_Elist);
-  Init_Size_Align (Standard_Exception_Type);
-  Set_Size_Known_At_Compile_Time
-  (Standard_Exception_Type, True);
-
-  Make_Component
-(Standard_Exception_Type, Standard_Boolean,   "Not_Handled_By_Others");
-  Make_Component
-(Standard_Exception_Type, Standard_Character, "Lang");
-  Make_Component
-(Standard_Exception_Type, Standard_Natural,   "Name_Length");
-  Make_Component
-(Standard_Exception_Type, Standard_A_Char,"Full_Name");
-  Make_Component
-(Standard_Exception_Type, Standard_A_Char,"HTable_Ptr");
-  Make_Component
-(Standard_Exception_Type, Standard_A_Char,"Foreign_Data");
-  Make_Component
-(Standard_Exception_Type, Standard_A_Char,"Raise_Hook");
-
-  --  Build tree for record declaration, for use by the back-end
-
-  declare
- Comp_List : List_Id;
- Comp  : Entity_Id;
+  Build_Exception_Type : declare
+ Comp_List  : List_Id;
+ Comp   : Entity_Id;
 
   begin
+ Standard_Exception_Type := New_Standard_Entity ("exception");
+ Set_Ekind   (Standard_Exception_Type, E_Record_Type);
+ Set_Etype   (Standard_Exception_Type, Standard_Exception_Type);
+ Set_Scope   (Standard_Exception_Type, Standard_Standard);
+ Set_Stored_Constraint
+ (Standard_Exception_Type, No_Elist);
+ Init_Size_Align (Standard_Exception_Type);
+ Set_Size_Known_At_Compile_Time
+ (Standard_Exception_Type, True);
+
+ Make_Aliased_Component (Standard_Exception_Type, Standard_Boolean,
+ "Not_Handled_By_Others");
+ Make_Aliased_Component (Standard_Exception_Type, Standard_Character,
+ "Lang");
+ Make_Aliased_Component (Standard_Exception_Type, Standard_Natural,
+ "Name_Length");
+ Make_Aliased_Component (Standard_Exception_Type, Standard_Address,
+ "Full_Name");
+ Make_Aliased_Component (Standard_Exception_Type, Standard_A_Char,
+ "HTable_Ptr");
+ Make_Aliased_Component (Standard_Exception_Type, Standard_Address,
+ "Foreign_Data");
+ Make_Aliased_Component (Standard_Exception_Type, Standard_A_Char,
+  

[Ada] Variable-sized node types -- cleanup

2021-05-07 Thread Pierre-Marie de Rodat
Fix incorrect comments. Clean up  marks. Rename Set_Ekind to be
Mutate_Ekind to match Mutate_Nkind.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* atree.ads, atree.adb, gen_il-gen.ads: Fix comments and clean
up ???  marks.  Rename Set_Ekind to be Mutate_Ekind.
* einfo.ads, sinfo.ads: Likewise.  Change "definitive
definition" to "official definition", because the former sounds
redundant.  Rename Set_Ekind to be Mutate_Ekind.
* checks.adb, contracts.adb, cstand.adb, exp_aggr.adb,
exp_attr.adb, exp_ch11.adb, exp_ch3.adb, exp_ch5.adb,
exp_ch6.adb, exp_ch7.adb, exp_ch9.adb, exp_disp.adb,
exp_dist.adb, exp_imgv.adb, exp_intr.adb, exp_prag.adb,
exp_unst.adb, exp_util.adb, gen_il-gen.adb, inline.adb,
lib-writ.adb, lib-xref-spark_specific.adb, sem_aggr.adb,
sem_ch10.adb, sem_ch11.adb, sem_ch12.adb, sem_ch13.adb,
sem_ch3.adb, sem_ch5.adb, sem_ch6.adb, sem_ch7.adb, sem_ch8.adb,
sem_ch9.adb, sem_dist.adb, sem_elab.adb, sem_prag.adb,
sem_util.adb: Rename Set_Ekind to be Mutate_Ekind.

patch.diff.gz
Description: application/gzip


[Ada] Fix link from body protected entry implementation to source code

2021-05-07 Thread Pierre-Marie de Rodat
CodePeer needs to recognize internally generated procedures that
implement protected entries. Previously this was done with an extra
field in the procedure entity; now it is done with an extra field in the
procedure body.

The new field bypasses the trouble with the procedure entity changing
its type from E_Void to E_Procedure to E_Subprogram_Body. Also, it is
closer to similar flags like Is_Protected_Subprogram_Body and
Is_Task_Body_Procedure.

Finally, the new field links bodies just like the old field linked
entities.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* einfo.ads: Move Corresponding_Protected_Entry...
* sinfo.ads: ... here.
* exp_ch9.adb (Build_Entry_Body): Link procedure and entry
bodies.
* gen_il-fields.ads (Opt_Field_Enum): Add
Corresponding_Entry_Body field to nodes; remove
Corresponding_Protected_Entry field from entities.
* gen_il-gen-gen_entities.adb (Gen_Entities): Remove
Corresponding_Protected_Entry field from E_Void and
E_Subprogram_Body.
* gen_il-gen-gen_nodes.adb (Gen_Nodes): Add
Corresponding_Entry_Body field to N_Subprogram_Body.
* sem_ch6.adb (Analyze_Subprogram_Specification): Remove
manipulation of Ekind and Corresponding_Protected_Entry added as
part of the support for varsize-nodes.diff --git a/gcc/ada/einfo.ads b/gcc/ada/einfo.ads
--- a/gcc/ada/einfo.ads
+++ b/gcc/ada/einfo.ads
@@ -786,10 +786,6 @@ package Einfo is
 --   Modify_Tree_For_C is set. Denotes the internally built procedure
 --   with an extra out parameter created for it.
 
---Corresponding_Protected_Entry (Node18)
---   Defined in subprogram bodies. Set for subprogram bodies that implement
---   a protected type entry to point to the entity for the entry.
-
 --Corresponding_Record_Component (Node21)
 --   Defined in components of a derived untagged record type, including
 --   discriminants. For a regular component or a girder discriminant,


diff --git a/gcc/ada/exp_ch9.adb b/gcc/ada/exp_ch9.adb
--- a/gcc/ada/exp_ch9.adb
+++ b/gcc/ada/exp_ch9.adb
@@ -3779,10 +3779,6 @@ package body Exp_Ch9 is
raise Program_Error;
  end case;
 
- --  Establish link between subprogram body entity and source entry
-
- Set_Corresponding_Protected_Entry (Bod_Id, Ent);
-
  --  Create body of entry procedure. The renaming declarations are
  --  placed ahead of the block that contains the actual entry body.
 
@@ -3816,6 +3812,10 @@ package body Exp_Ch9 is
New_Occurrence_Of
  (RTE (RE_Get_GNAT_Exception), Loc);
 
+ --  Establish link between subprogram body and source entry body
+
+ Set_Corresponding_Entry_Body (Proc_Body, N);
+
  Reset_Scopes_To (Proc_Body, Protected_Body_Subprogram (Ent));
  return Proc_Body;
   end if;


diff --git a/gcc/ada/gen_il-fields.ads b/gcc/ada/gen_il-fields.ads
--- a/gcc/ada/gen_il-fields.ads
+++ b/gcc/ada/gen_il-fields.ads
@@ -113,6 +113,7 @@ package Gen_IL.Fields is
   Convert_To_Return_False,
   Corresponding_Aspect,
   Corresponding_Body,
+  Corresponding_Entry_Body,
   Corresponding_Formal_Spec,
   Corresponding_Generic_Association,
   Corresponding_Integer_Value,
@@ -464,7 +465,6 @@ package Gen_IL.Fields is
   Corresponding_Equality,
   Corresponding_Function,
   Corresponding_Procedure,
-  Corresponding_Protected_Entry,
   Corresponding_Record_Component,
   Corresponding_Record_Type,
   Corresponding_Remote_Type,


diff --git a/gcc/ada/gen_il-gen-gen_entities.adb b/gcc/ada/gen_il-gen-gen_entities.adb
--- a/gcc/ada/gen_il-gen-gen_entities.adb
+++ b/gcc/ada/gen_il-gen-gen_entities.adb
@@ -244,7 +244,6 @@ begin -- Gen_IL.Gen.Gen_Entities
 Sm (Scope_Depth_Value, Uint),
 Sm (SPARK_Pragma, Node_Id),
 Sm (SPARK_Pragma_Inherited, Flag),
-Sm (Corresponding_Protected_Entry, Node_Id), -- setter only
 Sm (Current_Value, Node_Id), -- setter only
 Sm (Has_Predicates, Flag), -- setter only
 Sm (Initialization_Statements, Node_Id), -- setter only
@@ -1245,7 +1244,6 @@ begin -- Gen_IL.Gen.Gen_Entities
Cc (E_Subprogram_Body, Entity_Kind,
(Sm (Anonymous_Masters, Elist_Id),
 Sm (Contract, Node_Id),
-Sm (Corresponding_Protected_Entry, Node_Id),
 Sm (Extra_Formals, Node_Id),
 Sm (First_Entity, Node_Id),
 Sm (Ignore_SPARK_Mode_Pragmas, Flag),


diff --git a/gcc/ada/gen_il-gen-gen_nodes.adb b/gcc/ada/gen_il-gen-gen_nodes.adb
--- a/gcc/ada/gen_il-gen-gen_nodes.adb
+++ b/gcc/ada/gen_il-gen-gen_nodes.adb
@@ -790,6 +790,7 @@ begin -- Gen_IL.Gen.Gen_Nodes
 Sy (Bad_Is_Detected, Flag),
 Sm (Activation_Chain_Entity, Node_Id),
 Sm (Acts_As_Spec, Flag),
+Sm (Corresponding_Entry_Body, Node_Id),
 Sm 

[Ada] Replace packed records with integers in low-level implementation

2021-05-07 Thread Pierre-Marie de Rodat
Accessing components in packed record types turns out to be too slow
for practical use so this replaces them with integer types.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* atree.ads (Slot): Change to modular type.
(Slot_1_Bit): Delete.
(Slot_2_Bit): Likewise.
(Slot_4_Bit): Likewise.
(Slot_8_Bit): Likewise.
(Slot_32_Bit): Likewise.
* atree.adb (Get_1_Bit_Val): Adjust to above change.
(Get_2_Bit_Val): Likewise.
(Get_4_Bit_Val): Likewise.
(Get_8_Bit_Val): Likewise.
(Get_32_Bit_Val): Likewise.
(Set_1_Bit_Val): Likewise.
(Set_2_Bit_Val): Likewise.
(Set_4_Bit_Val): Likewise.
(Set_8_Bit_Val): Likewise.
(Set_32_Bit_Val): Likewise.
(Print_Atree_Info): Likewise.
(Zero): Likewise.
* atree.h (Get_1_Bit_Field): Likewise.
(Get_2_Bit_Field): Likewise.
(Get_4_Bit_Field): Likewise.
(Get_8_Bit_Field): Likewise.
(Get_32_Bit_Field): Likewise.
(Get_32_Bit_Field_With_Default): Likewise.
* types.h (slot_1_bit): Delete.
(slot_2_bit): Likewise.
(slot_4_bit): Likewise.
(slot_8_bit): Likewise.
(slot_32_bit): Likewise.
(any_slot): Change to unsigned int.
(Slot_Size): New macro.diff --git a/gcc/ada/atree.adb b/gcc/ada/atree.adb
--- a/gcc/ada/atree.adb
+++ b/gcc/ada/atree.adb
@@ -599,119 +599,55 @@ package body Atree is
 (N : Node_Or_Entity_Id; Offset : Field_Offset) return Field_1_Bit
   is
  --  We wish we were using packed arrays, but instead we're simulating
- --  packed arrays using packed records. L here (and elsewhere) is the
- --  'Length of that array.
- L : constant Field_Offset := 32;
+ --  them with modular integers. L here (and elsewhere) is the 'Length
+ --  of that simulated array.
+ L : constant Field_Offset := Slot_Size / 1;
 
  pragma Debug (Validate_Node_And_Offset (N, Offset / L));
 
- subtype Offset_In_Slot is Field_Offset range 0 .. L - 1;
  S : Slot renames Slots.Table (Node_Offsets.Table (N) + Offset / L);
+ V : constant Integer := Integer ((Offset mod L) * (Slot_Size / L));
   begin
- case Offset_In_Slot'(Offset mod L) is
-when 0 => return S.Slot_1.F0;
-when 1 => return S.Slot_1.F1;
-when 2 => return S.Slot_1.F2;
-when 3 => return S.Slot_1.F3;
-when 4 => return S.Slot_1.F4;
-when 5 => return S.Slot_1.F5;
-when 6 => return S.Slot_1.F6;
-when 7 => return S.Slot_1.F7;
-when 8 => return S.Slot_1.F8;
-when 9 => return S.Slot_1.F9;
-when 10 => return S.Slot_1.F10;
-when 11 => return S.Slot_1.F11;
-when 12 => return S.Slot_1.F12;
-when 13 => return S.Slot_1.F13;
-when 14 => return S.Slot_1.F14;
-when 15 => return S.Slot_1.F15;
-when 16 => return S.Slot_1.F16;
-when 17 => return S.Slot_1.F17;
-when 18 => return S.Slot_1.F18;
-when 19 => return S.Slot_1.F19;
-when 20 => return S.Slot_1.F20;
-when 21 => return S.Slot_1.F21;
-when 22 => return S.Slot_1.F22;
-when 23 => return S.Slot_1.F23;
-when 24 => return S.Slot_1.F24;
-when 25 => return S.Slot_1.F25;
-when 26 => return S.Slot_1.F26;
-when 27 => return S.Slot_1.F27;
-when 28 => return S.Slot_1.F28;
-when 29 => return S.Slot_1.F29;
-when 30 => return S.Slot_1.F30;
-when 31 => return S.Slot_1.F31;
- end case;
+ return Field_1_Bit (Shift_Right (S, V) and 1);
   end Get_1_Bit_Val;
 
   function Get_2_Bit_Val
 (N : Node_Or_Entity_Id; Offset : Field_Offset) return Field_2_Bit
   is
- L : constant Field_Offset := 16;
+ L : constant Field_Offset := Slot_Size / 2;
 
  pragma Debug (Validate_Node_And_Offset (N, Offset / L));
 
- subtype Offset_In_Slot is Field_Offset range 0 .. L - 1;
  S : Slot renames Slots.Table (Node_Offsets.Table (N) + Offset / L);
+ V : constant Integer := Integer ((Offset mod L) * (Slot_Size / L));
   begin
- case Offset_In_Slot'(Offset mod L) is
-when 0 => return S.Slot_2.F0;
-when 1 => return S.Slot_2.F1;
-when 2 => return S.Slot_2.F2;
-when 3 => return S.Slot_2.F3;
-when 4 => return S.Slot_2.F4;
-when 5 => return S.Slot_2.F5;
-when 6 => return S.Slot_2.F6;
-when 7 => return S.Slot_2.F7;
-when 8 => return S.Slot_2.F8;
-when 9 => return S.Slot_2.F9;
-when 10 => return S.Slot_2.F10;
-when 11 => return S.Slot_2.F11;
-when 12 => return S.Slot_2.F12;
- 

[Ada] Fix type mismatch warnings during LTO bootstrap #3

2021-05-07 Thread Pierre-Marie de Rodat
This changes a type name to avoid a violation of the C++ ODR with LTO,
sets convention C on another enumeration type and declares a matching
enumeration type in C, fixes a blatant type mismatch, marks components
as aliased in a couple of record types and tweaks one of them a bit more.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* atree.h (Slots_Ptr): Change pointed-to type to any_slot.
* fe.h (Get_RT_Exception_Name): Change type of parameter.
* namet.ads (Name_Entry): Mark non-boolean components as aliased,
reorder the boolean components and add an explicit Spare component.
* namet.adb (Name_Enter): Adjust aggregate accordingly.
(Name_Find): Likewise.
(Reinitialize): Likewise.
* namet.h (struct Name_Entry): Adjust accordingly.
(Names_Ptr): Use correct type.
(Name_Chars_Ptr): Likewise.
(Get_Name_String): Fix declaration and adjust to above changes.
* types.ads (RT_Exception_Code): Add pragma Convention C.
* types.h (Column_Number_Type): Fix original type.
(slot): Rename union type to...
(any_slot): ...this and adjust assertion accordingly.
(RT_Exception_Code): New enumeration type.
* uintp.ads (Uint_Entry): Mark components as aliased.
* uintp.h (Uints_Ptr):  Use correct type.
(Udigits_Ptr): Likewise.
* gcc-interface/gigi.h (gigi): Adjust name and type of parameter.
* gcc-interface/cuintp.c (UI_To_gnu): Adjust references to Uints_Ptr
and Udigits_Ptr.
* gcc-interface/trans.c (Slots_Ptr): Adjust pointed-to type.
(gigi): Adjust type of parameter.
(build_raise_check): Add cast in call to Get_RT_Exception_Name.diff --git a/gcc/ada/atree.h b/gcc/ada/atree.h
--- a/gcc/ada/atree.h
+++ b/gcc/ada/atree.h
@@ -70,7 +70,7 @@ extern Node_Id Current_Error_Node;
these even-lower-level getters.  */
 
 extern Field_Offset *Node_Offsets_Ptr;
-extern slot *Slots_Ptr;
+extern any_slot *Slots_Ptr;
 
 INLINE Union_Id Get_1_Bit_Field (Node_Id N, Field_Offset Offset);
 INLINE Union_Id Get_2_Bit_Field (Node_Id N, Field_Offset Offset);


diff --git a/gcc/ada/fe.h b/gcc/ada/fe.h
--- a/gcc/ada/fe.h
+++ b/gcc/ada/fe.h
@@ -122,7 +122,7 @@ extern Uint Error_Msg_Uint_2;
 
 extern Entity_Id Get_Local_Raise_Call_Entity	(void);
 extern Entity_Id Get_RT_Exception_Entity	(int);
-extern void Get_RT_Exception_Name		(int);
+extern void Get_RT_Exception_Name		(enum RT_Exception_Code);
 extern void Warn_If_No_Local_Raise		(int);
 
 /* exp_code:  */


diff --git a/gcc/ada/gcc-interface/cuintp.c b/gcc/ada/gcc-interface/cuintp.c
--- a/gcc/ada/gcc-interface/cuintp.c
+++ b/gcc/ada/gcc-interface/cuintp.c
@@ -49,8 +49,7 @@
 
For efficiency, this method is used only for integer values larger than the
constant Uint_Bias.  If a Uint is less than this constant, then it contains
-   the integer value itself.  The origin of the Uints_Ptr table is adjusted so
-   that a Uint value of Uint_Bias indexes the first element.
+   the integer value itself.
 
First define a utility function that is build_int_cst for integral types and
does a conversion for floating-point types.  */
@@ -85,9 +84,9 @@ UI_To_gnu (Uint Input, tree type)
 gnu_ret = build_cst_from_int (comp_type, Input - Uint_Direct_Bias);
   else
 {
-  Int Idx = Uints_Ptr[Input].Loc;
-  Pos Length = Uints_Ptr[Input].Length;
-  Int First = Udigits_Ptr[Idx];
+  Int Idx = (*Uints_Ptr)[Input - Uint_Table_Start].Loc;
+  Pos Length = (*Uints_Ptr)[Input - Uint_Table_Start].Length;
+  Int First = (*Udigits_Ptr)[Idx];
   tree gnu_base;
 
   gcc_assert (Length > 0);
@@ -109,14 +108,14 @@ UI_To_gnu (Uint Input, tree type)
  fold_build2 (MULT_EXPR, comp_type,
 	  gnu_ret, gnu_base),
  build_cst_from_int (comp_type,
-		 Udigits_Ptr[Idx]));
+		 (*Udigits_Ptr)[Idx]));
   else
 	for (Idx++, Length--; Length; Idx++, Length--)
 	  gnu_ret = fold_build2 (PLUS_EXPR, comp_type,
  fold_build2 (MULT_EXPR, comp_type,
 	  gnu_ret, gnu_base),
  build_cst_from_int (comp_type,
-		 Udigits_Ptr[Idx]));
+		 (*Udigits_Ptr)[Idx]));
 }
 
   gnu_ret = convert (type, gnu_ret);


diff --git a/gcc/ada/gcc-interface/gigi.h b/gcc/ada/gcc-interface/gigi.h
--- a/gcc/ada/gcc-interface/gigi.h
+++ b/gcc/ada/gcc-interface/gigi.h
@@ -235,7 +235,7 @@ extern void gigi (Node_Id gnat_root,
 	  int max_gnat_node,
 		  int number_name,
 		  Field_Offset *node_offsets_ptr,
-		  slot *Slots,
+		  any_slot *slots_ptr,
 		  Node_Id *next_node_ptr,
 		  Node_Id *prev_node_ptr,
 		  struct Elist_Header *elists_ptr,


diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -76,7 +76,7 @@
 
 /* Pointers to front-end tables accessed through macros.  */
 Field_Offset *Node_Offsets_Ptr;
-slot *Slots_Ptr;
+any_slot *Slots_Ptr;
 

[Ada] Fix type mismatch warnings during LTO bootstrap #5

2021-05-07 Thread Pierre-Marie de Rodat
This changes the C interface to Ada.Exceptions.Exception_Propagation from
using the opaque _Unwind_Ptr to using the explicit Exception_Id, which is
the C view of the Exception_Data_Ptr declared in System.Standard_Library.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* raise-gcc.c (__gnat_others_value): Remove const qualifier.
(__gnat_all_others_value): Likewise.
(__gnat_unhandled_others_value): Likewise.
(GNAT_OTHERS): Cast to Exception_Id instead of _Unwind_Ptr.
(GNAT_ALL_OTHERS): Likewise.
(GNAT_UNHANDLED_OTHERS): Likewise.
(Is_Handled_By_Others): Change parameter type to Exception_Id.
(Language_For): Likewise.
(Foreign_Data_For): Likewise.
(is_handled_by): Likewise.  Adjust throughout, remove redundant
line and fix indentation.
* libgnat/a-exexpr.adb (Is_Handled_By_Others): Remove pragma and
useless qualification from parameter type.
(Foreign_Data_For): Likewise.
(Language_For): Likewise.diff --git a/gcc/ada/libgnat/a-exexpr.adb b/gcc/ada/libgnat/a-exexpr.adb
--- a/gcc/ada/libgnat/a-exexpr.adb
+++ b/gcc/ada/libgnat/a-exexpr.adb
@@ -282,7 +282,6 @@ package body Exception_Propagation is
 
function Is_Handled_By_Others (E : Exception_Data_Ptr) return bool;
pragma Export (C, Is_Handled_By_Others, "__gnat_is_handled_by_others");
-   pragma Warnings (Off, Is_Handled_By_Others);
 
function Language_For (E : Exception_Data_Ptr) return Character;
pragma Export (C, Language_For, "__gnat_language_for");
@@ -688,7 +687,7 @@ package body Exception_Propagation is
-- Foreign_Data_For --
--
 
-   function Foreign_Data_For (E : SSL.Exception_Data_Ptr) return Address is
+   function Foreign_Data_For (E : Exception_Data_Ptr) return Address is
begin
   return E.Foreign_Data;
end Foreign_Data_For;
@@ -697,7 +696,7 @@ package body Exception_Propagation is
-- Is_Handled_By_Others --
--
 
-   function Is_Handled_By_Others (E : SSL.Exception_Data_Ptr) return bool is
+   function Is_Handled_By_Others (E : Exception_Data_Ptr) return bool is
begin
   return not bool (E.all.Not_Handled_By_Others);
end Is_Handled_By_Others;
@@ -706,7 +705,7 @@ package body Exception_Propagation is
-- Language_For --
--
 
-   function Language_For (E : SSL.Exception_Data_Ptr) return Character is
+   function Language_For (E : Exception_Data_Ptr) return Character is
begin
   return E.all.Lang;
end Language_For;


diff --git a/gcc/ada/raise-gcc.c b/gcc/ada/raise-gcc.c
--- a/gcc/ada/raise-gcc.c
+++ b/gcc/ada/raise-gcc.c
@@ -545,14 +545,14 @@ typedef struct
 /* The three constants below are specific ttype identifiers for special
exception ids.  Their type should match what a-exexpr exports.  */
 
-extern const char __gnat_others_value;
-#define GNAT_OTHERS ((_Unwind_Ptr) &__gnat_others_value)
+extern char __gnat_others_value;
+#define GNAT_OTHERS ((Exception_Id) &__gnat_others_value)
 
-extern const char __gnat_all_others_value;
-#define GNAT_ALL_OTHERS ((_Unwind_Ptr) &__gnat_all_others_value)
+extern char __gnat_all_others_value;
+#define GNAT_ALL_OTHERS ((Exception_Id) &__gnat_all_others_value)
 
-extern const char __gnat_unhandled_others_value;
-#define GNAT_UNHANDLED_OTHERS ((_Unwind_Ptr) &__gnat_unhandled_others_value)
+extern char __gnat_unhandled_others_value;
+#define GNAT_UNHANDLED_OTHERS ((Exception_Id) &__gnat_unhandled_others_value)
 
 /* Describe the useful region data associated with an unwind context.  */
 
@@ -902,12 +902,10 @@ get_call_site_action_for (_Unwind_Ptr ip,
 #define Foreign_Data_For  __gnat_foreign_data_for
 #define EID_For   __gnat_eid_for
 
-extern bool Is_Handled_By_Others (_Unwind_Ptr eid);
-extern char Language_For (_Unwind_Ptr eid);
-
-extern void *Foreign_Data_For (_Unwind_Ptr eid);
-
-extern Exception_Id EID_For (_GNAT_Exception * e);
+extern bool Is_Handled_By_Others (Exception_Id eid);
+extern char Language_For (Exception_Id eid);
+extern void *Foreign_Data_For(Exception_Id eid);
+extern Exception_Id EID_For  (_GNAT_Exception *e);
 
 #define Foreign_Exception system__exceptions__foreign_exception
 extern struct Exception_Data Foreign_Exception;
@@ -928,7 +926,7 @@ exception_class_eq (const _GNAT_Exception *except,
 /* Return how CHOICE matches PROPAGATED_EXCEPTION.  */
 
 static enum action_kind
-is_handled_by (_Unwind_Ptr choice, _GNAT_Exception *propagated_exception)
+is_handled_by (Exception_Id choice, _GNAT_Exception *propagated_exception)
 {
   /* All others choice match everything.  */
   if (choice == GNAT_ALL_OTHERS)
@@ -937,14 +935,10 @@ is_handled_by (_Unwind_Ptr choice, _GNAT_Exception *propagated_exception)
   /* GNAT exception occurrence.  */
   if (exception_class_eq (propagated_exception, GNAT_EXCEPTION_CLASS))
 {
-  /* Pointer to the GNAT exception data corresponding to the propagated
-

[Ada] Fix type mismatch warnings during LTO bootstrap #2

2021-05-07 Thread Pierre-Marie de Rodat
This fixes the type of parameters and variables in the C code, changes
the convention of Raise_From_Signal_Handler to C and uses a compatible
boolean type for Is_Handled_By_Others.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* init.c (__gnat_raise_program_error): Fix parameter type.
(Raise_From_Signal_Handler): Likewise and mark as no-return.
* raise-gcc.c (__gnat_others_value): Fix type.
(__gnat_all_others_value): Likewise.
(__gnat_unhandled_others_value): Likewise.
* seh_init.c (Raise_From_Signal_Handler): Fix parameter type.
* libgnat/a-except.ads (Raise_From_Signal_Handler): Use convention C
and new symbol name, move declaration to...
(Raise_From_Controlled_Operation): Minor tweak.
* libgnat/a-except.adb (Raise_From_Signal_Handler): ...here.
* libgnat/a-exexpr.adb (bool): New C compatible boolean type.
(Is_Handled_By_Others): Use it as return type for the function.diff --git a/gcc/ada/init.c b/gcc/ada/init.c
--- a/gcc/ada/init.c
+++ b/gcc/ada/init.c
@@ -78,7 +78,7 @@
 extern "C" {
 #endif
 
-extern void __gnat_raise_program_error (const char *, int);
+extern void __gnat_raise_program_error (const void *, int);
 
 /* Addresses of exception data blocks for predefined exceptions.  Tasking_Error
is not used in this unit, and the abort signal is only used on IRIX.
@@ -89,17 +89,16 @@ extern struct Exception_Data program_error;
 extern struct Exception_Data storage_error;
 
 /* For the Cert run time we use the regular raise exception routine because
-   Raise_From_Signal_Handler is not available.  */
+   __gnat_raise_from_signal_handler is not available.  */
 #ifdef CERT
-#define Raise_From_Signal_Handler \
-  __gnat_raise_exception
-extern void Raise_From_Signal_Handler (struct Exception_Data *, const char *);
+#define Raise_From_Signal_Handler __gnat_raise_exception
 #else
-#define Raise_From_Signal_Handler \
-  ada__exceptions__raise_from_signal_handler
-extern void Raise_From_Signal_Handler (struct Exception_Data *, const char *);
+#define Raise_From_Signal_Handler __gnat_raise_from_signal_handler
 #endif
 
+extern void Raise_From_Signal_Handler (struct Exception_Data *, const void *)
+  ATTRIBUTE_NORETURN;
+
 /* Global values computed by the binder.  Note that these variables are
declared here, not in the binder file, to avoid having unresolved
references in the shared libgnat.  */


diff --git a/gcc/ada/libgnat/a-except.adb b/gcc/ada/libgnat/a-except.adb
--- a/gcc/ada/libgnat/a-except.adb
+++ b/gcc/ada/libgnat/a-except.adb
@@ -279,6 +279,23 @@ package body Ada.Exceptions is
pragma No_Return (Raise_Exception_No_Defer);
--  Similar to Raise_Exception, but with no abort deferral
 
+   procedure Raise_From_Signal_Handler
+ (E : Exception_Id;
+  M : System.Address);
+   pragma Export
+ (C, Raise_From_Signal_Handler, "__gnat_raise_from_signal_handler");
+   pragma No_Return (Raise_From_Signal_Handler);
+   --  This routine is used to raise an exception from a signal handler. The
+   --  signal handler has already stored the machine state (i.e. the state that
+   --  corresponds to the location at which the signal was raised). E is the
+   --  Exception_Id specifying what exception is being raised, and M is a
+   --  pointer to a null-terminated string which is the message to be raised.
+   --  Note that this routine never returns, so it is permissible to simply
+   --  jump to this routine, rather than call it. This may be appropriate for
+   --  systems where the right way to get out of signal handler is to alter the
+   --  PC value in the machine state or in some other way ask the operating
+   --  system to return here rather than to the original location.
+
procedure Raise_With_Msg (E : Exception_Id);
pragma No_Return (Raise_With_Msg);
pragma Export (C, Raise_With_Msg, "__gnat_raise_with_msg");


diff --git a/gcc/ada/libgnat/a-except.ads b/gcc/ada/libgnat/a-except.ads
--- a/gcc/ada/libgnat/a-except.ads
+++ b/gcc/ada/libgnat/a-except.ads
@@ -184,26 +184,7 @@ private
--  Raise_Exception_Always if it can determine this is the case. The Export
--  allows this routine to be accessed from Pure units.
 
-   procedure Raise_From_Signal_Handler
- (E : Exception_Id;
-  M : System.Address);
-   pragma Export
- (Ada, Raise_From_Signal_Handler,
-   "ada__exceptions__raise_from_signal_handler");
-   pragma No_Return (Raise_From_Signal_Handler);
-   --  This routine is used to raise an exception from a signal handler. The
-   --  signal handler has already stored the machine state (i.e. the state that
-   --  corresponds to the location at which the signal was raised). E is the
-   --  Exception_Id specifying what exception is being raised, and M is a
-   --  pointer to a null-terminated string which is the message to be raised.
-   --  Note that this routine never returns, so it is permissible 

[Ada] Small cleanup in C header file

2021-05-07 Thread Pierre-Marie de Rodat
This fixes a few deviations from the GCC Coding Conventions, differences
between declarations and definitions of functions and minor other things.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* atree.h (Parent): Remove duplicate declaration.
(Get_1_Bit_Field): Also use INLINE specifier in the declaration,
fix formatting and use gcc_unreachable for the default case.
(Get_2_Bit_Field): Likewise.
(Get_4_Bit_Field): Likewise.
(Get_8_Bit_Field): Likewise.
(Get_32_Bit_Field): Likewise.
(Get_32_Bit_Field_With_Default): Likewise.diff --git a/gcc/ada/atree.h b/gcc/ada/atree.h
--- a/gcc/ada/atree.h
+++ b/gcc/ada/atree.h
@@ -41,14 +41,14 @@ extern Node_Id Parent (Node_Id);
 #define Original_Node atree__original_node
 extern Node_Id Original_Node (Node_Id);
 
-/* Type used for union of Node_Id, List_Id, Elist_Id. */
+/* Type used for union of Node_Id, List_Id, Elist_Id.  */
 typedef Int Tree_Id;
 
 /* These two functions can only be used for Node_Id and List_Id values and
they work in the C version because Empty = No_List = 0.  */
 
-static Boolean No	(Tree_Id);
-static Boolean Present	(Tree_Id);
+INLINE Boolean No (Tree_Id);
+INLINE Boolean Present (Tree_Id);
 
 INLINE Boolean
 No (Tree_Id N)
@@ -62,33 +62,32 @@ Present (Tree_Id N)
   return !No (N);
 }
 
-extern Node_Id Parent		(Tree_Id);
-
 #define Current_Error_Node atree__current_error_node
 extern Node_Id Current_Error_Node;
 
-// The following code corresponds to the Get_n_Bit_Field functions (for
-// various n) in package Atree. The low-level getters in sinfo.h call
-// these even-lower-level getters.
+/* The following code corresponds to the Get_n_Bit_Field functions (for
+   various n) in package Atree.  The low-level getters in sinfo.h call
+   these even-lower-level getters.  */
 
 extern Field_Offset *Node_Offsets_Ptr;
-extern slot* Slots_Ptr;
+extern slot *Slots_Ptr;
 
-static Union_Id Get_1_Bit_Field(Node_Id N, Field_Offset Offset);
-static Union_Id Get_2_Bit_Field(Node_Id N, Field_Offset Offset);
-static Union_Id Get_4_Bit_Field(Node_Id N, Field_Offset Offset);
-static Union_Id Get_8_Bit_Field(Node_Id N, Field_Offset Offset);
-static Union_Id Get_32_Bit_Field(Node_Id N, Field_Offset Offset);
-static Union_Id Get_32_Bit_Field_With_Default
-(Node_Id N, Field_Offset Offset, Union_Id Default_Value);
+INLINE Union_Id Get_1_Bit_Field (Node_Id N, Field_Offset Offset);
+INLINE Union_Id Get_2_Bit_Field (Node_Id N, Field_Offset Offset);
+INLINE Union_Id Get_4_Bit_Field (Node_Id N, Field_Offset Offset);
+INLINE Union_Id Get_8_Bit_Field (Node_Id N, Field_Offset Offset);
+INLINE Union_Id Get_32_Bit_Field (Node_Id N, Field_Offset Offset);
+INLINE Union_Id Get_32_Bit_Field_With_Default (Node_Id N, Field_Offset Offset,
+	   Union_Id Default_Value);
 
 INLINE Union_Id
-Get_1_Bit_Field(Node_Id N, Field_Offset Offset)
+Get_1_Bit_Field (Node_Id N, Field_Offset Offset)
 {
-const Field_Offset L = 32;
-slot_1_bit slot = (Slots_Ptr + (Node_Offsets_Ptr[N] + Offset/L))->slot_1;
+  const Field_Offset L = 32;
+
+  slot_1_bit slot = (Slots_Ptr + (Node_Offsets_Ptr[N] + Offset / L))->slot_1;
 
-switch (Offset%L)
+  switch (Offset % L)
 {
 case 0: return slot.f0;
 case 1: return slot.f1;
@@ -122,17 +121,18 @@ Get_1_Bit_Field(Node_Id N, Field_Offset Offset)
 case 29: return slot.f29;
 case 30: return slot.f30;
 case 31: return slot.f31;
-default: gcc_assert(false);
+default: gcc_unreachable ();
 }
 }
 
 INLINE Union_Id
-Get_2_Bit_Field(Node_Id N, Field_Offset Offset)
+Get_2_Bit_Field (Node_Id N, Field_Offset Offset)
 {
-const Field_Offset L = 16;
-slot_2_bit slot = (Slots_Ptr + (Node_Offsets_Ptr[N] + Offset/L))->slot_2;
+  const Field_Offset L = 16;
+
+  slot_2_bit slot = (Slots_Ptr + (Node_Offsets_Ptr[N] + Offset / L))->slot_2;
 
-switch (Offset%L)
+  switch (Offset % L)
 {
 case 0: return slot.f0;
 case 1: return slot.f1;
@@ -150,17 +150,18 @@ Get_2_Bit_Field(Node_Id N, Field_Offset Offset)
 case 13: return slot.f13;
 case 14: return slot.f14;
 case 15: return slot.f15;
-default: gcc_assert(false);
+default: gcc_unreachable ();
 }
 }
 
 INLINE Union_Id
-Get_4_Bit_Field(Node_Id N, Field_Offset Offset)
+Get_4_Bit_Field (Node_Id N, Field_Offset Offset)
 {
-const Field_Offset L = 8;
-slot_4_bit slot = (Slots_Ptr + (Node_Offsets_Ptr[N] + Offset/L))->slot_4;
+  const Field_Offset L = 8;
 
-switch (Offset%L)
+  slot_4_bit slot = (Slots_Ptr + (Node_Offsets_Ptr[N] + Offset / L))->slot_4;
+
+  switch (Offset % L)
 {
 case 0: return slot.f0;
 case 1: return slot.f1;
@@ -170,46 +171,46 @@ Get_4_Bit_Field(Node_Id N, Field_Offset Offset)
 case 5: return slot.f5;
 case 6: return slot.f6;
 case 7: return slot.f7;
-default: gcc_assert(false);
+default: gcc_unreachable ();
 }
 }
 
 INLINE Union_Id
-Get_8_Bit_Field(Node_Id N, Field_Offset Offset)
+Get_8_Bit_Field 

[Ada] Fix type mismatch warnings during LTO bootstrap #1

2021-05-07 Thread Pierre-Marie de Rodat
This sets convention C on enumeration types and functions declarations
involving System.Address, and makes adjustements to fe.h accordingly.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* errout.ads (Set_Identifier_Casing): Add pragma Convention C.
* eval_fat.ads (Rounding_Mode): Likewise.
(Machine): Add WARNING comment line.
* exp_code.ads (Clobber_Get_Next): Add pragma Convention C.
* fe.h (Compiler_Abort): Fix return type.
(Set_Identifier_Casing): Change type of parameters.
(Clobber_Get_Next): Change return type.
* gcc-interface/trans.c (gnat_to_gnu) : Add cast.diff --git a/gcc/ada/errout.ads b/gcc/ada/errout.ads
--- a/gcc/ada/errout.ads
+++ b/gcc/ada/errout.ads
@@ -987,6 +987,7 @@ package Errout is
procedure Set_Identifier_Casing
  (Identifier_Name : System.Address;
   File_Name   : System.Address);
+   pragma Convention (C, Set_Identifier_Casing);
--  This subprogram can be used by the back end for the purposes of
--  concocting error messages that are not output via Errout, e.g.
--  the messages generated by the gcc back end.


diff --git a/gcc/ada/eval_fat.ads b/gcc/ada/eval_fat.ads
--- a/gcc/ada/eval_fat.ads
+++ b/gcc/ada/eval_fat.ads
@@ -85,8 +85,8 @@ package Eval_Fat is
 
type Rounding_Mode is (Floor, Ceiling, Round, Round_Even);
for Rounding_Mode use (0, 1, 2, 3);
+   pragma Convention (C, Rounding_Mode);
--  Used to indicate rounding mode for Machine attribute
-   --  Note that C code in gigi knows that Round_Even is 3
 
--  The Machine attribute is special, in that it takes an extra argument
--  indicating the rounding mode, and also an argument Enode that is a
@@ -99,6 +99,8 @@ package Eval_Fat is
   Mode  : Rounding_Mode;
   Enode : Node_Id) return T;
 
+   --  WARNING: There is a matching C declaration of this function in urealp.h
+
procedure Decompose_Int
  (RT   : R;
   X: T;


diff --git a/gcc/ada/exp_code.ads b/gcc/ada/exp_code.ads
--- a/gcc/ada/exp_code.ads
+++ b/gcc/ada/exp_code.ads
@@ -53,6 +53,7 @@ package Exp_Code is
--  with subsequent calls to Clobber_Get_Next.
 
function Clobber_Get_Next return System.Address;
+   pragma Convention (C, Clobber_Get_Next);
--  Can only be called after a previous call to Clobber_Setup. The
--  returned value is a pointer to a null terminated (C format) string
--  for the next register argument. Null_Address is returned when there


diff --git a/gcc/ada/fe.h b/gcc/ada/fe.h
--- a/gcc/ada/fe.h
+++ b/gcc/ada/fe.h
@@ -55,7 +55,7 @@ extern Nat Serious_Errors_Detected;
 
 #define Compiler_Abort		comperr__compiler_abort
 
-extern int Compiler_Abort (String_Pointer, String_Pointer, Boolean) ATTRIBUTE_NORETURN;
+extern void Compiler_Abort (String_Pointer, String_Pointer, Boolean) ATTRIBUTE_NORETURN;
 
 /* debug: */
 
@@ -103,7 +103,7 @@ extern Node_Id Get_Attribute_Definition_Clause (Entity_Id, unsigned char);
 
 extern void Error_Msg_N			(String_Pointer, Node_Id);
 extern void Error_Msg_NE		(String_Pointer, Node_Id, Entity_Id);
-extern void Set_Identifier_Casing	(Char *, const Char *);
+extern void Set_Identifier_Casing	(void *, const void *);
 
 /* err_vars: */
 
@@ -145,7 +145,7 @@ extern Node_Id Asm_Input_Value		(void);
 extern Node_Id Asm_Output_Constraint	(void);
 extern Node_Id Asm_Output_Variable	(void);
 extern Node_Id Asm_Template		(Node_Id);
-extern char *Clobber_Get_Next		(void);
+extern void *Clobber_Get_Next		(void);
 extern void Clobber_Setup		(Node_Id);
 extern Boolean Is_Asm_Volatile		(Node_Id);
 extern void Next_Asm_Input		(void);


diff --git a/gcc/ada/gcc-interface/trans.c b/gcc/ada/gcc-interface/trans.c
--- a/gcc/ada/gcc-interface/trans.c
+++ b/gcc/ada/gcc-interface/trans.c
@@ -7993,7 +7993,7 @@ gnat_to_gnu (Node_Id gnat_node)
 	}
 
 	  Clobber_Setup (gnat_node);
-	  while ((clobber = Clobber_Get_Next ()))
+	  while ((clobber = (char *) Clobber_Get_Next ()))
 	gnu_clobbers
 	  = tree_cons (NULL_TREE,
 			   build_string (strlen (clobber) + 1, clobber),




[Ada] Implement aspect No_Controlled_Parts

2021-05-07 Thread Pierre-Marie de Rodat
This patch implements the No_Controlled_Parts aspect defined in
AI12-0256 which when specified for a type will verify such type or any
ancestors of such type with contain no controlled components.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* aspects.ads: Add entries to register
Aspect_No_Controlled_Parts.
* freeze.adb (Check_No_Controlled_Parts_Violations): Added to
check requirements of aspect No_Controlled_Parts after a type
has been frozen.
(Freeze_Entity): Add call to
Check_No_Controlled_Parts_Violations.
(Find_Aspect_No_Controlled_Parts): Created to obtain the aspect
specification for No_Controlled_Parts on a given type when
present.
(Find_Aspect_No_Controlled_Parts_Value): Protect against invalid
value.
(Has_Aspect_No_Controlled_Parts): Created as a prediate function
to check if No_Controlled_Parts has been specified on a type for
Get_Anacestor_Types_With_Specification.
(Get_Aspect_No_Controlled_Parts_Value): Created to obtain the
value of the aspect No_Controlled_Parts when specified on a
given type.
(Get_Generic_Formal_Types_In_Hierarchy): Created to collect
formal types in a given type's hierarchy.
(Get_Types_With_Aspect_In_Hierarchy): Created to collect types
in a given type's hierarchy with No_Controlled_Parts specified.
* sem_ch13.adb (Analyze_One_Aspect): Add processing for
No_Controlled_Parts, and fix error in check for allowed pragmas
for formal types.
(Check_Expr_Is_OK_Static_Expression): Created to enforce
checking of static expressions in the same vein as
Analyze_Pragma.Check_Expr_OK_Static_Expression.
* sem_util.adb (Collect_Types_In_Hierarchy): Created to collect
types in a given type's hierarchy that match a given predicate
function.
* sem_util.ads: Fix typo.
* snames.ads-tmpl: Add entry for No_Controlled_Parts.diff --git a/gcc/ada/aspects.ads b/gcc/ada/aspects.ads
--- a/gcc/ada/aspects.ads
+++ b/gcc/ada/aspects.ads
@@ -116,6 +116,7 @@ package Aspects is
   Aspect_Max_Entry_Queue_Length,
   Aspect_Max_Queue_Length,  -- GNAT
   Aspect_No_Caching,-- GNAT
+  Aspect_No_Controlled_Parts,
   Aspect_Object_Size,   -- GNAT
   Aspect_Obsolescent,   -- GNAT
   Aspect_Output,
@@ -403,6 +404,7 @@ package Aspects is
   Aspect_Max_Entry_Queue_Length => Expression,
   Aspect_Max_Queue_Length   => Expression,
   Aspect_No_Caching => Optional_Expression,
+  Aspect_No_Controlled_Parts=> Optional_Expression,
   Aspect_Object_Size=> Expression,
   Aspect_Obsolescent=> Optional_Expression,
   Aspect_Output => Name,
@@ -505,6 +507,7 @@ package Aspects is
   Aspect_Max_Entry_Queue_Length   => False,
   Aspect_Max_Queue_Length => False,
   Aspect_No_Caching   => False,
+  Aspect_No_Controlled_Parts  => False,
   Aspect_Object_Size  => True,
   Aspect_Obsolescent  => False,
   Aspect_Output   => False,
@@ -666,6 +669,7 @@ package Aspects is
   Aspect_Max_Entry_Queue_Length   => Name_Max_Entry_Queue_Length,
   Aspect_Max_Queue_Length => Name_Max_Queue_Length,
   Aspect_No_Caching   => Name_No_Caching,
+  Aspect_No_Controlled_Parts  => Name_No_Controlled_Parts,
   Aspect_No_Elaboration_Code_All  => Name_No_Elaboration_Code_All,
   Aspect_No_Inline=> Name_No_Inline,
   Aspect_No_Return=> Name_No_Return,
@@ -960,6 +964,7 @@ package Aspects is
   Aspect_Max_Entry_Queue_Length   => Never_Delay,
   Aspect_Max_Queue_Length => Never_Delay,
   Aspect_No_Caching   => Never_Delay,
+  Aspect_No_Controlled_Parts  => Never_Delay,
   Aspect_No_Elaboration_Code_All  => Never_Delay,
   Aspect_No_Tagged_Streams=> Never_Delay,
   Aspect_Obsolescent  => Never_Delay,


diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -2192,6 +2192,11 @@ package body Freeze is
   --  which is the current instance type can only be applied when the type
   --  is limited.
 
+  procedure Check_No_Controlled_Parts_Violations (Typ : Entity_Id);
+  --  Check that Typ does not violate the semantics of aspect
+  --  No_Controlled_Parts when it is specified on Typ or one of its
+  --  ancestors.
+
   procedure Check_Suspicious_Convention (Rec_Type : Entity_Id);
   --  Give a warning for pragma Convention with language C or C++ applied
   --  to a 

[Ada] Spurious error with component of unchecked_union type

2021-05-07 Thread Pierre-Marie de Rodat
Compiler rejects an equality operation on a record type when the nominal
subtype of a component is a constrained subtype of an Unchecked_Union
type, and that subtype is declared outside of the enclosing record
declaration.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch4.adb (Unconstrained_UU_In_Component_Declaration): A
component declaration whose subtype indication is an entity name
without an explicit constraint is an Unchecked_Union type only
if the entity has an unconstrained nominal subtype (record type
or private type) whose parent type is an Unchecked_Union.diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -8143,11 +8143,16 @@ package body Exp_Ch4 is
 Sindic : constant Node_Id :=
Subtype_Indication (Component_Definition (N));
  begin
---  Unconstrained nominal type. In the case of a constraint
---  present, the node kind would have been N_Subtype_Indication.
+--  If the component declaration includes a subtype indication
+--  it is not an unchecked_union. Otherwise verify that it carries
+--  the Unchecked_Union flag and is either a record or a private
+--  type. A Record_Subtype declared elsewhere does not qualify,
+--  even if its parent type carries the flag.
 
 return Nkind (Sindic) in N_Expanded_Name | N_Identifier
-  and then Is_Unchecked_Union (Base_Type (Etype (Sindic)));
+  and then Is_Unchecked_Union (Base_Type (Etype (Sindic)))
+  and then (Ekind (Entity (Sindic)) in
+ E_Private_Type | E_Record_Type);
  end Unconstrained_UU_In_Component_Declaration;
 
  -




[Ada] Generate warning for negative literal of a modular type

2021-05-07 Thread Pierre-Marie de Rodat
A negative literal of a module type is interpreted with wrap-around as
a large positive number. Warn if this value is not enclosed in a type
qualification or type conversion explicitly.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* opt.ads: Update comment for Warn_On_Suspicious_Modulus_Value.
* sem_res.adb (Resolve_Unary_Op): Generate warning.
* usage.adb: Refine doc for -gnatw.m/M switch.
* doc/gnat_ugn/building_executable_programs_with_gnat.rst:
Update doc on -gnatw.m switch.
* gnat_ugn.texi: Regenerate.diff --git a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
--- a/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
+++ b/gcc/ada/doc/gnat_ugn/building_executable_programs_with_gnat.rst
@@ -3424,7 +3424,10 @@ of the pragma in the :title:`GNAT_Reference_manual`).
   with no size clause. The guess in both cases is that 2**x was intended
   rather than x. In addition expressions of the form 2*x for small x
   generate a warning (the almost certainly accurate guess being that
-  2**x was intended). The default is that these warnings are given.
+  2**x was intended). This switch also activates warnings for negative
+  literal values of a modular type, which are interpreted as large positive
+  integers after wrap-around. The default is that these warnings are given.
+
 
 
 .. index:: -gnatw.M  (gcc)


diff --git a/gcc/ada/gnat_ugn.texi b/gcc/ada/gnat_ugn.texi
--- a/gcc/ada/gnat_ugn.texi
+++ b/gcc/ada/gnat_ugn.texi
@@ -11616,7 +11616,9 @@ a modulus of 7 with a size of 7 bits), and modulus values of 32 or 64
 with no size clause. The guess in both cases is that 2**x was intended
 rather than x. In addition expressions of the form 2*x for small x
 generate a warning (the almost certainly accurate guess being that
-2**x was intended). The default is that these warnings are given.
+2**x was intended). This switch also activates warnings for negative
+literal values of a modular type, which are interpreted as large positive
+integers after wrap-around. The default is that these warnings are given.
 @end table
 
 @geindex -gnatw.M (gcc)


diff --git a/gcc/ada/opt.ads b/gcc/ada/opt.ads
--- a/gcc/ada/opt.ads
+++ b/gcc/ada/opt.ads
@@ -1885,8 +1885,9 @@ package Opt is
 
Warn_On_Suspicious_Modulus_Value : Boolean := True;
--  GNAT
-   --  Set to True to generate warnings for suspicious modulus values. The
-   --  default is that this warning is enabled. Modified by -gnatw.m/.M.
+   --  Set to True to generate warnings for suspicious modulus values, as well
+   --  as negative literals of a modular type. The default is that this warning
+   --  is enabled. Modified by -gnatw.m/.M.
 
Warn_On_Unchecked_Conversion : Boolean := True;
--  GNAT


diff --git a/gcc/ada/sem_res.adb b/gcc/ada/sem_res.adb
--- a/gcc/ada/sem_res.adb
+++ b/gcc/ada/sem_res.adb
@@ -12096,6 +12096,28 @@ package body Sem_Res is
   Set_Etype (N, B_Typ);
   Resolve (R, B_Typ);
 
+  --  Generate warning for negative literal of a modular type, unless it is
+  --  enclosed directly in a type qualification or a type conversion, as it
+  --  is likely not what the user intended. We don't issue the warning for
+  --  the common use of -1 to denote Ox_...
+
+  if Warn_On_Suspicious_Modulus_Value
+and then Nkind (N) = N_Op_Minus
+and then Nkind (R) = N_Integer_Literal
+and then Is_Modular_Integer_Type (B_Typ)
+and then Nkind (Parent (N)) not in N_Qualified_Expression
+ | N_Type_Conversion
+and then Expr_Value (R) > Uint_1
+  then
+ Error_Msg_N
+   ("?M?negative literal of modular type is in fact positive", N);
+ Error_Msg_Uint_1 := (-Expr_Value (R)) mod Modulus (B_Typ);
+ Error_Msg_Uint_2 := Expr_Value (R);
+ Error_Msg_N ("\do you really mean^ when writing -^ '?", N);
+ Error_Msg_N
+   ("\if you do, use qualification to avoid this warning", N);
+  end if;
+
   --  Generate warning for expressions like abs (x mod 2)
 
   if Warn_On_Redundant_Constructs


diff --git a/gcc/ada/usage.adb b/gcc/ada/usage.adb
--- a/gcc/ada/usage.adb
+++ b/gcc/ada/usage.adb
@@ -532,8 +532,10 @@ begin
   "but not read");
Write_Line ("M*   turn off warnings for variable assigned " &
   "but not read");
-   Write_Line (".m*+ turn on warnings for suspicious modulus value");
-   Write_Line (".M   turn off warnings for suspicious modulus value");
+   Write_Line (".m*+ turn on warnings for suspicious usage " &
+  "of modular type");
+   Write_Line (".M   turn off warnings for suspicious usage " &
+

[Ada] Remove End_Interp_List from the overloaded resolution API

2021-05-07 Thread Pierre-Marie de Rodat
Routine End_Interp_List was part of the API for overloaded resolution
from the very beginning. However, it quickly become unnecessary, because
both adding and removing interpretation maintains a No_Interp marker at
the end of the interpretation list.

The only effect of this routine was that it prevented duplicated
interpretation entries from appearing on the interpretation list, but it
was only because the explicit guard for preventing such duplicates was
not always working.

In particular, this guard didn't work when an overloaded expression of a
generic unit parameter was first preanalyzed (when checking the legality
of the instantiation) and then analyzed (when actually instantiating the
generic unit).

This patch fixes protection against duplicate entries on the lists with
overloaded interpretations and removes the End_Interp_List routine.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch4.adb (Analyze_Call): Remove call to End_Interp_List.
(Process_Overloaded_Indexed_Component): Remove call to
End_Interp_List.
* sem_util.adb (Insert_Explicit_Dereference): Remove call to
End_Interp_List.
* sem_type.ads (End_Interp_List): Remove.
* sem_type.adb (Add_Entry): The guard against duplicate entries
is now checked before other conditions, so that EXIT statements
do not bypass this guard.
(End_Interp_List): Remove.diff --git a/gcc/ada/sem_ch4.adb b/gcc/ada/sem_ch4.adb
--- a/gcc/ada/sem_ch4.adb
+++ b/gcc/ada/sem_ch4.adb
@@ -1461,8 +1461,6 @@ package body Sem_Ch4 is
  else
 Remove_Abstract_Operations (N);
  end if;
-
- End_Interp_List;
   end if;
 
   --  Check the accessibility level for actuals for explicitly aliased
@@ -2790,8 +2788,6 @@ package body Sem_Ch4 is
 Error_Msg_N ("no legal interpretation for indexed component", N);
 Set_Is_Overloaded (N, False);
  end if;
-
- End_Interp_List;
   end Process_Overloaded_Indexed_Component;
 
--  Start of processing for Analyze_Indexed_Component_Form


diff --git a/gcc/ada/sem_type.adb b/gcc/ada/sem_type.adb
--- a/gcc/ada/sem_type.adb
+++ b/gcc/ada/sem_type.adb
@@ -239,6 +239,13 @@ package body Sem_Type is
  Get_First_Interp (N, I, It);
  while Present (It.Nam) loop
 
+--  Avoid making duplicate entries in overloads
+
+if Name = It.Nam
+  and then Base_Type (It.Typ) = Base_Type (T)
+then
+   return;
+
 --  A user-defined subprogram hides another declared at an outer
 --  level, or one that is use-visible. So return if previous
 --  definition hides new one (which is either in an outer
@@ -248,7 +255,7 @@ package body Sem_Type is
 --  If this is a universal operation, retain the operator in case
 --  preference rule applies.
 
-if (((Ekind (Name) = E_Function or else Ekind (Name) = E_Procedure)
+elsif ((Ekind (Name) in E_Function | E_Procedure
and then Ekind (Name) = Ekind (It.Nam))
  or else (Ekind (Name) = E_Operator
and then Ekind (It.Nam) = E_Function))
@@ -292,13 +299,6 @@ package body Sem_Type is
   return;
end if;
 
---  Avoid making duplicate entries in overloads
-
-elsif Name = It.Nam
-  and then Base_Type (It.Typ) = Base_Type (T)
-then
-   return;
-
 --  Otherwise keep going
 
 else
@@ -2227,16 +2227,6 @@ package body Sem_Type is
   end if;
end Disambiguate;
 
-   -
-   -- End_Interp_List --
-   -
-
-   procedure End_Interp_List is
-   begin
-  All_Interp.Table (All_Interp.Last) := No_Interp;
-  All_Interp.Increment_Last;
-   end End_Interp_List;
-
-
-- Entity_Matches_Spec --
-


diff --git a/gcc/ada/sem_type.ads b/gcc/ada/sem_type.ads
--- a/gcc/ada/sem_type.ads
+++ b/gcc/ada/sem_type.ads
@@ -130,9 +130,6 @@ package Sem_Type is
--  always Boolean, and we use Opnd_Type, which is a candidate type for one
--  of the operands of N, to check visibility.
 
-   procedure End_Interp_List;
-   --  End the list of interpretations of current node
-
procedure Get_First_Interp
  (N  : Node_Id;
   I  : out Interp_Index;


diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -15044,8 +15044,6 @@ package body Sem_Util is
 Get_Next_Interp (I, It);
  end loop;
 
- End_Interp_List;
-
   else
  --  Prefix is unambiguous: mark the original prefix (which might
  --  Come_From_Source) as a reference, since the new (relocated) one




[Ada] Attribute Address is not an interfering context in SPARK

2021-05-07 Thread Pierre-Marie de Rodat
Allow taking the address of a volatile object in SPARK, as it doesn't
cause problems related to interfering contexts.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_res.adb (Flag_Object): Ignore prefixes of attribute
Address.diff --git a/gcc/ada/sem_res.adb b/gcc/ada/sem_res.adb
--- a/gcc/ada/sem_res.adb
+++ b/gcc/ada/sem_res.adb
@@ -3749,6 +3749,19 @@ package body Sem_Res is
 
  begin
 case Nkind (N) is
+
+   --  Do not consider object name appearing in the prefix of
+   --  attribute Address as a read.
+
+   when N_Attribute_Reference =>
+
+  --  Prefix of attribute Address denotes an object, program
+  --  unit, or label; none of them needs to be flagged here.
+
+  if Attribute_Name (N) = Name_Address then
+ return Skip;
+  end if;
+
--  Do not consider nested function calls because they have
--  already been processed during their own resolution.
 




[Ada] Crash on imported object with deep initialization and No_Aborts

2021-05-07 Thread Pierre-Marie de Rodat
Compiler aborts on an object declaration without an expression, when the
type of the object includes controlled components and thus requires deep
initialization, there are various restrictions in effect that prevent
Abort statements, and there is a later Import pragma that applies to
the object being declared.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_util.adb (Remove_Init_Call): If a simple initialization
call is present, and the next statement is an initialization
block (that contains a call to a Deep_ Initialize routine),
remove the block as well, and insert the first initialization
call in it, in case it is needed for later relocation.diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb
--- a/gcc/ada/exp_util.adb
+++ b/gcc/ada/exp_util.adb
@@ -11382,6 +11382,26 @@ package body Exp_Util is
   end if;
 
   if Present (Init_Call) then
+ --  If restrictions have forbidden Aborts, the initialization call
+ --  for objects that require deep initialization has not been wrapped
+ --  into the following block (see Exp_Ch3, Default_Initialize_Object)
+ --  so if present remove it as well, and include the IP call in it,
+ --  in the rare case the caller may need to simply displace the
+ --  initialization, as is done for a later address specification.
+
+ if Nkind (Next (Init_Call)) = N_Block_Statement
+   and then Is_Initialization_Block (Next (Init_Call))
+ then
+declare
+   IP_Call : constant Node_Id := Init_Call;
+begin
+   Init_Call := Next (IP_Call);
+   Remove (IP_Call);
+   Prepend (IP_Call,
+ Statements (Handled_Statement_Sequence (Init_Call)));
+end;
+ end if;
+
  Remove (Init_Call);
   end if;
 




[Ada] Computation of Shift_Left and large signed values

2021-05-07 Thread Pierre-Marie de Rodat
The computation of Shift_Left on signed values might wrongly overflow
instead of generating a negative value.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_eval.adb (Fold_Shift): Fix computation of Shift_Left
resulting in negative signed values.diff --git a/gcc/ada/sem_eval.adb b/gcc/ada/sem_eval.adb
--- a/gcc/ada/sem_eval.adb
+++ b/gcc/ada/sem_eval.adb
@@ -4983,7 +4983,7 @@ package body Sem_Eval is
  end if;
   end Check_Elab_Call;
 
-  Modulus : Uint;
+  Modulus, Val : Uint;
 
begin
   if Compile_Time_Known_Value (Left)
@@ -4994,23 +4994,25 @@ package body Sem_Eval is
  if Op = N_Op_Shift_Left then
 Check_Elab_Call;
 
-declare
-   Modulus : Uint;
-begin
-   if Is_Modular_Integer_Type (Typ) then
-  Modulus := Einfo.Modulus (Typ);
-   else
-  Modulus := Uint_2 ** RM_Size (Typ);
-   end if;
+if Is_Modular_Integer_Type (Typ) then
+   Modulus := Einfo.Modulus (Typ);
+else
+   Modulus := Uint_2 ** RM_Size (Typ);
+end if;
 
-   --  Fold Shift_Left (X, Y) by computing (X * 2**Y) rem modulus
+--  Fold Shift_Left (X, Y) by computing
+--  (X * 2**Y) rem modulus [- Modulus]
 
-   Fold_Uint
- (N,
-  (Expr_Value (Left) * (Uint_2 ** Expr_Value (Right)))
-rem Modulus,
-  Static => Static);
-end;
+Val := (Expr_Value (Left) * (Uint_2 ** Expr_Value (Right)))
+ rem Modulus;
+
+if Is_Modular_Integer_Type (Typ)
+  or else Val < Modulus / Uint_2
+then
+   Fold_Uint (N, Val, Static => Static);
+else
+   Fold_Uint (N, Val - Modulus, Static => Static);
+end if;
 
  elsif Op = N_Op_Shift_Right then
 Check_Elab_Call;
@@ -5042,7 +5044,7 @@ package body Sem_Eval is
 Check_Elab_Call;
 
 declare
-   Two_Y   : constant Uint := Uint_2 ** Expr_Value (Right);
+   Two_Y : constant Uint := Uint_2 ** Expr_Value (Right);
 begin
if Is_Modular_Integer_Type (Typ) then
   Modulus := Einfo.Modulus (Typ);




[Ada] Fix signature mismatch for Defining_Entity

2021-05-07 Thread Pierre-Marie de Rodat
This fixes the signature mismatch recently introduced for Defining_Entity
between the front-end proper and gigi.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_util.ads (Defining_Entity): Remove Empty_On_Errors parameter.
(Defining_Entity_Or_Empty): New function.
* sem_util.adb (Defining_Entity): Move bulk of implementation to...
(Defining_Entity_Or_Empty): ...here.  Do not raise Program_Error.
(Innermost_Master_Scope_Depth): Call Defining_Entity_Or_Empty.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -269,14 +269,12 @@ package body Sem_Util is
   --  Construct an integer literal representing an accessibility level
   --  with its type set to Natural.
 
-  function Innermost_Master_Scope_Depth
-(N : Node_Id) return Uint;
+  function Innermost_Master_Scope_Depth (N : Node_Id) return Uint;
   --  Returns the scope depth of the given node's innermost
   --  enclosing dynamic scope (effectively the accessibility
   --  level of the innermost enclosing master).
 
-  function Function_Call_Or_Allocator_Level
-(N : Node_Id) return Node_Id;
+  function Function_Call_Or_Allocator_Level (N : Node_Id) return Node_Id;
   --  Centralized processing of subprogram calls which may appear in
   --  prefix notation.
 
@@ -284,10 +282,9 @@ package body Sem_Util is
   -- Innermost_Master_Scope_Depth --
   --
 
-  function Innermost_Master_Scope_Depth
-(N : Node_Id) return Uint
-  is
+  function Innermost_Master_Scope_Depth (N : Node_Id) return Uint is
  Encl_Scop   : Entity_Id;
+ Ent : Entity_Id;
  Node_Par: Node_Id := Parent (N);
  Master_Lvl_Modifier : Int := 0;
 
@@ -301,12 +298,10 @@ package body Sem_Util is
  --  among other things. These cases are detected properly ???
 
  while Present (Node_Par) loop
+Ent := Defining_Entity_Or_Empty (Node_Par);
 
-if Present (Defining_Entity
- (Node_Par, Empty_On_Errors => True))
-then
-   Encl_Scop := Nearest_Dynamic_Scope
-  (Defining_Entity (Node_Par));
+if Present (Ent) then
+   Encl_Scop := Nearest_Dynamic_Scope (Ent);
 
--  Ignore transient scopes made during expansion
 
@@ -7076,10 +7071,23 @@ package body Sem_Util is
-- Defining_Entity --
-
 
-   function Defining_Entity
- (N   : Node_Id;
-  Empty_On_Errors : Boolean := False) return Entity_Id
-   is
+   function Defining_Entity (N : Node_Id) return Entity_Id is
+  Ent : constant Entity_Id := Defining_Entity_Or_Empty (N);
+
+   begin
+  if Present (Ent) then
+ return Ent;
+
+  else
+ raise Program_Error;
+  end if;
+   end Defining_Entity;
+
+   --
+   -- Defining_Entity_Or_Empty --
+   --
+
+   function Defining_Entity_Or_Empty (N : Node_Id) return Entity_Id is
begin
   case Nkind (N) is
  when N_Abstract_Subprogram_Declaration
@@ -7178,13 +7186,9 @@ package body Sem_Util is
 return Entity (Identifier (N));
 
  when others =>
-if Empty_On_Errors then
-   return Empty;
-end if;
-
-raise Program_Error;
+return Empty;
   end case;
-   end Defining_Entity;
+   end Defining_Entity_Or_Empty;
 
--
-- Denotes_Discriminant --


diff --git a/gcc/ada/sem_util.ads b/gcc/ada/sem_util.ads
--- a/gcc/ada/sem_util.ads
+++ b/gcc/ada/sem_util.ads
@@ -662,9 +662,7 @@ package Sem_Util is
--  in the case of a descendant of a generic formal type (returns Int'Last
--  instead of 0).
 
-   function Defining_Entity
- (N   : Node_Id;
-  Empty_On_Errors : Boolean := False) return Entity_Id;
+   function Defining_Entity (N : Node_Id) return Entity_Id;
--  Given a declaration N, returns the associated defining entity. If the
--  declaration has a specification, the entity is obtained from the
--  specification. If the declaration has a defining unit name, then the
@@ -675,19 +673,13 @@ package Sem_Util is
--  local entities declared during loop expansion. These entities need
--  debugging information, generated through Qualify_Entity_Names, and
--  the loop declaration must be placed in the table Name_Qualify_Units.
-   --
-   --  Set flag Empty_On_Errors to change the behavior of this routine as
-   --  follows:
-   --
-   --* True  - A declaration that lacks a defining entity returns Empty.
-   --  A node that does not allow for a defining entity returns Empty.
-   --
-   --* False - A declaration that lacks a defining entity is given a new
-   --  

[Ada] Minor efficiency improvement in containers

2021-05-07 Thread Pierre-Marie de Rodat
Move an assertion to be conditional on T_Check, so pragma Suppress
(Tampering_Checks) will suppress it.  Note that the Lock component being
checked has the Atomic aspect. This is not a bug fix.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-conhel.adb (TC_Check): Move the Assert into the
'if'.diff --git a/gcc/ada/libgnat/a-conhel.adb b/gcc/ada/libgnat/a-conhel.adb
--- a/gcc/ada/libgnat/a-conhel.adb
+++ b/gcc/ada/libgnat/a-conhel.adb
@@ -122,17 +122,20 @@ package body Ada.Containers.Helpers is
 
   procedure TC_Check (T_Counts : Tamper_Counts) is
   begin
- if T_Check and then T_Counts.Busy > 0 then
-raise Program_Error with
-  "attempt to tamper with cursors";
+ if T_Check then
+if T_Counts.Busy > 0 then
+   raise Program_Error with
+ "attempt to tamper with cursors";
+end if;
+
+--  The lock status (which monitors "element tampering") always
+--  implies that the busy status (which monitors "cursor
+--  tampering") is set too; this is a representation invariant.
+--  Thus if the busy count is zero, then the lock count
+--  must also be zero.
+
+pragma Assert (T_Counts.Lock = 0);
  end if;
-
- --  The lock status (which monitors "element tampering") always
- --  implies that the busy status (which monitors "cursor tampering")
- --  is set too; this is a representation invariant. Thus if the busy
- --  bit is not set, then the lock bit must not be set either.
-
- pragma Assert (T_Counts.Lock = 0);
   end TC_Check;
 
   --




[Ada] sigtramp: fix powerpc64 against -fPIC

2021-05-07 Thread Pierre-Marie de Rodat
Use a local label to set the TOC location on powerpc64 to prevent
DT_TEXTREL, not supported by the VxWorks loader for shared libraries.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sigtramp-vxworks-target.inc: Use a local label for the TOC.diff --git a/gcc/ada/sigtramp-vxworks-target.inc b/gcc/ada/sigtramp-vxworks-target.inc
--- a/gcc/ada/sigtramp-vxworks-target.inc
+++ b/gcc/ada/sigtramp-vxworks-target.inc
@@ -319,9 +319,9 @@ TCR("blr")
 #else
 #define SIGTRAMP_BODY \
 CR("") \
-TCR("0:") \
-TCR("addis 2,12,.TOC.-0@ha") \
-TCR("addi 2,2,.TOC.-0@l") \
+TCR(".LOC_SIGTMP_COM_0:") \
+TCR("addis 2,12,.TOC.-.LOC_SIGTMP_COM_0@ha") \
+TCR("addi 2,2,.TOC.-.LOC_SIGTMP_COM_0@l") \
 TCR(".localentry	__gnat_sigtramp_common,.-__gnat_sigtramp_common") \
 TCR("# Allocate frame and save the non-volatile") \
 TCR("# registers we're going to modify") \




[Ada] Raise Constraint_Error for Compose and Scaling if Machine_Overflows

2021-05-07 Thread Pierre-Marie de Rodat
This is an optional behavior specified by the RM and it makes sense to
do it when T'Machine_Overflows is True for the sake of consistency.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/s-fatgen.adb (Scaling): Raise Constraint_Error in the
overflow case when T'Machine_Overflows is True.diff --git a/gcc/ada/libgnat/s-fatgen.adb b/gcc/ada/libgnat/s-fatgen.adb
--- a/gcc/ada/libgnat/s-fatgen.adb
+++ b/gcc/ada/libgnat/s-fatgen.adb
@@ -771,12 +771,19 @@ package body System.Fat_Gen is
  --  Check for overflow
 
  if Adjustment > IEEE_Emax - Exp then
-XX := 0.0;
-return (if Minus then -1.0 / XX else 1.0 / XX);
-pragma Annotate
-  (CodePeer, Intentional, "overflow check", "Infinity produced");
-pragma Annotate
-  (CodePeer, Intentional, "divide by zero", "Infinity produced");
+--  Optionally raise Constraint_Error as per RM A.5.3(29)
+
+if T'Machine_Overflows then
+   raise Constraint_Error with "Too large exponent";
+
+else
+   XX := 0.0;
+   return (if Minus then -1.0 / XX else 1.0 / XX);
+   pragma Annotate (CodePeer, Intentional, "overflow check",
+"Infinity produced");
+   pragma Annotate (CodePeer, Intentional, "divide by zero",
+"Infinity produced");
+end if;
 
  --  Check for underflow
 




[Ada] Move Has_Inferable_Discriminants to Sem_Util

2021-05-07 Thread Pierre-Marie de Rodat
Move the Has_Inferable_Discriminants utility to Sem_Util so that it can
be reused inside GNATprove.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch4.adb (Has_Inferable_Discriminants): Moved to Sem_Util.
* sem_util.ads, sem_util.adb (Has_Inferable_Discriminants):
Moved from Exp_Ch4.diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -176,17 +176,6 @@ package body Exp_Ch4 is
--  Return the size of a small signed integer type covering Lo .. Hi, the
--  main goal being to return a size lower than that of standard types.
 
-   function Has_Inferable_Discriminants (N : Node_Id) return Boolean;
-   --  Ada 2005 (AI-216): A view of an Unchecked_Union object has inferable
-   --  discriminants if it has a constrained nominal type, unless the object
-   --  is a component of an enclosing Unchecked_Union object that is subject
-   --  to a per-object constraint and the enclosing object lacks inferable
-   --  discriminants.
-   --
-   --  An expression of an Unchecked_Union type has inferable discriminants
-   --  if it is either a name of an object with inferable discriminants or a
-   --  qualified expression whose subtype mark denotes a constrained subtype.
-
procedure Insert_Dereference_Action (N : Node_Id);
--  N is an expression whose type is an access. When the type of the
--  associated storage pool is derived from Checked_Pool, generate a
@@ -13358,84 +13347,6 @@ package body Exp_Ch4 is
   end if;
end Get_Size_For_Range;
 
-   -
-   -- Has_Inferable_Discriminants --
-   -
-
-   function Has_Inferable_Discriminants (N : Node_Id) return Boolean is
-
-  function Prefix_Is_Formal_Parameter (N : Node_Id) return Boolean;
-  --  Determines whether the left-most prefix of a selected component is a
-  --  formal parameter in a subprogram. Assumes N is a selected component.
-
-  
-  -- Prefix_Is_Formal_Parameter --
-  
-
-  function Prefix_Is_Formal_Parameter (N : Node_Id) return Boolean is
- Sel_Comp : Node_Id;
-
-  begin
- --  Move to the left-most prefix by climbing up the tree
-
- Sel_Comp := N;
- while Present (Parent (Sel_Comp))
-   and then Nkind (Parent (Sel_Comp)) = N_Selected_Component
- loop
-Sel_Comp := Parent (Sel_Comp);
- end loop;
-
- return Is_Formal (Entity (Prefix (Sel_Comp)));
-  end Prefix_Is_Formal_Parameter;
-
-   --  Start of processing for Has_Inferable_Discriminants
-
-   begin
-  --  For selected components, the subtype of the selector must be a
-  --  constrained Unchecked_Union. If the component is subject to a
-  --  per-object constraint, then the enclosing object must have inferable
-  --  discriminants.
-
-  if Nkind (N) = N_Selected_Component then
- if Has_Per_Object_Constraint (Entity (Selector_Name (N))) then
-
---  A small hack. If we have a per-object constrained selected
---  component of a formal parameter, return True since we do not
---  know the actual parameter association yet.
-
-if Prefix_Is_Formal_Parameter (N) then
-   return True;
-
---  Otherwise, check the enclosing object and the selector
-
-else
-   return Has_Inferable_Discriminants (Prefix (N))
- and then Has_Inferable_Discriminants (Selector_Name (N));
-end if;
-
- --  The call to Has_Inferable_Discriminants will determine whether
- --  the selector has a constrained Unchecked_Union nominal type.
-
- else
-return Has_Inferable_Discriminants (Selector_Name (N));
- end if;
-
-  --  A qualified expression has inferable discriminants if its subtype
-  --  mark is a constrained Unchecked_Union subtype.
-
-  elsif Nkind (N) = N_Qualified_Expression then
- return Is_Unchecked_Union (Etype (Subtype_Mark (N)))
-   and then Is_Constrained (Etype (Subtype_Mark (N)));
-
-  --  For all other names, it is sufficient to have a constrained
-  --  Unchecked_Union nominal subtype.
-
-  else
- return Is_Unchecked_Union (Base_Type (Etype (N)))
-   and then Is_Constrained (Etype (N));
-  end if;
-   end Has_Inferable_Discriminants;
-
---
-- Insert_Dereference_Action --
---


diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -12435,6 +12435,84 @@ package body Sem_Util is
   return False;
end Has_Fully_Default_Initializing_DIC_Pragma;
 
+   -
+   -- Has_Inferable_Discriminants --
+   -
+
+   function 

[Ada] Spurious error on protected call in inherited postcondition

2021-05-07 Thread Pierre-Marie de Rodat
An inherited class-wide precondition of a primitive of a protected type
cannot include a call to a protected primitive of the type, as specified
in AI12-0166.  This patch adds a guard to verify that this legality rule
applies only to a precondition and not to a postcondition.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_util.adb (Build_Class_Wide_Expression, Replace_Entity):
Add guard to verify that the enclosing pragma is a precondition.diff --git a/gcc/ada/exp_util.adb b/gcc/ada/exp_util.adb
--- a/gcc/ada/exp_util.adb
+++ b/gcc/ada/exp_util.adb
@@ -1327,6 +1327,7 @@ package body Exp_Util is
  and then Is_Primitive_Wrapper (New_E)
  and then Is_Primitive_Wrapper (Subp)
  and then Scope (Subp) = Scope (New_E)
+ and then Chars (Pragma_Identifier (Prag)) = Name_Precondition
then
   Error_Msg_Node_2 := Wrapped_Entity (Subp);
   Error_Msg_NE




Re: [PATCH] run early sprintf warning after SSA (PR 100325)

2021-05-07 Thread Richard Biener via Gcc-patches
On Fri, May 7, 2021 at 2:12 AM Martin Sebor  wrote:
>
> On 5/6/21 8:32 AM, Aldy Hernandez wrote:
> >
> >
> > On 5/5/21 9:26 AM, Richard Biener wrote:
> >> On Wed, May 5, 2021 at 1:32 AM Martin Sebor via Gcc-patches
> >>  wrote:
> >>>
> >>> With no optimization, -Wformat-overflow and -Wformat-truncation
> >>> runs early to detect a subset of simple bugs.  But as it turns out,
> >>> the pass runs just a tad too early, before SSA.  That causes it to
> >>> miss a class of problems that can easily be detected once code is
> >>> in SSA form, and I would expect might also cause false positives.
> >>>
> >>> The attached change moves the sprintf pass just after pass_build_ssa,
> >>> similar to other early flow-sensitive warnings (-Wnonnull-compare and
> >>> -Wuninitialized).
> >>
> >> Makes sense.  I suppose walloca might also benefit from SSA - it seems
> >> to do range queries which won't work quite well w/o SSA?
> >
> > The early Walloca pass that runs without optimization doesn't do much,
> > as we've never had ranges so early.  All it does is diagnose _every_
> > call to alloca(), if -Walloca is passed:
> >
> >// The first time this pass is called, it is called before
> >// optimizations have been run and range information is unavailable,
> >// so we can only perform strict alloca checking.
> >if (first_time_p)
> >  return warn_alloca != 0;
> >
> > Though, I suppose we could move the first alloca pass after SSA is
> > available and make it the one and only pass, since ranger only needs
> > SSA.  However, I don't know how well this would work without value
> > numbering or CSE.  For example, for gcc.dg/Walloca-4.c the gimple is:
> >
> > :
> >_1 = rear_ptr_9(D) - w_10(D);
> >_2 = (long unsigned int) _1;
> >if (_2 <= 4095)
> >  goto ; [INV]
> >else
> >  goto ; [INV]
> >
> > :
> >_3 = rear_ptr_9(D) - w_10(D);
> >_4 = (long unsigned int) _3;
> >src_16 = __builtin_alloca (_4);
> >goto ; [INV]
> >
> > No ranges can be determined for _4.  However, if either FRE or DOM run,
> > as they do value numbering and CSE respectively, we could easily
> > determine a range as the above would become:
> >
> >:
> >_1 = rear_ptr_9(D) - w_10(D);
> >_2 = (long unsigned int) _1;
> >if (_2 <= 4095)
> >  goto ; [INV]
> >else
> >  goto ; [INV]
> >
> > :
> >src_16 = __builtin_alloca (_2);
> >goto ; [INV]
> >
> > I'm inclined to leave the first alloca pass before SSA runs, as it
> > doesn't do anything with ranges.  If anyone's open to a simple -O0 CSE
> > type pass, it would be a different story.  Thoughts?
>
> Improving the analysis at -O0 and getting better warnings that are
> more consistent with what is issued with optimization would be very
> helpful (as as long as it doesn't compromise debugging experience
> of course).

I agree.  It shouldn't be too difficult to for example run the VN
propagation part without doing actual elimiation and keep
value-numbers for consumption.  do_rpo_vn (not exported)
might even already support iterate = false, eliminate = false,
it would just need factoring out the init/deinit somewhat.

Of course it will be a lot more expensive to do since it cannot
do "on-demand" value-numbering of interesting SSA names.
I'm not sure that would be possible anyhow.  Though for
the alloca case quickly scanning the function whether there's
any would of course be faster than throwing VN at it.

Oh, and no - we don't want to perform CSE at -O0 (I mean
affecting generated code).

Richard.

> Martin
>
> >
> > Aldy
> >
>


[patch] Do not apply scalar storage order to pointer fields

2021-05-07 Thread Eric Botcazou
Hi,

I didn't really think of pointer fields (nor of vector fields originally) when 
implementing the scalar_storage_order attribute, so they are swapped as well.
As Ulrich pointed out, this is problematic to describe in DWARF and probably 
not very useful in any case, so the attached patch pulls them out.

Tested on x86-64/Linux, OK for mainline?


2021-05-07  Eric Botcazou  

* doc/extend.texi (scalar_storage_order): Mention effect on pointer
and vector fields.
* tree.h (reverse_storage_order_for_component_p): Return false if
the type is a pointer.
c/
* c-typeck.c (build_unary_op) : Do not issue an error
on the address of a pointer field in a record with reverse SSO.


2021-05-07  Eric Botcazou  

* gcc.dg/sso-12.c: New test.

-- 
Eric Botcazoudiff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index fdc7bb6125c..5bdc673d03a 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -4866,6 +4866,7 @@ build_unary_op (location_t location, enum tree_code code, tree xarg,
 	  if (TYPE_REVERSE_STORAGE_ORDER (TREE_TYPE (TREE_OPERAND (arg, 0
 	{
 	  if (!AGGREGATE_TYPE_P (TREE_TYPE (arg))
+		  && !POINTER_TYPE_P (TREE_TYPE (arg))
 		  && !VECTOR_TYPE_P (TREE_TYPE (arg)))
 		{
 		  error_at (location, "cannot take address of scalar with "
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index c8caf36f293..fd9175d1b3b 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -8551,6 +8551,9 @@ or an array whose component is a @code{union} or a @code{struct}, and it is
 possible for these fields to have a different scalar storage order than the
 enclosing type.
 
+Note that neither pointer nor vector fields are considered scalar fields in
+this context, so the attribute has no effects on these fields.
+
 This attribute is supported only for targets that use a uniform default
 scalar storage order (fortunately, most of them), i.e.@: targets that store
 the scalars either all in big-endian or all in little-endian.
diff --git a/gcc/tree.h b/gcc/tree.h
index 6d3cfc4c588..784452ca490 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -4989,7 +4989,9 @@ static inline bool
 reverse_storage_order_for_component_p (tree t)
 {
   /* The storage order only applies to scalar components.  */
-  if (AGGREGATE_TYPE_P (TREE_TYPE (t)) || VECTOR_TYPE_P (TREE_TYPE (t)))
+  if (AGGREGATE_TYPE_P (TREE_TYPE (t))
+  || POINTER_TYPE_P (TREE_TYPE (t))
+  || VECTOR_TYPE_P (TREE_TYPE (t)))
 return false;
 
   if (TREE_CODE (t) == REALPART_EXPR || TREE_CODE (t) == IMAGPART_EXPR)
/* Test scalar_storage_order attribute and pointer fields */

/* { dg-do run } */
/* { dg-options "-Wno-pedantic" } */

#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
struct __attribute__((scalar_storage_order("big-endian"))) Rec
{
  int *p;
};
#else
struct __attribute__((scalar_storage_order("little-endian"))) Rec
{
  int *p;
};
#endif

int main (int argc)
{
  struct Rec r = {  };
  int *p = 

  if (__builtin_memcmp (, , sizeof (int *)) != 0)
__builtin_abort ();

  return 0;
}


[Patch] contrib/gcc-changelog: Add/improve --help

2021-05-07 Thread Tobias Burnus

Hi all, hi Martin,

when running the scripts manually, I tend to get confused
which one is which.  --help helps a bit :-)

OK?

Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstrasse 201, 80634 München 
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Frank 
Thürauf
contrib/gcc-changelog: Add/improve --help

contrib/ChangeLog:

	* gcc-changelog/git_check_commit.py (__Main__): State in --help
	the default value for 'revisions'.
	* gcc-changelog/git_email.py (show_help): Add.
	(__main__): Handle -h and --help.

diff --git a/contrib/gcc-changelog/git_check_commit.py b/contrib/gcc-changelog/git_check_commit.py
index 935425ef813..246e9735c1d 100755
--- a/contrib/gcc-changelog/git_check_commit.py
+++ b/contrib/gcc-changelog/git_check_commit.py
@@ -23,7 +23,8 @@ from git_repository import parse_git_revisions
 parser = argparse.ArgumentParser(description='Check git ChangeLog format '
  'of a commit')
 parser.add_argument('revisions', default='HEAD', nargs='?',
-help='Git revisions (e.g. hash~5..hash or just hash)')
+help='Git revisions (e.g. hash~5..hash or just hash) - '
+'if not specified: HEAD')
 parser.add_argument('-g', '--git-path', default='.',
 help='Path to git repository')
 parser.add_argument('-p', '--print-changelog', action='store_true',
diff --git a/contrib/gcc-changelog/git_email.py b/contrib/gcc-changelog/git_email.py
index b0547b363aa..a79d2c7ba86 100755
--- a/contrib/gcc-changelog/git_email.py
+++ b/contrib/gcc-changelog/git_email.py
@@ -72,10 +72,23 @@ class GitEmail(GitCommit):
  commit_to_info_hook=lambda x: None)
 
 
-# With zero arguments, process every patch file in the ./patches directory.
-# With one argument, process the named patch file.
-# Patch files must be in 'git format-patch' format.
+def show_help():
+print("usage: git_email.py [--help] [patch file ...]\n"
+  "\n"
+  "Check git ChangeLog format of a patch\n"
+  "\n"
+  "With zero arguments, process every patch file in the "
+  "./patches directory.\n"
+  "With one argument, process the named patch file.\n"
+  "\n"
+  "Patch files must be in 'git format-patch' format.\n\n")
+sys.exit(0)
+
+
 if __name__ == '__main__':
+if len(sys.argv) == 2 and (sys.argv[1] == '-h' or sys.argv[1] == '--help'):
+show_help()
+
 if len(sys.argv) == 1:
 allfiles = []
 for root, _dirs, files in os.walk('patches'):


[PATCH] i386: Do not emit mask compares for mode sizes < 16 [PR100445]

2021-05-07 Thread Uros Bizjak via Gcc-patches
Recent addition of v*cond* patterns for MMXMODEI modes allows 64bit MMX
modes to enter ix86_expand_sse_cmp. ix86_use_mask_cmp_p was not prepared
to reject mode sizes < 16, resulting in ICE due to unavailability of 64bit
masked PCOM instructions.

2021-05-07  Uroš Bizjak  

gcc/
PR target/100445
* config/i386/i386-expand.c (ix86_use_mask_cmp_p):
Return false for mode sizes < 16.

gcc/testsuite/

PR target/100445
* gcc.target/i386/pr100445-1.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index 4dfe7d6c282..61b2f921f41 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -3490,7 +3490,11 @@ static bool
 ix86_use_mask_cmp_p (machine_mode mode, machine_mode cmp_mode,
 rtx op_true, rtx op_false)
 {
-  if (GET_MODE_SIZE (mode) == 64)
+  int vector_size = GET_MODE_SIZE (mode);
+
+  if (vector_size < 16)
+return false;
+  else if (vector_size == 64)
 return true;
 
   /* When op_true is NULL, op_false must be NULL, or vice versa.  */
diff --git a/gcc/testsuite/gcc.target/i386/pr100445-1.c 
b/gcc/testsuite/gcc.target/i386/pr100445-1.c
new file mode 100644
index 000..a1c18aff0f9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr100445-1.c
@@ -0,0 +1,5 @@
+/* PR target/100445 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -mavx512vl" } */
+
+#include "pr96827.c"


[PATCH] middle-end/100464 - avoid spurious TREE_ADDRESSABLE in folding debug stmts

2021-05-07 Thread Richard Biener
canonicalize_constructor_val was setting TREE_ADDRESSABLE on bases
of ADDR_EXPRs but that's futile when we're dealing with CTOR values
in debug stmts.  This rips out the code which was added for Java
and should have been an assertion when we didn't have debug stmts.

Bootstrapped and tested on x86_64-unknown-linux-gnu for all languages
which revealed PR100468 for which I added the cp/class.c hunk below.
Re-testing with that in progress.

OK for trunk and branch?  It looks like this C++ code is new in GCC 11.

Thanks,
Richard.

2021-05-07  Richard Biener  

PR middle-end/100464
PR c++/100468
gcc/
* gimple-fold.c (canonicalize_constructor_val): Do not set
TREE_ADDRESSABLE.

gcc/cp/
* call.c (set_up_extended_ref_temp): Mark the temporary
addressable if the TARGET_EXPR was.

gcc/testsuite/
* gcc.dg/pr100464.c: New testcase.
---
 gcc/cp/call.c   |  2 ++
 gcc/gimple-fold.c   |  4 +++-
 gcc/testsuite/gcc.dg/pr100464.c | 16 
 3 files changed, 21 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr100464.c

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 57bac05fe70..ea97be22f07 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -12478,6 +12478,8 @@ set_up_extended_ref_temp (tree decl, tree expr, 
vec **cleanups,
  VAR.  */
   if (TREE_CODE (expr) != TARGET_EXPR)
 expr = get_target_expr (expr);
+  else if (TREE_ADDRESSABLE (expr))
+TREE_ADDRESSABLE (var) = 1;
 
   if (TREE_CODE (decl) == FIELD_DECL
   && extra_warnings && !TREE_NO_WARNING (decl))
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index aa33779b753..768ef89d876 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -245,7 +245,9 @@ canonicalize_constructor_val (tree cval, tree from_decl)
   if (TREE_TYPE (base) == error_mark_node)
return NULL_TREE;
   if (VAR_P (base))
-   TREE_ADDRESSABLE (base) = 1;
+   /* ???  We should be able to assert that TREE_ADDRESSABLE is set,
+  but since the use can be in a debug stmt we can't.  */
+   ;
   else if (TREE_CODE (base) == FUNCTION_DECL)
{
  /* Make sure we create a cgraph node for functions we'll reference.
diff --git a/gcc/testsuite/gcc.dg/pr100464.c b/gcc/testsuite/gcc.dg/pr100464.c
new file mode 100644
index 000..46cc37dff54
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr100464.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fcompare-debug" } */
+
+int *a;
+static int b, c, d, e, g, h;
+int f;
+void i() {
+  int *j[] = {, , , , , , , , , , ,
+  , , , , , , , , , , ,
+  , , , , , , , , , , };
+  int **k = [5];
+  for (; f;)
+b |= *a;
+  *k = 
+}
+int main() {}
-- 
2.26.2


Re: [PATCH] split loop for NE condition.

2021-05-07 Thread guojiufu via Gcc-patches

On 2021-05-06 16:27, Richard Biener wrote:

On Thu, 6 May 2021, guojiufu wrote:


On 2021-05-03 20:18, Richard Biener wrote:
> On Thu, 29 Apr 2021, Jiufu Guo wrote:
>
>> When there is the possibility that overflow may happen on the loop index,
>> a few optimizations would not happen. For example code:
>>
>> foo (int *a, int *b, unsigned k, unsigned n)
>> {
>>   while (++k != n)
>> a[k] = b[k]  + 1;
>> }
>>
>> For this code, if "l > n", overflow may happen.  if "l < n" at begining,
>> it could be optimized (e.g. vectorization).
>>
>> We can split the loop into two loops:
>>
>>   while (++k > n)
>> a[k] = b[k]  + 1;
>>   while (l++ < n)
>> a[k] = b[k]  + 1;
>>
>> then for the second loop, it could be optimized.
>>
>> This patch is splitting this kind of small loop to achieve better
>> performance.
>>
>> Bootstrap and regtest pass on ppc64le.  Is this ok for trunk?
>
> Do you have any statistics on how often this splits a loop during
> bootstrap (use --with-build-config=bootstrap-O3)?  Or alternatively
> on SPEC?

In SPEC2017, there are ~240 loops are split.  And I saw some 
performance

improvement on xz.
I would try bootstrap-O3 (encounter ICE).
Without this patch, the ICE is also there when building with 
bootstrap-O3 on ppc64le.




>
> Actual comments on the patch inline.
>
>> Thanks!
>>
>> Jiufu Guo.
>>
>> gcc/ChangeLog:
>>
>> 2021-04-29  Jiufu Guo  
>>
>>  * params.opt (max-insns-ne-cond-split): New.
>>  * tree-ssa-loop-split.c (connect_loop_phis): Add new param.
>>  (get_ne_cond_branch): New function.
>>  (split_ne_loop): New function.
>>  (split_loop_on_ne_cond): New function.
>>  (tree_ssa_split_loops): Use split_loop_on_ne_cond.
>>
>> gcc/testsuite/ChangeLog:
>> 2021-04-29  Jiufu Guo  
>>
>>  * gcc.dg/loop-split1.c: New test.
>>
>> ---
>>  gcc/params.opt |   4 +
>>  gcc/testsuite/gcc.dg/loop-split1.c |  28 
>>  gcc/tree-ssa-loop-split.c  | 219
>> -
>>  3 files changed, 247 insertions(+), 4 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.dg/loop-split1.c
>>
>> diff --git a/gcc/params.opt b/gcc/params.opt
>> index 2e4cbdd7a71..900b59b5136 100644
>> --- a/gcc/params.opt
>> +++ b/gcc/params.opt
>> @@ -766,6 +766,10 @@ Min. ratio of insns to prefetches to enable
>> prefetching for a loop with an unkno
>> Common Joined UInteger Var(param_min_loop_cond_split_prob) Init(30)
>> IntegerRange(0, 100) Param Optimization
>> The minimum threshold for probability of semi-invariant condition statement
>> to trigger loop split.
>>
>> +-param=max-insns-ne-cond-split=
>> +Common Joined UInteger Var(param_max_insn_ne_cond_split) Init(64) Param
>> Optimization
>> +The maximum threshold for insnstructions number of a loop with ne
>> condition to split.
>> +
>>  -param=min-nondebug-insn-uid=
>>  Common Joined UInteger Var(param_min_nondebug_insn_uid) Param
>>  The minimum UID to be used for a nondebug insn.
>> diff --git a/gcc/testsuite/gcc.dg/loop-split1.c
>> b/gcc/testsuite/gcc.dg/loop-split1.c
>> new file mode 100644
>> index 000..4c466aa9f54
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/loop-split1.c
>> @@ -0,0 +1,28 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-O2 -fsplit-loops -fdump-tree-lsplit-details" } */
>> +
>> +void
>> +foo (int *a, int *b, unsigned l, unsigned n)
>> +{
>> +  while (++l != n)
>> +a[l] = b[l]  + 1;
>> +}
>> +
>> +void
>> +foo1 (int *a, int *b, unsigned l, unsigned n)
>> +{
>> +  while (l++ != n)
>> +a[l] = b[l]  + 1;
>> +}
>> +
>> +unsigned
>> +foo2 (char *a, char *b, unsigned l, unsigned n)
>> +{
>> +  while (++l != n)
>> +if (a[l] != b[l])
>> +  break;
>> +
>> +  return l;
>> +}
>> +
>> +/* { dg-final { scan-tree-dump-times "Loop split" 3 "lsplit" } } */
>> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
>> index b80b6a75e62..a6d28078e5e 100644
>> --- a/gcc/tree-ssa-loop-split.c
>> +++ b/gcc/tree-ssa-loop-split.c
>> @@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "cfghooks.h"
>>  #include "gimple-fold.h"
>>  #include "gimplify-me.h"
>> +#include "tree-ssa-loop-ivopts.h"
>>
>>  /* This file implements two kinds of loop splitting.
>>
>> @@ -233,7 +234,8 @@ easy_exit_values (class loop *loop)
>> this.  The loops need to fulfill easy_exit_values().  */
>>
>> static void
>> -connect_loop_phis (class loop *loop1, class loop *loop2, edge new_e)
>> +connect_loop_phis (class loop *loop1, class loop *loop2, edge new_e,
>> + bool use_prev = false)
>>  {
>>basic_block rest = loop_preheader_edge (loop2)->src;
>>gcc_assert (new_e->dest == rest);
>> @@ -248,13 +250,14 @@ connect_loop_phis (class loop *loop1, class loop
>> *loop2, edge new_e)
>> !gsi_end_p (psi_first);
>> gsi_next (_first), gsi_next (_second))
>>  {
>> -  tree init, next, new_init;
>> +  tree init, next, new_init, prev;
>>use_operand_p op;
>>gphi *phi_first = psi_first.phi ();
>>gphi *phi_second = 

Re: [PATCH] i386: Fix up 8-byte vcond* with -mxop [PR100445]

2021-05-07 Thread Uros Bizjak via Gcc-patches
On Fri, May 7, 2021 at 9:57 AM Jakub Jelinek  wrote:
>
> Hi!
>
> ix86_expand_sse_movcc has special TARGET_XOP handling and the recent
> addition of support of v*cond* patterns for MMXMODEI modes results in
> ICEs because the expected pattern doesn't exist.  We can handle it
> using 128-bit vpcmov (if we ignore the upper 64 bits like we ignore in
> other TARGET_MMX_WITH_SSE support).
>
> Bootstrapped/regtested on x86_64-linux and i686-linux (admittedly
> on a CPU without XOP support), ok for trunk?
>
> 2021-05-07  Jakub Jelinek  
>
> PR target/100445
> * config/i386/mmx.md (*xop_pcmov_): New define_insn.
>
> * gcc.target/i386/pr100445.c: New test.

OK.

Thanks,
Uros.

>
> --- gcc/config/i386/mmx.md.jj   2021-05-06 10:14:55.508063058 +0200
> +++ gcc/config/i386/mmx.md  2021-05-06 14:43:19.731486156 +0200
> @@ -1700,6 +1700,17 @@ (define_expand "vcond_mask_"
>DONE;
>  })
>
> +;; XOP parallel XMM conditional moves
> +(define_insn "*xop_pcmov_"
> +  [(set (match_operand:MMXMODEI 0 "register_operand" "=x")
> +(if_then_else:MMXMODEI
> +  (match_operand:MMXMODEI 3 "register_operand" "x")
> +  (match_operand:MMXMODEI 1 "register_operand" "x")
> +  (match_operand:MMXMODEI 2 "register_operand" "x")))]
> +  "TARGET_XOP && TARGET_MMX_WITH_SSE"
> +  "vpcmov\t{%3, %2, %1, %0|%0, %1, %2, %3}"
> +  [(set_attr "type" "sse4arg")])
> +
>  ;
>  ;;
>  ;; Parallel integral logical operations
> --- gcc/testsuite/gcc.target/i386/pr100445.c.jj 2021-05-06 14:46:35.936327593 
> +0200
> +++ gcc/testsuite/gcc.target/i386/pr100445.c2021-05-06 14:46:10.259609909 
> +0200
> @@ -0,0 +1,12 @@
> +/* PR target/100445 */
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -mxop" } */
> +
> +int a, b[3];
> +
> +void
> +foo (void)
> +{
> +  for (; a < 3; a++)
> +b[a] = (a - 1) / 2;
> +}
>
> Jakub
>


[PATCH] i386: Fix up 8-byte vcond* with -mxop [PR100445]

2021-05-07 Thread Jakub Jelinek via Gcc-patches
Hi!

ix86_expand_sse_movcc has special TARGET_XOP handling and the recent
addition of support of v*cond* patterns for MMXMODEI modes results in
ICEs because the expected pattern doesn't exist.  We can handle it
using 128-bit vpcmov (if we ignore the upper 64 bits like we ignore in
other TARGET_MMX_WITH_SSE support).

Bootstrapped/regtested on x86_64-linux and i686-linux (admittedly
on a CPU without XOP support), ok for trunk?

2021-05-07  Jakub Jelinek  

PR target/100445
* config/i386/mmx.md (*xop_pcmov_): New define_insn.

* gcc.target/i386/pr100445.c: New test.

--- gcc/config/i386/mmx.md.jj   2021-05-06 10:14:55.508063058 +0200
+++ gcc/config/i386/mmx.md  2021-05-06 14:43:19.731486156 +0200
@@ -1700,6 +1700,17 @@ (define_expand "vcond_mask_"
   DONE;
 })
 
+;; XOP parallel XMM conditional moves
+(define_insn "*xop_pcmov_"
+  [(set (match_operand:MMXMODEI 0 "register_operand" "=x")
+(if_then_else:MMXMODEI
+  (match_operand:MMXMODEI 3 "register_operand" "x")
+  (match_operand:MMXMODEI 1 "register_operand" "x")
+  (match_operand:MMXMODEI 2 "register_operand" "x")))]
+  "TARGET_XOP && TARGET_MMX_WITH_SSE"
+  "vpcmov\t{%3, %2, %1, %0|%0, %1, %2, %3}"
+  [(set_attr "type" "sse4arg")])
+
 ;
 ;;
 ;; Parallel integral logical operations
--- gcc/testsuite/gcc.target/i386/pr100445.c.jj 2021-05-06 14:46:35.936327593 
+0200
+++ gcc/testsuite/gcc.target/i386/pr100445.c2021-05-06 14:46:10.259609909 
+0200
@@ -0,0 +1,12 @@
+/* PR target/100445 */
+/* { dg-do compile } */
+/* { dg-options "-O3 -mxop" } */
+
+int a, b[3];
+
+void
+foo (void)
+{
+  for (; a < 3; a++)
+b[a] = (a - 1) / 2;
+}

Jakub



[PATCH] libcpp: Fix up pragma preprocessing [PR100450]

2021-05-07 Thread Jakub Jelinek via Gcc-patches
Hi!

Since the r0-85991-ga25a8f3be322fe0f838947b679f73d6efc2a412c
https://gcc.gnu.org/legacy-ml/gcc-patches/2008-02/msg01329.html
changes, so that we handle macros inside of pragmas that should expand
macros, during preprocessing we print those pragmas token by token,
with CPP_PRAGMA printed as
  fputs ("#pragma ", print.outf);
  if (space)
fprintf (print.outf, "%s %s", space, name);
  else
fprintf (print.outf, "%s", name);
where name is some identifier (so e.g. print
#pragma omp parallel
or
#pragma omp for
etc.).  Because it ends in an identifier, we need to handle it like
an identifier (i.e. CPP_NAME) for the decision whether a space needs
to be emitted in between that #pragma whatever or #pragma whatever whatever
and following token, otherwise the attached testcase is preprocessed as
#pragma omp forreduction(+:red)
rather than
#pragma omp for reduction(+:red)
The cpp_avoid_paste function is only called for this purpose.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk
and release branches (in particular 8 which freezes later today)?

2021-05-07  Jakub Jelinek  

PR c/100450
* lex.c (cpp_avoid_paste): Handle token1 CPP_PRAGMA like CPP_NAME.

* c-c++-common/gomp/pr100450.c: New test.

--- libcpp/lex.c.jj 2021-05-04 21:02:13.633917100 +0200
+++ libcpp/lex.c2021-05-06 20:32:07.695035739 +0200
@@ -3709,6 +3709,7 @@ cpp_avoid_paste (cpp_reader *pfile, cons
 case CPP_DEREF:return c == '*';
 case CPP_DOT:  return c == '.' || c == '%' || b == CPP_NUMBER;
 case CPP_HASH: return c == '#' || c == '%'; /* Digraph form.  */
+case CPP_PRAGMA:
 case CPP_NAME: return ((b == CPP_NUMBER
 && name_p (pfile, >val.str))
|| b == CPP_NAME
--- gcc/testsuite/c-c++-common/gomp/pr100450.c.jj   2021-05-06 
20:33:45.302961055 +0200
+++ gcc/testsuite/c-c++-common/gomp/pr100450.c  2021-05-06 20:33:39.882020738 
+0200
@@ -0,0 +1,20 @@
+/* PR c/100450 */
+/* { dg-do compile } */
+/* { dg-options "-fopenmp -save-temps -Wunknown-pragmas" } */
+
+#define TEST(T) { \
+ {T} \
+}
+#define CLAUSES reduction(+:red)
+#define PARALLEL_FOR(X) TEST({ \
+_Pragma("omp for CLAUSES") \
+X \
+})
+
+void foo()
+{
+  int red = 0;
+  int A[3] = {};
+  #pragma omp parallel shared(red)
+  PARALLEL_FOR( for(int i=0; i < 3; i++) red += A[i]; )
+}

Jakub