Re: [RFA] Attach MEM_EXPR information when flushing BLKmode args to the stack

2021-07-05 Thread Jeff Law via Gcc-patches




On 7/5/2021 5:17 AM, Richard Biener via Gcc-patches wrote:

On Fri, Jul 2, 2021 at 6:13 PM Jeff Law  wrote:


This is a minor missed optimization we found with our internal port.

Given this code:

typedef struct {short a; short b;} T;

extern void g1();

void f(T s)
{
  if (s.a < 0)
  g1();
}


"s" is passed in a register, but it's still a BLKmode object because the
alignment of T is smaller than the alignment that an integer of the same
size would have (16 bits vs 32 bits).


Because "s" is BLKmode function.c is going to store it into a stack slot
and we'll load it from the that slot for each reference.  So on the v850
(just to pick a port that likely has the same behavior we're seeing) we
have this RTL from CSE2:


(insn 2 4 3 2 (set (mem/c:SI (plus:SI (reg/f:SI 34 .fp)
  (const_int -4 [0xfffc])) [2 S4 A32])
  (reg:SI 6 r6)) "j.c":6:1 7 {*movsi_internal}
   (expr_list:REG_DEAD (reg:SI 6 r6)
  (nil)))
(note 3 2 8 2 NOTE_INSN_FUNCTION_BEG)
(insn 8 3 9 2 (set (reg:HI 44 [ s.a ])
  (mem/c:HI (plus:SI (reg/f:SI 34 .fp)
  (const_int -4 [0xfffc])) [1 s.a+0 S2 A32]))
"j.c":7:5 3 {*movhi_internal}
   (nil))
(insn 9 8 10 2 (parallel [
  (set (reg:SI 45)
  (ashift:SI (subreg:SI (reg:HI 44 [ s.a ]) 0)
  (const_int 16 [0x10])))
  (clobber (reg:CC 32 psw))
  ]) "j.c":7:5 94 {ashlsi3_clobber_flags}
   (expr_list:REG_DEAD (reg:HI 44 [ s.a ])
  (expr_list:REG_UNUSED (reg:CC 32 psw)
  (nil
(insn 10 9 11 2 (parallel [
  (set (reg:SI 43)
  (ashiftrt:SI (reg:SI 45)
  (const_int 16 [0x10])))
  (clobber (reg:CC 32 psw))
  ]) "j.c":7:5 104 {ashrsi3_clobber_flags}
   (expr_list:REG_DEAD (reg:SI 45)
  (expr_list:REG_UNUSED (reg:CC 32 psw)
  (nil


Insn 2 is the store into the stack. insn 8 is the load for s.a in the
conditional.  DSE1 replaces the MEM in insn 8 with (reg 6) since (reg 6)
has the value we want.  After that the store at insn 2 is dead.  Sadly
DSE never removes the store.

The problem is RTL DSE considers a store with no MEM_EXPR as escaping,
which keeps the MEM live.  The lack of a MEM_EXPR is due to call to
change_address to twiddle the mode on the MEM for the store at insn 2.
It should be safe to copy the MEM_EXPR (which should always be a
PARM_DECL) from the original memory to the memory returned by
change_address.  Doing so results in DSE1 removing the store at insn 2.

It would be nice to remove the stack setup/teardown.   I'm not offhand
aware of mechanisms to remove the setup/teardown after we've already
allocated a slot, even if the slot is no longer used.

Bootstrapped and regression tested on x86, though I don't think that's a
particularly useful test.  So I also ran it through my tester across
those pesky embedded targets without regressions as well.

I didn't include a test simply because I didn't want to have an insane
target selector.  I guess if we really wanted a test we could look after
DSE1 is done and verify there aren't any MEMs left at all.  Willing to
try that if the consensus is we want this tested.

OK for the trunk?

I wonder why the code doesn't use adjust_address instead?  That
handles most cases already and the code doesn't change the
address but just the mode (and access size)?

No idea.  It should be easy enough to try that approach though.

jeff



[r12-2036 Regression] FAIL: gcc.dg/pr96573.c scan-tree-dump optimized "__builtin_bswap|VEC_PERM_EXPR[^\n\r]*7, 6, 5, 4, 3, 2, 1, 0" on Linux/x86_64

2021-07-05 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

be8749f939a933bca6de19d9cf1a510d5954c2fa is the first bad commit
commit be8749f939a933bca6de19d9cf1a510d5954c2fa
Author: Uros Bizjak 
Date:   Mon Jul 5 21:05:10 2021 +0200

i386: Implement 4-byte vector (V4QI/V2HI) constant permutations

caused

FAIL: gcc.dg/pr96573.c scan-tree-dump optimized 
"__builtin_bswap|VEC_PERM_EXPR[^\n\r]*7, 6, 5, 4, 3, 2, 1, 0"

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-2036/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/pr96573.c 
--target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[PATCH, rs6000] fix failure test cases caused by disabling mode promotion for pseudos [PR100952]

2021-07-05 Thread HAO CHEN GUI via Gcc-patches

Hi

   The patch changed matching conditions in pr81384.c and pr56605.c. 
The original conditions failed to match due to mode promotion disabled.


   The attachments are the patch diff and change log file.

   Bootstrapped and tested on powerpc64le-linux with no regressions. Is 
this okay for trunk? Any recommendations? Thanks a lot.


PR target/100952
* gcc/testsuite/gcc.target/powerpc/pr56605.c: Change matching
conditions.
* gcc/testsuite/gcc.target/powerpc/pr81348.c: Likewise.

diff --git a/gcc/testsuite/gcc.target/powerpc/pr56605.c 
b/gcc/testsuite/gcc.target/powerpc/pr56605.c
index 29efd815adc..2b7ddbd7410 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr56605.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr56605.c
@@ -11,5 +11,5 @@ void foo (short* __restrict sb, int* __restrict ia)
 ia[i] = (int) sb[i];
 }
 
-/* { dg-final { scan-rtl-dump-times "\\\(compare:CC \\\((?:and|zero_extend):DI 
\\\(reg:\[SD\]I" 1 "combine" } } */
+/* { dg-final { scan-rtl-dump-times "\\\(compare:CC \\\((?:and|zero_extend):SI 
\\\(subreg:SI \\\(reg:\[SD\]I" 1 "combine" } } */
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr81348.c 
b/gcc/testsuite/gcc.target/powerpc/pr81348.c
index 7037acf0c22..8043d06bcde 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr81348.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr81348.c
@@ -19,5 +19,5 @@ void d(void)
 ***c = e;
 }
 
-/* { dg-final { scan-assembler {\mlxsihzx\M}  } } */
-/* { dg-final { scan-assembler {\mvextsh2d\M} } } */
+/* { dg-final { scan-assembler {\mlha\M}  } } */
+/* { dg-final { scan-assembler {\mmtvsrwa\M} } } */


[PATCH, rs6000] fix execution failure of parity_1.f90 on P10 [PR100952]

2021-07-05 Thread HAO CHEN GUI via Gcc-patches

Hi,

   The patch fixed the wrong "if" fall through in "cstore4" 
expand, which causes comparison pattern expanded twice on P10.


   The attachments are the patch diff and change log file.

    Bootstrapped and tested on powerpc64le-linux with no regressions. 
Is this okay for trunk? Any recommendations? Thanks a lot.


PR target/100952
* config/rs6000/rs6000.md (cstore4): Fix wrong fall through.
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 3f59b544f6a..3ae7aa29c1d 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -11627,7 +11627,7 @@ (define_expand "cstore4"
 
   /* Expanding EQ and NE directly to some machine instructions does not help
  but does hurt combine.  So don't.  */
-  if (GET_CODE (operands[1]) == EQ)
+  else if (GET_CODE (operands[1]) == EQ)
 emit_insn (gen_eq3 (operands[0], operands[2], operands[3]));
   else if (mode == Pmode
   && GET_CODE (operands[1]) == NE)


Re: [PATCH] Add FMADDSUB and FMSUBADD SLP vectorization patterns and optabs

2021-07-05 Thread Hongtao Liu via Gcc-patches
On Mon, Jul 5, 2021 at 10:09 PM Richard Biener  wrote:
>
> This adds named expanders for vec_fmaddsub4 and
> vec_fmsubadd4 which map to x86 vfmaddsubXXXp{ds} and
> vfmsubaddXXXp{ds} instructions.  This complements the previous
> addition of ADDSUB support.
>
> x86 lacks SUBADD and the negate variants of FMA with mixed
> plus minus so I did not add optabs or patterns for those but
> it would not be difficult if there's a target that has them.
> Maybe one of the complex fma patterns match those variants?
>
> I did not dare to rewrite the numerous patterns to the new
> canonical name but instead added two new expanders.  Note I
> did not cover AVX512 since the existing patterns are separated
> and I have no easy way to test things there.  Handling AVX512
> should be easy as followup though.
>
> Bootstrap and testing on x86_64-unknown-linux-gnu in progress.
>
> Any comments?
>
> Thanks,
> Richard.
>
> 2021-07-05  Richard Biener  
>
> * doc/md.texi (vec_fmaddsub4): Document.
> (vec_fmsubadd4): Likewise.
> * optabs.def (vec_fmaddsub$a4): Add.
> (vec_fmsubadd$a4): Likewise.
> * internal-fn.def (IFN_VEC_FMADDSUB): Add.
> (IFN_VEC_FMSUBADD): Likewise.
> * tree-vect-slp-patterns.c (addsub_pattern::recognize):
> Refactor to handle IFN_VEC_FMADDSUB and IFN_VEC_FMSUBADD.
> (addsub_pattern::build): Likewise.
> * tree-vect-slp.c (vect_optimize_slp): CFN_VEC_FMADDSUB
> and CFN_VEC_FMSUBADD are not transparent for permutes.
> * config/i386/sse.md (vec_fmaddsub4): New expander.
> (vec_fmsubadd4): Likewise.
>
> * gcc.target/i386/vect-fmaddsubXXXpd.c: New testcase.
> * gcc.target/i386/vect-fmaddsubXXXps.c: Likewise.
> * gcc.target/i386/vect-fmsubaddXXXpd.c: Likewise.
> * gcc.target/i386/vect-fmsubaddXXXps.c: Likewise.
> ---
>  gcc/config/i386/sse.md|  19 ++
>  gcc/doc/md.texi   |  14 ++
>  gcc/internal-fn.def   |   3 +-
>  gcc/optabs.def|   2 +
>  .../gcc.target/i386/vect-fmaddsubXXXpd.c  |  34 
>  .../gcc.target/i386/vect-fmaddsubXXXps.c  |  34 
>  .../gcc.target/i386/vect-fmsubaddXXXpd.c  |  34 
>  .../gcc.target/i386/vect-fmsubaddXXXps.c  |  34 
>  gcc/tree-vect-slp-patterns.c  | 192 +-
>  gcc/tree-vect-slp.c   |   2 +
>  10 files changed, 311 insertions(+), 57 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmaddsubXXXpd.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmaddsubXXXps.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmsubaddXXXpd.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmsubaddXXXps.c
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index bcf1605d147..6fc13c184bf 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -4644,6 +4644,25 @@
>  ;;
>  ;; But this doesn't seem useful in practice.
>
> +(define_expand "vec_fmaddsub4"
> +  [(set (match_operand:VF 0 "register_operand")
> +   (unspec:VF
> + [(match_operand:VF 1 "nonimmediate_operand")
> +  (match_operand:VF 2 "nonimmediate_operand")
> +  (match_operand:VF 3 "nonimmediate_operand")]
> + UNSPEC_FMADDSUB))]
> +  "TARGET_FMA || TARGET_FMA4 || TARGET_AVX512F")
> +
> +(define_expand "vec_fmsubadd4"
> +  [(set (match_operand:VF 0 "register_operand")
> +   (unspec:VF
> + [(match_operand:VF 1 "nonimmediate_operand")
> +  (match_operand:VF 2 "nonimmediate_operand")
> +  (neg:VF
> +(match_operand:VF 3 "nonimmediate_operand"))]
> + UNSPEC_FMADDSUB))]
> +  "TARGET_FMA || TARGET_FMA4 || TARGET_AVX512F")
> +

W/ condition like
  "TARGET_FMA || TARGET_FMA4
   || ( == 64 || TARGET_AVX512VL)“?

the original expander "fmaddsub_" is only used by builtins which
have it's own guard for AVX512VL, It doesn't matter if it doesn't have
TARGET_AVX512VL
BDESC (OPTION_MASK_ISA_AVX512VL, 0,
CODE_FOR_avx512vl_fmaddsub_v4df_mask,
"__builtin_ia32_vfmaddsubpd256_mask",
IX86_BUILTIN_VFMADDSUBPD256_MASK, UNKNOWN, (int)
V4DF_FTYPE_V4DF_V4DF_V4DF_UQI)

>  (define_expand "fmaddsub_"
>[(set (match_operand:VF 0 "register_operand")
> (unspec:VF
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 1b918144330..cc92ebd26aa 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5688,6 +5688,20 @@ Alternating subtract, add with even lanes doing 
> subtract and odd
>  lanes doing addition.  Operands 1 and 2 and the outout operand are vectors
>  with mode @var{m}.
>
> +@cindex @code{vec_fmaddsub@var{m}4} instruction pattern
> +@item @samp{vec_fmaddsub@var{m}4}
> +Alternating multiply subtract, add with even lanes doing subtract and odd
> +lanes doing addition of the third operand to the multiplication result
> +of the first two operands.  Operands 1, 2 and 3 and the outout 

Re: PING: [PATCH] mips: check MSA support for vector modes [PR100760, PR100761, PR100762]

2021-07-05 Thread Paul Hua via Gcc-patches
Looks good to me,  but I have no right to approve.



On Wed, Jun 30, 2021 at 9:17 PM Xi Ruoyao  wrote:
>
> Ping patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573213.html
>
> Status update: bootstrapped with BOOT_CFLAGS="-O3 -mmsa -mloongson-mmi"
> (it failed without the patch), and regtested on mips64el-linux-gnu with
> no new regression.
>
> On Sat, 2021-06-19 at 15:34 +0800, Xi Ruoyao wrote:
> > Check if the vector mode is really supported by MSA in certain cases,
> > instead of testing ISA_HAS_MSA.  Simply testing ISA_HAS_MSA can cause
> > ICE when MSA is enabled besides other MIPS SIMD extensions (notably,
> > Loongson MMI).
> >
> > Bootstrapped and tested on mips64el-linux-gnu.  OK to commit?
> >
> > gcc/
> >
> > * config/mips/mips.c (mips_const_insns): Use
> > MSA_SUPPORTED_MODE_P
> > instead of ISA_HAS_MSA.
> > (mips_expand_vec_unpack): Likewise.
> > (mips_expand_vector_init): Likewise.
> >
> > gcc/testsuite/
> >
> > * testsuite/gcc.target/mips/pr100760.c: New test.
> > * testsuite/gcc.target/mips/pr100761.c: New test.
> > * testsuite/gcc.target/mips/pr100762.c: New test.
> --
> Xi Ruoyao 
>


Re: [PATCH 1/2] CALL_INSN may not be a real function call.

2021-07-05 Thread Hongtao Liu via Gcc-patches
On Tue, Jul 6, 2021 at 8:03 AM Jeff Law via Gcc-patches
 wrote:
>
>
>
> On 7/5/2021 5:30 PM, Segher Boessenkool wrote:
> > Hi!
> >
> > I ran into this in shrink-wrap.c today.
> >
> > On Thu, Jun 03, 2021 at 02:54:07PM +0800, liuhongt via Gcc-patches wrote:
> >> Use "used" flag for CALL_INSN to indicate it's a fake call. If it's a
> >> fake call, it won't have its own function stack.
> > Could you document somewhere what a "fake call" *is*?  Including what
> > that means to RTL, how this is expected to be used, etc.?  In rtl.h is
> > fine with me, but as it is, no one can know when to use this.  What does
> > "its own function stack" mean in the description here?  You can only put
> > FAKE_CALL on functions that do not have a stack frame?  But that is
> > never true on x86, so that cannot be it, unless there isn't a call
> > instruction at all?  But then, why use an RTL call insn for this?
> >
> > Other targets simply do not use an RTL "call" when they want to hide
> > such an instruction, why can't you do that here, wouldn't that work much
> > better?  There are many more insns that you may want to hide.  The
> > traditional solution is to use unspecs, which very directly hides all
> > details.
> It reminds me a bit of millicode calls on the PA or calls to special
> routines in libgcc.  They're calls to functions, but those functions do
> not follow the standard ABI.  I'd like to remove
> INSN_REFERENCES_ARE_DELAYED and instead use the new fake call mechanism,
> but I haven't tried it or even looked at the fake call bits enough to
> know if that's possible.
Fake call is used for TARGET_INSN_CALLEE_ABI which is used for
vzeroupper in i386.
vzeroupper clobber high part of ymm registers but leave low part
unchanged, define it and call_insn with special callee ABI so that
RA/CSE knows this instruction kills high parts of ymm registers, and
can still optimize with lowpart.
I didn't handle FAKE_CALL_P thoroughly in the RTL, but only changed
the necessary parts so that I could get my patch to survive the
regression test(also fix some optimization issues I observed).
n through the tests>
> jeff



-- 
BR,
Hongtao


Re: [PATCH 1/2] CALL_INSN may not be a real function call.

2021-07-05 Thread Hongtao Liu via Gcc-patches
On Tue, Jul 6, 2021 at 7:31 AM Segher Boessenkool
 wrote:
>
> Hi!
>
> I ran into this in shrink-wrap.c today.
>
> On Thu, Jun 03, 2021 at 02:54:07PM +0800, liuhongt via Gcc-patches wrote:
> > Use "used" flag for CALL_INSN to indicate it's a fake call. If it's a
> > fake call, it won't have its own function stack.
>
> Could you document somewhere what a "fake call" *is*?  Including what
> that means to RTL, how this is expected to be used, etc.?  In rtl.h is
fake call is used for TARGET_INSN_CALLEE_ABI, i'll add comments for
#define FAKE_CALL_P(RTX) in rtl.h
> fine with me, but as it is, no one can know when to use this.  What does
> "its own function stack" mean in the description here?  You can only put
> FAKE_CALL on functions that do not have a stack frame?  But that is
> never true on x86, so that cannot be it, unless there isn't a call
> instruction at all?  But then, why use an RTL call insn for this?
>
> Other targets simply do not use an RTL "call" when they want to hide
> such an instruction, why can't you do that here, wouldn't that work much
> better?  There are many more insns that you may want to hide.  The
> traditional solution is to use unspecs, which very directly hides all
> details.

It's explained here,
> >> Yeah.  Initially clobber_high seemed like the best appraoch for
> >> handling the tlsdesc thing, but in practice it was too difficult
> >> to shoe-horn the concept in after the fact, when so much rtl
> >> infrastructure wasn't prepared to deal with it.  The old support
> >> didn't handle all cases and passes correctly, and handled others
> >> suboptimally.
> >>
> >> I think it would be worth using the same approach as
> >> https://gcc.gnu.org/legacy-ml/gcc-patches/2019-09/msg01466.html for
> >> vzeroupper: represent the instructions as call_insns in which the
> >> call has a special vzeroupper ABI.  I think that's likely to lead
> >> to better code than clobber_high would (or at least, it did for tlsdesc).

refer to [1] for more details
[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570265.html
>
>
> Segher



-- 
BR,
Hongtao


[PATCH] dwarf2ctf: the unit of sou field location is bits [PR101283]

2021-07-05 Thread Indu Bhagat via Gcc-patches
If the value of the DW_AT_data_member_location attribute is constant, the
associated unit is bytes. This patch amends incorrect behaviour which was being
exercised with -gdwarf-2. This caused some of the failures as noted in PR
debug/101283 (specifically the BTF tests involving btm_offset).

The testcase ctf-struct-array-2.c was erroneously checking for the value of
ctm_offset in number of bytes.

The patch fixes the calculation of the field location value for a struct member
in dwarf2ctf and adjusts the testcase. This patch also fixes some of the
failing tests as noted in PR debug/101283.

2021-07-05  Indu Bhagat  

  PR debug/101283 - Several tests fail on Darwin with -gctf/gbtf

gcc/ChangeLog:

PR debug/101283
* dwarf2ctf.c (ctf_get_AT_data_member_location): Multiply by 8 to get
number of bits.

gcc/testsuite/ChangeLog:

PR debug/101283
* gcc.dg/debug/ctf/ctf-struct-array-2.c: Adjust the value in the 
testcase.
---
 gcc/dwarf2ctf.c | 4 ++--
 gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-array-2.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/gcc/dwarf2ctf.c b/gcc/dwarf2ctf.c
index 08e1252..5e8a725 100644
--- a/gcc/dwarf2ctf.c
+++ b/gcc/dwarf2ctf.c
@@ -100,13 +100,13 @@ ctf_get_AT_data_member_location (dw_die_ref die)
  gcc_assert (!descr->dw_loc_oprnd2.v.val_unsigned);
  gcc_assert (descr->dw_loc_oprnd2.val_class
  == dw_val_class_unsigned_const);
- field_location = descr->dw_loc_oprnd1.v.val_unsigned;
+ field_location = descr->dw_loc_oprnd1.v.val_unsigned * 8;
}
   else
{
  attr = get_AT (die, DW_AT_data_member_location);
  if (attr && AT_class (attr) == dw_val_class_const)
-   field_location = AT_int (attr);
+   field_location = AT_int (attr) * 8;
  else
field_location = (get_AT_unsigned (die,
   DW_AT_data_member_location)
diff --git a/gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-array-2.c 
b/gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-array-2.c
index 9e698fd..37094b5 100644
--- a/gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-array-2.c
+++ b/gcc/testsuite/gcc.dg/debug/ctf/ctf-struct-array-2.c
@@ -10,6 +10,6 @@
 /* { dg-final { scan-assembler-times "0x1200\[\t \]+\[^\n\]*ctt_info" 1 } 
} */
 /* { dg-final { scan-assembler-times "\[\t \]0x4\[\t \]+\[^\n\]*cta_nelems" 1 
} } */
 /* { dg-final { scan-assembler-times "\[\t \]0\[\t \]+\[^\n\]*ctm_offset" 1 } 
} */
-/* { dg-final { scan-assembler-times "\[\t \]0x4\[\t \]+\[^\n\]*ctm_offset" 1 
} } */
+/* { dg-final { scan-assembler-times "\[\t \]0x20\[\t \]+\[^\n\]*ctm_offset" 1 
} } */
 
 static struct ranges {int from, to;} lim_regs[] = {{ 16, 7}, { 16, 6}, { 20, 
7},{ 20, 6}};
-- 
1.8.3.1



[PATCH] Fix 101256: Wrong code due to range incorrect from PHI-OPT

2021-07-05 Thread apinski--- via Gcc-patches
From: Andrew Pinski 

So the problem here is that replace_phi_edge_with_variable
will copy range information to a already (not newly) defined
ssa name.  This causes wrong code later on.
This fixes the problem by require the new ssa name to
be defined in the same bb as the conditional that is
about to be deleted.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

Changes from v1:
* this is a simplification of what was trying to be done before.

gcc/ChangeLog:

PR tree-optimization/101256
* dbgcnt.def (phiopt_edge_range): New counter.
* tree-ssa-phiopt.c (replace_phi_edge_with_variable):

gcc/testsuite/ChangeLog:

PR tree-optimization/101256
* g++.dg/torture/pr101256.C: New test.
---
 gcc/dbgcnt.def  |  1 +
 gcc/testsuite/g++.dg/torture/pr101256.C | 28 +
 gcc/tree-ssa-phiopt.c   | 16 --
 3 files changed, 39 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr101256.C

diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
index 93e7b4fd30e..2345899ba68 100644
--- a/gcc/dbgcnt.def
+++ b/gcc/dbgcnt.def
@@ -183,6 +183,7 @@ DEBUG_COUNTER (lim)
 DEBUG_COUNTER (local_alloc_for_sched)
 DEBUG_COUNTER (match)
 DEBUG_COUNTER (merged_ipa_icf)
+DEBUG_COUNTER (phiopt_edge_range)
 DEBUG_COUNTER (postreload_cse)
 DEBUG_COUNTER (pre)
 DEBUG_COUNTER (pre_insn)
diff --git a/gcc/testsuite/g++.dg/torture/pr101256.C 
b/gcc/testsuite/g++.dg/torture/pr101256.C
new file mode 100644
index 000..973a8b4caf3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr101256.C
@@ -0,0 +1,28 @@
+// { dg-do run }
+
+template 
+const T& max(const T& a, const T& b)
+{
+return (a < b) ? b : a;
+}
+
+signed char var_5 = -128;
+unsigned int var_11 = 2144479212U;
+unsigned long long int arr [22];
+
+void
+__attribute__((noipa))
+test(signed char var_5, unsigned var_11) {
+  for (short i_61 = 0; i_61 < var_5 + 149; i_61 += 1)
+arr[i_61] = max((signed char)0, var_5) ? max((signed char)1, var_5) : 
var_11;
+}
+
+int main() {
+  for (int i_0 = 0; i_0 < 22; ++i_0) 
+  arr [i_0] = 11834725929543695741ULL;
+
+  test(var_5, var_11);
+  if (arr [0] != 2144479212ULL && arr [0] != 11834725929543695741ULL)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
index ab63bf699e3..8b60ee81082 100644
--- a/gcc/tree-ssa-phiopt.c
+++ b/gcc/tree-ssa-phiopt.c
@@ -51,6 +51,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "internal-fn.h"
 #include "gimple-range.h"
 #include "gimple-match.h"
+#include "dbgcnt.h"
 
 static unsigned int tree_ssa_phiopt_worker (bool, bool, bool);
 static bool two_value_replacement (basic_block, basic_block, edge, gphi *,
@@ -390,7 +391,7 @@ replace_phi_edge_with_variable (basic_block cond_block,
   gimple_stmt_iterator gsi;
   tree phi_result = PHI_RESULT (phi);
 
-  /* Duplicate range info if we're the only things setting the target PHI.
+  /* Duplicate range info if they are the only things setting the target PHI.
  This is needed as later on, the new_tree will be replacing
  The assignement of the PHI.
  For an example:
@@ -398,19 +399,22 @@ replace_phi_edge_with_variable (basic_block cond_block,
  _4 = min
  goto bb2
 
- range<-INF,255>
+ # RANGE [-INF, 255]
  a_3 = PHI<_4(1)>
  bb3:
 
  use(a_3)
- And _4 gets prograted into the use of a_3 and losing the range info.
- This can't be done for more than 2 incoming edges as the progration
- won't happen.  */
+ And _4 gets propagated into the use of a_3 and losing the range info.
+ This can't be done for more than 2 incoming edges as the propagation
+ won't happen.
+ The new_tree needs to be defined in the same basic block as the 
conditional.  */
   if (TREE_CODE (new_tree) == SSA_NAME
   && EDGE_COUNT (gimple_bb (phi)->preds) == 2
   && INTEGRAL_TYPE_P (TREE_TYPE (phi_result))
   && !SSA_NAME_RANGE_INFO (new_tree)
-  && SSA_NAME_RANGE_INFO (phi_result))
+  && SSA_NAME_RANGE_INFO (phi_result)
+  && gimple_bb (SSA_NAME_DEF_STMT (new_tree)) == cond_block
+  && dbg_cnt (phiopt_edge_range))
 duplicate_ssa_name_range_info (new_tree,
   SSA_NAME_RANGE_TYPE (phi_result),
   SSA_NAME_RANGE_INFO (phi_result));
-- 
2.27.0



[PATCH] libffi/x86: Always check __x86_64__ for x86 hosts

2021-07-05 Thread H.J. Lu via Gcc-patches
Since for gnux32 hosts, -m32 generates i386 codes, always check __x86_64__
for x86 hosts.

PR libffi/101336
* configure.host: Always check __x86_64__ for x86 hosts.
---
 libffi/configure.host | 21 +++--
 1 file changed, 7 insertions(+), 14 deletions(-)

diff --git a/libffi/configure.host b/libffi/configure.host
index 786b32c5bb0..7248acb7458 100644
--- a/libffi/configure.host
+++ b/libffi/configure.host
@@ -95,20 +95,13 @@ case "${host}" in
   i?86-*-* | x86_64-*-* | amd64-*)
TARGETDIR=x86
if test $ac_cv_sizeof_size_t = 4; then
- case "$host" in
-   *-gnux32)
- TARGET=X86_64
- ;;
-   *)
- echo 'int foo (void) { return __x86_64__; }' > conftest.c
- if $CC $CFLAGS -Werror -S conftest.c -o conftest.s > /dev/null 
2>&1; then
-   TARGET=X86_64;
- else
-   TARGET=X86;
- fi
- rm -f conftest.*
- ;;
-  esac
+ echo 'int foo (void) { return __x86_64__; }' > conftest.c
+ if $CC $CFLAGS -Werror -S conftest.c -o conftest.s > /dev/null 2>&1; 
then
+   TARGET=X86_64;
+ else
+   TARGET=X86;
+  fi
+  rm -f conftest.*
else
  TARGET=X86_64;
fi
-- 
2.31.1



[COMMITTED] CTF, BTF testsuite: Use -gdwarf-4 for restrict type qualifier [PR101283]

2021-07-05 Thread Indu Bhagat via Gcc-patches
[Committed as obvious.]

DWARF DIEs do not contain DW_TAG_restrict_type when DWARF version is 2. CTF/BTF
generation feeds off DWARF DIEs, and as such, CTF records of kind
CTF_K_RESTRICT cease to be generated when DWARF version is 2.

This patch fixes the failure of these testcases on Darwin by using an explicit
-gdwarf-4 in the dg-options. This keeps the CTF record generation for restrict
type qualifier tested.

  PR debug/101283 - Several tests fail on Darwin with -gctf/gbtf

2021-07-05  Indu Bhagat  

gcc/testsuite/ChangeLog:

PR debug/101283
* gcc.dg/debug/btf/btf-cvr-quals-1.c: Use -gdwarf-4 on Darwin targets.
* gcc.dg/debug/ctf/ctf-cvr-quals-1.c: Likewise.
---
 gcc/testsuite/gcc.dg/debug/btf/btf-cvr-quals-1.c | 1 +
 gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-1.c | 1 +
 2 files changed, 2 insertions(+)

diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-cvr-quals-1.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-cvr-quals-1.c
index 79e9f52..33e2f64 100644
--- a/gcc/testsuite/gcc.dg/debug/btf/btf-cvr-quals-1.c
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-cvr-quals-1.c
@@ -23,6 +23,7 @@
 
 /* { dg-do compile } */
 /* { dg-options "-O0 -gbtf -dA" } */
+/* { dg-options "-O0 -gbtf -gdwarf-4 -dA" { target { *-*-darwin* } } } */
 
 /* { dg-final { scan-assembler-times "ascii \"int.0\"\[\t 
\]+\[^\n\]*btf_string" 1 } } */
 
diff --git a/gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-1.c 
b/gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-1.c
index 9368d47..0137e9d 100644
--- a/gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-1.c
+++ b/gcc/testsuite/gcc.dg/debug/ctf/ctf-cvr-quals-1.c
@@ -31,6 +31,7 @@
 
 /* { dg-do compile )  */
 /* { dg-options "-O0 -gctf -dA" } */
+/* { dg-options "-O0 -gctf -gdwarf-4 -dA" { target { *-*-darwin* } } } */
 
 /* { dg-final { scan-assembler-times "ascii \"int.0\"\[\t 
\]+\[^\n\]*ctf_string" 1 } } */
 /* { dg-final { scan-assembler-times "\[\t \]0\[\t \]+\[^\n\]*ctt_name" 7 } } 
*/
-- 
1.8.3.1



Re: [PATCH 1/2] CALL_INSN may not be a real function call.

2021-07-05 Thread Jeff Law via Gcc-patches




On 7/5/2021 5:30 PM, Segher Boessenkool wrote:

Hi!

I ran into this in shrink-wrap.c today.

On Thu, Jun 03, 2021 at 02:54:07PM +0800, liuhongt via Gcc-patches wrote:

Use "used" flag for CALL_INSN to indicate it's a fake call. If it's a
fake call, it won't have its own function stack.

Could you document somewhere what a "fake call" *is*?  Including what
that means to RTL, how this is expected to be used, etc.?  In rtl.h is
fine with me, but as it is, no one can know when to use this.  What does
"its own function stack" mean in the description here?  You can only put
FAKE_CALL on functions that do not have a stack frame?  But that is
never true on x86, so that cannot be it, unless there isn't a call
instruction at all?  But then, why use an RTL call insn for this?

Other targets simply do not use an RTL "call" when they want to hide
such an instruction, why can't you do that here, wouldn't that work much
better?  There are many more insns that you may want to hide.  The
traditional solution is to use unspecs, which very directly hides all
details.
It reminds me a bit of millicode calls on the PA or calls to special 
routines in libgcc.  They're calls to functions, but those functions do 
not follow the standard ABI.  I'd like to remove 
INSN_REFERENCES_ARE_DELAYED and instead use the new fake call mechanism, 
but I haven't tried it or even looked at the fake call bits enough to 
know if that's possible.


jeff


Re: [PATCH 1/2] CALL_INSN may not be a real function call.

2021-07-05 Thread Segher Boessenkool
Hi!

I ran into this in shrink-wrap.c today.

On Thu, Jun 03, 2021 at 02:54:07PM +0800, liuhongt via Gcc-patches wrote:
> Use "used" flag for CALL_INSN to indicate it's a fake call. If it's a
> fake call, it won't have its own function stack.

Could you document somewhere what a "fake call" *is*?  Including what
that means to RTL, how this is expected to be used, etc.?  In rtl.h is
fine with me, but as it is, no one can know when to use this.  What does
"its own function stack" mean in the description here?  You can only put
FAKE_CALL on functions that do not have a stack frame?  But that is
never true on x86, so that cannot be it, unless there isn't a call
instruction at all?  But then, why use an RTL call insn for this?

Other targets simply do not use an RTL "call" when they want to hide
such an instruction, why can't you do that here, wouldn't that work much
better?  There are many more insns that you may want to hide.  The
traditional solution is to use unspecs, which very directly hides all
details.


Segher


Re: [PATCH] build: Implement --with-multilib-list for avr target

2021-07-05 Thread Matt Jacobson via Gcc-patches



> On Jun 7, 2021, at 3:30 AM, Matt Jacobson  wrote:
> 
> The AVR target builds a lot of multilib variants of target libraries by 
> default,
> and I found myself wanting to use the --with-multilib-list argument to limit
> what I was building, to shorten build times.  This patch implements that 
> option
> for the AVR target.
> 
> Tested by configuring and building an AVR compiler and target libs on macOS.
> 
> I don't have commit access, so if this patch is suitable, I'd need someone 
> else
> to commit it for me.  Thanks.

Ping.  (Please let me know if I’ve made some process error here; this is my 
first change to GCC.)

Original mail:


Thanks.

Re: [PATCH 1/5] Fix 101256: Wrong code due to range incorrect from PHI-OPT

2021-07-05 Thread Andrew Pinski via Gcc-patches
On Mon, Jul 5, 2021 at 4:26 AM Richard Biener via Gcc-patches
 wrote:
>
> On Sun, Jul 4, 2021 at 8:40 PM apinski--- via Gcc-patches
>  wrote:
> >
> > From: Andrew Pinski 
> >
> > So the problem here is that replace_phi_edge_with_variable
> > will copy range information to a already (not newly) defined
> > ssa name.  This causes wrong code later on.
>
> That's a bit too conservative I guess?  Shouldn't it work for at least
> all defs defined in the same block as the original conditional (and
> thus then applying to the seq inserted there by the callers)?

Yes that even simplifies the change even further and still provide the
needed ranges
I should have a patch to submit later today.

Thanks,
Andrew Pinski

>
> I realize it's wrong for, say
>
>   _1 = ..
>  if (_1 != 0)
>{
>  ...
> if (..)
>;
>  # _2 = PHI <_1, 1>
> ...
>}
>
> with _2 having range [1, +INF] but clearly not _1 at the point of its
> definition.
>
> Richard.
>
> > This patch fixes the problem by requiring there to be statements
> > that are to be placed before the conditional to be able to
> > copy the range info; this assumes the statements will define
> > the ssa name.
> >
> > gcc/ChangeLog:
> >
> > PR tree-optimization/101256
> > * dbgcnt.def (phiopt_edge_range): New counter.
> > * tree-ssa-phiopt.c (replace_phi_edge_with_variable):
> > Add optional sequence which will be added before the old
> > conditional. Check sequence for non-null if we want to
> > update the range.
> > (two_value_replacement): Instead of inserting the sequence,
> > update the call to replace_phi_edge_with_variable.
> > (match_simplify_replacement): Likewise.
> > (minmax_replacement): Likewise.
> > (value_replacement): Create a sequence of statements
> > which would have defined the ssa name.  Update call
> > to replace_phi_edge_with_variable.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR tree-optimization/101256
> > * g++.dg/torture/pr101256.C: New test.
> > ---
> >  gcc/dbgcnt.def  |  1 +
> >  gcc/testsuite/g++.dg/torture/pr101256.C | 28 +
> >  gcc/tree-ssa-phiopt.c   | 52 ++---
> >  3 files changed, 59 insertions(+), 22 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/torture/pr101256.C
> >
> > diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
> > index 93e7b4fd30e..2345899ba68 100644
> > --- a/gcc/dbgcnt.def
> > +++ b/gcc/dbgcnt.def
> > @@ -183,6 +183,7 @@ DEBUG_COUNTER (lim)
> >  DEBUG_COUNTER (local_alloc_for_sched)
> >  DEBUG_COUNTER (match)
> >  DEBUG_COUNTER (merged_ipa_icf)
> > +DEBUG_COUNTER (phiopt_edge_range)
> >  DEBUG_COUNTER (postreload_cse)
> >  DEBUG_COUNTER (pre)
> >  DEBUG_COUNTER (pre_insn)
> > diff --git a/gcc/testsuite/g++.dg/torture/pr101256.C 
> > b/gcc/testsuite/g++.dg/torture/pr101256.C
> > new file mode 100644
> > index 000..973a8b4caf3
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/torture/pr101256.C
> > @@ -0,0 +1,28 @@
> > +// { dg-do run }
> > +
> > +template
> > +const T& max(const T& a, const T& b)
> > +{
> > +return (a < b) ? b : a;
> > +}
> > +
> > +signed char var_5 = -128;
> > +unsigned int var_11 = 2144479212U;
> > +unsigned long long int arr [22];
> > +
> > +void
> > +__attribute__((noipa))
> > +test(signed char var_5, unsigned var_11) {
> > +  for (short i_61 = 0; i_61 < var_5 + 149; i_61 += 1)
> > +arr[i_61] = max((signed char)0, var_5) ? max((signed char)1, var_5) : 
> > var_11;
> > +}
> > +
> > +int main() {
> > +  for (int i_0 = 0; i_0 < 22; ++i_0)
> > +  arr [i_0] = 11834725929543695741ULL;
> > +
> > +  test(var_5, var_11);
> > +  if (arr [0] != 2144479212ULL && arr [0] != 11834725929543695741ULL)
> > +__builtin_abort ();
> > +  return 0;
> > +}
> > diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
> > index ab12e85569d..71f0019d877 100644
> > --- a/gcc/tree-ssa-phiopt.c
> > +++ b/gcc/tree-ssa-phiopt.c
> > @@ -50,6 +50,7 @@ along with GCC; see the file COPYING3.  If not see
> >  #include "gimple-fold.h"
> >  #include "internal-fn.h"
> >  #include "gimple-range.h"
> > +#include "dbgcnt.h"
> >
> >  static unsigned int tree_ssa_phiopt_worker (bool, bool, bool);
> >  static bool two_value_replacement (basic_block, basic_block, edge, gphi *,
> > @@ -73,7 +74,8 @@ static bool cond_store_replacement (basic_block, 
> > basic_block, edge, edge,
> > hash_set *);
> >  static bool cond_if_else_store_replacement (basic_block, basic_block, 
> > basic_block);
> >  static hash_set * get_non_trapping ();
> > -static void replace_phi_edge_with_variable (basic_block, edge, gphi *, 
> > tree);
> > +static void replace_phi_edge_with_variable (basic_block, edge, gphi *, 
> > tree,
> > +   gimple_seq = NULL);
> >  static void hoist_adjacent_loads (basic_block, basic_block,
> >   

[committed] Remove redundant compare in shift loop on H8

2021-07-05 Thread Jeff Law via Gcc-patches


As I've mentioned elsewhere, the H8, particularly early models has very 
limited shift capabilities -- including no inherent support for shift by 
a variable amount. Naturally GCC accommodates this by emitting a 
suitable loop.


The shift loop has a typical sequence.  Shift, decrement counter, test 
counter, branch to top of loop if counter hasn't reached zero.  These 
are emitted as RTL fairly late in the pipeline (after prologue/epilogue 
generation).   Prior to removal of cc0 the existing machinery was 
capable of removing the test of the counter and instead relying on the 
condition codes set by the decrement.


Because this expansion happens so late (after cmpelim) we were failing 
to remove the unnecessary test.  My first attempt moved the expansion to 
a slightly earlier point, but the point where we want it doesn't 
necessarily have register lifetime information, which the patterns will 
try to exploit to generate better code.


This is a slightly different approach.  It leaves the expansion where it 
is, but generates condition code aware RTL.  So the expansion has the 
shift, decrement counter & update condition codes, branch to top of loop 
if counter hasn't reached zero.  ie, it natively handles the condition 
codes.


This eliminates one cmp insn in every shift-by-variable-amount loop.

Committed to the trunk after the usual testing.

Jeff
commit 1562c7987be115311a75b1074c3768a1b006adb6
Author: Jeff Law 
Date:   Mon Jul 5 17:23:43 2021 -0400

Remove redundant compare in shift loop on H8

gcc/ChangeLog

* config/h8300/shiftrotate.md (shift-by-variable patterns): Update 
to
generate condition code aware RTL directly.

diff --git a/gcc/config/h8300/shiftrotate.md b/gcc/config/h8300/shiftrotate.md
index 0476324bf22..485303cb906 100644
--- a/gcc/config/h8300/shiftrotate.md
+++ b/gcc/config/h8300/shiftrotate.md
@@ -385,10 +385,15 @@
(parallel
  [(set (match_dup 0)
   (match_op_dup 2 [(match_dup 0) (const_int 1)]))
-  (clobber (scratch:QI))])
-   (set (match_dup 1) (plus:QI (match_dup 1) (const_int -1)))
+  (clobber (reg:CC CC_REG))])
+   (parallel
+ [(set (reg:CCZN CC_REG)
+  (compare:CCZN
+(plus:QI (match_dup 1) (const_int -1))
+(const_int 0)))
+  (set (match_dup 1) (plus:QI (match_dup 1) (const_int -1)))])
(set (pc)
-(if_then_else (ne (match_dup 1) (const_int 0))
+(if_then_else (ne (reg:CCZN CC_REG) (const_int 0))
  (label_ref (match_dup 4))
  (pc)))
(match_dup 5)]
@@ -416,10 +421,15 @@
(parallel
  [(set (match_dup 0)
   (match_op_dup 2 [(match_dup 0) (const_int 1)]))
-  (clobber (scratch:QI))])
-   (set (match_dup 3) (plus:QI (match_dup 3) (const_int -1)))
+  (clobber (reg:CC CC_REG))])
+   (parallel
+ [(set (reg:CCZN CC_REG)
+  (compare:CCZN
+(plus:QI (match_dup 3) (const_int -1))
+(const_int 0)))
+  (set (match_dup 3) (plus:QI (match_dup 3) (const_int -1)))])
(set (pc)
-(if_then_else (ne (match_dup 3) (const_int 0))
+(if_then_else (ne (reg:CCZN CC_REG) (const_int 0))
  (label_ref (match_dup 4))
  (pc)))
(match_dup 5)]


Re: [PATCH] Darwin, configury : Allow for specification and detection of dsymutil.

2021-07-05 Thread Joseph Myers
On Mon, 5 Jul 2021, Iain Sandoe wrote:

> Hello Joseph,
> 
> > On 5 Jul 2021, at 21:21, Joseph Myers  wrote:
> > 
> > On Sun, 4 Jul 2021, Iain Sandoe wrote:
> > 
> >>* configure.ac: Handle --with-dsymutil in the same way as we
> >>do for the assembler and linker.  (DEFAULT_DSYMUTIL): New.
> >>Extract the type and version for the dsymutil configured or
> >>found by the default searches.
> > 
> > This is missing documentation of --with-dsymutil in install.texi.
> 
> oops, sorry.
> 
> following the style of the other entries would this be suitable?
> 
> @item --with-dsymutil=@var{pathname}
> Same as @uref{#with-as,,@option{--with-as}}
> but for the debug linker (only used on Darwin platforms so far).

Yes.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Darwin, configury : Allow for specification and detection of dsymutil.

2021-07-05 Thread Iain Sandoe
Hello Joseph,

> On 5 Jul 2021, at 21:21, Joseph Myers  wrote:
> 
> On Sun, 4 Jul 2021, Iain Sandoe wrote:
> 
>>  * configure.ac: Handle --with-dsymutil in the same way as we
>>  do for the assembler and linker.  (DEFAULT_DSYMUTIL): New.
>>  Extract the type and version for the dsymutil configured or
>>  found by the default searches.
> 
> This is missing documentation of --with-dsymutil in install.texi.

oops, sorry.

following the style of the other entries would this be suitable?

@item --with-dsymutil=@var{pathname}
Same as @uref{#with-as,,@option{--with-as}}
but for the debug linker (only used on Darwin platforms so far).


thanks
Iain



Re: [PATCH] Darwin, configury : Allow for specification and detection of dsymutil.

2021-07-05 Thread Joseph Myers
On Sun, 4 Jul 2021, Iain Sandoe wrote:

>   * configure.ac: Handle --with-dsymutil in the same way as we
>   do for the assembler and linker.  (DEFAULT_DSYMUTIL): New.
>   Extract the type and version for the dsymutil configured or
>   found by the default searches.

This is missing documentation of --with-dsymutil in install.texi.

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] i386: Implement 4-byte vector (V4QI/V2HI) constant permutations [PR100637]

2021-07-05 Thread Uros Bizjak via Gcc-patches
2021-07-05  Uroš Bizjak  

gcc/
PR target/100637
* config/i386/i386-expand.c (ix86_split_mmx_punpck):
Handle V4QI and V2HI modes.
(expand_vec_perm_blend): Allow 4-byte vector modes with TARGET_SSE4_1.
Handle V4QI mode. Emit mmx_pblendvb32 for 4-byte modes.
(expand_vec_perm_pshufb): Rewrite to use switch statemets.
Handle 4-byte dual operands with TARGET_XOP and single operands
with TARGET_SSSE3.  Emit mmx_ppermv32 for TARGET_XOP and
mmx_pshufbv4qi3 for TARGET_SSSE3.
(expand_vec_perm_pblendv): Allow 4-byte vector modes with TARGET_SSE4_1.
(expand_vec_perm_interleave2): Allow 4-byte vector modes.
(expand_vec_perm_pshufb2): Allow 4-byte vector modes with TARGET_SSSE3.
(expand_vec_perm_even_odd_1): Handle V4QI mode.
(expand_vec_perm_broadcast_1): Handle V4QI mode.
(ix86_vectorize_vec_perm_const): Handle V4QI mode.
* config/i386/mmx.md (mmx_ppermv32): New insn pattern.
(mmx_pshufbv4qi3): Ditto.
(*mmx_pblendw32): Ditto.
(*mmx_pblendw64): Rename from *mmx_pblendw.
(mmx_punpckhbw_low): New insn_and_split pattern.
(mmx_punpcklbw_low): Ditto.

All permutations are already checked in gcc.target/i386/vperm-v4qi.c.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index b37642e35ee..7f74653722c 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -933,6 +933,7 @@ ix86_split_mmx_punpck (rtx operands[], bool high_p)
 
   switch (mode)
 {
+case E_V4QImode:
 case E_V8QImode:
   sse_mode = V16QImode;
   double_sse_mode = V32QImode;
@@ -949,6 +950,7 @@ ix86_split_mmx_punpck (rtx operands[], bool high_p)
   break;
 
 case E_V4HImode:
+case E_V2HImode:
   sse_mode = V8HImode;
   double_sse_mode = V16HImode;
   mask = gen_rtx_PARALLEL (VOIDmode,
@@ -991,7 +993,7 @@ ix86_split_mmx_punpck (rtx operands[], bool high_p)
   rtx insn = gen_rtx_SET (dest, op2);
   emit_insn (insn);
 
-  /* Move bits 64:127 to bits 0:63.  */
+  /* Move high bits to low bits.  */
   if (high_p)
 {
   if (sse_mode == V4SFmode)
@@ -1004,9 +1006,19 @@ ix86_split_mmx_punpck (rtx operands[], bool high_p)
}
   else
{
- mask = gen_rtx_PARALLEL (VOIDmode,
-  gen_rtvec (4, GEN_INT (2), GEN_INT (3),
- GEN_INT (0), GEN_INT (1)));
+ int sz = GET_MODE_SIZE (mode);
+
+ if (sz == 4)
+   mask = gen_rtx_PARALLEL (VOIDmode,
+gen_rtvec (4, GEN_INT (1), GEN_INT (0),
+   GEN_INT (0), GEN_INT (1)));
+ else if (sz == 8)
+   mask = gen_rtx_PARALLEL (VOIDmode,
+gen_rtvec (4, GEN_INT (2), GEN_INT (3),
+   GEN_INT (0), GEN_INT (1)));
+ else
+   gcc_unreachable ();
+
  dest = lowpart_subreg (V4SImode, dest, GET_MODE (dest));
  op1 = gen_rtx_VEC_SELECT (V4SImode, dest, mask);
}
@@ -17331,7 +17343,8 @@ expand_vec_perm_blend (struct expand_vec_perm_d *d)
   else if (TARGET_AVX && (vmode == V4DFmode || vmode == V8SFmode))
 ;
   else if (TARGET_SSE4_1 && (GET_MODE_SIZE (vmode) == 16
-|| GET_MODE_SIZE (vmode) == 8))
+|| GET_MODE_SIZE (vmode) == 8
+|| GET_MODE_SIZE (vmode) == 4))
 ;
   else
 return false;
@@ -17408,7 +17421,9 @@ expand_vec_perm_blend (struct expand_vec_perm_d *d)
vperm = gen_rtx_CONST_VECTOR (vmode, gen_rtvec_v (nelt, rperm));
vperm = force_reg (vmode, vperm);
 
-   if (GET_MODE_SIZE (vmode) == 8)
+   if (GET_MODE_SIZE (vmode) == 4)
+ emit_insn (gen_mmx_pblendvb32 (target, op0, op1, vperm));
+   else if (GET_MODE_SIZE (vmode) == 8)
  emit_insn (gen_mmx_pblendvb64 (target, op0, op1, vperm));
else if (GET_MODE_SIZE (vmode) == 16)
  emit_insn (gen_sse4_1_pblendvb (target, op0, op1, vperm));
@@ -17440,6 +17455,16 @@ expand_vec_perm_blend (struct expand_vec_perm_d *d)
   vmode = V4HImode;
   goto do_subreg;
 
+case E_V4QImode:
+  for (i = 0; i < 4; i += 2)
+   if (d->perm[i] + 1 != d->perm[i + 1])
+ goto use_pblendvb;
+
+  for (i = 0; i < 2; ++i)
+   mask |= (d->perm[i * 2] >= 4) << i;
+  vmode = V2HImode;
+  goto do_subreg;
+
 case E_V32QImode:
   /* See if bytes move in pairs.  If not, vpblendvb must be used.  */
   for (i = 0; i < 32; i += 2)
@@ -17697,163 +17722,176 @@ expand_vec_perm_pshufb (struct expand_vec_perm_d *d)
   nelt = d->nelt;
 
   if (!d->one_operand_p)
-{
-  if (GET_MODE_SIZE (d->vmode) == 8)
-   {
- if (!TARGET_XOP)
-   return false;
- vmode = V8QImode;
-   }

contracts library support (was Re: [PATCH] PING implement pre-c++20 contracts)

2021-07-05 Thread Jason Merrill via Gcc-patches

On 6/26/21 10:23 AM, Andrew Sutton wrote:


I ended up taking over this work from Jeff (CC'd on his existing email
address). I scraped all the contracts changes into one big patch
against master. See attached. The ChangeLog.contracts files list the
sum of changes for the patch, not the full history of the work.


Jonathan, can you advise where the library support should go?

In N4820  was part of the language-support clause, which makes 
sense, but it uses string_view, which brings in a lot of the rest of the 
library.  Did LWG talk about this when contracts went in?  How are 
freestanding implementations expected to support contracts?


I imagine the header should be  for now.

You've previously mentioned that various current experimental features 
don't appear in libstdc++.so; that is not true of the current patch.


I see that https://github.com/arcosuc3m/clang-contracts takes the 
approach, of teaching the compiler about std::contract_violation, 
building up an object, and passing it to the handler directly, much like 
we do for initializer_list.  Their equivalent of __on_contract_violation 
is an internal function emitted in each translation unit that needs it, 
so it doesn't need to affect the library ABI.  These both seem like 
improvements to me.


More complicated is the question of the default violation handler: the 
lock3 implementation calls it "handle_contract_violation" in the global 
namespace, and overriding it is done with ELF symbol interposition, much 
like the replaceable allocation functions.  That approach seems 
reasonable, but I'd think we should use a reserved name, e.g. 
::__handle_contract_violation or __cxxabiv1::__contract_violation_handler.


The clang implementation above involves specifying the name of the 
handler on the compiler command line, which seems problematic, as it 
would tend to mean multiple independent violation handlers active at the 
same time.  Their default handler is std::terminate, which does avoid 
needing to add the default handler to the library.


Jason



Re: [patch, fortran] Fix PR 100227, write with implied DO loop

2021-07-05 Thread Jerry D via Gcc-patches

Looks OK Thomas,

Good for backport as well.

Regards,

Jerry

On 7/4/21 9:09 AM, Thomas Koenig via Fortran wrote:

Hello world,

after a bit of an absence, I am now back, at least for some regression
fixing (and for reviewing patches, if that is called for).

So, here's a regression fix to start with.

OK for trunk and affected branches (down to 9)?

Best regards

Thomas

Do not replace variable op variable in I/O implied DO loop replacement.

This PR came about because index expressions of the form k+k in
implied DO loops in I/O statements were considered for replacement
by array slices.

Fixed by only doing the transformation if the expression is of the
type expr OP contastant.

gcc/fortran/ChangeLog:

    PR fortran/100227
    * frontend-passes.c (traverse_io_block): Adjust test for
when a variable is eligible for the transformation to
array slice.

gcc/testsuite/ChangeLog:

    PR fortran/100227
    * gfortran.dg/implied_do_io_7.f90: New test.




[PATCH] musl: always use 'lib' directory for all x86_64 ABIs [PR90077]

2021-07-05 Thread Sergei Trofimovich via Gcc-patches
From: Sergei Trofimovich 

musl library intentionally does not support glibc-style multilib layout
and usually assumes --libdir=lib (Gentoo and Alpine Linux both use it).

Before the change --disable-multilib x86_64-gentoo-linux-musl returned:

$ gcc -print-multi-os-directory
../lib64
$ gcc -print-multi-os-directory -m32
../lib32
$ gcc -print-multi-os-directory -mx32
../libx32

After the change the layout is always the same:

$ gcc -print-multi-os-directory
../lib
$ gcc -print-multi-os-directory -m32
../lib
$ gcc -print-multi-os-directory -mx32
../lib

The discrepancy was noticed in meson build system which uses
-print-multi-os-directory to find out target directory.

Debian's multi-arch setup should not change.

PR target/90077

gcc/ChangeLog

* gcc/config.gcc: Specal case musl to t-linux64-musl.

* gcc/config/i386/t-linux64-musl: New file based on t-linux64
that pins MULTILIB_OSDIRNAMES to lib.
---
 gcc/config.gcc | 10 --
 gcc/config/i386/t-linux64-musl | 28 
 2 files changed, 36 insertions(+), 2 deletions(-)
 create mode 100644 gcc/config/i386/t-linux64-musl

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 0230bb88861..a87a59c9403 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1923,7 +1923,10 @@ i[34567]86-*-linux* | i[34567]86-*-kfreebsd*-gnu | 
i[34567]86-*-gnu* | i[34567]8
if test x$enable_targets = xall; then
tm_file="${tm_file} i386/x86-64.h 
i386/gnu-user-common.h i386/gnu-user64.h i386/linux-common.h i386/linux64.h"
tm_defines="${tm_defines} TARGET_BI_ARCH=1"
-   tmake_file="${tmake_file} i386/t-linux64"
+   case $target in
+   *-*-*musl*) tmake_file="${tmake_file} 
i386/t-linux64-musl";;
+   *) tmake_file="${tmake_file} i386/t-linux64";;
+   esac
x86_multilibs="${with_multilib_list}"
if test "$x86_multilibs" = "default"; then
x86_multilibs="m64,m32"
@@ -1983,7 +1986,10 @@ x86_64-*-linux* | x86_64-*-kfreebsd*-gnu)
tm_file="${tm_file} kfreebsd-gnu.h i386/kfreebsd-gnu64.h"
;;
esac
-   tmake_file="${tmake_file} i386/t-linux64"
+   case $target in
+   *-*-*musl*) tmake_file="${tmake_file} i386/t-linux64-musl";;
+   *) tmake_file="${tmake_file} i386/t-linux64";;
+   esac
x86_multilibs="${with_multilib_list}"
if test "$x86_multilibs" = "default"; then
case ${with_abi} in
diff --git a/gcc/config/i386/t-linux64-musl b/gcc/config/i386/t-linux64-musl
new file mode 100644
index 000..58e23c3c7dc
--- /dev/null
+++ b/gcc/config/i386/t-linux64-musl
@@ -0,0 +1,28 @@
+# Copyright (C) 2002-2021 Free Software Foundation, Inc.
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GCC is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .
+
+# musl explicitly does not support lib/lib32/lib64 layouts and always
+# uses lib layout. On debian full arch suffix is used. Thus we populate
+# all the m32/m64/mx32 with the same lib and apply multiarch suffix.
+
+comma=,
+MULTILIB_OPTIONS= $(subst $(comma),/,$(TM_MULTILIB_CONFIG))
+MULTILIB_DIRNAMES   = $(patsubst m%, %, $(subst /, ,$(MULTILIB_OPTIONS)))
+MULTILIB_OSDIRNAMES = m64=../lib$(call if_multiarch,:x86_64-linux-gnu)
+MULTILIB_OSDIRNAMES+= m32=../lib$(call if_multiarch,:i386-linux-gnu)
+MULTILIB_OSDIRNAMES+= mx32=../lib$(call if_multiarch,:x86_64-linux-gnux32)
-- 
2.32.0



Re: [PATCH 2/2] Backwards jump threader rewrite with ranger.

2021-07-05 Thread Aldy Hernandez via Gcc-patches
PING.

Aldy



Re: Commit: Update libiberty sources

2021-07-05 Thread Nick Clifton via Gcc-patches

Hi H.J.


My patch is needed to build binutils with LTO.  I submitted a patch for GCC:

https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574405.html


Very well.  I have reappplied your patch to the mainline and 2.37 branch 
sources.

Cheers
  Nick




Re: [PATCH] Add FMADDSUB and FMSUBADD SLP vectorization patterns and optabs

2021-07-05 Thread Richard Biener
On Mon, 5 Jul 2021, Richard Biener wrote:

> On Mon, Jul 5, 2021 at 4:09 PM Richard Biener  wrote:
> >
> > This adds named expanders for vec_fmaddsub4 and
> > vec_fmsubadd4 which map to x86 vfmaddsubXXXp{ds} and
> > vfmsubaddXXXp{ds} instructions.  This complements the previous
> > addition of ADDSUB support.
> >
> > x86 lacks SUBADD and the negate variants of FMA with mixed
> > plus minus so I did not add optabs or patterns for those but
> > it would not be difficult if there's a target that has them.
> > Maybe one of the complex fma patterns match those variants?
> >
> > I did not dare to rewrite the numerous patterns to the new
> > canonical name but instead added two new expanders.  Note I
> > did not cover AVX512 since the existing patterns are separated
> > and I have no easy way to test things there.  Handling AVX512
> > should be easy as followup though.
> >
> > Bootstrap and testing on x86_64-unknown-linux-gnu in progress.
> 
> FYI, building libgfortran matmul_c4 we hit
> 
> /home/rguenther/src/trunk/libgfortran/generated/matmul_c4.c:1781:1:
> error: unrecognizable insn:
>  1781 | }
>   | ^
> (insn 5408 5407 5409 213 (set (reg:V8SF 1454 [ vect__4368.5363 ])
> (unspec:V8SF [
> (reg:V8SF 4391)
> (reg:V8SF 4398)
> (reg:V8SF 4415 [ vect__2005.5362 ])
> ] UNSPEC_FMADDSUB)) -1
>  (nil))
> during RTL pass: vregs
> 
> so it looks like the existing fmaddsub_ expander cannot be
> simply re-purposed?

Ah, using the VF_128_256 iterator and removing the || TARGET_AVX512F
predication fixes it.  There's a avx512f but not fma target variant
of matmul which likely lacks avx512vl for the above.  So consider it
changed this way.  Not sure if there's a more appropriate iterator
that catches this case.

Richard.

> > Any comments?
> >
> > Thanks,
> > Richard.
> >
> > 2021-07-05  Richard Biener  
> >
> > * doc/md.texi (vec_fmaddsub4): Document.
> > (vec_fmsubadd4): Likewise.
> > * optabs.def (vec_fmaddsub$a4): Add.
> > (vec_fmsubadd$a4): Likewise.
> > * internal-fn.def (IFN_VEC_FMADDSUB): Add.
> > (IFN_VEC_FMSUBADD): Likewise.
> > * tree-vect-slp-patterns.c (addsub_pattern::recognize):
> > Refactor to handle IFN_VEC_FMADDSUB and IFN_VEC_FMSUBADD.
> > (addsub_pattern::build): Likewise.
> > * tree-vect-slp.c (vect_optimize_slp): CFN_VEC_FMADDSUB
> > and CFN_VEC_FMSUBADD are not transparent for permutes.
> > * config/i386/sse.md (vec_fmaddsub4): New expander.
> > (vec_fmsubadd4): Likewise.
> >
> > * gcc.target/i386/vect-fmaddsubXXXpd.c: New testcase.
> > * gcc.target/i386/vect-fmaddsubXXXps.c: Likewise.
> > * gcc.target/i386/vect-fmsubaddXXXpd.c: Likewise.
> > * gcc.target/i386/vect-fmsubaddXXXps.c: Likewise.
> > ---
> >  gcc/config/i386/sse.md|  19 ++
> >  gcc/doc/md.texi   |  14 ++
> >  gcc/internal-fn.def   |   3 +-
> >  gcc/optabs.def|   2 +
> >  .../gcc.target/i386/vect-fmaddsubXXXpd.c  |  34 
> >  .../gcc.target/i386/vect-fmaddsubXXXps.c  |  34 
> >  .../gcc.target/i386/vect-fmsubaddXXXpd.c  |  34 
> >  .../gcc.target/i386/vect-fmsubaddXXXps.c  |  34 
> >  gcc/tree-vect-slp-patterns.c  | 192 +-
> >  gcc/tree-vect-slp.c   |   2 +
> >  10 files changed, 311 insertions(+), 57 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmaddsubXXXpd.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmaddsubXXXps.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmsubaddXXXpd.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmsubaddXXXps.c
> >
> > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > index bcf1605d147..6fc13c184bf 100644
> > --- a/gcc/config/i386/sse.md
> > +++ b/gcc/config/i386/sse.md
> > @@ -4644,6 +4644,25 @@
> >  ;;
> >  ;; But this doesn't seem useful in practice.
> >
> > +(define_expand "vec_fmaddsub4"
> > +  [(set (match_operand:VF 0 "register_operand")
> > +   (unspec:VF
> > + [(match_operand:VF 1 "nonimmediate_operand")
> > +  (match_operand:VF 2 "nonimmediate_operand")
> > +  (match_operand:VF 3 "nonimmediate_operand")]
> > + UNSPEC_FMADDSUB))]
> > +  "TARGET_FMA || TARGET_FMA4 || TARGET_AVX512F")
> > +
> > +(define_expand "vec_fmsubadd4"
> > +  [(set (match_operand:VF 0 "register_operand")
> > +   (unspec:VF
> > + [(match_operand:VF 1 "nonimmediate_operand")
> > +  (match_operand:VF 2 "nonimmediate_operand")
> > +  (neg:VF
> > +(match_operand:VF 3 "nonimmediate_operand"))]
> > + UNSPEC_FMADDSUB))]
> > +  "TARGET_FMA || TARGET_FMA4 || TARGET_AVX512F")
> > +
> >  (define_expand "fmaddsub_"
> >[(set (match_operand:VF 0 "register_operand")
> > 

Re: [PATCH] Add FMADDSUB and FMSUBADD SLP vectorization patterns and optabs

2021-07-05 Thread Richard Biener via Gcc-patches
On Mon, Jul 5, 2021 at 4:09 PM Richard Biener  wrote:
>
> This adds named expanders for vec_fmaddsub4 and
> vec_fmsubadd4 which map to x86 vfmaddsubXXXp{ds} and
> vfmsubaddXXXp{ds} instructions.  This complements the previous
> addition of ADDSUB support.
>
> x86 lacks SUBADD and the negate variants of FMA with mixed
> plus minus so I did not add optabs or patterns for those but
> it would not be difficult if there's a target that has them.
> Maybe one of the complex fma patterns match those variants?
>
> I did not dare to rewrite the numerous patterns to the new
> canonical name but instead added two new expanders.  Note I
> did not cover AVX512 since the existing patterns are separated
> and I have no easy way to test things there.  Handling AVX512
> should be easy as followup though.
>
> Bootstrap and testing on x86_64-unknown-linux-gnu in progress.

FYI, building libgfortran matmul_c4 we hit

/home/rguenther/src/trunk/libgfortran/generated/matmul_c4.c:1781:1:
error: unrecognizable insn:
 1781 | }
  | ^
(insn 5408 5407 5409 213 (set (reg:V8SF 1454 [ vect__4368.5363 ])
(unspec:V8SF [
(reg:V8SF 4391)
(reg:V8SF 4398)
(reg:V8SF 4415 [ vect__2005.5362 ])
] UNSPEC_FMADDSUB)) -1
 (nil))
during RTL pass: vregs

so it looks like the existing fmaddsub_ expander cannot be
simply re-purposed?

> Any comments?
>
> Thanks,
> Richard.
>
> 2021-07-05  Richard Biener  
>
> * doc/md.texi (vec_fmaddsub4): Document.
> (vec_fmsubadd4): Likewise.
> * optabs.def (vec_fmaddsub$a4): Add.
> (vec_fmsubadd$a4): Likewise.
> * internal-fn.def (IFN_VEC_FMADDSUB): Add.
> (IFN_VEC_FMSUBADD): Likewise.
> * tree-vect-slp-patterns.c (addsub_pattern::recognize):
> Refactor to handle IFN_VEC_FMADDSUB and IFN_VEC_FMSUBADD.
> (addsub_pattern::build): Likewise.
> * tree-vect-slp.c (vect_optimize_slp): CFN_VEC_FMADDSUB
> and CFN_VEC_FMSUBADD are not transparent for permutes.
> * config/i386/sse.md (vec_fmaddsub4): New expander.
> (vec_fmsubadd4): Likewise.
>
> * gcc.target/i386/vect-fmaddsubXXXpd.c: New testcase.
> * gcc.target/i386/vect-fmaddsubXXXps.c: Likewise.
> * gcc.target/i386/vect-fmsubaddXXXpd.c: Likewise.
> * gcc.target/i386/vect-fmsubaddXXXps.c: Likewise.
> ---
>  gcc/config/i386/sse.md|  19 ++
>  gcc/doc/md.texi   |  14 ++
>  gcc/internal-fn.def   |   3 +-
>  gcc/optabs.def|   2 +
>  .../gcc.target/i386/vect-fmaddsubXXXpd.c  |  34 
>  .../gcc.target/i386/vect-fmaddsubXXXps.c  |  34 
>  .../gcc.target/i386/vect-fmsubaddXXXpd.c  |  34 
>  .../gcc.target/i386/vect-fmsubaddXXXps.c  |  34 
>  gcc/tree-vect-slp-patterns.c  | 192 +-
>  gcc/tree-vect-slp.c   |   2 +
>  10 files changed, 311 insertions(+), 57 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmaddsubXXXpd.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmaddsubXXXps.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmsubaddXXXpd.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmsubaddXXXps.c
>
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index bcf1605d147..6fc13c184bf 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -4644,6 +4644,25 @@
>  ;;
>  ;; But this doesn't seem useful in practice.
>
> +(define_expand "vec_fmaddsub4"
> +  [(set (match_operand:VF 0 "register_operand")
> +   (unspec:VF
> + [(match_operand:VF 1 "nonimmediate_operand")
> +  (match_operand:VF 2 "nonimmediate_operand")
> +  (match_operand:VF 3 "nonimmediate_operand")]
> + UNSPEC_FMADDSUB))]
> +  "TARGET_FMA || TARGET_FMA4 || TARGET_AVX512F")
> +
> +(define_expand "vec_fmsubadd4"
> +  [(set (match_operand:VF 0 "register_operand")
> +   (unspec:VF
> + [(match_operand:VF 1 "nonimmediate_operand")
> +  (match_operand:VF 2 "nonimmediate_operand")
> +  (neg:VF
> +(match_operand:VF 3 "nonimmediate_operand"))]
> + UNSPEC_FMADDSUB))]
> +  "TARGET_FMA || TARGET_FMA4 || TARGET_AVX512F")
> +
>  (define_expand "fmaddsub_"
>[(set (match_operand:VF 0 "register_operand")
> (unspec:VF
> diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
> index 1b918144330..cc92ebd26aa 100644
> --- a/gcc/doc/md.texi
> +++ b/gcc/doc/md.texi
> @@ -5688,6 +5688,20 @@ Alternating subtract, add with even lanes doing 
> subtract and odd
>  lanes doing addition.  Operands 1 and 2 and the outout operand are vectors
>  with mode @var{m}.
>
> +@cindex @code{vec_fmaddsub@var{m}4} instruction pattern
> +@item @samp{vec_fmaddsub@var{m}4}
> +Alternating multiply subtract, add with even lanes doing subtract and odd
> +lanes doing addition of the third operand to the 

Re: [PATCH] Add gnu::diagnose_as attribute

2021-07-05 Thread Matthias Kretz
On Thursday, 1 July 2021 17:18:26 CEST Jason Merrill wrote:
> You probably want to adjust is_late_template_attribute to change that.

Right, I hacked is_late_template_attribute but now I only see a TYPE_DECL 
passed to my attribute handler (!DECL_ALIAS_TEMPLATE_P). I.e. I don't know how 
your previous comment is supposed to help me:

On Tuesday, 22 June 2021 22:12:42 CEST Jason Merrill wrote:
> Yes.  You can check that with get_underlying_template.

FWIW, I don't feel qualified to implement the diagnose_as attribute on alias 
templates. The trees I've seen while testing the following test case don't 
make sense to me. :(


// { dg-do compile { target c++11 } }
// { dg-options "-fdiagnostics-use-aliases -fpretty-templates" }

template  class A0 {};
template  using B0 [[gnu::diagnose_as]] = A0; // #1
template  using C0 [[gnu::diagnose_as]] = A0; // #2

template  class A1 {};
template  class A1 {};
template  using B1 [[gnu::diagnose_as]] = A1; // #3

void fn_1(int);

int main ()
{
  fn_1 (A0 ()); // { dg-error "cannot convert 'B0' to 'int'" }
  fn_1 (A1 ()); // { dg-error "cannot convert 'A1' to 'int'" }
  fn_1 (A1 ()); // { dg-error "cannot convert 'B1' to 'int'" }
}


On #1 I see !COMPLETE_TYPE_P (TREE_TYPE (*node)) while on #3 TREE_TYPE (*node) 
is a complete type. Like I said, I don't get to see the TEMPLATE_DECL of 
either #1, #2, or #3, only a TYPE_DECL whose TREE_TYPE is A0. I thus have no 
idea how to reject #2.

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──


[PATCH] Add FMADDSUB and FMSUBADD SLP vectorization patterns and optabs

2021-07-05 Thread Richard Biener
This adds named expanders for vec_fmaddsub4 and
vec_fmsubadd4 which map to x86 vfmaddsubXXXp{ds} and
vfmsubaddXXXp{ds} instructions.  This complements the previous
addition of ADDSUB support.

x86 lacks SUBADD and the negate variants of FMA with mixed
plus minus so I did not add optabs or patterns for those but
it would not be difficult if there's a target that has them.
Maybe one of the complex fma patterns match those variants?

I did not dare to rewrite the numerous patterns to the new
canonical name but instead added two new expanders.  Note I
did not cover AVX512 since the existing patterns are separated
and I have no easy way to test things there.  Handling AVX512
should be easy as followup though.

Bootstrap and testing on x86_64-unknown-linux-gnu in progress.

Any comments?

Thanks,
Richard.

2021-07-05  Richard Biener  

* doc/md.texi (vec_fmaddsub4): Document.
(vec_fmsubadd4): Likewise.
* optabs.def (vec_fmaddsub$a4): Add.
(vec_fmsubadd$a4): Likewise.
* internal-fn.def (IFN_VEC_FMADDSUB): Add.
(IFN_VEC_FMSUBADD): Likewise.
* tree-vect-slp-patterns.c (addsub_pattern::recognize):
Refactor to handle IFN_VEC_FMADDSUB and IFN_VEC_FMSUBADD.
(addsub_pattern::build): Likewise.
* tree-vect-slp.c (vect_optimize_slp): CFN_VEC_FMADDSUB
and CFN_VEC_FMSUBADD are not transparent for permutes.
* config/i386/sse.md (vec_fmaddsub4): New expander.
(vec_fmsubadd4): Likewise.

* gcc.target/i386/vect-fmaddsubXXXpd.c: New testcase.
* gcc.target/i386/vect-fmaddsubXXXps.c: Likewise.
* gcc.target/i386/vect-fmsubaddXXXpd.c: Likewise.
* gcc.target/i386/vect-fmsubaddXXXps.c: Likewise.
---
 gcc/config/i386/sse.md|  19 ++
 gcc/doc/md.texi   |  14 ++
 gcc/internal-fn.def   |   3 +-
 gcc/optabs.def|   2 +
 .../gcc.target/i386/vect-fmaddsubXXXpd.c  |  34 
 .../gcc.target/i386/vect-fmaddsubXXXps.c  |  34 
 .../gcc.target/i386/vect-fmsubaddXXXpd.c  |  34 
 .../gcc.target/i386/vect-fmsubaddXXXps.c  |  34 
 gcc/tree-vect-slp-patterns.c  | 192 +-
 gcc/tree-vect-slp.c   |   2 +
 10 files changed, 311 insertions(+), 57 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmaddsubXXXpd.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmaddsubXXXps.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmsubaddXXXpd.c
 create mode 100644 gcc/testsuite/gcc.target/i386/vect-fmsubaddXXXps.c

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index bcf1605d147..6fc13c184bf 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -4644,6 +4644,25 @@
 ;;
 ;; But this doesn't seem useful in practice.
 
+(define_expand "vec_fmaddsub4"
+  [(set (match_operand:VF 0 "register_operand")
+   (unspec:VF
+ [(match_operand:VF 1 "nonimmediate_operand")
+  (match_operand:VF 2 "nonimmediate_operand")
+  (match_operand:VF 3 "nonimmediate_operand")]
+ UNSPEC_FMADDSUB))]
+  "TARGET_FMA || TARGET_FMA4 || TARGET_AVX512F")
+
+(define_expand "vec_fmsubadd4"
+  [(set (match_operand:VF 0 "register_operand")
+   (unspec:VF
+ [(match_operand:VF 1 "nonimmediate_operand")
+  (match_operand:VF 2 "nonimmediate_operand")
+  (neg:VF
+(match_operand:VF 3 "nonimmediate_operand"))]
+ UNSPEC_FMADDSUB))]
+  "TARGET_FMA || TARGET_FMA4 || TARGET_AVX512F")
+
 (define_expand "fmaddsub_"
   [(set (match_operand:VF 0 "register_operand")
(unspec:VF
diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi
index 1b918144330..cc92ebd26aa 100644
--- a/gcc/doc/md.texi
+++ b/gcc/doc/md.texi
@@ -5688,6 +5688,20 @@ Alternating subtract, add with even lanes doing subtract 
and odd
 lanes doing addition.  Operands 1 and 2 and the outout operand are vectors
 with mode @var{m}.
 
+@cindex @code{vec_fmaddsub@var{m}4} instruction pattern
+@item @samp{vec_fmaddsub@var{m}4}
+Alternating multiply subtract, add with even lanes doing subtract and odd
+lanes doing addition of the third operand to the multiplication result
+of the first two operands.  Operands 1, 2 and 3 and the outout operand are 
vectors
+with mode @var{m}.
+
+@cindex @code{vec_fmsubadd@var{m}4} instruction pattern
+@item @samp{vec_fmsubadd@var{m}4}
+Alternating multiply add, subtract with even lanes doing addition and odd
+lanes doing subtraction of the third operand to the multiplication result
+of the first two operands.  Operands 1, 2 and 3 and the outout operand are 
vectors
+with mode @var{m}.
+
 These instructions are not allowed to @code{FAIL}.
 
 @cindex @code{mulhisi3} instruction pattern
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index c3b8e730960..a7003d5da8e 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -282,7 +282,8 @@ DEF_INTERNAL_OPTAB_FN 

[Ada] Add Reference and Constant_Reference functions to formal containers

2021-07-05 Thread Pierre-Marie de Rodat
Reference and Constant_Reference functions are added to all formal
containers types, returning an access to an element in the container.
This takes avantage of pointer support in SPARK.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-cfdlli.ads, libgnat/a-cfdlli.adb
libgnat/a-cfinve.ads, libgnat/a-cfinve.adb,
libgnat/a-cofove.ads, libgnat/a-cofove.adb,
libgnat/a-coboho.ads, libgnat/a-coboho.adb (Constant_Reference):
Get a read-only access to an element of the container.
(At_End): Ghost functions used to express pledges in the
postcondition of Reference.
(Reference): Get a read-write access to an element of the
container.
* libgnat/a-cfhama.ads, libgnat/a-cfhama.adb,
libgnat/a-cforma.ads, libgnat/a-cforma.adb: The full view of the
Map type is no longer a tagged type, but a wrapper over this
tagged type. This is to avoid issues with dispatching result in
At_End functions.
(Constant_Reference): Get a read-only access to an element of
the container.
(At_End): Ghost functions used to express pledges in the
postcondition of Reference.
(Reference): Get a read-write access to an element of the
container.

* libgnat/a-cfhase.ads, libgnat/a-cfhase.adb,
libgnat/a-cforse.ads, libgnat/a-cforse.adb: The full view of the
Map type is no longer a tagged type, but a wrapper over this
tagged type.
(Constant_Reference): Get a read-only access to an element of
the container.
* libgnat/a-cofuse.ads, libgnat/a-cofuve.ads (Copy_Element):
Expression function used to cause SPARK to make sure
Element_Type is copiable.
* libgnat/a-cofuma.ads (Copy_Key): Expression function used to
cause SPARK to make sure Key_Type is copiable.
(Copy_Element): Expression function used to cause SPARK to make
sure Element_Type is copiable.

patch.diff.gz
Description: application/gzip


[Ada] Remove Ada.Strings.Text_Output and child units

2021-07-05 Thread Pierre-Marie de Rodat
The GNAT-defined package Ada.Strings.Text_Output has been replaced by
the Ada-defined package Ada.Strings.Text_Buffers.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-stobbu.adb, libgnat/a-stobbu.ads,
libgnat/a-stobfi.adb, libgnat/a-stobfi.ads,
libgnat/a-stoubu.adb, libgnat/a-stoubu.ads,
libgnat/a-stoufi.adb, libgnat/a-stoufi.ads,
libgnat/a-stoufo.adb, libgnat/a-stoufo.ads,
libgnat/a-stouut.adb, libgnat/a-stouut.ads,
libgnat/a-stteou.ads: Delete files.
* Makefile.rtl, impunit.adb: Remove references to deleted files.

patch.diff.gz
Description: application/gzip


[Ada] INOX: prototype alternative accessibility model

2021-07-05 Thread Pierre-Marie de Rodat
This patch implements an experimental restriction
No_Dynamic_Accessibility_Checks which presents the two alternative
accessibility models.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* checks.adb (Accessibility_Checks_Suppressed): Add check
against restriction No_Dynamic_Accessibility_Checks.
(Apply_Accessibility_Check): Add assertion to check restriction
No_Dynamic_Accessibility_Checks is not active.
* debug.adb: Add documentation for new debugging switch to
control which accessibility model gets employed under
restriction No_Dynamic_Accessibility_Checks.
* exp_attr.adb (Expand_N_Attribute_Reference): Disable dynamic
accessibility check generation when
No_Dynamic_Accessibility_Checks is active.
* exp_ch4.adb (Apply_Accessibility_Check): Skip check generation
when restriction No_Dynamic_Accessibility_Checks is active.
(Expand_N_Allocator): Disable dynamic accessibility checks when
No_Dynamic_Accessibility_Checks is active.
(Expand_N_In): Disable dynamic accessibility checks when
No_Dynamic_Accessibility_Checks is active.
(Expand_N_Type_Conversion): Disable dynamic accessibility checks
when No_Dynamic_Accessibility_Checks is active.
* exp_ch5.adb (Expand_N_Assignment_Statement): Disable
alternative accessibility model calculations when computing a
dynamic level for a SAOAAT.
* exp_ch6.adb (Add_Call_By_Copy_Code): Disable dynamic
accessibility check generation when
No_Dynamic_Accessibility_Checks is active.
(Expand_Branch): Disable alternative accessibility model
calculations.
(Expand_Call_Helper): Disable alternative accessibility model
calculations.
* restrict.adb, restrict.ads: Add new restriction
No_Dynamic_Accessibility_Checks.
(No_Dynamic_Accessibility_Checks_Enabled): Created to test when
experimental features (which are generally incompatible with
standard Ada) can be enabled.
* sem_attr.adb (Safe_Value_Conversions): Add handling of new
accessibility model under the restriction
No_Dynamic_Accessibility_Checks.
* sem_prag.adb (Process_Restrictions_Or_Restriction_Warnings):
Disallow new restriction No_Dynamic_Accessibility_Checks from
being exclusively specified within a body or subunit without
being present in a specification.
* sem_res.adb (Check_Fully_Declared_Prefix): Minor comment
fixup.
(Valid_Conversion): Omit implicit conversion checks on anonymous
access types and perform static checking instead when
No_Dynamic_Accessibility_Checks is active.
* sem_util.adb, sem_util.ads (Accessibility_Level): Add special
handling of anonymous access objects, formal parameters,
anonymous access components, and function return objects.
(Deepest_Type_Access_Level): When
No_Dynamic_Accessibility_Checks is active employ an alternative
model. Add paramter Allow_Alt_Model to override the new behavior
in certain cases.
(Type_Access_Level): When No_Dynamic_Accessibility_Checks is
active employ an alternative model. Add parameter
Allow_Alt_Model to override the new behavior in certain cases.
(Typ_Access_Level): Created within Accessibility_Level for
convenience.
* libgnat/s-rident.ads, snames.ads-tmpl: Add handing for
No_Dynamic_Accessibility_Checks.

patch.diff.gz
Description: application/gzip


[Ada] Add Ada 2022 Image and Put_Image support for tagged types

2021-07-05 Thread Pierre-Marie de Rodat
GNAT's initial implementation of Ada 2022's Image and Put_Image
attributes did not include full support for tagged types. Improve that
level of support. This support is still disabled by default and is
enabled via the -gnatd_z switch because it generates additional
dispatching routines even for Ada 2022 programs that do not make
(explicit or implicit) use of the Image or Put_Image attributes of
tagged types.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* debug.adb: Remove comments about -gnatd_z switch.
* exp_ch3.adb (Make_Predefined_Primitive_Specs): A one-line fix
for a subtle bug that took some effort to debug. Append a new
Put_Image procedure for a type extension even if it seems to
already have one, just as is done for (for example) the
streaming-related Read procedure.
* exp_put_image.adb:
(Build_Record_Put_Image_Procedure.Make_Component_Attributes): Do
not treat _Parent component like just another component, for two
reasons.  1. If the _parent component's type has a
user-specified Put_Image procedure, then we want to generate a
call to that procedure and then generate extension aggregate
syntax.  2. Otherwise, we still don't want to see any mention of
"_parent" in the generated image text.
(Build_Record_Put_Image_Procedure.Make_Component_Name): Add
assertion that we are not generating a reference to an "_parent"
component.
(Build_Record_Put_Image_Procedure): Add special treatment for
null records.  Add call to Duplicate_Subexpr for image attribute
prefix in order to help with expansion needed in the class-wide
case (where the prefix is also referenced in the call to
Wide_Wide_Expanded_Name) if evaluation of the prefix has side
effects. Add new local helper function, Put_String_Exp.  Add
support for case where prefix type is class-wide.
(Enable_Put_Image, Preload_Root_Buffer_Type): Query Ada_Version
> Ada_2022 instead of (indirectly) querying -gnatd_z switch.
* freeze.adb (In_Expanded_Body): A one-line change to add
TSS_Put_Image to the list of subprograms that have
expander-created bodies.
* rtsfind.ads: Add support for accessing
Ada.Tags.Wide_Wide_Expanded_Name.
* sem_ch3.ads, sem_ch3.adb: Delete Is_Null_Extension function,
as part of moving it to Sem_Util.
* sem_ch13.adb
(Analyze_Put_Image_TSS_Definition.Has_Good_Profile): Improve
diagnostic messages in cases where the result is going to be
False and the Report parameter is True. Relax overly-restrictive
checks in order to implement mode conformance.
(Analyze_Stream_TSS_Definition.Has_Good_Profile): Add similar
relaxation of parameter subtype checking for the Stream
parameter of user-defined streaming subprograms.
* sem_disp.adb (Check_Dispatching_Operation): A one-line
change (and an accompanying comment change) to add TSS_Put_Image
to the list of compiler-generated dispatching primitive
operations.
* sem_util.ads, sem_util.adb: Add Ignore_Privacy Boolean
parameter to Is_Null_Record_Type function (typically the
parameter will be False when the function is being used in the
implementation of static semantics and True for dynamic
semantics; the parameter might make a difference in the case of,
for example, a private type that is implemented as a null record
type).  Add related new routines Is_Null_Extension (formerly
declared in Sem_Ch3), Is_Null_Extension_Of, and
Is_Null_Record_Definition.

patch.diff.gz
Description: application/gzip


[Ada] Clean up Get_Index_Bounds

2021-07-05 Thread Pierre-Marie de Rodat
Replace some calls to procedure Get_Index_Bounds with calls
to the function with the same name. Not all calls are replaced
(some seem clearer as procedure calls).

Change names to be more consistent with the RM.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* checks.adb, exp_aggr.adb, exp_ch5.adb, freeze.adb,
sem_util.adb, sem_util.ads: Change L and H to be First and Last,
to match the attributes in the RM. Change calls from procedure
to function where appropriate.

patch.diff.gz
Description: application/gzip


Re: [PATCH] X86: Provide a CTOR for stringop_algs [PR100246].

2021-07-05 Thread Richard Biener via Gcc-patches
On Mon, Jul 5, 2021 at 3:04 PM Iain Sandoe  wrote:
>
> Hi Richard,
>
> > On 5 Jul 2021, at 11:50, Richard Biener via Gcc-patches 
> >  wrote:
> >
> > On Sun, Jul 4, 2021 at 10:04 PM Iain Sandoe  wrote:
>
> >> Several older compilers fail to build modern GCC because of missing
> >> or incomplete C++11 support.
> >>
> >> (although the PR mentions clang, specifically, this has also been reported
> >> for some GCC versions within the range that should be able to bootstrap
> >> GCC)
> >>
> >> There are several possible solutions proposed in the PR, this one seems
> >> the least invasive.
> >>
> >> The header is pulled into the gcov code that builds with C, so we have to
> >> make the CTOR conditional on C++.
> >>
> >> tested on Darwin12 with xcode-6, bootstrapped on x86_64-darwin and linux.
> >> OK for master / GCC-11?
> >
> > Hmm, what is specifically built with a C compiler?  gcov.c not, I think.
>
> any C compilation that includes tm.h
>
> well, libgcc2 fails too on a quick check here -  but ISTR there was something 
> in
> libgcov and I checked with Martin that it was intentionally compiled with C 
> compiler.
>
> > Instead of commenting the CTOR, does it work to comment the whole 
> > stringop_algs
> > type?
>
> I don’t think that will work because it’s in a header that’s transitively 
> included by tm.h
> which is then included loads of places.
>
> >  Also it seems on trunk this CTOR is no more?
>
> The addition of the CTOR is the fix for the C++ compile fail in the PR, the 
> conditional is
> only there because the same header is compiled by C and C++.

Whoops sorry - I was confused.  The patch looks OK to me if you add a comment
before the CTOR why it was added (maybe quoting the error that happens)

Richard.

> thanks
> Iain
> >
> >> thanks
> >> Iain
> >>
> >> Signed-off-by: Iain Sandoe 
> >>
> >> PR bootstrap/100246 - [11/12 Regression] GCC will not bootstrap with clang 
> >> 3.4/3.5 [xcode 5/6, Darwin 12/13]
> >>
> >>PR bootstrap/100246
> >>
> >> gcc/ChangeLog:
> >>
> >>* config/i386/i386.h (struct stringop_algs): Define a CTOR for
> >>this type.
> >> ---
> >> gcc/config/i386/i386.h | 5 +
> >> 1 file changed, 5 insertions(+)
> >>
> >> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> >> index 6e0340a4b60..84151156999 100644
> >> --- a/gcc/config/i386/i386.h
> >> +++ b/gcc/config/i386/i386.h
> >> @@ -73,6 +73,11 @@ struct stringop_algs
> >> {
> >>   const enum stringop_alg unknown_size;
> >>   const struct stringop_strategy {
> >> +#ifdef __cplusplus
> >> +stringop_strategy(int _max = -1, enum stringop_alg _alg = libcall,
> >> + int _noalign = false)
> >> +  : max (_max), alg (_alg), noalign (_noalign) {}
> >> +#endif
> >> const int max;
> >> const enum stringop_alg alg;
> >> int noalign;
> >> --
> >> 2.24.1
>


[Ada] Simplify and reuse Is_Concurrent_Interface

2021-07-05 Thread Pierre-Marie de Rodat
Code cleanup; semantics is unaffected.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch3.adb (Stream_Operation_OK): Reuse
Is_Concurrent_Interface.
* sem_ch3.adb (Analyze_Interface_Declaration,
Build_Derived_Record_Type): Likewise.
* sem_ch6.adb (Check_Limited_Return): Likewise.
* sem_util.adb (Is_Concurrent_Interface): Don't call
Is_Interface because each of the Is_Protected_Interface,
Is_Synchronized_Interface and Is_Task_Interface calls it anyway.diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -11251,12 +11251,7 @@ package body Exp_Ch3 is
 or else not Is_Abstract_Type (Typ)
 or else not Is_Derived_Type (Typ))
 and then not Has_Unknown_Discriminants (Typ)
-and then not
-  (Is_Interface (Typ)
-and then
-  (Is_Task_Interface (Typ)
-or else Is_Protected_Interface (Typ)
-or else Is_Synchronized_Interface (Typ)))
+and then not Is_Concurrent_Interface (Typ)
 and then not Restriction_Active (No_Streams)
 and then not Restriction_Active (No_Dispatch)
 and then No (No_Tagged_Streams_Pragma (Typ))


diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -3493,9 +3493,7 @@ package body Sem_Ch3 is
 
   --  Check runtime support for synchronized interfaces
 
-  if (Is_Task_Interface (T)
-   or else Is_Protected_Interface (T)
-   or else Is_Synchronized_Interface (T))
+  if Is_Concurrent_Interface (T)
 and then not RTE_Available (RE_Select_Specific_Data)
   then
  Error_Msg_CRT ("synchronized interfaces", T);
@@ -9270,9 +9268,7 @@ package body Sem_Ch3 is
   and then Is_Limited_Record (Full_View (Parent_Type)))
   then
  if not Is_Interface (Parent_Type)
-   or else Is_Synchronized_Interface (Parent_Type)
-   or else Is_Protected_Interface (Parent_Type)
-   or else Is_Task_Interface (Parent_Type)
+   or else Is_Concurrent_Interface (Parent_Type)
  then
 Set_Is_Limited_Record (Derived_Type);
  end if;


diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -6999,10 +6999,7 @@ package body Sem_Ch6 is
   --  A limited interface that is not immutably limited is OK
 
   if Is_Limited_Interface (R_Type)
-and then
-  not (Is_Task_Interface (R_Type)
-or else Is_Protected_Interface (R_Type)
-or else Is_Synchronized_Interface (R_Type))
+and then not Is_Concurrent_Interface (R_Type)
   then
  null;
 


diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -16209,11 +16209,9 @@ package body Sem_Util is
 
function Is_Concurrent_Interface (T : Entity_Id) return Boolean is
begin
-  return Is_Interface (T)
-and then
-  (Is_Protected_Interface (T)
-or else Is_Synchronized_Interface (T)
-or else Is_Task_Interface (T));
+  return Is_Protected_Interface (T)
+or else Is_Synchronized_Interface (T)
+or else Is_Task_Interface (T);
end Is_Concurrent_Interface;
 
---




[Ada] Reject overlays in Global/Depends/Initializes contracts

2021-07-05 Thread Pierre-Marie de Rodat
Object overlays, i.e. objects with an Address clause that specify the
address of an overlaid object, are no longer allowed to appear in SPARK
data and dependency flow contracts. Also, they do not contribute to the
package state and so don't need to appear in Refined_State contracts.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_prag.adb (Analyze_Depends_In_Decl_Part): Reject overlays
in Depends and Refined_Depends contracts.
(Analyze_Global_In_Decl_Part): Likewise for Global and
Refined_Global.
(Analyze_Initializes_In_Decl_Part): Likewise for Initializes
(when appearing both as a single item and as a initialization
clause).
* sem_util.ads (Ultimate_Overlaid_Entity): New routine.
* sem_util.adb (Report_Unused_Body_States): Ignore overlays.
(Ultimate_Overlaid_Entity): New routine.diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -1139,6 +1139,17 @@ package body Sem_Prag is
  (State_Id => Item_Id,
   Ref  => Item);
 end if;
+
+ elsif Ekind (Item_Id) in E_Constant | E_Variable
+   and then Present (Ultimate_Overlaid_Entity (Item_Id))
+ then
+SPARK_Msg_NE
+  ("overlaying object & cannot appear in Depends",
+   Item, Item_Id);
+SPARK_Msg_NE
+  ("\use the overlaid object & instead",
+   Item, Ultimate_Overlaid_Entity (Item_Id));
+return;
  end if;
 
  --  When the item renames an entire object, replace the
@@ -2387,6 +2398,17 @@ package body Sem_Prag is
elsif Is_Formal_Object (Item_Id) then
   null;
 
+   elsif Ekind (Item_Id) in E_Constant | E_Variable
+ and then Present (Ultimate_Overlaid_Entity (Item_Id))
+   then
+  SPARK_Msg_NE
+("overlaying object & cannot appear in Global",
+ Item, Item_Id);
+  SPARK_Msg_NE
+("\use the overlaid object & instead",
+ Item, Ultimate_Overlaid_Entity (Item_Id));
+  return;
+
--  The only legal references are those to abstract states,
--  objects and various kinds of constants (SPARK RM 6.1.4(4)).
 
@@ -2984,6 +3006,16 @@ package body Sem_Prag is
if Item_Id = Any_Id then
   null;
 
+   elsif Ekind (Item_Id) in E_Constant | E_Variable
+ and then Present (Ultimate_Overlaid_Entity (Item_Id))
+   then
+  SPARK_Msg_NE
+("overlaying object & cannot appear in Initializes",
+ Item, Item_Id);
+  SPARK_Msg_NE
+("\use the overlaid object & instead",
+ Item, Ultimate_Overlaid_Entity (Item_Id));
+
--  The state or variable must be declared in the visible
--  declarations of the package (SPARK RM 7.1.5(7)).
 
@@ -3126,6 +3158,18 @@ package body Sem_Prag is
 end if;
  end if;
 
+ if Ekind (Input_Id) in E_Constant | E_Variable
+   and then Present (Ultimate_Overlaid_Entity (Input_Id))
+ then
+SPARK_Msg_NE
+  ("overlaying object & cannot appear in Initializes",
+   Input, Input_Id);
+SPARK_Msg_NE
+  ("\use the overlaid object & instead",
+   Input, Ultimate_Overlaid_Entity (Input_Id));
+return;
+ end if;
+
  --  Detect a duplicate use of the same input item
  --  (SPARK RM 7.1.5(5)).
 


diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -5708,6 +5708,13 @@ package body Sem_Util is
if Ekind (State_Id) = E_Constant then
   null;
 
+   --  Overlays do not contribute to package state
+
+   elsif Ekind (State_Id) = E_Variable
+ and then Present (Ultimate_Overlaid_Entity (State_Id))
+   then
+  null;
+
--  Generate an error message of the form:
 
--body of package ... has unused hidden states
@@ -29312,6 +29319,39 @@ package body Sem_Util is
   end if;
end Type_Without_Stream_Operation;
 
+   --
+   -- Ultimate_Overlaid_Entity --
+   --
+
+   function 

[Ada] Adapt SPARK RM rule on non-effectively volatile abstract state

2021-07-05 Thread Pierre-Marie de Rodat
SPARK RM 7.1.3(8) has been updated to reflect the fact that abstract
states which do have Async_Writers or Effective_Reads cannot have as
constituents objects which are effectively volatile for reading,
hence need not require that a function reading such an abstract state
be marked as a volatile function.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_prag.adb (Analyze_Global_Item): Adapt to update SPARK RM
rule.diff --git a/gcc/ada/sem_prag.adb b/gcc/ada/sem_prag.adb
--- a/gcc/ada/sem_prag.adb
+++ b/gcc/ada/sem_prag.adb
@@ -2433,10 +2433,13 @@ package body Sem_Prag is
  SPARK_Msg_N ("\use its constituents instead", Item);
  return;
 
-  --  An external state cannot appear as a global item of a
-  --  nonvolatile function (SPARK RM 7.1.3(8)).
+  --  An external state which has Async_Writers or
+  --  Effective_Reads enabled cannot appear as a global item
+  --  of a nonvolatile function (SPARK RM 7.1.3(8)).
 
   elsif Is_External_State (Item_Id)
+and then (Async_Writers_Enabled (Item_Id)
+   or else Effective_Reads_Enabled (Item_Id))
 and then Ekind (Spec_Id) in E_Function | E_Generic_Function
 and then not Is_Volatile_Function (Spec_Id)
   then




[Ada] Fix some "current instance" bugs

2021-07-05 Thread Pierre-Marie de Rodat
This started out as an Ada2022 ticket, but work on Ada 2022 constructs
uncovered bugs that could affect pre-Ada2022 code. Fix those bugs.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch3.adb (Build_Record_Init_Proc.Build_Assignment): When
building the assignment statement corresponding to the default
expression for a component, we make a copy of the expression.
When making that copy (and if we have seen a component that
requires late initialization), pass a Map parameter into the
call to New_Copy_Tree to redirect references to the type to
instead refer to the _Init formal parameter of the init proc.
This includes hoisting the declaration of Has_Late_Init_Comp out
one level so that it becomes available to Build_Assignment.
(Find_Current_Instance): Return True for other kinds of current
instance references, instead of just access-valued attribute
references such as T'Access.
* sem_util.adb (Is_Aliased_View): Return True for the _Init
formal parameter of an init procedure. The changes in
exp_ch3.adb can have the effect of replacing a "T'Access"
attribute reference in an init procedure with an "_Init'Access"
attribute reference. We want such an attribute reference to be
legal. However, we do not simply mark the formal parameter as
being aliased because that might impact callers.
(Is_Object_Image): Return True if Is_Current_Instance returns
True for the prefix of an Image (or related attribute) attribute
reference.diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -1926,6 +1926,7 @@ package body Exp_Ch3 is
   Proc_Id   : Entity_Id;
   Rec_Type  : Entity_Id;
   Set_Tag   : Entity_Id := Empty;
+  Has_Late_Init_Comp : Boolean := False; -- set in Build_Init_Statements
 
   function Build_Assignment
 (Id  : Entity_Id;
@@ -2021,35 +2022,27 @@ package body Exp_Ch3 is
  Selector_Name => New_Occurrence_Of (Id, Default_Loc));
  Set_Assignment_OK (Lhs);
 
- --  Case of an access attribute applied to the current instance.
- --  Replace the reference to the type by a reference to the actual
- --  object. (Note that this handles the case of the top level of
- --  the expression being given by such an attribute, but does not
- --  cover uses nested within an initial value expression. Nested
- --  uses are unlikely to occur in practice, but are theoretically
- --  possible.) It is not clear how to handle them without fully
- --  traversing the expression. ???
-
- if Kind = N_Attribute_Reference
-   and then Attribute_Name (Default) in Name_Unchecked_Access
-  | Name_Unrestricted_Access
-   and then Is_Entity_Name (Prefix (Default))
-   and then Is_Type (Entity (Prefix (Default)))
-   and then Entity (Prefix (Default)) = Rec_Type
- then
-Exp :=
-  Make_Attribute_Reference (Default_Loc,
-Prefix =>
-  Make_Identifier (Default_Loc, Name_uInit),
-Attribute_Name => Name_Unrestricted_Access);
- end if;
-
  --  Take a copy of Exp to ensure that later copies of this component
  --  declaration in derived types see the original tree, not a node
  --  rewritten during expansion of the init_proc. If the copy contains
  --  itypes, the scope of the new itypes is the init_proc being built.
 
- Exp := New_Copy_Tree (Exp, New_Scope => Proc_Id);
+ declare
+Map : Elist_Id := No_Elist;
+ begin
+if Has_Late_Init_Comp then
+   --  Map the type to the _Init parameter in order to
+   --  handle "current instance" references.
+
+   Map := New_Elmt_List
+(Elmt1 => Rec_Type,
+ Elmt2 => Defining_Identifier (First
+   (Parameter_Specifications
+  (Parent (Proc_Id);
+end if;
+
+Exp := New_Copy_Tree (Exp, New_Scope => Proc_Id, Map => Map);
+ end;
 
  Res := New_List (
Make_Assignment_Statement (Loc,
@@ -2981,7 +2974,6 @@ package body Exp_Ch3 is
  Counter_Id : Entity_Id:= Empty;
  Comp_Loc   : Source_Ptr;
  Decl   : Node_Id;
- Has_Late_Init_Comp : Boolean;
  Id : Entity_Id;
  Parent_Stmts   : List_Id;
  Stmts  : List_Id;
@@ -3097,10 +3089,9 @@ package body Exp_Ch3 is
 function Find_Current_Instance
   (N : Node_Id) return Traverse_Result is
 begin
-   if Nkind 

[Ada] Fix missing error messages when returning limited type

2021-07-05 Thread Pierre-Marie de Rodat
Check_Limited_Return originally used Comes_From_Source (N) in order to
decide whether N was a return statement created from an extended return
statement or not.

This was a problem because the return statement from expression
functions also have their Comes_From_Source flag set to false.
The solution is to use Comes_From_Extended_Return_Statement to decide
whether to post error messages or not.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch6.adb (Check_Limited_Return): Replace Comes_From_Source
with Comes_From_Extended_Return_Statement.diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -7008,7 +7008,8 @@ package body Sem_Ch6 is
 
   elsif Is_Limited_Type (R_Type)
 and then not Is_Interface (R_Type)
-and then Comes_From_Source (N)
+and then not (Nkind (N) = N_Simple_Return_Statement
+  and then Comes_From_Extended_Return_Statement (N))
 and then not In_Instance_Body
 and then not OK_For_Limited_Init_In_05 (R_Type, Expr)
   then




[Ada] Fix excessive check for alignment of overlaying objects

2021-07-05 Thread Pierre-Marie de Rodat
When generating alignment checks for Address representation clauses we
optimized them away for clauses like:

  for X'Address use Arr (1)'Address;
  for X'Address use Rec.C'Address;

but not:

  for X'Address use Obj'Address;

even though the alignment of Obj is known.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_util.adb (Has_Compatible_Alignment_Internal): If the
prefix of the Address expression is an entire object with a
known alignment, then skip checks related to its size.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -11939,6 +11939,7 @@ package body Sem_Util is
   elsif Is_Entity_Name (Expr)
 and then Known_Alignment (Entity (Expr))
   then
+ Offs := Uint_0;
  ExpA := Alignment (Entity (Expr));
 
   --  Otherwise, we can use the alignment of the type of Expr
@@ -11961,9 +11962,9 @@ package body Sem_Util is
  Set_Result (Known_Incompatible);
   end if;
 
-  --  If Expr is not a piece of a larger object, see if size
-  --  is given. If so, check that it is not too small for the
-  --  required alignment.
+  --  If Expr is a component or an entire object with a known
+  --  alignment, then we are fine. Otherwise, if its size is
+  --  known, it must be big enough for the required alignment.
 
   if Offs /= No_Uint then
  null;
@@ -11982,7 +11983,7 @@ package body Sem_Util is
   end if;
 
   --  If we got a size, see if it is a multiple of the Obj
-  --  alignment, if not, then the alignment cannot be
+  --  alignment; if not, then the alignment cannot be
   --  acceptable, since the size is always a multiple of the
   --  alignment.
 




[Ada] The Unix Epochalypse of 2038 (Warn about time_t in the compiler)

2021-07-05 Thread Pierre-Marie de Rodat
Add some comments to warn about the use of time_t in the host based
tools since it's not practical to remove the declaration itself.
Rename some formal parameters in internal subprograms to reflect the
fact that OS_Time is the preferred interface type.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/s-os_lib.ads: Add some comments about time_t.
* libgnat/s-os_lib.adb (GM_Split/To_GM_Time): Rename formal to
P_OS_Time.
(GM_Time_Of/To_OS_Time): Likewise.diff --git a/gcc/ada/libgnat/s-os_lib.adb b/gcc/ada/libgnat/s-os_lib.adb
--- a/gcc/ada/libgnat/s-os_lib.adb
+++ b/gcc/ada/libgnat/s-os_lib.adb
@@ -1347,13 +1347,13 @@ package body System.OS_Lib is
   Second : out Second_Type)
is
   procedure To_GM_Time
-(P_Time_T : Address;
- P_Year   : Address;
- P_Month  : Address;
- P_Day: Address;
- P_Hours  : Address;
- P_Mins   : Address;
- P_Secs   : Address);
+(P_OS_Time : Address;
+ P_Year: Address;
+ P_Month   : Address;
+ P_Day : Address;
+ P_Hours   : Address;
+ P_Mins: Address;
+ P_Secs: Address);
   pragma Import (C, To_GM_Time, "__gnat_to_gm_time");
 
   T  : OS_Time := Date;
@@ -1385,13 +1385,13 @@ package body System.OS_Lib is
   Locked_Processing : begin
  SSL.Lock_Task.all;
  To_GM_Time
-   (P_Time_T => T'Address,
-P_Year   => Y'Address,
-P_Month  => Mo'Address,
-P_Day=> D'Address,
-P_Hours  => H'Address,
-P_Mins   => Mn'Address,
-P_Secs   => S'Address);
+   (P_OS_Time => T'Address,
+P_Year=> Y'Address,
+P_Month   => Mo'Address,
+P_Day => D'Address,
+P_Hours   => H'Address,
+P_Mins=> Mn'Address,
+P_Secs=> S'Address);
  SSL.Unlock_Task.all;
 
   exception
@@ -1429,26 +1429,26 @@ package body System.OS_Lib is
   Second : Second_Type) return OS_Time
is
   procedure To_OS_Time
-(P_Time_T : Address;
- P_Year   : Integer;
- P_Month  : Integer;
- P_Day: Integer;
- P_Hours  : Integer;
- P_Mins   : Integer;
- P_Secs   : Integer);
+(P_OS_Time : Address;
+ P_Year: Integer;
+ P_Month   : Integer;
+ P_Day : Integer;
+ P_Hours   : Integer;
+ P_Mins: Integer;
+ P_Secs: Integer);
   pragma Import (C, To_OS_Time, "__gnat_to_os_time");
 
   Result : OS_Time;
 
begin
   To_OS_Time
-(P_Time_T => Result'Address,
- P_Year   => Year - 1900,
- P_Month  => Month - 1,
- P_Day=> Day,
- P_Hours  => Hour,
- P_Mins   => Minute,
- P_Secs   => Second);
+(P_OS_Time => Result'Address,
+ P_Year=> Year - 1900,
+ P_Month   => Month - 1,
+ P_Day => Day,
+ P_Hours   => Hour,
+ P_Mins=> Minute,
+ P_Secs=> Second);
   return Result;
end GM_Time_Of;
 


diff --git a/gcc/ada/libgnat/s-os_lib.ads b/gcc/ada/libgnat/s-os_lib.ads
--- a/gcc/ada/libgnat/s-os_lib.ads
+++ b/gcc/ada/libgnat/s-os_lib.ads
@@ -164,6 +164,14 @@ package System.OS_Lib is
--  component parts to be interpreted in the local time zone, and returns
--  an OS_Time. Returns Invalid_Time if the creation fails.
 
+   --
+   -- Time_t Stuff --
+   --
+
+   --  Note: Do not use time_t in the compiler and host based tools,
+   --  instead use OS_Time. These 3 declarations are indended for use only
+   --  by consumers of the GNAT.OS_Lib renaming of this package.
+
subtype time_t is Long_Integer;
--  C time_t type of the time representation
 




[Ada] The Unix Epochalyse of 2038 - OS_Time comparison

2021-07-05 Thread Pierre-Marie de Rodat
The comment in the public section of the spec says the comparison ops
are intrinsic, but that doesn't match the private part implementation
and comment. Fix this.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/s-os_lib.ads: Import OS_Time comparison ops as
intrinsic.
* libgnat/s-os_lib.adb: Remove OS_TIme comparison ops
implementation.diff --git a/gcc/ada/libgnat/s-os_lib.adb b/gcc/ada/libgnat/s-os_lib.adb
--- a/gcc/ada/libgnat/s-os_lib.adb
+++ b/gcc/ada/libgnat/s-os_lib.adb
@@ -133,42 +133,6 @@ package body System.OS_Lib is
--  Converts a C String to an Ada String. We could do this making use of
--  Interfaces.C.Strings but we prefer not to import that entire package
 
-   -
-   -- "<" --
-   -
-
-   function "<"  (X, Y : OS_Time) return Boolean is
-   begin
-  return Long_Long_Integer (X) < Long_Long_Integer (Y);
-   end "<";
-
-   --
-   -- "<=" --
-   --
-
-   function "<="  (X, Y : OS_Time) return Boolean is
-   begin
-  return Long_Long_Integer (X) <= Long_Long_Integer (Y);
-   end "<=";
-
-   -
-   -- ">" --
-   -
-
-   function ">"  (X, Y : OS_Time) return Boolean is
-   begin
-  return Long_Long_Integer (X) > Long_Long_Integer (Y);
-   end ">";
-
-   --
-   -- ">=" --
-   --
-
-   function ">="  (X, Y : OS_Time) return Boolean is
-   begin
-  return Long_Long_Integer (X) >= Long_Long_Integer (Y);
-   end ">=";
-
-
-- Args_Length --
-


diff --git a/gcc/ada/libgnat/s-os_lib.ads b/gcc/ada/libgnat/s-os_lib.ads
--- a/gcc/ada/libgnat/s-os_lib.ads
+++ b/gcc/ada/libgnat/s-os_lib.ads
@@ -,18 +,13 @@ private
--  time stamps, but may have a different representation than C's time_t.
--  This type needs to match the declaration of OS_Time in adaint.h.
 
-   --  Add pragma Inline statements for comparison operations on OS_Time. It
-   --  would actually be nice to use pragma Import (Intrinsic) here, but this
-   --  was not properly supported till GNAT 3.15a, so that would cause
-   --  bootstrap path problems. To be changed later ???
-
Invalid_Time : constant OS_Time := -1;
--  This value should match the return value from __gnat_file_time_*
 
-   pragma Inline ("<");
-   pragma Inline (">");
-   pragma Inline ("<=");
-   pragma Inline (">=");
+   pragma Import (Intrinsic, "<");
+   pragma Import (Intrinsic, ">");
+   pragma Import (Intrinsic, "<=");
+   pragma Import (Intrinsic, ">=");
pragma Inline (To_C);
pragma Inline (To_Ada);
 




[Ada] Fix missing minus sign in literal translation

2021-07-05 Thread Pierre-Marie de Rodat
When translating literals to big reals, the compiler would forget about
the minus sign and turn a negative number into a positive one.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_res.adb (Resolve): Insert minus sign if needed.diff --git a/gcc/ada/sem_res.adb b/gcc/ada/sem_res.adb
--- a/gcc/ada/sem_res.adb
+++ b/gcc/ada/sem_res.adb
@@ -2934,6 +2934,11 @@ package body Sem_Res is
  else
 UI_Image (Norm_Num (Expr_Value_R (Expr)), Decimal);
 Start_String;
+
+if UR_Is_Negative (Expr_Value_R (Expr)) then
+   Store_String_Chars ("-");
+end if;
+
 Store_String_Chars
   (UI_Image_Buffer (1 .. UI_Image_Length));
 Param1 := Make_String_Literal (Loc, End_String);




[Ada] Temporarily disable Ada 2022 Image and Put_Image support for tagged types

2021-07-05 Thread Pierre-Marie de Rodat
Revert to having support for Ada 2022's Image and Put_Image attributes
of tagged types conditional on the -gnatd_z switch (as opposed
to being enabled unconditionally). This is a temporary change, so the
comments in debug.adb about the -gnatd_z switch have not been restored.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_put_image.adb:
(Enable_Put_Image, Preload_Root_Buffer_Type): Revert to querying
the -gnatd_z switch, as opposed to testing whether Ada_Version >= 
Ada_2022.diff --git a/gcc/ada/exp_put_image.adb b/gcc/ada/exp_put_image.adb
--- a/gcc/ada/exp_put_image.adb
+++ b/gcc/ada/exp_put_image.adb
@@ -26,6 +26,7 @@
 with Aspects;use Aspects;
 with Atree;  use Atree;
 with Csets;  use Csets;
+with Debug;  use Debug;
 with Einfo;  use Einfo;
 with Einfo.Entities; use Einfo.Entities;
 with Einfo.Utils;use Einfo.Utils;
@@ -50,6 +51,9 @@ with Uintp;  use Uintp;
 
 package body Exp_Put_Image is
 
+   Tagged_Put_Image_Enabled : Boolean renames Debug_Flag_Underscore_Z;
+   --  Temporary until we resolve mixing Ada 2012 and 2022 code
+
---
-- Local Subprograms --
---
@@ -933,6 +937,7 @@ package body Exp_Put_Image is
   if Ada_Version < Ada_2022
 or else Is_Remote_Types (Scope (Typ))
 or else (Is_Tagged_Type (Typ) and then In_Predefined_Unit (Typ))
+or else (Is_Tagged_Type (Typ) and then not Tagged_Put_Image_Enabled)
   then
  return False;
   end if;
@@ -1188,6 +1193,7 @@ package body Exp_Put_Image is
 
   if not In_Predefined_Unit (Compilation_Unit)
 and then Ada_Version >= Ada_2022
+and then Tagged_Put_Image_Enabled
 and then Tagged_Seen
 and then not No_Run_Time_Mode
 and then RTE_Available (RE_Root_Buffer_Type)




[Ada] The Unix Epochalypse of 2038 - Use OS_Time

2021-07-05 Thread Pierre-Marie de Rodat
__gnat_set_file_time_name is called from Ada with OS_Time, but the C
function argument is time_t.  This is a violation of the interface rule
that calls to the C parts use OS_Time.  It currently works by accident.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* adaint.h (__gnat_set_file_time_name): Use OS_Time.
* adaint.c (__gnat_set_file_time_name): Likewise.diff --git a/gcc/ada/adaint.c b/gcc/ada/adaint.c
--- a/gcc/ada/adaint.c
+++ b/gcc/ada/adaint.c
@@ -1570,7 +1570,7 @@ extern long long __gnat_file_time(char* name)
 /* Set the file time stamp.  */
 
 void
-__gnat_set_file_time_name (char *name, time_t time_stamp)
+__gnat_set_file_time_name (char *name, OS_Time time_stamp)
 {
 #if defined (__vxworks)
 
@@ -1606,7 +1606,7 @@ __gnat_set_file_time_name (char *name, time_t time_stamp)
   time_t t;
 
   /* Set modification time to requested time.  */
-  utimbuf.modtime = time_stamp;
+  utimbuf.modtime = (time_t) time_stamp;
 
   /* Set access time to now in local time.  */
   t = time (NULL);


diff --git a/gcc/ada/adaint.h b/gcc/ada/adaint.h
--- a/gcc/ada/adaint.h
+++ b/gcc/ada/adaint.h
@@ -201,7 +201,7 @@ extern OS_Time __gnat_file_time_name(char *);
 extern OS_Time __gnat_file_time_fd  (int);
 /* return -1 in case of error */
 
-extern void   __gnat_set_file_time_name		   (char *, time_t);
+extern void   __gnat_set_file_time_name		   (char *, OS_Time);
 
 extern int__gnat_dup			(int);
 extern int__gnat_dup2			(int, int);




[Ada] The Unix Epochalyse of 2038 - OS_Time

2021-07-05 Thread Pierre-Marie de Rodat
OS_Time is use mainly in the interface to the "C" part of the GNAT RTL.
Change it to a 64-bit signed type on all targets.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* adaint.h (OS_Time): typedef as long long.
* osint.adb (Underlying_OS_Time): Declare as 64-bit signed type.
* libgnat/s-os_lib.adb ("<"): Compare OS_Time as
Long_Long_Integer.
("<="): Likewise.
(">"): Likewise.
(">="): Likewise.
* libgnat/s-os_lib.ads (OS_Time): Declare as 64-bit signed type.diff --git a/gcc/ada/adaint.h b/gcc/ada/adaint.h
--- a/gcc/ada/adaint.h
+++ b/gcc/ada/adaint.h
@@ -101,11 +101,7 @@ extern "C" {
 #endif
 
 /* Type corresponding to GNAT.OS_Lib.OS_Time */
-#if defined (_WIN64)
 typedef long long OS_Time;
-#else
-typedef long OS_Time;
-#endif
 
 #define __int64 long long
 GNAT_STRUCT_STAT;


diff --git a/gcc/ada/libgnat/s-os_lib.adb b/gcc/ada/libgnat/s-os_lib.adb
--- a/gcc/ada/libgnat/s-os_lib.adb
+++ b/gcc/ada/libgnat/s-os_lib.adb
@@ -139,7 +139,7 @@ package body System.OS_Lib is
 
function "<"  (X, Y : OS_Time) return Boolean is
begin
-  return Long_Integer (X) < Long_Integer (Y);
+  return Long_Long_Integer (X) < Long_Long_Integer (Y);
end "<";
 
--
@@ -148,7 +148,7 @@ package body System.OS_Lib is
 
function "<="  (X, Y : OS_Time) return Boolean is
begin
-  return Long_Integer (X) <= Long_Integer (Y);
+  return Long_Long_Integer (X) <= Long_Long_Integer (Y);
end "<=";
 
-
@@ -157,7 +157,7 @@ package body System.OS_Lib is
 
function ">"  (X, Y : OS_Time) return Boolean is
begin
-  return Long_Integer (X) > Long_Integer (Y);
+  return Long_Long_Integer (X) > Long_Long_Integer (Y);
end ">";
 
--
@@ -166,7 +166,7 @@ package body System.OS_Lib is
 
function ">="  (X, Y : OS_Time) return Boolean is
begin
-  return Long_Integer (X) >= Long_Integer (Y);
+  return Long_Long_Integer (X) >= Long_Long_Integer (Y);
end ">=";
 
-


diff --git a/gcc/ada/libgnat/s-os_lib.ads b/gcc/ada/libgnat/s-os_lib.ads
--- a/gcc/ada/libgnat/s-os_lib.ads
+++ b/gcc/ada/libgnat/s-os_lib.ads
@@ -1098,8 +1098,7 @@ private
pragma Import (C, Current_Process_Id, "__gnat_current_process_id");
 
type OS_Time is
- range -(2 ** (Standard'Address_Size - Integer'(1))) ..
-   +(2 ** (Standard'Address_Size - Integer'(1)) - 1);
+ range -(2 ** 63) ..  +(2 ** 63 - 1);
--  Type used for timestamps in the compiler. This type is used to hold
--  time stamps, but may have a different representation than C's time_t.
--  This type needs to match the declaration of OS_Time in adaint.h.


diff --git a/gcc/ada/osint.adb b/gcc/ada/osint.adb
--- a/gcc/ada/osint.adb
+++ b/gcc/ada/osint.adb
@@ -2191,8 +2191,7 @@ package body Osint is
   GNAT_Time : Time_Stamp_Type;
 
   type Underlying_OS_Time is
-range -(2 ** (Standard'Address_Size - Integer'(1))) ..
-  +(2 ** (Standard'Address_Size - Integer'(1)) - 1);
+range -(2 ** 63) ..  +(2 ** 63 - 1);
   --  Underlying_OS_Time is a redeclaration of OS_Time to allow integer
   --  manipulation. Remove this in favor of To_Ada/To_C once newer
   --  GNAT releases are available with these functions.




[Ada] Fix comment about the debug flag for strict alignment

2021-07-05 Thread Pierre-Marie de Rodat
The "strict alignment" compilation mode is controlled by -gnatd.a,
as described in Adjust_Global_Switches.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* ttypes.ads (Target_Strict_Alignment): Fix comment.diff --git a/gcc/ada/ttypes.ads b/gcc/ada/ttypes.ads
--- a/gcc/ada/ttypes.ads
+++ b/gcc/ada/ttypes.ads
@@ -210,7 +210,7 @@ package Ttypes is
Set_Targ.Strict_Alignment /= 0;
--  True if instructions will fail if data is misaligned. Note that this
--  is a variable rather than a constant since it can be modified (set to
-   --  True) if the debug flag -gnatd.A is used.
+   --  True) if the debug flag -gnatd.a is used.
 
Target_Double_Float_Alignment : constant Nat :=
  Set_Targ.Double_Float_Alignment;




[Ada] Adapt SPARK checking after change in rules regarding heap modeling

2021-07-05 Thread Pierre-Marie de Rodat
Rules in SPARK RM section 3.10 regarding modeling of heap through
dynamic (de)allocation has changed. As a result, the dependency contract
of Ada.Unchecked_Deallocation can be added directly in the sources,
and placement of allocators is not checked in the frontend anymore.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-uncdea.ads: Add Depends/Post to
Ada.Unchecked_Deallocation.
* sem_ch4.adb (Analyze_Allocator): Remove checking of allocator
placement.
* sem_res.adb (Flag_Object): Same.diff --git a/gcc/ada/libgnat/a-uncdea.ads b/gcc/ada/libgnat/a-uncdea.ads
--- a/gcc/ada/libgnat/a-uncdea.ads
+++ b/gcc/ada/libgnat/a-uncdea.ads
@@ -17,7 +17,10 @@ generic
type Object (<>) is limited private;
type Name is access Object;
 
-procedure Ada.Unchecked_Deallocation (X : in out Name);
+procedure Ada.Unchecked_Deallocation (X : in out Name) with
+  Depends => (X=> null,  --  X on exit does not depend on its input value
+  null => X),--  X's input value has no effect
+  Post => X = null;  --  X's output value is null
 pragma Preelaborate (Unchecked_Deallocation);
 
 pragma Import (Intrinsic, Ada.Unchecked_Deallocation);


diff --git a/gcc/ada/sem_ch4.adb b/gcc/ada/sem_ch4.adb
--- a/gcc/ada/sem_ch4.adb
+++ b/gcc/ada/sem_ch4.adb
@@ -889,16 +889,6 @@ package body Sem_Ch4 is
  Check_Restriction (No_Local_Allocators, N);
   end if;
 
-  if SPARK_Mode = On
-and then Comes_From_Source (N)
-and then not Is_OK_Volatile_Context (Context   => Parent (N),
- Obj_Ref   => N,
- Check_Actuals => False)
-  then
- Error_Msg_N
-   ("allocator cannot appear in this context (SPARK RM 7.1.3(10))", N);
-  end if;
-
   if Serious_Errors_Detected > Sav_Errs then
  Set_Error_Posted (N);
  Set_Etype (N, Any_Type);


diff --git a/gcc/ada/sem_res.adb b/gcc/ada/sem_res.adb
--- a/gcc/ada/sem_res.adb
+++ b/gcc/ada/sem_res.adb
@@ -3753,18 +3753,6 @@ package body Sem_Res is
 
  begin
 case Nkind (N) is
-   when N_Allocator =>
-  if not Is_OK_Volatile_Context (Context   => Parent (N),
- Obj_Ref   => N,
- Check_Actuals => True)
-  then
- Error_Msg_N
-   ("allocator cannot appear in this context"
-& " (SPARK RM 7.1.3(10))", N);
-  end if;
-
-  return Skip;
-
--  Do not consider nested function calls because they have
--  already been processed during their own resolution.
 




[Ada] Move overriding rename error message from declaration to use

2021-07-05 Thread Pierre-Marie de Rodat
Posting the error message on the declaration of the renamed subprogram
is more confusing than posting the message on the name of the renamed
subprogram in the renaming.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch3.adb (Check_Abstract_Overriding): Post error message on
renaming node.diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -11156,7 +11156,8 @@ package body Sem_Ch3 is
 
 if Present (Renamed_Or_Alias (Subp)) then
if not No_Return (Renamed_Or_Alias (Subp)) then
-  Error_Msg_N ("subprogram & must be No_Return",
+  Error_Msg_NE ("subprogram & must be No_Return",
+Subp,
 Renamed_Or_Alias (Subp));
   Error_Msg_N ("\since renaming & overrides No_Return "
 & "subprogram (RM 6.5.1(6/2))",




[Ada] Turn GNAT_Annotate into its own pragma

2021-07-05 Thread Pierre-Marie de Rodat
GNAT_Annotate being an alias of Annotate rather than its own pragma
results in issues for tools that rely on snames to get a list of
available pragmas.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* aspects.ads: Add GNAT_Annotate aspect.
* gnat1drv.adb (Adjust_Global_Switches): Stop defining
Name_Gnat_Annotate as an alias of Name_Annotate.
* snames.ads-tmpl: Define Gnat_Annotate.
* par-prag.adb, sem_prag.ads: Add Pragma_Gnat_Annotate to list
of pragmas.
* lib-writ.adb, sem_ch13.adb, sem_prag.adb: Handle Gnat_Annotate
like Aspect_Annotate.diff --git a/gcc/ada/aspects.ads b/gcc/ada/aspects.ads
--- a/gcc/ada/aspects.ads
+++ b/gcc/ada/aspects.ads
@@ -100,6 +100,7 @@ package Aspects is
   Aspect_External_Tag,
   Aspect_Ghost, -- GNAT
   Aspect_Global,-- GNAT
+  Aspect_GNAT_Annotate, -- GNAT
   Aspect_Implicit_Dereference,
   Aspect_Initial_Condition, -- GNAT
   Aspect_Initializes,   -- GNAT
@@ -269,6 +270,7 @@ package Aspects is
   Aspect_Favor_Top_Level=> True,
   Aspect_Ghost  => True,
   Aspect_Global => True,
+  Aspect_GNAT_Annotate  => True,
   Aspect_Inline_Always  => True,
   Aspect_Invariant  => True,
   Aspect_Lock_Free  => True,
@@ -318,9 +320,10 @@ package Aspects is
--  the same aspect attached to the same declaration are allowed.
 
No_Duplicates_Allowed : constant array (Aspect_Id) of Boolean :=
- (Aspect_Annotate  => False,
-  Aspect_Test_Case => False,
-  others   => True);
+ (Aspect_Annotate  => False,
+  Aspect_GNAT_Annotate => False,
+  Aspect_Test_Case => False,
+  others   => True);
 
--  The following subtype defines aspects corresponding to library unit
--  pragmas, these can only validly appear as aspects for library units,
@@ -387,6 +390,7 @@ package Aspects is
   Aspect_External_Tag   => Expression,
   Aspect_Ghost  => Optional_Expression,
   Aspect_Global => Expression,
+  Aspect_GNAT_Annotate  => Expression,
   Aspect_Implicit_Dereference   => Name,
   Aspect_Initial_Condition  => Expression,
   Aspect_Initializes=> Expression,
@@ -491,6 +495,7 @@ package Aspects is
   Aspect_External_Tag => False,
   Aspect_Ghost=> False,
   Aspect_Global   => False,
+  Aspect_GNAT_Annotate   => False,
   Aspect_Implicit_Dereference => False,
   Aspect_Initial_Condition=> False,
   Aspect_Initializes  => False,
@@ -647,6 +652,7 @@ package Aspects is
   Aspect_Full_Access_Only => Name_Full_Access_Only,
   Aspect_Ghost=> Name_Ghost,
   Aspect_Global   => Name_Global,
+  Aspect_GNAT_Annotate=> Name_GNAT_Annotate,
   Aspect_Implicit_Dereference => Name_Implicit_Dereference,
   Aspect_Import   => Name_Import,
   Aspect_Independent  => Name_Independent,
@@ -957,6 +963,7 @@ package Aspects is
   Aspect_Extensions_Visible   => Never_Delay,
   Aspect_Ghost=> Never_Delay,
   Aspect_Global   => Never_Delay,
+  Aspect_GNAT_Annotate=> Never_Delay,
   Aspect_Import   => Never_Delay,
   Aspect_Initial_Condition=> Never_Delay,
   Aspect_Initializes  => Never_Delay,


diff --git a/gcc/ada/gnat1drv.adb b/gcc/ada/gnat1drv.adb
--- a/gcc/ada/gnat1drv.adb
+++ b/gcc/ada/gnat1drv.adb
@@ -67,7 +67,6 @@ with Sem_Type;
 with Set_Targ;
 with Sinfo;  use Sinfo;
 with Sinfo.Nodes;use Sinfo.Nodes;
-with Sinfo.Utils;use Sinfo.Utils;
 with Sinput; use Sinput;
 with Sinput.L;   use Sinput.L;
 with Snames; use Snames;
@@ -146,12 +145,6 @@ procedure Gnat1drv is
--  Start of processing for Adjust_Global_Switches
 
begin
-  --  Define pragma GNAT_Annotate as an alias of pragma Annotate, to be
-  --  able to work around bootstrap limitations with the old syntax of
-  --  pragma Annotate, and use pragma GNAT_Annotate in compiler sources
-  --  when needed.
-
-  Map_Pragma_Name (From => Name_Gnat_Annotate, To => Name_Annotate);
 
   --  -gnatd_U disables prepending error messages with "error:"
 


diff --git a/gcc/ada/lib-writ.adb b/gcc/ada/lib-writ.adb
--- a/gcc/ada/lib-writ.adb
+++ b/gcc/ada/lib-writ.adb
@@ -709,7 +709,7 @@ package body Lib.Writ is
   Write_Info_Char (' ');
 
   case Pragma_Name (N) is
-  

[Ada] Cleanup checking for compatible alignment

2021-07-05 Thread Pierre-Marie de Rodat
Code cleanup only related to handling of Address clauses in GNATprove;
semantics is unaffected.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_util.adb (Has_Compatible_Alignment_Internal): Fix
indentation of ELSIF comments; remove explicit calls to
UI_To_Int; remove extra parens around the MOD operand.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -11820,22 +11820,22 @@ package body Sem_Util is
 Set_Result (Known_Incompatible);
  end if;
 
- --  See if Expr is an object with known alignment
+  --  See if Expr is an object with known alignment
 
   elsif Is_Entity_Name (Expr)
 and then Known_Alignment (Entity (Expr))
   then
  ExpA := Alignment (Entity (Expr));
 
- --  Otherwise, we can use the alignment of the type of
- --  Expr given that we already checked for
- --  discombobulating rep clauses for the cases of indexed
- --  and selected components above.
+  --  Otherwise, we can use the alignment of the type of Expr
+  --  given that we already checked for discombobulating rep
+  --  clauses for the cases of indexed and selected components
+  --  above.
 
   elsif Known_Alignment (Etype (Expr)) then
  ExpA := Alignment (Etype (Expr));
 
- --  Otherwise the alignment is unknown
+  --  Otherwise the alignment is unknown
 
   else
  Set_Result (Default);
@@ -11854,14 +11854,14 @@ package body Sem_Util is
   if Offs /= No_Uint then
  null;
 
- --  See if Expr is an object with known size
+  --  See if Expr is an object with known size
 
   elsif Is_Entity_Name (Expr)
 and then Known_Static_Esize (Entity (Expr))
   then
  SizA := Esize (Entity (Expr));
 
- --  Otherwise, we check the object size of the Expr type
+  --  Otherwise, we check the object size of the Expr type
 
   elsif Known_Static_Esize (Etype (Expr)) then
  SizA := Esize (Etype (Expr));
@@ -11906,25 +11906,24 @@ package body Sem_Util is
--  where we do not know the alignment of Obj.
 
if Known_Alignment (Entity (Expr))
- and then UI_To_Int (Alignment (Entity (Expr))) <
-Ttypes.Maximum_Alignment
+ and then Alignment (Entity (Expr)) < Ttypes.Maximum_Alignment
then
   Set_Result (Unknown);
 
-  --  Now check size of Expr object. Any size that is not an
-  --  even multiple of Maximum_Alignment is also worrisome
-  --  since it may cause the alignment of the object to be less
-  --  than the alignment of the type.
+   --  Now check size of Expr object. Any size that is not an even
+   --  multiple of Maximum_Alignment is also worrisome since it
+   --  may cause the alignment of the object to be less than the
+   --  alignment of the type.
 
elsif Known_Static_Esize (Entity (Expr))
  and then
-   (UI_To_Int (Esize (Entity (Expr))) mod
- (Ttypes.Maximum_Alignment * Ttypes.System_Storage_Unit))
+   Esize (Entity (Expr)) mod
+ (Ttypes.Maximum_Alignment * Ttypes.System_Storage_Unit)
 /= 0
then
   Set_Result (Unknown);
 
-  --  Otherwise same type is decisive
+   --  Otherwise same type is decisive
 
else
   Set_Result (Known_Compatible);




[Ada] Spurious error in instantiation with aggregate and private ancestor

2021-07-05 Thread Pierre-Marie de Rodat
Compiler rejects an instantiation but accepts the corresponding generic
unit, when it includes a declaration for a private type whose full view
is a record type with a controlled component, and the full view of an
object of the type is given by an aggregate with default-initialized
components.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_aggr.adb (Resolve_Record_Aggregate, Step_5): Do not check
for the need to use an extension aggregate for a given component
when within an instance and the type of the component hss a
private ancestor: the instantiation is legal if the generic
compiles, and spurious errors may be generated otherwise.diff --git a/gcc/ada/sem_aggr.adb b/gcc/ada/sem_aggr.adb
--- a/gcc/ada/sem_aggr.adb
+++ b/gcc/ada/sem_aggr.adb
@@ -5028,12 +5028,19 @@ package body Sem_Aggr is
Prepend_Elmt (Parent_Typ, To => Parent_Typ_List);
Parent_Typ := Etype (Parent_Typ);
 
+   --  Check whether a private parent requires the use of
+   --  an extension aggregate. This test does not apply in
+   --  an instantiation: if the generic unit is legal so is
+   --  the instance.
+
if Nkind (Parent (Base_Type (Parent_Typ))) =
 N_Private_Type_Declaration
  or else Nkind (Parent (Base_Type (Parent_Typ))) =
 N_Private_Extension_Declaration
then
-  if Nkind (N) /= N_Extension_Aggregate then
+  if Nkind (N) /= N_Extension_Aggregate
+and then not In_Instance
+  then
  Error_Msg_NE
("type of aggregate has private ancestor&!",
 N, Parent_Typ);




[Ada] Fix crash when printing error message

2021-07-05 Thread Pierre-Marie de Rodat
Missing Chars on N were causing a crash - the solution is to set the
Sloc from N and to use F_type's chars.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* freeze.adb (Freeze_Profile): Use N's Sloc, F_type's chars.diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -4141,9 +4141,10 @@ package body Freeze is
elsif not After_Last_Declaration
  and then not Freezing_Library_Level_Tagged_Type
then
-  Error_Msg_Node_1 := F_Type;
-  Error_Msg_N
-("type & must be fully defined before this point", N);
+  Error_Msg_NE
+("type & must be fully defined before this point",
+ N,
+ F_Type);
end if;
 end if;
 




[Ada] Print JSON continuation messages as separate messages

2021-07-05 Thread Pierre-Marie de Rodat
Printing continuation messages as a single JSON message was an error as
consumers of the JSON output can take advantage of different locations
of the continuation message.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* errout.adb (Output_JSON_Message): Recursively call
Output_JSON_Message for continuation messages instead of
appending their content to the initial message.diff --git a/gcc/ada/errout.adb b/gcc/ada/errout.adb
--- a/gcc/ada/errout.adb
+++ b/gcc/ada/errout.adb
@@ -2079,6 +2079,9 @@ package body Errout is
 
procedure Output_JSON_Message (Error_Id : Error_Msg_Id) is
 
+  function Is_Continuation (E : Error_Msg_Id) return Boolean;
+  --  Return True if E is a continuation message.
+
   procedure Write_JSON_Escaped_String (Str : String_Ptr);
   --  Write each character of Str, taking care of preceding each quote and
   --  backslash with a backslash. Note that this escaping differs from what
@@ -2099,6 +2102,15 @@ package body Errout is
   --  Span.Last are different from Span.Ptr, they will be printed as JSON
   --  locations under the names "start" and "finish".
 
+  ---
+  --  Is_Continuation  --
+  ---
+
+  function Is_Continuation (E : Error_Msg_Id) return Boolean is
+  begin
+ return E <= Last_Error_Msg and then Errors.Table (E).Msg_Cont;
+  end Is_Continuation;
+
   ---
   -- Write_JSON_Escaped_String --
   ---
@@ -2155,6 +2167,10 @@ package body Errout is
 
   E : Error_Msg_Id := Error_Id;
 
+  Print_Continuations : constant Boolean := not Is_Continuation (E);
+  --  Do not print continuations messages as children of the current
+  --  message if the current message is a continuation message.
+
--  Start of processing for Output_JSON_Message
 
begin
@@ -2186,18 +2202,27 @@ package body Errout is
 
   Write_Str ("],""message"":""");
   Write_JSON_Escaped_String (Errors.Table (E).Text);
-
-  --  Print message continuations if present
+  Write_Str ();
 
   E := E + 1;
 
-  while E <= Last_Error_Msg and then Errors.Table (E).Msg_Cont loop
- Write_Str (", ");
- Write_JSON_Escaped_String (Errors.Table (E).Text);
+  if Print_Continuations and then Is_Continuation (E) then
+
+ Write_Str (",""children"": [");
+ Output_JSON_Message (E);
  E := E + 1;
-  end loop;
 
-  Write_Str ("""}");
+ while Is_Continuation (E) loop
+Write_Str (", ");
+Output_JSON_Message (E);
+E := E + 1;
+ end loop;
+
+ Write_Str ("]");
+
+  end if;
+
+  Write_Str ("}");
end Output_JSON_Message;
 
-




[Ada] Refactoring related to Returns_By_Ref

2021-07-05 Thread Pierre-Marie de Rodat
Split out the computation of Returns_By_Ref, to make subsequent changes
easier. General cleanups.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_util.ads, sem_util.adb (Compute_Returns_By_Ref): New
procedure to compute Returns_By_Ref, to avoid some code
duplication. This will likely change soon, so it's good to have
the code in one place.
(CW_Or_Has_Controlled_Part): Move here from Exp_Ch7, because
it's called by Compute_Returns_By_Ref, and this is a better
place for it anyway.
(Needs_Finalization): Fix comment to be vague instead of wrong.
* exp_ch6.adb (Expand_N_Subprogram_Body, Freeze_Subprogram):
Call Compute_Returns_By_Ref.
* sem_ch6.adb (Check_Delayed_Subprogram): Call
Compute_Returns_By_Ref.
* exp_ch7.ads, exp_ch7.adb (CW_Or_Has_Controlled_Part): Move to
Sem_Util.
(Has_New_Controlled_Component): Remove unused function.diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -6431,18 +6431,7 @@ package body Exp_Ch6 is
   --  Returns_By_Ref flag is normally set when the subprogram is frozen but
   --  subprograms with no specs are not frozen.
 
-  declare
- Typ  : constant Entity_Id := Etype (Spec_Id);
- Utyp : constant Entity_Id := Underlying_Type (Typ);
-
-  begin
- if Is_Limited_View (Typ) then
-Set_Returns_By_Ref (Spec_Id);
-
- elsif Present (Utyp) and then CW_Or_Has_Controlled_Part (Utyp) then
-Set_Returns_By_Ref (Spec_Id);
- end if;
-  end;
+  Compute_Returns_By_Ref (Spec_Id);
 
   --  For a procedure, we add a return for all possible syntactic ends of
   --  the subprogram.
@@ -7851,18 +7840,7 @@ package body Exp_Ch6 is
   --  of the normal semantic analysis of the spec since the underlying
   --  returned type may not be known yet (for private types).
 
-  declare
- Typ  : constant Entity_Id := Etype (Subp);
- Utyp : constant Entity_Id := Underlying_Type (Typ);
-
-  begin
- if Is_Limited_View (Typ) then
-Set_Returns_By_Ref (Subp);
-
- elsif Present (Utyp) and then CW_Or_Has_Controlled_Part (Utyp) then
-Set_Returns_By_Ref (Subp);
- end if;
-  end;
+  Compute_Returns_By_Ref (Subp);
 
   --  Wnen freezing a null procedure, analyze its delayed aspects now
   --  because we may not have reached the end of the declarative list when


diff --git a/gcc/ada/exp_ch7.adb b/gcc/ada/exp_ch7.adb
--- a/gcc/ada/exp_ch7.adb
+++ b/gcc/ada/exp_ch7.adb
@@ -5118,15 +5118,6 @@ package body Exp_Ch7 is
   end if;
end Convert_View;
 
-   ---
-   -- CW_Or_Has_Controlled_Part --
-   ---
-
-   function CW_Or_Has_Controlled_Part (T : Entity_Id) return Boolean is
-   begin
-  return Is_Class_Wide_Type (T) or else Needs_Finalization (T);
-   end CW_Or_Has_Controlled_Part;
-

-- Enclosing_Function --

@@ -6130,37 +6121,6 @@ package body Exp_Ch7 is
   return Empty;
end Find_Transient_Context;
 
-   --
-   -- Has_New_Controlled_Component --
-   --
-
-   function Has_New_Controlled_Component (E : Entity_Id) return Boolean is
-  Comp : Entity_Id;
-
-   begin
-  if not Is_Tagged_Type (E) then
- return Has_Controlled_Component (E);
-  elsif not Is_Derived_Type (E) then
- return Has_Controlled_Component (E);
-  end if;
-
-  Comp := First_Component (E);
-  while Present (Comp) loop
- if Chars (Comp) = Name_uParent then
-null;
-
- elsif Scope (Original_Record_Component (Comp)) = E
-   and then Needs_Finalization (Etype (Comp))
- then
-return True;
- end if;
-
- Next_Component (Comp);
-  end loop;
-
-  return False;
-   end Has_New_Controlled_Component;
-
-
-- Has_Simple_Protected_Object --
-


diff --git a/gcc/ada/exp_ch7.ads b/gcc/ada/exp_ch7.ads
--- a/gcc/ada/exp_ch7.ads
+++ b/gcc/ada/exp_ch7.ads
@@ -153,17 +153,6 @@ package Exp_Ch7 is
--  triggered by an abort, E_Id denotes the defining identifier of a local
--  exception occurrence, Raised_Id is the entity of a local boolean flag.
 
-   function CW_Or_Has_Controlled_Part (T : Entity_Id) return Boolean;
-   --  True if T is a class-wide type, or if it has controlled parts ("part"
-   --  means T or any of its subcomponents). Same as Needs_Finalization, except
-   --  when pragma Restrictions (No_Finalization) applies, in which case we
-   --  know that class-wide objects do not contain controlled parts.
-
-   function Has_New_Controlled_Component (E : Entity_Id) return Boolean;
-   --  E is a type entity. 

[Ada] Do not catch 'N rem -1' in CodePeer_Mode

2021-07-05 Thread Pierre-Marie de Rodat
The special case used for catching the 'rem -1' operation is not useful
to CodePeer, and in fact may be detrimental to its precision. Remove
it in CodePeer_Mode.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch4.adb (Expand_N_Op_Rem): Remove special case for rem -1
in CodePeer_Mode.diff --git a/gcc/ada/exp_ch4.adb b/gcc/ada/exp_ch4.adb
--- a/gcc/ada/exp_ch4.adb
+++ b/gcc/ada/exp_ch4.adb
@@ -10393,7 +10393,9 @@ package body Exp_Ch4 is
   --  types and this is really marginal). We will just assume that we need
   --  the test if the left operand can be negative at all.
 
-  if Lneg and Rneg then
+  if (Lneg and Rneg)
+ and then not CodePeer_Mode
+  then
  Rewrite (N,
Make_If_Expression (Loc,
  Expressions => New_List (




[Ada] Fix overriding subprogram being incorrectly seen as returning

2021-07-05 Thread Pierre-Marie de Rodat
Before this commit, GNAT failed to notice that subprograms overriding
non-returning subprograms could be renamings of non-returning
subprograms and thus wrongfully emitted an error.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch3.adb (Check_Abstract_Overriding): Check for renamings.diff --git a/gcc/ada/sem_ch3.adb b/gcc/ada/sem_ch3.adb
--- a/gcc/ada/sem_ch3.adb
+++ b/gcc/ada/sem_ch3.adb
@@ -11149,12 +11149,28 @@ package body Sem_Ch3 is
 
  if Present (Overridden_Operation (Subp))
and then No_Return (Overridden_Operation (Subp))
-   and then not No_Return (Subp)
  then
-Error_Msg_N ("overriding subprogram & must be No_Return", Subp);
-Error_Msg_N
-  ("\since overridden subprogram is No_Return (RM 6.5.1(6/2))",
-   Subp);
+
+--  If the subprogram is a renaming, check that the renamed
+--  subprogram is No_Return.
+
+if Present (Renamed_Or_Alias (Subp)) then
+   if not No_Return (Renamed_Or_Alias (Subp)) then
+  Error_Msg_N ("subprogram & must be No_Return",
+Renamed_Or_Alias (Subp));
+  Error_Msg_N ("\since renaming & overrides No_Return "
+& "subprogram (RM 6.5.1(6/2))",
+Subp);
+   end if;
+
+--  Make sure that the subprogram itself is No_Return.
+
+elsif not No_Return (Subp) then
+   Error_Msg_N ("overriding subprogram & must be No_Return", Subp);
+   Error_Msg_N
+ ("\since overridden subprogram is No_Return (RM 6.5.1(6/2))",
+  Subp);
+end if;
  end if;
 
  --  If the operation is a wrapper for a synchronized primitive, it




Re: [PATCH v4] ira: Support more matching constraint forms with param [PR100328]

2021-07-05 Thread Vladimir Makarov via Gcc-patches



On 2021-07-01 10:11 p.m., Kewen.Lin wrote:

Hi Vladimir,

on 2021/6/30 下午11:24, Vladimir Makarov wrote:


Many thanks for your review!  I've updated the patch according to your comments 
and also polished some comments and document words a bit.  Does it look better 
to you?

Sorry for the delay with the answer.  The patch is better for me now and 
can be committed into the trunk.


Thanks again for working on this performance issue.




Re: [PATCH] X86: Provide a CTOR for stringop_algs [PR100246].

2021-07-05 Thread Iain Sandoe
Hi Richard,

> On 5 Jul 2021, at 11:50, Richard Biener via Gcc-patches 
>  wrote:
> 
> On Sun, Jul 4, 2021 at 10:04 PM Iain Sandoe  wrote:

>> Several older compilers fail to build modern GCC because of missing
>> or incomplete C++11 support.
>> 
>> (although the PR mentions clang, specifically, this has also been reported
>> for some GCC versions within the range that should be able to bootstrap
>> GCC)
>> 
>> There are several possible solutions proposed in the PR, this one seems
>> the least invasive.
>> 
>> The header is pulled into the gcov code that builds with C, so we have to
>> make the CTOR conditional on C++.
>> 
>> tested on Darwin12 with xcode-6, bootstrapped on x86_64-darwin and linux.
>> OK for master / GCC-11?
> 
> Hmm, what is specifically built with a C compiler?  gcov.c not, I think.

any C compilation that includes tm.h

well, libgcc2 fails too on a quick check here -  but ISTR there was something in
libgcov and I checked with Martin that it was intentionally compiled with C 
compiler.

> Instead of commenting the CTOR, does it work to comment the whole 
> stringop_algs
> type?

I don’t think that will work because it’s in a header that’s transitively 
included by tm.h
which is then included loads of places.

>  Also it seems on trunk this CTOR is no more?

The addition of the CTOR is the fix for the C++ compile fail in the PR, the 
conditional is
only there because the same header is compiled by C and C++.

thanks
Iain
> 
>> thanks
>> Iain
>> 
>> Signed-off-by: Iain Sandoe 
>> 
>> PR bootstrap/100246 - [11/12 Regression] GCC will not bootstrap with clang 
>> 3.4/3.5 [xcode 5/6, Darwin 12/13]
>> 
>>PR bootstrap/100246
>> 
>> gcc/ChangeLog:
>> 
>>* config/i386/i386.h (struct stringop_algs): Define a CTOR for
>>this type.
>> ---
>> gcc/config/i386/i386.h | 5 +
>> 1 file changed, 5 insertions(+)
>> 
>> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
>> index 6e0340a4b60..84151156999 100644
>> --- a/gcc/config/i386/i386.h
>> +++ b/gcc/config/i386/i386.h
>> @@ -73,6 +73,11 @@ struct stringop_algs
>> {
>>   const enum stringop_alg unknown_size;
>>   const struct stringop_strategy {
>> +#ifdef __cplusplus
>> +stringop_strategy(int _max = -1, enum stringop_alg _alg = libcall,
>> + int _noalign = false)
>> +  : max (_max), alg (_alg), noalign (_noalign) {}
>> +#endif
>> const int max;
>> const enum stringop_alg alg;
>> int noalign;
>> --
>> 2.24.1



Re: [PATCH] Port GCC documentation to Sphinx

2021-07-05 Thread Richard Sandiford via Gcc-patches
Eli Zaretskii  writes:
>> Hans-Peter Nilsson  writes:
>> > I've read the discussion downthread, but I seem to miss (a recap
>> > of) the benefits of moving to Sphinx.  Maybe other have too and
>> > it'd be a good idea to repeat them?  Otherwise, the impression
>> > is not so good, as all I see is bits here and there getting lost
>> > in translation.
>> 
>> Better cross-referencing is one big feature.
>
> See below: the Info format has some features in addition to
> cross-references that can make this a much smaller issue.  HTML has
> just the cross-references, so "when you are a hammer, every problem
> looks like a nail".
>
>> IMO this subthread has demonstrated why the limitations of info
>> formatting have held back the amount of cross-referencing in the
>> online html.
>
> I disagree with this conclusion, see below.
>
>> (And based on emperical evidence, I get the impression that far more
>> people use the online html docs than the info docs.)
>
> HTML browsers currently lack some features that make Info the format
> of choice for me when I need to use the documentation efficiently.
> The most important feature I miss in HTML browsers is the index
> search.  A good manual usually has extensive index (or indices) which
> make it very easy to find a specific topic one is looking for,
> i.e. use the manual as a reference (as opposed as a first-time
> reading, when you read large portions of the manual in sequence).
>
> Another important feature is regexp search across multiple sections
> (with HTML you'd be forced to download the manual as a single large
> file for that, and then you'll probably miss regexps).
>
> Yet another feature which, when needed, is something to kill for, is
> the "info apropos" command, which can search all the manuals on your
> system and build a menu from the matching sections found in different
> manuals.  And there are a few more.
>
> (Texinfo folks are working on JavaScript code to add some missing
> capabilities to Web browsers, but that effort is not yet complete.)

Whether info or HTML is the better format isn't the issue though.  The
point is that we do have HTML output that is (emperically) widely used.
And at the moment it isn't as good as it could be.

The question that I was replying to was: what is the benefit of moving
to Sphinx?  And one of the answers is that it improves the HTML output.

>> E.g. quoting from Richard's recent patch:
>> 
>>   @item -fmove-loop-stores
>>   @opindex fmove-loop-stores
>>   Enables the loop store motion pass in the GIMPLE loop optimizer.  This
>>   moves invariant stores to after the end of the loop in exchange for
>>   carrying the stored value in a register across the iteration.
>>   Note for this option to have an effect @option{-ftree-loop-im} has to 
>>   be enabled as well.  Enabled at level @option{-O1} and higher, except 
>>   for @option{-Og}.
>> 
>> In the online docs, this will just be plain text.  Anyone who doesn't
>> know what -ftree-loop-im is will have to search for it manually.
>
> First, even if there are no cross-references, manual search is not the
> best way.  It is much easier to use index-search:
>
>   i ftree TAB
>
> will display a list of options that you could be after, and you can
> simply choose from the list, or type a bit more until you have a
> single match.

Here too I was talking about this being plain text in the online docs,
i.e. in the HTML.

In HTML the user-friendly way of letting users answer the question
“what on earth is -ftree-loop-im” is to have “-ftree-loop-im” be a
hyperlink that goes straight to the documentation of the option.
Same for PDF when viewed digitally.

One of the things that the move to Sphinx does is give us those
hyperlinks.

> Moreover, adding cross-references is easy:
>
>   @item -fmove-loop-stores
>   @opindex fmove-loop-stores
>   Enables the loop store motion pass in the GIMPLE loop optimizer.  This
>   moves invariant stores to after the end of the loop in exchange for
>   carrying the stored value in a register across the iteration.
>   Note for this option to have an effect @option{-ftree-loop-im}
>   (@pxref{Optimize Options, -ftree-loop-im}) 
>   ^^
>   has be enabled as well.  Enabled at level @option{-O1} and higher,
>   except for @option{-Og}.
>
> If this looks like too much work, a simple Texinfo macro (two, if you
> want an anchor where you point) will do.

But this would be redundant in HTML: “foo (see foo)”.

Also, the benefit of hyperlinks in HTML (not info) is that they can be
used outside of prose, such as in lists, without interrupting the flow.

>> Adding the extra references to the html (and pdf) output but dropping
>> them from the info sounds like a good compromise.
>
> But that's not what happens.

Not in the original patch, sure, but it's what I think Martin was
suggesting as a compromise (maybe I misunderstood).  The comment above
was supposed to be in support of doing that.

It sounds like those who use the 

Re: [PATCH] Port GCC documentation to Sphinx

2021-07-05 Thread Eli Zaretskii via Gcc-patches
> From: Richard Sandiford 
> Cc: Eli Zaretskii ,  g...@gcc.gnu.org,  
> gcc-patches@gcc.gnu.org,  jos...@codesourcery.com
> Date: Mon, 05 Jul 2021 10:17:38 +0100
> 
> Hans-Peter Nilsson  writes:
> > I've read the discussion downthread, but I seem to miss (a recap
> > of) the benefits of moving to Sphinx.  Maybe other have too and
> > it'd be a good idea to repeat them?  Otherwise, the impression
> > is not so good, as all I see is bits here and there getting lost
> > in translation.
> 
> Better cross-referencing is one big feature.

See below: the Info format has some features in addition to
cross-references that can make this a much smaller issue.  HTML has
just the cross-references, so "when you are a hammer, every problem
looks like a nail".

> IMO this subthread has demonstrated why the limitations of info
> formatting have held back the amount of cross-referencing in the
> online html.

I disagree with this conclusion, see below.

> (And based on emperical evidence, I get the impression that far more
> people use the online html docs than the info docs.)

HTML browsers currently lack some features that make Info the format
of choice for me when I need to use the documentation efficiently.
The most important feature I miss in HTML browsers is the index
search.  A good manual usually has extensive index (or indices) which
make it very easy to find a specific topic one is looking for,
i.e. use the manual as a reference (as opposed as a first-time
reading, when you read large portions of the manual in sequence).

Another important feature is regexp search across multiple sections
(with HTML you'd be forced to download the manual as a single large
file for that, and then you'll probably miss regexps).

Yet another feature which, when needed, is something to kill for, is
the "info apropos" command, which can search all the manuals on your
system and build a menu from the matching sections found in different
manuals.  And there are a few more.

(Texinfo folks are working on JavaScript code to add some missing
capabilities to Web browsers, but that effort is not yet complete.)

> E.g. quoting from Richard's recent patch:
> 
>   @item -fmove-loop-stores
>   @opindex fmove-loop-stores
>   Enables the loop store motion pass in the GIMPLE loop optimizer.  This
>   moves invariant stores to after the end of the loop in exchange for
>   carrying the stored value in a register across the iteration.
>   Note for this option to have an effect @option{-ftree-loop-im} has to 
>   be enabled as well.  Enabled at level @option{-O1} and higher, except 
>   for @option{-Og}.
> 
> In the online docs, this will just be plain text.  Anyone who doesn't
> know what -ftree-loop-im is will have to search for it manually.

First, even if there are no cross-references, manual search is not the
best way.  It is much easier to use index-search:

  i ftree TAB

will display a list of options that you could be after, and you can
simply choose from the list, or type a bit more until you have a
single match.

Moreover, adding cross-references is easy:

  @item -fmove-loop-stores
  @opindex fmove-loop-stores
  Enables the loop store motion pass in the GIMPLE loop optimizer.  This
  moves invariant stores to after the end of the loop in exchange for
  carrying the stored value in a register across the iteration.
  Note for this option to have an effect @option{-ftree-loop-im}
  (@pxref{Optimize Options, -ftree-loop-im}) 
  ^^
  has be enabled as well.  Enabled at level @option{-O1} and higher,
  except for @option{-Og}.

If this looks like too much work, a simple Texinfo macro (two, if you
want an anchor where you point) will do.

> Adding the extra references to the html (and pdf) output but dropping
> them from the info sounds like a good compromise.

But that's not what happens.  And besides, how would you decide which
cross-references to drop and which to retain in Info?


[COMMITTED] [PATCH] testsuite: gcc.dg/debug/btf/btf-bitfields-3.c requires -fno-short-enums PR debug/101321

2021-07-05 Thread Christophe LYON - foss via Gcc-patches
arm-eabi uses -fshort-enums by default while arm-linux-gnueabi* do not,
like most (all?) other targets, but this test relies -fno-short-enums.
Fix it by forcing -fno-short-enums.

2021-07-05  Christophe Lyon  

PR debug/101321
gcc/testsuite/
* gcc.dg/debug/btf/btf-bitfields-3.c: Add -fno-short-enums.
---
 gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-3.c 
b/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-3.c
index 440623c3b16..5e68416e2c2 100644
--- a/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-3.c
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-3.c
@@ -15,7 +15,7 @@
*/

 /* { dg-do compile } */
-/* { dg-options "-O0 -gbtf -dA" } */
+/* { dg-options "-O0 -gbtf -dA -fno-short-enums" } */

 /* Enum with 4 members.  */
 /* { dg-final { scan-assembler-times "\[\t \]0x604\[\t 
\]+\[^\n\]*btt_info" 1 } } */

From 0ea47850bbb38ea81a34c503533d4dd0f3391f19 Mon Sep 17 00:00:00 2001
From: Christophe Lyon 
Date: Mon, 5 Jul 2021 11:33:45 +
Subject: [PATCH] testsuite: gcc.dg/debug/btf/btf-bitfields-3.c requires
 -fno-short-enums PR debug/101321

arm-eabi uses -fshort-enums by default while arm-linux-gnueabi* do not,
like most (all?) other targets, but this test relies -fno-short-enums.
Fix it by forcing -fno-short-enums.

2021-07-05  Christophe Lyon  

	PR debug/101321
	gcc/testsuite/
	* gcc.dg/debug/btf/btf-bitfields-3.c: Add -fno-short-enums.
---
 gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-3.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-3.c b/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-3.c
index 440623c3b16..5e68416e2c2 100644
--- a/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-3.c
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-bitfields-3.c
@@ -15,7 +15,7 @@
*/
 
 /* { dg-do compile } */
-/* { dg-options "-O0 -gbtf -dA" } */
+/* { dg-options "-O0 -gbtf -dA -fno-short-enums" } */
 
 /* Enum with 4 members.  */
 /* { dg-final { scan-assembler-times "\[\t \]0x604\[\t \]+\[^\n\]*btt_info" 1 } } */
-- 
2.25.1



[PATCH] Do not set both LOOP_C_INFINITE and LOOP_C_FINITE on vectorized loop

2021-07-05 Thread Richard Biener
The setting is likely a typo and was meant to affect the scalar version
but even there LOOP_C_INFINITE is at most an optimization to the
niter analysis.  Clearly setting it on the vectorized loop which we
just versioned to be _not_ infinite is bogus so the following change
removes this.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2021-07-05  Richard Biener  

* tree-vect-loop-manip.c (vect_loop_versioning): Do not
set LOOP_C_INFINITE on the vectorized loop.
---
 gcc/tree-vect-loop-manip.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 012f48bd487..2909e8a0fc3 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -3597,8 +3597,6 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
 niter information which is copied from the original loop.  */
   gcc_assert (loop_constraint_set_p (loop, LOOP_C_FINITE));
   vect_free_loop_info_assumptions (nloop);
-  /* And set constraint LOOP_C_INFINITE for niter analyzer.  */
-  loop_constraint_set (loop, LOOP_C_INFINITE);
 }
 
   if (LOCATION_LOCUS (vect_location.get_location_t ()) != UNKNOWN_LOCATION
-- 
2.26.2


Re: [PATCH 5/5] Port most of the A CMP 0 ? A : -A to match

2021-07-05 Thread Richard Biener via Gcc-patches
On Sun, Jul 4, 2021 at 8:42 PM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> To improve phiopt and be able to remove abs_replacement, this ports
> most of "A CMP 0 ? A : -A" from fold_cond_expr_with_comparison to
> match.pd.  There is a few extra changes that are needed to remove
> the "A CMP 0 ? A : -A" part from fold_cond_expr_with_comparison:
>* Need to handle (A - B) case
>* Need to handle UN* comparisons.
>
> I will handle those in a different patch.
>
> Note phi-opt-15.c test needed to be updated as we get ABSU now
> instead of not getting ABS.  When ABSU was added phiopt was not
> updated even to use ABSU instead of not creating ABS.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

OK

> gcc/ChangeLog:
>
> PR tree-optimization/101039
> * match.pd (A CMP 0 ? A : -A): New patterns.
> * tree-ssa-phiopt.c (abs_replacement): Delete function.
> (tree_ssa_phiopt_worker): Don't call abs_replacement.
> Update comment about abs_replacement.
>
> gcc/testsuite/ChangeLog:
>
> PR tree-optimization/101039
> * gcc.dg/tree-ssa/phi-opt-15.c: Update test to expect
> ABSU and still not expect ABS_EXPR.
> * gcc.dg/tree-ssa/phi-opt-23.c: New test.
> * gcc.dg/tree-ssa/phi-opt-24.c: New test.
> ---
>  gcc/match.pd   |  60 +
>  gcc/testsuite/gcc.dg/tree-ssa/phi-opt-15.c |   4 +-
>  gcc/testsuite/gcc.dg/tree-ssa/phi-opt-23.c |  44 +++
>  gcc/testsuite/gcc.dg/tree-ssa/phi-opt-24.c |  44 +++
>  gcc/tree-ssa-phiopt.c  | 134 +
>  5 files changed, 152 insertions(+), 134 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-23.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/phi-opt-24.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 4e10d54383c..72860fbd448 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3976,6 +3976,66 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(cnd (logical_inverted_value truth_valued_p@0) @1 @2)
>(cnd @0 @2 @1)))
>
> +/* abs/negative simplifications moved from fold_cond_expr_with_comparison,
> +   Need to handle (A - B) case as fold_cond_expr_with_comparison does.
> +   Need to handle UN* comparisons.
> +
> +   None of these transformations work for modes with signed
> +   zeros.  If A is +/-0, the first two transformations will
> +   change the sign of the result (from +0 to -0, or vice
> +   versa).  The last four will fix the sign of the result,
> +   even though the original expressions could be positive or
> +   negative, depending on the sign of A.
> +
> +   Note that all these transformations are correct if A is
> +   NaN, since the two alternatives (A and -A) are also NaNs.  */
> +
> +(for cnd (cond vec_cond)
> + /* A == 0 ? A : -Asame as -A */
> + (for cmp (eq uneq)
> +  (simplify
> +   (cnd (cmp @0 zerop) @0 (negate@1 @0))
> +(if (!HONOR_SIGNED_ZEROS (type))
> + @1))
> +  (simplify
> +   (cnd (cmp @0 zerop) integer_zerop (negate@1 @0))
> +(if (!HONOR_SIGNED_ZEROS (type))
> + @1))
> + )
> + /* A != 0 ? A : -Asame as A */
> + (for cmp (ne ltgt)
> +  (simplify
> +   (cnd (cmp @0 zerop) @0 (negate @0))
> +(if (!HONOR_SIGNED_ZEROS (type))
> + @0))
> +  (simplify
> +   (cnd (cmp @0 zerop) @0 integer_zerop)
> +(if (!HONOR_SIGNED_ZEROS (type))
> + @0))
> + )
> + /* A >=/> 0 ? A : -Asame as abs (A) */
> + (for cmp (ge gt)
> +  (simplify
> +   (cnd (cmp @0 zerop) @0 (negate @0))
> +(if (!HONOR_SIGNED_ZEROS (type)
> +&& !TYPE_UNSIGNED (type))
> + (abs @0
> + /* A <=/< 0 ? A : -Asame as -abs (A) */
> + (for cmp (le lt)
> +  (simplify
> +   (cnd (cmp @0 zerop) @0 (negate @0))
> +(if (!HONOR_SIGNED_ZEROS (type)
> +&& !TYPE_UNSIGNED (type))
> + (if (ANY_INTEGRAL_TYPE_P (type)
> + && !TYPE_OVERFLOW_WRAPS (type))
> +  (with {
> +   tree utype = unsigned_type_for (type);
> +   }
> +   (convert (negate (absu:utype @0
> +   (negate (abs @0)
> + )
> +)
> +
>  /* -(type)!A -> (type)A - 1.  */
>  (simplify
>   (negate (convert?:s (logical_inverted_value:s @0)))
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-15.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-15.c
> index ac3018ef533..6aec68961cf 100644
> --- a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-15.c
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-15.c
> @@ -9,4 +9,6 @@ foo (int i)
>return i;
>  }
>
> -/* { dg-final { scan-tree-dump-not "ABS" "optimized" } } */
> +/* We should not have ABS_EXPR but ABSU_EXPR instead. */
> +/* { dg-final { scan-tree-dump-not "ABS_EXPR" "optimized" } } */
> +/* { dg-final { scan-tree-dump "ABSU" "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-23.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-23.c
> new file mode 100644
> index 000..ff658cd16a7
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/phi-opt-23.c
> @@ -0,0 +1,44 @@
> +/* 

Re: [PATCH 3/5] Allow match-and-simplified phiopt to run in early phiopt

2021-07-05 Thread Richard Biener via Gcc-patches
On Sun, Jul 4, 2021 at 8:41 PM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> To move a few things more to match-and-simplify from phiopt,
> we need to allow match_simplify_replacement to run in early
> phiopt. To do this we add a replacement for gimple_simplify
> that is explictly for phiopt.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu with no
> regressions.

OK.

Richard.

> gcc/ChangeLog:
>
> * tree-ssa-phiopt.c (match_simplify_replacement):
> Add early_p argument. Call gimple_simplify_phiopt
> instead of gimple_simplify.
> (tree_ssa_phiopt_worker): Update call to
> match_simplify_replacement and allow unconditionally.
> (phiopt_early_allow): New function.
> (gimple_simplify_phiopt): New function.
> ---
>  gcc/tree-ssa-phiopt.c | 89 ++-
>  1 file changed, 70 insertions(+), 19 deletions(-)
>
> diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
> index 71f0019d877..d4449afcdca 100644
> --- a/gcc/tree-ssa-phiopt.c
> +++ b/gcc/tree-ssa-phiopt.c
> @@ -50,13 +50,14 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimple-fold.h"
>  #include "internal-fn.h"
>  #include "gimple-range.h"
> +#include "gimple-match.h"
>  #include "dbgcnt.h"
>
>  static unsigned int tree_ssa_phiopt_worker (bool, bool, bool);
>  static bool two_value_replacement (basic_block, basic_block, edge, gphi *,
>tree, tree);
>  static bool match_simplify_replacement (basic_block, basic_block,
> -   edge, edge, gphi *, tree, tree);
> +   edge, edge, gphi *, tree, tree, bool);
>  static gphi *factor_out_conditional_conversion (edge, edge, gphi *, tree, 
> tree,
> gimple *);
>  static int value_replacement (basic_block, basic_block,
> @@ -347,9 +348,9 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool 
> do_hoist_loads, bool early_p)
>   /* Do the replacement of conditional if it can be done.  */
>   if (!early_p && two_value_replacement (bb, bb1, e2, phi, arg0, 
> arg1))
> cfgchanged = true;
> - else if (!early_p
> -  && match_simplify_replacement (bb, bb1, e1, e2, phi,
> - arg0, arg1))
> + else if (match_simplify_replacement (bb, bb1, e1, e2, phi,
> +  arg0, arg1,
> +  early_p))
> cfgchanged = true;
>   else if (abs_replacement (bb, bb1, e1, e2, phi, arg0, arg1))
> cfgchanged = true;
> @@ -819,6 +820,67 @@ two_value_replacement (basic_block cond_bb, basic_block 
> middle_bb,
>return true;
>  }
>
> +/* Return TRUE if CODE should be allowed during early phiopt.
> +   Currently this is to allow MIN/MAX and ABS/NEGATE.  */
> +static bool
> +phiopt_early_allow (enum tree_code code)
> +{
> +  switch (code)
> +{
> +  case MIN_EXPR:
> +  case MAX_EXPR:
> +  case ABS_EXPR:
> +  case ABSU_EXPR:
> +  case NEGATE_EXPR:
> +  case SSA_NAME:
> +   return true;
> +  default:
> +   return false;
> +}
> +}
> +
> +/* gimple_simplify_phiopt is like gimple_simplify but designed for PHIOPT.
> +   Return NULL if nothing can be simplified or the resulting simplified value
> +   with parts pushed if EARLY_P was true. Also rejects non allowed tree code
> +   if EARLY_P is set.
> +   Takes the comparison from COMP_STMT and two args, ARG0 and ARG1 and tries
> +   to simplify CMP ? ARG0 : ARG1.  */
> +static tree
> +gimple_simplify_phiopt (bool early_p, tree type, gimple *comp_stmt,
> +   tree arg0, tree arg1,
> +   gimple_seq *seq)
> +{
> +  tree result;
> +  enum tree_code comp_code = gimple_cond_code (comp_stmt);
> +  location_t loc = gimple_location (comp_stmt);
> +  tree cmp0 = gimple_cond_lhs (comp_stmt);
> +  tree cmp1 = gimple_cond_rhs (comp_stmt);
> +  /* To handle special cases like floating point comparison, it is easier and
> + less error-prone to build a tree and gimplify it on the fly though it is
> + less efficient.
> + Don't use fold_build2 here as that might create (bool)a instead of just
> + "a != 0".  */
> +  tree cond = build2_loc (loc, comp_code, boolean_type_node,
> + cmp0, cmp1);
> +  gimple_match_op op (gimple_match_cond::UNCOND,
> + COND_EXPR, type, cond, arg0, arg1);
> +
> +  if (op.resimplify (early_p ? NULL : seq, follow_all_ssa_edges))
> +{
> +  /* Early we want only to allow some generated tree codes. */
> +  if (!early_p
> + || op.code.is_tree_code ()
> + || phiopt_early_allow ((tree_code)op.code))
> +   {
> + result = maybe_push_res_to_seq (, seq);
> + if (result)
> +   return result;
> +   }
> +}
> +
> +  

Re: [PATCH 1/5] Fix 101256: Wrong code due to range incorrect from PHI-OPT

2021-07-05 Thread Richard Biener via Gcc-patches
On Sun, Jul 4, 2021 at 8:40 PM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> So the problem here is that replace_phi_edge_with_variable
> will copy range information to a already (not newly) defined
> ssa name.  This causes wrong code later on.

That's a bit too conservative I guess?  Shouldn't it work for at least
all defs defined in the same block as the original conditional (and
thus then applying to the seq inserted there by the callers)?

I realize it's wrong for, say

  _1 = ..
 if (_1 != 0)
   {
 ...
if (..)
   ;
 # _2 = PHI <_1, 1>
...
   }

with _2 having range [1, +INF] but clearly not _1 at the point of its
definition.

Richard.

> This patch fixes the problem by requiring there to be statements
> that are to be placed before the conditional to be able to
> copy the range info; this assumes the statements will define
> the ssa name.
>
> gcc/ChangeLog:
>
> PR tree-optimization/101256
> * dbgcnt.def (phiopt_edge_range): New counter.
> * tree-ssa-phiopt.c (replace_phi_edge_with_variable):
> Add optional sequence which will be added before the old
> conditional. Check sequence for non-null if we want to
> update the range.
> (two_value_replacement): Instead of inserting the sequence,
> update the call to replace_phi_edge_with_variable.
> (match_simplify_replacement): Likewise.
> (minmax_replacement): Likewise.
> (value_replacement): Create a sequence of statements
> which would have defined the ssa name.  Update call
> to replace_phi_edge_with_variable.
>
> gcc/testsuite/ChangeLog:
>
> PR tree-optimization/101256
> * g++.dg/torture/pr101256.C: New test.
> ---
>  gcc/dbgcnt.def  |  1 +
>  gcc/testsuite/g++.dg/torture/pr101256.C | 28 +
>  gcc/tree-ssa-phiopt.c   | 52 ++---
>  3 files changed, 59 insertions(+), 22 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/torture/pr101256.C
>
> diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
> index 93e7b4fd30e..2345899ba68 100644
> --- a/gcc/dbgcnt.def
> +++ b/gcc/dbgcnt.def
> @@ -183,6 +183,7 @@ DEBUG_COUNTER (lim)
>  DEBUG_COUNTER (local_alloc_for_sched)
>  DEBUG_COUNTER (match)
>  DEBUG_COUNTER (merged_ipa_icf)
> +DEBUG_COUNTER (phiopt_edge_range)
>  DEBUG_COUNTER (postreload_cse)
>  DEBUG_COUNTER (pre)
>  DEBUG_COUNTER (pre_insn)
> diff --git a/gcc/testsuite/g++.dg/torture/pr101256.C 
> b/gcc/testsuite/g++.dg/torture/pr101256.C
> new file mode 100644
> index 000..973a8b4caf3
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/torture/pr101256.C
> @@ -0,0 +1,28 @@
> +// { dg-do run }
> +
> +template
> +const T& max(const T& a, const T& b)
> +{
> +return (a < b) ? b : a;
> +}
> +
> +signed char var_5 = -128;
> +unsigned int var_11 = 2144479212U;
> +unsigned long long int arr [22];
> +
> +void
> +__attribute__((noipa))
> +test(signed char var_5, unsigned var_11) {
> +  for (short i_61 = 0; i_61 < var_5 + 149; i_61 += 1)
> +arr[i_61] = max((signed char)0, var_5) ? max((signed char)1, var_5) : 
> var_11;
> +}
> +
> +int main() {
> +  for (int i_0 = 0; i_0 < 22; ++i_0)
> +  arr [i_0] = 11834725929543695741ULL;
> +
> +  test(var_5, var_11);
> +  if (arr [0] != 2144479212ULL && arr [0] != 11834725929543695741ULL)
> +__builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
> index ab12e85569d..71f0019d877 100644
> --- a/gcc/tree-ssa-phiopt.c
> +++ b/gcc/tree-ssa-phiopt.c
> @@ -50,6 +50,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimple-fold.h"
>  #include "internal-fn.h"
>  #include "gimple-range.h"
> +#include "dbgcnt.h"
>
>  static unsigned int tree_ssa_phiopt_worker (bool, bool, bool);
>  static bool two_value_replacement (basic_block, basic_block, edge, gphi *,
> @@ -73,7 +74,8 @@ static bool cond_store_replacement (basic_block, 
> basic_block, edge, edge,
> hash_set *);
>  static bool cond_if_else_store_replacement (basic_block, basic_block, 
> basic_block);
>  static hash_set * get_non_trapping ();
> -static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree);
> +static void replace_phi_edge_with_variable (basic_block, edge, gphi *, tree,
> +   gimple_seq = NULL);
>  static void hoist_adjacent_loads (basic_block, basic_block,
>   basic_block, basic_block);
>  static bool gate_hoist_loads (void);
> @@ -382,18 +384,20 @@ tree_ssa_phiopt_worker (bool do_store_elim, bool 
> do_hoist_loads, bool early_p)
>
>  /* Replace PHI node element whose edge is E in block BB with variable NEW.
> Remove the edge from COND_BLOCK which does not lead to BB (COND_BLOCK
> -   is known to have two edges, one of which must reach BB).  */
> +   is known to have two edges, one of which must reach BB).
> +   Optionally insert stmts before the 

Re: [PATCH 2/5] Fix PR 101237: Remove element_type call when used with the functions from real

2021-07-05 Thread Richard Biener via Gcc-patches
On Sun, Jul 4, 2021 at 8:39 PM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> HONOR_SIGNED_ZEROS, HONOR_SIGN_DEPENDENT_ROUNDING, and HONOR_SNANS all
> have an overload for taking a tree type now, so we should do that instead.
>
> OK?  Bootstrapped and tested on x86_64-linux-gnu.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
>
> PR middle-end/101237
> * fold-const.c (negate_expr_p): Remove call to element_mode
> and TREE_MODE/TREE_TYPE when calling HONOR_SIGNED_ZEROS,
> HONOR_SIGN_DEPENDENT_ROUNDING, and HONOR_SNANS.
> (fold_negate_expr_1): Likewise.
> (const_unop): Likewise.
> (fold_cond_expr_with_comparison): Likewise.
> (fold_binary_loc): Likewise.
> (fold_ternary_loc): Likewise.
> (tree_call_nonnegative_warnv_p): Likewise.
> * match.pd (-(A + B) -> (-B) - A): Likewise.
> ---
>  gcc/fold-const.c | 46 +++---
>  gcc/match.pd |  4 ++--
>  2 files changed, 25 insertions(+), 25 deletions(-)
>
> diff --git a/gcc/fold-const.c b/gcc/fold-const.c
> index dfccbaec683..e0cdb75fb26 100644
> --- a/gcc/fold-const.c
> +++ b/gcc/fold-const.c
> @@ -432,8 +432,8 @@ negate_expr_p (tree t)
>return negate_expr_p (TREE_OPERAND (t, 0));
>
>  case PLUS_EXPR:
> -  if (HONOR_SIGN_DEPENDENT_ROUNDING (element_mode (type))
> - || HONOR_SIGNED_ZEROS (element_mode (type))
> +  if (HONOR_SIGN_DEPENDENT_ROUNDING (type)
> + || HONOR_SIGNED_ZEROS (type)
>   || (ANY_INTEGRAL_TYPE_P (type)
>   && ! TYPE_OVERFLOW_WRAPS (type)))
> return false;
> @@ -445,8 +445,8 @@ negate_expr_p (tree t)
>
>  case MINUS_EXPR:
>/* We can't turn -(A-B) into B-A when we honor signed zeros.  */
> -  return !HONOR_SIGN_DEPENDENT_ROUNDING (element_mode (type))
> -&& !HONOR_SIGNED_ZEROS (element_mode (type))
> +  return !HONOR_SIGN_DEPENDENT_ROUNDING (type)
> +&& !HONOR_SIGNED_ZEROS (type)
>  && (! ANY_INTEGRAL_TYPE_P (type)
>  || TYPE_OVERFLOW_WRAPS (type));
>
> @@ -468,7 +468,7 @@ negate_expr_p (tree t)
>/* Fall through.  */
>
>  case RDIV_EXPR:
> -  if (! HONOR_SIGN_DEPENDENT_ROUNDING (element_mode (TREE_TYPE (t
> +  if (! HONOR_SIGN_DEPENDENT_ROUNDING (t))
> return negate_expr_p (TREE_OPERAND (t, 1))
>|| negate_expr_p (TREE_OPERAND (t, 0));
>break;
> @@ -605,8 +605,8 @@ fold_negate_expr_1 (location_t loc, tree t)
>break;
>
>  case PLUS_EXPR:
> -  if (!HONOR_SIGN_DEPENDENT_ROUNDING (element_mode (type))
> - && !HONOR_SIGNED_ZEROS (element_mode (type)))
> +  if (!HONOR_SIGN_DEPENDENT_ROUNDING (type)
> + && !HONOR_SIGNED_ZEROS (type))
> {
>   /* -(A + B) -> (-B) - A.  */
>   if (negate_expr_p (TREE_OPERAND (t, 1)))
> @@ -628,8 +628,8 @@ fold_negate_expr_1 (location_t loc, tree t)
>
>  case MINUS_EXPR:
>/* - (A - B) -> B - A  */
> -  if (!HONOR_SIGN_DEPENDENT_ROUNDING (element_mode (type))
> - && !HONOR_SIGNED_ZEROS (element_mode (type)))
> +  if (!HONOR_SIGN_DEPENDENT_ROUNDING (type)
> + && !HONOR_SIGNED_ZEROS (type))
> return fold_build2_loc (loc, MINUS_EXPR, type,
> TREE_OPERAND (t, 1), TREE_OPERAND (t, 0));
>break;
> @@ -641,7 +641,7 @@ fold_negate_expr_1 (location_t loc, tree t)
>/* Fall through.  */
>
>  case RDIV_EXPR:
> -  if (! HONOR_SIGN_DEPENDENT_ROUNDING (element_mode (type)))
> +  if (! HONOR_SIGN_DEPENDENT_ROUNDING (type))
> {
>   tem = TREE_OPERAND (t, 1);
>   if (negate_expr_p (tem))
> @@ -1725,7 +1725,7 @@ const_unop (enum tree_code code, tree type, tree arg0)
>/* Don't perform the operation, other than NEGATE and ABS, if
>   flag_signaling_nans is on and the operand is a signaling NaN.  */
>if (TREE_CODE (arg0) == REAL_CST
> -  && HONOR_SNANS (TYPE_MODE (TREE_TYPE (arg0)))
> +  && HONOR_SNANS (arg0)
>&& REAL_VALUE_ISSIGNALING_NAN (TREE_REAL_CST (arg0))
>&& code != NEGATE_EXPR
>&& code != ABS_EXPR
> @@ -2135,7 +2135,7 @@ fold_convert_const_real_from_real (tree type, 
> const_tree arg1)
>
>/* Don't perform the operation if flag_signaling_nans is on
>   and the operand is a signaling NaN.  */
> -  if (HONOR_SNANS (TYPE_MODE (TREE_TYPE (arg1)))
> +  if (HONOR_SNANS (arg1)
>&& REAL_VALUE_ISSIGNALING_NAN (TREE_REAL_CST (arg1)))
>  return NULL_TREE;
>
> @@ -5773,7 +5773,7 @@ fold_cond_expr_with_comparison (location_t loc, tree 
> type,
>
>   Note that all these transformations are correct if A is
>   NaN, since the two alternatives (A and -A) are also NaNs.  */
> -  if (!HONOR_SIGNED_ZEROS (element_mode (type))
> +  if (!HONOR_SIGNED_ZEROS (type)
>&& (FLOAT_TYPE_P (TREE_TYPE (arg01))
>   ? real_zerop (arg01)
>   : integer_zerop (arg01))
> @@ 

Re: [PATCH 4/5] Try inverted comparison for match_simplify in phiopt

2021-07-05 Thread Richard Biener via Gcc-patches
On Sun, Jul 4, 2021 at 8:38 PM apinski--- via Gcc-patches
 wrote:
>
> From: Andrew Pinski 
>
> Since match and simplify does not have all of the inverted
> comparison patterns, it make sense to just have
> phi-opt try to do the inversion and try match and simplify again.
>
> OK? Bootstrapped and tested on x86_64-linux-gnu.

OK.

> Thanks,
> Andrew Pinski
>
> gcc/ChangeLog:
>
> * tree-ssa-phiopt.c (gimple_simplify_phiopt):
> If "A ? B : C" fails to simplify, try "(!A) ? C : B".
> ---
>  gcc/tree-ssa-phiopt.c | 27 ++-
>  1 file changed, 26 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/tree-ssa-phiopt.c b/gcc/tree-ssa-phiopt.c
> index d4449afcdca..fec8c02c062 100644
> --- a/gcc/tree-ssa-phiopt.c
> +++ b/gcc/tree-ssa-phiopt.c
> @@ -844,7 +844,8 @@ phiopt_early_allow (enum tree_code code)
> with parts pushed if EARLY_P was true. Also rejects non allowed tree code
> if EARLY_P is set.
> Takes the comparison from COMP_STMT and two args, ARG0 and ARG1 and tries
> -   to simplify CMP ? ARG0 : ARG1.  */
> +   to simplify CMP ? ARG0 : ARG1.
> +   Also try to simplify (!CMP) ? ARG1 : ARG0 if the non-inverse failed.  */
>  static tree
>  gimple_simplify_phiopt (bool early_p, tree type, gimple *comp_stmt,
> tree arg0, tree arg1,
> @@ -877,6 +878,30 @@ gimple_simplify_phiopt (bool early_p, tree type, gimple 
> *comp_stmt,
> return result;
> }
>  }
> +  /* Try the inverted comparison, that is !COMP ? ARG1 : ARG0. */
> +  comp_code = invert_tree_comparison (comp_code, HONOR_NANS (cmp0));
> +
> +  if (comp_code == ERROR_MARK)
> +return NULL;
> +
> +  cond = build2_loc (loc,
> +comp_code, boolean_type_node,
> +cmp0, cmp1);
> +  gimple_match_op op1 (gimple_match_cond::UNCOND,
> +  COND_EXPR, type, cond, arg1, arg0);
> +
> +  if (op1.resimplify (early_p ? NULL : seq, follow_all_ssa_edges))
> +{
> +  /* Early we want only to allow some generated tree codes. */
> +  if (!early_p
> + || op1.code.is_tree_code ()
> + || phiopt_early_allow ((tree_code)op1.code))
> +   {
> + result = maybe_push_res_to_seq (, seq);
> + if (result)
> +   return result;
> +   }
> +}
>
>return NULL;
>  }
> --
> 2.27.0
>


Re: [RFA] Attach MEM_EXPR information when flushing BLKmode args to the stack

2021-07-05 Thread Richard Biener via Gcc-patches
On Fri, Jul 2, 2021 at 6:13 PM Jeff Law  wrote:
>
>
> This is a minor missed optimization we found with our internal port.
>
> Given this code:
>
> typedef struct {short a; short b;} T;
>
> extern void g1();
>
> void f(T s)
> {
>  if (s.a < 0)
>  g1();
> }
>
>
> "s" is passed in a register, but it's still a BLKmode object because the
> alignment of T is smaller than the alignment that an integer of the same
> size would have (16 bits vs 32 bits).
>
>
> Because "s" is BLKmode function.c is going to store it into a stack slot
> and we'll load it from the that slot for each reference.  So on the v850
> (just to pick a port that likely has the same behavior we're seeing) we
> have this RTL from CSE2:
>
>
> (insn 2 4 3 2 (set (mem/c:SI (plus:SI (reg/f:SI 34 .fp)
>  (const_int -4 [0xfffc])) [2 S4 A32])
>  (reg:SI 6 r6)) "j.c":6:1 7 {*movsi_internal}
>   (expr_list:REG_DEAD (reg:SI 6 r6)
>  (nil)))
> (note 3 2 8 2 NOTE_INSN_FUNCTION_BEG)
> (insn 8 3 9 2 (set (reg:HI 44 [ s.a ])
>  (mem/c:HI (plus:SI (reg/f:SI 34 .fp)
>  (const_int -4 [0xfffc])) [1 s.a+0 S2 A32]))
> "j.c":7:5 3 {*movhi_internal}
>   (nil))
> (insn 9 8 10 2 (parallel [
>  (set (reg:SI 45)
>  (ashift:SI (subreg:SI (reg:HI 44 [ s.a ]) 0)
>  (const_int 16 [0x10])))
>  (clobber (reg:CC 32 psw))
>  ]) "j.c":7:5 94 {ashlsi3_clobber_flags}
>   (expr_list:REG_DEAD (reg:HI 44 [ s.a ])
>  (expr_list:REG_UNUSED (reg:CC 32 psw)
>  (nil
> (insn 10 9 11 2 (parallel [
>  (set (reg:SI 43)
>  (ashiftrt:SI (reg:SI 45)
>  (const_int 16 [0x10])))
>  (clobber (reg:CC 32 psw))
>  ]) "j.c":7:5 104 {ashrsi3_clobber_flags}
>   (expr_list:REG_DEAD (reg:SI 45)
>  (expr_list:REG_UNUSED (reg:CC 32 psw)
>  (nil
>
>
> Insn 2 is the store into the stack. insn 8 is the load for s.a in the
> conditional.  DSE1 replaces the MEM in insn 8 with (reg 6) since (reg 6)
> has the value we want.  After that the store at insn 2 is dead.  Sadly
> DSE never removes the store.
>
> The problem is RTL DSE considers a store with no MEM_EXPR as escaping,
> which keeps the MEM live.  The lack of a MEM_EXPR is due to call to
> change_address to twiddle the mode on the MEM for the store at insn 2.
> It should be safe to copy the MEM_EXPR (which should always be a
> PARM_DECL) from the original memory to the memory returned by
> change_address.  Doing so results in DSE1 removing the store at insn 2.
>
> It would be nice to remove the stack setup/teardown.   I'm not offhand
> aware of mechanisms to remove the setup/teardown after we've already
> allocated a slot, even if the slot is no longer used.
>
> Bootstrapped and regression tested on x86, though I don't think that's a
> particularly useful test.  So I also ran it through my tester across
> those pesky embedded targets without regressions as well.
>
> I didn't include a test simply because I didn't want to have an insane
> target selector.  I guess if we really wanted a test we could look after
> DSE1 is done and verify there aren't any MEMs left at all.  Willing to
> try that if the consensus is we want this tested.
>
> OK for the trunk?

I wonder why the code doesn't use adjust_address instead?  That
handles most cases already and the code doesn't change the
address but just the mode (and access size)?

Richard.

> Jeff
>
>


Re: [PATCH 0/2] Initial support for AVX512FP16

2021-07-05 Thread Richard Biener via Gcc-patches
On Mon, Jul 5, 2021 at 3:21 AM Hongtao Liu via Gcc-patches
 wrote:
>
> On Fri, Jul 2, 2021 at 4:03 PM Uros Bizjak  wrote:
> >
> > On Fri, Jul 2, 2021 at 8:25 AM Hongtao Liu  wrote:
> >
> > > > >   AVX512FP16 is disclosed, refer to [1].
> > > > >   There're 100+ instructions for AVX512FP16, 67 gcc patches, for the 
> > > > > convenience of review, we divide the 67 patches into 2 major parts.
> > > > >   The first part is 2 patches containing basic support for AVX512FP16 
> > > > > (options, cpuid, _Float16 type, libgcc, etc.), and the second part is 
> > > > > 65 patches covering all instructions of AVX512FP16(including 
> > > > > intrinsic support and some optimizations).
> > > > >   There is a problem with the first part, _Float16 is not a C++ 
> > > > > standard, so the front-end does not support this type and its 
> > > > > mangling, so we "make up" a _Float16 type on the back-end and use 
> > > > > _DF16 as its mangling. The purpose of this is to align with llvm 
> > > > > side, because llvm C++ FE already supports _Float16[2].
> > > > >
> > > > > [1] 
> > > > > https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html
> > > > > [2] https://reviews.llvm.org/D33719
> > > >
> > > > Looking through implementation of _Float16 support, I think, there is
> > > > no need for _Float16 support to depend on AVX512FP16.
> > > >
> > > > The compiler is smart enough to use either a named pattern that
> > > > describes the instruction when available or diverts to a library call
> > > > to a soft-fp implementation. So, I think that general _Float16 support
> > > > should be implemented first (similar to _float128) and then upgraded
> > > > with AVX512FP16 specific instructions.
> > > >
> > > > MOVW loads/stores to XMM reg can be emulated with MOVD and a SImode
> > > > secondary_reload register.
> > > >
> > > MOVD is under sse2, so is pinsrw, which means if we want xmm
> > > load/stores for HF, sse2 is the least requirement.
> > > Also we support PEXTRW reg/m16, xmm, imm8 under SSE4_1 under which we
> > > have 16bit direct load/store for HFmode and no need for a secondary
> > > reload.
> > > So for simplicity, can we just restrict _Float16 under sse4_1?
> >
> > When baseline is not met, the equivalent integer calling convention is
> > used, for example:
> Problem is under TARGET_SSE and w/ -mno-sse2, float calling convention
>  is available for sse register, it's ok for float since there's movss
> under sse, but there's no 16bit load/store for sse registers, nor
> movement between gpr and sse register.

You can always spill though, that's prefered for some archs
over xmm <-> gpr moves anyway.

Richard.

> >
> > --cut here--
> > typedef int __v2si __attribute__ ((vector_size (8)));
> >
> > __v2si foo (__v2si a, __v2si b)
> > {
> >   return a + b;
> > }
> > --cut here--
> >
> > will still compile with -m32 -mno-mmx with warnings:
> >
> > mmx1.c: In function ‘foo’:
> > mmx1.c:4:1: warning: MMX vector return without MMX enabled changes the
> > ABI [-Wpsabi]
> > mmx1.c:3:8: warning: MMX vector argument without MMX enabled changes
> > the ABI [-Wpsabi]
> >
> > So, by setting the baseline to SSE4.1, a big pool of targets will be
> > forced to use alternative ABI. This is quite inconvenient, and we
> > revert to the alternative ABI if we *really*  can't satisfy ABI
> > requirements (e.g. register type is not available, basic move insn
> > can't be implemented). Based on your analysis, I think that SSE2
> > should be the baseline.
> Agreed.
> >
> > Also, looking at insn tables, it looks that movzwl from memory + movd
> > is faster than pinsrw (and similar for pextrw to memory), but I have
> > no hard data here.
> >
> > Regarding secondary_reload, a scratch register is needed in case of
> > HImode moves between memory and XMM reg, since scratch register needs
> > a different mode than source and destination. Please see
> > TARGET_SECONDARY_RELOAD documentation and several examples in the
> > source.
> >
> > Uros.
>
>
>
> --
> BR,
> Hongtao


Re: [PATCH] Darwin, configury : Allow for specification and detection of dsymutil.

2021-07-05 Thread Richard Biener via Gcc-patches
On Sun, Jul 4, 2021 at 10:22 PM Iain Sandoe  wrote:
>
> Hi,
>
> IMO this was an omission when the dsymutil program was added
> (before my time).  Essentially, we have been ‘getting away with it’
> on Darwin because of (a) restrictions in DWARF versions and (b)
> that the installed tools handle a wide range of platform versions and
> archs.
>
> However, (a) is a barrier to moving Darwin to DWARF-4 or greater
> and (b) is no longer true for people who might build cross- toolchains
> for older Darwin (or, the motiovating case, for brand new Arm64
> Darwin).  In order to support necessary tests for (a) we produce a
> version record that can be tested.
>
> This replicates the logic used for ‘as’ and ‘ld’ and now correctly
> reports for ‘-v’ and works with discovery of installed “binutils” in the
> target dir (or for specific paths given).
>
> tested across the Darwin range and on crosses and canadian (native)
> crosses to powerpc and Arm64 darwin.  Also tested on x86_64
> and powerpc64 linux.
>
> OK for master?

OK.

Thanks,
Richard.

> thanks
> Iain
>
> ===
>
> In order to enable DWARF versions > 2 we need a sufficiently modern
> version of dsymutil (in addition to the assembler / linker).  This
> allows the user to configure a different path from the installed one.
>
> In addition, there are several sources of dsymutil so we differentiate
> these in order to be get accurate version information.
>
> Signed-off-by: Iain Sandoe 
>
> gcc/ChangeLog:
>
> * configure.ac: Handle --with-dsymutil in the same way as we
> do for the assembler and linker.  (DEFAULT_DSYMUTIL): New.
> Extract the type and version for the dsymutil configured or
> found by the default searches.
> * config.in: Regenerated.
> * configure: Regenerated.
> * collect2.c (do_dsymutil): Handle locating dsymutil in the
> same way as for the assembler and  linker.
> * config/darwin.h (DSYMUTIL): Delete.
> * gcc.c: Report a configured dsymutil correctly.
>
> ChangeLog:
>
> * Makefile.def: Add dsymutil defs.
> * Makefile.in: Regenerated.
> * Makefile.tpl: Add dsymutil to flags.
> * configure: Regenerated.
> * configure.ac: Add dsymutil to target and build recipes.
> ---
>  Makefile.def|   1 +
>  Makefile.in |  10 ++
>  Makefile.tpl|   9 +
>  configure   | 413 
>  configure.ac|   6 +
>  gcc/collect2.c  |  40 -
>  gcc/config.in   |  12 ++
>  gcc/config/darwin.h |   2 -
>  gcc/configure   | 166 +-
>  gcc/configure.ac|  96 +-
>  gcc/exec-tool.in|   8 +
>  gcc/gcc.c   |   5 +
>  12 files changed, 757 insertions(+), 11 deletions(-)
>
> diff --git a/Makefile.def b/Makefile.def
> index c83d9c4a813..fbfdb6fee08 100644
> --- a/Makefile.def
> +++ b/Makefile.def
> @@ -291,6 +291,7 @@ flags_to_pass = { flag= CFLAGS_FOR_TARGET ; };
>  flags_to_pass = { flag= CPPFLAGS_FOR_TARGET ; };
>  flags_to_pass = { flag= CXXFLAGS_FOR_TARGET ; };
>  flags_to_pass = { flag= DLLTOOL_FOR_TARGET ; };
> +flags_to_pass = { flag= DSYMUTIL_FOR_TARGET ; };
>  flags_to_pass = { flag= FLAGS_FOR_TARGET ; };
>  flags_to_pass = { flag= GFORTRAN_FOR_TARGET ; };
>  flags_to_pass = { flag= GOC_FOR_TARGET ; };
>
> diff --git a/Makefile.tpl b/Makefile.tpl
> index 6e0337fb48f..bffd85bd68e 100644
> --- a/Makefile.tpl
> +++ b/Makefile.tpl
> @@ -162,6 +162,7 @@ BUILD_EXPORTS = \
> GDC="$(GDC_FOR_BUILD)"; export GDC; \
> GDCFLAGS="$(GDCFLAGS_FOR_BUILD)"; export GDCFLAGS; \
> DLLTOOL="$(DLLTOOL_FOR_BUILD)"; export DLLTOOL; \
> +   DSYMUTIL="$(DSYMUTIL_FOR_BUILD)"; export DSYMUTIL; \
> LD="$(LD_FOR_BUILD)"; export LD; \
> LDFLAGS="$(LDFLAGS_FOR_BUILD)"; export LDFLAGS; \
> NM="$(NM_FOR_BUILD)"; export NM; \
> @@ -203,6 +204,7 @@ HOST_EXPORTS = \
> CC_FOR_BUILD="$(CC_FOR_BUILD)"; export CC_FOR_BUILD; \
> CXX_FOR_BUILD="$(CXX_FOR_BUILD)"; export CXX_FOR_BUILD; \
> DLLTOOL="$(DLLTOOL)"; export DLLTOOL; \
> +   DSYMUTIL="$(DSYMUTIL)"; export DSYMUTIL; \
> LD="$(LD)"; export LD; \
> LDFLAGS="$(STAGE1_LDFLAGS) $(LDFLAGS)"; export LDFLAGS; \
> NM="$(NM)"; export NM; \
> @@ -215,6 +217,7 @@ HOST_EXPORTS = \
> READELF="$(READELF)"; export READELF; \
> AR_FOR_TARGET="$(AR_FOR_TARGET)"; export AR_FOR_TARGET; \
> AS_FOR_TARGET="$(AS_FOR_TARGET)"; export AS_FOR_TARGET; \
> +   DSYMUTIL_FOR_TARGET="$(DSYMUTIL_FOR_TARGET)"; export 
> DSYMUTIL_FOR_TARGET; \
> GCC_FOR_TARGET="$(GCC_FOR_TARGET)"; export GCC_FOR_TARGET; \
> LD_FOR_TARGET="$(LD_FOR_TARGET)"; export LD_FOR_TARGET; \
> NM_FOR_TARGET="$(NM_FOR_TARGET)"; export NM_FOR_TARGET; \
> @@ -297,6 +300,7 @@ BASE_TARGET_EXPORTS = \
> GOC="$(GOC_FOR_TARGET) $(XGCC_FLAGS_FOR_TARGET) $$TFLAGS"; export 
> GOC; \
> GDC="$(GDC_FOR_TARGET) 

[PATCH] middle-end/101291 - set loop copy of versioned loop

2021-07-05 Thread Richard Biener
This fixes the vectorizer loop versioning code failing to clear
niter related info on the scalar loop as it assumed get_loop_copy
would work even for the outermost loop.  The patch makes that
assumption hold by adjusting the loop versioning code.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-07-05  Richard Biener  

PR middle-end/101291
* cfgloopmanip.c (loop_version): Set the loop copy of the
versioned loop to the new loop.
---
 gcc/cfgloopmanip.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
index e6df28036c4..2af59fedc92 100644
--- a/gcc/cfgloopmanip.c
+++ b/gcc/cfgloopmanip.c
@@ -1731,6 +1731,7 @@ loop_version (class loop *loop,
   then_scale, else_scale);
 
   copy_loop_info (loop, nloop);
+  set_loop_copy (loop, nloop);
 
   /* loopify redirected latch_edge. Update its PENDING_STMTS.  */
   lv_flush_pending_stmts (latch_edge);
-- 
2.26.2


Re: [PATCH] X86: Provide a CTOR for stringop_algs [PR100246].

2021-07-05 Thread Richard Biener via Gcc-patches
On Sun, Jul 4, 2021 at 10:04 PM Iain Sandoe  wrote:
>
> Hi,
>
> Several older compilers fail to build modern GCC because of missing
> or incomplete C++11 support.
>
> (although the PR mentions clang, specifically, this has also been reported
>  for some GCC versions within the range that should be able to bootstrap
>  GCC)
>
> There are several possible solutions proposed in the PR, this one seems
>  the least invasive.
>
> The header is pulled into the gcov code that builds with C, so we have to
> make the CTOR conditional on C++.
>
> tested on Darwin12 with xcode-6, bootstrapped on x86_64-darwin and linux.
> OK for master / GCC-11?

Hmm, what is specifically built with a C compiler?  gcov.c not, I think.

Instead of commenting the CTOR, does it work to comment the whole stringop_algs
type?  Also it seems on trunk this CTOR is no more?

> thanks
> Iain
>
> Signed-off-by: Iain Sandoe 
>
> PR bootstrap/100246 - [11/12 Regression] GCC will not bootstrap with clang 
> 3.4/3.5 [xcode 5/6, Darwin 12/13]
>
> PR bootstrap/100246
>
> gcc/ChangeLog:
>
> * config/i386/i386.h (struct stringop_algs): Define a CTOR for
> this type.
> ---
>  gcc/config/i386/i386.h | 5 +
>  1 file changed, 5 insertions(+)
>
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index 6e0340a4b60..84151156999 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -73,6 +73,11 @@ struct stringop_algs
>  {
>const enum stringop_alg unknown_size;
>const struct stringop_strategy {
> +#ifdef __cplusplus
> +stringop_strategy(int _max = -1, enum stringop_alg _alg = libcall,
> + int _noalign = false)
> +  : max (_max), alg (_alg), noalign (_noalign) {}
> +#endif
>  const int max;
>  const enum stringop_alg alg;
>  int noalign;
> --
> 2.24.1
>
>


Re: [PATCH] combine: Check for paradoxical subreg

2021-07-05 Thread Robin Dapp via Gcc-patches

gcc/ChangeLog:

  * combine.c (try_combine): Check for paradoxical subreg.


ping.


RE: [ARM] PR66791: Replace builtins for fp and unsigned vmul_n intrinsics

2021-07-05 Thread Kyrylo Tkachov via Gcc-patches


> -Original Message-
> From: Prathamesh Kulkarni 
> Sent: 05 July 2021 10:18
> To: gcc Patches ; Kyrylo Tkachov
> 
> Subject: [ARM] PR66791: Replace builtins for fp and unsigned vmul_n
> intrinsics
> 
> Hi Kyrill,
> I assume this patch is OK to commit after bootstrap+testing ?

Yes.
Thanks,
Kyrill

> 
> Thanks,
> Prathamesh


[ARM] PR66791: Replace builtins for signed vmul_n intrinsics

2021-07-05 Thread Prathamesh Kulkarni via Gcc-patches
Hi,
This patch replaces builtins with __a * __b for signed variants of
vmul_n intrinsics.
As discussed earlier, the patch has issue if __a * __b overflows, and
whether we wish to leave
that as UB.

Thanks,
Prathamesh
diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index 41b596b5fc6..5928c25318b 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -8370,14 +8370,14 @@ __extension__ extern __inline int16x4_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vmul_n_s16 (int16x4_t __a, int16_t __b)
 {
-  return (int16x4_t)__builtin_neon_vmul_nv4hi (__a, (__builtin_neon_hi) __b);
+  return __a * __b;
 }
 
 __extension__ extern __inline int32x2_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vmul_n_s32 (int32x2_t __a, int32_t __b)
 {
-  return (int32x2_t)__builtin_neon_vmul_nv2si (__a, (__builtin_neon_si) __b);
+  return __a * __b;
 }
 
 __extension__ extern __inline float32x2_t
@@ -8409,14 +8409,14 @@ __extension__ extern __inline int16x8_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vmulq_n_s16 (int16x8_t __a, int16_t __b)
 {
-  return (int16x8_t)__builtin_neon_vmul_nv8hi (__a, (__builtin_neon_hi) __b);
+  return __a * __b;
 }
 
 __extension__ extern __inline int32x4_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vmulq_n_s32 (int32x4_t __a, int32_t __b)
 {
-  return (int32x4_t)__builtin_neon_vmul_nv4si (__a, (__builtin_neon_si) __b);
+  return __a * __b;
 }
 
 __extension__ extern __inline float32x4_t


[ARM] PR66791: Replace builtins for fp and unsigned vmul_n intrinsics

2021-07-05 Thread Prathamesh Kulkarni via Gcc-patches
Hi Kyrill,
I assume this patch is OK to commit after bootstrap+testing ?

Thanks,
Prathamesh
diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index f42a15f7912..41b596b5fc6 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -8384,21 +8384,25 @@ __extension__ extern __inline float32x2_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vmul_n_f32 (float32x2_t __a, float32_t __b)
 {
+#ifdef __FAST_MATH__
+  return __a * __b;
+#else
   return (float32x2_t)__builtin_neon_vmul_nv2sf (__a, (__builtin_neon_sf) __b);
+#endif
 }
 
 __extension__ extern __inline uint16x4_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vmul_n_u16 (uint16x4_t __a, uint16_t __b)
 {
-  return (uint16x4_t)__builtin_neon_vmul_nv4hi ((int16x4_t) __a, 
(__builtin_neon_hi) __b);
+  return __a * __b;
 }
 
 __extension__ extern __inline uint32x2_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vmul_n_u32 (uint32x2_t __a, uint32_t __b)
 {
-  return (uint32x2_t)__builtin_neon_vmul_nv2si ((int32x2_t) __a, 
(__builtin_neon_si) __b);
+  return __a * __b;
 }
 
 __extension__ extern __inline int16x8_t
@@ -8419,21 +8423,25 @@ __extension__ extern __inline float32x4_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vmulq_n_f32 (float32x4_t __a, float32_t __b)
 {
+#ifdef __FAST_MATH__
+  return __a * __b;
+#else
   return (float32x4_t)__builtin_neon_vmul_nv4sf (__a, (__builtin_neon_sf) __b);
+#endif
 }
 
 __extension__ extern __inline uint16x8_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vmulq_n_u16 (uint16x8_t __a, uint16_t __b)
 {
-  return (uint16x8_t)__builtin_neon_vmul_nv8hi ((int16x8_t) __a, 
(__builtin_neon_hi) __b);
+  return __a * __b;
 }
 
 __extension__ extern __inline uint32x4_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vmulq_n_u32 (uint32x4_t __a, uint32_t __b)
 {
-  return (uint32x4_t)__builtin_neon_vmul_nv4si ((int32x4_t) __a, 
(__builtin_neon_si) __b);
+  return __a * __b;
 }
 
 __extension__ extern __inline int32x4_t
@@ -17740,7 +17748,11 @@ __extension__ extern __inline float16x4_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vmul_n_f16 (float16x4_t __a, float16_t __b)
 {
+#ifdef __FAST_MATH__
+  return __a * __b;
+#else
   return __builtin_neon_vmul_nv4hf (__a, __b);
+#endif
 }
 
 __extension__ extern __inline float16x8_t
@@ -17765,7 +1,11 @@ __extension__ extern __inline float16x8_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vmulq_n_f16 (float16x8_t __a, float16_t __b)
 {
+#ifdef __FAST_MATH__
+  return __a * __b;
+#else
   return __builtin_neon_vmul_nv8hf (__a, __b);
+#endif
 }
 
 __extension__ extern __inline float16x4_t
diff --git a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-neon-2.c 
b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-neon-2.c
index 50f689352ca..6808576ce59 100644
--- a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-neon-2.c
+++ b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-neon-2.c
@@ -327,13 +327,13 @@ BINOP_TEST (vminnm)
 
 BINOP_TEST (vmul)
 /* { dg-final { scan-assembler-times {vmul\.f16\td[0-9]+, d[0-9]+, d[0-9]+} 3 
} }
-   { dg-final { scan-assembler-times {vmul\.f16\tq[0-9]+, q[0-9]+, q[0-9]+} 1 
} }  */
+   { dg-final { scan-assembler-times {vmul\.f16\tq[0-9]+, q[0-9]+, q[0-9]+} 2 
} }  */
 BINOP_LANE_TEST (vmul, 2)
 /* { dg-final { scan-assembler-times {vmul\.f16\td[0-9]+, d[0-9]+, 
d[0-9]+\[2\]} 1 } }
{ dg-final { scan-assembler-times {vmul\.f16\tq[0-9]+, q[0-9]+, 
d[0-9]+\[2\]} 1 } }  */
 BINOP_N_TEST (vmul)
-/* { dg-final { scan-assembler-times {vmul\.f16\td[0-9]+, d[0-9]+, 
d[0-9]+\[0\]} 1 } }
-   { dg-final { scan-assembler-times {vmul\.f16\tq[0-9]+, q[0-9]+, 
d[0-9]+\[0\]} 1 } }*/
+/* { dg-final { scan-assembler-times {vmul\.f16\td[0-9]+, d[0-9]+, d[0-9]+} 3 
} }
+   { dg-final { scan-assembler-times {vmul\.f16\tq[0-9]+, q[0-9]+, q[0-9]+} 2 
} }*/
 
 float16x4_t
 test_vpadd_16x4 (float16x4_t a, float16x4_t b)
@@ -387,7 +387,7 @@ test_vdup_n_f16 (float16_t a)
 {
   return vdup_n_f16 (a);
 }
-/* { dg-final { scan-assembler-times {vdup\.16\td[0-9]+, r[0-9]+} 2 } }  */
+/* { dg-final { scan-assembler-times {vdup\.16\td[0-9]+, r[0-9]+} 3 } }  */
 
 float16x8_t
 test_vmovq_n_f16 (float16_t a)
@@ -400,7 +400,7 @@ test_vdupq_n_f16 (float16_t a)
 {
   return vdupq_n_f16 (a);
 }
-/* { dg-final { scan-assembler-times {vdup\.16\tq[0-9]+, r[0-9]+} 2 } }  */
+/* { dg-final { scan-assembler-times {vdup\.16\tq[0-9]+, r[0-9]+} 3 } }  */
 
 float16x4_t
 test_vdup_lane_f16 (float16x4_t a)


Re: [PATCH] Port GCC documentation to Sphinx

2021-07-05 Thread Richard Sandiford via Gcc-patches
Hans-Peter Nilsson  writes:
> I've read the discussion downthread, but I seem to miss (a recap
> of) the benefits of moving to Sphinx.  Maybe other have too and
> it'd be a good idea to repeat them?  Otherwise, the impression
> is not so good, as all I see is bits here and there getting lost
> in translation.

Better cross-referencing is one big feature.  IMO this subthread has
demonstrated why the limitations of info formatting have held back
the amount of cross-referencing in the online html.  (And based on
emperical evidence, I get the impression that far more people use
the online html docs than the info docs.)

E.g. quoting from Richard's recent patch:

  @item -fmove-loop-stores
  @opindex fmove-loop-stores
  Enables the loop store motion pass in the GIMPLE loop optimizer.  This
  moves invariant stores to after the end of the loop in exchange for
  carrying the stored value in a register across the iteration.
  Note for this option to have an effect @option{-ftree-loop-im} has to 
  be enabled as well.  Enabled at level @option{-O1} and higher, except 
  for @option{-Og}.

In the online docs, this will just be plain text.  Anyone who doesn't
know what -ftree-loop-im is will have to search for it manually.

Adding the extra references to the html (and pdf) output but dropping
them from the info sounds like a good compromise.

Thanks,
Richard


Re: [PATCH] add -fmove-loop-stores option to control GIMPLE loop store-motion

2021-07-05 Thread Richard Biener
On Fri, 2 Jul 2021, Martin Sebor wrote:

> On 7/2/21 5:55 AM, Richard Biener wrote:
> > This adds the -fmove-loop-stores option, mainly as a way to disable
> > the store-motion part of GIMPLE invariant motion (-ftree-loop-im)
> > which is enabled by default.  It might be sensible to turn off
> > -fmove-loop-stores at -O1 since it can result in compile-time
> > as well as memory usage issues but this patch tries to preserve
> > existing behavior besides introducing the new option with the
> > exception of -Og where I've disabled it.
> > 
> > Controlling store-motion has been made easy by earlier refactoring
> > for the invariant motion only use after loop interchange.
> > 
> > Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> > 
> > OK?
> > 
> > Thanks,
> > Richard.
> > 
> > 2021-07-02  Richard Biener  
> > 
> >  * doc/invoke.texi (fmove-loop-stores): Document.
> >  * common.opt (fmove-loop-stores): New option.
> >  * opts.c (default_options_table): Enable -fmove-loop-stores
> >  at -O1 but not -Og.
> >  * tree-ssa-loop-im.c (pass_lim::execute): Pass
> >  flag_move_loop_stores instead of true to
> >  loop_invariant_motion_in_fun.
> > ---
> >   gcc/common.opt |  4 
> >   gcc/doc/invoke.texi| 11 +--
> >   gcc/opts.c |  1 +
> >   gcc/tree-ssa-loop-im.c |  2 +-
> >   4 files changed, 15 insertions(+), 3 deletions(-)
> > 
> > diff --git a/gcc/common.opt b/gcc/common.opt
> > index 5b03bbc6662..d9da1131eda 100644
> > --- a/gcc/common.opt
> > +++ b/gcc/common.opt
> > @@ -2084,6 +2084,10 @@ fmove-loop-invariants
> >   Common Var(flag_move_loop_invariants) Optimization
> >   Move loop invariant computations out of loops.
> >   
> > +fmove-loop-stores
> > +Common Var(flag_move_loop_stores) Optimization
> > +Move stores out of loops.
> > +
> >   fdce
> >   Common Var(flag_dce) Init(1) Optimization
> >   Use the RTL dead code elimination pass.
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index a9fd5fdc104..7b4f5d26738 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -528,7 +528,7 @@ Objective-C and Objective-C++ Dialects}.
> >   -floop-parallelize-all  -flra-remat  -flto  -flto-compression-level @gol
> >   -flto-partition=@var{alg}  -fmerge-all-constants @gol
> >   -fmerge-constants  -fmodulo-sched  -fmodulo-sched-allow-regmoves @gol
> > --fmove-loop-invariants  -fno-branch-count-reg @gol
> > +-fmove-loop-invariants  -fmove-loop-stores  -fno-branch-count-reg @gol
> >   -fno-defer-pop  -fno-fp-int-builtin-inexact  -fno-function-cse @gol
> >   -fno-guess-branch-probability  -fno-inline  -fno-math-errno  -fno-peephole
> >   @gol
> >   -fno-peephole2  -fno-printf-return-value  -fno-sched-interblock @gol
> > @@ -10260,6 +10260,7 @@ compilation time.
> >   -fipa-reference-addressable @gol
> >   -fmerge-constants @gol
> >   -fmove-loop-invariants @gol
> > +-fmove-loop-stores@gol
> >   -fomit-frame-pointer @gol
> >   -freorder-blocks @gol
> >   -fshrink-wrap @gol
> > @@ -10403,7 +10404,7 @@ optimization flags except for those that may
> > interfere with debugging:
> >   @gccoptlist{-fbranch-count-reg  -fdelayed-branch @gol
> >   -fdse  -fif-conversion  -fif-conversion2  @gol
> >   -finline-functions-called-once @gol
> > --fmove-loop-invariants  -fssa-phiopt @gol
> > +-fmove-loop-invariants  -fmove-loop-stores  -fssa-phiopt @gol
> >   -ftree-bit-ccp  -ftree-dse  -ftree-pta  -ftree-sra}
> >   
> >   @end table
> > @@ -13011,6 +13012,12 @@ Enabled by @option{-O3}, @option{-fprofile-use},
> > and @option{-fauto-profile}.
> >   Enables the loop invariant motion pass in the RTL loop optimizer.  Enabled
> >   at level @option{-O1} and higher, except for @option{-Og}.
> >   
> > +@item -fmove-loop-stores
> > +@opindex fmove-loop-stores
> > +Enables the loop store motion pass in the GIMPLE loop optimizer.  Note for
> > +this option to have an effect @code{-ftree-loop-im} has to be enabled as
> > well.
>  ^
> 
> The @code markup should be @option as well (same as below).

Ah, thanks - fixed.

> I find the brief text added to gcc/common.opt more informative than
> this longer description.  Explaining what the store motion pass does
> in a few words would be helpful to those not familiar with
> the implementation.

OK, so is the following more useful then?

"
@item -fmove-loop-stores
@opindex fmove-loop-stores
Enables the loop store motion pass in the GIMPLE loop optimizer.  This
moves invariant stores to after the end of the loop in exchange for
carrying the stored value in a register across the iteration.
Note for this option to have an effect @option{-ftree-loop-im} has to 
be enabled as well.  Enabled at level @option{-O1} and higher, except 
for @option{-Og}.
"

Richard.

> Martin
> 
> 
> > +Enabled at level @option{-O1} and higher, except for @option{-Og}.
> > +
> >   @item -fsplit-loops
> >   @opindex fsplit-loops
> >   Split a loop into two if it contains a condition that's always true
> > diff --git 

[PATCH] testsuite/101299 - add missing vect_double requires to bb-slp-74.c

2021-07-05 Thread Richard Biener
This should fix the FAIL of gcc.dg/vect/bb-slp-74.c on arm.

2021-07-05  Richard Biener  

PR testsuite/101299
* gcc.dg/vect/bb-slp-74.c: Add vect_double requires.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-74.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-74.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-74.c
index d3d5a02a29b..9c1ebb7ecbe 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-74.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-74.c
@@ -1,4 +1,5 @@
 /* { dg-do run } */
+/* { dg-require-effective-target vect_double } */
 
 #include "tree-vect.h"
 
-- 
2.26.2



Re: [PATCH][gcc] Allow functions without C-style ellipsis to use format attribute

2021-07-05 Thread Tuan Le Quang via Gcc-patches
Hi Martin,

Thank you for your quick response.

>  The main benefit of variadic functions templates over C vararg
>  functions is that they make use of the type system for type safety.
>  I'm not sure I see applying attribute format to them as a very
>  compelling use case.  (I'd expect the format string in a variadic
>  function template to use generic conversion specifiers, say %@ or
>  some such, and only let the caller specify things like flags, width
>  and precision but not type conversion specifiers).  Is there one
>  where relying on the type system isn't good enough?

One case we have is wrapping a C API in a C++ one. Hence, the underlying
API is C and we must specify types for format arguments. It means that
generic conversion is not possible.

> Do you have an actual
> use case for it or did it just fall out of the varaidic template
> implementation?

Yes, it falls out of the variadic template. It is the only way that helps
variadic templates use format attribute. And I also feel the same, it might
be useful sometimes.

Also, your suggestion on the code is helpful! I have drafted a new patch
here. I have bootstrapped and regression tested it on x86_64-pc-linux-gnu

Regards,
Tuan

gcc/c-family/ChangeLog:

* c-attribs.c (positional_argument): allow third argument of format
attribute to point to parameters of any type if the function is not C-style
variadic
* c-common.h (enum posargflags): add flag POSARG_ANY for the third argument
of format attribute
* c-format.c (decode_format_attr): read third argument with POSARG_ELLIPSIS
only if the function has has a variable argument
(handle_format_attribute): relax explicit checks for non-variadic functions

gcc/testsuite/ChangeLog:

* gcc.dg/format/attr-3.c: modify comment
* objc.dg/attributes/method-format-1.m: errors do not hold anymore, a
warning is given instead
* g++.dg/warn/format9.C: New test.
* gcc.dg/format/attr-9.c: New test.

diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c
index 6bf492afcc0..a46d882feba 100644
--- a/gcc/c-family/c-attribs.c
+++ b/gcc/c-family/c-attribs.c
@@ -714,6 +714,9 @@ positional_argument (const_tree fntype, const_tree
atname, tree pos,
   return NULL_TREE;
  }

+  if (flags & POSARG_ANY)
+return pos;
+
   /* Where the expected code is STRING_CST accept any pointer
  expected by attribute format (this includes possibly qualified
  char pointers and, for targets like Darwin, also pointers to
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index be4b29a017b..391cfa685d4 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1462,7 +1462,10 @@ enum posargflags {
   POSARG_ZERO = 1,
   /* Consider positional attribute argument value valid if it refers
  to the ellipsis (i.e., beyond the last typed argument).  */
-  POSARG_ELLIPSIS = 2
+  POSARG_ELLIPSIS = 2,
+  /* Consider positional attribute argument value valid if it refers
+ to an argument of any type */
+  POSARG_ANY = 4
 };

 extern tree positional_argument (const_tree, const_tree, tree, tree_code,
diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index bda3b18fcd0..0cef3152828 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -380,9 +380,16 @@ decode_format_attr (const_tree fntype, tree atname,
tree args,
   else
 return false;

+  bool has_variable_arg = !type_argument_type(fntype,
type_num_arguments(fntype) + 1);
+  int extra_flag = 0;
+  if (has_variable_arg)
+extra_flag = POSARG_ELLIPSIS;
+  else
+extra_flag = POSARG_ANY;
+
   if (tree val = get_constant (fntype, atname, *first_arg_num_expr,
3, >first_arg_num,
-   (POSARG_ZERO | POSARG_ELLIPSIS), validated_p))
+   (POSARG_ZERO | extra_flag), validated_p))
 *first_arg_num_expr = val;
   else
 return false;
@@ -5193,11 +5200,11 @@ handle_format_attribute (tree *node, tree atname,
tree args,
   tree arg_type;

   /* Verify that first_arg_num points to the last arg,
- the ...  */
+ if the last arg is  ... */
   FOREACH_FUNCTION_ARGS (type, arg_type, iter)
 arg_num++;

-  if (arg_num != info.first_arg_num)
+  if (arg_num != info.first_arg_num && !type_argument_type(type, arg_num))
 {
   if (!(flags & (int) ATTR_FLAG_BUILT_IN))
  error ("argument to be formatted is not %<...%>");
diff --git a/gcc/testsuite/g++.dg/warn/format9.C
b/gcc/testsuite/g++.dg/warn/format9.C
new file mode 100644
index 000..39b615859fc
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/format9.C
@@ -0,0 +1,16 @@
+// Test format attribute used with variadic templates
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wformat" }
+
+template
+__attribute__((format(printf, 1, 2))) void fa (const char * fmt,
Args...args)
+{
+return;
+}
+
+int main()
+{
+fa ("%d%d", 5, 7);
+fa ("%d%d%d", 5, 6, 7);
+fa ("%s%d", 3, 4); // { dg-warning "format" "printf warning" }
+}
diff --git a/gcc/testsuite/gcc.dg/format/attr-3.c
b/gcc/testsuite/gcc.dg/format/attr-3.c
index