[Bug ipa/98594] [11 Regression] IPA modref codegen bug

2021-01-27 Thread rguenther at suse dot de via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98594

--- Comment #4 from rguenther at suse dot de  ---
On Wed, 27 Jan 2021, hubicka at gcc dot gnu.org wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98594
> 
> --- Comment #3 from Jan Hubicka  ---
> The initialization is removed by dse1 pass.  We get:
> ipa-modref: call stmt D.3199 = bitCount::bitCount_bitfield<1, int,
> glm::packed_highp> (); [return slot optimization]
> ipa-modref: call to glm::vec bitCount::bitCount_bitfield(const
> glm::vec&) [with int L = 1; T = int; glm::qualifier Q =
> glm::packed_highp]/8 does not use ref: D.3185.D.3097.x alias sets: 3->1
>   Deleted dead store: D.3185.D.3097.x = x_2(D);   
>   
> 
> ipa-modref: call stmt D.3199 = bitCount::bitCount_bitfield<1, int,
> glm::packed_highp> (); [return slot optimization]
> ipa-modref: call to glm::vec bitCount::bitCount_bitfield(const
> glm::vec&) [with int L = 1; T = int; glm::qualifier Q =
> glm::packed_highp]/8 does not use ref: D.3185 alias sets: 3->3
>   Deleted dead store: D.3185 ={v} {CLOBBER};  
>   
> 
> Now the modref summary for function is
>   loads:  
>   
> Limits: 32 bases, 16 refs 
>   
>   Base 0: alias set 5 
>   
> Ref 0: alias set 5
>   
>   access: Parm 0 param offset:0 offset:0 size:32 max_size:32  
>   
> 
> alias set 5 correspond to const struct vec but diferent instantiation than
> alias set 3 used in the store.
> There is reinterpret cast:
> 
>   glm::vec::type, Q>
  x(*reinterpret_cast<
glm::vec::type, Q> const *>());
> 
> turning it to
> 
>   glm::vec::type, Q> x(*());
> 
> makes the aliasing difference go away.  So it seems to me that the testcase
> simply includes TBAA violation?

Not sure but if my visuals do not cheat me then the difference is only
const qualification so it should not matter for TBAA?  Of course the
question is what type 'v' has since this maybe invokes a different CTOR?

[Bug tree-optimization/98856] [11 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af

2021-01-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856

--- Comment #2 from Richard Biener  ---
The cxx bench Botan doesn't know --cxxflags, what Botan version are you looking
at?

[Bug c++/98861] I want deterministic exceptions (Herbception)

2021-01-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861

Richard Biener  changed:

   What|Removed |Added

   Severity|normal  |enhancement
   Last reconfirmed||2021-01-28
 Status|UNCONFIRMED |NEW
 Ever confirmed|0   |1

[Bug bootstrap/98860] [11 Regression] boostrap failure on MinGW-w64 windows 10

2021-01-27 Thread rguenth at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860

Richard Biener  changed:

   What|Removed |Added

Summary|boostrap failure on |[11 Regression] boostrap
   |MinGW-w64 windows 10|failure on MinGW-w64
   ||windows 10
   Target Milestone|--- |11.0

[Bug bootstrap/98860] boostrap failure on MinGW-w64 windows 10

2021-01-27 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860

--- Comment #6 from cqwrteur  ---
configure:4069: ./conftest.exe
/home/unlvs/mcf_build/src/gcc-git/libgomp/configure: line 4071: ./conftest.exe:
cannot execute binary file: Exec format error
configure:4073: $? = 126
configure:4080: error: in
`/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/x86_64-w64-mingw32/libgomp':
configure:4082: error: cannot run C compiled programs.
If you meant to cross compile, use `--host'.
See `config.log' for more details

[Bug fortran/93524] [ISO C Binding][F2018] CFI_allocate – elem_size mishandled + sm wrongly set?

2021-01-27 Thread tkoenig at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93524

Thomas Koenig  changed:

   What|Removed |Added

 CC||tkoenig at gcc dot gnu.org

--- Comment #3 from Thomas Koenig  ---
A related patch was applied at

https://gcc.gnu.org/g:1cdca4261e88f4dc9c3293c6b3c2fff3071ca32b .

Re: [PATCH] [8/9/10/11 Regression] [OOP] PR fortran/86470 - ICE with OpenMP

2021-01-27 Thread Thomas Koenig via Gcc-patches



Hello Harald,


OK for master / backports?


OK. It is indeed fairly obvious, as you write.


Should the testcase be moved to the gomp/ subdirectory?

Yes. It's a compile-time test, and it will then only be run
if the the compiler can do OpenMP.

You will not need the

+! { dg-options "-fopenmp" }

line, then.

Thanks for the patch!

Best regards

Thomas


[PATCH][X86] Enable X86_TUNE_AVX256_UNALIGNED_{LOAD,STORE}_OPTIMAL for generic tune [PR target/98172]

2021-01-27 Thread Hongtao Liu via Gcc-patches
Hi:
   GCC11 will be the system GCC 2 years from now, and for the
processors then, they shouldn't even need to split a 256-bit vector
into 2 128-bits vectors.
   .i.e. Test SPEC2017 with the below 2 options on Zen3/ICL show
option B is better than Option A.
Option A:
-march=x86-64 -mtune=generic -mavx2 -mfma -Ofast

Option B:
Option A + -mtune-ctrl="256_unaligned_load_optimal,256_unaligned_store_optimal"

  Bootstrapped and regtested on x86-64_iinux-gnu{-m32,}.
  Ok for trunk?




-- 
BR,
Hongtao


0001-Enable-X86_TUNE_AVX256_UNALIGNED_-LOAD-STORE-_OPTIMA.patch
Description: Binary data


Re: [RFC] test builtin ratio for loop distribution

2021-01-27 Thread Alexandre Oliva
On Jan 27, 2021, Richard Biener  wrote:

> That said, rather than not transforming the loop as you do I'd
> say we want to re-inline small copies more forcefully during
> loop distribution code-gen so we turn a loop that sets
> 3 'short int' to zero into a 'int' store and a 'short' store for example
> (we have more code-size leeway here because we formerly had
> a loop).

Ok, that makes sense to me, if there's a chance of growing the access
stride.

> Since you don't add a testcase

Uhh, sorry, I mentioned TFmode emulation calls, but I wasn't explicit I
meant the soft-fp ones from libgcc.

./xgcc -B./ -O2 $srcdir/libgcc/soft-fp/fixtfdi.c \
  -I$srcdir/libgcc/config/riscv -I$srcdir/include \
  -S -o - | grep memset

> I can't see whether the actual case would be fixed by setting SSA
> pointer alignment on the memset arguments

The dest pointer alignment is known at the builtin expansion time,
that's not a problem.  What isn't known (*) is that the length is a
multiple of the word size: what gets to the expander is that it's
between 4 and 12 bytes (targeting 32-bit risc-v), but not that it's
either 4, 8 or 12.  Coming up with an efficient inline expansion becomes
a bit of a challenge without that extra knowledge.


(*) at least before my related patch for get_nonzero_bits
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564344.html

-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar


Re: [PATCH,rs6000] Fusion patterns for logical-logical

2021-01-27 Thread Segher Boessenkool
Hi!

On Thu, Dec 10, 2020 at 08:41:11PM -0600, acsaw...@linux.ibm.com wrote:
> This patch adds a new function to genfusion.pl to generate patterns for
> logical-logical fusion. They are enabled by default for power10 and can
> be disabled by -mno-power10-fusion-2logical or -mno-power10-fusion.

> +;; logical-logical fusion pattern generated by gen_2logical
> +;; kind: scalar outer: and op and rtl and inv 0 comp 0
> +;; inner: and op and rtl and inv 0 comp 0

These lines are a bit mysterious; can you make them more obvious
somehow?  (You do want to keep it short maybe, so it may be hard then).

> +(define_insn "*fuse_and_and"
> +  [(set (match_operand:GPR 3 "gpc_reg_operand" "=,0,1,r")
> +(and:GPR (and:GPR (match_operand:GPR 0 "gpc_reg_operand" "r,r,r,r") 
> (match_operand:GPR 1 "gpc_reg_operand" "%r,r,r,r")) (match_operand:GPR 2 
> "gpc_reg_operand" "r,r,r,r")))
> +   (clobber (match_scratch:GPR 4 "=X,X,X,r"))]

You miss some newlines here:

  [(set (match_operand:GPR 3 "gpc_reg_operand" "=,0,1,r")
(and:GPR (and:GPR (match_operand:GPR 0 "gpc_reg_operand" "r,r,r,r")
  (match_operand:GPR 1 "gpc_reg_operand" "%r,r,r,r"))
 (match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r")))
   (clobber (match_scratch:GPR 4 "=X,X,X,r"))]

The alt 3 clobber needs an earlyclobber, because you clobber it before
operand 2, which can be in the same hard reg.  Likely?  Not at all.
Impossible?  Also not clear!  It may be true, but needs some
explanation then.

(Does % help here btw?)

> +  "(TARGET_P10_FUSION && TARGET_P10_FUSION_2LOGICAL)"
> +  "@
> +   and %3,%1,%0\;and %3,%3,%2
> +   and %0,%1,%0\;and %0,%0,%2
> +   and %1,%1,%0\;and %1,%1,%2
> +   and %4,%1,%0\;and %3,%4,%2"

Since you bind op 3 to op 0 resp. 1 in the alts 0 and 1, you can use
exactly the same template as for alt 0 for them, which probably is
easier to read, like:

  "@
   and %3,%1,%0\;and %3,%3,%2
   and %3,%1,%0\;and %3,%3,%2
   and %3,%1,%0\;and %3,%3,%2
   and %4,%1,%0\;and %3,%4,%2"

Do you agree?  Or is that nasty for other patterns maybe :-)

Have you checked that all these pattern combinations canonicalise to the
RTL you use here?

Okay for trunk with those things considered / fixed.  Thanks!


Segher


Re: [PATCH] RISC-V: Always define MULTILIB_DEFAULTS

2021-01-27 Thread Kito Cheng via Gcc-patches
Hi Sebastian:

Thank for report this issue, I can reproduce that, I'll investigate what
happened today :)

Sebastian Huber  於 2021年1月26日 週二 14:13
寫道:

> Hello Kito,
>
> On 20/11/2020 09:33, Kito Cheng wrote:
> >   - Define MULTILIB_DEFAULTS can reduce the total number of multilib if
> > the default arch and ABI are listed in the multilib config.
> >
> >   - This also simplify the implementation of --with-multilib-list.
> >
> > gcc/ChangeLog:
> >
> >   * config.gcc (riscv*-*-*): Add TARGET_RISCV_DEFAULT_ABI and
> >   TARGET_RISCV_DEFAULT_ARCH to tm_defines.
> >   Remove including riscv/withmultilib.h for --with-multilib-list.
> >   * config/riscv/riscv.h (STRINGIZING): New.
> >   (__STRINGIZING): Ditto.
> >   (MULTILIB_DEFAULTS): Ditto.
> >   * config/riscv/withmultilib.h: Remove.
>
> I think this change broke the multilib generation for RTEMS (git
> bisect). I had to apply this local patch to build
> a5ad5d5c478ee7bebf057161bb8715ee7d286875:
>
> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 6f1ee62f7fd..7449c470265 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -4612,7 +4612,6 @@ case "${target}" in
>  exit 1
>  ;;
>  esac
> - with_arch=`${srcdir}/config/riscv/arch-canonicalize ${with_arch}`
>  tm_defines="${tm_defines}
> TARGET_RISCV_DEFAULT_ARCH=${with_arch}"
>
>  # Make sure --with-abi is valid.  If it was not specified,
>
> With this commit we have:
>
> ./gcc/xgcc -print-multi-lib
> .;
> rv32i/ilp32;@march=rv32i@mabi=ilp32
> rv32im/ilp32;@march=rv32im@mabi=ilp32
> rv32iac/ilp32;@march=rv32iac@mabi=ilp32
> rv32imac/ilp32;@march=rv32imac@mabi=ilp32
> rv32imafc/ilp32f;@march=rv32imafc@mabi=ilp32f
> rv64imafd/lp64d;@march=rv64imafd@mabi=lp64d
> rv64imafd/lp64d/medany;@march=rv64imafd@mabi=lp64d@mcmodel=medany
> rv64imac/lp64;@march=rv64imac@mabi=lp64
> rv64imac/lp64/medany;@march=rv64imac@mabi=lp64@mcmodel=medany
> rv64imafdc/lp64d;@march=rv64imafdc@mabi=lp64d
> rv64imafdc/lp64d/medany;@march=rv64imafdc@mabi=lp64d@mcmodel=medany
>
> ./gcc/xgcc -print-multi-directory -march=rv32imafd -mabi=ilp32d
> rv32imafd/ilp32d
>
> So for this option set it prints a multilib directory which is not
> listed. Also GCC seems to use this directory for the search paths and
> cannot find multilib specific C++ header files for example.
>
> In the commit before (3a5d8ed231a0329822b7c032ba0834991732d2a0) we have:
>
> ./gcc/xgcc -print-multi-lib
> .;
> rv32i/ilp32;@march=rv32i@mabi=ilp32
> rv32im/ilp32;@march=rv32im@mabi=ilp32
> rv32imafd/ilp32d;@march=rv32imafd@mabi=ilp32d <-- HERE
> rv32iac/ilp32;@march=rv32iac@mabi=ilp32
> rv32imac/ilp32;@march=rv32imac@mabi=ilp32
> rv32imafc/ilp32f;@march=rv32imafc@mabi=ilp32f
> rv64imafd/lp64d;@march=rv64imafd@mabi=lp64d
> rv64imafd/lp64d/medany;@march=rv64imafd@mabi=lp64d@mcmodel=medany
> rv64imac/lp64;@march=rv64imac@mabi=lp64
> rv64imac/lp64/medany;@march=rv64imac@mabi=lp64@mcmodel=medany
> rv64imafdc/lp64d;@march=rv64imafdc@mabi=lp64d
> rv64imafdc/lp64d/medany;@march=rv64imafdc@mabi=lp64d@mcmodel=medany
>
> ./gcc/xgcc -print-multi-directory -march=rv32imafd -mabi=ilp32d
> rv32imafd/ilp32d
>
> I was not able to figure out what prevents the generation of the
> rv32imafd/ilp32d multilib in commit
> a5ad5d5c478ee7bebf057161bb8715ee7d286875. The gcc/tm.h contains this:
>
> gcc/tm.h:#ifndef TARGET_RISCV_DEFAULT_ARCH
> gcc/tm.h:# define TARGET_RISCV_DEFAULT_ARCH rv32gc
> gcc/tm.h:#ifndef TARGET_RISCV_DEFAULT_ABI
> gcc/tm.h:# define TARGET_RISCV_DEFAULT_ABI ilp32d
>
> --
> embedded brains GmbH
> Herr Sebastian HUBER
> Dornierstr. 4
> 82178 Puchheim
> Germany
> email: sebastian.hu...@embedded-brains.de
> phone: +49-89-18 94 741 - 16
> fax:   +49-89-18 94 741 - 08
>
> Registergericht: Amtsgericht München
> Registernummer: HRB 157899
> Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
> Unsere Datenschutzerklärung finden Sie hier:
> https://embedded-brains.de/datenschutzerklaerung/
>
>


[Bug target/98799] [11 Regression] vector_set_var ICE

2021-01-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98799

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Xiong Hu Luo :

https://gcc.gnu.org/g:fbe37371cf372b84d5b7f1a6f5f0971a513dd5fa

commit r11-6947-gfbe37371cf372b84d5b7f1a6f5f0971a513dd5fa
Author: Xionghu Luo 
Date:   Wed Jan 27 20:24:03 2021 -0600

rs6000: Fix vec insert ilp32 ICE and test failures [PR98799]

UNSPEC_SI_FROM_SF is not supported when TARGET_DIRECT_MOVE_64BIT
is false for -m32, don't generate VIEW_CONVERT_EXPR(ARRAY_REF) for
variable vector insert.  Remove rs6000_expand_vector_set_var helper
function, adjust the p8 and p9 definitions position and make them
static.

The previous commit r11-6858 missed check m32, This patch is tested pass
on P7BE{m32,m64}/P8BE{m32,m64}/P8LE/P9LE with
RUNTESTFLAGS="--target_board =unix'{-m32,-m64}'" for BE targets.

gcc/ChangeLog:

2021-01-27  Xionghu Luo  
David Edelsohn  

PR target/98799
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Don't generate VIEW_CONVERT_EXPR for fcode
ALTIVEC_BUILTIN_VEC_INSERT
when -m32.
* config/rs6000/rs6000-protos.h (rs6000_expand_vector_set_var):
Delete.
* config/rs6000/rs6000.c (rs6000_expand_vector_set): Remove the
wrapper call rs6000_expand_vector_set_var for cleanup.  Call
rs6000_expand_vector_set_var_p9 and rs6000_expand_vector_set_var_p8
directly.
(rs6000_expand_vector_set_var): Delete.
(rs6000_expand_vector_set_var_p9): Make static.
(rs6000_expand_vector_set_var_p8): Make static.

gcc/testsuite/ChangeLog:

2021-01-27  Xionghu Luo  

PR target/98827
* gcc.target/powerpc/fold-vec-insert-char-p8.c: Adjust ilp32.
* gcc.target/powerpc/fold-vec-insert-char-p9.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-double.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-float-p8.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-float-p9.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-int-p8.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-int-p9.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-longlong.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-short-p8.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-short-p9.c: Likewise.
* gcc.target/powerpc/pr79251.p8.c: Likewise.
* gcc.target/powerpc/pr79251.p9.c: Likewise.
* gcc.target/powerpc/vsx-builtin-7.c: Likewise.
* gcc.target/powerpc/pr79251-run.c: Build and run with vsx
option.

[Bug target/98827] [11 regression] gcc.target/powerpc/vsx-builtin-7.c assembler counts off after r11-6857

2021-01-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98827

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Xiong Hu Luo :

https://gcc.gnu.org/g:fbe37371cf372b84d5b7f1a6f5f0971a513dd5fa

commit r11-6947-gfbe37371cf372b84d5b7f1a6f5f0971a513dd5fa
Author: Xionghu Luo 
Date:   Wed Jan 27 20:24:03 2021 -0600

rs6000: Fix vec insert ilp32 ICE and test failures [PR98799]

UNSPEC_SI_FROM_SF is not supported when TARGET_DIRECT_MOVE_64BIT
is false for -m32, don't generate VIEW_CONVERT_EXPR(ARRAY_REF) for
variable vector insert.  Remove rs6000_expand_vector_set_var helper
function, adjust the p8 and p9 definitions position and make them
static.

The previous commit r11-6858 missed check m32, This patch is tested pass
on P7BE{m32,m64}/P8BE{m32,m64}/P8LE/P9LE with
RUNTESTFLAGS="--target_board =unix'{-m32,-m64}'" for BE targets.

gcc/ChangeLog:

2021-01-27  Xionghu Luo  
David Edelsohn  

PR target/98799
* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Don't generate VIEW_CONVERT_EXPR for fcode
ALTIVEC_BUILTIN_VEC_INSERT
when -m32.
* config/rs6000/rs6000-protos.h (rs6000_expand_vector_set_var):
Delete.
* config/rs6000/rs6000.c (rs6000_expand_vector_set): Remove the
wrapper call rs6000_expand_vector_set_var for cleanup.  Call
rs6000_expand_vector_set_var_p9 and rs6000_expand_vector_set_var_p8
directly.
(rs6000_expand_vector_set_var): Delete.
(rs6000_expand_vector_set_var_p9): Make static.
(rs6000_expand_vector_set_var_p8): Make static.

gcc/testsuite/ChangeLog:

2021-01-27  Xionghu Luo  

PR target/98827
* gcc.target/powerpc/fold-vec-insert-char-p8.c: Adjust ilp32.
* gcc.target/powerpc/fold-vec-insert-char-p9.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-double.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-float-p8.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-float-p9.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-int-p8.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-int-p9.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-longlong.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-short-p8.c: Likewise.
* gcc.target/powerpc/fold-vec-insert-short-p9.c: Likewise.
* gcc.target/powerpc/pr79251.p8.c: Likewise.
* gcc.target/powerpc/pr79251.p9.c: Likewise.
* gcc.target/powerpc/vsx-builtin-7.c: Likewise.
* gcc.target/powerpc/pr79251-run.c: Build and run with vsx
option.

Re: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-27 Thread will schmidt via Gcc-patches
On Wed, 2021-01-27 at 19:43 -0600, Segher Boessenkool wrote:
> On Wed, Jan 27, 2021 at 01:06:46PM -0600, will schmidt wrote:
> > On Thu, 2021-01-14 at 11:59 -0500, Michael Meissner via Gcc-patches 
> > wrote:
> > > November 19th, 2020:
> > > Message-ID: <20201119235814.ga...@ibm-toto.the-meissners.org>
> > 
> > Subject and date should be sufficient
> 
> Only if people pick good subjects, and do not send ten patches with a
> similar subject line on the same day.  I asked for the message id,
> that works pretty much everywhere.

Good points..  I wasn't aware you had specifically asked for the
message ids.  Thanks for clarifying the situation. :-)


> 
> > _if_ having the old versions
> > of the patchs are necessary to review the latest version of the
> > patch.  Which ideally is not the case.
> 
> Stronger that that: I need to know what changed!  So please just
> explain
> what changed, in just a short sentence or two, or more if that is
> needed
> (but not if it is not needed).
> 
> 
> Segher



Re: [PATCH] RISC-V: Fix -march option parsing when `p` extension exists.

2021-01-27 Thread Kito Cheng via Gcc-patches
Thanks! committed to master :)

On Wed, Jan 27, 2021 at 1:58 PM Xing GUO via Gcc-patches
 wrote:
>
> Sorry, I forgot to remove the line '*explicit_version_p = true;' in my
> previous patch.
>
> This is an updated patch.
>
> Thanks!
>
> ---
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.c
> (riscv_subset_list::parsing_subset_version):
> Fix -march option parsing when `p` extension exists.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/attribute-18.c: New test.
>
> --
> Cheers,
> Xing


Re: [PATCH, rs6000] improve vec_ctf invalid parameter handling. (pr91903)

2021-01-27 Thread will schmidt via Gcc-patches
On Wed, 2021-01-27 at 18:24 -0600, Segher Boessenkool wrote:
> Hi!
> 
> On Mon, Oct 26, 2020 at 04:22:32PM -0500, will schmidt wrote:
> >   Per PR91903, GCC ICEs when we attempt to pass a variable
> > (or out of range value) into the vec_ctf() builtin.  Per
> > investigation, the parameter checking exists for this
> > builtin with the int types, but was missing for
> > the long long types.
> > 
> > This patch adds the missing CODE_FOR_* entries to the
> > rs6000_expand_binup_builtin to cover that scenario.
> > This patch also updates some existing tests to remove
> > calls to vec_ctf() and vec_cts() that contain negative
> > values.
> > --- a/gcc/testsuite/gcc.target/powerpc/builtins-1.fold.h
> > +++ b/gcc/testsuite/gcc.target/powerpc/builtins-1.fold.h
> > @@ -212,14 +212,14 @@ int main ()
> >extern vector unsigned long long u9; u9 = vec_mergeo (u3, u4);
> >  
> >extern vector long long l8; l8 = vec_mul (l3, l4);
> >extern vector unsigned long long u6; u6 = vec_mul (u3, u4);
> >  
> > -  extern vector double dh; dh = vec_ctf (la, -2);
> > +  extern vector double dh; dh = vec_ctf (la, 2);
> >extern vector double di; di = vec_ctf (ua, 2);
> >extern vector int sz; sz = vec_cts (fa, 0x1F);
> > -  extern vector long long l9; l9 = vec_cts (dh, -2);
> > +  extern vector long long l9; l9 = vec_cts (dh, 2);
> 
> I think removing the negative inputs here reduces test coverage?  Why
> did you change them, it isn't immediately clear to me?


The vec_ctf() and vec_cts() builtins accept a const int parameter which
should be in the range of 0..31.   The PR was initially
written/described as an ICE when a variable was passed into the
builtin, and part of debug/fixups revealed that the testcase negative
values were also invalid.
I'll clarify that in the commit message.


> 
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr91903.c
> > @@ -0,0 +1,74 @@
> > +/* { dg-do compile */
> > +/* { dg-require-effective-target p8vector_hw } */
> 
> Compile tests should use p8vector_ok, instead.  (We do not care what
> kind of hardware the system under test is: we can run this on a
> cross-
> compiler just fine, after all!)

ok

> 
> > +/* { dg-skip-if "" { powerpc*-*-darwin* } } */
> 
> Please skip this line.  If the test does not work for Darwin Iain can
> easily disable it, but if you do, no one will find out if it does
> work.

ok, sounds good.
> 
> Okay for trunk with those things fixed, and the -2 thing looked at.
> Thanks!
> 

Thanks for the review. :-)

> 
> Segher



[Bug c++/98862] New: Complex reduction support in offload region

2021-01-27 Thread xw111luoye at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98862

Bug ID: 98862
   Summary: Complex reduction support in offload region
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: xw111luoye at gmail dot com
  Target Milestone: ---

Using std::complex type in offload region is highly desired.

$ g++ -fopenmp complex_reduction.cpp
ptxas /tmp/cceLNaYr.o, line 484; error   : Label expected for argument 0 of
instruction 'call'
ptxas /tmp/cceLNaYr.o, line 484; error   : Function '_ZNSt7complexIfEC1Eff' not
declared in this scope
ptxas /tmp/cceLNaYr.o, line 484; fatal   : Call target not recognized
ptxas fatal   : Ptx assembly aborted due to errors
nvptx-as: ptxas returned 255 exit status
mkoffload: fatal error: x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1
exit status
compilation terminated.
lto-wrapper: fatal error:
/soft/gcc/gcc-11-dev-2021-01-27/bin/../libexec/gcc/x86_64-pc-linux-gnu/11.0.0//accel/nvptx-none/mkoffload
returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

$ g++ -fopenmp -O2 complex_reduction.cpp
unresolved symbol __atomic_compare_exchange_16
collect2: error: ld returned 1 exit status
mkoffload: fatal error: x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1
exit status
compilation terminated.
lto-wrapper: fatal error:
/soft/gcc/gcc-11-dev-2021-01-27/bin/../libexec/gcc/x86_64-pc-linux-gnu/11.0.0//accel/nvptx-none/mkoffload
returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

The -O2 is more useful for production. Fixing both are desired.

source code:
https://github.com/ye-luo/openmp-target/blob/master/hands-on/tests/complex/complex_reduction.cpp

Re: [Ping] PowerPC: Add float128/Decimal conversions.

2021-01-27 Thread Segher Boessenkool
On Tue, Jan 26, 2021 at 06:43:06PM -0500, Michael Meissner wrote:
> I posted this patch on January 14th, 2021:
> https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563498.html
> 
> | Date: Thu, 14 Jan 2021 12:09:36 -0500
> | Subject: [PATCH] PowerPC: Add float128/Decimal conversions.
> | Message-ID: <20210114170936.ga3...@ibm-toto.the-meissners.org>
> 
> You had a question about what changed, and I replied:
> 
> | In your last message, you said that it was unacceptable that the conversion
> | fails if the user uses an old GLIBC.  So I rewrote the code using weak
> | references.  If the user has at least GLIBC 2.32, it will use the IEEE 
> 128-bit
> | string support in the library.
> |
> | If an older GLIBC is used, I then use the IBM 128-bit format as an 
> intermediate
> | value.  Obviously there are cases where IEEE 128-bit can hold values that 
> IBM
> | 128-bit can't (mostly due to the increased exponent range in IEEE 128-bit), 
> but
> | it at least does the conversion for the numbers in the common range.
> |
> | In doing this transformation, I needed to do minor edits to the main decimal
> | to/from binary conversion functions to allow the KF functions to be 
> declared.
> | Previously, I used preprocessor magic to rename the functions.
> 
> This is the second most important patch in the IEEE 128-bit work.  What do I
> need to do to be able to commit the patch?

The whole thread is at 
https://patchwork.ozlabs.org/project/gcc/patch/2020112524.ga...@ibm-toto.the-meissners.org/
 .

I approved *that* version of the patch.


Segher


Re: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-27 Thread Michael Meissner via Gcc-patches
Whoops, I thought I was replying to the second patch about Decimal and IEEE
128-bit conversion, not about built-in support.

On Wed, Jan 27, 2021 at 10:01:38PM -0500, Michael Meissner wrote:
> On Wed, Jan 27, 2021 at 07:43:56PM -0600, Segher Boessenkool wrote:
> > On Wed, Jan 27, 2021 at 01:06:46PM -0600, will schmidt wrote:
> > > On Thu, 2021-01-14 at 11:59 -0500, Michael Meissner via Gcc-patches wrote:
> > > > November 19th, 2020:
> > > > Message-ID: <20201119235814.ga...@ibm-toto.the-meissners.org>
> > > 
> > > Subject and date should be sufficient
> > 
> > Only if people pick good subjects, and do not send ten patches with a
> > similar subject line on the same day.  I asked for the message id,
> > that works pretty much everywhere.
> > 
> > > _if_ having the old versions
> > > of the patchs are necessary to review the latest version of the
> > > patch.  Which ideally is not the case.
> > 
> > Stronger that that: I need to know what changed!  So please just explain
> > what changed, in just a short sentence or two, or more if that is needed
> > (but not if it is not needed).
> 
> In the past you complained that the patch would abort if the user did not link
> against GLIBC 2.32 (because there is an #ifdef in the code to do the abort if
> gcc was configured against an older GLIBC).
> 
> In addition, it used some pre-processor magic so that I didn't have to modify
> the dfp-bit.{c,h} functions to add new functions.  In particular, the new
> functions pretended they where the TF functions, and used #define to change 
> the
> names.
> 
> The new code modifies dfp-bit.{c,h} to have support for the KF functions as
> separate #ifdef's.  It eliminates the preprocessor trickery, since I did 
> modify
> the dfp-bit.{c,h} support.
> 
> In order to deal with older GLIBC's, I used a different function for the KF
> library (__sprintfkf instead of sprintf, and __strtokf instead of strold).
> This function uses weak references to see if we had the GLIBC symbols
> (__sprintfieee128 and __strtoieee128 that are in GLIBC 2.32).  If those
> functions exist, we call those functions directly.
> 
> If those functions do not exist, I converted the _Float128 type to or from
> __ibm128, and I did the normal long double conversions.  Given that IEEE
> 128-bit has a much larger exponent range than IBM 128-bit, it means there are
> some numbers that can't be converted.  But at least the majority of the values
> are converted.
> 
> Note all of the other binary/decimal conversions use the GLIBC functions
> (either sprintf or strto).  The GLIBC people have the expertise to do the
> conversion, wheras I do not.  But until GLIBC 2.32, there was not enough of 
> the
> support in GLIBC to handle IEEE 128-bit conversions.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-27 Thread Michael Meissner via Gcc-patches
On Wed, Jan 27, 2021 at 07:43:56PM -0600, Segher Boessenkool wrote:
> On Wed, Jan 27, 2021 at 01:06:46PM -0600, will schmidt wrote:
> > On Thu, 2021-01-14 at 11:59 -0500, Michael Meissner via Gcc-patches wrote:
> > > November 19th, 2020:
> > > Message-ID: <20201119235814.ga...@ibm-toto.the-meissners.org>
> > 
> > Subject and date should be sufficient
> 
> Only if people pick good subjects, and do not send ten patches with a
> similar subject line on the same day.  I asked for the message id,
> that works pretty much everywhere.
> 
> > _if_ having the old versions
> > of the patchs are necessary to review the latest version of the
> > patch.  Which ideally is not the case.
> 
> Stronger that that: I need to know what changed!  So please just explain
> what changed, in just a short sentence or two, or more if that is needed
> (but not if it is not needed).

In the past you complained that the patch would abort if the user did not link
against GLIBC 2.32 (because there is an #ifdef in the code to do the abort if
gcc was configured against an older GLIBC).

In addition, it used some pre-processor magic so that I didn't have to modify
the dfp-bit.{c,h} functions to add new functions.  In particular, the new
functions pretended they where the TF functions, and used #define to change the
names.

The new code modifies dfp-bit.{c,h} to have support for the KF functions as
separate #ifdef's.  It eliminates the preprocessor trickery, since I did modify
the dfp-bit.{c,h} support.

In order to deal with older GLIBC's, I used a different function for the KF
library (__sprintfkf instead of sprintf, and __strtokf instead of strold).
This function uses weak references to see if we had the GLIBC symbols
(__sprintfieee128 and __strtoieee128 that are in GLIBC 2.32).  If those
functions exist, we call those functions directly.

If those functions do not exist, I converted the _Float128 type to or from
__ibm128, and I did the normal long double conversions.  Given that IEEE
128-bit has a much larger exponent range than IBM 128-bit, it means there are
some numbers that can't be converted.  But at least the majority of the values
are converted.

Note all of the other binary/decimal conversions use the GLIBC functions
(either sprintf or strto).  The GLIBC people have the expertise to do the
conversion, wheras I do not.  But until GLIBC 2.32, there was not enough of the
support in GLIBC to handle IEEE 128-bit conversions.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH PR97627]Avoid computing niters info for fake edges

2021-01-27 Thread bin.cheng via Gcc-patches
Hi,
As described in commit message, we need to avoid computing niters info for fake
edges.  This simple patch does this by two changes.  

Bootstrap and test on X86_64, is it ok?

Thanks,
bin

pr97627-20210128.patch
Description: Binary data


[Bug bootstrap/98860] boostrap failure on MinGW-w64 windows 10

2021-01-27 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860

--- Comment #5 from cqwrteur  ---
I do not know whether it has to do with the CRLF issue because GCC on Linux
emits the same result as it does on MinGW-w64 or msys2.

conftextx.c

#ifdef __x86_64__
#ifndef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4
#error need -march=i486
#endif
#ifndef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_16
#error need -mcx16
#endif
#else
#ifndef __GCC_HAVE_SYNC_COMPARE_AND_SWAP_8
#error need -march=i686
#endif
#endif


MinGW32

unlvs@DESKTOP-DFHPDC1 MINGW32 ~/gcc_bug
$  gcc -E conftestx.c
# 1 "conftestx.c"
# 1 ""
# 1 ""
# 1 "conftestx.c"

unlvs@DESKTOP-DFHPDC1 MINGW32 ~/gcc_bug
$  gcc -E conftestx.c -march=i486
# 1 "conftestx.c"
# 1 ""
# 1 ""
# 1 "conftestx.c"
conftestx.c:10:2: error: #error need -march=i686
   10 | #error need -march=i686
  |  ^

MinGW64

unlvs@DESKTOP-DFHPDC1 MINGW64 ~/gcc_bug
$  gcc -E conftestx.c -m32
# 1 "conftestx.c"
# 1 ""
# 1 ""
# 1 "conftestx.c"

unlvs@DESKTOP-DFHPDC1 MINGW64 ~/gcc_bug
$  gcc -E conftestx.c -march=i486 -mtune=generic
# 1 "conftestx.c"
cc1.exe: error: CPU you selected does not support x86-64 instruction set

MSYS (which is x86_64 with CYGWIN)

unlvs@DESKTOP-DFHPDC1 MSYS ~/gcc_bug
$  gcc -E conftestx.c
# 1 "conftestx.c"
# 1 ""
# 1 ""
# 1 "conftestx.c"
conftestx.c:6:2: error: #error need -mcx16
6 | #error need -mcx16
  |  ^

unlvs@DESKTOP-DFHPDC1 MSYS ~/gcc_bug
$  gcc -E conftestx.c -march=i486
# 1 "conftestx.c"
cc1: error: CPU you selected does not support x86-64 instruction set


The result on Linux:

cqwrteur@DESKTOP-DFHPDC1:/mnt/d/msys64/home/unlvs/gcc_bug$ gcc -E conftestx.c
# 0 "conftestx.c"
# 0 ""
# 0 ""
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 0 "" 2
# 1 "conftestx.c"
conftestx.c:6:2: error: #error need -mcx16
6 | #error need -mcx16
  |  ^
cqwrteur@DESKTOP-DFHPDC1:/mnt/d/msys64/home/unlvs/gcc_bug$ gcc -E conftestx.c
-march=i486
# 0 "conftestx.c"
cc1: error: CPU you selected does not support x86-64 instruction set

[Bug bootstrap/98860] boostrap failure on MinGW-w64 windows 10

2021-01-27 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860

--- Comment #4 from cqwrteur  ---
Created attachment 50071
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50071=edit
bootstrap failure picture

Re: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-27 Thread Segher Boessenkool
On Wed, Jan 27, 2021 at 01:06:46PM -0600, will schmidt wrote:
> On Thu, 2021-01-14 at 11:59 -0500, Michael Meissner via Gcc-patches wrote:
> > November 19th, 2020:
> > Message-ID: <20201119235814.ga...@ibm-toto.the-meissners.org>
> 
> Subject and date should be sufficient

Only if people pick good subjects, and do not send ten patches with a
similar subject line on the same day.  I asked for the message id,
that works pretty much everywhere.

> _if_ having the old versions
> of the patchs are necessary to review the latest version of the
> patch.  Which ideally is not the case.

Stronger that that: I need to know what changed!  So please just explain
what changed, in just a short sentence or two, or more if that is needed
(but not if it is not needed).


Segher


Re: [Ping] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-27 Thread Segher Boessenkool
On Tue, Jan 26, 2021 at 06:39:22PM -0500, Michael Meissner wrote:
> Ping https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563496.html
> 
> | Date: Thu, 14 Jan 2021 11:59:19 -0500
> | Subject: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.
> | Message-ID: <20210114165919.ga1...@ibm-toto.the-meissners.org>
> 
> As I've said in the past, this is the most important patch of the IEEE 128-bit
> patches.  What do I need to do to be able to commit this patch ASAP?  Or what
> changes do I need to make?

https://patchwork.ozlabs.org/project/gcc/patch/20201119235814.ga...@ibm-toto.the-meissners.org/

  I cannot understand this code, and it does seem far from obviously
  correct.  But, okay for trunk if you handle all fallout (and I mean all,
  not just "all you consider important").

(And that is for *that* patch, not including later changes.  Send those
separately, don't make me do much more work than needed).


Segher


Re: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-27 Thread Segher Boessenkool
On Tue, Jan 19, 2021 at 12:24:51PM -0500, Michael Meissner wrote:
> On Fri, Jan 15, 2021 at 03:43:13PM -0600, Segher Boessenkool wrote:
> > Hi!
> > 
> > On Thu, Jan 14, 2021 at 11:59:19AM -0500, Michael Meissner wrote:
> > > >From 78435dee177447080434cdc08fc76b1029c7f576 Mon Sep 17 00:00:00 2001
> > > From: Michael Meissner 
> > > Date: Wed, 13 Jan 2021 21:47:03 -0500
> > > Subject: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.
> > > 
> > > This patch replaces patches previously submitted:
> > 
> > What did you change after I approved it?
> 
> You grumbled about the way I converted the names from the current name to the
> IEEE 128-bit name as being unclear.
> 
> 1) I moved the table of known mappings from within a function to a separate
> function, and I populated the switch statement with all of the current names.
> 
> 2) I moved the code that looks at a built-in function's arguments and returns
> whether it uses long double to a separate function rather than being buried
> within a larger function.
> 
> 3) I changed the code for case we we didn't provide a name (i.e. new 
> built-ins)
> to hopefully be clearer on the conversion.

Don't Do That.

Commit what was approved (unless it actually does not work, then explain
that clearly).  You can sent incremental patches after that.


I am not going to review this whole patch once again.


If you change things in a series, the 0/N message is a good free-form
place to explain that (and start sith a summary, and a summary of what
is different from the previous version, for example).  Some people
keep a changelog of what changed in all version (newest on top of
course).

If there is only one patch, or you need to commemnt something on just
one patch, you can do that after the "---" line.  Everything before that
line then is the exact commit message you will use (or anyone else can
do it as well, with a simple "git am").


The goal of a patch submission is for it to be reviewed.  Your
submission should be optimised for that, not for anything else.


So please send an incremental patch if you want more changes, or if the
previous version was actually very much broken, explain what?


Segher


Re: [PATCH] document BLOCK_ABSTRACT_ORIGIN et al.

2021-01-27 Thread Martin Sebor via Gcc-patches

Attached is an updated patch for both tree.h and the internals manual
documenting the most important BLOCK_ macros and what they represent.

On 1/21/21 2:52 PM, Martin Sebor wrote:

On 1/18/21 6:25 AM, Richard Biener wrote:

PS Here are my notes on the macros and the two related functions:

BLOCK: Denotes a lexical scope.  Contains BLOCK_VARS of variables
declared in it, BLOCK_SUBBLOCKS of scopes nested in it, and
BLOCK_CHAIN pointing to the next BLOCK.  Its BLOCK_SUPERCONTEXT
point to the BLOCK of the enclosing scope.  May have
a BLOCK_ABSTRACT_ORIGIN and a BLOCK_SOURCE_LOCATION.

BLOCK_SUPERCONTEXT: The scope of the enclosing block, or FUNCTION_DECL
for the "outermost" function scope.  Inlined functions are chained by
this so that given expression E and its TREE_BLOCK(E) B,
BLOCK_SUPERCONTEXT(B) is the scope (BLOCK) in which E has been made
or into which E has been inlined.  In the latter case,

BLOCK_ORIGIN(B) evaluates either to the enclosing BLOCK or to
the enclosing function DECL.  It's never null.

BLOCK_ABSTRACT_ORIGIN(B) is the FUNCTION_DECL of the function into
which it has been inlined, or null if B is not inlined.


It's the BLOCK or FUNCTION it was inlined _from_, not were it was 
inlined to.
It's the "ultimate" source, thus the abstract copy of the block or 
function decl
(for the outermost scope, aka inlined_function_outer_scope_p).  It 
corresponds

to what you'd expect for the DWARF abstract origin.


Thanks for the correction!  It's just the "innermost" block that
points to the "ultimate" destination into which it's been inlined.



BLOCK_ABSTRACT_ORIGIN can be NULL (in case it isn't an inline instance).


BLOCK_ABSTRACT_ORIGIN: A BLOCK, or FUNCTION_DECL of the function
into which a block has been inlined.  In a BLOCK immediately enclosing
an inlined leaf expression points to the outermost BLOCK into which it
has been inlined (thus bypassing all intermediate BLOCK_SUPERCONTEXTs).

BLOCK_FRAGMENT_ORIGIN: ???
BLOCK_FRAGMENT_CHAIN: ???


that's for scope blocks split by hot/cold partitioning and only 
temporarily

populated.


Thanks, I now see these documented in detail in tree.h.




bool inlined_function_outer_scope_p(BLOCK)   [tree.h]
    Returns true if a BLOCK has a source location.
    True for all but the innermost (no SUBBLOCKs?) and outermost blocks
    into which an expression has been inlined. (Is this always true?)

tree block_ultimate_origin(BLOCK)   [tree.c]
    Returns BLOCK_ABSTRACT_ORIGIN(BLOCK), AO, after asserting that
    (DECL_P(AO) && DECL_ORIGIN(AO) == AO) || BLOCK_ORIGIN(AO) == AO).


The attached diff adds the comments above to tree.h.

I looked for a good place in the manual to add the same text but I'm
not sure.  Would the Blocks @subsection in generic.texi be appropriate?

Martin



Document various BLOCK macros.

gcc/ChangeLog:

	* doc/generic.texi (Function Basics): Mention BLOCK_SUBBLOCKS,
	BLOCK_VARS, BLOCK_SUPERCONTEXT, and BLOCK_ABSTRACT_ORIGIN.
	* doc/gimple.texi (GIMPLE): Update.  Mention free_lang_data pass.
	* tree.h (BLOCK_VARS): Add comment.
	(BLOCK_SUBBLOCKS): Same.
	(BLOCK_SUPERCONTEXT): Same.
	(BLOCK_ABSTRACT_ORIGIN): Same.
	(inlined_function_outer_scope_p): Same.

diff --git a/gcc/tree.h b/gcc/tree.h
index 02b03d1f68e..0dd2196008b 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1912,18 +1912,29 @@ class auto_suppress_location_wrappers
 #define OMP_CLAUSE_OPERAND(NODE, I)\
 	OMP_CLAUSE_ELT_CHECK (NODE, I)
 
-/* In a BLOCK node.  */
+/* In a BLOCK (scope) node:
+   Variables declared in the scope NODE.  */
 #define BLOCK_VARS(NODE) (BLOCK_CHECK (NODE)->block.vars)
 #define BLOCK_NONLOCALIZED_VARS(NODE) \
   (BLOCK_CHECK (NODE)->block.nonlocalized_vars)
 #define BLOCK_NUM_NONLOCALIZED_VARS(NODE) \
   vec_safe_length (BLOCK_NONLOCALIZED_VARS (NODE))
 #define BLOCK_NONLOCALIZED_VAR(NODE,N) (*BLOCK_NONLOCALIZED_VARS (NODE))[N]
+/* A chain of BLOCKs (scopes) nested within the scope NODE.  */
 #define BLOCK_SUBBLOCKS(NODE) (BLOCK_CHECK (NODE)->block.subblocks)
+/* The scope enclosing the scope NODE, or FUNCTION_DECL for the "outermost"
+   function scope.  Inlined functions are chained by this so that given
+   expression E and its TREE_BLOCK(E) B, BLOCK_SUPERCONTEXT(B) is the scope
+   in which E has been made or into which E has been inlined.   */
 #define BLOCK_SUPERCONTEXT(NODE) (BLOCK_CHECK (NODE)->block.supercontext)
+/* Points to the next scope at the same level of nesting as scope NODE.  */
 #define BLOCK_CHAIN(NODE) (BLOCK_CHECK (NODE)->block.chain)
+/* A BLOCK, or FUNCTION_DECL of the function from which a block has been
+   inlined.  In a scope immediately enclosing an inlined leaf expression,
+   points to the outermost scope into which it has been inlined (thus
+   bypassing all intermediate BLOCK_SUPERCONTEXTs). */
 #define BLOCK_ABSTRACT_ORIGIN(NODE) (BLOCK_CHECK (NODE)->block.abstract_origin)
-#define BLOCK_ORIGIN(NODE) \
+#define BLOCK_ORIGIN(NODE)		\
   (BLOCK_ABSTRACT_ORIGIN(NODE) ? BLOCK_ABSTRACT_ORIGIN(NODE) : (NODE))
 #define 

[Bug bootstrap/98860] boostrap failure on MinGW-w64 windows 10

2021-01-27 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860

--- Comment #3 from cqwrteur  ---
After revert to the previous commit. Compilation success

https://github.com/gcc-mirror/gcc/commit/bfab355012ca0f5219da8beb04f2fdaf757d34b7

I think it has to do with the script you changed, Jakub.

[Bug c++/98861] New: I want deterministic exceptions (Herbception)

2021-01-27 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98861

Bug ID: 98861
   Summary: I want deterministic exceptions (Herbception)
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: unlvsur at live dot com
  Target Milestone: ---

The mailing list requires me to request the feature here. I put it here.
https://www.mail-archive.com/gcc@gcc.gnu.org/msg94104.html
http://open-std.org/JTC1/SC22/WG21/docs/papers/2019/p0709r4.pdf

[Bug bootstrap/98860] boostrap failure on MinGW-w64 windows 10

2021-01-27 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860

--- Comment #2 from cqwrteur  ---
I guess is because of this commit

https://github.com/gcc-mirror/gcc/commit/0411ae7f08e0f5a8b02ff313d26d27a0f6d1bb34

https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=0411ae7f08e0f5a8b02ff313d26d27a0f6d1bb34

Re: [PATCH, rs6000] improve vec_ctf invalid parameter handling. (pr91903)

2021-01-27 Thread Segher Boessenkool
Hi!

On Mon, Oct 26, 2020 at 04:22:32PM -0500, will schmidt wrote:
>   Per PR91903, GCC ICEs when we attempt to pass a variable
> (or out of range value) into the vec_ctf() builtin.  Per
> investigation, the parameter checking exists for this
> builtin with the int types, but was missing for
> the long long types.
> 
> This patch adds the missing CODE_FOR_* entries to the
> rs6000_expand_binup_builtin to cover that scenario.
> This patch also updates some existing tests to remove
> calls to vec_ctf() and vec_cts() that contain negative
> values.

> --- a/gcc/testsuite/gcc.target/powerpc/builtins-1.fold.h
> +++ b/gcc/testsuite/gcc.target/powerpc/builtins-1.fold.h
> @@ -212,14 +212,14 @@ int main ()
>extern vector unsigned long long u9; u9 = vec_mergeo (u3, u4);
>  
>extern vector long long l8; l8 = vec_mul (l3, l4);
>extern vector unsigned long long u6; u6 = vec_mul (u3, u4);
>  
> -  extern vector double dh; dh = vec_ctf (la, -2);
> +  extern vector double dh; dh = vec_ctf (la, 2);
>extern vector double di; di = vec_ctf (ua, 2);
>extern vector int sz; sz = vec_cts (fa, 0x1F);
> -  extern vector long long l9; l9 = vec_cts (dh, -2);
> +  extern vector long long l9; l9 = vec_cts (dh, 2);

I think removing the negative inputs here reduces test coverage?  Why
did you change them, it isn't immediately clear to me?

> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr91903.c
> @@ -0,0 +1,74 @@
> +/* { dg-do compile */
> +/* { dg-require-effective-target p8vector_hw } */

Compile tests should use p8vector_ok, instead.  (We do not care what
kind of hardware the system under test is: we can run this on a cross-
compiler just fine, after all!)

> +/* { dg-skip-if "" { powerpc*-*-darwin* } } */

Please skip this line.  If the test does not work for Darwin Iain can
easily disable it, but if you do, no one will find out if it does work.

Okay for trunk with those things fixed, and the -2 thing looked at.
Thanks!


Segher


Re: [[C++ PATCH]] Implement C++2a P0330R2 - Literal Suffixes for ptrdiff_t and size_t

2021-01-27 Thread Ed Smith-Rowland via Gcc-patches

On 1/27/21 3:32 PM, Jakub Jelinek wrote:

On Sun, Oct 21, 2018 at 04:39:30PM -0400, Ed Smith-Rowland wrote:

This patch implements C++2a proposal P0330R2 Literal Suffixes for ptrdiff_t
and size_t*.  It's not official yet but looks very likely to pass.  It is
incomplete because I'm looking for some opinions. 9We also might wait 'till
it actually passes).

This paper takes the direction of a language change rather than a library
change through C++11 literal operators.  This was after feedback on that
paper after a few iterations.

As coded in this patch, integer suffixes involving 'z' are errors in C and
warnings for C++ <= 17 (in addition to the usual warning about
implementation suffixes shadowing user-defined ones).

OTOH, the 'z' suffix is not currently legal - it can't break
currently-correct code in any C/C++ dialect.  furthermore, I suspect the
language direction was chosen to accommodate a similar addition to C20.

I'm thinking of making this feature available as an extension to all of
C/C++ perhaps with appropriate pedwarn.

GCC now supports -std=c++2b and -std=gnu++2b, are you going to update your
patch against it (and change for z/Z standing for ssize_t rather than
ptrdiff_t), plus incorporate the feedback from Joseph and Jason?

Jakub


I'm actually working on it now!




[Bug bootstrap/98860] boostrap failure on MinGW-w64 windows 10

2021-01-27 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860

--- Comment #1 from cqwrteur  ---
The question is that why it says we are not cross-compiling? I am using the
same script I used before.
https://bitbucket.org/ejsvifq_mabmip/mingw-gcc-mcf-gthread/src/master/PKGBUILD

It is so weird.


checking whether we are cross compiling... configure: error: in
`/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/x86_64-w64-mingw32/libgomp':
configure: error: cannot run C compiled programs.
If you meant to cross compile, use `--host'.
See `config.log' for more details
configure: error: in
`/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/x86_64-w64-mingw32/libatomic':
configure: error: cannot run C compiled programs.
If you meant to cross compile, use `--host'.
See `config.log' for more details
.exe
checking whether we are cross compiling... make[1]: *** [Makefile:15606:
configure-target-libgomp] Error 1
make[1]: *** Waiting for unfinished jobs
configure: error: in
`/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/x86_64-w64-mingw32/libssp':
configure: error: cannot run C compiled programs.
If you meant to cross compile, use `--host'.
See `config.log' for more details
make[1]: *** [Makefile:16174: configure-target-libatomic] Error 1
configure: error: in
`/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/x86_64-w64-mingw32/libquadmath':
configure: error: cannot run C compiled programs.
If you meant to cross compile, use `--host'.
See `config.log' for more details
make[1]: *** [Makefile:13329: configure-target-libssp] Error 1
make[1]: *** [Makefile:14375: configure-target-libquadmath] Error 1
make[1]: Leaving directory '/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32'
make: *** [Makefile:973: all] Error 2
==> ERROR: A failure occurred in build().
Aborting...

[Bug ipa/98594] [11 Regression] IPA modref codegen bug

2021-01-27 Thread hubicka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98594

--- Comment #3 from Jan Hubicka  ---
The initialization is removed by dse1 pass.  We get:
ipa-modref: call stmt D.3199 = bitCount::bitCount_bitfield<1, int,
glm::packed_highp> (); [return slot optimization]
ipa-modref: call to glm::vec bitCount::bitCount_bitfield(const
glm::vec&) [with int L = 1; T = int; glm::qualifier Q =
glm::packed_highp]/8 does not use ref: D.3185.D.3097.x alias sets: 3->1
  Deleted dead store: D.3185.D.3097.x = x_2(D); 

ipa-modref: call stmt D.3199 = bitCount::bitCount_bitfield<1, int,
glm::packed_highp> (); [return slot optimization]
ipa-modref: call to glm::vec bitCount::bitCount_bitfield(const
glm::vec&) [with int L = 1; T = int; glm::qualifier Q =
glm::packed_highp]/8 does not use ref: D.3185 alias sets: 3->3
  Deleted dead store: D.3185 ={v} {CLOBBER};

Now the modref summary for function is
  loads:
Limits: 32 bases, 16 refs   
  Base 0: alias set 5   
Ref 0: alias set 5  
  access: Parm 0 param offset:0 offset:0 size:32 max_size:32

alias set 5 correspond to const struct vec but diferent instantiation than
alias set 3 used in the store.
There is reinterpret cast:

  glm::vec::type,
Q>x(*reinterpret_cast::type,
Q> const *>());

turning it to

  glm::vec::type, Q> x(*());

makes the aliasing difference go away.  So it seems to me that the testcase
simply includes TBAA violation?

[Bug c/97172] [11 Regression] ICE: tree code ‘ssa_name’ is not supported in LTO streams since r11-3303-g6450f07388f9fe57

2021-01-27 Thread msebor at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97172

--- Comment #25 from Martin Sebor  ---
Patch v3: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/564411.html

[PATCH v3] clear VLA bounds in attribute access (PR 97172)

2021-01-27 Thread Martin Sebor via Gcc-patches

Attached is another attempt to fix the problem caused by allowing
front-end trees representing nontrivial VLA bound expressions to
stay in attribute access attached to functions.  Since removing
these trees seems to be everyone's preference this patch does that
by extending the free_lang_data pass to look for and zero out these
trees.

Because free_lang_data only frees anything when LTO is enabled and
we want these trees cleared regardless to keep them from getting
clobbered during gimplification, this change also modifies the pass
to do the clearing even when the pass is otherwise inactive.

Tested on x86_64-linux.

Martin
PR middle-end/97172 - ICE: tree code 'ssa_name' is not supported in LTO streams

gcc/ChangeLog:

	PR middle-end/97172
	* attribs.c (attr_access::free_lang_data): Define new function.
	* attribs.h (attr_access::free_lang_data): Declare new function.
	* tree.c (free_lang_data_in_type): Call attr_access::free_lang_data.
	(array_bound_from_maxval): Define new function.
	* tree.h (array_bound_from_maxval): Declare new function.

gcc/c-family/ChangeLog:

	PR middle-end/97172
	* c-pretty-print.c (c_pretty_printer::direct_abstract_declarator):
	Call array_bound_from_maxval.

gcc/c/ChangeLog:

	PR middle-end/97172
	* c-decl.c (get_parm_array_spec): Call array_bound_from_maxval.

gcc/testsuite/ChangeLog:

	PR middle-end/97172
	* gcc.dg/pr97172.c: New test.

diff --git a/gcc/attribs.c b/gcc/attribs.c
index 94991fbbeab..81322d40f1d 100644
--- a/gcc/attribs.c
+++ b/gcc/attribs.c
@@ -2238,6 +2238,38 @@ attr_access::vla_bounds (unsigned *nunspec) const
   return list_length (size);
 }
 
+/* Reset front end-specific attribute access data from ATTRS.
+   Called from the free_lang_data pass.  */
+
+/* static */ void
+attr_access::free_lang_data (tree attrs)
+{
+  for (tree acs = attrs; (acs = lookup_attribute ("access", acs));
+   acs = TREE_CHAIN (acs))
+{
+  tree vblist = TREE_VALUE (acs);
+  vblist = TREE_CHAIN (vblist);
+  if (!vblist)
+	continue;
+
+  vblist = TREE_VALUE (vblist);
+  if (!vblist)
+	continue;
+
+  for (vblist = TREE_VALUE (vblist); vblist; vblist = TREE_CHAIN (vblist))
+	{
+	  tree *pvbnd = _VALUE (vblist);
+	  if (!*pvbnd || DECL_P (*pvbnd))
+	continue;
+
+	  /* VLA bounds that are expressions as opposed to DECLs are
+	 only used in the front end.  Reset them to keep front end
+	 trees leaking into the middle end (see pr97172) and to
+	 free up memory.  */
+	  *pvbnd = NULL_TREE;
+	}
+}
+}
 
 /* Defined in attr_access.  */
 constexpr char attr_access::mode_chars[];
diff --git a/gcc/attribs.h b/gcc/attribs.h
index 21d28a47f39..898e73db3e4 100644
--- a/gcc/attribs.h
+++ b/gcc/attribs.h
@@ -274,6 +274,9 @@ struct attr_access
   /* Return the access mode corresponding to the character code.  */
   static access_mode from_mode_char (char);
 
+  /* Reset front end-specific attribute access data from attributes.  */
+  static void free_lang_data (tree);
+
   /* The character codes corresponding to all the access modes.  */
   static constexpr char mode_chars[5] = { '-', 'r', 'w', 'x', '^' };
 
diff --git a/gcc/c-family/c-pretty-print.c b/gcc/c-family/c-pretty-print.c
index 2095d4badf7..c6e8a45afd5 100644
--- a/gcc/c-family/c-pretty-print.c
+++ b/gcc/c-family/c-pretty-print.c
@@ -635,22 +635,7 @@ c_pretty_printer::direct_abstract_declarator (tree t)
 		  /* Strip the expressions from around a VLA bound added
 		 internally to make it fit the domain mold, including
 		 any casts.  */
-		  if (TREE_CODE (maxval) == NOP_EXPR)
-		maxval = TREE_OPERAND (maxval, 0);
-		  if (TREE_CODE (maxval) == PLUS_EXPR
-		  && integer_all_onesp (TREE_OPERAND (maxval, 1)))
-		{
-		  maxval = TREE_OPERAND (maxval, 0);
-		  if (TREE_CODE (maxval) == NOP_EXPR)
-			maxval = TREE_OPERAND (maxval, 0);
-		}
-		  if (TREE_CODE (maxval) == SAVE_EXPR)
-		{
-		  maxval = TREE_OPERAND (maxval, 0);
-		  if (TREE_CODE (maxval) == NOP_EXPR)
-			maxval = TREE_OPERAND (maxval, 0);
-		}
-
+		  maxval = array_bound_from_maxval (maxval);
 		  expression (maxval);
 		}
 	}
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 4ba9477f5d1..9dcad5e362d 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -5781,7 +5781,8 @@ get_parm_array_spec (const struct c_parm *parm, tree attrs)
 		{
 		  /* Each variable VLA bound is represented by the dollar
 		 sign.  */
-		  spec += "$";
+		  spec += '$';
+		  nelts = array_bound_from_maxval (nelts);
 		  tpbnds = tree_cons (NULL_TREE, nelts, tpbnds);
 		}
 	}
@@ -5835,7 +5836,8 @@ get_parm_array_spec (const struct c_parm *parm, tree attrs)
 	}
 
   /* Each variable VLA bound is represented by a dollar sign.  */
-  spec += "$";
+  spec += '$';
+  nelts = array_bound_from_maxval (nelts);
   vbchain = tree_cons (NULL_TREE, nelts, vbchain);
 }
 
diff --git a/gcc/testsuite/gcc.dg/pr97172.c b/gcc/testsuite/gcc.dg/pr97172.c
new file mode 100644
index 

[Bug bootstrap/98860] New: boostrap failure on MinGW-w64 windows 10

2021-01-27 Thread unlvsur at live dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98860

Bug ID: 98860
   Summary: boostrap failure on MinGW-w64 windows 10
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: bootstrap
  Assignee: unassigned at gcc dot gnu.org
  Reporter: unlvsur at live dot com
  Target Milestone: ---

checking whether we are cross compiling... configure: error: in
`/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/x86_64-w64-mingw32/libatomic':
configure: error: cannot run C compiled programs.
If you meant to cross compile, use `--host'.
See `config.log' for more details
.exe
checking whether we are cross compiling... make[1]: *** [Makefile:15606:
configure-target-libgomp] Error 1
make[1]: *** Waiting for unfinished jobs
make[1]: *** [Makefile:16174: configure-target-libatomic] Error 1
configure: error: in
`/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/x86_64-w64-mingw32/libssp':
configure: error: cannot run C compiled programs.
If you meant to cross compile, use `--host'.
See `config.log' for more details
configure: error: in
`/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32/x86_64-w64-mingw32/libquadmath':
configure: error: cannot run C compiled programs.
If you meant to cross compile, use `--host'.
See `config.log' for more details
make[1]: *** [Makefile:13329: configure-target-libssp] Error 1
make[1]: *** [Makefile:14375: configure-target-libquadmath] Error 1
make[1]: Leaving directory '/home/unlvs/mcf_build/src/build-x86_64-w64-mingw32'
make: *** [Makefile:973: all] Error 2
==> ERROR: A failure occurred in build().
Aborting...

[Bug rtl-optimization/80960] [8/9/10/11 Regression] Huge memory use when compiling a very large test case

2021-01-27 Thread segher at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80960

--- Comment #26 from Segher Boessenkool  ---
(In reply to Richard Biener from comment #23)
> (that combine number prevails on trunk as well, I can't spot any code
> that disables combine on large BBs so not sure what goes on here)

There is no such thing, indeed.  And the instruction combiner is
"mostly linear", so it shouldn't actually matter.

Re: [PATCH Fortran] Re: PR fortran/93524 - rank >= 3 array stride incorrectly set in CFI_establish

2021-01-27 Thread Thomas Koenig via Gcc-patches



Hi Harris!


OK for master? I do not have write access, so someone will need to
commit this for me.


Reviewed, regression-tested and committed as

https://gcc.gnu.org/g:1cdca4261e88f4dc9c3293c6b3c2fff3071ca32b

Thanks for your patch, and welcome aboard!

Best regards

Thomas


[Bug fortran/86470] [8/9/10/11 Regression] [OOP] ICE with OMP

2021-01-27 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86470

anlauf at gcc dot gnu.org changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |anlauf at gcc dot 
gnu.org
 CC||anlauf at gcc dot gnu.org
 Status|NEW |ASSIGNED

--- Comment #8 from anlauf at gcc dot gnu.org ---
Submitted: https://gcc.gnu.org/pipermail/fortran/2021-January/055647.html

[PATCH] [8/9/10/11 Regression] [OOP] PR fortran/86470 - ICE with OpenMP

2021-01-27 Thread Harald Anlauf via Gcc-patches
Dear all,

the fix for this ICE is obvious: make gfc_call_malloc behave as documented.
Apparently the special case in question was not exercised in the testsuite.

Regtested on x86_64-pc-linux-gnu.

OK for master / backports?

Should the testcase be moved to the gomp/ subdirectory?

Thanks,
Harald


PR fortran/86470 - ICE with OpenMP, class(*) allocatable

gfc_call_malloc should malloc an area of size 1 if no size given.

gcc/fortran/ChangeLog:

PR fortran/86470
* trans.c (gfc_call_malloc): Allocate area of size 1 if passed
size is NULL (as documented).

gcc/testsuite/ChangeLog:

PR fortran/86470
* gfortran.dg/pr86470.f90: New test.

diff --git a/gcc/fortran/trans.c b/gcc/fortran/trans.c
index a2376917635..ab53fc5f441 100644
--- a/gcc/fortran/trans.c
+++ b/gcc/fortran/trans.c
@@ -689,6 +689,9 @@ gfc_call_malloc (stmtblock_t * block, tree type, tree size)
   /* Call malloc.  */
   gfc_start_block ();

+  if (size == NULL_TREE)
+size = build_int_cst (size_type_node, 1);
+
   size = fold_convert (size_type_node, size);
   size = fold_build2_loc (input_location, MAX_EXPR, size_type_node, size,
 			  build_int_cst (size_type_node, 1));
diff --git a/gcc/testsuite/gfortran.dg/pr86470.f90 b/gcc/testsuite/gfortran.dg/pr86470.f90
new file mode 100644
index 000..4021e5d655c
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr86470.f90
@@ -0,0 +1,13 @@
+! { dg-do compile }
+! { dg-options "-fopenmp" }
+! PR fortran/86470 - ICE with OpenMP, class(*)
+
+program p
+  implicit none
+  class(*), allocatable :: val
+!$OMP PARALLEL private(val)
+  allocate(integer::val)
+  val = 1
+  deallocate(val)
+!$OMP END PARALLEL
+end


[PATCH Fortran] Re: PR fortran/93524 - rank >= 3 array stride incorrectly set in CFI_establish

2021-01-27 Thread Harris Snyder
(re-sending with subject line tags)

Hi all,

Now that my copyright assignment is complete, I'm submitting this fix.
Test cases are included.
OK for master? I do not have write access, so someone will need to
commit this for me.

Regards,
Harris

libgfortran/ChangeLog:

* runtime/ISO_Fortran_binding.c (CFI_establish):  fixed strides
for rank >2 arrays

gcc/testsuite/ChangeLog:

* gfortran.dg/ISO_Fortran_binding_18.c: New test.
* gfortran.dg/ISO_Fortran_binding_18.f90: New test.

> On Wed, Jan 13, 2021 at 2:10 PM Harris Snyder  wrote:
> >
> > Hi Tobias / all,
> >
> > Further related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93524
> > `sm` is being incorrectly computed in CFI_establish. Take a look at
> > the diff below - we are currently only using the extent of the
> > previous rank to assign `sm`, instead of all previous ranks. Have I
> > got this right, or am I missing something / does this need to be
> > handled differently? I can offer some test cases and submit a proper
> > patch if we think this solution is OK...
> >
> > Thanks,
> > Harris
> >
> > diff --git a/libgfortran/runtime/ISO_Fortran_binding.c
> > b/libgfortran/runtime/ISO_Fortran_binding.c
> > index 3746ec1c681..20833ad2025 100644
> > --- a/libgfortran/runtime/ISO_Fortran_binding.c
> > +++ b/libgfortran/runtime/ISO_Fortran_binding.c
> > @@ -391,7 +391,12 @@ int CFI_establish (CFI_cdesc_t *dv, void
> > *base_addr, CFI_attribute_t attribute,
> >   if (i == 0)
> > dv->dim[i].sm = dv->elem_len;
> >   else
> > -   dv->dim[i].sm = (CFI_index_t)(dv->elem_len * extents[i - 1]);
> > +   {
> > + CFI_index_t extents_product = 1;
> > + for (int j = 0; j < i; j++)
> > +   extents_product *= extents[j];
> > + dv->dim[i].sm = (CFI_index_t)(dv->elem_len * extents_product);
> > +   }
> > }
> >  }
commit 451bd40aca006ebdba52553de2392fcb5b1ff42f
Author: Harris M. Snyder 
Date:   Tue Jan 26 23:29:24 2021 -0500

Partial fix for PR fortran/93524

diff --git a/gcc/testsuite/gfortran.dg/ISO_Fortran_binding_18.c b/gcc/testsuite/gfortran.dg/ISO_Fortran_binding_18.c
new file mode 100644
index 000..4d1c4ecbd72
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/ISO_Fortran_binding_18.c
@@ -0,0 +1,29 @@
+#include 
+
+#include 
+#include 
+
+
+
+extern int do_loop(CFI_cdesc_t* array);
+
+int main(int argc, char ** argv)
+{
+	int nx = 9;
+	int ny = 10;
+	int nz = 2;
+
+	int arr[nx*ny*nz];
+	memset(arr,0,sizeof(int)*nx*ny*nz);
+	CFI_index_t shape[3];
+	shape[0] = nz;
+	shape[1] = ny;
+	shape[2] = nx;
+
+	CFI_CDESC_T(3) farr;
+	int rc = CFI_establish((CFI_cdesc_t*), arr, CFI_attribute_other, CFI_type_int, 0, (CFI_rank_t)3, (const CFI_index_t *)shape);
+	if (rc != CFI_SUCCESS) abort();
+	int result = do_loop((CFI_cdesc_t*));
+	if (result != nx*ny*nz) abort();
+	return 0;
+}
diff --git a/gcc/testsuite/gfortran.dg/ISO_Fortran_binding_18.f90 b/gcc/testsuite/gfortran.dg/ISO_Fortran_binding_18.f90
new file mode 100644
index 000..76be51d22fb
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/ISO_Fortran_binding_18.f90
@@ -0,0 +1,28 @@
+! { dg-do run }
+! { dg-additional-sources ISO_Fortran_binding_18.c }
+
+module fortran_binding_test_18
+use iso_c_binding
+implicit none
+contains
+
+subroutine test(array)
+integer(c_int) :: array(:)
+array = 1
+end subroutine
+
+function do_loop(array) result(the_sum) bind(c)
+integer(c_int), intent(in out) :: array(:,:,:)
+integer(c_int) :: the_sum, i, j
+
+the_sum = 0  
+array = 0
+do i=1,size(array,3)
+do j=1,size(array,2)
+call test(array(:,j,i))
+end do
+end do
+the_sum = sum(array)
+end function
+
+end module
diff --git a/libgfortran/runtime/ISO_Fortran_binding.c b/libgfortran/runtime/ISO_Fortran_binding.c
index 3746ec1c681..20833ad2025 100644
--- a/libgfortran/runtime/ISO_Fortran_binding.c
+++ b/libgfortran/runtime/ISO_Fortran_binding.c
@@ -391,7 +391,12 @@ int CFI_establish (CFI_cdesc_t *dv, void *base_addr, CFI_attribute_t attribute,
 	  if (i == 0)
 	dv->dim[i].sm = dv->elem_len;
 	  else
-	dv->dim[i].sm = (CFI_index_t)(dv->elem_len * extents[i - 1]);
+	{
+	  CFI_index_t extents_product = 1;
+	  for (int j = 0; j < i; j++)
+		extents_product *= extents[j];
+	  dv->dim[i].sm = (CFI_index_t)(dv->elem_len * extents_product);
+	}
 	}
 }
 


[PATCH, revised, #2] PowerPC: Add float128/Decimal conversions.

2021-01-27 Thread Michael Meissner via Gcc-patches
>From 02b04aed77130f2ec9156d2f7ff89d4cc6b5a78b Mon Sep 17 00:00:00 2001
From: Michael Meissner 
Date: Thu, 21 Jan 2021 12:58:56 -0500
Subject: [PATCH, revised] PowerPC: Add float128/Decimal conversions.

[PATCH, revised] PowerPC: Add float128/Decimal conversions.

Unfortunately, the revision I just posted had the old patch, and not the new
patch.  This patch actually has the BFP_FMT set to "%.36Le" which gives enough
accuracy to allow c-c++-common/dfp/convert-bfp-6.c test to pass.

This patch replaces the following three patches:

September 24th, 2020:
Message-ID: <20200924203545.gd31...@ibm-toto.the-meissners.org>

October 22nd, 2020:
Message-ID: <2020100603.ga11...@ibm-toto.the-meissners.org>

January 14th, 2021:
Message-ID: <20210114170936.ga3...@ibm-toto.the-meissners.org>
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563498.html

This patch rewrites those patches.  In order to run with older GLIBC's, this
patch uses weak references to the IEEE 128-bit conversions to/from string that
are found in GLIBC 2.32.

If the user uses GLIBC 2.32 or later, the Decimal <-> Float128 conversions will
call the functions in that library.  This isn't ideal, as IEEE 128-bit has more
exponent range than IBM 128-bit.

If an older library is used, these patches will convert IEEE 128-bit to IBM
128-bit and do the conversion with IBM 128-bit.  I have tested this with a
compiler configured to use an older library, and it worked for the conversion
if the number could be represented in the IBM 128-bit format.

While most of the Decimal <-> Long double tests now pass when long doubles are
IEEE 128-bit, there is one test that fails:

*   c-c++-common/dfp/convert-bfp-11.c

I have patches for the bfp-11 test (which requires that long double be IBM
128-bit).

Compared to the patch on January 14th, this patch fixes the format string
for converting IEEE 128-bit floating point to string.  This in turn allows
the c-c++-common/dfp/convert-bfp-6.c to pass.

I have tested this patch by doing builds, bootstraps, and make check with 3
builds on a power9 little endian server:

*   Build one used the default long double being IBM 128-bit;
*   Build two set the long double default to IEEE 128-bit; (and)
*   Build three set the long double default to 64-bit.

I have also built and tested this patch on a big endian Power8 system with
both 64 and 32-bit targets.  There were no regressions.

The compilers built fine providing I recompiled gmp, mpc, and mpfr with the
appropriate long double options.  There were a few differences in the test
suite runs that will be addressed in later patches, but over all it works
well.  This patch is required to be able to build a toolchain where the default
long double is IEEE 128-bit.  Can I check this patch into the master branch for
GCC 11?

libgcc/
2021-01-27  Michael Meissner  

* config/rs6000/_dd_to_kf.c: New file.
* config/rs6000/_kf_to_dd.c: New file.
* config/rs6000/_kf_to_sd.c: New file.
* config/rs6000/_kf_to_td.c: New file.
* config/rs6000/_sd_to_kf.c: New file.
* config/rs6000/_sprintfkf.c: New file.
* config/rs6000/_sprintfkf.h: New file.
* config/rs6000/_strtokf.h: New file.
* config/rs6000/_strtokf.c: New file.
* config/rs6000/_td_to_kf.c: New file.
* config/rs6000/quad-float128.h: Add new declarations.
* config/rs6000/t-float128 (fp128_dec_funcs): New macro.
(fp128_decstr_funcs): New macro.
(ibm128_dec_funcs): New macro.
(fp128_ppc_funcs): Add the new conversions.
(fp128_dec_objs): Force Decimal <-> __float128 conversions to be
compiled with -mabi=ieeelongdouble.
(fp128_decstr_objs): Force __float128 <-> string conversions to be
compiled with -mabi=ibmlongdouble.
(ibm128_dec_objs): Force Decimal <-> __float128 conversions to be
compiled with -mabi=ieeelongdouble.
(FP128_CFLAGS_DECIMAL): New macro.
(IBM128_CFLAGS_DECIMAL): New macro.
* dfp-bit.c (DFP_TO_BFP): Add PowerPC _Float128 support.
(BFP_TO_DFP): Add PowerPC _Float128 support.
* dfp-bit.h (BFP_KIND): Add new binary floating point kind for
IEEE 128-bit floating point.
(DFP_TO_BFP): Add PowerPC _Float128 support.
(BFP_TO_DFP): Add PowerPC _Float128 support.
(BFP_SPRINTF): New macro.
---
 libgcc/config/rs6000/_dd_to_kf.c | 37 ++
 libgcc/config/rs6000/_kf_to_dd.c | 37 ++
 libgcc/config/rs6000/_kf_to_sd.c | 37 ++
 libgcc/config/rs6000/_kf_to_td.c | 37 ++
 libgcc/config/rs6000/_sd_to_kf.c | 37 ++
 libgcc/config/rs6000/_sprintfkf.c| 57 
 libgcc/config/rs6000/_sprintfkf.h| 28 ++
 libgcc/config/rs6000/_strtokf.c  | 56 +++
 libgcc/config/rs6000/_strtokf.h  | 27 +
 libgcc/config/rs6000/_td_to_kf.c  

Re: [PATCH, revised] PowerPC: Add float128/Decimal conversions.

2021-01-27 Thread Michael Meissner via Gcc-patches
[PATCH, revised] PowerPC: Add float128/Decimal conversions.

This patch revises the patch on January 14th.  The only change in this patch
compared to the previous patch is to change the format string for converting
IEEE 128-bit to string.  This allows the c-c++-common/dfp/convert-bfp-6.c test
now passes.

This patch replaces the following three patches:

September 24th, 2020:
Message-ID: <20200924203545.gd31...@ibm-toto.the-meissners.org>

October 22nd, 2020:
Message-ID: <2020100603.ga11...@ibm-toto.the-meissners.org>

January 14th, 2021:
Message-ID: <20210114170936.ga3...@ibm-toto.the-meissners.org>
https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563498.html

This patch rewrites those patches.  In order to run with older GLIBC's, this
patch uses weak references to the IEEE 128-bit conversions to/from string that
are found in GLIBC 2.32.

If the user uses GLIBC 2.32 or later, the Decimal <-> Float128 conversions will
call the functions in that library.  This isn't ideal, as IEEE 128-bit has more
exponent range than IBM 128-bit.

If an older library is used, these patches will convert IEEE 128-bit to IBM
128-bit and do the conversion with IBM 128-bit.  I have tested this with a
compiler configured to use an older library, and it worked for the conversion
if the number could be represented in the IBM 128-bit format.

While most of the Decimal <-> Long double tests now pass when long doubles are
IEEE 128-bit, there is one test that fails:

*   c-c++-common/dfp/convert-bfp-11.c

I have patches for the bfp-11 test (which requires that long double be IBM
128-bit).

I have tested this patch by doing builds, bootstraps, and make check with 3
builds on a power9 little endian server:

*   Build one used the default long double being IBM 128-bit;
*   Build two set the long double default to IEEE 128-bit; (and)
*   Build three set the long double default to 64-bit.

I have also built and tested this patch on a big endian Power8 system with
both 64 and 32-bit targets.  There were no regressions.

The compilers built fine providing I recompiled gmp, mpc, and mpfr with the
appropriate long double options.  There were a few differences in the test
suite runs that will be addressed in later patches, but over all it works
well.  This patch is required to be able to build a toolchain where the default
long double is IEEE 128-bit.  Can I check this patch into the master branch for
GCC 11?

libgcc/
2021-01-27  Michael Meissner  

* config/rs6000/_dd_to_kf.c: New file.
* config/rs6000/_kf_to_dd.c: New file.
* config/rs6000/_kf_to_sd.c: New file.
* config/rs6000/_kf_to_td.c: New file.
* config/rs6000/_sd_to_kf.c: New file.
* config/rs6000/_sprintfkf.c: New file.
* config/rs6000/_sprintfkf.h: New file.
* config/rs6000/_strtokf.h: New file.
* config/rs6000/_strtokf.c: New file.
* config/rs6000/_td_to_kf.c: New file.
* config/rs6000/quad-float128.h: Add new declarations.
* config/rs6000/t-float128 (fp128_dec_funcs): New macro.
(fp128_decstr_funcs): New macro.
(ibm128_dec_funcs): New macro.
(fp128_ppc_funcs): Add the new conversions.
(fp128_dec_objs): Force Decimal <-> __float128 conversions to be
compiled with -mabi=ieeelongdouble.
(fp128_decstr_objs): Force __float128 <-> string conversions to be
compiled with -mabi=ibmlongdouble.
(ibm128_dec_objs): Force Decimal <-> __float128 conversions to be
compiled with -mabi=ieeelongdouble.
(FP128_CFLAGS_DECIMAL): New macro.
(IBM128_CFLAGS_DECIMAL): New macro.
* dfp-bit.c (DFP_TO_BFP): Add PowerPC _Float128 support.
(BFP_TO_DFP): Add PowerPC _Float128 support.
* dfp-bit.h (BFP_KIND): Add new binary floating point kind for
IEEE 128-bit floating point.
(DFP_TO_BFP): Add PowerPC _Float128 support.
(BFP_TO_DFP): Add PowerPC _Float128 support.
(BFP_SPRINTF): New macro.
---
 libgcc/config/rs6000/_dd_to_kf.c | 37 ++
 libgcc/config/rs6000/_kf_to_dd.c | 37 ++
 libgcc/config/rs6000/_kf_to_sd.c | 37 ++
 libgcc/config/rs6000/_kf_to_td.c | 37 ++
 libgcc/config/rs6000/_sd_to_kf.c | 37 ++
 libgcc/config/rs6000/_sprintfkf.c| 57 
 libgcc/config/rs6000/_sprintfkf.h| 28 ++
 libgcc/config/rs6000/_strtokf.c  | 56 +++
 libgcc/config/rs6000/_strtokf.h  | 27 +
 libgcc/config/rs6000/_td_to_kf.c | 37 ++
 libgcc/config/rs6000/quad-float128.h |  8 
 libgcc/config/rs6000/t-float128  | 37 +-
 libgcc/dfp-bit.c | 12 +-
 libgcc/dfp-bit.h | 26 +
 14 files changed, 470 insertions(+), 3 deletions(-)
 create mode 100644 libgcc/config/rs6000/_dd_to_kf.c
 create mode 100644 

[committed] [PR97684] IRA: Recalculate pseudo classes if we added new pseduos since last calculation before updating equiv regs

2021-01-27 Thread Vladimir Makarov via Gcc-patches

The patch solves the following problem:

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97684

The patch was successfully bootstrapped and tested on x86-64.

commit 238ea13cca75ad499f227b60a95c40174c6caf78
Author: Vladimir N. Makarov 
Date:   Wed Jan 27 14:53:28 2021 -0500

[PR97684] IRA: Recalculate pseudo classes if we added new pseduos since last calculation before updating equiv regs

update_equiv_regs can use reg classes of pseudos and they are set up in
register pressure sensitive scheduling and loop invariant motion and in
live range shrinking.  This info can become obsolete if we add new pseudos
since the last set up.  Recalculate it again if the new pseudos were
added.

gcc/ChangeLog:

PR rtl-optimization/97684
* ira.c (ira): Call ira_set_pseudo_classes before
update_equiv_regs when it is necessary.

gcc/testsuite/ChangeLog:

PR rtl-optimization/97684
* gcc.target/i386/pr97684.c: New.

diff --git a/gcc/ira.c b/gcc/ira.c
index f0bdbc8cf56..c32ecf814fd 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -5566,6 +5566,15 @@ ira (FILE *f)
   if (warn_clobbered)
 generate_setjmp_warnings ();
 
+  /* update_equiv_regs can use reg classes of pseudos and they are set up in
+ register pressure sensitive scheduling and loop invariant motion and in
+ live range shrinking.  This info can become obsolete if we add new pseudos
+ since the last set up.  Recalculate it again if the new pseudos were
+ added.  */
+  if (resize_reg_info () && (flag_sched_pressure || flag_live_range_shrinkage
+			 || flag_ira_loop_pressure))
+ira_set_pseudo_classes (true, ira_dump_file);
+
   init_alias_analysis ();
   loop_optimizer_init (AVOID_CFG_MODIFICATIONS);
   reg_equiv = XCNEWVEC (struct equivalence, max_reg_num ());
@@ -5610,9 +5619,6 @@ ira (FILE *f)
   regstat_recompute_for_max_regno ();
 }
 
-  if (resize_reg_info () && flag_ira_loop_pressure)
-ira_set_pseudo_classes (true, ira_dump_file);
-
   setup_reg_equiv ();
   grow_reg_equivs ();
   setup_reg_equiv_init ();
diff --git a/gcc/testsuite/gcc.target/i386/pr97684.c b/gcc/testsuite/gcc.target/i386/pr97684.c
new file mode 100644
index 000..983bf535ad8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr97684.c
@@ -0,0 +1,24 @@
+/* PR rtl-optimization/97684 */
+/* { dg-do compile } */
+/* { dg-options "-O1 -flive-range-shrinkage -fschedule-insns -fselective-scheduling -funroll-all-loops -fno-web" } */
+
+void
+c5 (double);
+
+void
+g4 (int *n4)
+{
+  double lp = 0.0;
+  int fn;
+
+  for (fn = 0; fn < 18; ++fn)
+{
+  int as;
+
+  as = __builtin_abs (n4[fn]);
+  if (as > lp)
+lp = as;
+}
+
+  c5 (lp);
+}


[Bug rtl-optimization/97684] [11 Regression] ICE in reg_preferred_class, at reginfo.c:789 by r11-4577

2021-01-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97684

--- Comment #6 from CVS Commits  ---
The master branch has been updated by Vladimir Makarov :

https://gcc.gnu.org/g:081c96621da658760b4a67c07530805f770fa22c

commit r11-6943-g081c96621da658760b4a67c07530805f770fa22c
Author: Vladimir N. Makarov 
Date:   Wed Jan 27 14:53:28 2021 -0500

[PR97684] IRA: Recalculate pseudo classes if we added new pseduos since
last calculation before updating equiv regs

update_equiv_regs can use reg classes of pseudos and they are set up in
register pressure sensitive scheduling and loop invariant motion and in
live range shrinking.  This info can become obsolete if we add new pseudos
since the last set up.  Recalculate it again if the new pseudos were
added.

gcc/ChangeLog:

PR rtl-optimization/97684
* ira.c (ira): Call ira_set_pseudo_classes before
update_equiv_regs when it is necessary.

gcc/testsuite/ChangeLog:

PR rtl-optimization/97684
* gcc.target/i386/pr97684.c: New.

[Bug libstdc++/70303] Value-initialized debug iterators

2021-01-27 Thread fdumont at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70303

François Dumont  changed:

   What|Removed |Added

 CC||fdumont at gcc dot gnu.org
   Assignee|unassigned at gcc dot gnu.org  |fdumont at gcc dot 
gnu.org

--- Comment #6 from François Dumont  ---
After fixing the duplicate PR 98466 std::vector::iterator is ok but
std::deque::iterator seems to be broken still.

Taking it.

[Bug c++/98859] pedantic error on use of __VA_OPT__ before C++20 is unnecessary and counterproductive

2021-01-27 Thread mpolacek at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98859

Marek Polacek  changed:

   What|Removed |Added

 CC||mpolacek at gcc dot gnu.org
 Ever confirmed|0   |1
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2021-01-27
   Keywords||diagnostic

--- Comment #1 from Marek Polacek  ---
That sounds reasonable.

[Bug c++/98859] New: pedantic error on use of __VA_OPT__ before C++20 is unnecessary and counterproductive

2021-01-27 Thread richard-gccbugzilla at metafoo dot co.uk via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98859

Bug ID: 98859
   Summary: pedantic error on use of __VA_OPT__ before C++20 is
unnecessary and counterproductive
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: richard-gccbugzilla at metafoo dot co.uk
  Target Milestone: ---

There's no good way in ISO C or C++ to express what the GNU ,##__VA_ARGS__
extension does prior to the addition of __VA_OPT__. However, code targeting new
compilers (that doesn't want to use GNU C / GNU C++) cannot reliably use
__VA_OPT__ instead of the comma paste extension, because GCC's -pedantic-errors
mode rejects it outside C++20.

Such rejection is unnecessary: __VA_OPT__ is a reserved identifier in other
language modes, so there is no conformance reason to issue a diagnostic on its
use. I think it'd be useful for GCC to unconditionally allow using __VA_OPT__
in all language modes. (I'm changing Clang to do the same.)

Re: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-27 Thread Michael Meissner via Gcc-patches
On Wed, Jan 27, 2021 at 01:06:46PM -0600, will schmidt wrote:
> On Thu, 2021-01-14 at 11:59 -0500, Michael Meissner via Gcc-patches wrote:
> > From 78435dee177447080434cdc08fc76b1029c7f576 Mon Sep 17 00:00:00 2001
> > From: Michael Meissner 
> > Date: Wed, 13 Jan 2021 21:47:03 -0500
> > Subject: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.
> > 
> > This patch replaces patches previously submitted:
> > 
> > September 24th, 2020:
> > Message-ID: <20200924203159.ga31...@ibm-toto.the-meissners.org>
> > 
> > October 9th, 2020:
> > Message-ID: <20201009043543.ga11...@ibm-toto.the-meissners.org>
> > 
> > October 24th, 2020:
> > Message-ID: <2020100346.ga8...@ibm-toto.the-meissners.org>
> > 
> > November 19th, 2020:
> > Message-ID: <20201119235814.ga...@ibm-toto.the-meissners.org>
> 
> 
> Subject and date should be sufficient _if_ having the old versions
> of the patchs are necessary to review the latest version of the
> patch.  Which ideally is not the case.
> 
> 
> > 
> > This patch maps the built-in functions that take or return long double
> > arguments on systems where long double is IEEE 128-bit.
> > 
> > If long double is IEEE 128-bit, this patch goes through the built-in 
> > functions
> > and changes the name of the math, scanf, and printf built-in functions to 
> > use
> > the functions that GLIBC provides when long double uses the IEEE 128-bit
> > representation.
> 
> ok.
> 
> > 
> > In addition, changing the name in GCC allows the Fortran compiler to
> > automatically use the correct name.
> 
> Does the fortran compiler currently use the wrong name? (pr?)

Yes.  If the compiler is configured for IBM 128-bit long double, the Fortran
compiler calls 'sinl' for real*16.  If the compiler is configured for IEEE
128-bit long double, the compiler needs to call __sinieee128 instead of sinl.

Similarly if a C or C++ user calls __builtin_sinl directly without including
math.h, the wrong name would be used.

Hence what this code does is change the names of all of the built-in functions
that can use long double to be the names appropriate for IEEE 128-bit.

> > 
> > To map the math functions, typically this patch changes l to
> > __ieee128.  However there are some exceptions that are handled with 
> > this
> > patch.
> 
> This appears to be  the rs6000_mangle_decl_assembler_name() function, which
> also maps l_r to ieee128_r, and looks like some additional special
> handling for printf and scanf.  

Yes, the rs6000_mangle_decl_assembler_name was not complete in the mapping.  In
particular, it did not handle *printf, *scanf, or *l_r calls.  There are also a
few names that need to have a different mapping.

> 
> > To map the printf functions,  is mapped to __ieee128.
> > 
> > To map the scanf functions,  is mapped to __isoc99_ieee128.
> 
> 
> > 
> > I have tested this patch by doing builds, bootstraps, and make check with 3
> > builds on a power9 little endian server:
> > 
> > *   Build one used the default long double being IBM 128-bit;
> > *   Build two set the long double default to IEEE 128-bit; (and)
> > *   Build three set the long double default to 64-bit.
> > 
> 
> ok
> 
> > The compilers built fine providing I recompiled gmp, mpc, and mpfr with the
> > appropriate long double options.
> 
> Presumably the build is otherwise broken... 
> Does that mean more than invoking download_preqrequisites as part of the
> build?   If there are specific options required during configure/build of
> those packages, they should be called out.
> 
> > There were a few differences in the test
> > suite runs that will be addressed in later patches, but over all it works
> > well.
> 
> Presumably minimal. :-)

It depends on what you mean by minimal.

* There are 5 C tests that fail (2 Decimal/IEEE, 3 NaN related)
* 2 C tests that need some changes to be able to run
* There are 2 C++ tests that fail (Decimal/IEEE, same as the C tests)
* There are 31 C++ modules tests that fail (PR 98645)
* There are 3 Fortran tests that used to fail that now pass

I have patches for the Decimal/IEEE tests

> 
> >   This patch is required to be able to build a toolchain where the 
> > default
> > long double is IEEE 128-bit. 
> 
> Ok.   Could lead the patch description with this,.  I imagine this is
> just one of several patches that are still required towrards that goal.

In terms of 'need', this patch and the Decimal patch next are the two patches
that absolutely need to be installed.  The others fix some things and tests,
but are not required.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] rs6000: Fix vec insert ilp32 ICE and test failures [PR98799]

2021-01-27 Thread David Edelsohn via Gcc-patches
This patch is okay with the removal of

{ target powerpc*-*-* }

from the pr79251-run.c testcase directives.

As I explained in the earlier email, I still believe that the testcase
is not testing what you intend, but this patch is a definite
improvement and removes the failures.  We can correct the testcase in
a follow-up patch.

Thanks for the clarification about P9 support.  32 bit doesn't have a
fast mechanism to move SImode to SFmode.

Thanks, David

On Tue, Jan 26, 2021 at 10:56 PM Xionghu Luo  wrote:
>
> Hi,
>
> On 2021/1/27 03:00, David Edelsohn wrote:
> > On Tue, Jan 26, 2021 at 2:46 AM Xionghu Luo  wrote:
> >>
> >> From: "luo...@cn.ibm.com" 
> >>
> >> UNSPEC_SI_FROM_SF is not supported when TARGET_DIRECT_MOVE_64BIT
> >> is false for -m32, don't generate VIEW_CONVERT_EXPR(ARRAY_REF) for
> >> variable vector insert.  Remove rs6000_expand_vector_set_var helper
> >> function, adjust the p8 and p9 definitions position and make them
> >> static.
> >>
> >> The previous commit r11-6858 missed check m32, This patch is tested pass
> >> on P7BE{m32,m64}/P8BE{m32,m64}/P8LE/P9LE with
> >> RUNTESTFLAGS="--target_board =unix'{-m32,-m64}" for BE targets.
> >
> > Hi, Xionghu
> >
> > Thanks for addressing these failures and the cleanups.
> >
> > This patch addresses most of the failures.
> >
> > pr79251-run.c continues to fail.  The directives are not complete.
> > I'm not certain if your intention is to run the testcase on all
> > targets or only on Power7 and above.  The testcase relies on vector
> > "long long", which only is available with -mvsx, but the testcase only
> > enables -maltivec.  I believe that the testcase happens to pass on the
> > Linux platforms you tested because GCC defaulted to Power7 or Power8
> > ISA and the ABI specifies VSX.  The testcase probably needs to be
> > restricted to only run on some level of VSX enabled processor (VSX?
> > Power8? Power9?) and also needs some additional compiler options when
> > compiling the testcase instead of relying upon the default
> > configuration of the compiler.
>
>
> P8BE: gcc/testsuite/gcc/gcc.sum(it didn't run before due to no 'dg-do run'):
>
> Running target unix/-m32
> Running 
> /home/luoxhu/workspace/gcc/gcc/testsuite/gcc.target/powerpc/powerpc.exp ...
> PASS: gcc.target/powerpc/pr79251-run.c (test for excess errors)
> PASS: gcc.target/powerpc/pr79251-run.c execution test
> === gcc Summary for unix/-m32 ===
>
> # of expected passes2
> Running target unix/-m64
> Running 
> /home/luoxhu/workspace/gcc/gcc/testsuite/gcc.target/powerpc/powerpc.exp ...
> PASS: gcc.target/powerpc/pr79251-run.c (test for excess errors)
> PASS: gcc.target/powerpc/pr79251-run.c execution test
> === gcc Summary for unix/-m64 ===
>
> # of expected passes2
>
>
> How did you get the failure of pr79251-run.c, please?  I tested it all
> passes on P7BE{m32,m64}/P8BE{m32,m64}/P8LE/P9LE of Linux.  This case is
> just verifying the *functionality* of "u = vec_insert (254, v, k)" and
> compare whether u[k] is changed to 254, it must work on all platforms,
> no matter with the optimization or not, otherwise there is a functional
> error.  As to "long long", add target vsx_hw and powerpc like below?
> (Also change the -maltive to -mvsx for pr79251.p8.c/pr79251.p9.c.)
>
> --- a/gcc/testsuite/gcc.target/powerpc/pr79251-run.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr79251-run.c
> @@ -1,4 +1,6 @@
> -/* { dg-options "-O2 -maltivec" } */
> +/* { dg-do run { target powerpc*-*-* } } */
> +/* { dg-require-effective-target vsx_hw { target powerpc*-*-* } } */
> +/* { dg-options "-O2 -mvsx" } */
>
>
> Any other options necessary to limit the testcases? :)
>
> >
> > Also, part of the change seems to be
> >
> >> -  if (TARGET_P9_VECTOR || GET_MODE_SIZE (inner_mode) == 8)
> >> -rs6000_expand_vector_set_var_p9 (target, val, idx);
> >> + if ((TARGET_P9_VECTOR && TARGET_POWERPC64) || width == 8)
> >> +   {
> >> + rs6000_expand_vector_set_var_p9 (target, val, elt_rtx);
> >> + return;
> >> +   }
> >
> > Does the P9 case need TARGET_POWERPC64?  This optimization seemed to
> > be functioning on P9 in 32 bit mode prior to this fix.  It would be a
> > shame to unnecessarily disable this optimization in 32 bit mode.  Or
> > maybe it generated a functioning sequence but didn't utilize the
> > optimization.  Would you please check / clarify?
>
>
> >> -  if (TARGET_P8_VECTOR)
> >> +  if (TARGET_P8_VECTOR && TARGET_DIRECT_MOVE_64BIT)
> >>  {
> >>stmt = build_array_ref (loc, stmt, arg2);
> >>stmt = fold_build2 (MODIFY_EXPR, TREE_TYPE (arg0), stmt,
>
>
> This change in rs6000-c.c causes it not generating 
> VIEW_CONVERT_EXPR(ARRAY_REF)
> gimple code again for P9-32bit, then the IFN VEC_SET won't be matched,
> so rs6000.c:rs6000_expand_vector_set_var_p9 won't be called to produce
> optimized "lvsl+xxperm+lvsr" for P9-32bit again.  It's a pity, but without
> this, it 

[PATCH 16/16] Improve "find_first/last_set" for NEON

2021-01-27 Thread Matthias Kretz
From: yaozhongxiao 

find_first_set and find_last_set method is not optimal for neon,
it need to be improved by synthesized with horizontal adds(vaddv)
which will reduce the generated assembly code; in the following cases,
vaddvq_s16 will generate 2 instructions but vpadd_s16 will generate 4
instrunctions:
```
 # vaddvq_s16
vaddvq_s16(__asint);
//  addvh0, v1.8h
//  smovw1, v0.h[0]
 # vpadd_s16
vpaddq_s16(vpaddq_s16(vpaddq_s16(__asint, __zero), __zero), __zero)[0]
// addp v1.8h,v1.8h,v2.8h
// addp v1.8h,v1.8h,v2.8h
// addp v1.8h,v1.8h,v2.8h
// smovw1, v1.h[0]
 #
```

libstdc++-v3/ChangeLog:
* include/experimental/bits/simd_neon.h: Replace repeated vpadd
calls with a single vaddv for aarch64.
---
 .../include/experimental/bits/simd_neon.h   | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/experimental/bits/simd_neon.h b/libstdc++-
v3/include/experimental/bits/simd_neon.h
index a3a8ffe165f..0b8ccc17513 100644
--- a/libstdc++-v3/include/experimental/bits/simd_neon.h
+++ b/libstdc++-v3/include/experimental/bits/simd_neon.h
@@ -311,8 +311,7 @@ struct _MaskImplNeonMixin
  });
  __asint &= __bitsel;
 #ifdef __aarch64__
- return vpaddq_s16(vpaddq_s16(vpaddq_s16(__asint, __zero), 
__zero),
-   __zero)[0];
+ return vaddvq_s16(__asint);
 #else
  return vpadd_s16(
vpadd_s16(vpadd_s16(__lo64(__asint), __hi64(__asint)), __zero),
@@ -328,7 +327,7 @@ struct _MaskImplNeonMixin
  });
  __asint &= __bitsel;
 #ifdef __aarch64__
- return vpaddq_s32(vpaddq_s32(__asint, __zero), __zero)[0];
+ return vaddvq_s32(__asint);
 #else
  return vpadd_s32(vpadd_s32(__lo64(__asint), __hi64(__asint)),
   __zero)[0];
@@ -351,8 +350,12 @@ struct _MaskImplNeonMixin
return static_cast<_I>(__i < _Np ? 1 << __i : 0);
  });
  __asint &= __bitsel;
+#ifdef __aarch64__
+ return vaddv_s8(__asint);
+#else
  return vpadd_s8(vpadd_s8(vpadd_s8(__asint, __zero), __zero),
  __zero)[0];
+#endif
}
  else if constexpr (sizeof(_Tp) == 2)
{
@@ -362,12 +365,20 @@ struct _MaskImplNeonMixin
return static_cast<_I>(__i < _Np ? 1 << __i : 0);
  });
  __asint &= __bitsel;
+#ifdef __aarch64__
+ return vaddv_s16(__asint);
+#else
  return vpadd_s16(vpadd_s16(__asint, __zero), __zero)[0];
+#endif
}
  else if constexpr (sizeof(_Tp) == 4)
{
  __asint &= __make_vector<_I>(0x1, 0x2);
+#ifdef __aarch64__
+ return vaddv_s32(__asint);
+#else
  return vpadd_s32(__asint, __zero)[0];
+#endif
}
  else
__assert_unreachable<_Tp>();
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──






[PATCH 15/16] Work around test failures using -mno-tree-vrp

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

This is necessary to avoid failures resulting from PR98834.

libstdc++-v3/ChangeLog:
* testsuite/Makefile.am: Warn about the workaround. Add
-fno-tree-vrp to CXXFLAGS passed to the check_simd script.
Improve initial user feedback from make check-simd.
* testsuite/Makefile.in: Regenerated.
---
 libstdc++-v3/testsuite/Makefile.am | 4 +++-
 libstdc++-v3/testsuite/Makefile.in | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/testsuite/Makefile.am b/libstdc++-v3/testsuite/
Makefile.am
index 2d3ad481dba..ba5023a8b54 100644
--- a/libstdc++-v3/testsuite/Makefile.am
+++ b/libstdc++-v3/testsuite/Makefile.am
@@ -191,8 +191,10 @@ check-simd: $(srcdir)/experimental/simd/
generate_makefile.sh \
${glibcxx_srcdir}/scripts/check_simd \
testsuite_files_simd \
${glibcxx_builddir}/scripts/testsuite_flags
+   @echo "WARNING: Adding -fno-tree-vrp to CXXFLAGS to work around 
PR98834."
@rm -f .simd.summary
-   ${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$
{glibcxx_builddir}" "$(CXXFLAGS)" | \
+   @echo "Generating simd testsuite subdirs and Makefiles ..."
+   @${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$
{glibcxx_builddir}" "$(CXXFLAGS) -fno-tree-vrp" | \
  while read subdir; do \
$(MAKE) -C "$${subdir}"; \
tail -n20 $${subdir}/simd_testsuite.sum | \
diff --git a/libstdc++-v3/testsuite/Makefile.in b/libstdc++-v3/testsuite/
Makefile.in
index ac6207ae75c..c9dd7f5da61 100644
--- a/libstdc++-v3/testsuite/Makefile.in
+++ b/libstdc++-v3/testsuite/Makefile.in
@@ -716,8 +716,10 @@ check-simd: $(srcdir)/experimental/simd/
generate_makefile.sh \
${glibcxx_srcdir}/scripts/check_simd \
testsuite_files_simd \
${glibcxx_builddir}/scripts/testsuite_flags
+   @echo "WARNING: Adding -fno-tree-vrp to CXXFLAGS to work around 
PR98834."
@rm -f .simd.summary
-   ${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$
{glibcxx_builddir}" "$(CXXFLAGS)" | \
+   @echo "Generating simd testsuite subdirs and Makefiles ..."
+   @${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$
{glibcxx_builddir}" "$(CXXFLAGS) -fno-tree-vrp" | \
  while read subdir; do \
$(MAKE) -C "$${subdir}"; \
tail -n20 $${subdir}/simd_testsuite.sum | \
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



[PATCH 14/16] Implement hmin and hmax

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

From 9.7.4 in Parallelism TS 2. For some reason I overlooked these two
functions. Implement them via call to _S_reduce.

libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h: Add __detail::_Minimum and
__detail::_Maximum to use them as _BinaryOperation to _S_reduce.
Add hmin and hmax overloads for simd and const_where_expression.
* include/experimental/bits/simd_scalar.h
(_SimdImplScalar::_S_reduce): Make unused _BinaryOperation
parameter const-ref to allow calling _S_reduce with an rvalue.
* testsuite/experimental/simd/tests/reductions.cc: Add tests for
hmin and hmax. Since the compiler statically determined that all
tests pass, repeat the test after a call to make_value_unknown.
---
 libstdc++-v3/include/experimental/bits/simd.h | 78 ++-
 .../include/experimental/bits/simd_scalar.h   |  2 +-
 .../experimental/simd/tests/reductions.cc | 21 +
 3 files changed, 99 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/
include/experimental/bits/simd.h
index 14179491f9d..f08ef4c027d 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -204,6 +204,27 @@ template 
 template 
   using _SizeConstant = integral_constant;
 
+namespace __detail {
+  struct _Minimum {
+template 
+  _GLIBCXX_SIMD_INTRINSIC constexpr
+  _Tp
+  operator()(_Tp __a, _Tp __b) const {
+   using std::min;
+   return min(__a, __b);
+  }
+  };
+  struct _Maximum {
+template 
+  _GLIBCXX_SIMD_INTRINSIC constexpr
+  _Tp
+  operator()(_Tp __a, _Tp __b) const {
+   using std::max;
+   return max(__a, __b);
+  }
+  };
+} // namespace __detail
+
 // unrolled/pack execution helpers
 // __execute_n_times{{{
 template 
@@ -3408,7 +3429,7 @@ template 
 
 // }}}1
 // reductions [simd.reductions] {{{1
-  template >
+template >
   _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
   reduce(const simd<_Tp, _Abi>& __v,
 _BinaryOperation __binary_op = _BinaryOperation())
@@ -3454,6 +3475,61 @@ template 
   reduce(const const_where_expression<_M, _V>& __x, bit_xor<> __binary_op)
   { return reduce(__x, 0, __binary_op); }
 
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
+  hmin(const simd<_Tp, _Abi>& __v) noexcept
+  {
+return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Minimum());
+  }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _Tp
+  hmax(const simd<_Tp, _Abi>& __v) noexcept
+  {
+return _Abi::_SimdImpl::_S_reduce(__v, __detail::_Maximum());
+  }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  typename _V::value_type
+  hmin(const const_where_expression<_M, _V>& __x) noexcept
+  {
+using _Tp = typename _V::value_type;
+constexpr _Tp __id_elem =
+#ifdef __FINITE_MATH_ONLY__
+  __finite_max_v<_Tp>;
+#else
+  __value_or<__infinity, _Tp>(__finite_max_v<_Tp>);
+#endif
+_V __tmp = __id_elem;
+_V::_Impl::_S_masked_assign(__data(__get_mask(__x)), __data(__tmp),
+   __data(__get_lvalue(__x)));
+return _V::abi_type::_SimdImpl::_S_reduce(__tmp, __detail::_Minimum());
+  }
+
+template 
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  typename _V::value_type
+  hmax(const const_where_expression<_M, _V>& __x) noexcept
+  {
+using _Tp = typename _V::value_type;
+constexpr _Tp __id_elem =
+#ifdef __FINITE_MATH_ONLY__
+  __finite_min_v<_Tp>;
+#else
+  [] {
+   if constexpr (__value_exists_v<__infinity, _Tp>)
+ return -__infinity_v<_Tp>;
+   else
+ return __finite_min_v<_Tp>;
+  }();
+#endif
+_V __tmp = __id_elem;
+_V::_Impl::_S_masked_assign(__data(__get_mask(__x)), __data(__tmp),
+   __data(__get_lvalue(__x)));
+return _V::abi_type::_SimdImpl::_S_reduce(__tmp, __detail::_Maximum());
+  }
+
 // }}}1
 // algorithms [simd.alg] {{{
 template 
diff --git a/libstdc++-v3/include/experimental/bits/simd_scalar.h b/libstdc++-
v3/include/experimental/bits/simd_scalar.h
index 7680bc39c30..7e480ecdb37 100644
--- a/libstdc++-v3/include/experimental/bits/simd_scalar.h
+++ b/libstdc++-v3/include/experimental/bits/simd_scalar.h
@@ -182,7 +182,7 @@ struct _SimdImplScalar
   // _S_reduce {{{2
   template 
 static constexpr inline _Tp
-_S_reduce(const simd<_Tp, simd_abi::scalar>& __x, _BinaryOperation&)
+_S_reduce(const simd<_Tp, simd_abi::scalar>& __x, const 
_BinaryOperation&)
 { return __x._M_data; }
 
   // _S_min, _S_max {{{2
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc b/
libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc
index 9d897d5ccd6..02df68fafbc 100644
--- a/libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/reductions.cc
@@ -57,6 +57,8 @@ template 
 }
 
   

[PATCH 13/16] Improve test codegen for interpreting assembly

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

In many failure cases it is helpful to inspect the instructions leading
up to the test failure. After this change the location is easier to find
and the branch after failure is easier to find.

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/tests/bits/verify.h (verify): Add
instruction pointer data member. Ensure that the `if (m_failed)`
branch is always inlined into the calling code. The body of the
conditional can still be a function call. Move the get_ip call
into the verify ctor to simplify the ctor calls.
(COMPARE): Don't mention the use of all_of for reduction of a
simd_mask. It only distracts from the real issue.
---
 .../experimental/simd/tests/bits/verify.h | 44 +--
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h b/
libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h
index 5da47b35536..17bda71b77e 100644
--- a/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/bits/verify.h
@@ -60,6 +60,7 @@ template 
 class verify
 {
   const bool m_failed = false;
+  size_t m_ip = 0;
 
   template ()
@@ -129,20 +130,21 @@ class verify
 
 public:
   template 
-verify(bool ok, size_t ip, const char* file, const int line,
+[[gnu::always_inline]]
+verify(bool ok, const char* file, const int line,
   const char* func, const char* cond, const Ts&... extra_info)
-: m_failed(!ok)
+: m_failed(!ok), m_ip(get_ip())
 {
   if (m_failed)
-   {
+   [&] {
  __builtin_fprintf(stderr, "%s:%d: (%s):\nInstruction Pointer: %x\n"
"Assertion '%s' failed.\n",
-   file, line, func, ip, cond);
+   file, line, func, m_ip, cond);
  (print(extra_info, int()), ...);
-   }
+   }();
 }
 
-  ~verify()
+  [[gnu::always_inline]] ~verify()
   {
 if (m_failed)
   {
@@ -152,26 +154,27 @@ public:
   }
 
   template 
+[[gnu::always_inline]]
 const verify&
 operator<<(const T& x) const
 {
   if (m_failed)
-   {
- print(x, int());
-   }
+   print(x, int());
   return *this;
 }
 
   template 
+[[gnu::always_inline]]
 const verify&
 on_failure(const Ts&... xs) const
 {
   if (m_failed)
-   (print(xs, int()), ...);
+   [&] { (print(xs, int()), ...); }();
   return *this;
 }
 
-  [[gnu::always_inline]] static inline size_t
+  [[gnu::always_inline]] static inline
+  size_t
   get_ip()
   {
 size_t _ip = 0;
@@ -220,24 +223,21 @@ template 
 
 #define COMPARE(_a, _b)
\
   [&](auto&& _aa, auto&& _bb) {
\
-return verify(std::experimental::all_of(_aa == _bb), verify::get_ip(), 
\
- __FILE__, __LINE__, __PRETTY_FUNCTION__, \
- "all_of(" #_a " == " #_b ")", #_a " = ", _aa,\
+return verify(std::experimental::all_of(_aa == _bb), __FILE__, __LINE__,   
\
+ __PRETTY_FUNCTION__, #_a " == " #_b, #_a " = ", _aa, \
  "\n" #_b " = ", _bb);\
   }(force_fp_truncation(_a), force_fp_truncation(_b))
 #else
 #define COMPARE(_a, _b)
\
   [&](auto&& _aa, auto&& _bb) {
\
-return verify(std::experimental::all_of(_aa == _bb), verify::get_ip(), 
\
- __FILE__, __LINE__, __PRETTY_FUNCTION__, \
- "all_of(" #_a " == " #_b ")", #_a " = ", _aa,\
+return verify(std::experimental::all_of(_aa == _bb), __FILE__, __LINE__,   
\
+ __PRETTY_FUNCTION__, #_a " == " #_b, #_a " = ", _aa, \
  "\n" #_b " = ", _bb);\
   }((_a), (_b))
 #endif
 
 #define VERIFY(_test)  
\
-  verify(_test, verify::get_ip(), __FILE__, __LINE__, __PRETTY_FUNCTION__, 
\
-#_test)
+  verify(_test, __FILE__, __LINE__, __PRETTY_FUNCTION__, #_test)
 
   // ulp_distance_signed can raise FP exceptions and thus must be 
conditionally
   // executed
@@ -245,9 +245,9 @@ template 
   [&](auto&& _aa, auto&& _bb) {
\
 const bool success = std::experimental::all_of(
\
   vir::test::ulp_distance(_aa, _bb) <= (_allowed_distance));   
\
-return verify(success, verify::get_ip(), __FILE__, __LINE__,   
\
- __PRETTY_FUNCTION__, "all_of(" #_a " ~~ " #_b ")",   \
- #_a " = ", _aa, "\n" #_b " = ", _bb, "\ndistance = ", 

[PATCH 12/16] Support timeout and timeout-factor options

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/driver.sh: Abstract reading test
options into read_src_option function. Read skip, only,
expensive, and xfail via read_src_option. Add timeout and
timeout-factor options and adjust timeout variable accordingly.
* testsuite/experimental/simd/tests/loadstore.cc: Set
timeout-factor 2.
---
 .../testsuite/experimental/simd/driver.sh | 38 +--
 .../experimental/simd/tests/loadstore.cc  |  1 +
 2 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/libstdc++-v3/testsuite/experimental/simd/driver.sh b/libstdc++-
v3/testsuite/experimental/simd/driver.sh
index 719e4db8e68..71e0c7d5ee8 100755
--- a/libstdc++-v3/testsuite/experimental/simd/driver.sh
+++ b/libstdc++-v3/testsuite/experimental/simd/driver.sh
@@ -214,35 +214,43 @@ trap "rm -f '$log' '$sum' $exe; exit" INT
 rm -f "$log" "$sum"
 touch "$log" "$sum"
 
-skip="$(head -n25 "$src" | grep '^//\s*skip: ')"
-if [ -n "$skip" ]; then
-  skip="$(echo "$skip" | sed -e 's/^.*:\s*//' -e 's/ \+/ /g')"
+read_src_option() {
+  local key tmp var
+  key="$1"
+  var="$2"
+  [ -z "$var" ] && var="$1"
+  local tmp="$(head -n25 "$src" | grep "^//\\s*${key}: ")"
+  if [ -n "$tmp" ]; then
+tmp="$(echo "${tmp#//*${key}: }" | sed -e 's/ \+/ /g' -e 's/^ //' -e 's/ 
$//')"
+eval "$var=\"$tmp\""
+  else
+return 1
+  fi
+}
+
+if read_src_option skip; then
   if test_selector "$skip"; then
 # silently skip this test
 exit 0
   fi
 fi
-only="$(head -n25 "$src" | grep '^//\s*only: ')"
-if [ -n "$only" ]; then
-  only="$(echo "$only" | sed -e 's/^.*:\s*//' -e 's/ \+/ /g')"
+if read_src_option only; then
   if ! test_selector "$only"; then
 # silently skip this test
 exit 0
   fi
 fi
+
 if ! $run_expensive; then
-  expensive="$(head -n25 "$src" | grep '^//\s*expensive: ')"
-  if [ -n "$expensive" ]; then
-expensive="$(echo "$expensive" | sed -e 's/^.*:\s*//' -e 's/ \+/ /g')"
+  if read_src_option expensive; then
 if test_selector "$expensive"; then
   unsupported "skip expensive tests"
   exit 0
 fi
   fi
 fi
-xfail="$(head -n25 "$src" | grep '^//\s*xfail: ')"
-if [ -n "$xfail" ]; then
-  xfail="$(echo "$xfail" | sed -e 's/^.*:\s*//' -e 's/ \+/ /g')"
+
+if read_src_option xfail; then
   if test_selector "${xfail#* }"; then
 xfail="${xfail%% *}"
   else
@@ -250,6 +258,12 @@ if [ -n "$xfail" ]; then
   fi
 fi
 
+read_src_option timeout
+
+if read_src_option timeout-factor factor; then
+  timeout=$(awk "BEGIN { print int($timeout * $factor) }")
+fi
+
 log_output() {
   if $verbose; then
 maxcol=${1:-1024}
diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/loadstore.cc b/
libstdc++-v3/testsuite/experimental/simd/tests/loadstore.cc
index dd7d6c30e8c..cd27c3a7426 100644
--- a/libstdc++-v3/testsuite/experimental/simd/tests/loadstore.cc
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/loadstore.cc
@@ -16,6 +16,7 @@
 // .
 
 // expensive: * [1-9] * *
+// timeout-factor: 2
 #include "bits/verify.h"
 #include "bits/make_vec.h"
 #include "bits/conversions.h"
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──






[PATCH 11/16] Abort test after 1000 lines of output

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

Handle overly large output by aborting the log and thus the test. This
is a similar condition to a timeout.

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/driver.sh: When handling the pipe
to log (and on verbose to stdout) count the lines. If it exceeds
1000 log the issue and exit 125, which is then handled as a
failure.
---
 .../testsuite/experimental/simd/driver.sh   | 17 +++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/libstdc++-v3/testsuite/experimental/simd/driver.sh b/libstdc++-
v3/testsuite/experimental/simd/driver.sh
index 314c6a16f86..719e4db8e68 100755
--- a/libstdc++-v3/testsuite/experimental/simd/driver.sh
+++ b/libstdc++-v3/testsuite/experimental/simd/driver.sh
@@ -258,7 +258,11 @@ BEGIN { count = 0 }
 /^###exitstatus### [0-9]+$/ { exit \$2 }
 {
   print >> \"$log\"
-  if (count >= 1000) next
+  if (count >= 1000) {
+print \"Aborting: too much output\" >> \"$log\"
+print \"Aborting: too much output\"
+exit 125
+  }
   ++count
   if (length(\$0) > $maxcol) {
 i = 1
@@ -282,8 +286,17 @@ END { close(\"$log\") }
 "
   else
 awk "
+BEGIN { count = 0 }
 /^###exitstatus### [0-9]+$/ { exit \$2 }
-{ print >> \"$log\" }
+{
+  print >> \"$log\"
+  if (count >= 1000) {
+print \"Aborting: too much output\" >> \"$log\"
+print \"Aborting: too much output\"
+exit 125
+  }
+  ++count
+}
 END { close(\"$log\") }
 "
   fi
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



[PATCH 10/16] Skip testing hypot3 for long double on PPC

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

std::hypot(a, b, c) is imprecise and makes this test fail even though
the failure is unrelated to simd.

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/tests/hypot3_fma.cc: Add skip:
markup for long double on powerpc64*.
---
 libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.cc | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.cc b/
libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.cc
index 689a90c10a5..94d267fccfb 100644
--- a/libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.cc
+++ b/libstdc++-v3/testsuite/experimental/simd/tests/hypot3_fma.cc
@@ -16,6 +16,7 @@
 // .
 
 // only: float|double|ldouble * * *
+// skip: ldouble * powerpc64* *
 // expensive: * [1-9] * *
 #include "bits/verify.h"
 #include "bits/metahelpers.h"
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



[PATCH 09/16] Fix mask reduction of simd_mask on POWER7

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

POWER7 does not support __vector long long reductions, making the
generic _S_popcount implementation ill-formed. Specializing _S_popcount
for PPC allows optimization and avoids the issue.

libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h: Add __have_power10vec
conditional on _ARCH_PWR10.
* include/experimental/bits/simd_builtin.h: Forward declare
_MaskImplPpc and use it as _MaskImpl when __ALTIVEC__ is
defined.
(_MaskImplBuiltin::_S_some_of): Call _S_popcount from the
_SuperImpl for optimizations and correctness.
* include/experimental/bits/simd_ppc.h: Add _MaskImplPpc.
(_MaskImplPpc::_S_popcount): Implement via vec_cntm for POWER10.
Otherwise, for >=int use -vec_sums divided by a sizeof factor.
For  struct _MaskImplX86;
 template  struct _SimdImplNeon;
 template  struct _MaskImplNeon;
 template  struct _SimdImplPpc;
+template  struct _MaskImplPpc;
 
 // simd_abi::_VecBuiltin {{{
 template 
@@ -959,10 +960,11 @@ template 
 using _CommonImpl = _CommonImplBuiltin;
 #ifdef __ALTIVEC__
 using _SimdImpl = _SimdImplPpc<_VecBuiltin<_UsedBytes>>;
+using _MaskImpl = _MaskImplPpc<_VecBuiltin<_UsedBytes>>;
 #else
 using _SimdImpl = _SimdImplBuiltin<_VecBuiltin<_UsedBytes>>;
-#endif
 using _MaskImpl = _MaskImplBuiltin<_VecBuiltin<_UsedBytes>>;
+#endif
 #endif
 
 // }}}
@@ -2899,7 +2901,7 @@ template 
   _GLIBCXX_SIMD_INTRINSIC static bool
   _S_some_of(simd_mask<_Tp, _Abi> __k)
   {
-   const int __n_true = _S_popcount(__k);
+   const int __n_true = _SuperImpl::_S_popcount(__k);
return __n_true > 0 && __n_true < int(_S_size<_Tp>);
   }
 
diff --git a/libstdc++-v3/include/experimental/bits/simd_ppc.h b/libstdc++-v3/
include/experimental/bits/simd_ppc.h
index c00d2323ac6..1d649931eb9 100644
--- a/libstdc++-v3/include/experimental/bits/simd_ppc.h
+++ b/libstdc++-v3/include/experimental/bits/simd_ppc.h
@@ -30,6 +30,7 @@
 #ifndef __ALTIVEC__
 #error "simd_ppc.h may only be included when AltiVec/VMX is available"
 #endif
+#include 
 
 _GLIBCXX_SIMD_BEGIN_NAMESPACE
 
@@ -114,10 +115,42 @@ template 
 // }}}
   };
 
+// }}}
+// _MaskImplPpc {{{
+template 
+  struct _MaskImplPpc : _MaskImplBuiltin<_Abi>
+  {
+using _Base = _MaskImplBuiltin<_Abi>;
+
+// _S_popcount {{{
+template 
+  _GLIBCXX_SIMD_INTRINSIC static int _S_popcount(simd_mask<_Tp, _Abi> 
__k)
+  {
+   const auto __kv = __as_vector(__k);
+   if constexpr (__have_power10vec)
+ {
+   return vec_cntm(__to_intrin(__kv), 1);
+ }
+   else if constexpr (sizeof(_Tp) >= sizeof(int))
+ {
+   using _Intrin = __intrinsic_type16_t;
+   const int __sum = -vec_sums(__intrin_bitcast<_Intrin>(__kv), 
_Intrin())[3];
+   return __sum / (sizeof(_Tp) / sizeof(int));
+ }
+   else
+ {
+   const auto __summed_to_int = vec_sum4s(__to_intrin(__kv), 
__intrinsic_type16_t());
+   return -vec_sums(__summed_to_int, __intrinsic_type16_t())[3];
+ }
+  }
+
+// }}}
+  };
+
 // }}}
 
 _GLIBCXX_SIMD_END_NAMESPACE
 #endif // __cplusplus >= 201703L
 #endif // _GLIBCXX_EXPERIMENTAL_SIMD_PPC_H_
 
-// vim: foldmethod=marker sw=2 noet ts=8 sts=2 tw=80
+// vim: foldmethod=marker foldmarker={{{,}}} sw=2 noet ts=8 sts=2 tw=100
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──






[PATCH 08/16] Immediate feedback with -v

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/driver.sh: Remove executable on
SIGINT. Process compiler and test executable output: In verbose
mode print messages immediately, limited to 1000 lines and
breaking long lines to below $COLUMNS (or 1024 if not set).
Communicating the exit status of the compiler / test with the
necessary pipe is done via a message through stdout/-in.
---
 .../testsuite/experimental/simd/driver.sh | 194 +++---
 1 file changed, 116 insertions(+), 78 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/testsuite/experimental/simd/driver.sh b/libstdc++-v3/testsuite/experimental/simd/driver.sh
index cf07ff9ad85..314c6a16f86 100755
--- a/libstdc++-v3/testsuite/experimental/simd/driver.sh
+++ b/libstdc++-v3/testsuite/experimental/simd/driver.sh
@@ -172,81 +172,14 @@ unsupported() {
   echo "UNSUPPORTED: $src $type $abiflag ($*)" >> "$log"
 }
 
-verify_compilation() {
-  failed=$1
-  if [ $failed -eq 0 ]; then
-warnings=$(grep -ic 'warning:' "$log")
-if [ $warnings -gt 0 ]; then
-  fail "excess warnings:" $warnings
-  if $verbose; then
-cat "$log"
-  elif ! $quiet; then
-grep -i 'warning:' "$log" | head -n5
-  fi
-elif [ "$xfail" = "compile" ]; then
-  xpass "test for excess errors"
-else
-  pass "test for excess errors"
-fi
-  else
-if [ $failed -eq 124 ]; then
-  fail "timeout: test for excess errors"
-else
-  errors=$(grep -ic 'error:' "$log")
-  if [ "$xfail" = "compile" ]; then
-xfail "excess errors:" $errors
-exit 0
-  else
-fail "excess errors:" $errors
-  fi
-fi
-if $verbose; then
-  cat "$log"
-elif ! $quiet; then
-  grep -i 'error:' "$log" | head -n5
-fi
-exit 0
-  fi
-}
-
-verify_test() {
-  failed=$1
-  if [ $failed -eq 0 ]; then
-rm "$exe"
-if [ "$xfail" = "run" ]; then
-  xpass "execution test"
-else
-  pass "execution test"
-fi
-  else
-$keep_failed || rm "$exe"
-if [ $failed -eq 124 ]; then
-  fail "timeout: execution test"
-elif [ "$xfail" = "run" ]; then
-  xfail "execution test"
-else
-  fail "execution test"
-fi
-if $verbose; then
-  lines=$(wc -l < "$log")
-  lines=$((lines-3))
-  if [ $lines -gt 1000 ]; then
-echo "[...]"
-tail -n1000 "$log"
-  else
-tail -n$lines "$log"
-  fi
-elif ! $quiet; then
-  grep -i fail "$log" | head -n5
-fi
-exit 0
-  fi
-}
-
 write_log_and_verbose() {
   echo "$*" >> "$log"
   if $verbose; then
-echo "$*"
+if [ -z "$COLUMNS" ] || ! type fmt>/dev/null; then
+  echo "$*"
+else
+  echo "$*" | fmt -w $COLUMNS -s - || cat
+fi
   fi
 }
 
@@ -277,7 +210,7 @@ test_selector() {
   return 1
 }
 
-trap "rm -f '$log' '$sum'; exit" INT
+trap "rm -f '$log' '$sum' $exe; exit" INT
 rm -f "$log" "$sum"
 touch "$log" "$sum"
 
@@ -317,17 +250,122 @@ if [ -n "$xfail" ]; then
   fi
 fi
 
+log_output() {
+  if $verbose; then
+maxcol=${1:-1024}
+awk "
+BEGIN { count = 0 }
+/^###exitstatus### [0-9]+$/ { exit \$2 }
+{
+  print >> \"$log\"
+  if (count >= 1000) next
+  ++count
+  if (length(\$0) > $maxcol) {
+i = 1
+while (i + $maxcol <= length(\$0)) {
+  len = $maxcol
+  line = substr(\$0, i, len)
+  len = match(line, / [^ ]*$/)
+  if (len <= 0) {
+len = match(substr(\$0, i), / [^ ]/)
+if (len <= 0) len = $maxcol
+  }
+  print substr(\$0, i, len)
+  i += len
+}
+print substr(\$0, i)
+  } else {
+print
+  }
+}
+END { close(\"$log\") }
+"
+  else
+awk "
+/^###exitstatus### [0-9]+$/ { exit \$2 }
+{ print >> \"$log\" }
+END { close(\"$log\") }
+"
+  fi
+}
+
+verify_compilation() {
+  log_output $COLUMNS
+  exitstatus=$?
+  if [ $exitstatus -eq 0 ]; then
+warnings=$(grep -ic 'warning:' "$log")
+if [ $warnings -gt 0 ]; then
+  fail "excess warnings:" $warnings
+  if ! $verbose && ! $quiet; then
+grep -i 'warning:' "$log" | head -n5
+  fi
+elif [ "$xfail" = "compile" ]; then
+  xpass "test for excess errors"
+else
+  pass "test for excess errors"
+fi
+return 0
+  else
+if [ $exitstatus -eq 124 ]; then
+  fail "timeout: test for excess errors"
+else
+  errors=$(grep -ic 'error:' "$log")
+  if [ "$xfail" = "compile" ]; then
+xfail "excess errors:" $errors
+exit 0
+  else
+fail "excess errors:" $errors
+  fi
+fi
+if ! $verbose && ! $quiet; then
+  grep -i 

[PATCH 07/16] Fix incorrect display of old test summaries

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* testsuite/Makefile.am: Ensure .simd.summary is empty before
collecting a new summary.
* testsuite/Makefile.in: Regenerate.
---
 libstdc++-v3/testsuite/Makefile.am | 1 +
 libstdc++-v3/testsuite/Makefile.in | 1 +
 2 files changed, 2 insertions(+)

diff --git a/libstdc++-v3/testsuite/Makefile.am b/libstdc++-v3/testsuite/
Makefile.am
index 5dd109b40c9..2d3ad481dba 100644
--- a/libstdc++-v3/testsuite/Makefile.am
+++ b/libstdc++-v3/testsuite/Makefile.am
@@ -191,6 +191,7 @@ check-simd: $(srcdir)/experimental/simd/
generate_makefile.sh \
${glibcxx_srcdir}/scripts/check_simd \
testsuite_files_simd \
${glibcxx_builddir}/scripts/testsuite_flags
+   @rm -f .simd.summary
${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$
{glibcxx_builddir}" "$(CXXFLAGS)" | \
  while read subdir; do \
$(MAKE) -C "$${subdir}"; \
diff --git a/libstdc++-v3/testsuite/Makefile.in b/libstdc++-v3/testsuite/
Makefile.in
index 3900d6d87b4..ac6207ae75c 100644
--- a/libstdc++-v3/testsuite/Makefile.in
+++ b/libstdc++-v3/testsuite/Makefile.in
@@ -716,6 +716,7 @@ check-simd: $(srcdir)/experimental/simd/
generate_makefile.sh \
${glibcxx_srcdir}/scripts/check_simd \
testsuite_files_simd \
${glibcxx_builddir}/scripts/testsuite_flags
+   @rm -f .simd.summary
${glibcxx_srcdir}/scripts/check_simd "${glibcxx_srcdir}" "$
{glibcxx_builddir}" "$(CXXFLAGS)" | \
  while read subdir; do \
$(MAKE) -C "$${subdir}"; \
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



[PATCH 05/16] Fix several check-simd interaction issues

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/driver.sh (verify_test): Print
test output on run xfail. Do not repeat lines from the log that
were already printed on stdout.
(test_selector): Make the compiler flags pattern usable as a
substring selector.
(toplevel): Trap on SIGINT and remove the log and sum files.
Call timout with --foreground to quickly terminate on SIGINT.
* testsuite/experimental/simd/generate_makefile.sh: Simplify run
targets via target patterns. Default DRIVEROPTS to -v for run
targets. Remove log and sum files after completion of the run
target (so that it's always recompiled).
Place help text into text file for reasonable 'make help'
performance.
---
 .../testsuite/experimental/simd/driver.sh | 16 +++--
 .../experimental/simd/generate_makefile.sh| 70 +--
 2 files changed, 44 insertions(+), 42 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──diff --git a/libstdc++-v3/testsuite/experimental/simd/driver.sh b/libstdc++-v3/testsuite/experimental/simd/driver.sh
index 84f3829c2d4..cf07ff9ad85 100755
--- a/libstdc++-v3/testsuite/experimental/simd/driver.sh
+++ b/libstdc++-v3/testsuite/experimental/simd/driver.sh
@@ -224,16 +224,17 @@ verify_test() {
   fail "timeout: execution test"
 elif [ "$xfail" = "run" ]; then
   xfail "execution test"
-  exit 0
 else
   fail "execution test"
 fi
 if $verbose; then
-  if [ $(cat "$log"|wc -l) -gt 1000 ]; then
+  lines=$(wc -l < "$log")
+  lines=$((lines-3))
+  if [ $lines -gt 1000 ]; then
 echo "[...]"
 tail -n1000 "$log"
   else
-cat "$log"
+tail -n$lines "$log"
   fi
 elif ! $quiet; then
   grep -i fail "$log" | head -n5
@@ -267,7 +268,7 @@ test_selector() {
   [ -z "$target_triplet" ] && target_triplet=$($CXX -dumpmachine)
   if matches "$target_triplet" "$pat_triplet"; then
 pat_flags="${string#* }"
-if matches "$CXXFLAGS" "$pat_flags"; then
+if matches "$CXXFLAGS" "*$pat_flags*"; then
   return 0
 fi
   fi
@@ -276,6 +277,7 @@ test_selector() {
   return 1
 }
 
+trap "rm -f '$log' '$sum'; exit" INT
 rm -f "$log" "$sum"
 touch "$log" "$sum"
 
@@ -316,15 +318,15 @@ if [ -n "$xfail" ]; then
 fi
 
 write_log_and_verbose "$CXX $src $@ -D_GLIBCXX_SIMD_TESTTYPE=$type $abiflag -o $exe"
-timeout $timeout "$CXX" "$src" "$@" "-D_GLIBCXX_SIMD_TESTTYPE=$type" $abiflag -o "$exe" >> "$log" 2>&1
+timeout --foreground $timeout "$CXX" "$src" "$@" "-D_GLIBCXX_SIMD_TESTTYPE=$type" $abiflag -o "$exe" >> "$log" 2>&1
 verify_compilation $?
 if [ -n "$sim" ]; then
   write_log_and_verbose "$sim ./$exe"
-  timeout $timeout $sim "./$exe" >> "$log" 2>&1 <&-
+  timeout --foreground $timeout $sim "./$exe" >> "$log" 2>&1 <&-
 else
   write_log_and_verbose "./$exe"
   timeout=$(awk "BEGIN { print int($timeout / 2) }")
-  timeout $timeout "./$exe" >> "$log" 2>&1 <&-
+  timeout --foreground $timeout "./$exe" >> "$log" 2>&1 <&-
 fi
 verify_test $?
 
diff --git a/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh b/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh
index 553bc98f60b..8d642a2941a 100755
--- a/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh
+++ b/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh
@@ -240,7 +240,7 @@ EOF
 %-$type.log: %-$type-0.log %-$type-1.log %-$type-2.log %-$type-3.log \
 %-$type-4.log %-$type-5.log %-$type-6.log %-$type-7.log \
 %-$type-8.log %-$type-9.log
-	@cat $^ > \$@
+	@cat \$^ > \$@
 	@cat \$(^:log=sum) > \$(@:log=sum)${rmline}
 
 EOF
@@ -252,47 +252,47 @@ EOF
 EOF
 done
   done
-  echo 'run-%: export GCC_TEST_RUN_EXPENSIVE=yes'
-  all_tests | while read file && read name; do
-echo "run-$name: $name.log"
-all_types "$file" | while read t && read type; do
-  echo "run-$name-$type: $name-$type.log"
-  for i in $(seq 0 9); do
-echo "run-$name-$type-$i: $name-$type-$i.log"
-  done
-done
-echo
-  done
   cat < to pass the following options:\n"\\
-	"-q, --quiet Only print failures.\n"\\
-	"-v, --verbose   Print compiler and test output on failure.\n"\\
-	"-k, --keep-failed   Keep executables of failed tests.\n"\\
-	"--sim   Path to an executable that is prepended to the test\n"\\
-	"execution binary (default: the value of\n"\\
-	"GCC_TEST_SIMULATOR).\n"\\
-	"--timeout-factor \n"\\
-	"Multiply the default timeout with x.\n"\\
-	"--run-expensive Compile 

[PATCH 04/16] Fix simd_mask on POWER w/o POWER8

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h: Remove unnecessary static
assertion. Allow sizeof(8) integer __intrinsic_type to enable
the necessary mask type.
---
 libstdc++-v3/include/experimental/bits/simd.h | 6 --
 1 file changed, 6 deletions(-)

diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/
include/experimental/bits/simd.h
index 64cf8d32328..9685df0be9e 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -2292,12 +2292,6 @@ template 
 #ifndef __VSX__
 static_assert(!is_same_v<_Tp, double>,
  "no __intrinsic_type support for double on PPC w/o VSX");
-#endif
-#ifndef __POWER8_VECTOR__
-static_assert(
-  !(is_integral_v<_Tp> && sizeof(_Tp) > 4),
-  "no __intrinsic_type support for integers larger than 4 Bytes "
-  "on PPC w/o POWER8 vectors");
 #endif
 using type =
   typename __intrinsic_type_impl<
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



[PATCH 06/16] Fix DRIVEROPTS and TESTFLAGS processing

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/generate_makefile.sh: Use
different variables internally than documented for user
overrides. This makes internal append/prepend work as intended.
---
 .../testsuite/experimental/simd/generate_makefile.sh  | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh b/
libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh
index 8d642a2941a..4fb710c7767 100755
--- a/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh
+++ b/libstdc++-v3/testsuite/experimental/simd/generate_makefile.sh
@@ -85,19 +85,20 @@ CXX="$1"
 shift
 
 echo "TESTFLAGS ?=" > "$dst"
-[ -n "$testflags" ] && echo "TESTFLAGS := $testflags \$(TESTFLAGS)" >> "$dst"
-echo CXXFLAGS = "$@" "\$(TESTFLAGS)" >> "$dst"
+echo "test_flags := $testflags \$(TESTFLAGS)" >> "$dst"
+echo CXXFLAGS = "$@" "\$(test_flags)" >> "$dst"
 [ -n "$sim" ] && echo "export GCC_TEST_SIMULATOR = $sim" >> "$dst"
 cat >> "$dst" 

[PATCH 03/16] Support -mlong-double-64 on PPC

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h: Let __intrinsic_type be valid if sizeof(long double) == sizeof(double) and
use a __vector double as member type.
---
 libstdc++-v3/include/experimental/bits/simd.h | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/
include/experimental/bits/simd.h
index d56176210df..64cf8d32328 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -2285,7 +2285,9 @@ template 
   struct __intrinsic_type<_Tp, _Bytes,
  enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 16>>
   {
-static_assert(!is_same_v<_Tp, long double>,
+static constexpr bool _S_is_ldouble = is_same_v<_Tp, long double>;
+// allow _Tp == long double with -mlong-double-64
+static_assert(!(_S_is_ldouble && sizeof(long double) > sizeof(double)),
  "no __intrinsic_type support for long double on PPC");
 #ifndef __VSX__
 static_assert(!is_same_v<_Tp, double>,
@@ -2297,8 +2299,11 @@ template 
   "no __intrinsic_type support for integers larger than 4 Bytes "
   "on PPC w/o POWER8 vectors");
 #endif
-using type = typename __intrinsic_type_impl, _Tp, __int_for_sizeof_t<_Tp>>>::type;
+using type =
+  typename __intrinsic_type_impl<
+conditional_t,
+  conditional_t<_S_is_ldouble, double, _Tp>,
+  __int_for_sizeof_t<_Tp>>>::type;
   };
 #endif // __ALTIVEC__
 
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──



[PATCH 02/16] Fix NEON intrinsic types usage

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

Intrinsics types for NEON differ from gnu::vector_size types now. This
requires explicit specializations for __intrinsic_type and a new
__is_intrinsic_type trait.

libstdc++-v3/ChangeLog:
* include/experimental/bits/simd.h (__is_intrinsic_type): New
internal type trait. Alias for __is_vector_type on x86.
(_VectorTraitsImpl): Enable for __intrinsic_type in addition for
__vector_type.
(__intrin_bitcast): Allow casting to & from vector & intrinsic
types.
(__intrinsic_type): Explicitly specialize for NEON intrinsic
vector types.
---
 libstdc++-v3/include/experimental/bits/simd.h | 70 +--
 1 file changed, 66 insertions(+), 4 deletions(-)

diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/
include/experimental/bits/simd.h
index 00eec50d64f..d56176210df 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -1379,13 +1379,35 @@ template 
 template 
   inline constexpr bool __is_vector_type_v = __is_vector_type<_Tp>::value;
 
+// }}}
+// __is_intrinsic_type {{{
+#if _GLIBCXX_SIMD_HAVE_SSE_ABI
+template 
+  using __is_intrinsic_type = __is_vector_type<_Tp>;
+#else // not SSE (x86)
+template >
+  struct __is_intrinsic_type : false_type {};
+
+template 
+  struct __is_intrinsic_type<
+_Tp, void_t()[0])>, 
sizeof(_Tp)>::type>>
+: is_same<_Tp, typename __intrinsic_type<
+remove_reference_t()[0])>,
+sizeof(_Tp)>::type> {};
+#endif
+
+template 
+  inline constexpr bool __is_intrinsic_type_v = 
__is_intrinsic_type<_Tp>::value;
+
 // }}}
 // _VectorTraits{{{
 template >
   struct _VectorTraitsImpl;
 
 template 
-  struct _VectorTraitsImpl<_Tp, enable_if_t<__is_vector_type_v<_Tp>>>
+  struct _VectorTraitsImpl<_Tp, enable_if_t<__is_vector_type_v<_Tp>
+ || __is_intrinsic_type_v<_Tp>>>
   {
 using type = _Tp;
 using value_type = remove_reference_t()[0])>;
@@ -1457,7 +1479,8 @@ template 
   _GLIBCXX_SIMD_INTRINSIC constexpr _To
   __intrin_bitcast(_From __v)
   {
-static_assert(__is_vector_type_v<_From> && __is_vector_type_v<_To>);
+static_assert((__is_vector_type_v<_From> || __is_intrinsic_type_v<_From>)
+   && (__is_vector_type_v<_To> || __is_intrinsic_type_v<_To>));
 if constexpr (sizeof(_To) == sizeof(_From))
   return reinterpret_cast<_To>(__v);
 else if constexpr (sizeof(_From) > sizeof(_To))
@@ -2183,16 +2206,55 @@ template 
 #endif // _GLIBCXX_SIMD_HAVE_SSE_ABI
 // __intrinsic_type (ARM){{{
 #if _GLIBCXX_SIMD_HAVE_NEON
+template <>
+  struct __intrinsic_type
+  { using type = float32x2_t; };
+
+template <>
+  struct __intrinsic_type
+  { using type = float32x4_t; };
+
+#if _GLIBCXX_SIMD_HAVE_NEON_A64
+template <>
+  struct __intrinsic_type
+  { using type = float64x1_t; };
+
+template <>
+  struct __intrinsic_type
+  { using type = float64x2_t; };
+#endif
+
+#define _GLIBCXX_SIMD_ARM_INTRIN(_Bits, _Np)   
\
+template <>
\
+  struct __intrinsic_type<__int_with_sizeof_t<_Bits / 8>,  
\
+ _Np * _Bits / 8, void>   \
+  { using type = int##_Bits##x##_Np##_t; };
\
+template <>
\
+  struct __intrinsic_type>, 
\
+ _Np * _Bits / 8, void>   \
+  { using type = uint##_Bits##x##_Np##_t; }
+_GLIBCXX_SIMD_ARM_INTRIN(8, 8);
+_GLIBCXX_SIMD_ARM_INTRIN(8, 16);
+_GLIBCXX_SIMD_ARM_INTRIN(16, 4);
+_GLIBCXX_SIMD_ARM_INTRIN(16, 8);
+_GLIBCXX_SIMD_ARM_INTRIN(32, 2);
+_GLIBCXX_SIMD_ARM_INTRIN(32, 4);
+_GLIBCXX_SIMD_ARM_INTRIN(64, 1);
+_GLIBCXX_SIMD_ARM_INTRIN(64, 2);
+#undef _GLIBCXX_SIMD_ARM_INTRIN
+
 template 
   struct __intrinsic_type<_Tp, _Bytes,
  enable_if_t<__is_vectorizable_v<_Tp> && _Bytes <= 16>>
   {
-static constexpr int _S_VBytes = _Bytes <= 8 ? 8 : 16;
+static constexpr int _SVecBytes = _Bytes <= 8 ? 8 : 16;
 using _Ip = __int_for_sizeof_t<_Tp>;
 using _Up = conditional_t<
   is_floating_point_v<_Tp>, _Tp,
   conditional_t, make_unsigned_t<_Ip>, _Ip>>;
-using type [[__gnu__::__vector_size__(_S_VBytes)]] = _Up;
+static_assert(!is_same_v<_Tp, _Up> || _SVecBytes != _Bytes,
+ "should use explicit specialization above");
+using type = typename __intrinsic_type<_Up, _SVecBytes>::type;
   };
 #endif // _GLIBCXX_SIMD_HAVE_NEON
 
-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  

[PATCH 01/16] Support skip, only, expensive, and xfail markers

2021-01-27 Thread Matthias Kretz
From: Matthias Kretz 

libstdc++-v3/ChangeLog:
* testsuite/experimental/simd/driver.sh: Implement skip, only,
expensive, and xfail markers. They can select on type, ABI tag
subset number, target-triplet, and compiler flags.
* testsuite/experimental/simd/generate_makefile.sh: The summary
now includes lines for unexpected passes and expected failures.
If the skip or only markers are only conditional on the type, do
not generate rules for those types.
* testsuite/experimental/simd/tests/abs.cc: Mark test expensive
for ABI tag subsets 1-9.
* testsuite/experimental/simd/tests/algorithms.cc: Ditto.
* testsuite/experimental/simd/tests/broadcast.cc: Ditto.
* testsuite/experimental/simd/tests/casts.cc: Ditto.
* testsuite/experimental/simd/tests/generator.cc: Ditto.
* testsuite/experimental/simd/tests/integer_operators.cc: Ditto.
* testsuite/experimental/simd/tests/loadstore.cc: Ditto.
* testsuite/experimental/simd/tests/mask_broadcast.cc: Ditto.
* testsuite/experimental/simd/tests/mask_conversions.cc: Ditto.
* testsuite/experimental/simd/tests/mask_implicit_cvt.cc: Ditto.
* testsuite/experimental/simd/tests/mask_loadstore.cc: Ditto.
* testsuite/experimental/simd/tests/mask_operator_cvt.cc: Ditto.
* testsuite/experimental/simd/tests/mask_operators.cc: Ditto.
* testsuite/experimental/simd/tests/mask_reductions.cc: Ditto.
* testsuite/experimental/simd/tests/operator_cvt.cc: Ditto.
* testsuite/experimental/simd/tests/operators.cc: Ditto.
* testsuite/experimental/simd/tests/reductions.cc: Ditto.
* testsuite/experimental/simd/tests/simd.cc: Ditto.
* testsuite/experimental/simd/tests/split_concat.cc: Ditto.
* testsuite/experimental/simd/tests/splits.cc: Ditto.
* testsuite/experimental/simd/tests/where.cc: Ditto.
* testsuite/experimental/simd/tests/fpclassify.cc: Ditto. In
addition replace "test only floattypes" marker by unconditional
"float|double|ldouble" only marker.
* testsuite/experimental/simd/tests/frexp.cc: Ditto.
* testsuite/experimental/simd/tests/hypot3_fma.cc: Ditto.
* testsuite/experimental/simd/tests/ldexp_scalbn_scalbln_modf.cc:
Ditto.
* testsuite/experimental/simd/tests/logarithm.cc: Ditto.
* testsuite/experimental/simd/tests/math_1arg.cc: Ditto.
* testsuite/experimental/simd/tests/math_2arg.cc: Ditto.
* testsuite/experimental/simd/tests/remqo.cc: Ditto.
* testsuite/experimental/simd/tests/trigonometric.cc: Ditto.
* testsuite/experimental/simd/tests/trunc_ceil_floor.cc: Ditto.
* testsuite/experimental/simd/tests/sincos.cc: Ditto. In
addition, xfail on run because the reference data is missing.
---
 .../testsuite/experimental/simd/driver.sh | 114 +---
 .../experimental/simd/generate_makefile.sh| 122 --
 .../testsuite/experimental/simd/tests/abs.cc  |   1 +
 .../experimental/simd/tests/algorithms.cc |   1 +
 .../experimental/simd/tests/broadcast.cc  |   1 +
 .../experimental/simd/tests/casts.cc  |   1 +
 .../experimental/simd/tests/fpclassify.cc |   3 +-
 .../experimental/simd/tests/frexp.cc  |   3 +-
 .../experimental/simd/tests/generator.cc  |   1 +
 .../experimental/simd/tests/hypot3_fma.cc |   3 +-
 .../simd/tests/integer_operators.cc   |   1 +
 .../simd/tests/ldexp_scalbn_scalbln_modf.cc   |   3 +-
 .../experimental/simd/tests/loadstore.cc  |   1 +
 .../experimental/simd/tests/logarithm.cc  |   3 +-
 .../experimental/simd/tests/mask_broadcast.cc |   1 +
 .../simd/tests/mask_conversions.cc|   1 +
 .../simd/tests/mask_implicit_cvt.cc   |   1 +
 .../experimental/simd/tests/mask_loadstore.cc |   1 +
 .../simd/tests/mask_operator_cvt.cc   |   1 +
 .../experimental/simd/tests/mask_operators.cc |   1 +
 .../simd/tests/mask_reductions.cc |   1 +
 .../experimental/simd/tests/math_1arg.cc  |   3 +-
 .../experimental/simd/tests/math_2arg.cc  |   3 +-
 .../experimental/simd/tests/operator_cvt.cc   |   1 +
 .../experimental/simd/tests/operators.cc  |   1 +
 .../experimental/simd/tests/reductions.cc |   1 +
 .../experimental/simd/tests/remqo.cc  |   3 +-
 .../testsuite/experimental/simd/tests/simd.cc |   1 +
 .../experimental/simd/tests/sincos.cc |   4 +-
 .../experimental/simd/tests/split_concat.cc   |   1 +
 .../experimental/simd/tests/splits.cc |   1 +
 .../experimental/simd/tests/trigonometric.cc  |   3 +-
 .../simd/tests/trunc_ceil_floor.cc|   3 +-
 .../experimental/simd/tests/where.cc  |   1 +
 34 files changed, 225 insertions(+), 66 deletions(-)


--
──
 Dr. Matthias Kretz   

[PATCH 00/16] stdx::simd fixes and testsuite improvements

2021-01-27 Thread Matthias Kretz
As promised on IRC ...

Matthias Kretz (15):
  Support skip, only, expensive, and xfail markers
  Fix NEON intrinsic types usage
  Support -mlong-double-64 on PPC
  Fix simd_mask on POWER w/o POWER8
  Fix several check-simd interaction issues
  Fix DRIVEROPTS and TESTFLAGS processing
  Fix incorrect display of old test summaries
  Immediate feedback with -v
  Fix mask reduction of simd_mask on POWER7
  Skip testing hypot3 for long double on PPC
  Abort test after 1000 lines of output
  Support timeout and timeout-factor options
  Improve test codegen for interpreting assembly
  Implement hmin and hmax
  Work around test failures using -mno-tree-vrp

yaozhongxiao (1):
  Improve "find_first/last_set" for NEON

 libstdc++-v3/include/experimental/bits/simd.h | 170 ++-
 .../include/experimental/bits/simd_builtin.h  |   6 +-
 .../include/experimental/bits/simd_neon.h |  17 +-
 .../include/experimental/bits/simd_ppc.h  |  35 ++-
 .../include/experimental/bits/simd_scalar.h   |   2 +-
 libstdc++-v3/testsuite/Makefile.am|   5 +-
 libstdc++-v3/testsuite/Makefile.in|   5 +-
 .../testsuite/experimental/simd/driver.sh | 263 ++
 .../experimental/simd/generate_makefile.sh| 201 +++--
 .../testsuite/experimental/simd/tests/abs.cc  |   1 +
 .../experimental/simd/tests/algorithms.cc |   1 +
 .../experimental/simd/tests/bits/verify.h |  44 +--
 .../experimental/simd/tests/broadcast.cc  |   1 +
 .../experimental/simd/tests/casts.cc  |   1 +
 .../experimental/simd/tests/fpclassify.cc |   3 +-
 .../experimental/simd/tests/frexp.cc  |   3 +-
 .../experimental/simd/tests/generator.cc  |   1 +
 .../experimental/simd/tests/hypot3_fma.cc |   4 +-
 .../simd/tests/integer_operators.cc   |   1 +
 .../simd/tests/ldexp_scalbn_scalbln_modf.cc   |   3 +-
 .../experimental/simd/tests/loadstore.cc  |   2 +
 .../experimental/simd/tests/logarithm.cc  |   3 +-
 .../experimental/simd/tests/mask_broadcast.cc |   1 +
 .../simd/tests/mask_conversions.cc|   1 +
 .../simd/tests/mask_implicit_cvt.cc   |   1 +
 .../experimental/simd/tests/mask_loadstore.cc |   1 +
 .../simd/tests/mask_operator_cvt.cc   |   1 +
 .../experimental/simd/tests/mask_operators.cc |   1 +
 .../simd/tests/mask_reductions.cc |   1 +
 .../experimental/simd/tests/math_1arg.cc  |   3 +-
 .../experimental/simd/tests/math_2arg.cc  |   3 +-
 .../experimental/simd/tests/operator_cvt.cc   |   1 +
 .../experimental/simd/tests/operators.cc  |   1 +
 .../experimental/simd/tests/reductions.cc |  22 ++
 .../experimental/simd/tests/remqo.cc  |   3 +-
 .../testsuite/experimental/simd/tests/simd.cc |   1 +
 .../experimental/simd/tests/sincos.cc |   4 +-
 .../experimental/simd/tests/split_concat.cc   |   1 +
 .../experimental/simd/tests/splits.cc |   1 +
 .../experimental/simd/tests/trigonometric.cc  |   3 +-
 .../simd/tests/trunc_ceil_floor.cc|   3 +-
 .../experimental/simd/tests/where.cc  |   1 +
 42 files changed, 635 insertions(+), 191 deletions(-)

-- 
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 std::experimental::simd  https://github.com/VcDevel/std-simd
──






Re: [[C++ PATCH]] Implement C++2a P0330R2 - Literal Suffixes for ptrdiff_t and size_t

2021-01-27 Thread Jakub Jelinek via Gcc-patches
On Sun, Oct 21, 2018 at 04:39:30PM -0400, Ed Smith-Rowland wrote:
> This patch implements C++2a proposal P0330R2 Literal Suffixes for ptrdiff_t
> and size_t*.  It's not official yet but looks very likely to pass.  It is
> incomplete because I'm looking for some opinions. 9We also might wait 'till
> it actually passes).
> 
> This paper takes the direction of a language change rather than a library
> change through C++11 literal operators.  This was after feedback on that
> paper after a few iterations.
> 
> As coded in this patch, integer suffixes involving 'z' are errors in C and
> warnings for C++ <= 17 (in addition to the usual warning about
> implementation suffixes shadowing user-defined ones).
> 
> OTOH, the 'z' suffix is not currently legal - it can't break
> currently-correct code in any C/C++ dialect.  furthermore, I suspect the
> language direction was chosen to accommodate a similar addition to C20.
> 
> I'm thinking of making this feature available as an extension to all of
> C/C++ perhaps with appropriate pedwarn.

GCC now supports -std=c++2b and -std=gnu++2b, are you going to update your
patch against it (and change for z/Z standing for ssize_t rather than
ptrdiff_t), plus incorporate the feedback from Joseph and Jason?

Jakub



[Bug c++/98570] [8/9/10/11 Regression] ICE: canonical types differ for identical types

2021-01-27 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98570

Jason Merrill  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org
 Status|NEW |ASSIGNED
 CC||jason at gcc dot gnu.org

DWARF Committee resumes meetings

2021-01-27 Thread Michael Eager
The DWARF Debugging Standard Committee (http://dwarfstd.org) is planning 
to resume meetings starting in February.  The DWARF debugging file 
format, as you likely know, is widely used by compilers and debuggers to 
support source level debugging.  The Committee released Version 5 of the 
DWARF Standard in 2017 and will start work on Version 6.


The committee consists of more than 20 committee members from a broad 
cross-section of the compiler and debugger development community, 
representing a dozen companies and including a number of independent 
developers, working on both proprietary and open source products.


A list of the current proposals or issues can be found on the Open 
Issues web page:  http://dwarfstd.org/Issues.php.  If you would like to 
submit a comment, proposal, or issue, go to the Public Comment page: 
http://dwarfstd.org/Comment.php.


If you are interested in participating on the DWARF Committee please 
contact me privately.


--
Michael Eager, DWARF Committee Chair


[Bug lto/85574] [8/9 Regression] LTO bootstapped binaries differ

2021-01-27 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85574

--- Comment #38 from Eric Botcazou  ---
> Feel free to improve things - I do not have any Windows system to
> test on or an idea what you think needs to be improved.  I would
> guess similar things apply to compare-debug which it was derived from.

That's even more broken than initially thought: nobody sets $(exeext) at top
level so gcc/lto1 is passed and then the behavior is random since some tools
apppend the missing .exe implicitly and some don't.

[Bug c++/97874] [11 Regression] ICE: tree check: expected record_type or union_type or qual_union_type, have template_type_parm in lookup_using_decl, at cp/name-lookup.c:4652

2021-01-27 Thread jason at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97874

Jason Merrill  changed:

   What|Removed |Added

   Keywords|ice-on-invalid-code |ice-on-valid-code
 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Jason Merrill  ---
Fixed.

[Bug c++/97874] [11 Regression] ICE: tree check: expected record_type or union_type or qual_union_type, have template_type_parm in lookup_using_decl, at cp/name-lookup.c:4652

2021-01-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97874

--- Comment #2 from CVS Commits  ---
The master branch has been updated by Jason Merrill :

https://gcc.gnu.org/g:9cd7c32549fa334885b716fe98b674f6447fa7c0

commit r11-6942-g9cd7c32549fa334885b716fe98b674f6447fa7c0
Author: Jason Merrill 
Date:   Wed Jan 27 00:51:01 2021 -0500

c++: Dependent using enum [PR97874]

The handling of dependent scopes and unsuitable scopes in lookup_using_decl
was a bit convoluted; I tweaked it for a while and then eventually
reorganized much of the function to hopefully be clearer.  Along the way I
noticed a couple of ways we were mishandling inherited constructors.

The local binding for a dependent using is the USING_DECL.

Implement instantiation of a dependent USING_DECL at function scope.

gcc/cp/ChangeLog:

PR c++/97874
* name-lookup.c (lookup_using_decl): Clean up handling
of dependency and inherited constructors.
(finish_nonmember_using_decl): Handle DECL_DEPENDENT_P.
* pt.c (tsubst_expr): Handle DECL_DEPENDENT_P.

gcc/testsuite/ChangeLog:

PR c++/97874
* g++.dg/lookup/using4.C: No error in C++20.
* g++.dg/cpp0x/decltype37.C: Adjust message.
* g++.dg/template/crash75.C: Adjust message.
* g++.dg/template/crash76.C: Adjust message.
* g++.dg/cpp0x/inh-ctor36.C: New test.
* g++.dg/cpp1z/inh-ctor39.C: New test.
* g++.dg/cpp2a/using-enum-7.C: New test.

[pushed] c++: Dependent using enum [PR97874]

2021-01-27 Thread Jason Merrill via Gcc-patches
The handling of dependent scopes and unsuitable scopes in lookup_using_decl
was a bit convoluted; I tweaked it for a while and then eventually
reorganized much of the function to hopefully be clearer.  Along the way I
noticed a couple of ways we were mishandling inherited constructors.

The local binding for a dependent using is the USING_DECL.

Implement instantiation of a dependent USING_DECL at function scope.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/97874
* name-lookup.c (lookup_using_decl): Clean up handling
of dependency and inherited constructors.
(finish_nonmember_using_decl): Handle DECL_DEPENDENT_P.
* pt.c (tsubst_expr): Handle DECL_DEPENDENT_P.

gcc/testsuite/ChangeLog:

PR c++/97874
* g++.dg/lookup/using4.C: No error in C++20.
* g++.dg/cpp0x/decltype37.C: Adjust message.
* g++.dg/template/crash75.C: Adjust message.
* g++.dg/template/crash76.C: Adjust message.
* g++.dg/cpp0x/inh-ctor36.C: New test.
* g++.dg/cpp1z/inh-ctor39.C: New test.
* g++.dg/cpp2a/using-enum-7.C: New test.
---
 gcc/cp/name-lookup.c  | 144 +++---
 gcc/cp/pt.c   |  41 +++---
 gcc/testsuite/g++.dg/cpp0x/decltype37.C   |   2 +-
 gcc/testsuite/g++.dg/cpp0x/inh-ctor36.C   |  10 ++
 gcc/testsuite/g++.dg/cpp1z/inh-ctor39.C   |  12 ++
 gcc/testsuite/g++.dg/cpp2a/using-enum-7.C |  27 
 gcc/testsuite/g++.dg/lookup/using4.C  |   2 +-
 gcc/testsuite/g++.dg/template/crash75.C   |   4 +-
 gcc/testsuite/g++.dg/template/crash76.C   |   2 +-
 9 files changed, 154 insertions(+), 90 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/inh-ctor36.C
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/inh-ctor39.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/using-enum-7.C

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 0fb0036c4f3..52e4a630e25 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -5729,6 +5729,16 @@ lookup_using_decl (tree scope, name_lookup )
   /* Naming a class member.  This is awkward in C++20, because we
 might be naming an enumerator of an unrelated class.  */
 
+  tree npscope = scope;
+  if (PACK_EXPANSION_P (scope))
+   npscope = PACK_EXPANSION_PATTERN (scope);
+
+  if (!MAYBE_CLASS_TYPE_P (npscope))
+   {
+ error ("%qT is not a class, namespace, or enumeration", npscope);
+ return NULL_TREE;
+   }
+
   /* You cannot using-decl a destructor.  */
   if (TREE_CODE (lookup.name) == BIT_NOT_EXPR)
{
@@ -5737,14 +5747,13 @@ lookup_using_decl (tree scope, name_lookup )
}
 
   /* Using T::T declares inheriting ctors, even if T is a typedef.  */
-  if (MAYBE_CLASS_TYPE_P (scope)
- && (lookup.name == TYPE_IDENTIFIER (scope)
- || constructor_name_p (lookup.name, scope)))
+  if (lookup.name == TYPE_IDENTIFIER (npscope)
+ || constructor_name_p (lookup.name, npscope))
{
  if (!TYPE_P (current))
{
  error ("non-member using-declaration names constructor of %qT",
-scope);
+npscope);
  return NULL_TREE;
}
  maybe_warn_cpp0x (CPP0X_INHERITING_CTORS);
@@ -5752,88 +5761,79 @@ lookup_using_decl (tree scope, name_lookup )
  CLASSTYPE_NON_AGGREGATE (current) = true;
}
 
-  if (!MAYBE_CLASS_TYPE_P (scope))
-   ;
+  if (!TYPE_P (current) && cxx_dialect < cxx20)
+   {
+ error ("using-declaration for member at non-class scope");
+ return NULL_TREE;
+   }
+
+  bool depscope = dependent_scope_p (scope);
+
+  if (depscope)
+   /* Leave binfo null.  */;
   else if (TYPE_P (current))
{
- dependent_p = dependent_scope_p (scope);
- if (!dependent_p)
-   {
- binfo = lookup_base (current, scope, ba_any, _kind, tf_none);
- gcc_checking_assert (b_kind >= bk_not_base);
+ binfo = lookup_base (current, scope, ba_any, _kind, tf_none);
+ gcc_checking_assert (b_kind >= bk_not_base);
 
- if (lookup.name == ctor_identifier)
+ if (b_kind == bk_not_base && any_dependent_bases_p ())
+   /* Treat as-if dependent.  */
+   depscope = true;
+ else if (lookup.name == ctor_identifier
+  && (b_kind < bk_proper_base || !binfo_direct_p (binfo)))
+   {
+ if (any_dependent_bases_p ())
+   depscope = true;
+ else
{
- /* Even if there are dependent bases, SCOPE will not
-be direct base, no matter.  */
- if (b_kind < bk_proper_base || !binfo_direct_p (binfo))
-   {
- error ("%qT is not a direct base of %qT", scope, current);
- return NULL_TREE;
- 

Re: [PATCH, rs6000] improve vec_ctf invalid parameter handling. (pr91903)

2021-01-27 Thread will schmidt via Gcc-patches


Ping!  

Thanks
-Will


On Mon, 2021-01-04 at 18:03 -0600, will schmidt via Gcc-patches wrote:
> On Mon, 2020-10-26 at 16:22 -0500, will schmidt wrote:
> > [PATCH, rs6000] improve vec_ctf invalid parameter handling.
> > 
> > Hi,
> >   Per PR91903, GCC ICEs when we attempt to pass a variable
> > (or out of range value) into the vec_ctf() builtin.  Per
> > investigation, the parameter checking exists for this
> > builtin with the int types, but was missing for
> > the long long types.
> > 
> > This patch adds the missing CODE_FOR_* entries to the
> > rs6000_expand_binup_builtin to cover that scenario.
> > This patch also updates some existing tests to remove
> > calls to vec_ctf() and vec_cts() that contain negative
> > values.
> > 
> > Regtested clean on power7, power8, power9 Linux targets.
> > 
> > OK for trunk?
> 
> 
> I've reviewed the list archives in case my local inbox lost a
> response..  I don't think this one was reviewed.  
> so..
> 
> ping!  
> 
> :-) 
> 
> thanks
> -Will
> 
> 
> > 
> > THanks,
> > -Will
> > 
> > PR target/91903
> > 
> > 2020-10-26  Will Schmidt  
> > 
> > gcc/ChangeLog:
> > * config/rs6000/rs6000-call.c (rs6000_expand_binup_builtin):
> > Add
> > clauses for CODE_FOR_vsx_xvcvuxddp_scale and
> > CODE_FOR_vsx_xvcvsxddp_scale to the parameter checking code.
> > 
> > gcc/testsuite/ChangeLog:
> > * testsuite/gcc.target/powerpc/pr91903.c: New test.
> > * testsuite/gcc.target/powerpc/builtins-1.fold.h: Update.
> > * testsuite/gcc.target/powerpc/builtins-2.c: Update.
> > 
> > diff --git a/gcc/config/rs6000/rs6000-call.c
> > b/gcc/config/rs6000/rs6000-call.c
> > index b044778a7ae4..eb7e007e68d3 100644
> > --- a/gcc/config/rs6000/rs6000-call.c
> > +++ b/gcc/config/rs6000/rs6000-call.c
> > @@ -9447,11 +9447,13 @@ rs6000_expand_binop_builtin (enum insn_code
> > icode, tree exp, rtx target)
> > }
> >  }
> >else if (icode == CODE_FOR_altivec_vcfux
> >|| icode == CODE_FOR_altivec_vcfsx
> >|| icode == CODE_FOR_altivec_vctsxs
> > -  || icode == CODE_FOR_altivec_vctuxs)
> > +  || icode == CODE_FOR_altivec_vctuxs
> > +  || icode == CODE_FOR_vsx_xvcvuxddp_scale
> > +  || icode == CODE_FOR_vsx_xvcvsxddp_scale)
> >  {
> >/* Only allow 5-bit unsigned literals.  */
> >STRIP_NOPS (arg1);
> >if (TREE_CODE (arg1) != INTEGER_CST
> >   || TREE_INT_CST_LOW (arg1) & ~0x1f)
> > diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-1.fold.h
> > b/gcc/testsuite/gcc.target/powerpc/builtins-1.fold.h
> > index 8bc5f5e43366..42d552295e3e 100644
> > --- a/gcc/testsuite/gcc.target/powerpc/builtins-1.fold.h
> > +++ b/gcc/testsuite/gcc.target/powerpc/builtins-1.fold.h
> > @@ -212,14 +212,14 @@ int main ()
> >extern vector unsigned long long u9; u9 = vec_mergeo (u3, u4);
> >  
> >extern vector long long l8; l8 = vec_mul (l3, l4);
> >extern vector unsigned long long u6; u6 = vec_mul (u3, u4);
> >  
> > -  extern vector double dh; dh = vec_ctf (la, -2);
> > +  extern vector double dh; dh = vec_ctf (la, 2);
> >extern vector double di; di = vec_ctf (ua, 2);
> >extern vector int sz; sz = vec_cts (fa, 0x1F);
> > -  extern vector long long l9; l9 = vec_cts (dh, -2);
> > +  extern vector long long l9; l9 = vec_cts (dh, 2);
> >extern vector unsigned long long u7; u7 = vec_ctu (di, 2);
> >extern vector unsigned int usz; usz = vec_ctu (fa, 0x1F);
> >  
> >extern vector float f1; f1 = vec_mergee (fa, fb);
> >extern vector float f2; f2 = vec_mergeo (fa, fb);
> > diff --git a/gcc/testsuite/gcc.target/powerpc/builtins-2.c
> > b/gcc/testsuite/gcc.target/powerpc/builtins-2.c
> > index 2aa23a377992..30acae47faff 100644
> > --- a/gcc/testsuite/gcc.target/powerpc/builtins-2.c
> > +++ b/gcc/testsuite/gcc.target/powerpc/builtins-2.c
> > @@ -40,16 +40,16 @@ int main ()
> >  
> >if (se[0] != 27L || se[1] != 27L || sf[0] != -14L || sf[1] !=
> > -14L
> >|| ue[0] != 27L || ue[1] != 27L || uf[0] != 14L || uf[1] !=
> > 14L)
> >  abort ();
> >  
> > -  vector double da = vec_ctf (sa, -2);
> > +  vector double da = vec_ctf (sa, 2);
> >vector double db = vec_ctf (ua, 2);
> > -  vector long long sg = vec_cts (da, -2);
> > +  vector long long sg = vec_cts (da, 2);
> >vector unsigned long long ug = vec_ctu (db, 2);
> >  
> > -  if (da[0] != 108.0 || da[1] != -56.0 || db[0] != 6.75 || db[1]
> > != 3.5
> > +  if (da[0] != 6.75 || da[1] != -3.5 || db[0] != 6.75 || db[1] !=
> > 3.5
> >|| sg[0] != 27L || sg[1] != -14L || ug[0] != 27L || ug[1] !=
> > 14L)
> >  abort ();
> >  
> >vector float fa = vec_ctf (inta, 5);
> >if (fa[0] != 0.843750 || fa[1] != -0.031250 || fa[2] != 0.125000
> > || fa[3] != 0.281250)
> > diff --git a/gcc/testsuite/gcc.target/powerpc/pr91903.c
> > b/gcc/testsuite/gcc.target/powerpc/pr91903.c
> > new file mode 100644
> > index ..f0792117a88f
> > --- /dev/null
> > +++ b/gcc/testsuite/gcc.target/powerpc/pr91903.c
> > @@ -0,0 +1,74 @@
> > +/* { dg-do 

[Bug target/98853] [9/10 Regression] wrong use of bfxil at -O1

2021-01-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98853

Jakub Jelinek  changed:

   What|Removed |Added

   Last reconfirmed||2021-01-27
Summary|[9/10/11 Regression] wrong  |[9/10 Regression] wrong use
   |use of bfxil at -O1 |of bfxil at -O1
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

--- Comment #5 from Jakub Jelinek  ---
Fixed on the trunk so far.

Re: PR fortran/93524 - rank >= 3 array stride incorrectly set in CFI_establish

2021-01-27 Thread Harris Snyder
Hi all,

Now that my copyright assignment is complete, I'm submitting this fix.
Test cases are included.
OK for master? I do not have write access, so someone will need to
commit this for me.

Regards,
Harris

libgfortran/ChangeLog:

* runtime/ISO_Fortran_binding.c (CFI_establish):  fixed strides
for rank >2 arrays

gcc/testsuite/ChangeLog:

* gfortran.dg/ISO_Fortran_binding_18.c: New test.
* gfortran.dg/ISO_Fortran_binding_18.f90: New test.




On Wed, Jan 13, 2021 at 2:10 PM Harris Snyder  wrote:
>
> Hi Tobias / all,
>
> Further related to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93524
> `sm` is being incorrectly computed in CFI_establish. Take a look at
> the diff below - we are currently only using the extent of the
> previous rank to assign `sm`, instead of all previous ranks. Have I
> got this right, or am I missing something / does this need to be
> handled differently? I can offer some test cases and submit a proper
> patch if we think this solution is OK...
>
> Thanks,
> Harris
>
> diff --git a/libgfortran/runtime/ISO_Fortran_binding.c
> b/libgfortran/runtime/ISO_Fortran_binding.c
> index 3746ec1c681..20833ad2025 100644
> --- a/libgfortran/runtime/ISO_Fortran_binding.c
> +++ b/libgfortran/runtime/ISO_Fortran_binding.c
> @@ -391,7 +391,12 @@ int CFI_establish (CFI_cdesc_t *dv, void
> *base_addr, CFI_attribute_t attribute,
>   if (i == 0)
> dv->dim[i].sm = dv->elem_len;
>   else
> -   dv->dim[i].sm = (CFI_index_t)(dv->elem_len * extents[i - 1]);
> +   {
> + CFI_index_t extents_product = 1;
> + for (int j = 0; j < i; j++)
> +   extents_product *= extents[j];
> + dv->dim[i].sm = (CFI_index_t)(dv->elem_len * extents_product);
> +   }
> }
>  }
commit 451bd40aca006ebdba52553de2392fcb5b1ff42f
Author: Harris M. Snyder 
Date:   Tue Jan 26 23:29:24 2021 -0500

Partial fix for PR fortran/93524

diff --git a/gcc/testsuite/gfortran.dg/ISO_Fortran_binding_18.c b/gcc/testsuite/gfortran.dg/ISO_Fortran_binding_18.c
new file mode 100644
index 000..4d1c4ecbd72
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/ISO_Fortran_binding_18.c
@@ -0,0 +1,29 @@
+#include 
+
+#include 
+#include 
+
+
+
+extern int do_loop(CFI_cdesc_t* array);
+
+int main(int argc, char ** argv)
+{
+	int nx = 9;
+	int ny = 10;
+	int nz = 2;
+
+	int arr[nx*ny*nz];
+	memset(arr,0,sizeof(int)*nx*ny*nz);
+	CFI_index_t shape[3];
+	shape[0] = nz;
+	shape[1] = ny;
+	shape[2] = nx;
+
+	CFI_CDESC_T(3) farr;
+	int rc = CFI_establish((CFI_cdesc_t*), arr, CFI_attribute_other, CFI_type_int, 0, (CFI_rank_t)3, (const CFI_index_t *)shape);
+	if (rc != CFI_SUCCESS) abort();
+	int result = do_loop((CFI_cdesc_t*));
+	if (result != nx*ny*nz) abort();
+	return 0;
+}
diff --git a/gcc/testsuite/gfortran.dg/ISO_Fortran_binding_18.f90 b/gcc/testsuite/gfortran.dg/ISO_Fortran_binding_18.f90
new file mode 100644
index 000..76be51d22fb
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/ISO_Fortran_binding_18.f90
@@ -0,0 +1,28 @@
+! { dg-do run }
+! { dg-additional-sources ISO_Fortran_binding_18.c }
+
+module fortran_binding_test_18
+use iso_c_binding
+implicit none
+contains
+
+subroutine test(array)
+integer(c_int) :: array(:)
+array = 1
+end subroutine
+
+function do_loop(array) result(the_sum) bind(c)
+integer(c_int), intent(in out) :: array(:,:,:)
+integer(c_int) :: the_sum, i, j
+
+the_sum = 0  
+array = 0
+do i=1,size(array,3)
+do j=1,size(array,2)
+call test(array(:,j,i))
+end do
+end do
+the_sum = sum(array)
+end function
+
+end module
diff --git a/libgfortran/runtime/ISO_Fortran_binding.c b/libgfortran/runtime/ISO_Fortran_binding.c
index 3746ec1c681..20833ad2025 100644
--- a/libgfortran/runtime/ISO_Fortran_binding.c
+++ b/libgfortran/runtime/ISO_Fortran_binding.c
@@ -391,7 +391,12 @@ int CFI_establish (CFI_cdesc_t *dv, void *base_addr, CFI_attribute_t attribute,
 	  if (i == 0)
 	dv->dim[i].sm = dv->elem_len;
 	  else
-	dv->dim[i].sm = (CFI_index_t)(dv->elem_len * extents[i - 1]);
+	{
+	  CFI_index_t extents_product = 1;
+	  for (int j = 0; j < i; j++)
+		extents_product *= extents[j];
+	  dv->dim[i].sm = (CFI_index_t)(dv->elem_len * extents_product);
+	}
 	}
 }
 


[Bug target/98853] [9/10/11 Regression] wrong use of bfxil at -O1

2021-01-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98853

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:55163419211c6f17e3e22c68304384eba35782a3

commit r11-6941-g55163419211c6f17e3e22c68304384eba35782a3
Author: Jakub Jelinek 
Date:   Wed Jan 27 20:35:21 2021 +0100

aarch64: Fix up *aarch64_bfxilsi_uxtw [PR98853]

The https://gcc.gnu.org/legacy-ml/gcc-patches/2018-07/msg01895.html
patch that introduced this pattern claimed:
Would generate:

combine_balanced_int:
bfxil   w0, w1, 0, 16
uxtwx0, w0
ret

But with this patch generates:

combine_balanced_int:
bfxil   w0, w1, 0, 16
ret
and it is indeed what it should generate, but it doesn't do that,
it emits bfxil  x0, x1, 0, 16
instead which doesn't zero extend from 32 to 64 bits, but preserves
the bits from the destination register.

2021-01-27  Jakub Jelinek  

PR target/98853
* config/aarch64/aarch64.md (*aarch64_bfxilsi_uxtw): Use
%w0, %w1 and %2 instead of %0, %1 and %2.

* gcc.c-torture/execute/pr98853-1.c: New test.
* gcc.c-torture/execute/pr98853-2.c: New test.

RE: [PATCH] aarch64: Fix up *aarch64_bfxilsi_uxtw [PR98853]

2021-01-27 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Jakub Jelinek 
> Sent: 27 January 2021 19:11
> To: Richard Earnshaw ; Richard Sandiford
> ; Marcus Shawcroft
> ; Kyrylo Tkachov 
> Cc: gcc-patches@gcc.gnu.org
> Subject: [PATCH] aarch64: Fix up *aarch64_bfxilsi_uxtw [PR98853]
> 
> Hi!
> 
> The https://gcc.gnu.org/legacy-ml/gcc-patches/2018-07/msg01895.html
> patch that introduced this pattern claimed:
> Would generate:
> 
> combine_balanced_int:
> bfxil   w0, w1, 0, 16
> uxtwx0, w0
> ret
> 
> But with this patch generates:
> 
> combine_balanced_int:
> bfxil   w0, w1, 0, 16
> ret
> and it is indeed what it should generate, but it doesn't do that,
> it emits bfxilx0, x1, 0, 16
> instead which doesn't zero extend from 32 to 64 bits, but preserves
> the bits from the destination register.
> 
> The following patch fixes that, bootstrapped/regtested on aarch64-linux,
> ok for trunk (and later backports)?

Ok.
Thanks,
Kyrill

> 
> 2021-01-27  Jakub Jelinek  
> 
>   PR target/98853
>   * config/aarch64/aarch64.md (*aarch64_bfxilsi_uxtw): Use
>   %w0, %w1 and %2 instead of %0, %1 and %2.
> 
>   * gcc.c-torture/execute/pr98853-1.c: New test.
>   * gcc.c-torture/execute/pr98853-2.c: New test.
> 
> --- gcc/config/aarch64/aarch64.md.jj  2021-01-04 10:25:46.435147744
> +0100
> +++ gcc/config/aarch64/aarch64.md 2021-01-27 15:13:13.993275204
> +0100
> @@ -5724,10 +5724,10 @@ (define_insn "*aarch64_bfxilsi_uxtw"
>  {
>case 0:
>   operands[3] = GEN_INT (ctz_hwi (~INTVAL (operands[3])));
> - return "bfxil\\t%0, %1, 0, %3";
> + return "bfxil\\t%w0, %w1, 0, %3";
>case 1:
>   operands[3] = GEN_INT (ctz_hwi (~INTVAL (operands[4])));
> - return "bfxil\\t%0, %2, 0, %3";
> + return "bfxil\\t%w0, %w2, 0, %3";
>default:
>   gcc_unreachable ();
>  }
> --- gcc/testsuite/gcc.c-torture/execute/pr98853-1.c.jj2021-01-27
> 15:26:15.544335342 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr98853-1.c   2021-01-27
> 15:28:37.877710203 +0100
> @@ -0,0 +1,21 @@
> +/* PR target/98853 */
> +
> +#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8 &&
> __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
> +__attribute__((__noipa__)) unsigned long long
> +foo (unsigned x, unsigned long long y, unsigned long long z)
> +{
> +  __builtin_memcpy (2 + (char *) , 2 + (char *) , 2);
> +  return x + z;
> +}
> +#endif
> +
> +int
> +main ()
> +{
> +#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8 &&
> __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
> +  if (foo (0xU, 0xULL,
> 0xULL)
> +  != 0xULL)
> +__builtin_abort ();
> +#endif
> +  return 0;
> +}
> --- gcc/testsuite/gcc.c-torture/execute/pr98853-2.c.jj2021-01-27
> 19:35:52.312351623 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr98853-2.c   2021-01-27
> 19:37:51.369515183 +0100
> @@ -0,0 +1,19 @@
> +/* PR target/98853 */
> +
> +#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8
> +__attribute__((noipa)) unsigned long long
> +foo (unsigned long long x, unsigned int y)
> +{
> +  return ((unsigned) x & 0xfffeU) | (y & 0x1);
> +}
> +#endif
> +
> +int
> +main ()
> +{
> +#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8
> +  if (foo (0xdeadbeefcaf2babeULL, 0xdeaffeedU) !=
> 0xcaf3feedULL)
> +__builtin_abort ();
> +#endif
> +  return 0;
> +}
> 
>   Jakub



[PATCH v2] IBM Z: Fix usage of "f" constraint with long doubles

2021-01-27 Thread Ilya Leoshkevich via Gcc-patches
v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563799.html

v1 -> v2: Handle constraint modifiers, use AR constraint instead of R,
add testcases for & and %.




After switching the s390 backend to store long doubles in vector
registers, "f" constraint broke when used with the former: long doubles
correspond to TFmode, which in combination with "f" corresponds to
hard regs %v0-%v15, however, asm users expect a %f0-%f15 pair.

Fix by using TARGET_MD_ASM_ADJUST hook to convert TFmode values to
FPRX2mode and back.

gcc/ChangeLog:

2020-12-14  Ilya Leoshkevich  

* config/s390/s390.c (f_constraint_p): New function.
(s390_md_asm_adjust): Implement TARGET_MD_ASM_ADJUST.
(TARGET_MD_ASM_ADJUST): Likewise.
* config/s390/vector.md (fprx2_to_tf): Rename from *fprx2_to_tf,
add memory alternative.
(tf_to_fprx2): New pattern.

gcc/testsuite/ChangeLog:

2020-12-14  Ilya Leoshkevich  

* gcc.target/s390/vector/long-double-asm-abi.c: New test.
* gcc.target/s390/vector/long-double-asm-commutative.c: New
test.
* gcc.target/s390/vector/long-double-asm-earlyclobber.c: New
test.
* gcc.target/s390/vector/long-double-asm-in-out.c: New test.
* gcc.target/s390/vector/long-double-asm-inout.c: New test.
* gcc.target/s390/vector/long-double-volatile-from-i64.c: New
test.
---
 gcc/config/s390/s390.c| 88 +++
 gcc/config/s390/vector.md | 36 ++--
 .../s390/vector/long-double-asm-abi.c | 26 ++
 .../s390/vector/long-double-asm-commutative.c | 16 
 .../vector/long-double-asm-earlyclobber.c | 17 
 .../s390/vector/long-double-asm-in-out.c  | 14 +++
 .../s390/vector/long-double-asm-inout.c   | 14 +++
 .../s390/vector/long-double-asm-matching.c| 13 +++
 .../vector/long-double-volatile-from-i64.c| 22 +
 9 files changed, 241 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/long-double-asm-abi.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-asm-commutative.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-asm-earlyclobber.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-asm-in-out.c
 create mode 100644 gcc/testsuite/gcc.target/s390/vector/long-double-asm-inout.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-asm-matching.c
 create mode 100644 
gcc/testsuite/gcc.target/s390/vector/long-double-volatile-from-i64.c

diff --git a/gcc/config/s390/s390.c b/gcc/config/s390/s390.c
index 9d2cee950d0..d4b098325e8 100644
--- a/gcc/config/s390/s390.c
+++ b/gcc/config/s390/s390.c
@@ -16688,6 +16688,91 @@ s390_shift_truncation_mask (machine_mode mode)
   return mode == DImode || mode == SImode ? 63 : 0;
 }
 
+/* Return TRUE iff CONSTRAINT is an "f" constraint, possibly with additional
+   modifiers.  */
+
+static bool
+f_constraint_p (const char *constraint)
+{
+  for (size_t i = 0, c_len = strlen (constraint); i < c_len;
+   i += CONSTRAINT_LEN (constraint[i], constraint + i))
+{
+  if (constraint[i] == 'f')
+   return true;
+}
+  return false;
+}
+
+/* Implement TARGET_MD_ASM_ADJUST hook in order to fix up "f"
+   constraints when long doubles are stored in vector registers.  */
+
+static rtx_insn *
+s390_md_asm_adjust (vec , vec ,
+   vec _modes,
+   vec , vec & /*clobbers*/,
+   HARD_REG_SET & /*clobbered_regs*/)
+{
+  if (!TARGET_VXE)
+/* Long doubles are stored in FPR pairs - nothing to do.  */
+return NULL;
+
+  rtx_insn *after_md_seq = NULL, *after_md_end = NULL;
+
+  unsigned ninputs = inputs.length ();
+  unsigned noutputs = outputs.length ();
+  for (unsigned i = 0; i < noutputs; i++)
+{
+  if (GET_MODE (outputs[i]) != TFmode)
+   /* Not a long double - nothing to do.  */
+   continue;
+  const char *constraint = constraints[i];
+  bool allows_mem, allows_reg, is_inout;
+  bool ok = parse_output_constraint (, i, ninputs, noutputs,
+_mem, _reg, _inout);
+  gcc_assert (ok);
+  if (!f_constraint_p (constraint + 1))
+   /* Long double with a constraint other than "=f" - nothing to do.  */
+   continue;
+  gcc_assert (allows_reg);
+  gcc_assert (!allows_mem);
+  gcc_assert (!is_inout);
+  /* Copy output value from a FPR pair into a vector register.  */
+  rtx fprx2 = gen_reg_rtx (FPRX2mode);
+  push_to_sequence2 (after_md_seq, after_md_end);
+  emit_insn (gen_fprx2_to_tf (outputs[i], fprx2));
+  after_md_seq = get_insns ();
+  after_md_end = get_last_insn ();
+  end_sequence ();
+  outputs[i] = fprx2;
+}
+
+  for (unsigned i = 0; i < ninputs; i++)
+{
+  if (GET_MODE (inputs[i]) != TFmode)
+   /* Not a long double - nothing to do.  */
+   continue;
+  const char 

[PATCH] aarch64: Fix up *aarch64_bfxilsi_uxtw [PR98853]

2021-01-27 Thread Jakub Jelinek via Gcc-patches
Hi!

The https://gcc.gnu.org/legacy-ml/gcc-patches/2018-07/msg01895.html
patch that introduced this pattern claimed:
Would generate:

combine_balanced_int:
bfxil   w0, w1, 0, 16
uxtwx0, w0
ret

But with this patch generates:

combine_balanced_int:
bfxil   w0, w1, 0, 16
ret
and it is indeed what it should generate, but it doesn't do that,
it emits bfxil  x0, x1, 0, 16
instead which doesn't zero extend from 32 to 64 bits, but preserves
the bits from the destination register.

The following patch fixes that, bootstrapped/regtested on aarch64-linux,
ok for trunk (and later backports)?

2021-01-27  Jakub Jelinek  

PR target/98853
* config/aarch64/aarch64.md (*aarch64_bfxilsi_uxtw): Use
%w0, %w1 and %2 instead of %0, %1 and %2.

* gcc.c-torture/execute/pr98853-1.c: New test.
* gcc.c-torture/execute/pr98853-2.c: New test.

--- gcc/config/aarch64/aarch64.md.jj2021-01-04 10:25:46.435147744 +0100
+++ gcc/config/aarch64/aarch64.md   2021-01-27 15:13:13.993275204 +0100
@@ -5724,10 +5724,10 @@ (define_insn "*aarch64_bfxilsi_uxtw"
 {
   case 0:
operands[3] = GEN_INT (ctz_hwi (~INTVAL (operands[3])));
-   return "bfxil\\t%0, %1, 0, %3";
+   return "bfxil\\t%w0, %w1, 0, %3";
   case 1:
operands[3] = GEN_INT (ctz_hwi (~INTVAL (operands[4])));
-   return "bfxil\\t%0, %2, 0, %3";
+   return "bfxil\\t%w0, %w2, 0, %3";
   default:
gcc_unreachable ();
 }
--- gcc/testsuite/gcc.c-torture/execute/pr98853-1.c.jj  2021-01-27 
15:26:15.544335342 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr98853-1.c 2021-01-27 
15:28:37.877710203 +0100
@@ -0,0 +1,21 @@
+/* PR target/98853 */
+
+#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8 && __BYTE_ORDER__ == 
__ORDER_LITTLE_ENDIAN__
+__attribute__((__noipa__)) unsigned long long
+foo (unsigned x, unsigned long long y, unsigned long long z)
+{
+  __builtin_memcpy (2 + (char *) , 2 + (char *) , 2);
+  return x + z;
+}
+#endif
+
+int
+main ()
+{
+#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8 && __BYTE_ORDER__ == 
__ORDER_LITTLE_ENDIAN__
+  if (foo (0xU, 0xULL, 0xULL)
+  != 0xULL)
+__builtin_abort ();
+#endif
+  return 0;
+}
--- gcc/testsuite/gcc.c-torture/execute/pr98853-2.c.jj  2021-01-27 
19:35:52.312351623 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr98853-2.c 2021-01-27 
19:37:51.369515183 +0100
@@ -0,0 +1,19 @@
+/* PR target/98853 */
+
+#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8
+__attribute__((noipa)) unsigned long long
+foo (unsigned long long x, unsigned int y)
+{
+  return ((unsigned) x & 0xfffeU) | (y & 0x1);
+}
+#endif
+
+int
+main ()
+{
+#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8
+  if (foo (0xdeadbeefcaf2babeULL, 0xdeaffeedU) != 0xcaf3feedULL)
+__builtin_abort ();
+#endif
+  return 0;
+}

Jakub



Re: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.

2021-01-27 Thread will schmidt via Gcc-patches
On Thu, 2021-01-14 at 11:59 -0500, Michael Meissner via Gcc-patches wrote:
> From 78435dee177447080434cdc08fc76b1029c7f576 Mon Sep 17 00:00:00 2001
> From: Michael Meissner 
> Date: Wed, 13 Jan 2021 21:47:03 -0500
> Subject: [PATCH] PowerPC: Map IEEE 128-bit long double built-ins.
> 
> This patch replaces patches previously submitted:
> 
> September 24th, 2020:
> Message-ID: <20200924203159.ga31...@ibm-toto.the-meissners.org>
> 
> October 9th, 2020:
> Message-ID: <20201009043543.ga11...@ibm-toto.the-meissners.org>
> 
> October 24th, 2020:
> Message-ID: <2020100346.ga8...@ibm-toto.the-meissners.org>
> 
> November 19th, 2020:
> Message-ID: <20201119235814.ga...@ibm-toto.the-meissners.org>


Subject and date should be sufficient _if_ having the old versions
of the patchs are necessary to review the latest version of the
patch.  Which ideally is not the case.


> 
> This patch maps the built-in functions that take or return long double
> arguments on systems where long double is IEEE 128-bit.
> 
> If long double is IEEE 128-bit, this patch goes through the built-in functions
> and changes the name of the math, scanf, and printf built-in functions to use
> the functions that GLIBC provides when long double uses the IEEE 128-bit
> representation.

ok.

> 
> In addition, changing the name in GCC allows the Fortran compiler to
> automatically use the correct name.

Does the fortran compiler currently use the wrong name? (pr?)

> 
> To map the math functions, typically this patch changes l to
> __ieee128.  However there are some exceptions that are handled with this
> patch.

This appears to be  the rs6000_mangle_decl_assembler_name() function, which
also maps l_r to ieee128_r, and looks like some additional special
handling for printf and scanf.  


> To map the printf functions,  is mapped to __ieee128.
> 
> To map the scanf functions,  is mapped to __isoc99_ieee128.


> 
> I have tested this patch by doing builds, bootstraps, and make check with 3
> builds on a power9 little endian server:
> 
> * Build one used the default long double being IBM 128-bit;
> * Build two set the long double default to IEEE 128-bit; (and)
> * Build three set the long double default to 64-bit.
> 

ok

> The compilers built fine providing I recompiled gmp, mpc, and mpfr with the
> appropriate long double options.

Presumably the build is otherwise broken... 
Does that mean more than invoking download_preqrequisites as part of the
build?   If there are specific options required during configure/build of
those packages, they should be called out.

> There were a few differences in the test
> suite runs that will be addressed in later patches, but over all it works
> well.

Presumably minimal. :-)


>   This patch is required to be able to build a toolchain where the default
> long double is IEEE 128-bit. 

Ok.   Could lead the patch description with this,.  I imagine this is
just one of several patches that are still required towrards that goal.



>  Can I check this patch into the master branch for
> GCC 11?





> 
> gcc/
> 2021-01-14  Michael Meissner  
> 
>   * config/rs6000/rs6000.c (ieee128_builtin_name): New function.
>   (built_in_uses_long_double): New function.
>   (identifier_ends_in_suffix): New function.
>   (rs6000_mangle_decl_assembler_name): Update support for mapping built-in
>   function names for long double built-in functions if long double is
>   IEEE 128-bit to catch all of the built-in functions that take or
>   return long double arguments.
> 
> gcc/testsuite/
> 2021-01-14  Michael Meissner  
> 
>   * gcc.target/powerpc/float128-longdouble-math.c: New test.
>   * gcc.target/powerpc/float128-longdouble-stdio.c: New test.
>   * gcc.target/powerpc/float128-math.c: Adjust test for new name
>   being generated.  Add support for running test on power10.  Add
>   support for running if long double defaults to 64-bits.
> ---
>  gcc/config/rs6000/rs6000.c| 239 --
>  .../powerpc/float128-longdouble-math.c| 442 ++
>  .../powerpc/float128-longdouble-stdio.c   |  36 ++
>  .../gcc.target/powerpc/float128-math.c|  16 +-
>  4 files changed, 694 insertions(+), 39 deletions(-)
>  create mode 100644 
> gcc/testsuite/gcc.target/powerpc/float128-longdouble-math.c
>  create mode 100644 
> gcc/testsuite/gcc.target/powerpc/float128-longdouble-stdio.c
> 
> diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
> index 6f48dd6566d..282703b9715 100644
> --- a/gcc/config/rs6000/rs6000.c
> +++ b/gcc/config/rs6000/rs6000.c
> @@ -27100,6 +27100,172 @@ rs6000_globalize_decl_name (FILE * stream, tree 
> decl)
>  #endif
> 
>  
> +/* If long double uses the IEEE 128-bit representation, return the name used
> +   within GLIBC for the IEEE 128-bit long double built-in, instead of the
> +   default IBM 128-bit long double built-in.  Or return NULL if the built-in
> +   function does not use long double.  

[Bug c++/98295] [8/9/10/11 Regression] ICE in verify_ctor_sanity, at cp/constexpr.c:4312

2021-01-27 Thread ppalka at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98295

Patrick Palka  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |ppalka at gcc dot 
gnu.org
 CC||ppalka at gcc dot gnu.org

[Bug tree-optimization/60770] disappearing clobbers

2021-01-27 Thread orgads at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60770

--- Comment #15 from Orgad Shaneh  ---
test.cpp: In function ‘int f(int)’:
test.cpp:7:11: warning: ‘q’ is used uninitialized in this function
[-Wuninitialized]
7 |   return *p;
  |   ^

Is this the intended description? It doesn't refer to the real problem (storing
a pointer to a variable that is out of scope).

[Bug fortran/98858] OpenMP offload target data ICE at use_device_ptr

2021-01-27 Thread xw111luoye at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98858

--- Comment #1 from Ye Luo  ---
GNU Fortran (GCC) 11.0.0 20210127 (experimental)

[Bug fortran/98858] New: OpenMP offload target data ICE at use_device_ptr

2021-01-27 Thread xw111luoye at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98858

Bug ID: 98858
   Summary: OpenMP offload target data ICE at use_device_ptr
   Product: gcc
   Version: 11.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
  Assignee: unassigned at gcc dot gnu.org
  Reporter: xw111luoye at gmail dot com
  Target Milestone: ---

Getting ICE


yeluo@ryzen-box:~/opt/openmp-target/hands-on/tests/fortran_use_device_ptr$
gfortran -fopenmp test_use_device_ptr_target.f90 
test_use_device_ptr_target.f90:15:41:

   15 |   !$omp target data use_device_ptr(a)
  | ^
internal compiler error: Segmentation fault
0xf55ee3 crash_signal

source code at.
https://github.com/ye-luo/openmp-target/blob/master/hands-on/tests/fortran_use_device_ptr/test_use_device_ptr_target.f90

Re: [PATCH 2/2] Add simd testsuite

2021-01-27 Thread Jonathan Wakely via Gcc-patches

On 27/01/21 17:54 +, Jonathan Wakely wrote:

and add something to the release notes too.


Also done. Pushed to wwwdocs.


commit f948177c3d01d09cbc8035a75583d425a4dca46e
Author: Jonathan Wakely 
Date:   Wed Jan 27 18:30:00 2021 +

Document simd additions to libstdc++

diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
index 9b86e6c8..efbf3341 100644
--- a/htdocs/gcc-11/changes.html
+++ b/htdocs/gcc-11/changes.html
@@ -329,6 +329,9 @@ a work-in-progress.
   
 
   
+  Experimental support for Data-Parallel Types (simd)
+from the Parallelism 2 TS, thanks to Matthias Kretz.
+  
   Faster std::uniform_int_distribution,
   thanks to Daniel Lemire.
   


[Bug target/98849] [11 Regression] ICE in expand_shift_1, at expmed.c:2658 since g:7432f255b70811dafaf325d9403

2021-01-27 Thread clyon at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98849

--- Comment #11 from Christophe Lyon  ---
Yes MVE is incompatible with iWMMXt.

Regarding the pattern name, quoting what I wrote in the commit message:
I kept the mve_vshlq_ naming instead of renaming it to
ashl3__ as discussed because the reference in
arm_mve_builtins.def automatically inserts the "mve_" prefix and I
didn't want to make a special case for this.

Re: [PATCH 2/2] Add simd testsuite

2021-01-27 Thread Jonathan Wakely via Gcc-patches

On 27/01/21 16:45 +, Jonathan Wakely wrote:

I'll regen the docs [...]


Done. Regenerating the docs needed the attached fix.


commit 3670dbe49059ab1746ac2e3b77940160c05db6c2
Author: Jonathan Wakely 
Date:   Wed Jan 27 17:52:27 2021

libstdc++: Regenerate libstdc++ HTML docs

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2017.xml: Replace invalid entity.
* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
index bc740f8e1ba..f97fc060fa0 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2017.xml
@@ -3113,7 +3113,7 @@ since C++14 and the implementation is complete.
 is supported if __ALTIVEC__ is defined and sizeof(T)
  8. Additionally, double is supported if
 __VSX__ is defined, and any T with 
-sizeof(T)  8 is supported if __POWER8_VECTOR__
+sizeof(T) = 8 is supported if __POWER8_VECTOR__
 is defined.
 
 On x86, given an extended ABI tag Abi,


[Bug c++/98841] wrong ‘operator=’ should return a reference to ‘*this’ [-Weffc++]

2021-01-27 Thread o.mandel at menlosystems dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98841

--- Comment #7 from Olaf Mandel  ---
(In reply to Olaf Mandel from comment #0)
> In the minimal demo used here this only happens for a template member
> function, but in larger code it can also be observed for a plain member
> function: see e.g. https://github.com/jbeder/yaml-cpp/issues/970
> 
I have to retract that statement: I cannot reproduce this and the two line
numbers in the larger code in question are very similar: 212 and 221. Maybe I
just confused them?

[Bug tree-optimization/60770] disappearing clobbers

2021-01-27 Thread glisse at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60770

--- Comment #14 from Marc Glisse  ---
(In reply to Orgad Shaneh from comment #13)
> The case described in comment 1 doesn't issue a warning with GCC 10.

It does for me with -Wall -O (you need at least some optimization). If there is
still a problem, you need to open a new issue.

Re: [PATCH] aarch64: Use GCC vector extensions for FP ml[as]_n intrinsics

2021-01-27 Thread Richard Sandiford via Gcc-patches
Kyrylo Tkachov  writes:
> Hi Jonathan,
>
>> -Original Message-
>> From: Jonathan Wright 
>> Sent: 27 January 2021 16:03
>> To: gcc-patches@gcc.gnu.org
>> Cc: Kyrylo Tkachov 
>> Subject: [PATCH] aarch64: Use GCC vector extensions for FP ml[as]_n
>> intrinsics
>>
>> Hi,
>>
>> As subject, this patch rewrites floating-point mla_n/mls_n intrinsics to use
>> a + b * c / a - b * c rather than inline assembly code, allowing for better
>> scheduling and optimization.
>>
>> Regression tested and bootstrapped on aarch64-none-linux-gnu - no
>> issues.
>>
>> Ok for master?
>
> I'm quite keen to remove that ugly inline asm, but I'm a bit concerned about 
> the floating-point semantics now being affected by things like FP 
> contractions.
> The intrinsics are supposed to preserve the semantics of the instructions as 
> much as possible.
> Richard, does this mean we'll want to implement this using RTL builtins, like 
> for the integer ones?

It seems like a grey area in the spec.  E.g. vmlaq_f32 is described as:

RESULT[I] = a[i] + (b[i] * c[i]) for i = 0 to 3

which could be taken to mean that it behaves in the same way as the
C arithmetic would, and so should be subject to -ffp-contract.

At the moment, a separate vmulq_f32 and vaddq_f32 could be fused,
but that's arguably a bug, since the spec says that they should
behave like FMUL and FADD respectively.  So:

* At the moment, vmla_* is the only way of forcibly disabling fusing.

* -ffp-contract has different defaults between Clang and GCC,
  and the default GCC behaviour would be to contract the vmlas.

* It would be a change in behaviour from previous releases.

So I agree we should probably use builtins.

We'd need to be careful that we don't grow define_insns or RTL
optimisations that do their own fusing of separate MULTs and ADDs.
I think we should have new tests to make sure that we generate
separate FMULs and FADDs, if we don't already.

Thanks,
Richard


RE: [PATCH] aarch64: Use RTL builtins for [su]mlal intrinsics

2021-01-27 Thread Kyrylo Tkachov via Gcc-patches



> -Original Message-
> From: Jonathan Wright 
> Sent: 27 January 2021 16:28
> To: gcc-patches@gcc.gnu.org
> Cc: Kyrylo Tkachov 
> Subject: [PATCH] aarch64: Use RTL builtins for [su]mlal intrinsics
> 
> Hi,
> 
> As subject, this patch rewrites [su]mlal Neon intrinsics to use RTL
> builtins rather than inline assembly code, allowing for better
> scheduling and optimization.
> 
> Regression tested and bootstrapped on aarch64-none-linux-gnu - no
> issues.
> 
> Ok for master?
> 

Ok.
Thanks,
Kyrill

> Thanks,
> Jonathan
> 
> ---
> 
> gcc/ChangeLog:
> 
> 2021-01-26  Jonathan Wright  
> 
> * config/aarch64/aarch64-simd-builtins.def: Add [su]mlal
> builtin generator macros.
> * config/aarch64/aarch64-simd.md (*aarch64_mlal):
> Rename to...
> (aarch64_mlal): This.
> * config/aarch64/arm_neon.h (vmlal_s8): Use RTL builtin
> instead of inline asm.
> (vmlal_s16): Likewise.
> (vmlal_s32): Likewise.
> (vmlal_u8): Likewise.
> (vmlal_u16): Likewise.
> (vmlal_u32): Likewise.



[Bug tree-optimization/60770] disappearing clobbers

2021-01-27 Thread orgads at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60770

Orgad Shaneh  changed:

   What|Removed |Added

 CC||orgads at gmail dot com

--- Comment #13 from Orgad Shaneh  ---
The case described in comment 1 doesn't issue a warning with GCC 10.

Looks like it's a different case than bug 60517.

[Bug c++/98295] [8/9/10/11 Regression] ICE in verify_ctor_sanity, at cp/constexpr.c:4312

2021-01-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98295

--- Comment #4 from Jakub Jelinek  ---
Still ICEs even when that other bug is fixed.

[Bug testsuite/98351] [11 regression] gcc.target/powerpc/sse-andnps-1.c and sse2-andnpd-1.c fail after r11-3308

2021-01-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98351

Jakub Jelinek  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 CC||jakub at gcc dot gnu.org
 Status|UNCONFIRMED |RESOLVED

--- Comment #2 from Jakub Jelinek  ---
Should be fixed with r11-6869-gd08677c11dc4b43cc8bab862d1c986563897ce3f and
r11-6871-g70ab52b8cafffedb05b55c68c847173ff80f2652 and
https://gcc.gnu.org/g:e80f1f6b7a339bce1db03567e497658ae32d135e  

commit r11-6917-ge80f1f6b7a339bce1db03567e497658ae32d135e   
Author: Jakub Jelinek 
Date:   Tue Jan 26 20:02:29 2021 +0100  

testsuite: Fix TBAA in sse*and*p[sd]*.c tests   

This patch drops the no-strict-aliasing hack in m128-check.h and instead
ensures the tests read objects with the right dynamic type. 

2021-01-26  Jakub Jelinek 

* gcc.target/powerpc/m128-check.h (CHECK_EXP): Remove   
optimize ("no-strict-aliasing") attribute.  
* gcc.target/powerpc/sse-andnps-1.c (TEST): Copy e into float[4]
array to avoid violating TBAA.  
* gcc.target/powerpc/sse2-andpd-1.c (TEST): Copy e.d into double[2] 
array to avoid violating TBAA.  
* gcc.target/powerpc/sse-andps-1.c (TEST): Copy e.f into float[4]   
array to avoid violating TBAA.  
* gcc.target/powerpc/sse2-andnpd-1.c (TEST): Copy e into double[2]  
array to avoid violating TBAA.

[Bug testsuite/98349] [11 regression] gcc.target/powerpc/sse-movhps-1.c and sse-movlps.c fail after r11-3434

2021-01-27 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98349

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
 CC||jakub at gcc dot gnu.org

--- Comment #3 from Jakub Jelinek  ---
Should be fixed by:
https://gcc.gnu.org/g:c63f091db89a56ae56b2bfa2ba4d9e956bd9693f  

commit r11-6879-gc63f091db89a56ae56b2bfa2ba4d9e956bd9693f   
Author: Jakub Jelinek 
Date:   Sat Jan 23 09:41:58 2021 +0100  

rs6000: Fix up __m64 typedef in mmintrin.h [PR97301]

The x86 __m64 type is defined as:   
/* The Intel API is flexible enough that we must allow aliasing with other  
   vector types, and their scalar components.  */   
typedef int __m64 __attribute__ ((__vector_size__ (8), __may_alias__)); 
and so matches the comment above it in that reads and stores through
pointers to __m64 can alias anything.   
But in the rs6000 headers that is the case only for __m128, but not __m64.  

The following patch adds that attribute, which fixes the
FAIL: gcc.target/powerpc/sse-movhps-1.c execution test  
FAIL: gcc.target/powerpc/sse-movlps-1.c execution test  
regressions that appeared when Honza improved ipa-modref.   

2021-01-23  Jakub Jelinek 

PR testsuite/97301  
* config/rs6000/mmintrin.h (__m64): Add __may_alias__ attribute.

Re: [PATCH 2/2] Add simd testsuite

2021-01-27 Thread Jonathan Wakely via Gcc-patches

On 18/12/20 16:49 +0100, Matthias Kretz wrote:

Resending squashed patch after addressing Jonathan's comments.

From: Matthias Kretz 

Add a new check-simd target to the testsuite. The new target creates a
subdirectory, generates the necessary Makefiles, and spawns submakes to
build and run the tests. Running this testsuite with defaults on my
machine takes half of the time the dejagnu testsuite required to only
determine whether to run tests. Since the simd testsuite integrated in
dejagnu increased the time of the whole libstdc++ testsuite by ~100%
this approach is a compromise for speed while not sacrificing coverage
too much. Since the test driver is invoked individually per test
executable from a Makefile, make's jobserver (-j) trivially parallelizes
testing.

Testing different flags and with simulator (or remote execution) is
possible. E.g. `make check-simd DRIVEROPTS=-q
target_list="unix{-m64,-m32}{-march=sandybridge,-march=skylake-avx512}{,-
ffast-math}"`
runs the testsuite 8 times in different subdirectories, using 8
different combinations of compiler flags, only outputs failing tests
(-q), and prints all summaries at the end. It skips most ABI tags by
default unless --run-expensive is passed to DRIVEROPTS or
GCC_TEST_RUN_EXPENSIVE is not empty.

To use a simulator, the CHECK_SIMD_CONFIG variable needs to point to a
shell script which calls `define_target   ` and
set target_list as needed. E.g.:
case "$target_triplet" in
x86_64-*)
 target_list="unix{-march=sandybridge,-march=skylake-avx512}
 ;;
powerpc64le-*)
 define_target power8 "-static -mcpu=power8" "/usr/bin/qemu-ppc64le -cpu
power8"
 define_target power9 -mcpu=power9 "$HOME/bin/run_on_gcc135"
 target_list="power8 power9{,-ffast-math}"
 ;;
esac

libstdc++-v3/ChangeLog:
* scripts/check_simd: New file. This script is called from the
the check-simd target. It determines a set of compiler flags and
simulator setups for calling generate_makefile.sh and passes the
information back to the check-simd target, which recurses to the
generated Makefiles.
* scripts/create_testsuite_files: Remove files below simd/tests/
from testsuite_files and place them in testsuite_files_simd.
* testsuite/Makefile.am: Add testsuite_files_simd. Add
check-simd target.
* testsuite/Makefile.in: Regenerate.
* testsuite/experimental/simd/driver.sh: New file. This script
compiles and runs a given simd test, logging its output and
status. It uses the timeout command to implement compile and
test timeouts.
* testsuite/experimental/simd/generate_makefile.sh: New file.
This script generates a Makefile which uses driver.sh to compile
and run the tests and collect the logs into a single log file.
* testsuite/experimental/simd/tests/abs.cc: New file. Tests
abs(simd).
* testsuite/experimental/simd/tests/algorithms.cc: New file.
Tests min/max(simd, simd).
* testsuite/experimental/simd/tests/bits/conversions.h: New
file. Contains functions to support tests involving conversions.
* testsuite/experimental/simd/tests/bits/make_vec.h: New file.
Support functions make_mask and make_vec.
* testsuite/experimental/simd/tests/bits/mathreference.h: New
file. Support functions to supply precomputed math function
reference data.
* testsuite/experimental/simd/tests/bits/metahelpers.h: New
file. Support code for SFINAE testing.
* testsuite/experimental/simd/tests/bits/simd_view.h: New file.
* testsuite/experimental/simd/tests/bits/test_values.h: New
file. Test functions to easily drive a test with simd objects
initialized from a given list of values and a range of random
values.
* testsuite/experimental/simd/tests/bits/ulp.h: New file.
Support code to determine the ULP distance of simd objects.
* testsuite/experimental/simd/tests/bits/verify.h: New file.
Test framework for COMPARE'ing simd objects and instantiating
the test templates with value_type and ABI tag.
* testsuite/experimental/simd/tests/broadcast.cc: New file. Test
simd broadcasts.
* testsuite/experimental/simd/tests/casts.cc: New file. Test
simd casts.
* testsuite/experimental/simd/tests/fpclassify.cc: New file.
Test floating-point classification functions.
* testsuite/experimental/simd/tests/frexp.cc: New file. Test
frexp(simd).
* testsuite/experimental/simd/tests/generator.cc: New file. Test
simd generator constructor.
* testsuite/experimental/simd/tests/hypot3_fma.cc: New file.
Test 3-arg hypot(simd,simd,simd) and fma(simd,simd,sim).
* testsuite/experimental/simd/tests/integer_operators.cc: New
file. Test integer operators.
* testsuite/experimental/simd/tests/ldexp_scalbn_scalbln_modf.cc:
New 

Re: [PATCH] rtl-optimization/80960 - avoid creating garbage RTL in DSE

2021-01-27 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 27, 2021 at 05:37:54PM +0100, Richard Biener wrote:
> Sure, more micro-optimizing is possible, including passing a flag
> to canon_true_dependence whether the addr RTX already had get_addr
> called on it.  And pass in the offset as poly-rtx-int and make
> get_addr apply it if not zero.  But I've mostly tried to address
> the non-linearity here, after the patch the number of get_addr
> and plus_constant calls should be linear in the number of loads
> rather than O (#loads * #stores).
> 
> I've also tried to find the most minimalistic change at this point
> (so it could be eventually backported).

Ok.  I'll gather stats incrementally and see if it is worth to do something
further later.

Jakub



  1   2   3   >