date:20220115

[r12-6606 Regression] FAIL: g++.dg/asan/asan_test.C -O2 (test for excess errors) on Linux/x86_64

2022-01-15 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

9d6a0f388eb048f8d87f47af78f07b5ce513bfe6 is the first bad commit
commit 9d6a0f388eb048f8d87f47af78f07b5ce513bfe6
Author: Martin Sebor 
Date:   Sat Jan 15 16:41:40 2022 -0700

Add -Wdangling-pointer [PR63272].

caused

FAIL: gcc.dg/torture/pr57147-2.c   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: g++.dg/asan/asan_test.C   -O2  (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6606/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr57147-2.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr57147-2.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr57147-2.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=gcc.dg/torture/pr57147-2.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="asan.exp=g++.dg/asan/asan_test.C --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="asan.exp=g++.dg/asan/asan_test.C --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="asan.exp=g++.dg/asan/asan_test.C --target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="asan.exp=g++.dg/asan/asan_test.C --target_board='unix{-m64\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.

2022-01-15 Thread Hongtao Liu via Gcc-patches

On Sun, Jan 16, 2022 at 12:44 AM Uros Bizjak via Gcc-patches
 wrote:
>
> On Sat, Jan 15, 2022 at 5:39 PM Hongyu Wang  wrote:
> >
> > Thanks for the suggestion, here is the updated patch that survived
> > bootstrap/regtest.
>
> LGTM for me, but please get the final approval from Hongtao.
>
Ok, thanks.
> Thanks,
> Uros.
>
> > > Please note reg_mentioned_p in the above condition. This function
> > > returns nonzero if register op0 appears somewhere within op1 and is
> > > critical for the correct operation of your patch.
> > I added reg_mentioned_p for all insns except fp16 complex mult, since
> > they have constraint & to the dest so it must be allocated different
> > register from src.
> >
> > Uros Bizjak  于2022年1月14日周五 23:49写道：
> >
> >
> > >
> > > On Fri, Jan 14, 2022 at 2:44 PM Hongyu Wang  
> > > wrote:
> > > >
> > > > > Are there any technical obstacles to introduce subst to
> > > > > define_{,insn_and_}split?
> > > >
> > > > gccint says: define_subst can be used only in define_insn and
> > > > define_expand, it cannot be used in other expressions (e.g. in
> > > > define_insn_and_split).
> > >
> > > Hm, hm ... annoying ...
> > >
> > > > I have no idea how to implement it in current infrastructure.
> > > >
> > > > > In the proposed patch, if the output register is also mentioned in the
> > > > > input, then clearing before insn will clear the value in the input
> > > > > register. The solution in the i386.md also takes care of this issue.
> > > > >
> > > >
> > > > For this, I think we can add REGNO checks for operands in condition
> > > > (which means there is true dependency).
> > >
> > > Let's go in your direction, considering the limitations of current
> > > infrastructure.
> > > +{
> > > +  if (TARGET_DEST_FALSE_DEPENDENCY
> > > +  && get_attr_dest_false_dep (insn) ==
> > > +DEST_FALSE_DEP_TRUE)
> > > +output_asm_insn ("vxorps\t{%x0, %x0, %x0}", operands);
> > > +  return "vgetmant\t{%2, %1,
> > > %0|%0, %1, %2}";
> > > +}
> > >
> > > There is no need to pass the information via attributes. IMO, you
> > > shoud use subst attribute directly in the condition:
> > >
> > > {
> > >   if (TARGET_DEST_FALSE_DEPENDENCY
> > >   && 
> > >   && !reg_mentioned_p (operands[0], operands[1]))
> > >output_asm_insn ("vxorps\t{%x0, %x0, %x0}", operands);
> > >   return ".";
> > > }
> > >
> > > Assume the above works, so please:
> > >
> > > - rename TARGET_DEST_FALSE_DEPENDENCY to something less generic, maybe
> > > following existing BMI example with TARGET_AVOID_FALSE_DEP_FOR_AVX512F
> > > - rename "mask3_dest_false_dep_attr" to "mask3_false_dep_for_avx512f_cond"
> > >
> > > Please note reg_mentioned_p in the above condition. This function
> > > returns nonzero if register op0 appears somewhere within op1 and is
> > > critical for the correct operation of your patch.
> > >
> > > Uros.



-- 
BR,
Hongtao

[PATCH] Strengthen memory memory order for atomic::wait/notify

2022-01-15 Thread Thomas Rodgers via Gcc-patches

This patch updates the memory order of atomic accesses to the waiter's
count to match libc++'s usage. It should be backported to GCC11.

Tested x86_64-pc-linux-gnu.
From f5ed7674f86283db4f4ff49a2cc65d4f852413a1 Mon Sep 17 00:00:00 2001
From: Thomas Rodgers 
Date: Sat, 15 Jan 2022 17:40:49 -0800
Subject: [PATCH] Strengthen memory memory order for atomic::wait/notify

This matches the memory order in libc++.

libstdc++-v3/ChangeLog:
	* libstdc++-v3/include/bits/atomic_wait.h: Change memory order
	from Acquire/Release with relaxed loads to SeqCst+Release for
	accesses to the waiter's count.
---
 libstdc++-v3/include/bits/atomic_wait.h | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/libstdc++-v3/include/bits/atomic_wait.h b/libstdc++-v3/include/bits/atomic_wait.h
index 05cf0013d2a..d7de0d7eb9e 100644
--- a/libstdc++-v3/include/bits/atomic_wait.h
+++ b/libstdc++-v3/include/bits/atomic_wait.h
@@ -209,18 +209,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   void
   _M_enter_wait() noexcept
-  { __atomic_fetch_add(&_M_wait, 1, __ATOMIC_ACQ_REL); }
+  { __atomic_fetch_add(&_M_wait, 1, __ATOMIC_SEQ_CST); }
 
   void
   _M_leave_wait() noexcept
-  { __atomic_fetch_sub(&_M_wait, 1, __ATOMIC_ACQ_REL); }
+  { __atomic_fetch_sub(&_M_wait, 1, __ATOMIC_RELEASE); }
 
   bool
   _M_waiting() const noexcept
   {
 	__platform_wait_t __res;
-	__atomic_load(&_M_wait, &__res, __ATOMIC_ACQUIRE);
-	return __res > 0;
+	__atomic_load(&_M_wait, &__res, __ATOMIC_SEQ_CST);
+	return __res != 0;
   }
 
   void
@@ -258,7 +258,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	__platform_wait(__addr, __old);
 #else
 	__platform_wait_t __val;
-	__atomic_load(__addr, &__val, __ATOMIC_RELAXED);
+	__atomic_load(__addr, &__val, __ATOMIC_SEQ_CST);
 	if (__val == __old)
 	  {
 	lock_guard __l(_M_mtx);
@@ -309,7 +309,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 	{
 	  if (_M_laundered())
 	{
-	  __atomic_fetch_add(_M_addr, 1, __ATOMIC_ACQ_REL);
+	  __atomic_fetch_add(_M_addr, 1, __ATOMIC_SEQ_CST);
 	  __all = true;
 	}
 	  _M_w._M_notify(_M_addr, __all, __bare);
-- 
2.31.1

[PATCHv3] libiberty rust-demangle, ignore .suffix

2022-01-15 Thread Mark Wielaard

Rust symbols can have a .suffix because of compiler transformations.
These can be ignored in the demangled name. Which is what this patch
implements. By stopping at the first dot for v0 symbols and searching
backwards to the ending 'E' for legacy symbols.

An alternative implementation could be to follow what C++ does and
represent these as [clone .suffix] tagged onto the demangled name.
But this seems somewhat confusing since it results in a demangled
name that cannot be mangled again. And it would mean trying to
decode compiler internal naming.

https://bugs.kde.org/show_bug.cgi?id=445916
https://github.com/rust-lang/rust/issues/60705

libiberty/Changelog

* rust-demangle.c (rust_demangle_callback): Ignore everything
after '.' char in sym for v0. For legacy symbols search
backwards to find the last 'E' before any '.'.
* testsuite/rust-demangle-expected: Add new .suffix testcases.
---
 libiberty/rust-demangle.c  | 21 ++---
 libiberty/testsuite/rust-demangle-expected | 26 ++
 2 files changed, 44 insertions(+), 3 deletions(-)

V3 - Add more testcases
   - Allow @ in legacy symbols (which can appear in the .suffix)

diff --git a/libiberty/rust-demangle.c b/libiberty/rust-demangle.c
index 18c760491bdc..42c88161da30 100644
--- a/libiberty/rust-demangle.c
+++ b/libiberty/rust-demangle.c
@@ -1340,13 +1340,19 @@ rust_demangle_callback (const char *mangled, int 
options,
   /* Rust symbols (v0) use only [_0-9a-zA-Z] characters. */
   for (p = rdm.sym; *p; p++)
 {
+  /* Rust v0 symbols can have '.' suffixes, ignore those.  */
+  if (rdm.version == 0 && *p == '.')
+break;
+
   rdm.sym_len++;
 
   if (*p == '_' || ISALNUM (*p))
 continue;
 
-  /* Legacy Rust symbols can also contain [.:$] characters. */
-  if (rdm.version == -1 && (*p == '$' || *p == '.' || *p == ':'))
+  /* Legacy Rust symbols can also contain [.:$] characters.
+ Or @ in the .suffix (which will be skipped, see below). */
+  if (rdm.version == -1 && (*p == '$' || *p == '.' || *p == ':'
+|| *p == '@'))
 continue;
 
   return 0;
@@ -1355,7 +1361,16 @@ rust_demangle_callback (const char *mangled, int options,
   /* Legacy Rust symbols need to be handled separately. */
   if (rdm.version == -1)
 {
-  /* Legacy Rust symbols always end with E. */
+  /* Legacy Rust symbols always end with E.  But can be followed by a
+ .suffix (which we want to ignore).  */
+  int dot_suffix = 1;
+  while (rdm.sym_len > 0 &&
+ !(dot_suffix && rdm.sym[rdm.sym_len - 1] == 'E'))
+{
+  dot_suffix = rdm.sym[rdm.sym_len - 1] == '.';
+  rdm.sym_len--;
+}
+
   if (!(rdm.sym_len > 0 && rdm.sym[rdm.sym_len - 1] == 'E'))
 return 0;
   rdm.sym_len--;
diff --git a/libiberty/testsuite/rust-demangle-expected 
b/libiberty/testsuite/rust-demangle-expected
index 7dca315d0054..b565084cfefa 100644
--- a/libiberty/testsuite/rust-demangle-expected
+++ b/libiberty/testsuite/rust-demangle-expected
@@ -295,3 +295,29 @@ _RMCs4fqI2P2rA04_13const_genericINtB0_4CharKc2202_E
 --format=auto
 _RNvNvMCs4fqI2P2rA04_13const_genericINtB4_3FooKpE3foo3FOO
 >::foo::FOO
+#
+# Suffixes
+#
+--format=rust
+_RNvMs0_NtCs5l0EXMQXRMU_21rustc_data_structures17obligation_forestINtB5_16ObligationForestNtNtNtCsdozMG8X9FIu_21rustc_trait_selection6traits7fulfill26PendingPredicateObligationE22register_obligation_atB1v_.llvm.8517020237817239694
+>::register_obligation_at
+--format=rust
+_ZN4core3ptr85drop_in_place$LT$std..rt..lang_start$LT$$LP$$RP$$GT$..$u7b$$u7b$closure$u7d$$u7d$$GT$17h27f14859c664490dE.llvm.8091179795805947855
+core::ptr::drop_in_place::{{closure}}>
+# old style rustc llvm thinlto
+--format=rust
+_ZN9backtrace3foo17hbb467fcdaea5d79bE.llvm.A5310EB9
+backtrace::foo
+--format=rust
+_ZN9backtrace3foo17hbb467fcdaea5d79bE.llvm.A5310EB9@@16
+backtrace::foo
+# new style rustc llvm thinlto
+--format=rust
+_RC3foo.llvm.9D1C9369
+foo
+--format=rust
+_RC3foo.llvm.9D1C9369@@16
+foo
+--format=rust
+_RNvC9backtrace3foo.llvm.A5310EB9
+backtrace::foo
-- 
2.30.2

Re: gcc/configure: out of date

2022-01-15 Thread Martin Sebor via Gcc-patches


Martin, I've looked into removing the -Wno-error for this warning
for just a subset of targets.  It seems doable with some hardcoding
in configure.ac but if you're planning to do the cleanup for all of
them I'm wondering if we should even bother.  What do you think?

Martin

On 1/14/22 08:46, Martin Liška wrote:

Hello.

I noticed that when I run:
ACLOCAL=~/bin/automake-1.15.1/bin/aclocal  
AUTOMAKE=~/bin/automake-1.15.1/bin/automake autoconf


in gcc subfolder I get the following diff:

diff --git a/gcc/configure b/gcc/configure
index d19059e13cc..ff570f73ef5 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -5352,7 +5352,26 @@ else
    GDC="$ac_cv_prog_GDC"
  fi

+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether the D 
compiler works" >&5

+$as_echo_n "checking whether the D compiler works... " >&6; }
+if ${acx_cv_d_compiler_works+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat >conftest.d <&1 || echo failure`
+  if test x"$errors" = x && test -f conftest.$ac_objext; then
+    acx_cv_d_compiler_works=yes
+  fi
+  rm -f conftest.*
+fi
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: 
$acx_cv_d_compiler_works" >&5

+$as_echo "$acx_cv_d_compiler_works" >&6; }
+if test "x$GDC" != xno && test x$acx_cv_d_compiler_works != xno; then
    have_gdc=yes
  else
    have_gdc=no
@@ -19640,7 +19659,7 @@ else
    lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
    lt_status=$lt_dlunknown
    cat > conftest.$ac_ext <<_LT_EOF
-#line 19643 "configure"
+#line 19662 "configure"
  #include "confdefs.h"

  #if HAVE_DLFCN_H
@@ -19746,7 +19765,7 @@ else
    lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
    lt_status=$lt_dlunknown
    cat > conftest.$ac_ext <<_LT_EOF
-#line 19749 "configure"
+#line 19768 "configure"
  #include "confdefs.h"

  #if HAVE_DLFCN_H

Am I correct that somebody forgot to re-generate the file?

Martin

Re: [PATCH] libiberty rust-demangle, ignore .suffix

2022-01-15 Thread Mark Wielaard

Hi Eddy,

On Mon, Dec 20, 2021 at 01:50:52PM +0200, Eduard-Mihai Burtescu wrote:
> Apologies for the delay, the email fell through the cracks somehow.

And then I went on vacation... Sorry this fairly simple patch takes so
long.

> The updated patch looks like it would work alright, only needs a couple 
> tests, e.g.:
> https://github.com/rust-lang/rustc-demangle/blob/2811a1ad6f7c8bead2ef3671e4fdc10de1553e96/src/lib.rs#L413-L422
> https://github.com/rust-lang/rustc-demangle/blob/2811a1ad6f7c8bead2ef3671e4fdc10de1553e96/src/v0.rs#L1442-L1444

Thanks. I have added those testcases and noticed we need a tweak for
having an @ in the suffix of a legacy symbol. I'll sent the updated
patch. Hope it can still be pushed even though stage 3 is closing
soon.

Cheers,

Mark

Re: [PATCH v2 1/2] add -Wuse-after-free

2022-01-15 Thread Martin Sebor via Gcc-patches


On 1/11/22 15:40, Jason Merrill wrote:

On 11/30/21 17:32, Martin Sebor via Gcc-patches wrote:

Attached is a revised patch with the following changes based
on your comments:

1) Set and use statement uids to determine which statement
    precedes which in the same basic block.
2) Avoid testing flag_isolate_erroneous_paths_dereference.
3) Use post-dominance to decide whether to use the "maybe"
    phrasing vs a definite form.

David raised (and in our offline discussion today reiterated)
an objection to the default setting of the option being
the strictest.  I have not changed that in this revision.
See my rationale for this choice in my reply below:
https://gcc.gnu.org/pipermail/gcc-patches/2021-November/583176.html


In the latest C2x draft I see in the list of undefined behavior

"The value of a pointer that refers to space deallocated by a call to 
the free or realloc function is used (7.22.3)."


So the case that would be technically undefined would be comparing the 
reallocated pointer to the old pointer which has been deallocated.


The C++ draft is more nuanced: it says, "When the end of the duration of 
a region of storage is reached, the values of all pointers representing 
the address of any part of that region of storage become invalid pointer 
values (6.8.3).  Indirection through an invalid pointer value and 
passing an invalid pointer value to a deallocation function have 
undefined behavior.  Any other use of an invalid pointer value has 
implementation-defined behavior."


So the case above is implementation-defined in C++, not undefined.

Let's put =2 in -Wall for now.

With that change, this and the -Wdangling-pointer patch are OK on Friday 
afternoon if there are no other comments before then.

I've adjusted both patches and pushed r12-6605 and r12-6606 after
retesting with a few packages.  In -Wdangling-pointer I reworded
the warning to refer to "unnamed temporary" instead of "compound
literal" since it triggers for those as well.

While rebuilding LLVM and a few its projects with the patches
I noticed the -Wdangling-pointer change exposes what seems like
a Ranger bug for one translation unit (it enters an infinite loop):
I opened pr104038 for it.

Martin




On 11/23/21 2:16 PM, Martin Sebor wrote:

On 11/22/21 6:32 PM, Jeff Law wrote:



On 11/1/2021 4:17 PM, Martin Sebor via Gcc-patches wrote:

Patch 1 in the series detects a small subset of uses of pointers
made indeterminate by calls to deallocation functions like free
or C++ operator delete.  To control the conditions the warnings
are issued under the new -Wuse-after-free= option provides three
levels.  At the lowest level the warning triggers only for
unconditional uses of freed pointers and doesn't warn for uses
in equality expressions.  Level 2 warns also for come conditional
uses, and level 3 also for uses in equality expressions.

I debated whether to make level 2 or 3 the default included in
-Wall.  I decided on 3 for two reasons: 1) to raise awareness
of both the problem and GCC's new ability to detect it: using
a pointer after it's been freed, even only in principle, by
a successful call to realloc, is undefined, and 2) because
it's trivial to lower the level either globally, or locally
by suppressing the warning around such misuses.

I've tested the patch on x86_64-linux and by building Glibc
and Binutils/GDB.  It triggers a number of times in each, all
due to comparing invalidated pointers for equality (i.e., level
3).  I have suppressed these in GCC (libiberty) by a #pragma,
and will see how the Glibc folks want to deal with theirs (I
track them in BZ #28521).

The tests contain a number of xfails due to limitations I'm
aware of.  I marked them pr?? until the patch is approved.
I will open bugs for them before committing if I don't resolve
them in a followup.

Martin

gcc-63272-1.diff

Add -Wuse-after-free.

gcc/c-family/ChangeLog

* c.opt (-Wuse-after-free): New options.

gcc/ChangeLog:

* diagnostic-spec.c (nowarn_spec_t::nowarn_spec_t): Handle
OPT_Wreturn_local_addr and OPT_Wuse_after_free_.
* diagnostic-spec.h (NW_DANGLING): New enumerator.
* doc/invoke.texi (-Wuse-after-free): Document new option.
* gimple-ssa-warn-access.cc (pass_waccess::check_call): Rename...
(pass_waccess::check_call_access): ...to this.
(pass_waccess::check): Rename...
(pass_waccess::check_block): ...to this.
(pass_waccess::check_pointer_uses): New function.
(pass_waccess::gimple_call_return_arg): New function.
(pass_waccess::warn_invalid_pointer): New function.
(pass_waccess::check_builtin): Handle free and realloc.
(gimple_use_after_inval_p): New function.
(get_realloc_lhs): New function.
(maybe_warn_mismatched_realloc): New function.
(pointers_related_p): New function.
(pass_waccess::check_call): Call check_pointer_uses.
(pass_waccess::execute): Compute and free dominance info.

libcpp/ChangeLog:

* files.c (_cpp_find_file): Substitute a valid pointer for

Re: [PATCH] PR fortran/83079 - ICE and wrong code with TRANSFER and character(kind=4)

2022-01-15 Thread Thomas Koenig via Gcc-patches




Hi Harald,


An early *ping* ...


OK.  Thanks for the patch!

Best regards

Thomas

Re: [PATCH] PR fortran/83079 - ICE and wrong code with TRANSFER and character(kind=4)

2022-01-15 Thread Harald Anlauf via Gcc-patches


An early *ping* ...

Am 11.01.22 um 22:17 schrieb Harald Anlauf via Fortran:

Dear Fortranners,

when digging into the issue pointed out in the PR by Gerhard it turned
out that there were several issues with the TRANSFER intrinsics in the
case MOLD was CHARACTER(kind=4).  Default CHARACTER was fine, though.

- the size of the result was wrongly calculated
- the string length used in temporaries was wrong
- the result characteristics was wrong

Fortunately, the few fixes were very local and needed only fix-ups of
the respective computations.

Since the details of which issue would show up seemed to depend on the
properties of a the arguments to TRANSFER, I wrote an extended testcase
which is a "hull" of what I used in debugging.

(The testcase was and can be cross-checked with the NAG compiler.)

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

Thanks,
Harald

[PATCH] i386: Improve and optimize ix86_expand_sse_movcc

2022-01-15 Thread Uros Bizjak via Gcc-patches

Modernize ix86_expand_sse_movcc to use expand_simple_{unop,binop}
infrastructure to avoid manual twiddling with output registers.
Also fix a couple of inconsistent vector_all_ones_operand usages,
break a couple of unnecessary else-if chains, eliminate common
subexpressions and do some general code simplifications.

2022-01-15  Uroš Bizjak  

gcc/ChangeLog:

* config/i386/i386-expand.c (ix86_expand_sse_movcc): Use
expand_simple_unop and expand_simple_binop instead of manually
constructing NOT, AND and IOR RTXes.  Use vector_all_ones_operand
consistently.  Eliminate common subexpressions and simplify code.
* config/i386/sse.md (3): New expander.
(3): Make public.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c
index c740d6e5c04..138580da96e 100644
--- a/gcc/config/i386/i386-expand.c
+++ b/gcc/config/i386/i386-expand.c
@@ -3781,6 +3781,7 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, 
rtx op_false)
 {
   machine_mode mode = GET_MODE (dest);
   machine_mode cmpmode = GET_MODE (cmp);
+  rtx x;
 
   /* Simplify trivial VEC_COND_EXPR to avoid ICE in pr97506.  */
   if (rtx_equal_p (op_true, op_false))
@@ -3789,8 +3790,6 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, 
rtx op_false)
   return;
 }
 
-  rtx t2, t3, x;
-
   /* If we have an integer mask and FP value then we need
  to cast mask to FP mode.  */
   if (mode != cmpmode && VECTOR_MODE_P (cmpmode))
@@ -3813,12 +3812,14 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, 
rtx op_false)
  ? force_reg (mode, op_false) : op_false);
   if (op_true == CONST0_RTX (mode))
{
- rtx n = gen_reg_rtx (cmpmode);
  if (cmpmode == E_DImode && !TARGET_64BIT)
-   emit_insn (gen_knotdi (n, cmp));
+   {
+ x = gen_reg_rtx (cmpmode);
+ emit_insn (gen_knotdi (x, cmp));
+   }
  else
-   emit_insn (gen_rtx_SET (n, gen_rtx_fmt_e (NOT, cmpmode, cmp)));
- cmp = n;
+   x = expand_simple_unop (cmpmode, NOT, cmp, NULL, 1);
+ cmp = x;
  /* Reverse op_true op_false.  */
  std::swap (op_true, op_false);
}
@@ -3826,22 +3827,24 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, 
rtx op_false)
   if (mode == HFmode)
emit_insn (gen_movhf_mask (dest, op_true, op_false, cmp));
   else
-   {
- rtx vec_merge = gen_rtx_VEC_MERGE (mode, op_true, op_false, cmp);
- emit_insn (gen_rtx_SET (dest, vec_merge));
-   }
+   emit_insn (gen_rtx_SET (dest,
+   gen_rtx_VEC_MERGE (mode,
+  op_true, op_false, cmp)));
   return;
 }
-  else if (vector_all_ones_operand (op_true, mode)
-  && op_false == CONST0_RTX (mode))
+
+  if (vector_all_ones_operand (op_true, mode)
+  && op_false == CONST0_RTX (mode))
 {
-  emit_insn (gen_rtx_SET (dest, cmp));
+  emit_move_insn (dest, cmp);
   return;
 }
   else if (op_false == CONST0_RTX (mode))
 {
-  op_true = force_reg (mode, op_true);
-  ix86_emit_vec_binop (AND, mode, dest, cmp, op_true);
+  x = expand_simple_binop (mode, AND, cmp, op_true,
+  dest, 1, OPTAB_DIRECT);
+  if (x != dest)
+   emit_move_insn (dest, x);
   return;
 }
   else if (op_true == CONST0_RTX (mode))
@@ -3851,13 +3854,16 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, 
rtx op_false)
   ix86_emit_vec_binop (AND, mode, dest, x, op_false);
   return;
 }
-  else if (INTEGRAL_MODE_P (mode) && op_true == CONSTM1_RTX (mode))
+  else if (vector_all_ones_operand (op_true, mode))
 {
-  op_false = force_reg (mode, op_false);
-  ix86_emit_vec_binop (IOR, mode, dest, cmp, op_false);
+  x = expand_simple_binop (mode, IOR, cmp, op_false,
+  dest, 1, OPTAB_DIRECT);
+  if (x != dest)
+   emit_move_insn (dest, x);
   return;
 }
-  else if (TARGET_XOP)
+
+  if (TARGET_XOP)
 {
   op_true = force_reg (mode, op_true);
 
@@ -3865,16 +3871,17 @@ ix86_expand_sse_movcc (rtx dest, rtx cmp, rtx op_true, 
rtx op_false)
  || !nonimmediate_operand (op_false, mode))
op_false = force_reg (mode, op_false);
 
-  emit_insn (gen_rtx_SET (dest, gen_rtx_IF_THEN_ELSE (mode, cmp,
- op_true,
- op_false)));
+  emit_insn (gen_rtx_SET (dest,
+ gen_rtx_IF_THEN_ELSE (mode, cmp,
+   op_true, op_false)));
   return;
 }
 
   rtx (*gen) (rtx, rtx, rtx, rtx) = NULL;
-  rtx d = dest;
+  machine_mode blend_mode = mode;
 
-  if (!vector_operand (op_true, mode))
+  if (GET_MODE_SIZE (mode) < 16
+

[Patch][V5][Patch 2/2]Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables.

2022-01-15 Thread Qing Zhao via Gcc-patches

Hi, Richard,

This is the updated version for the change of "Enable -Wuninitialized + 
-ftrivial-auto-var-init for  address taken variables”.

Compared to the previous patch, I mainly made the following change:

Delete the 4th parameter of “warn_uninit”, construct the warning message string 
based on the value of OPT and VAR.

This patch has been bootstrapped and regressing tested on both X86 and aarch64.

Okay for GCC12?

thanks.

Qing.

==
Enable -Wuninitialized + -ftrivial-auto-var-init for address
 taken variables.

With -ftrivial-auto-var-init, the address taken auto variable is replaced with
a temporary variable during gimplification, and the original auto variable might
be eliminated by compiler optimization completely. As a result, the current
uninitialized warning analysis cannot get enough information from the IR,
therefore the uninitialized warnings for address taken variable cannot be
issued based on the current implemenation of -ftrival-auto-var-init.

For more info please refer to:
https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577431.html

In order to improve this situation, we can improve uninitialized analysis
for address taken auto variables with -ftrivial-auto-var-init as following:

for the following stmt:

_1 = .DEFERRED_INIT (4, 2, &"alt_reloc"[0]);
if (_1 != 0)

The original variable DECL has been eliminated from the IR, all the necessary
information that is needed for reporting the warnings for DECL can be acquired
from the call to .DEFERRED_INIT.

A. the name string of DECL from the 3rd parameter of the call;
B. the location of the DECL from the location of the call;
C. the call can also be used to hold the information on whether the warning
   has been issued or not to suppress warning messages when needed;

The current testing cases for uninitialized warnings + -ftrivial-auto-var-init
are adjusted to reflect the fact that we can issue warnings for address taken
variables.

gcc/ChangeLog:

2022-01-14  qing zhao  

* tree-ssa-uninit.c (warn_uninit): Delete the 4th parameter. Handle
.DEFERRED_INIT call with an anonymous SSA_NAME specially.
(check_defs): Handle .DEFERRED_INIT call with an anonymous SSA_NAME
specially.
(warn_uninit_phi_uses): Delete the 4th actual when call warn_uninit.
(warn_uninitialized_vars): Likewise.
(warn_uninitialized_phi): Likewise.

The complete patch is:



0001-Enable-Wuninitialized-ftrivial-auto-var-init-for-add.patch
Description: 0001-Enable-Wuninitialized-ftrivial-auto-var-init-for-add.patch

Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.

2022-01-15 Thread Uros Bizjak via Gcc-patches

On Sat, Jan 15, 2022 at 5:39 PM Hongyu Wang  wrote:
>
> Thanks for the suggestion, here is the updated patch that survived
> bootstrap/regtest.

LGTM for me, but please get the final approval from Hongtao.

Thanks,
Uros.

> > Please note reg_mentioned_p in the above condition. This function
> > returns nonzero if register op0 appears somewhere within op1 and is
> > critical for the correct operation of your patch.
> I added reg_mentioned_p for all insns except fp16 complex mult, since
> they have constraint & to the dest so it must be allocated different
> register from src.
>
> Uros Bizjak  于2022年1月14日周五 23:49写道：
>
>
> >
> > On Fri, Jan 14, 2022 at 2:44 PM Hongyu Wang  wrote:
> > >
> > > > Are there any technical obstacles to introduce subst to
> > > > define_{,insn_and_}split?
> > >
> > > gccint says: define_subst can be used only in define_insn and
> > > define_expand, it cannot be used in other expressions (e.g. in
> > > define_insn_and_split).
> >
> > Hm, hm ... annoying ...
> >
> > > I have no idea how to implement it in current infrastructure.
> > >
> > > > In the proposed patch, if the output register is also mentioned in the
> > > > input, then clearing before insn will clear the value in the input
> > > > register. The solution in the i386.md also takes care of this issue.
> > > >
> > >
> > > For this, I think we can add REGNO checks for operands in condition
> > > (which means there is true dependency).
> >
> > Let's go in your direction, considering the limitations of current
> > infrastructure.
> > +{
> > +  if (TARGET_DEST_FALSE_DEPENDENCY
> > +  && get_attr_dest_false_dep (insn) ==
> > +DEST_FALSE_DEP_TRUE)
> > +output_asm_insn ("vxorps\t{%x0, %x0, %x0}", operands);
> > +  return "vgetmant\t{%2, %1,
> > %0|%0, %1, %2}";
> > +}
> >
> > There is no need to pass the information via attributes. IMO, you
> > shoud use subst attribute directly in the condition:
> >
> > {
> >   if (TARGET_DEST_FALSE_DEPENDENCY
> >   && 
> >   && !reg_mentioned_p (operands[0], operands[1]))
> >output_asm_insn ("vxorps\t{%x0, %x0, %x0}", operands);
> >   return ".";
> > }
> >
> > Assume the above works, so please:
> >
> > - rename TARGET_DEST_FALSE_DEPENDENCY to something less generic, maybe
> > following existing BMI example with TARGET_AVOID_FALSE_DEP_FOR_AVX512F
> > - rename "mask3_dest_false_dep_attr" to "mask3_false_dep_for_avx512f_cond"
> >
> > Please note reg_mentioned_p in the above condition. This function
> > returns nonzero if register op0 appears somewhere within op1 and is
> > critical for the correct operation of your patch.
> >
> > Uros.

Re: [PATCH] widening_mul, i386, v2: Improve spaceship expansion on x86 [PR103973]

2022-01-15 Thread Uros Bizjak via Gcc-patches

On Sat, Jan 15, 2022 at 12:23 PM Jakub Jelinek  wrote:
>
> On Sat, Jan 15, 2022 at 11:42:55AM +0100, Uros Bizjak wrote:
> > Yes, that would be nice. XFmode is used for long double, and not obsolete.
>
> Ok, that seems to work.  Compared to the incremental patch I've posted, I
> also had to add handling of the case where we have just
> x == y ? 0 : x < y ? -1 : 1 (both for -ffast-math and non-ffast-math).
> Apparently even that is worth optimizing.
> Tested so far on the new testcases, will run full bootstrap/regtest tonight.
>
> > > Why?  That seems to be a waste of time to me, unless something uses them
> > > already during expansion.  Because pass_expand::execute
> > > runs:
> > >   /* We need JUMP_LABEL be set in order to redirect jumps, and hence
> > >  split edges which edge insertions might do.  */
> > >   rebuild_jump_labels (get_insns ());
> > > which resets all LABEL_NUSES to 0 (well, to:
> > >   if (LABEL_P (insn))
> > > LABEL_NUSES (insn) = (LABEL_PRESERVE_P (insn) != 0);
> > > and then recomputes them and adds JUMP_LABEL if needed:
> > >   JUMP_LABEL (insn) = label;
> >
> > I was not aware of that detail. Thanks for sharing (and I wonder if
> > all other cases should be removed from the source).
>
> I guess it depends, for code that can only be called during the expand pass
> dropping it should be just fine, for code that can be called also (or only)
> later I think adding JUMP_LABEL and correct LABEL_NUSES is needed because
> nothing will fix it up afterwards.
>
> 2022-01-15  Jakub Jelinek  
>
> PR target/103973
> * tree-cfg.h (cond_only_block_p): Declare.
> * tree-ssa-phiopt.c (cond_only_block_p): Move function to ...
> * tree-cfg.c (cond_only_block_p): ... here.  No longer static.
> * optabs.def (spaceship_optab): New optab.
> * internal-fn.def (SPACESHIP): New internal function.
> * internal-fn.h (expand_SPACESHIP): Declare.
> * internal-fn.c (expand_PHI): Formatting fix.
> (expand_SPACESHIP): New function.
> * tree-ssa-math-opts.c (optimize_spaceship): New function.
> (math_opts_dom_walker::after_dom_children): Use it.
> * config/i386/i386.md (spaceship3): New define_expand.
> * config/i386/i386-protos.h (ix86_expand_fp_spaceship): Declare.
> * config/i386/i386-expand.c (ix86_expand_fp_spaceship): New function.
> * doc/md.texi (spaceship@var{m}3): Document.
>
> * gcc.target/i386/pr103973-1.c: New test.
> * gcc.target/i386/pr103973-2.c: New test.
> * gcc.target/i386/pr103973-3.c: New test.
> * gcc.target/i386/pr103973-4.c: New test.
> * gcc.target/i386/pr103973-5.c: New test.
> * gcc.target/i386/pr103973-6.c: New test.
> * gcc.target/i386/pr103973-7.c: New test.
> * gcc.target/i386/pr103973-8.c: New test.
> * gcc.target/i386/pr103973-9.c: New test.
> * gcc.target/i386/pr103973-10.c: New test.
> * gcc.target/i386/pr103973-11.c: New test.
> * gcc.target/i386/pr103973-12.c: New test.
> * gcc.target/i386/pr103973-13.c: New test.
> * gcc.target/i386/pr103973-14.c: New test.
> * gcc.target/i386/pr103973-15.c: New test.
> * gcc.target/i386/pr103973-16.c: New test.
> * gcc.target/i386/pr103973-17.c: New test.
> * gcc.target/i386/pr103973-18.c: New test.
> * gcc.target/i386/pr103973-19.c: New test.
> * gcc.target/i386/pr103973-20.c: New test.
> * g++.target/i386/pr103973-1.C: New test.
> * g++.target/i386/pr103973-2.C: New test.
> * g++.target/i386/pr103973-3.C: New test.
> * g++.target/i386/pr103973-4.C: New test.
> * g++.target/i386/pr103973-5.C: New test.
> * g++.target/i386/pr103973-6.C: New test.
> * g++.target/i386/pr103973-7.C: New test.
> * g++.target/i386/pr103973-8.C: New test.
> * g++.target/i386/pr103973-9.C: New test.
> * g++.target/i386/pr103973-10.C: New test.
> * g++.target/i386/pr103973-11.C: New test.
> * g++.target/i386/pr103973-12.C: New test.
> * g++.target/i386/pr103973-13.C: New test.
> * g++.target/i386/pr103973-14.C: New test.
> * g++.target/i386/pr103973-15.C: New test.
> * g++.target/i386/pr103973-16.C: New test.
> * g++.target/i386/pr103973-17.C: New test.
> * g++.target/i386/pr103973-18.C: New test.
> * g++.target/i386/pr103973-19.C: New test.
> * g++.target/i386/pr103973-20.C: New test.

OK (with a comment fix below) for the x86 part.

Thanks,
Uros.

> --- gcc/tree-cfg.h.jj   2022-01-14 23:57:44.491718086 +0100
> +++ gcc/tree-cfg.h  2022-01-15 09:51:25.359468982 +0100
> @@ -111,6 +111,7 @@ extern basic_block gimple_switch_label_b
>  extern basic_block gimple_switch_default_bb (function *, gswitch *);
>  extern edge gimple_switch_edge (function *, gswitch *, unsigned);
>  extern edge

Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.

2022-01-15 Thread Hongyu Wang via Gcc-patches

Thanks for the suggestion, here is the updated patch that survived
bootstrap/regtest.

> Please note reg_mentioned_p in the above condition. This function
> returns nonzero if register op0 appears somewhere within op1 and is
> critical for the correct operation of your patch.
I added reg_mentioned_p for all insns except fp16 complex mult, since
they have constraint & to the dest so it must be allocated different
register from src.

Uros Bizjak  于2022年1月14日周五 23:49写道：


>
> On Fri, Jan 14, 2022 at 2:44 PM Hongyu Wang  wrote:
> >
> > > Are there any technical obstacles to introduce subst to
> > > define_{,insn_and_}split?
> >
> > gccint says: define_subst can be used only in define_insn and
> > define_expand, it cannot be used in other expressions (e.g. in
> > define_insn_and_split).
>
> Hm, hm ... annoying ...
>
> > I have no idea how to implement it in current infrastructure.
> >
> > > In the proposed patch, if the output register is also mentioned in the
> > > input, then clearing before insn will clear the value in the input
> > > register. The solution in the i386.md also takes care of this issue.
> > >
> >
> > For this, I think we can add REGNO checks for operands in condition
> > (which means there is true dependency).
>
> Let's go in your direction, considering the limitations of current
> infrastructure.
> +{
> +  if (TARGET_DEST_FALSE_DEPENDENCY
> +  && get_attr_dest_false_dep (insn) ==
> +DEST_FALSE_DEP_TRUE)
> +output_asm_insn ("vxorps\t{%x0, %x0, %x0}", operands);
> +  return "vgetmant\t{%2, %1,
> %0|%0, %1, %2}";
> +}
>
> There is no need to pass the information via attributes. IMO, you
> shoud use subst attribute directly in the condition:
>
> {
>   if (TARGET_DEST_FALSE_DEPENDENCY
>   && 
>   && !reg_mentioned_p (operands[0], operands[1]))
>output_asm_insn ("vxorps\t{%x0, %x0, %x0}", operands);
>   return ".";
> }
>
> Assume the above works, so please:
>
> - rename TARGET_DEST_FALSE_DEPENDENCY to something less generic, maybe
> following existing BMI example with TARGET_AVOID_FALSE_DEP_FOR_AVX512F
> - rename "mask3_dest_false_dep_attr" to "mask3_false_dep_for_avx512f_cond"
>
> Please note reg_mentioned_p in the above condition. This function
> returns nonzero if register op0 appears somewhere within op1 and is
> critical for the correct operation of your patch.
>
> Uros.
From ac472527b653030c89c06021f38268441e64a502 Mon Sep 17 00:00:00 2001
From: wwwhhhyyy 
Date: Mon, 30 Aug 2021 16:41:41 +0800
Subject: [PATCH] [i386] GLC tuning: Break false dependency for dest register.

For GoldenCove micro-architecture, force insert zero-idiom in asm
template to break false dependency of dest register for several insns.

The related insns are:

VPERM/D/Q/PS/PD
VRANGEPD/PS/SD/SS
VGETMANTSS/SD/SH
VGETMANDPS/PD - mem version only
VPMULLQ
VFMULCSH/PH
VFCMULCSH/PH

gcc/ChangeLog:

	* config/i386/i386.h (TARGET_DEST_FALSE_DEP_FOR_GLC): New macro.
	* config/i386/sse.md (__):
	Insert zero-idiom in output template when attr enabled, set new attribute to
	true for non-mask/maskz insn.
	(avx512fp16_sh_v8hf):
	Likewise.
	(avx512dq_mul3): Likewise.
	(_permvar): Likewise.
	(avx2_perm_1): Likewise.
	(avx512f_perm_1): Likewise.
	(avx512dq_rangep): Likewise.
	(avx512dq_ranges):
	Likewise.
	(_getmant): Likewise.
	(avx512f_vgetmant):
	Likewise.
	* config/i386/subst.md (mask3_dest_false_dep_for_glc_cond): New
	subst_attr.
	(mask4_dest_false_dep_for_glc_cond): Likewise.
	(mask6_dest_false_dep_for_glc_cond): Likewise.
	(mask10_dest_false_dep_for_glc_cond): Likewise.
	(maskc_dest_false_dep_for_glc_cond): Likewise.
	(mask_scalar4_dest_false_dep_for_glc_cond): Likewise.
	(mask_scalarc_dest_false_dep_for_glc_cond): Likewise.
	* config/i386/x86-tune.def (X86_TUNE_DEST_FALSE_DEP_FOR_GLC): New
	DEF_TUNE enabled for m_SAPPHIRERAPIDS and m_ALDERLAKE

gcc/testsuite/ChangeLog:

	* gcc.target/i386/avx2-dest-false-dep-for-glc.c: New test.
	* gcc.target/i386/avx512dq-dest-false-dep-for-glc.c: Ditto.
	* gcc.target/i386/avx512f-dest-false-dep-for-glc.c: Ditto.
	* gcc.target/i386/avx512fp16-dest-false-dep-for-glc.c: Ditto.
	* gcc.target/i386/avx512fp16vl-dest-false-dep-for-glc.c: Ditto.
	* gcc.target/i386/avx512vl-dest-false-dep-for-glc.c: Ditto.
---
 gcc/config/i386/i386.h|   2 +
 gcc/config/i386/sse.md|  75 +++--
 gcc/config/i386/subst.md  |   7 ++
 gcc/config/i386/x86-tune.def  |   6 +
 .../i386/avx2-dest-false-dep-for-glc.c|  24 
 .../i386/avx512dq-dest-false-dep-for-glc.c|  73 +
 .../i386/avx512f-dest-false-dep-for-glc.c | 103 ++
 .../i386/avx512fp16-dest-false-dep-for-glc.c  |  45 
 .../avx512fp16vl-dest-false-dep-for-glc.c |  24 
 .../i386/avx512vl-dest-false-dep-for-glc.c|  76 +
 10 files changed, 427 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/avx2-dest-false-dep-for-glc.c
 create mode 100644

Re: [PATCH] c++: ICE with noexcept and canonical types [PR101715]

2022-01-15 Thread Patrick Palka via Gcc-patches

On Fri, 14 Jan 2022, Marek Polacek via Gcc-patches wrote:

> This is a "canonical types differ for identical types" ICE, which started
> with r11-4682.  It's a bit tricky to explain.  Consider:
> 
>   template  struct S {
> S bar() noexcept(T::value);  // #1
> S foo() noexcept(T::value);  // #2
>   };
> 
>   template  S S::foo() noexcept(T::value) {}  // #3
> 
> We ICE because #3 and #2 have the same type, but their canonical types
> differ: TYPE_CANONICAL (#3) == #2 but TYPE_CANONICAL (#2) == #1.
> 
> The member functions #1 and #2 have the same type.  However, since their
> noexcept-specifier is deferred, when parsing them, we create a variant for
> both of them, because DEFERRED_PARSE cannot be compared.  In other words,
> build_cp_fntype_variant's
> 
>   tree v = TYPE_MAIN_VARIANT (type);
>   for (; v; v = TYPE_NEXT_VARIANT (v))
> if (cp_check_qualified_type (v, type, type_quals, rqual, raises, late))
>   return v;
> 
> will *not* find an existing variant when creating a method_type for #2, so we
> have to create a new one.
> 
> But then we perform delayed parsing and call fixup_deferred_exception_variants
> for #1 and #2.  f_d_e_v will replace TYPE_RAISES_EXCEPTIONS with the newly
> parsed noexcept-specifier.  It also sets TYPE_CANONICAL (#2) to #1.  Both
> noexcepts turned out to be the same, so now we have two equivalent variants in
> the list!  I.e.,
> 
> +-+  +-+  +-+
> |  main   |  |  #2 |  |  #1 |
> | S S::(S*) |->| S S::(S*) |->| S S::(S*) 
> |->NULL
> |-|  |  noex(T::value) |  |  noex(T::value) |
> +-+  +-+  +-+
> 
> Then we get to #3.  As for #1 and #2, grokdeclarator calls build_memfn_type,
> which ends up calling build_cp_fntype_variant, which will use the loop
> above to look for an existing variant.  The first one that matches
> cp_check_qualified_type will be used, so we use #2 rather than #1, and the
> TYPE_CANONICAL mismatch follows.  Hopefully that makes sense.
> 
> As for the fix, I didn't think I could rewrite the method_type #2 with #1
> because the type may have escaped via decltype.  So my approach is to
> elide #2 from the list, so when looking for a matching variant, we always
> find #1 (#2 remains live though, which admittedly sounds sort of dodgy).

I wonder about instead making build_cp_fntype_variant set the TYPE_CANONICAL for
#3 to TYPE_CANONICAL(#2) (i.e. #1) instead of to #2?  Something like:

-- >8 --

 gcc/cp/tree.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 7f7de86b4e8..b89135fa121 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -2779,8 +2779,9 @@ build_cp_fntype_variant (tree type, cp_ref_qualifier 
rqual,
   else if (TYPE_CANONICAL (type) != type || cr != raises || late)
 /* Build the underlying canonical type, since it is different
from TYPE. */
-TYPE_CANONICAL (v) = build_cp_fntype_variant (TYPE_CANONICAL (type),
- rqual, cr, false);
+TYPE_CANONICAL (v)
+  = TYPE_CANONICAL (build_cp_fntype_variant (TYPE_CANONICAL (type),
+rqual, cr, false));
   else
 /* T is its own canonical type. */
 TYPE_CANONICAL (v) = v;

[PATCH] c++, dyninit, v2: Optimize C++ dynamic initialization by constants into DECL_INITIAL adjustment [PR102876]

2022-01-15 Thread Jakub Jelinek via Gcc-patches

Hi!

Sorry for the delay in response.

On Thu, Dec 02, 2021 at 02:35:25PM +0100, Richard Biener wrote:
> So with
> 
> +/* Mark start and end of dynamic initialization of a variable.  */
> +DEF_INTERNAL_FN (DYNAMIC_INIT_START, ECF_LEAF | ECF_NOTHROW, ". r ")
> +DEF_INTERNAL_FN (DYNAMIC_INIT_END, ECF_LEAF | ECF_NOTHROW, ". r ")
> 
> there's nothing preventing code motion of unrelated stmts into
> the block, but that should be harmless.  What it also does
> is make 'e' aliased (because it's address is now taken), probably
> relevant only for IPA/LTO or for statics.
> 
> The setup does not prevent CSEing the inits with uses from another
> initializer - probably OK as well (if not then .DYNAMIC_INIT_END
> should also be considered writing to 'e').
> 
> ". r " also means it clobbers and uses all global memory, I think
> we'd like to have it const + looping-pure-or-const.  ".cr " would
> possibly achieve this, not sure about the looping part.

https://eel.is/c++draft/basic.start.static#3 in
https://eel.is/c++draft/basic.start.static#3.1 says that the optimization
I'd like to do is only allowed if "the dynamic version of the initialization
does not change the value of any other object of static or thread storage
duration prior to its initialization", so we at least need to ensure that
stores from other dynamic initialization or stores from this
dynamic initialization aren't moved across the
.DYNAMIC_INIT_{START,END} calls (i.e. that if a dynamic initialization
of a also stores to b, we don't move the store to b somewhere else and
optimize it, or we don't move store to some unrelated var into the block
and punt because of that).

> > Currently the optimization is only able to optimize cases where the whole
> > variable is stored in a single store (typically scalar variables), or
> > uses the native_{encode,interpret}* infrastructure to create or update
> > the CONSTRUCTOR.  This means that except for the first category, we can't
> > right now handle unions or anything that needs relocations (vars containing
> > pointers to other vars or references).
> > I think it would be nice to incrementally add before the native_* fallback
> > some attempt to just create or update a CONSTRUCTOR if possible.  If we only
> > see var.a.b.c.d[10].e = const; style of stores, this shouldn't be that hard
> > as the whole access path is recorded there and we'd just need to decide what
> > to do with unions if two or more union members are accessed.  And do a deep
> > copy of the CONSTRUCTOR and try to efficiently update the copy afterwards
> > (the CONSTRUCTORs should be sorted on increasing offsets of the
> > members/elements, so doing an ordered vec insertion might not be the best
> > idea).  But MEM_REFs complicate this, parts or all of the access path
> > is lost.  For non-unions in most cases we could try to guess which field
> > it is (do we have some existing function to do that?  I vaguely remember
> > we've been doing that in some cases in the past in some folding but stopped
> > doing so) but with unions it will be harder or impossible.
> 
> I suppose we could, at least for non-overlapping inits, create a
> new aggregate type on the fly to be able to compose the CTOR and then
> view-convert it to the decls type.  Would need to check that
> a CTOR wrapped in a V_C_E is handled OK by varasm of course.
> 
> An alternative way of recording the initializer (maybe just emit
> it right away into asm?) would be another possibility.

It would be nice to come up with something that works for those, but perhaps
that can be improved incrementally (so for sure GCC13+ only)?

> I also note that loops are quite common in some initializers so more
> aggressively unrolling those for initializations might be a good
> idea as well.

Sure.

> > As the middle-end can't easily differentiate between const variables without
> > and with mutable members, both of those will have TREE_READONLY on the
> > var decl clear (because of dynamic initialization) and TYPE_READONLY set
> > on the type, the patch remembers this in an extra argument to
> > .DYNAMIC_INIT_START (true if it is ok to set TREE_READONLY on the var decl
> > back if the var dynamic initialization could be optimized into 
> > DECL_INITIAL).
> > Thinking more about it, I'm not sure about const vars without mutable
> > members with non-trivial destructors, do we register their dtors dynamically
> > through __cxa_atexit in the ctors (that would mean the optimization
> > currently punts on them), or not (in that case we could put it into .rodata
> > even when the dtor will want to perhaps write to them)?
> 
> I think anything like this asks for doing the whole thing at IPA level
> to see which functions are "initialization" and thus need not considered
> writing when the initializer is made static.

I don't see how we could do it at the IPA level.  Because for it we need
inlining to happen and then perform at least some minimal cleanup passes
(forwprop, ccp, fre, dce, cunrolli, etc.) and I

Re: [PATCH] gcc: pass-manager: Fix memory leak. [PR jit/63854]

2022-01-15 Thread Marc Nieper-Wißkirchen

Jeff, David, do you need any more input from my side?

-- Marc

Am Sa., 8. Jan. 2022 um 17:32 Uhr schrieb Jeff Law :
>
>
>
> On 1/6/2022 6:53 AM, David Malcolm via Gcc-patches wrote:
> > On Sun, 2021-12-19 at 22:30 +0100, Marc Nieper-Wißkirchen wrote:
> >> This patch fixes a memory leak in the pass manager. In the existing
> >> code,
> >> the m_name_to_pass_map is allocated in
> >> pass_manager::register_pass_name, but
> >> never deallocated.  This is fixed by adding a deletion in
> >> pass_manager::~pass_manager.  Moreover the string keys in
> >> m_name_to_pass_map are
> >> all dynamically allocated.  To free them, this patch adds a new hash
> >> trait for
> >> string hashes that are to be freed when the corresponding hash entry
> >> is removed.
> >>
> >> This fix is particularly relevant for using GCC as a library through
> >> libgccjit.
> >> The memory leak also occurs when libgccjit is instructed to use an
> >> external
> >> driver.
> >>
> >> Before the patch, compiling the hello world example of libgccjit with
> >> the external driver under Valgrind shows a loss of 12,611 (48 direct)
> >> bytes.  After the patch, no memory leaks are reported anymore.
> >> (Memory leaks occurring when using the internal driver are mostly in
> >> the driver code in gcc/gcc.c and have to be fixed separately.)
> >>
> >> The patch has been tested by fully bootstrapping the compiler with
> >> the
> >> frontends C, C++, Fortran, LTO, ObjC, JIT and running the test suite
> >> under a x86_64-pc-linux-gnu host.
> > Thanks for the patch.
> >
> > It looks correct to me, given that pass_manager::register_pass_name
> > does an xstrdup and puts the result in the map.
> >
> > That said:
> > - I'm not officially a reviewer for this part of gcc (though I probably
> > touched this code last)
> > - is it cleaner to instead change m_name_to_pass_map's key type from
> > const char * to char *, to convey that the map "owns" the name?  That
> > way we probably wouldn't need struct typed_const_free_remove, and (I
> > hope) works better with the type system.
> >
> > Dave
> >
> >> gcc/ChangeLog:
> >>
> >>  PR jit/63854
> >>  * hash-traits.h (struct typed_const_free_remove): New.
> >>  (struct free_string_hash): New.
> >>  * pass_manager.h: Use free_string_hash.
> >>  * passes.c (pass_manager::register_pass_name): Use
> >> free_string_hash.
> >>  (pass_manager::~pass_manager): Delete allocated
> >> m_name_to_pass_map.
> My concern (and what I hadn't had time to dig into) was we initially
> used nofree_string_hash -- I wanted to make sure there wasn't any path
> where the name came from the stack (can't be free'd), was saved
> elsewhere (danging pointer) and the like.  ie, why were we using
> nofree_string_hash to begin with?  I've never really mucked around with
> these bits, so the analysis side kept falling off the daily todo list.
>
> If/once you're comfortable with the patch David, then go ahead and apply
> it on Marc's behalf.
>
> jeff
>

[r12-6420 Regression] FAIL: g++.dg/vect/pr92595.cc -std=c++2a (test for excess errors) on Linux/x86_64

2022-01-15 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

d3ff7420e941931d32ce2e332e7968fe67ba20af is the first bad commit
commit d3ff7420e941931d32ce2e332e7968fe67ba20af
Author: Andre Vieira 
Date:   Thu Dec 2 14:34:15 2021 +

[vect] Re-analyze all modes for epilogues

caused

FAIL: gcc.dg/tree-ssa/gen-vect-26.c (internal compiler error: in operator[], at 
vec.h:889)
FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect "vectorized 1 
loops" 1
FAIL: gcc.dg/tree-ssa/gen-vect-26.c (test for excess errors)
FAIL: gcc.dg/tree-ssa/gen-vect-28.c (internal compiler error: in operator[], at 
vec.h:889)
FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect "vectorized 1 
loops" 1
FAIL: gcc.dg/tree-ssa/gen-vect-28.c (test for excess errors)
FAIL: gcc.dg/vect/slp-perm-9.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "vectorizing stmts using SLP" 1
FAIL: gcc.dg/vect/slp-perm-9.c scan-tree-dump-times vect "vectorizing stmts 
using SLP" 1
FAIL: gcc.dg/vect/slp-reduc-11.c -flto -ffat-lto-objects  scan-tree-dump vect 
"vectorizing stmts using SLP"
FAIL: gcc.dg/vect/slp-reduc-11.c scan-tree-dump vect "vectorizing stmts using 
SLP"
FAIL: gcc.dg/vect/vect-tail-nomask-1.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "LOOP EPILOGUE VECTORIZED \\(MODE=V16QI\\)" 2
FAIL: gcc.dg/vect/vect-tail-nomask-1.c scan-tree-dump-times vect "LOOP EPILOGUE 
VECTORIZED \\(MODE=V16QI\\)" 2
FAIL: gcc.target/i386/avx512fp16-recip-1.c scan-assembler-not vdivph
FAIL: gcc.target/i386/pr61403.c scan-assembler rsqrtps
FAIL: gcc.target/i386/pr88531-1b.c scan-assembler-times vgatherdpd 4
FAIL: gcc.target/i386/pr88531-1b.c scan-assembler-times vgatherqpd 4
FAIL: gcc.target/i386/pr88531-1b.c scan-assembler-times vmulpd 4
FAIL: gcc.target/i386/pr88531-1c.c scan-assembler-times vgatherdpd 4
FAIL: gcc.target/i386/pr88531-1c.c scan-assembler-times vgatherqpd 4
FAIL: gcc.target/i386/pr88531-1c.c scan-assembler-times vmulpd 4
FAIL: gcc.target/i386/pr88531-2b.c scan-assembler-times vmulps 2
FAIL: gcc.target/i386/pr88531-2c.c scan-assembler-times vmulps 2
FAIL: gcc.target/i386/pr94494.c (internal compiler error: in operator[], at 
vec.h:889)
FAIL: gcc.target/i386/pr94494.c (test for excess errors)
FAIL: gcc.target/i386/vect-reduc-1.c scan-assembler-times padd 5
FAIL: gcc.target/i386/vect-reduc-1.c scan-tree-dump vect "LOOP EPILOGUE 
VECTORIZED"
FAIL: g++.dg/vect/pr92595.cc  -std=c++14 (internal compiler error: in 
operator[], at vec.h:889)
FAIL: g++.dg/vect/pr92595.cc  -std=c++14 (test for excess errors)
FAIL: g++.dg/vect/pr92595.cc  -std=c++17 (internal compiler error: in 
operator[], at vec.h:889)
FAIL: g++.dg/vect/pr92595.cc  -std=c++17 (test for excess errors)
FAIL: g++.dg/vect/pr92595.cc  -std=c++2a (internal compiler error: in 
operator[], at vec.h:889)
FAIL: g++.dg/vect/pr92595.cc  -std=c++2a (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6420/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/gen-vect-26.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/gen-vect-26.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/gen-vect-28.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="tree-ssa.exp=gcc.dg/tree-ssa/gen-vect-28.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/slp-perm-9.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/slp-reduc-11.c --target_board='unix{-m32\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/slp-reduc-11.c --target_board='unix{-m64\ 
-march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-tail-nomask-1.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-tail-nomask-1.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-tail-nomask-1.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="vect.exp=gcc.dg/vect/vect-tail-nomask-1.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/avx512fp16-recip-1.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/avx512fp16-recip-1.c 
--target_board='unix{-m64\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr61403.c

[r12-6560 Regression] FAIL: gfortran.dg/gomp/allocate-2.f90 -O (test for errors, line 36) on Linux/x86_64

2022-01-15 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

69561fc781aca3dea3aa4d5d562ef5a502965924 is the first bad commit
commit 69561fc781aca3dea3aa4d5d562ef5a502965924
Author: Hafiz Abid Qadeer 
Date:   Fri Sep 24 10:04:12 2021 +0100

Add support for allocate clause (OpenMP 5.0).

caused

FAIL: gfortran.dg/gomp/allocate-2.f90   -O   (test for errors, line 36)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6560/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="gomp.exp=gfortran.dg/gomp/allocate-2.f90 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="gomp.exp=gfortran.dg/gomp/allocate-2.f90 
--target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

[r12-6404 Regression] FAIL: gfortran.dg/ieee/signaling_1.f90 -Os (test for excess errors) on Linux/x86_64

2022-01-15 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

492954263e39346287a5a2a32bcc5312466a0ee1 is the first bad commit
commit 492954263e39346287a5a2a32bcc5312466a0ee1
Author: Francois-Xavier Coudert 
Date:   Sun Jan 2 11:36:23 2022 +0100

Fortran: Allow IEEE_CLASS to identify signaling NaNs

caused

FAIL: gfortran.dg/ieee/signaling_1.f90   -O0  (test for excess errors)
FAIL: gfortran.dg/ieee/signaling_1.f90   -O1  (test for excess errors)
FAIL: gfortran.dg/ieee/signaling_1.f90   -O2  (test for excess errors)
FAIL: gfortran.dg/ieee/signaling_1.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess 
errors)
FAIL: gfortran.dg/ieee/signaling_1.f90   -O3 -g  (test for excess errors)
FAIL: gfortran.dg/ieee/signaling_1.f90   -Os  (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6404/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="ieee.exp=gfortran.dg/ieee/signaling_1.f90 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="ieee.exp=gfortran.dg/ieee/signaling_1.f90 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="ieee.exp=gfortran.dg/ieee/signaling_1.f90 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="ieee.exp=gfortran.dg/ieee/signaling_1.f90 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

[r12-6586 Regression] FAIL: g++.dg/torture/pr57993-2.C -Os (test for excess errors) on Linux/x86_64

2022-01-15 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

74abb0beb420830e52dfc6b3ee74e77dae8e31a3 is the first bad commit
commit 74abb0beb420830e52dfc6b3ee74e77dae8e31a3
Author: Martin Liska 
Date:   Fri Jan 14 15:21:40 2022 +0100

testsuite: rename 2 files.

caused

FAIL: g++.dg/torture/pr57993-2.C   -O0  (test for excess errors)
FAIL: g++.dg/torture/pr57993-2.C   -O1  (test for excess errors)
FAIL: g++.dg/torture/pr57993-2.C   -O2 -flto -fno-use-linker-plugin 
-flto-partition=none  (test for excess errors)
FAIL: g++.dg/torture/pr57993-2.C   -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects  (test for excess errors)
FAIL: g++.dg/torture/pr57993-2.C   -O2  (test for excess errors)
FAIL: g++.dg/torture/pr57993-2.C   -O3 -fomit-frame-pointer -funroll-loops 
-fpeel-loops -ftracer -finline-functions  (test for excess errors)
FAIL: g++.dg/torture/pr57993-2.C   -O3 -g  (test for excess errors)
FAIL: g++.dg/torture/pr57993-2.C   -Os  (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6586/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr57993-2.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr57993-2.C 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr57993-2.C 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg-torture.exp=g++.dg/torture/pr57993-2.C 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

[r12-6517 Regression] FAIL: gfortran.dg/ieee/signaling_1.f90 -Os execution test on Linux/x86_64

2022-01-15 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

6b14100b9504800768da726dcb81f1857db3b493 is the first bad commit
commit 6b14100b9504800768da726dcb81f1857db3b493
Author: Francois-Xavier Coudert 
Date:   Wed Jan 12 11:19:37 2022 +0100

Fortran: fix testcase compiler flags

caused

FAIL: gfortran.dg/ieee/signaling_1.f90   -O0  execution test
FAIL: gfortran.dg/ieee/signaling_1.f90   -O1  execution test
FAIL: gfortran.dg/ieee/signaling_1.f90   -O2  execution test
FAIL: gfortran.dg/ieee/signaling_1.f90   -O3 -fomit-frame-pointer 
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution test
FAIL: gfortran.dg/ieee/signaling_1.f90   -O3 -g  execution test
FAIL: gfortran.dg/ieee/signaling_1.f90   -Os  execution test

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6517/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="ieee.exp=gfortran.dg/ieee/signaling_1.f90 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="ieee.exp=gfortran.dg/ieee/signaling_1.f90 
--target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

[r12-6578 Regression] FAIL: g++.dg/cpp0x/constexpr-compare2.C -std=c++2a (test for excess errors) on Linux/x86_64

2022-01-15 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

d686d5d85c23451c03799dc55e456b73065f7333 is the first bad commit
commit d686d5d85c23451c03799dc55e456b73065f7333
Author: Jakub Jelinek 
Date:   Fri Jan 14 12:07:49 2022 +0100

c++: Reject in constant evaluation address comparisons of start of one var 
and end of another [PR89074]

caused

FAIL: g++.dg/cpp0x/constexpr-compare2.C  -std=c++14 (test for excess errors)
FAIL: g++.dg/cpp0x/constexpr-compare2.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp0x/constexpr-compare2.C  -std=c++2a (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6578/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/cpp0x/constexpr-compare2.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/cpp0x/constexpr-compare2.C 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/cpp0x/constexpr-compare2.C 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/cpp0x/constexpr-compare2.C 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

[r12-6585 Regression] FAIL: c-c++-common/Walloca-larger-than.c -Wc++-compat (test for excess errors) on Linux/x86_64

2022-01-15 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

d8b64476138671f3d89cd66f224a9b59e465631b is the first bad commit
commit d8b64476138671f3d89cd66f224a9b59e465631b
Author: Martin Liska 
Date:   Fri Jan 14 15:04:33 2022 +0100

testsuite: rename files in c-c++-common.

caused

FAIL: c-c++-common/Walloca-larger-than.c  -std=gnu++14 (test for excess errors)
FAIL: c-c++-common/Walloca-larger-than.c  -std=gnu++17 (test for excess errors)
FAIL: c-c++-common/Walloca-larger-than.c  -std=gnu++2a (test for excess errors)
FAIL: c-c++-common/Walloca-larger-than.c  -std=gnu++98 (test for excess errors)
FAIL: c-c++-common/Walloca-larger-than.c  -Wc++-compat  (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-6585/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=c-c++-common/Walloca-larger-than.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=c-c++-common/Walloca-larger-than.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=c-c++-common/Walloca-larger-than.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=c-c++-common/Walloca-larger-than.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

Re: [PATCH] OpenMP front-end: allow requires dynamic_allocators

2022-01-15 Thread Jakub Jelinek via Gcc-patches

On Mon, Dec 20, 2021 at 03:16:23PM +, Andrew Stubbs wrote:
> This patch removes the "sorry" message for the OpenMP "requires
> dynamic_allocators" feature in C, C++ and Fortran.
> 
> The clause is supposed to state that the user code will not work without the
> omp_alloc/omp_free and omp_init_allocator/omp_destroy_allocator and these
> things *are* present, so there should be no problem allowing it.
> 
> I can't see any reason for our implementation to *do* anything with this
> statement -- it's fine for the allocators to work the same with or without
> it.

Well, we should do a lot with it, but that can wait for later.
In particular, if OMP_REQUIRES_DYNAMIC_ALLOCATORS is not present,
we should be rejecting omp_init_allocator and omp_destroy_allocator
in target regions (maybe just a warning though), and for allocate clauses,
directives, and the omp_alloc etc. APIs we should in the target region
enforce they don't use omp_null_allocator and the allocators they use
is mentioned in uses_allocators (and implement uses_allocators).
But, right now it is unclear to me how exactly is that supposed to work
with declare target to routines, those aren't nested inside of the target
and so can't know what uses_allocator is used, so they can't make any
allocations?  Or is it just that we can't diagnose this in such routines
and only can diagnose omp_null_allocator isn't used...
> 
> I think this patch ought to be fine for GCC 12, but if not it can wait until
> stage 1 opens. I shall backport this to OG11 shortly.

Anyway, your patch is a step forward, as we don't support uses_allocators,
the non-requires dynamic_allocators way is unusable for any target region
allocations due to that and so requires dynamic_allocators is the only thing
we actually support, but we were rejecting that directive...

Ok and sorry for the delay.

> gcc/c/ChangeLog:
> 
>   * c-parser.c (c_parser_omp_requires): Don't "sorry" dynamic_allocators.
> 
> gcc/cp/ChangeLog:
> 
>   * parser.c (cp_parser_omp_requires): Don't "sorry" dynamic_allocators.
> 
> gcc/fortran/ChangeLog:
> 
>   * openmp.c (gfc_match_omp_requires): Don't "sorry" dynamic_allocators.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gfortran.dg/gomp/requires-8.f90: Reinstate dynamic allocators
>   requirement.

Jakub

[PATCH] widening_mul, i386, v2: Improve spaceship expansion on x86 [PR103973]

2022-01-15 Thread Jakub Jelinek via Gcc-patches

On Sat, Jan 15, 2022 at 11:42:55AM +0100, Uros Bizjak wrote:
> Yes, that would be nice. XFmode is used for long double, and not obsolete.

Ok, that seems to work.  Compared to the incremental patch I've posted, I
also had to add handling of the case where we have just
x == y ? 0 : x < y ? -1 : 1 (both for -ffast-math and non-ffast-math).
Apparently even that is worth optimizing.
Tested so far on the new testcases, will run full bootstrap/regtest tonight.

> > Why?  That seems to be a waste of time to me, unless something uses them
> > already during expansion.  Because pass_expand::execute
> > runs:
> >   /* We need JUMP_LABEL be set in order to redirect jumps, and hence
> >  split edges which edge insertions might do.  */
> >   rebuild_jump_labels (get_insns ());
> > which resets all LABEL_NUSES to 0 (well, to:
> >   if (LABEL_P (insn))
> > LABEL_NUSES (insn) = (LABEL_PRESERVE_P (insn) != 0);
> > and then recomputes them and adds JUMP_LABEL if needed:
> >   JUMP_LABEL (insn) = label;
> 
> I was not aware of that detail. Thanks for sharing (and I wonder if
> all other cases should be removed from the source).

I guess it depends, for code that can only be called during the expand pass
dropping it should be just fine, for code that can be called also (or only)
later I think adding JUMP_LABEL and correct LABEL_NUSES is needed because
nothing will fix it up afterwards.

2022-01-15  Jakub Jelinek  

PR target/103973
* tree-cfg.h (cond_only_block_p): Declare.
* tree-ssa-phiopt.c (cond_only_block_p): Move function to ...
* tree-cfg.c (cond_only_block_p): ... here.  No longer static.
* optabs.def (spaceship_optab): New optab.
* internal-fn.def (SPACESHIP): New internal function.
* internal-fn.h (expand_SPACESHIP): Declare.
* internal-fn.c (expand_PHI): Formatting fix.
(expand_SPACESHIP): New function.
* tree-ssa-math-opts.c (optimize_spaceship): New function.
(math_opts_dom_walker::after_dom_children): Use it.
* config/i386/i386.md (spaceship3): New define_expand.
* config/i386/i386-protos.h (ix86_expand_fp_spaceship): Declare.
* config/i386/i386-expand.c (ix86_expand_fp_spaceship): New function.
* doc/md.texi (spaceship@var{m}3): Document.

* gcc.target/i386/pr103973-1.c: New test.
* gcc.target/i386/pr103973-2.c: New test.
* gcc.target/i386/pr103973-3.c: New test.
* gcc.target/i386/pr103973-4.c: New test.
* gcc.target/i386/pr103973-5.c: New test.
* gcc.target/i386/pr103973-6.c: New test.
* gcc.target/i386/pr103973-7.c: New test.
* gcc.target/i386/pr103973-8.c: New test.
* gcc.target/i386/pr103973-9.c: New test.
* gcc.target/i386/pr103973-10.c: New test.
* gcc.target/i386/pr103973-11.c: New test.
* gcc.target/i386/pr103973-12.c: New test.
* gcc.target/i386/pr103973-13.c: New test.
* gcc.target/i386/pr103973-14.c: New test.
* gcc.target/i386/pr103973-15.c: New test.
* gcc.target/i386/pr103973-16.c: New test.
* gcc.target/i386/pr103973-17.c: New test.
* gcc.target/i386/pr103973-18.c: New test.
* gcc.target/i386/pr103973-19.c: New test.
* gcc.target/i386/pr103973-20.c: New test.
* g++.target/i386/pr103973-1.C: New test.
* g++.target/i386/pr103973-2.C: New test.
* g++.target/i386/pr103973-3.C: New test.
* g++.target/i386/pr103973-4.C: New test.
* g++.target/i386/pr103973-5.C: New test.
* g++.target/i386/pr103973-6.C: New test.
* g++.target/i386/pr103973-7.C: New test.
* g++.target/i386/pr103973-8.C: New test.
* g++.target/i386/pr103973-9.C: New test.
* g++.target/i386/pr103973-10.C: New test.
* g++.target/i386/pr103973-11.C: New test.
* g++.target/i386/pr103973-12.C: New test.
* g++.target/i386/pr103973-13.C: New test.
* g++.target/i386/pr103973-14.C: New test.
* g++.target/i386/pr103973-15.C: New test.
* g++.target/i386/pr103973-16.C: New test.
* g++.target/i386/pr103973-17.C: New test.
* g++.target/i386/pr103973-18.C: New test.
* g++.target/i386/pr103973-19.C: New test.
* g++.target/i386/pr103973-20.C: New test.

--- gcc/tree-cfg.h.jj   2022-01-14 23:57:44.491718086 +0100
+++ gcc/tree-cfg.h  2022-01-15 09:51:25.359468982 +0100
@@ -111,6 +111,7 @@ extern basic_block gimple_switch_label_b
 extern basic_block gimple_switch_default_bb (function *, gswitch *);
 extern edge gimple_switch_edge (function *, gswitch *, unsigned);
 extern edge gimple_switch_default_edge (function *, gswitch *);
+extern bool cond_only_block_p (basic_block);

 /* Return true if the LHS of a call should be removed.  */

--- gcc/tree-ssa-phiopt.c.jj2022-01-14 23:57:44.536717549 +0100
+++ gcc/tree-ssa-phiopt.c   2022-01-15 09:51:25.361468954 +0100
@@ -1958,31 +1958,6 @@

Re: [PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973]

2022-01-15 Thread Uros Bizjak via Gcc-patches

On Sat, Jan 15, 2022 at 10:56 AM Jakub Jelinek  wrote:
>
> On Sat, Jan 15, 2022 at 09:29:05AM +0100, Uros Bizjak wrote:
> > > --- gcc/config/i386/i386.md.jj  2022-01-14 11:51:34.432384170 +0100
> > > +++ gcc/config/i386/i386.md 2022-01-14 18:22:41.140906449 +0100
> > > @@ -23886,6 +23886,18 @@ (define_insn "hreset"
> > >[(set_attr "type" "other")
> > > (set_attr "length" "4")])
> > >
> > > +;; Spaceship optimization
> > > +(define_expand "spaceship3"
> > > +  [(match_operand:SI 0 "register_operand")
> > > +   (match_operand:MODEF 1 "cmp_fp_expander_operand")
> > > +   (match_operand:MODEF 2 "cmp_fp_expander_operand")]
> > > +  "(TARGET_80387 || (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH))
> > > +   && TARGET_CMOVE && TARGET_IEEE_FP"
> >
> > Is there a reason that this pattern is limited to TARGET_IEEE_FP?
> > During the expansion in ix86_expand_fp_spaceship, we can just skip
> > jump on unordered, while ix86_expand_fp_compare will emit the correct
> > comparison mode depending on TARGET_IEEE_FP.
>
> For -ffast-math I thought <=> expands to just x == y ? 0 : x < y ? -1 : 1
> but apparently not, it is still x == y ? 0 : x < y ? -1 : x > y ? 1 : 2
> but it is still optimized much better than the non-fast-math case
> without the patch:
> comisd  %xmm1, %xmm0
> je  .L12
> jb  .L13
> movl$1, %edx
> movl$2, %eax
> cmova   %edx, %eax
> ret
> .p2align 4,,10
> .p2align 3
> .L12:
> xorl%eax, %eax
> ret
> .p2align 4,,10
> .p2align 3
> .L13:
> movl$-1, %eax
> ret
> so just one comparison but admittedly the
> movl$1, %edx
> movl$2, %eax
> cmova   %edx, %eax
> part is unnecessary.
> So below is an incremental patch that handles even the !HONORS_NANS
> case at the gimple pattern matching (while for HONOR_NANS we must
> obey that for NaN operand(s) >/>=/ we look for the third comparison on the false edge of the second one,
> for !HONOR_NANS that is not the case.  With the incremental patch we get:
> comisd  %xmm1, %xmm0
> je  .L2
> seta%al
> leal-1(%rax,%rax), %eax
> ret
> .p2align 4,,10
> .p2align 3
> .L2:
> xorl%eax, %eax
> ret
> for -O2 -ffast-math.
> Also, I've added || (TARGET_SAHF && TARGET_USE_SAHF), because apparently
> we can handle that case nicely too, it is just the IX86_FPCMP_ARITH
> case where fp_compare already emits very specific code (and we can't
> call ix86_expand_fp_compare 3 times because that would for IEEE_FP
> emit different comparisons which couldn't be CSEd).
> I'll add also -ffast-math testsuite coverage and retest.
>
> Also, I wonder if I shouldn't handle XFmode the same, thoughts on that?

Yes, that would be nice. XFmode is used for long double, and not obsolete.

>
> > > +  gcc_checking_assert (ix86_fp_comparison_strategy (GT) == 
> > > IX86_FPCMP_COMI);
> > > +  rtx gt = ix86_expand_fp_compare (GT, op0, op1);
> > > +  rtx l0 = gen_label_rtx ();
> > > +  rtx l1 = gen_label_rtx ();
> > > +  rtx l2 = gen_label_rtx ();
> > > +  rtx lend = gen_label_rtx ();
> > > +  rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode,
> > > +  gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> > > +  rtx tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un,
> > > + gen_rtx_LABEL_REF (VOIDmode, l2), 
> > > pc_rtx);
> > > +  rtx_insn *jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> > > +  add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ());
> >
> > Please also add JUMP_LABEL to the insn.
> >
> > > +  rtx eq = gen_rtx_fmt_ee (UNEQ, VOIDmode,
> > > +  gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> > > +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, eq,
> > > + gen_rtx_LABEL_REF (VOIDmode, l0), pc_rtx);
> > > +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> > > +  add_reg_br_prob_note (jmp, profile_probability::unlikely ());
> > > +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, gt,
> > > + gen_rtx_LABEL_REF (VOIDmode, l1), pc_rtx);
> > > +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> > > +  add_reg_br_prob_note (jmp, profile_probability::even ());
> > > +  emit_move_insn (dest, constm1_rtx);
> > > +  emit_jump (lend);
> > > +  emit_label (l0);
> >
> > and LABEL_NUSES label.
>
> Why?  That seems to be a waste of time to me, unless something uses them
> already during expansion.  Because pass_expand::execute
> runs:
>   /* We need JUMP_LABEL be set in order to redirect jumps, and hence
>  split edges which edge insertions might do.  */
>   rebuild_jump_labels (get_insns ());
> which resets all LABEL_NUSES to 0 (well, to:
>   if (LABEL_P (insn))
> LABEL_NUSES (insn) = (LABEL_PRESERVE_P (insn) != 0);
> and then recomputes them and adds JUMP_LABEL if needed:
>   JUMP_LABEL (insn) = label;

Re: [PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973]

2022-01-15 Thread Jakub Jelinek via Gcc-patches

On Sat, Jan 15, 2022 at 09:29:05AM +0100, Uros Bizjak wrote:
> > --- gcc/config/i386/i386.md.jj  2022-01-14 11:51:34.432384170 +0100
> > +++ gcc/config/i386/i386.md 2022-01-14 18:22:41.140906449 +0100
> > @@ -23886,6 +23886,18 @@ (define_insn "hreset"
> >[(set_attr "type" "other")
> > (set_attr "length" "4")])
> >
> > +;; Spaceship optimization
> > +(define_expand "spaceship3"
> > +  [(match_operand:SI 0 "register_operand")
> > +   (match_operand:MODEF 1 "cmp_fp_expander_operand")
> > +   (match_operand:MODEF 2 "cmp_fp_expander_operand")]
> > +  "(TARGET_80387 || (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH))
> > +   && TARGET_CMOVE && TARGET_IEEE_FP"
> 
> Is there a reason that this pattern is limited to TARGET_IEEE_FP?
> During the expansion in ix86_expand_fp_spaceship, we can just skip
> jump on unordered, while ix86_expand_fp_compare will emit the correct
> comparison mode depending on TARGET_IEEE_FP.

For -ffast-math I thought <=> expands to just x == y ? 0 : x < y ? -1 : 1
but apparently not, it is still x == y ? 0 : x < y ? -1 : x > y ? 1 : 2
but it is still optimized much better than the non-fast-math case
without the patch:
comisd  %xmm1, %xmm0
je  .L12
jb  .L13
movl$1, %edx
movl$2, %eax
cmova   %edx, %eax
ret
.p2align 4,,10
.p2align 3
.L12:
xorl%eax, %eax
ret
.p2align 4,,10
.p2align 3
.L13:
movl$-1, %eax
ret
so just one comparison but admittedly the
movl$1, %edx
movl$2, %eax
cmova   %edx, %eax
part is unnecessary.
So below is an incremental patch that handles even the !HONORS_NANS
case at the gimple pattern matching (while for HONOR_NANS we must
obey that for NaN operand(s) >/>=/ > +  gcc_checking_assert (ix86_fp_comparison_strategy (GT) == 
> > IX86_FPCMP_COMI);
> > +  rtx gt = ix86_expand_fp_compare (GT, op0, op1);
> > +  rtx l0 = gen_label_rtx ();
> > +  rtx l1 = gen_label_rtx ();
> > +  rtx l2 = gen_label_rtx ();
> > +  rtx lend = gen_label_rtx ();
> > +  rtx un = gen_rtx_fmt_ee (UNORDERED, VOIDmode,
> > +  gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> > +  rtx tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, un,
> > + gen_rtx_LABEL_REF (VOIDmode, l2), pc_rtx);
> > +  rtx_insn *jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> > +  add_reg_br_prob_note (jmp, profile_probability:: very_unlikely ());
> 
> Please also add JUMP_LABEL to the insn.
> 
> > +  rtx eq = gen_rtx_fmt_ee (UNEQ, VOIDmode,
> > +  gen_rtx_REG (CCFPmode, FLAGS_REG), const0_rtx);
> > +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, eq,
> > + gen_rtx_LABEL_REF (VOIDmode, l0), pc_rtx);
> > +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> > +  add_reg_br_prob_note (jmp, profile_probability::unlikely ());
> > +  tmp = gen_rtx_IF_THEN_ELSE (VOIDmode, gt,
> > + gen_rtx_LABEL_REF (VOIDmode, l1), pc_rtx);
> > +  jmp = emit_jump_insn (gen_rtx_SET (pc_rtx, tmp));
> > +  add_reg_br_prob_note (jmp, profile_probability::even ());
> > +  emit_move_insn (dest, constm1_rtx);
> > +  emit_jump (lend);
> > +  emit_label (l0);
> 
> and LABEL_NUSES label.

Why?  That seems to be a waste of time to me, unless something uses them
already during expansion.  Because pass_expand::execute
runs:
  /* We need JUMP_LABEL be set in order to redirect jumps, and hence
 split edges which edge insertions might do.  */
  rebuild_jump_labels (get_insns ());
which resets all LABEL_NUSES to 0 (well, to:
  if (LABEL_P (insn))
LABEL_NUSES (insn) = (LABEL_PRESERVE_P (insn) != 0);
and then recomputes them and adds JUMP_LABEL if needed:
  JUMP_LABEL (insn) = label;

--- gcc/config/i386/i386.md.jj  2022-01-15 09:51:25.404468342 +0100
+++ gcc/config/i386/i386.md 2022-01-15 09:56:31.602109421 +0100
@@ -23892,7 +23892,7 @@ (define_expand "spaceship3"
(match_operand:MODEF 1 "cmp_fp_expander_operand")
(match_operand:MODEF 2 "cmp_fp_expander_operand")]
   "(TARGET_80387 || (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH))
-   && TARGET_CMOVE && TARGET_IEEE_FP"
+   && (TARGET_CMOVE || (TARGET_SAHF && TARGET_USE_SAHF))"
 {
   ix86_expand_fp_spaceship (operands[0], operands[1], operands[2]);
   DONE;
--- gcc/config/i386/i386-expand.c.jj2022-01-15 09:51:25.411468242 +0100
+++ gcc/config/i386/i386-expand.c   2022-01-15 10:38:26.924333651 +0100
@@ -2884,18 +2884,23 @@ ix86_expand_setcc (rtx dest, enum rtx_co
 void
 ix86_expand_fp_spaceship (rtx dest, rtx op0, rtx op1)
 {
-  gcc_checking_assert (ix86_fp_comparison_strategy (GT) == IX86_FPCMP_COMI);
+  gcc_checking_assert (ix86_fp_comparison_strategy (GT) != IX86_FPCMP_ARITH);
   rtx gt = ix86_expand_fp_compare (GT, op0, op1);
   rtx l0 = gen_label_rtx ();
   rtx l1 = gen_label_rtx ();
-  rtx l2 = gen_label_rtx ();
+  rtx l2 = TARGET_IEEE_FP ? gen_label_rtx () : NULL_RTX;
   rtx

Re: [PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973]

2022-01-15 Thread Uros Bizjak via Gcc-patches

On Fri, Jan 14, 2022 at 11:56 PM Jakub Jelinek  wrote:
>
> Hi!
>
> C++20:
> #include 
> auto cmp4way(double a, double b)
> {
>   return a <=> b;
> }
> expands to:
> ucomisd %xmm1, %xmm0
> jp  .L8
> movl$0, %eax
> jne .L8
> .L2:
> ret
> .p2align 4,,10
> .p2align 3
> .L8:
> comisd  %xmm0, %xmm1
> movl$-1, %eax
> ja  .L2
> ucomisd %xmm1, %xmm0
> setbe   %al
> addl$1, %eax
> ret
> That is 3 comparisons of the same operands.
> The following patch improves it to just one comparison:
> comisd  %xmm1, %xmm0
> jp  .L4
> seta%al
> movl$0, %edx
> leal-1(%rax,%rax), %eax
> cmove   %edx, %eax
> ret
> .L4:
> movl$2, %eax
> ret
> While a <=> b expands to a == b ? 0 : a < b ? -1 : a > b ? 1 : 2
> where the first comparison is equality and this shouldn't raise
> exceptions on qNaN operands, if the operands aren't equal (which
> includes unordered cases), then it immediately performs < or >
> comparison and that raises exceptions even on qNaNs, so we can just
> perform a single comparison that raises exceptions on qNaN.
> As the 4 different cases are encoded as
> ZF CF PF
> 1  1  1  a unordered b
> 0  0  0  a > b
> 0  1  0  a < b
> 1  0  0  a == b
> we can emit optimal sequence of comparions, first jp
> for the unordered case, then je for the == case and finally jb
> for the < case.
>
> The patch pattern recognizes spaceship-like comparisons during
> widening_mul if the spaceship optab is implemented, and replaces
> thoose comparisons with comparisons of .SPACESHIP ifn which returns
> -1/0/1/2 based on the comparison.  This seems to work well both for the
> case of just returning the -1/0/1/2 (when we have just a common
> successor with a PHI) or when the different cases are handled with
> various other basic blocks.  The testcases cover both of those cases,
> the latter with different function calls in those.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2022-01-14  Jakub Jelinek  
>
> PR target/103973
> * tree-cfg.h (cond_only_block_p): Declare.
> * tree-ssa-phiopt.c (cond_only_block_p): Move function to ...
> * tree-cfg.c (cond_only_block_p): ... here.  No longer static.
> * optabs.def (spaceship_optab): New optab.
> * internal-fn.def (SPACESHIP): New internal function.
> * internal-fn.h (expand_SPACESHIP): Declare.
> * internal-fn.c (expand_PHI): Formatting fix.
> (expand_SPACESHIP): New function.
> * tree-ssa-math-opts.c (optimize_spaceship): New function.
> (math_opts_dom_walker::after_dom_children): Use it.
> * config/i386/i386.md (spaceship3): New define_expand.
> * config/i386/i386-protos.h (ix86_expand_fp_spaceship): Declare.
> * config/i386/i386-expand.c (ix86_expand_fp_spaceship): New function.
> * doc/md.texi (spaceship@var{m}3): Document.
>
> * gcc.target/i386/pr103973-1.c: New test.
> * gcc.target/i386/pr103973-2.c: New test.
> * gcc.target/i386/pr103973-3.c: New test.
> * gcc.target/i386/pr103973-4.c: New test.
> * g++.target/i386/pr103973-1.C: New test.
> * g++.target/i386/pr103973-2.C: New test.
> * g++.target/i386/pr103973-3.C: New test.
> * g++.target/i386/pr103973-4.C: New test.

>--- gcc/config/i386/i386.md.jj  2022-01-14 11:51:34.432384170 +0100
> +++ gcc/config/i386/i386.md 2022-01-14 18:22:41.140906449 +0100
> @@ -23886,6 +23886,18 @@ (define_insn "hreset"
>[(set_attr "type" "other")
> (set_attr "length" "4")])
>
> +;; Spaceship optimization
> +(define_expand "spaceship3"
> +  [(match_operand:SI 0 "register_operand")
> +   (match_operand:MODEF 1 "cmp_fp_expander_operand")
> +   (match_operand:MODEF 2 "cmp_fp_expander_operand")]
> +  "(TARGET_80387 || (SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH))
> +   && TARGET_CMOVE && TARGET_IEEE_FP"

Is there a reason that this pattern is limited to TARGET_IEEE_FP?
During the expansion in ix86_expand_fp_spaceship, we can just skip
jump on unordered, while ix86_expand_fp_compare will emit the correct
comparison mode depending on TARGET_IEEE_FP.

> +{
> +  ix86_expand_fp_spaceship (operands[0], operands[1], operands[2]);
> +  DONE;
> +})
> +
>  (include "mmx.md")
>  (include "sse.md")
>  (include "sync.md")
> --- gcc/config/i386/i386-expand.c.jj2022-01-14 11:51:34.429384213 +0100
> +++ gcc/config/i386/i386-expand.c   2022-01-14 21:02:09.446437810 +0100
> @@ -2879,6 +2879,46 @@ ix86_expand_setcc (rtx dest, enum rtx_co
>emit_insn (gen_rtx_SET (dest, ret));
>  }
>
> +/* Expand floating point op0 <=> op1 if NaNs are honored.  */
> +
> +void
> +ix86_expand_fp_spaceship (rtx dest, rtx op0, rtx op1)
> +{
> +  gcc_checking_assert (ix86_fp_comparison_strategy (GT) == IX86_FPCMP_COMI);
> +  rtx gt =

Re: [PATCH] c++: error message for dependent template members [PR70417]

2022-01-15 Thread Anthony Sharp via Gcc-patches

Hi Jason,

Hope you are well. Apologies, I've not had time to sit down and look at
this since last month I quit my old job, then I had family around for the
whole of the Christmas period, and then even more recently I've had to
start my new job.

In any case happy that you managed to figure it all out. I noticed the
small regression at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104025. To
be honest I wouldn't even know how to begin to go about fixing that so
perhaps it's best if I leave that to you? I guess it's not playing well
with the use of warn_missing_template_keyword. Maybe I didn't set that
variable up properly or something.

Kind regards,
Anthony

On Thu, 13 Jan 2022 at 21:01, Jason Merrill  wrote:

> On 12/9/21 10:51, Jason Merrill wrote:
> > On 12/4/21 12:23, Anthony Sharp wrote:
> >> Hi Jason,
> >>
> >> Hope you are well. Apologies for not coming back sooner.
> >>
> >>  >I'd put it just above the definition of saved_token_sentinel in
> >> parser.c.
> >>
> >> Sounds good, done.
> >>
> >>  >Maybe cp_parser_require_end_of_template_parameter_list?  Either way
> >> is fine.
> >>
> >> Even better, have changed it.
> >>
> >>  >Hmm, good point; operators that are member functions must be
> >> non-static,
> >>  >so we couldn't be doing a comparison of the address of the function.
> >>
> >> In that case I have set it to return early there.
> >>
> >>  >So declarator_p should be true there.  I'll fix that.
> >>
> >> Thank you.
> >>
> >>  >> +  if (next_token->keyword == RID_TEMPLATE)
> >>  >> +{
> >>  >> +  /* But at least make sure it's properly formed (e.g. see
> >> PR19397).  */
> >>  >> +  if (cp_lexer_peek_nth_token (parser->lexer, 2)->type ==
> >> CPP_NAME)
> >>  >> +   return 1;
> >>  >> +
> >>  >> +  return -1;
> >>  >> +}
> >>  >> +
> >>  >> +  /* Could be a ~ referencing the destructor of a class template.
> >>  */
> >>  >> +  if (next_token->type == CPP_COMPL)
> >>  >> +{
> >>  >> +  /* It could only be a template.  */
> >>  >> +  if (cp_lexer_peek_nth_token (parser->lexer, 2)->type ==
> >> CPP_NAME)
> >>  >> +   return 1;
> >>  >> +
> >>  >> +  return -1;
> >>  >> +}
> >>  >
> >>  >Why don't these check for the < ?
> >>
> >> I think perhaps I could have named the function better; instead of
> >> next_token_begins_template_id, how's about
> >> next_token_begins_template_name?
> >> That's all I intended to check for.
> >
> > You're using it to control whether we try to parse a template-id, and
> > it's used to initialize variables named looks_like_template_id, so I
> > think better to keep the old name.
> >
> >> In the first case, something like "->template some_name" will always be
> >> intended as a template, so no need to check for the <. If there were
> >> default
> >> template arguments you could also validly omit the <> completely, I
> think
> >> (could be wrong).
> >
> > Or if the template arguments can be deduced, yes:
> >
> > template  struct A
> > {
> >template  void f(U u);
> > };
> >
> > template  void g(A a)
> > {
> >a->template f(42);
> > }
> >
> > But 'f' is still not a template-id.
> >
> > ...
> >
> > Actually, it occurs to me that you might be better off handling this in
> > cp_parser_template_name, something like the below, to avoid the complex
> > duplicate logic in the id-expression handling.
> >
> > Note that in this patch I'm using "any_object_scope" as a proxy for "I'm
> > parsing an expression", since !is_declaration doesn't work for that; as
> > a result, this doesn't handle the static member function template case.
> > For that you'd probably still need to pass down a flag so that
> > cp_parser_template_name knows it's being called from
> > cp_parser_id_expression.
> >
> > Your patch has a false positive on
> >
> > template  struct A { };
> > template  void f()
> > {
> >A();
> > };
> >
> > which my patch checks in_template_argument_list_p to avoid, though
> > checking any_object_scope also currently avoids it.
> >
> > What do you think?
>
> I decided that it made more sense to keep the check in
> cp_parser_id_expression like you had it, but I moved it to the end to
> simplify the logic.  Here's what I'm applying, thanks!

[r12-6606 Regression] FAIL: g++.dg/asan/asan_test.C -O2 (test for excess errors) on Linux/x86_64

Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.

[PATCH] Strengthen memory memory order for atomic::wait/notify

[PATCHv3] libiberty rust-demangle, ignore .suffix

Re: gcc/configure: out of date

Re: [PATCH] libiberty rust-demangle, ignore .suffix

Re: [PATCH v2 1/2] add -Wuse-after-free

Re: [PATCH] PR fortran/83079 - ICE and wrong code with TRANSFER and character(kind=4)

Re: [PATCH] PR fortran/83079 - ICE and wrong code with TRANSFER and character(kind=4)

[PATCH] i386: Improve and optimize ix86_expand_sse_movcc

[Patch][V5][Patch 2/2]Enable -Wuninitialized + -ftrivial-auto-var-init for address taken variables.

Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.

Re: [PATCH] widening_mul, i386, v2: Improve spaceship expansion on x86 [PR103973]

Re: [PATCH] [i386] GLC tuning: Break false dependency for dest register.

Re: [PATCH] c++: ICE with noexcept and canonical types [PR101715]

[PATCH] c++, dyninit, v2: Optimize C++ dynamic initialization by constants into DECL_INITIAL adjustment [PR102876]

Re: [PATCH] gcc: pass-manager: Fix memory leak. [PR jit/63854]

[r12-6420 Regression] FAIL: g++.dg/vect/pr92595.cc -std=c++2a (test for excess errors) on Linux/x86_64

[r12-6560 Regression] FAIL: gfortran.dg/gomp/allocate-2.f90 -O (test for errors, line 36) on Linux/x86_64

[r12-6404 Regression] FAIL: gfortran.dg/ieee/signaling_1.f90 -Os (test for excess errors) on Linux/x86_64

[r12-6586 Regression] FAIL: g++.dg/torture/pr57993-2.C -Os (test for excess errors) on Linux/x86_64

[r12-6517 Regression] FAIL: gfortran.dg/ieee/signaling_1.f90 -Os execution test on Linux/x86_64

[r12-6578 Regression] FAIL: g++.dg/cpp0x/constexpr-compare2.C -std=c++2a (test for excess errors) on Linux/x86_64

[r12-6585 Regression] FAIL: c-c++-common/Walloca-larger-than.c -Wc++-compat (test for excess errors) on Linux/x86_64

Re: [PATCH] OpenMP front-end: allow requires dynamic_allocators

[PATCH] widening_mul, i386, v2: Improve spaceship expansion on x86 [PR103973]

Re: [PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973]

Re: [PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973]

Re: [PATCH] widening_mul, i386: Improve spaceship expansion on x86 [PR103973]

Re: [PATCH] c++: error message for dependent template members [PR70417]

30 matches

Site Navigation

Mail list logo

Footer information