[r12-7287 Regression] FAIL: gcc.dg/deprecated.c (test for warnings, line 28) on Linux/x86_64

2022-02-17 Thread sunil.k.pandey via Gcc-patches
On Linux/x86_64,

1b71bc7c8b18bd1b22debfde155f175fd1654942 is the first bad commit
commit 1b71bc7c8b18bd1b22debfde155f175fd1654942
Author: Jason Merrill 
Date:   Tue Feb 15 19:17:03 2022 -0500

tree: tweak warn_deprecated_use

caused

FAIL: gcc.dg/deprecated.c (test for excess errors)
FAIL: gcc.dg/deprecated.c  (test for warnings, line 28)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r12-7287/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/deprecated.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/deprecated.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/deprecated.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check RUNTESTFLAGS="dg.exp=gcc.dg/deprecated.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH v7] c++: Add diagnostic when operator= is used as truth cond [PR25689]

2022-02-17 Thread Zhao Wei Liew via Gcc-patches
On Fri, 18 Feb 2022 at 08:32, Zhao Wei Liew  wrote:
>
> > >>> +/* Test non-empty class */
> > >>> +void f2(B b1, B b2)
> > >>> +{
> > >>> + if (b1 = 0); /* { dg-warning "suggest parentheses" } */
> > >>> + if (b1 = 0.); /* { dg-warning "suggest parentheses" } */
> > >>> + if (b1 = b2); /* { dg-warning "suggest parentheses" } */
> > >>> + if (b1.operator= (0));
> > >>> +
> > >>> + /* Ideally, we wouldn't warn for non-empty classes using trivial
> > >>> +  operator= (below), but we currently do as it is a MODIFY_EXPR. */
> > >>> + // if (b1.operator= (b2));
> > >>
> > >> You can avoid it by calling suppress_warning on that MODIFY_EXPR in
> > >> build_over_call.
> > >
> > > Unfortunately, that also affects the warning for if (b1 = b2) just 5
> > > lines above. Both expressions seem to generate the same tree structure.
> >
> > True, you would need to put the call to suppress_warning in build_new_op
> > around where CALL_EXPR_OPERATOR_SYNTAX is set.
>
> It seems like that would suppress the warning for the case of if (b1 = b2) 
> instead of
> if (b1.operator= (b2)). Do you mean to add the call to suppress_warning
> in build_method_call instead?
>
> This is what I've tried so far:
>
> 1. Call suppress_warning (result, ...) in the trivial_fn_p block in 
> build_new_op,
>right above the comment "There won't be a CALL_EXPR" (line 6699).
>This suppresses the warning for if (b1 = b2) but not for if (b1.operator= 
> (b2)).
>
> 2. Call suppress_warning (result, ...) in build_method_call, right after the 
> call to
> build_over_call (line 11141). This suppresses the warning for if 
> (b1.operator= (b2))
> and not if (b1 = b2).
>
> Based on this, I think the 2nd option might be what we want here? Please 
> correct me if I'm
> wrong. I'm also unsure if there are issues that might arise with this change.

To better illustrate the 2nd option, I've attached it as a patch v8.
How does it look?

v7: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590464.html
Changes since v7:
1. Suppress -Wparentheses warnings in build_new_method_call.
2. Uncomment the test case for if (b1.operator= (b2)).

v6: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590419.html
Changes since v6:
1. Check for error_mark_node in is_assignment_op_expr_pr.
2. Change "c:" to "c++:".

v5: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590393.html
Changes since v5:
1. Revert changes in v4.
2. Replace gcc_assert with a return NULL_TREE in extract_call_expr.

v4: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590379.html
Changes since v4:
1. Refactor the non-assert-related code out of extract_call_expr and
   call that function instead to check for call expressions.

v3: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590310.html
Changes since v3:
1. Also handle COMPOUND_EXPRs and TARGET_EXPRs.

v2: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590236.html
Changes since v2:
1. Add more test cases in Wparentheses-31.C.
2. Refactor added logic to a function (is_assignment_overload_ref_p).
3. Use REFERENCE_REF_P instead of INDIRECT_REF_P.

v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590158.html
Changes since v1:
1. Use CALL_EXPR_OPERATOR_SYNTAX to avoid warnings for explicit
   operator=() calls.
2. Use INDIRECT_REF_P to filter implicit operator=() calls.
3. Use cp_get_callee_fndecl_nofold.
4. Add spaces before (.
From ef4cfecca64b2cb199a5d3979fe99f8c9bd0f414 Mon Sep 17 00:00:00 2001
From: Zhao Wei Liew 
Date: Tue, 15 Feb 2022 17:44:29 +0800
Subject: [PATCH] c++: Add diagnostic when operator= is used as truth cond
 [PR25689]

When compiling the following code with g++ -Wparentheses, GCC does not
warn on the if statement. For example, there is no warning for this code:

struct A {
	A& operator=(int);
	operator bool();
};

void f(A a) {
	if (a = 0); // no warning
}

This is because a = 0 is a call to operator=, which GCC does not handle.

This patch fixes this issue by handling calls to operator= when deciding
to warn.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

	PR c++/25689

gcc/cp/ChangeLog:

	* call.cc (extract_call_expr): Return a NULL_TREE on failure
	  instead of asserting.
	* semantics.cc (is_assignment_op_expr_p): Add function to check
	  if an expression is a call to an op= operator expression.
	(maybe_convert_cond): Handle the case of a op= operator expression
	  for the -Wparentheses diagnostic.

gcc/testsuite/ChangeLog:

	* g++.dg/warn/Wparentheses-31.C: New test.

Signed-off-by: Zhao Wei Liew 
---
 gcc/cp/call.cc  | 12 +++--
 gcc/cp/semantics.cc | 22 +++-
 gcc/testsuite/g++.dg/warn/Wparentheses-31.C | 59 +
 3 files changed, 89 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wparentheses-31.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index d6eed5ed835..caf22e02b39 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -7090,9 +7090,10 @@ extract_call_expr 

Re: [PATCH] target/104581 - compile-time regression in mode-switching

2022-02-17 Thread Hongtao Liu via Gcc-patches
On Thu, Feb 17, 2022 at 9:47 PM Richard Biener via Gcc-patches
 wrote:
>
> The x86 backend piggy-backs on mode-switching for insertion of
> vzeroupper.  A recent improvement there was implemented in a way
> to walk possibly the whole basic-block for all DF reg def definitions
> in its mode_needed hook which is called for each instruction in
> a basic-block during mode-switching local analysis.
>
> The following mostly reverts this improvement.  It needs to be
> re-done in a way more consistent with a local dataflow which
> probably means making targets aware of the state of the local
> dataflow analysis.
>
> This improves compile-time of some 538.imagick_r TU from
> 362s to 16s with -Ofast -mavx2 -fprofile-generate.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
LGTM, have talked to H.J, he also agrees.
>
> Thanks,
> Richard.
>
> 2022-02-17  Richard Biener  
>
> PR target/104581
> * config/i386/i386.cc (ix86_avx_u128_mode_source): Remove.
> (ix86_avx_u128_mode_needed): Return AVX_U128_DIRTY instead
> of calling ix86_avx_u128_mode_source which would eventually
> have returned AVX_U128_ANY in some very special case.
>
> * gcc.target/i386/pr101456-1.c: XFAIL.
> ---
>  gcc/config/i386/i386.cc| 78 +-
>  gcc/testsuite/gcc.target/i386/pr101456-1.c |  3 +-
>  2 files changed, 5 insertions(+), 76 deletions(-)
>
> diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> index cf246e74e57..e4b42fbba6f 100644
> --- a/gcc/config/i386/i386.cc
> +++ b/gcc/config/i386/i386.cc
> @@ -14377,80 +14377,12 @@ ix86_check_avx_upper_register (const_rtx exp)
>
>  static void
>  ix86_check_avx_upper_stores (rtx dest, const_rtx, void *data)
> - {
> -   if (ix86_check_avx_upper_register (dest))
> +{
> +  if (ix86_check_avx_upper_register (dest))
>  {
>bool *used = (bool *) data;
>*used = true;
>  }
> - }
> -
> -/* For YMM/ZMM store or YMM/ZMM extract.  Return mode for the source
> -   operand of SRC DEFs in the same basic block before INSN.  */
> -
> -static int
> -ix86_avx_u128_mode_source (rtx_insn *insn, const_rtx src)
> -{
> -  basic_block bb = BLOCK_FOR_INSN (insn);
> -  rtx_insn *end = BB_END (bb);
> -
> -  /* Return AVX_U128_DIRTY if there is no DEF in the same basic
> - block.  */
> -  int status = AVX_U128_DIRTY;
> -
> -  for (df_ref def = DF_REG_DEF_CHAIN (REGNO (src));
> -   def; def = DF_REF_NEXT_REG (def))
> -if (DF_REF_BB (def) == bb)
> -  {
> -   /* Ignore DEF from different basic blocks.  */
> -   rtx_insn *def_insn = DF_REF_INSN (def);
> -
> -   /* Check if DEF_INSN is before INSN.  */
> -   rtx_insn *next;
> -   for (next = NEXT_INSN (def_insn);
> -next != nullptr && next != end && next != insn;
> -next = NEXT_INSN (next))
> - ;
> -
> -   /* Skip if DEF_INSN isn't before INSN.  */
> -   if (next != insn)
> - continue;
> -
> -   /* Return AVX_U128_DIRTY if the source operand of DEF_INSN
> -  isn't constant zero.  */
> -
> -   if (CALL_P (def_insn))
> - {
> -   bool avx_upper_reg_found = false;
> -   note_stores (def_insn,
> -ix86_check_avx_upper_stores,
> -_upper_reg_found);
> -
> -   /* Return AVX_U128_DIRTY if call returns AVX.  */
> -   if (avx_upper_reg_found)
> - return AVX_U128_DIRTY;
> -
> -   continue;
> - }
> -
> -   rtx set = single_set (def_insn);
> -   if (!set)
> - return AVX_U128_DIRTY;
> -
> -   rtx dest = SET_DEST (set);
> -
> -   /* Skip if DEF_INSN is not an AVX load.  Return AVX_U128_DIRTY
> -  if the source operand isn't constant zero.  */
> -   if (ix86_check_avx_upper_register (dest)
> -   && standard_sse_constant_p (SET_SRC (set),
> -   GET_MODE (dest)) != 1)
> - return AVX_U128_DIRTY;
> -
> -   /* We get here only if all AVX loads are from constant zero.  */
> -   status = AVX_U128_ANY;
> -  }
> -
> -  return status;
>  }
>
>  /* Return needed mode for entity in optimize_mode_switching pass.  */
> @@ -14520,11 +14452,7 @@ ix86_avx_u128_mode_needed (rtx_insn *insn)
> {
>   FOR_EACH_SUBRTX (iter, array, src, NONCONST)
> if (ix86_check_avx_upper_register (*iter))
> - {
> -   int status = ix86_avx_u128_mode_source (insn, *iter);
> -   if (status == AVX_U128_DIRTY)
> - return status;
> - }
> + return AVX_U128_DIRTY;
> }
>
>/* This isn't YMM/ZMM load/store.  */
> diff --git a/gcc/testsuite/gcc.target/i386/pr101456-1.c 
> b/gcc/testsuite/gcc.target/i386/pr101456-1.c
> index 803fc6e0207..7fb3a3f055c 100644
> --- a/gcc/testsuite/gcc.target/i386/pr101456-1.c
> +++ b/gcc/testsuite/gcc.target/i386/pr101456-1.c
> @@ -30,4 +30,5 @@ foo3 (void)
>

Re: [PATCH v7] c++: Add diagnostic when operator= is used as truth cond [PR25689]

2022-02-17 Thread Zhao Wei Liew via Gcc-patches
On Thu, 17 Feb 2022 at 00:59, Jason Merrill  wrote:
>
> On 2/16/22 02:16, Zhao Wei Liew wrote:
> > On Wed Feb 16, 2022 at 4:06 AM +08, Jason Merrill wrote:
> >>> Ah, I see. I found it a bit odd that gcc-commit-mklog auto-generated a
> >>> subject with "c:",
> >>> but I just went with it as I didn't know any better. Unfortunately, I
> >>> can't change it now on the current thread.
> >>
> >> That came from this line in the testcase:
> >>
> >>   > +/* PR c/25689 */
> >>
> >> The PR should be c++/25689.  Also, sometimes the bugzilla component
> >> isn't the same as the area of the compiler you're changing; the latter
> >> is what you want in the patch subject, so that the right people know to
> >> review it.
> >
> > Oh, I see. Thanks for the explanation. I've fixed the line.
> >
> >>> Ah, I didn't notice that. Sorry about that! I'm kinda new to the whole
> >>> mailing list setup so there are some kinks I have to iron out.
> >>
> >> FWIW it's often easier to send the patch as an attachment.
> >
> > Alright, I'll send patches as attachments instead. I originally sent
> > them as text as it is easier to comment on them.
>
> It is a bit more of a hassle in this case because your mail sender
> doesn't mark the patch as text, but rather application/mbox or
> application/x-patch, so my mail reader for patch review (Thunderbird)
> doesn't display it inline.  I tried sending myself a patch through the
> gmail web interface, and it used text/x-patch, which is OK; what are you
> using to send?
>
> Maybe renaming the file to .txt before sending would help?

Hmm, in the end I used gmail to send the patch, so I'm not sure why it was
marked that way. I'll test it out again before sending another patch.

> >>> +/* Test non-empty class */
> >>> +void f2(B b1, B b2)
> >>> +{
> >>> + if (b1 = 0); /* { dg-warning "suggest parentheses" } */
> >>> + if (b1 = 0.); /* { dg-warning "suggest parentheses" } */
> >>> + if (b1 = b2); /* { dg-warning "suggest parentheses" } */
> >>> + if (b1.operator= (0));
> >>> +
> >>> + /* Ideally, we wouldn't warn for non-empty classes using trivial
> >>> +  operator= (below), but we currently do as it is a MODIFY_EXPR. */
> >>> + // if (b1.operator= (b2));
> >>
> >> You can avoid it by calling suppress_warning on that MODIFY_EXPR in
> >> build_over_call.
> >
> > Unfortunately, that also affects the warning for if (b1 = b2) just 5
> > lines above. Both expressions seem to generate the same tree structure.
>
> True, you would need to put the call to suppress_warning in build_new_op
> around where CALL_EXPR_OPERATOR_SYNTAX is set.

It seems like that would suppress the warning for the case of if (b1 = b2)
instead of
if (b1.operator= (b2)). Do you mean to add the call to suppress_warning
in build_method_call instead?

This is what I've tried so far:

1. Call suppress_warning (result, ...) in the trivial_fn_p block in
build_new_op,
   right above the comment "There won't be a CALL_EXPR" (line 6699).
   This suppresses the warning for if (b1 = b2) but not for if
(b1.operator= (b2)).

2. Call suppress_warning (result, ...) in build_method_call, right after
the call to
build_over_call (line 11141). This suppresses the warning for if
(b1.operator= (b2))
and not if (b1 = b2).

Based on this, I think the 2nd option might be what we want here? Please
correct me if I'm
wrong. I'm also unsure if there are issues that might arise with this
change.


[committed] libstdc++: Deprecate non-standard std::vector::insert(pos) [PR104559]

2022-02-17 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk.

-- >8--

The SGI STL and pre-1998 drafts of the C++ standard had a default
argument for vector::insert(iterator, const bool&) which was
remove by N1051. The default argument is still present in libstdc++ for
some reason. There are no tests verifying it as an extension, so I don't
think it has been kept intentionally.

This removes the default argument but adds an overload without the
second parameter, and adds the deprecated attribute to it. This allows
any code using it to keep working (for now) but with a warning.

libstdc++-v3/ChangeLog:

PR libstdc++/104559
* doc/xml/manual/evolution.xml: Document deprecation.
* doc/html/manual/api.html: Regenerate.
* include/bits/stl_bvector.h (insert(const_iterator, const bool&)):
Remove default argument.
(insert(const_iterator)): New overload with deprecated attribute.
* testsuite/23_containers/vector/bool/modifiers/insert/104559.cc:
New test.
---
 libstdc++-v3/doc/html/manual/api.html   |  3 +++
 libstdc++-v3/doc/xml/manual/evolution.xml   |  3 +++
 libstdc++-v3/include/bits/stl_bvector.h | 11 +--
 .../vector/bool/modifiers/insert/104559.cc  | 13 +
 4 files changed, 28 insertions(+), 2 deletions(-)
 create mode 100644 
libstdc++-v3/testsuite/23_containers/vector/bool/modifiers/insert/104559.cc

diff --git a/libstdc++-v3/doc/html/manual/api.html 
b/libstdc++-v3/doc/html/manual/api.html
index 26087775708..bbda6f5acf3 100644
--- a/libstdc++-v3/doc/html/manual/api.html
+++ b/libstdc++-v3/doc/html/manual/api.html
@@ -454,6 +454,9 @@ were deprecated for C++11.
 were deprecated for C++17.
 
 Non-standard std::pair constructors were deprecated.
+A non-standard default argument for
+vectorbool::insert(const_iterator, const 
bool)
+was deprecated.
 
 The bitmap, mt, and 
pool
 options for --enable-libstdcxx-allocator were 
removed.
diff --git a/libstdc++-v3/doc/xml/manual/evolution.xml 
b/libstdc++-v3/doc/xml/manual/evolution.xml
index f5bc6471465..4923e8c4783 100644
--- a/libstdc++-v3/doc/xml/manual/evolution.xml
+++ b/libstdc++-v3/doc/xml/manual/evolution.xml
@@ -1045,6 +1045,9 @@ were deprecated for C++17.
 
 
 Non-standard std::pair constructors were deprecated.
+A non-standard default argument for
+vectorbool::insert(const_iterator, const bool)
+was deprecated.
 
 
 
diff --git a/libstdc++-v3/include/bits/stl_bvector.h 
b/libstdc++-v3/include/bits/stl_bvector.h
index 75f38812807..d256af40f40 100644
--- a/libstdc++-v3/include/bits/stl_bvector.h
+++ b/libstdc++-v3/include/bits/stl_bvector.h
@@ -1135,9 +1135,9 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
   _GLIBCXX20_CONSTEXPR
   iterator
 #if __cplusplus >= 201103L
-  insert(const_iterator __position, const bool& __x = bool())
+  insert(const_iterator __position, const bool& __x)
 #else
-  insert(iterator __position, const bool& __x = bool())
+  insert(iterator __position, const bool& __x)
 #endif
   {
const difference_type __n = __position - begin();
@@ -1149,6 +1149,13 @@ _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
return begin() + __n;
   }
 
+#if _GLIBCXX_USE_DEPRECATED
+  _GLIBCXX_DEPRECATED_SUGGEST("insert(position, false)")
+  iterator
+  insert(const_iterator __position)
+  { return this->insert(__position._M_const_cast(), false); }
+#endif
+
 #if __cplusplus >= 201103L
   template>
diff --git 
a/libstdc++-v3/testsuite/23_containers/vector/bool/modifiers/insert/104559.cc 
b/libstdc++-v3/testsuite/23_containers/vector/bool/modifiers/insert/104559.cc
new file mode 100644
index 000..1121827477f
--- /dev/null
+++ 
b/libstdc++-v3/testsuite/23_containers/vector/bool/modifiers/insert/104559.cc
@@ -0,0 +1,13 @@
+// { dg-options "-Wdeprecated" }
+// { dg-do compile }
+// { dg-require-normal-mode "" }
+
+#include 
+
+void
+test01()
+{
+  std::vector v;
+  v.insert(v.begin(), false);
+  v.insert(v.begin());  // { dg-warning "deprecated" }
+}
-- 
2.34.1



Re: [PATCH] Don't do int cmoves for IEEE comparisons, PR target/104256.

2022-02-17 Thread Segher Boessenkool
Hi!

First, you need to adjust after Robin's patch, and retest.

On Thu, Feb 17, 2022 at 01:56:04PM -0500, Michael Meissner wrote:
> Don't do int cmoves for IEEE comparisons, PR target/104256.

> Unfortunately there are some conditions like UNLE that can't easily be 
> reversed
> due to NaNs.

What do you mean?  The reverse of UNLE is GT.  You don't even need to
check if fast-math is active, or whether this is FP at all -- that is a
*given* if you see UNLE!

You need more context to reverse GT.  For fast-math and integer that
gives LE, for fp it is UNLE.

> The patch changes the code so that it does the reversal before generating the
> comparison.  If the comparison cannot be reversed, it just returns false,
> which indicates that we can't do an int conditional move in this case.

> +  /* Swap the comparison if isel can't handle it directly.  Don't generate 
> int
> + cmoves if we can't swap the condition code due to NaNs.  */

"swap" has a specific meaning for comparisons, and this isn't it (it
refers to swapping the arguments).

You can do the reverse condition for all codes that include UN just
fine.  reversed_comparison knows how to do this.

> +  enum rtx_code op_code = GET_CODE (op);
> +  if (op_code != LT && op_code != GT && op_code != LTU && op_code != GTU
> +  && op_code != EQ)

There are functions to test this.  Perhaps scc_comparison_operator
and exclude unordered?  But, this seems wrong, as said.

> -  switch (cond_code)
> -{
> -case LT: case GT: case LTU: case GTU: case EQ:
> -  /* isel handles these directly.  */
> -  break;

Ah, you got that from existing code.  Well, a good chance to improve
things, isn't it :-)

> new file mode 100644
> index 000..d1bfab23482
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/ppc-fortran/pr104254.f90
> @@ -0,0 +1,25 @@
> +! { dg-do compile }
> +! { dg-require-effective-target powerpc_p9vector_ok }
> +! { dg-options "-mdejagnu-cpu=power9 -O1 -fnon-call-exceptions" }
> +
> +! PR target/104254.  GCC would raise an assertion error if this program was

PR104256.


Segher


Re: [PATCH] c++: implicit 'this' in noexcept-spec within class tmpl [PR94944]

2022-02-17 Thread Jason Merrill via Gcc-patches

On 2/17/22 09:26, Patrick Palka wrote:

Here when instantiating the noexcept-spec we fail to resolve the
implicit object parameter for the call A::f() ultimately because
maybe_instantiate_noexcept sets current_class_ptr/ref to the dependent
'this' (of type B) rather than the specialized 'this' (of type B).
This ends up causing maybe_dummy_object (called from
finish_qualified_id_expr) to return a dummy object instead of 'this'.

This patch corrects this by making maybe_instantiate_noexcept always set
current_class_ptr/ref to the specialized 'this', consistent with what
tsubst_function_type does when substituting into a trailing return type.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk perhaps?


OK.


PR c++/94944

gcc/cp/ChangeLog:

* pt.cc (maybe_instantiate_noexcept): For non-static member
functions, set current_class_ptr/ref to the specialized 'this'
instead.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept34.C: Adjusted expected diagnostics.
* g++.dg/cpp0x/noexcept75.C: New test.
---
  gcc/cp/pt.cc| 19 +++
  gcc/testsuite/g++.dg/cpp0x/noexcept34.C |  8 
  gcc/testsuite/g++.dg/cpp0x/noexcept75.C | 17 +
  3 files changed, 28 insertions(+), 16 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/noexcept75.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 6dda66081bd..a7a524fe9fc 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -26139,20 +26139,15 @@ maybe_instantiate_noexcept (tree fn, tsubst_flags_t 
complain)
  push_deferring_access_checks (dk_no_deferred);
  input_location = DECL_SOURCE_LOCATION (fn);
  
-	  if (!DECL_LOCAL_DECL_P (fn))

+ if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fn)
+ && !DECL_LOCAL_DECL_P (fn))
{
  /* If needed, set current_class_ptr for the benefit of
-tsubst_copy/PARM_DECL.  The exception pattern will
-refer to the parm of the template, not the
-instantiation.  */
- tree tdecl = DECL_TEMPLATE_RESULT (DECL_TI_TEMPLATE (fn));
- if (DECL_NONSTATIC_MEMBER_FUNCTION_P (tdecl))
-   {
- tree this_parm = DECL_ARGUMENTS (tdecl);
- current_class_ptr = NULL_TREE;
- current_class_ref = cp_build_fold_indirect_ref (this_parm);
- current_class_ptr = this_parm;
-   }
+tsubst_copy/PARM_DECL.  */
+ tree this_parm = DECL_ARGUMENTS (fn);
+ current_class_ptr = NULL_TREE;
+ current_class_ref = cp_build_fold_indirect_ref (this_parm);
+ current_class_ptr = this_parm;
}
  
  	  /* If this function is represented by a TEMPLATE_DECL, then

diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept34.C 
b/gcc/testsuite/g++.dg/cpp0x/noexcept34.C
index dce35652ef5..86129e7a520 100644
--- a/gcc/testsuite/g++.dg/cpp0x/noexcept34.C
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept34.C
@@ -7,13 +7,13 @@ template struct A
  {
constexpr int f () { return 0; }
bool b = true;
-  void g () noexcept (f()) { } // { dg-error "use of parameter" }
-  void g2 () noexcept (this->f()) { } // { dg-error "use of parameter" }
+  void g () noexcept (f()) { } // { dg-error ".this. is not a constant" }
+  void g2 () noexcept (this->f()) { } // { dg-error ".this. is not a constant" 
}
void g3 () noexcept (b) { } // { dg-error "use of .this. in a constant 
expression|use of parameter" }
void g4 (int i) noexcept (i) { } // { dg-error "use of parameter" }
-  void g5 () noexcept (A::f()) { } // { dg-error "use of parameter" }
+  void g5 () noexcept (A::f()) { } // { dg-error ".this. is not a constant" }
void g6 () noexcept (foo(b)) { } // { dg-error "use of .this. in a constant 
expression|use of parameter" }
-  void g7 () noexcept (int{f()}) { } // { dg-error "use of parameter" }
+  void g7 () noexcept (int{f()}) { } // { dg-error ".this. is not a constant" }
  };
  
  int main ()

diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept75.C 
b/gcc/testsuite/g++.dg/cpp0x/noexcept75.C
new file mode 100644
index 000..d746f4768d0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept75.C
@@ -0,0 +1,17 @@
+// PR c++/94944
+// { dg-do compile { target c++11 } }
+
+template
+struct A {
+  void f();
+};
+
+template
+struct B : A {
+  void g() noexcept(noexcept(A::f()));
+};
+
+int main() {
+  B b;
+  b.g();
+}




[pushed] c++: inlining explicit instantiations [PR104539]

2022-02-17 Thread Jason Merrill via Gcc-patches
The PR10968 fix cleared DECL_COMDAT to force output of explicit
instantiations.  Then the PR59469 fix added a call to mark_needed, after
which we no longer need to clear DECL_COMDAT, and leaving it set allows us
to inline explicit instantiations without worrying about symbol
interposition.

I suppose there's an argument to be made that an explicit instantiation
declaration (extern template) should clear DECL_COMDAT, since that suggests
that there will be only a single instantiation somewhere that could be
subject to interposition, but that doesn't change the 'inline' semantics,
and it seems cleaner to treat template instantiations uniformly.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/104539

gcc/cp/ChangeLog:

* pt.cc (mark_decl_instantiated): Don't clear DECL_COMDAT.

gcc/testsuite/ChangeLog:

* g++.dg/ipa/inline-4.C: New test.
---
 gcc/cp/pt.cc|  3 ---
 gcc/testsuite/g++.dg/ipa/inline-4.C | 15 +++
 2 files changed, 15 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/ipa/inline-4.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index d4a40d517d1..352cff944d0 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -24726,9 +24726,6 @@ mark_decl_instantiated (tree result, int extern_p)
set correctly by tsubst.  */
 TREE_PUBLIC (result) = 1;
 
-  /* This might have been set by an earlier implicit instantiation.  */
-  DECL_COMDAT (result) = 0;
-
   if (extern_p)
 {
   DECL_EXTERNAL (result) = 1;
diff --git a/gcc/testsuite/g++.dg/ipa/inline-4.C 
b/gcc/testsuite/g++.dg/ipa/inline-4.C
new file mode 100644
index 000..204aa7a366e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/inline-4.C
@@ -0,0 +1,15 @@
+// PR c++/104539
+// { dg-additional-options "-O3 -fdump-ipa-inline" }
+// { dg-final { scan-ipa-dump-not "overwritten at link time" "inline" } }
+
+template 
+//inline
+int f() {
+  return 0;
+}
+
+template int f<0>();
+
+int g() {
+  return f<0>() + 1;
+}

base-commit: c352ef0ed90cfc07d494dfec21bc683e337b
-- 
2.27.0



[pushed] tree: tweak warn_deprecated_use

2022-02-17 Thread Jason Merrill via Gcc-patches
While looking at PR90451 I noticed that this function was failing to find the
attributes if called with a variant of the struct.  And we were doing a
redundant lookup_attribute.

Tested x86_64-pc-linux-gnu, applying to trunk as obvious.

gcc/ChangeLog:

* tree.cc (warn_deprecated_use): Look for TYPE_STUB_DECL
on TYPE_MAIN_VARIANT.

gcc/testsuite/ChangeLog:

* g++.dg/warn/deprecated-16.C: New test.
---
 gcc/testsuite/g++.dg/warn/deprecated-16.C | 2 ++
 gcc/tree.cc   | 8 +---
 2 files changed, 7 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/deprecated-16.C

diff --git a/gcc/testsuite/g++.dg/warn/deprecated-16.C 
b/gcc/testsuite/g++.dg/warn/deprecated-16.C
new file mode 100644
index 000..8d1f4191270
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/deprecated-16.C
@@ -0,0 +1,2 @@
+struct __attribute((deprecated ("foo"))) A { }; // { dg-message "declared" }
+void f(const A&) { }   // { dg-warning "deprecated: foo" }
diff --git a/gcc/tree.cc b/gcc/tree.cc
index dd919ff0717..2bbef2d6b75 100644
--- a/gcc/tree.cc
+++ b/gcc/tree.cc
@@ -12047,10 +12047,12 @@ warn_deprecated_use (tree node, tree attr)
attr = DECL_ATTRIBUTES (node);
   else if (TYPE_P (node))
{
- tree decl = TYPE_STUB_DECL (node);
+ tree decl = TYPE_STUB_DECL (TYPE_MAIN_VARIANT (node));
  if (decl)
-   attr = lookup_attribute ("deprecated",
-TYPE_ATTRIBUTES (TREE_TYPE (decl)));
+   {
+ node = TREE_TYPE (decl);
+ attr = TYPE_ATTRIBUTES (node);
+   }
}
 }
 

base-commit: c352ef0ed90cfc07d494dfec21bc683e337b
-- 
2.27.0



[committed] libstdc++: Make std::error_code printer more robust

2022-02-17 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux, pushed to trunk. The StdErrorCodePrinter that
crashes GDB is on gcc-11 too so this should be backported there.

-- >8 --

This attempts to implement a partial workaround for the GDB bug
https://sourceware.org/bugzilla/show_bug.cgi?id=28856 which causes GDB
to crash when printing a frame with a std::error_code argument.

By recognising the known error categories defined in the library and
hardcoding their names we do not need to call cat->name() on the
category.  This has the additional benefit of also working when
debugging a core file rather than a running process. For those known
categories we can also cast the int value to the corresponding error
code enum (e.g. future_errc) so that we show an enumerator instead of
just an integer.

For program-defined categories we just use the name of the dynamic type
to identify the category, and print the value as an integer. Once the
GDB bug is fixed and the virtual name() function can be called safely,
that would be preferable. For now it's better to have an imperfect
printer that doesn't crash GDB.

This rewritten StdErrorCodePrinter needs gdb.Value.dynamic_type, so is
only registered if that is supported, which means GDB 7.7 and later.

libstdc++-v3/ChangeLog:

* python/libstdcxx/v6/printers.py (StdErrorCodePrinter): Replace
code that call cat->name() on std::error_category objects.
Identify known categories by symbol name and use a hardcoded
name. Print error code values as enumerators where appopriate.
* testsuite/libstdc++-prettyprinters/cxx11.cc: Adjust expected
name of custom category. Check io_errc and future_errc errors.
---
 libstdc++-v3/python/libstdcxx/v6/printers.py  | 112 +++---
 .../libstdc++-prettyprinters/cxx11.cc |  10 +-
 2 files changed, 102 insertions(+), 20 deletions(-)

diff --git a/libstdc++-v3/python/libstdcxx/v6/printers.py 
b/libstdc++-v3/python/libstdcxx/v6/printers.py
index b3f4956381b..f7a7f9961a7 100644
--- a/libstdc++-v3/python/libstdcxx/v6/printers.py
+++ b/libstdc++-v3/python/libstdcxx/v6/printers.py
@@ -1518,41 +1518,113 @@ class StdCmpCatPrinter:
 class StdErrorCodePrinter:
 "Print a std::error_code or std::error_condition"
 
-_errno_categories = None # List of categories that use errno values
+_system_is_posix = None  # Whether std::system_category() use errno values.
 
 def __init__ (self, typename, val):
 self.val = val
 self.typename = strip_versioned_namespace(typename)
 # Do this only once ...
-if StdErrorCodePrinter._errno_categories is None:
-StdErrorCodePrinter._errno_categories = ['generic']
+if StdErrorCodePrinter._system_is_posix is None:
 try:
 import posix
-StdErrorCodePrinter._errno_categories.append('system')
+StdErrorCodePrinter._system_is_posix = True
 except ImportError:
-pass
+StdErrorCodePrinter._system_is_posix = False
 
 @staticmethod
-def _category_name(cat):
-"Call the virtual function that overrides std::error_category::name()"
-gdb.set_convenience_variable('__cat', cat)
-return gdb.parse_and_eval('$__cat->name()').string()
+def _find_errc_enum(name):
+typ = gdb.lookup_type(name)
+if typ is not None and typ.code == gdb.TYPE_CODE_ENUM:
+return typ
+return None
+
+@classmethod
+def _match_net_ts_category(cls, cat):
+net_cats = ['stream', 'socket', 'ip::resolver']
+for c in net_cats:
+func = c + '_category()'
+for ns in ['', _versioned_namespace]:
+ns = 'std::{}experimental::net::v1'.format(ns)
+sym = gdb.lookup_symbol('{}::{}::__c'.format(ns, func))[0]
+if sym is not None:
+if cat == sym.value().address:
+name = 'net::' + func
+enum = cls._find_errc_enum('{}::{}_errc'.format(ns, c))
+return (name, enum)
+return (None, None)
+
+@classmethod
+def _category_info(cls, cat):
+"Return details of a std::error_category"
+
+name = None
+enum = None
+is_errno = False
+
+# Try these first, or we get "warning: RTTI symbol not found" when
+# using cat.dynamic_type on the local class types for Net TS 
categories.
+func, enum = cls._match_net_ts_category(cat)
+if func is not None:
+return (None, func, enum, is_errno)
+
+# This might give a warning for a program-defined category defined as
+# a local class, but there doesn't seem to be any way to avoid that.
+typ = cat.dynamic_type.target()
+# Shortcuts for the known categories defined by libstdc++.
+if typ.tag.endswith('::generic_error_category'):
+name = 'generic'
+is_errno = True
+if 

Re: [PATCH, rs6000] Clean up Power10 fusion options

2022-02-17 Thread Segher Boessenkool
Hi!

On Fri, Jan 28, 2022 at 12:03:09PM -0600, Pat Haugen wrote:
> Mark Power10 fusion option undocumented and remove sub-options.

> gcc/
>   * config/rs6000/rs6000.opt (mpower10-fusion): Mark Undocumented.
>   (mpower10-fusion-ld-cmpi, mpower10-fusion-2logical,
>   mpower10-fusion-logical-add, mpower10-fusion-add-logical,
>   mpower10-fusion-2add, mpower10-fusion-2store): Remove.
>   * config/rs6000/rs6000-cpus.def (ISA_3_1_MASKS_SERVER,
>   OTHER_P9_VECTOR_MASKS): Remove Power10 fusion sub-options.
>   * config/rs6000/rs6000.cc (rs6000_option_override_internal,
>   power10_sched_reorder): Likewise.
>   * config/rs6000/genfusion.pl (gen_ld_cmpi_p10, gen_logical_addsubf,
>   gen_addadd): Likewise
>   * config/rs6000/fusion.md: Regenerate.

>/* Try to pair certain store insns to adjacent memory locations
>   so that the hardware will fuse them to a single operation.  */
> -  if (TARGET_P10_FUSION && TARGET_P10_FUSION_2STORE
> +  if (TARGET_P10_FUSION
>&& is_fusable_store (last_scheduled_insn, ))

Please fit that on one line now :-)

Okay for trunk with that triviality.  Thanks!


Segher


[pushed] c++: avoid duplicate deprecated warning [PR90451]

2022-02-17 Thread Jason Merrill via Gcc-patches
We were getting the deprecated warning twice for the same call because we
called mark_used first in finish_qualified_id_expr and then again in
build_over_call.  Let's not call it the first time; C++17 clarified that a
function is used only when it is selected from an overload set, which
happens later.

Then I had to add a few more uses in places that don't do anything further
with the expression (convert_to_void, finish_decltype_type), and places that
use the expression more unusually (cp_build_addr_expr_1,
convert_nontype_argument).  The new mark_single_function is mostly so
that I only have to put the comment in one place.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/90451

gcc/cp/ChangeLog:

* decl2.cc (mark_single_function): New.
* cp-tree.h: Declare it.
* typeck.cc (cp_build_addr_expr_1): mark_used when making a PMF.
* semantics.cc (finish_qualified_id_expr): Not here.
(finish_id_expression_1): Or here.
(finish_decltype_type): Call mark_single_function.
* cvt.cc (convert_to_void): And here.
* pt.cc (convert_nontype_argument): And here.
* init.cc (build_offset_ref): Adjust assert.

gcc/testsuite/ChangeLog:

* g++.dg/warn/deprecated-14.C: New test.
* g++.dg/warn/deprecated-15.C: New test.
---
 gcc/cp/cp-tree.h  |  1 +
 gcc/cp/cvt.cc |  3 +
 gcc/cp/decl2.cc   | 23 
 gcc/cp/init.cc|  5 +-
 gcc/cp/pt.cc  |  4 ++
 gcc/cp/semantics.cc   | 22 ++-
 gcc/cp/typeck.cc  |  6 ++
 gcc/testsuite/g++.dg/warn/deprecated-14.C | 72 +++
 gcc/testsuite/g++.dg/warn/deprecated-15.C | 14 +
 9 files changed, 132 insertions(+), 18 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/deprecated-14.C
 create mode 100644 gcc/testsuite/g++.dg/warn/deprecated-15.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index f253b32c3f2..37d462fca6e 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6930,6 +6930,7 @@ extern void no_linkage_error  (tree);
 extern void check_default_args (tree);
 extern bool mark_used  (tree);
 extern bool mark_used  (tree, tsubst_flags_t);
+extern bool mark_single_function   (tree, tsubst_flags_t);
 extern void finish_static_data_member_decl (tree, tree, bool, tree, int);
 extern tree cp_build_parm_decl (tree, tree, tree);
 extern void copy_linkage   (tree, tree);
diff --git a/gcc/cp/cvt.cc b/gcc/cp/cvt.cc
index e9803c1be31..53aa41368fe 100644
--- a/gcc/cp/cvt.cc
+++ b/gcc/cp/cvt.cc
@@ -1482,6 +1482,9 @@ convert_to_void (tree expr, impl_conv_void implicit, 
tsubst_flags_t complain)
 default:;
 }
   expr = resolve_nondeduced_context (expr, complain);
+  if (!mark_single_function (expr, complain))
+return error_mark_node;
+
   {
 tree probe = expr;
 
diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index c6bfcfe631a..2e58419ea51 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -5718,6 +5718,29 @@ decl_dependent_p (tree decl)
   return false;
 }
 
+/* [basic.def.odr] A function is named [and therefore odr-used] by an
+   expression or conversion if it is the selected member of an overload set in
+   an overload resolution performed as part of forming that expression or
+   conversion, unless it is a pure virtual function and either the expression
+   is not an id-expression naming the function with an explicitly qualified
+   name or the expression forms a pointer to member.
+
+   Mostly, we call mark_used in places that actually do something with a
+   function, like build_over_call.  But in a few places we end up with a
+   non-overloaded FUNCTION_DECL that we aren't going to do any more with, like
+   convert_to_void.  resolve_nondeduced_context is called in those places,
+   but it's also called in too many other places.  */
+
+bool
+mark_single_function (tree expr, tsubst_flags_t complain)
+{
+  if (is_overloaded_fn (expr) == 1
+  && !mark_used (expr, complain)
+  && (complain & tf_error))
+return false;
+  return true;
+}
+
 /* Mark DECL (either a _DECL or a BASELINK) as "used" in the program.
If DECL is a specialization or implicitly declared class member,
generate the actual definition.  Return false if something goes
diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index fcb255f1ac7..545d904c0f9 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -2362,8 +2362,9 @@ build_offset_ref (tree type, tree member, bool address_p,
 return error_mark_node;
 
   gcc_assert (DECL_P (member) || BASELINK_P (member));
-  /* Callers should call mark_used before this point.  */
-  gcc_assert (!DECL_P (member) || TREE_USED (member));
+  /* Callers should call mark_used before this point, except for functions.  */
+  

PING: [PATCH, rs6000] Clean up Power10 fusion options

2022-02-17 Thread Pat Haugen via Gcc-patches

Ping.

On 1/28/22 12:03 PM, Pat Haugen via Gcc-patches wrote:

Mark Power10 fusion option undocumented and remove sub-options.

Bootstrapped and regression tested on powerpc64le(Power10).
Ok for master?

-Pat


2022-01-28  Pat Haugen  

gcc/
* config/rs6000/rs6000.opt (mpower10-fusion): Mark Undocumented.
(mpower10-fusion-ld-cmpi, mpower10-fusion-2logical,
mpower10-fusion-logical-add, mpower10-fusion-add-logical,
mpower10-fusion-2add, mpower10-fusion-2store): Remove.
* config/rs6000/rs6000-cpus.def (ISA_3_1_MASKS_SERVER,
OTHER_P9_VECTOR_MASKS): Remove Power10 fusion sub-options.
* config/rs6000/rs6000.cc (rs6000_option_override_internal,
power10_sched_reorder): Likewise.
* config/rs6000/genfusion.pl (gen_ld_cmpi_p10, gen_logical_addsubf,
gen_addadd): Likewise
* config/rs6000/fusion.md: Regenerate.


[PATCH] c++: memory corruption during name lookup w/ modules [PR99479]

2022-02-17 Thread Patrick Palka via Gcc-patches
name_lookup::search_unqualified uses a statically allocated vector
in order to avoid repeated reallocation, under the assumption that
the function can't be called recursively.  With modules however,
this assumption turns out to be false, and search_unqualified can
be called recursively as demonstrated by testcase in comment #19
of PR99479[1] where the recursive call causes the vector to get
reallocated which invalidates the reference held by the parent call.

This patch makes search_unqualified instead use an auto_vec with 16
elements of internal storage (since with the various libraries I tested,
the size of the vector never exceeded 12).  In turn we can simplify the
API of subroutines to take the vector by reference and return void.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

[1]: https://gcc.gnu.org/PR99479#c19

PR c++/99479

gcc/cp/ChangeLog:

* name-lookup.cc (name_lookup::using_queue): Change to an
auto_vec (with 16 elements of internal storage).
(name_lookup::queue_namespace): Change return type to void,
take queue parameter by reference and adjust function body
accordingly.
(name_lookup::do_queue_usings): Inline into ...
(name_lookup::queue_usings): ... here.  As in queue_namespace.
(name_lookup::search_unqualified): Don't make queue static,
assume its incoming length is 0, and adjust function body
accordingly.
---
 gcc/cp/name-lookup.cc | 62 +++
 1 file changed, 22 insertions(+), 40 deletions(-)

diff --git a/gcc/cp/name-lookup.cc b/gcc/cp/name-lookup.cc
index 93c4eb7193b..5c965d6fba1 100644
--- a/gcc/cp/name-lookup.cc
+++ b/gcc/cp/name-lookup.cc
@@ -429,7 +429,7 @@ class name_lookup
 {
 public:
   typedef std::pair using_pair;
-  typedef vec using_queue;
+  typedef auto_vec using_queue;
 
 public:
   tree name;   /* The identifier being looked for.  */
@@ -528,16 +528,8 @@ private:
   bool search_usings (tree scope);
 
 private:
-  using_queue *queue_namespace (using_queue *queue, int depth, tree scope);
-  using_queue *do_queue_usings (using_queue *queue, int depth,
-   vec *usings);
-  using_queue *queue_usings (using_queue *queue, int depth,
-vec *usings)
-  {
-if (usings)
-  queue = do_queue_usings (queue, depth, usings);
-return queue;
-  }
+  void queue_namespace (using_queue& queue, int depth, tree scope);
+  void queue_usings (using_queue& queue, int depth, vec *usings);
 
 private:
   void add_fns (tree);
@@ -1084,39 +1076,35 @@ name_lookup::search_qualified (tree scope, bool usings)
 /* Add SCOPE to the unqualified search queue, recursively add its
inlines and those via using directives.  */
 
-name_lookup::using_queue *
-name_lookup::queue_namespace (using_queue *queue, int depth, tree scope)
+void
+name_lookup::queue_namespace (using_queue& queue, int depth, tree scope)
 {
   if (see_and_mark (scope))
-return queue;
+return;
 
   /* Record it.  */
   tree common = scope;
   while (SCOPE_DEPTH (common) > depth)
 common = CP_DECL_CONTEXT (common);
-  vec_safe_push (queue, using_pair (common, scope));
+  queue.safe_push (using_pair (common, scope));
 
   /* Queue its inline children.  */
   if (vec *inlinees = DECL_NAMESPACE_INLINEES (scope))
 for (unsigned ix = inlinees->length (); ix--;)
-  queue = queue_namespace (queue, depth, (*inlinees)[ix]);
+  queue_namespace (queue, depth, (*inlinees)[ix]);
 
   /* Queue its using targets.  */
-  queue = queue_usings (queue, depth, NAMESPACE_LEVEL 
(scope)->using_directives);
-
-  return queue;
+  queue_usings (queue, depth, NAMESPACE_LEVEL (scope)->using_directives);
 }
 
 /* Add the namespaces in USINGS to the unqualified search queue.  */
 
-name_lookup::using_queue *
-name_lookup::do_queue_usings (using_queue *queue, int depth,
- vec *usings)
+void
+name_lookup::queue_usings (using_queue& queue, int depth, vec 
*usings)
 {
-  for (unsigned ix = usings->length (); ix--;)
-queue = queue_namespace (queue, depth, (*usings)[ix]);
-
-  return queue;
+  if (usings)
+for (unsigned ix = usings->length (); ix--;)
+  queue_namespace (queue, depth, (*usings)[ix]);
 }
 
 /* Unqualified namespace lookup in SCOPE.
@@ -1128,15 +1116,12 @@ name_lookup::do_queue_usings (using_queue *queue, int 
depth,
 bool
 name_lookup::search_unqualified (tree scope, cp_binding_level *level)
 {
-  /* Make static to avoid continual reallocation.  We're not
- recursive.  */
-  static using_queue *queue = NULL;
+  using_queue queue;
   bool found = false;
-  int length = vec_safe_length (queue);
 
   /* Queue local using-directives.  */
   for (; level->kind != sk_namespace; level = level->level_chain)
-queue = queue_usings (queue, SCOPE_DEPTH (scope), level->using_directives);
+queue_usings (queue, SCOPE_DEPTH (scope), level->using_directives);
 
   for (; !found; 

Re: [PATCH] rs6000: Workaround for new ifcvt behavior [PR104335]

2022-02-17 Thread Robin Dapp via Gcc-patches
> Please send patches as plain text, not as base64.

It seems like Thunderbird does not support this anymore since later
versions, grml.  Probably need to look for another mail client.

> Why that first test?  XEXP (op, 0) is required to not be nil.
> 
> The patch is okay without that (if it passes testing of course :-) )
> Thanks!

Pushed without the XEXP test.  Hope this fixes things for you.

Regards
 Robin


[PATCH] Don't do int cmoves for IEEE comparisons, PR target/104256.

2022-02-17 Thread Michael Meissner via Gcc-patches
Don't do int cmoves for IEEE comparisons, PR target/104256.

Protect int cmove from raising an assertion if it is trying to do an int
conditional move where the test involves floating point comparisons that
can't easily be reversed due to NaNs.

The code used to generate the condition, and possibly reverse the condition if
ISEL could not handle it by rewriting the OP in the comparison rtx.

Unfortunately there are some conditions like UNLE that can't easily be reversed
due to NaNs.

The patch changes the code so that it does the reversal before generating the
comparison.  If the comparison cannot be reversed, it just returns false,
which indicates that we can't do an int conditional move in this case.

I have tested this on a little endian power9 system doing a bootstrap.  There
were no regressions.  Can I install this in the trunk?  Neither GCC 10 nor GCC
11 seem to generate an assertion faiure, so I don't plan to backport it.

2022-02-17  Michael Meissner  

gcc/
PR target/104256
* config/rs6000/rs6000.cc (rs6000_emit_int_cmove): Don't do
integer conditional moves if the test needs to be reversed and
there isn't a direct reverse comparison.

gcc/testsuite/
PR target/104256
* gcc.target/powerpc/ppc-fortran/pr104254.f90: New test.
---
 gcc/config/rs6000/rs6000.cc   | 36 ++-
 .../powerpc/ppc-fortran/pr104254.f90  | 25 +
 2 files changed, 44 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/ppc-fortran/pr104254.f90

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index f56cf66313a..15d324d13aa 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -16174,18 +16174,35 @@ rs6000_emit_int_cmove (rtx dest, rtx op, rtx 
true_cond, rtx false_cond)
 {
   rtx condition_rtx, cr;
   machine_mode mode = GET_MODE (dest);
-  enum rtx_code cond_code;
   rtx (*isel_func) (rtx, rtx, rtx, rtx, rtx);
   bool signedp;
 
   if (mode != SImode && (!TARGET_POWERPC64 || mode != DImode))
 return false;
 
+  /* Swap the comparison if isel can't handle it directly.  Don't generate int
+ cmoves if we can't swap the condition code due to NaNs.  */
+  enum rtx_code op_code = GET_CODE (op);
+  if (op_code != LT && op_code != GT && op_code != LTU && op_code != GTU
+  && op_code != EQ)
+{
+  if (!COMPARISON_P (op))
+   return false;
+
+  enum rtx_code rev_code = reverse_condition (op_code);
+  if (rev_code == UNKNOWN)
+   return false;
+
+  std::swap (false_cond, true_cond);
+  op = gen_rtx_fmt_ee (rev_code, GET_MODE (op),
+  XEXP (op, 0),
+  XEXP (op, 1));
+}
+
   /* We still have to do the compare, because isel doesn't do a
  compare, it just looks at the CRx bits set by a previous compare
  instruction.  */
   condition_rtx = rs6000_generate_compare (op, mode);
-  cond_code = GET_CODE (condition_rtx);
   cr = XEXP (condition_rtx, 0);
   signedp = GET_MODE (cr) == CCmode;
 
@@ -16193,21 +16210,6 @@ rs6000_emit_int_cmove (rtx dest, rtx op, rtx 
true_cond, rtx false_cond)
   ? (signedp ? gen_isel_signed_si : gen_isel_unsigned_si)
   : (signedp ? gen_isel_signed_di : gen_isel_unsigned_di));
 
-  switch (cond_code)
-{
-case LT: case GT: case LTU: case GTU: case EQ:
-  /* isel handles these directly.  */
-  break;
-
-default:
-  /* We need to swap the sense of the comparison.  */
-  {
-   std::swap (false_cond, true_cond);
-   PUT_CODE (condition_rtx, reverse_condition (cond_code));
-  }
-  break;
-}
-
   false_cond = force_reg (mode, false_cond);
   if (true_cond != const0_rtx)
 true_cond = force_reg (mode, true_cond);
diff --git a/gcc/testsuite/gcc.target/powerpc/ppc-fortran/pr104254.f90 
b/gcc/testsuite/gcc.target/powerpc/ppc-fortran/pr104254.f90
new file mode 100644
index 000..d1bfab23482
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/ppc-fortran/pr104254.f90
@@ -0,0 +1,25 @@
+! { dg-do compile }
+! { dg-require-effective-target powerpc_p9vector_ok }
+! { dg-options "-mdejagnu-cpu=power9 -O1 -fnon-call-exceptions" }
+
+! PR target/104254.  GCC would raise an assertion error if this program was
+! compiled with -O1 and -fnon-call-exceptions on a power9 or higher.  The issue
+! occurs because at this optimization level, the compiler is trying to make
+! a conditional move to store integers using a 32-bit floating point compare.
+! It wants to use UNLE, which is not supported for integer modes.
+  
+  real :: a(2), nan
+  real, allocatable :: c(:)
+  integer :: ia(1)
+
+  nan = 0.0
+  nan = 0.0/nan
+
+  a(:) = nan
+  ia = maxloc (a)
+  if (ia(1).ne.1) STOP 1
+
+  allocate (c(1))
+  c(:) = nan
+  deallocate (c)
+end
-- 
2.35.1


-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com


libgo patch committed: Add hurd build tag for setReadMsgCloseOnExec

2022-02-17 Thread Ian Lance Taylor via Gcc-patches
This libgo patch, from Svante Signell, adds a hurd build tag for
setReadMsgCloseOnExec.  This fixes GCC PRs 103573 and 104290.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu (which
doesn't test much).  Committed to mainline.

Ian
8ec374f329b72e640bffe3abf8c082f9a287adb3
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 3742414c828..1fdc5a95d44 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-3742e8a154bfec805054b4ebf0809f12dc7694da
+90ed127ef053b758288af9c4e43473e257770bc3
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/libgo/go/net/unixsock_readmsg_cloexec.go 
b/libgo/go/net/unixsock_readmsg_cloexec.go
index fa4fd7d9331..84479e58d65 100644
--- a/libgo/go/net/unixsock_readmsg_cloexec.go
+++ b/libgo/go/net/unixsock_readmsg_cloexec.go
@@ -2,7 +2,7 @@
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-//go:build aix || darwin || freebsd || solaris
+//go:build aix || darwin || freebsd || hurd || solaris
 
 package net
 


Re: [PATCH] rs6000: __Uglify non-uglified local variables in headers

2022-02-17 Thread Segher Boessenkool
On Wed, Feb 16, 2022 at 09:05:05PM -0600, Paul A. Clarke wrote:
> Properly prefix (with "__")  all local variables in shipped headers for x86
> compatibility intrinsics implementations.  This avoids possible problems with
> usages like:
> ```
> #define result X
> #include 
> ```
> 
> 2021-02-16  Paul A. Clarke  
> 
> gcc
>   PR target/104257
>   * config/rs6000/bmi2intrin.h: Uglify local variables.
>   * /config/rs6000/emmintrin.h: Likewise.
>   * /config/rs6000/mm_malloc.h: Likewise.
>   * /config/rs6000/mmintrin.h: Likewise.
>   * /config/rs6000/pmmintrin.h: Likewise.
>   * /config/rs6000/smmintrin.h: Likewise.
>   * /config/rs6000/tmmintrin.h: Likewise.
>   * /config/rs6000/xmmintrin.h: Likewise.

Okay for trunk.  Thanks!  Do you want to backport this as well?  That's
preapproved (if you think it is useful).


Segher


[Patch] nvptx: Add -mptx=6.0 + -misa=sm_70

2022-02-17 Thread Tobias Burnus

This patch exposes two -m* option values which are already
internally available. I think it makes sense to expose them
explicitly to the user (see below), but there are also arguments
against. Thoughts?


PTX version (-mptx=)
[patch adds -mptx=6.0 as option]

* Currently supported internally are 3.1 (CUDA 5.0, used by GCC <= 11),
  6.0 (CUDA 9.0, current GCC 12 default), 6.3 (CUDA 10.0), 7.0 (CUDA 11.0)
* -mptx= supports 3.1, 6.3, 7.0 – but not the internal default 6.0

First, I think all versions make sense:
* 3.1 is the previous default and permits running with older CUDA (if need)
* 6.0 is for CUDA 9 - and if we want to support it, it has to stay.
  6.0 is the default since commit
  https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590007.html
* 6.3 is CUDA 10.0. In that PTX version, a lot of nice features
  were added like .alias
* 7.0 is CUDA 11.0. This adds support for sm_80 (honored in code gen).

PTX >= 6.0 makes sense as it permits newer sm_* (in particular: sm_53 and sm_70)
and
+  /* Pick at least 6.0, to enable using bar.warp.sync to have a way to force
+ warp convergence.  */
On the other hand, for older systems, CUDA 10.0 might be too new and we still
want to support CUDA 9. (At least that's how I understood one of nvpx gcc
emails, which I cannot find at the moment.)

Assuming we don't want to change the default minimal version from PTX 6.0
back to 6.3, it looks as both should stay.
Downside: we probably need one lib{c,gomp,gfortran,...} per PTX version,
i.e. 4 versions (3.1, 6.0, 6.3, 7.0).

I think it makes sense to expose the 6.0 value to the user and not
only use it internally behind the scenes. As it is already used internally,
the change is tiny but user visible. Thus, it has to stay when we will
bump the default in later GCC versions; on the other hand, if we bump
the default, it might be also a good reason to have it to permit the
user to have a backward compatible PTX output for linking libraries.

 * * *

SM version (-misa=)
[Patch adds -misa=sm_70]

* The compiler supports internally: SM_30, SM_35, SM_53, SM_70, SM_75, SM_80.
* GCC <= 11 only had sm_30 and sm_35 (supported since PTX 3.1/CUDA 5.0)
* GCC 12 exposes
  - sm_30, sm_35,
  - sm_53 (PTX 4.2, CUDA 7.0),
  - sm_75 (PTX 6.3, CUDA 10.0)
  - sm_80 (PTX 7.0, CUDA 11.0)
  but it does not permit using -misa=sm_70 (PTX 6.0, CUDA 9.0).
* Note: sm_75 + sm_80 imply a newer PTX version, which
  the compiler defaults to (if no -mptx= has been specified).

I think it makes sense to have sm_70 in addition:
* sm_70 enables several new features (see PTX documentation)
* sm_70 is the highest supported for CUDA 9 (default PTX version);
  as sm_75 will require CUDA 10, currently only sm_53 can be used with CUDA 9.
* The current code actually does generate different code for >= sm_70
  already.

 * * *

This patch updates -misa= and -mptx= documentation to match what actually has
been implemented. I think that makes sense as:
* The currently documented default for -mptx= is no longer true.
* The available values are already exposed via the diagnostic
* The multilib issue already occurs when the user explicitly specifies -mptx=6.3
  (or -mptx=3.1).
* If needed, we could note that certain PTX or ISA values are experimental.

I think besides > sm_35 being experimental, there is no reason that higher sm_*
should not be used. Except for the pre-existing multilib issue and for the ICE
when bootstrapping with sm_53 (instead of sm_35) as default ISA version.
But that's solved by Roger's patch (pending ME (and then BE) review),
https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590545.html

* * *

Comments to any of those three patches (-mptx=6.0, -misa=sm_70, documentation)?
(Lightly tested on x86-64 with nvptx offloading.)
OK? (All, some?)

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
nvptx: Add -mptx=6.0 + -misa=sm_70

gcc/ChangeLog:

	* config/nvptx/nvptx-c.cc (nvptx_cpu_cpp_builtins): Handle SM70.
	* gcc/config/nvptx/nvptx.cc (first_ptx_version_supporting_sm):
	Likewise.
	* config/nvptx/nvptx.opt (misa): Add sm_70 alias PTX_ISA_SM70.
	(mptx): Add 6.0 alias PTX_VERSION_6_0.
	* config/nvptx/t-omp-device: Add sm_53, sm_70, sm_75, sm_80.
	* doc/invoke.texi (-misa, -mptx): Update for new values and
	defaults.

 gcc/config/nvptx/nvptx-c.cc   |  2 ++
 gcc/config/nvptx/nvptx.cc |  2 ++
 gcc/config/nvptx/nvptx.opt|  6 ++
 gcc/config/nvptx/t-omp-device |  2 +-
 gcc/doc/invoke.texi   | 17 +++--
 5 files changed, 22 insertions(+), 7 deletions(-)

diff --git a/gcc/config/nvptx/nvptx-c.cc b/gcc/config/nvptx/nvptx-c.cc
index d68b9910d7e..b2375fb5b16 100644
--- a/gcc/config/nvptx/nvptx-c.cc
+++ b/gcc/config/nvptx/nvptx-c.cc
@@ -43,6 +43,8 @@ nvptx_cpu_cpp_builtins (void)
 cpp_define 

[committed] [PR104447] LRA: Do not split non-alloc hard regs

2022-02-17 Thread Vladimir Makarov via Gcc-patches

The patch solves the following PR:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104447

The patch was successfully bootstrapped and tested on x86-64.

commit db69f666a728ce800a840115829f6b64bc3174d2
Author: Vladimir N. Makarov 
Date:   Thu Feb 17 11:31:50 2022 -0500

[PR104447] LRA: Do not split non-alloc hard regs.

LRA tried to split non-allocated hard reg for reload pseudos again and
again until number of assignment passes reaches the limit.  The patch fixes
this.

gcc/ChangeLog:

PR rtl-optimization/104447
* lra-constraints.cc (spill_hard_reg_in_range): Initiate ignore
hard reg set by lra_no_alloc_regs.

gcc/testsuite/ChangeLog:

PR rtl-optimization/104447
* gcc.target/i386/pr104447.c: New.

diff --git a/gcc/lra-constraints.cc b/gcc/lra-constraints.cc
index c700c3f4578..b2c4590153c 100644
--- a/gcc/lra-constraints.cc
+++ b/gcc/lra-constraints.cc
@@ -6008,7 +6008,7 @@ spill_hard_reg_in_range (int regno, enum reg_class rclass, rtx_insn *from, rtx_i
   HARD_REG_SET ignore;
   
   lra_assert (from != NULL && to != NULL);
-  CLEAR_HARD_REG_SET (ignore);
+  ignore = lra_no_alloc_regs;
   EXECUTE_IF_SET_IN_BITMAP (_reg_info[regno].insn_bitmap, 0, uid, bi)
 {
   lra_insn_recog_data_t id = lra_insn_recog_data[uid];
diff --git a/gcc/testsuite/gcc.target/i386/pr104447.c b/gcc/testsuite/gcc.target/i386/pr104447.c
new file mode 100644
index 000..bf11e8696e6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr104447.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -pg" } */
+
+int
+bar (int x)
+{
+  asm goto ("" : : "r" (x), "r" (x + 1), "r" (x + 2), "r" (x + 3), /* { dg-error "operand has impossible constraints" } */
+	"r" (x + 4), "r" (x + 5), "r" (x + 6), "r" (x + 7),
+	"r" (x + 8), "r" (x + 9), "r" (x + 10), "r" (x + 11),
+	"r" (x + 12), "r" (x + 13), "r" (x + 14), "r" (x + 15),
+	"r" (x + 16) : : lab);
+ lab:
+  return 0;
+}


Re: [PATCHv3] libiberty rust-demangle, ignore .suffix

2022-02-17 Thread Ian Lance Taylor via Gcc-patches
On Thu, Feb 17, 2022 at 2:45 AM Mark Wielaard  wrote:
>
> Ping. Is this OK to commit now?
> I am not sure who can approve this.
>
> On Sun, Jan 16, 2022 at 01:35:34AM +0100, Mark Wielaard wrote:
> > Rust symbols can have a .suffix because of compiler transformations.
> > These can be ignored in the demangled name. Which is what this patch
> > implements. By stopping at the first dot for v0 symbols and searching
> > backwards to the ending 'E' for legacy symbols.
> >
> > An alternative implementation could be to follow what C++ does and
> > represent these as [clone .suffix] tagged onto the demangled name.
> > But this seems somewhat confusing since it results in a demangled
> > name that cannot be mangled again. And it would mean trying to
> > decode compiler internal naming.
> >
> > https://bugs.kde.org/show_bug.cgi?id=445916
> > https://github.com/rust-lang/rust/issues/60705
> >
> > libiberty/Changelog
> >
> >   * rust-demangle.c (rust_demangle_callback): Ignore everything
> >   after '.' char in sym for v0. For legacy symbols search
> >   backwards to find the last 'E' before any '.'.
> >   * testsuite/rust-demangle-expected: Add new .suffix testcases.

This is OK.

Thanks.

Ian


Re: [PATCH] i386: Skip decimal float vector modes in type_natural_mode [PR79754]

2022-02-17 Thread Eric Botcazou via Gcc-patches
> gcc/testsuite/ChangeLog:
> 
> PR target/79754
> * gcc.target/i386/pr79754.c: New test.
> 
> Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.
> 
> Pushed to master.

And 11 branch apparently, but it should be:

/* { dg-do compile { target dfp } } */

instead of just:

/* { dg-do compile } *

-- 
Eric Botcazou




Contents of PO file 'cpplib-12.1-b20220213.sv.po'

2022-02-17 Thread Translation Project Robot


cpplib-12.1-b20220213.sv.po.gz
Description: Binary data
The Translation Project robot, in the
name of your translation coordinator.



New Swedish PO file for 'cpplib' (version 12.1-b20220213)

2022-02-17 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the Swedish team of translators.  The file is available at:

https://translationproject.org/latest/cpplib/sv.po

(This file, 'cpplib-12.1-b20220213.sv.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [PATCH v3 07/15] arm: Implement MVE predicates as vectors of booleans

2022-02-17 Thread Christophe Lyon via Gcc-patches
Hi,

On Fri, Feb 4, 2022 at 10:43 AM Richard Sandiford 
wrote:

> Christophe Lyon  writes:
> > On Tue, Feb 1, 2022 at 4:42 AM Richard Sandiford <
> richard.sandif...@arm.com>
> > wrote:
> >
> >> Christophe Lyon via Gcc-patches  writes:
> >> > On Mon, Jan 31, 2022 at 7:01 PM Richard Sandiford via Gcc-patches <
> >> > gcc-patches@gcc.gnu.org> wrote:
> >> >
> >> >> Sorry for the slow response, was out last week.
> >> >>
> >> >> Christophe Lyon via Gcc-patches  writes:
> >> >> > diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c
> >> >> > index f16d320..5f559f8fd93 100644
> >> >> > --- a/gcc/emit-rtl.c
> >> >> > +++ b/gcc/emit-rtl.c
> >> >> > @@ -6239,9 +6239,14 @@ init_emit_once (void)
> >> >> >
> >> >> >/* For BImode, 1 and -1 are unsigned and signed interpretations
> >> >> >   of the same value.  */
> >> >> > -  const_tiny_rtx[0][(int) BImode] = const0_rtx;
> >> >> > -  const_tiny_rtx[1][(int) BImode] = const_true_rtx;
> >> >> > -  const_tiny_rtx[3][(int) BImode] = const_true_rtx;
> >> >> > +  for (mode = MIN_MODE_BOOL;
> >> >> > +   mode <= MAX_MODE_BOOL;
> >> >> > +   mode = (machine_mode)((int)(mode) + 1))
> >> >> > +{
> >> >> > +  const_tiny_rtx[0][(int) mode] = const0_rtx;
> >> >> > +  const_tiny_rtx[1][(int) mode] = const_true_rtx;
> >> >> > +  const_tiny_rtx[3][(int) mode] = const_true_rtx;
> >> >> > +}
> >> >> >
> >> >> >for (mode = MIN_MODE_PARTIAL_INT;
> >> >> > mode <= MAX_MODE_PARTIAL_INT;
> >> >>
> >> >> Does this do the right thing for:
> >> >>
> >> >>   gen_int_mode (-1, B2Imode)
> >> >>
> >> >> (which is used e.g. in native_decode_vector_rtx)?  It looks like it
> >> >> would give 0b01 rather than 0b11.
> >> >>
> >> >> Maybe for non-BImode we should use const1_rtx and constm1_rtx, like
> with
> >> >> MODE_INT.
> >> >>
> >> >
> >> > debug_rtx ( gen_int_mode (-1, B2Imode) says:
> >> > (const_int -1 [0x])
> >> > so that looks right?
> >>
> >> Ah, right, I forgot that the mode is unused for the small constant
> lookup.
> >> But it looks like CONSTM1_RTX (B2Imode) would be (const_int 1) instead,
> >> even though the two should be equal.
> >>
> >
> > Indeed!
> >
> > So I changed the above loop into:
> >/* For BImode, 1 and -1 are unsigned and signed interpretations
> >  of the same value.  */
> >   for (mode = MIN_MODE_BOOL;
> >mode <= MAX_MODE_BOOL;
> >mode = (machine_mode)((int)(mode) + 1))
> > {
> >   const_tiny_rtx[0][(int) mode] = const0_rtx;
> >   const_tiny_rtx[1][(int) mode] = const_true_rtx;
> > -  const_tiny_rtx[3][(int) mode] = const_true_rtx;
> > +  const_tiny_rtx[3][(int) mode] = constm1_rtx;
> > }
> > which works, both constants are now equal and the validation still
> passes.
>
> I think we need to keep const_true_rtx for both [BImode][1] and
> [BImode][3].
> BImode is an awkward special case in that the (only) nonzero value must be
> exactly STORE_FLAG_VALUE, even if that leads to an otherwise non-canonical
> const_int representation.
>

OK, done.


>
> For the multi-bit booleans, [1] needs to be const1_rtx rather than
> const_true_rtx in case STORE_FLAG_VALUE != 1.
>
> >> >> > @@ -1679,15 +1708,25 @@ emit_class_narrowest_mode (void)
> >> >> >print_decl ("unsigned char", "class_narrowest_mode",
> >> >> "MAX_MODE_CLASS");
> >> >> >
> >> >> >for (c = 0; c < MAX_MODE_CLASS; c++)
> >> >> > -/* Bleah, all this to get the comment right for
> MIN_MODE_INT.  */
> >> >> > -tagged_printf ("MIN_%s", mode_class_names[c],
> >> >> > -modes[c]
> >> >> > -? ((c != MODE_INT || modes[c]->precision != 1)
> >> >> > -   ? modes[c]->name
> >> >> > -   : (modes[c]->next
> >> >> > -  ? modes[c]->next->name
> >> >> > -  : void_mode->name))
> >> >> > -: void_mode->name);
> >> >> > +{
> >> >> > +  /* Bleah, all this to get the comment right for
> MIN_MODE_INT.
> >> */
> >> >> > +  const char *comment_name = void_mode->name;
> >> >> > +
> >> >> > +  if (modes[c])
> >> >> > + if (c != MODE_INT || !modes[c]->boolean)
> >> >> > +   comment_name = modes[c]->name;
> >> >> > + else
> >> >> > +   {
> >> >> > + struct mode_data *m = modes[c];
> >> >> > + while (m->boolean)
> >> >> > +   m = m->next;
> >> >> > + if (m)
> >> >> > +   comment_name = m->name;
> >> >> > + else
> >> >> > +   comment_name = void_mode->name;
> >> >> > +   }
> >> >>
> >> >> Have you tried bootstrapping the patch on a host of your choice?
> >> >> I would expect a warning/Werror about an ambiguous else here.
> >> >>
> >> > No I hadn't and indeed the build fails
> >> >
> >> >>
> >> >> I guess this reduces to:
> >> >>
> >> >> struct mode_data *m = modes[c];
> >> >> while (m && m->boolean)
> >> >>   m = m->next;
> >> >> const char *comment_name = (m ? m : void_mode)->name;
> >> >>
> >> >> but I don't 

Re: Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'

2022-02-17 Thread Richard Biener via Gcc-patches
On Thu, Feb 17, 2022 at 1:23 PM Thomas Schwinge  wrote:
>
> Hi!
>
> On 2022-02-11T08:02:20+0100, Richard Biener  
> wrote:
> > On Thu, Feb 10, 2022 at 11:20 PM Thomas Schwinge
> >  wrote:
> >> On 2022-02-10T16:36:51+, Michael Matz via Gcc-patches 
> >>  wrote:
> >> > On Thu, 10 Feb 2022, Richard Biener via Gcc-patches wrote:
> >> >> On Wed, Feb 9, 2022 at 2:21 PM Thomas Schwinge 
> >> >>  wrote:
> >> >> > OK to push (now, or in next development stage 1?) the attached
> >> >> > "Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'",
> >> >> > or should that be done differently -- or, per the current state (why?)
> >> >> > not at all?
> >>
> >> First, thanks for (indirectly) having confirmed that my confusion is not
> >> completely off, why this is currently missing.  ;-)
> >>
> >> >> Hmm, I wonder if we shouldn't simply dump DECL_UID as
> >> >>
> >> >>  'uid NNN'
> >> >
> >> > Yes, much better in line with the normal dump_tree output.
> >>
> >> >> somewhere.  For example after or before DECL_NAME?
> >>
> >> Heh -- that's what I wanted to do initially, but then I saw that we've
> >> currently got in 'print_node_brief' (and very similar in 'print_node'):
> >>
> >> [...]
> >>   fprintf (file, "%s <%s", prefix, get_tree_code_name (TREE_CODE 
> >> (node)));
> >>   dump_addr (file, " ", node);
> >>
> >>   if (tclass == tcc_declaration)
> >> {
> >>   if (DECL_NAME (node))
> >> fprintf (file, " %s", IDENTIFIER_POINTER (DECL_NAME (node)));
> >>   else if (TREE_CODE (node) == LABEL_DECL
> >>&& LABEL_DECL_UID (node) != -1)
> >> {
> >>   if (dump_flags & TDF_NOUID)
> >> fprintf (file, " L.");
> >>   else
> >> fprintf (file, " L.%d", (int) LABEL_DECL_UID (node));
> >> }
> >>   else
> >> {
> >>   if (dump_flags & TDF_NOUID)
> >> fprintf (file, " %c.",
> >>  TREE_CODE (node) == CONST_DECL ? 'C' : 'D');
> >>   else
> >> fprintf (file, " %c.%u",
> >>  TREE_CODE (node) == CONST_DECL ? 'C' : 'D',
> >>  DECL_UID (node));
> >> }
> >> }
> >> [...]
> >>
> >> That is, if there's no 'DECL_NAME', we print 'L.[UID]', 'C.[UID]',
> >> 'D.[UID]'.  The same we do in 'gcc/tree-pretty-print.cc:dump_decl_name',
> >> I found.  But in the latter function, we also do it that same way if
> >> there is a 'DECL_NAME' ('i' -> 'iD.4249', for example), so that's why I
> >> copied that style back to my proposed 'print_node_brief'/'print_node'
> >> change.
> >>
> >> Are you now suggesting to only print 'DECL_NAME' as '[NAME] uid [UID]',
> >> but keep 'L.[UID]', 'C.[UID]', 'D.[UID]' in the "dot" form, or change
> >> these to 'L uid [UID]', 'C uid [UID]', 'D uid [UID]' correspondingly?
> >
> > I'd say these should then be 'D.[UID] uid [UID]' even if that's
> > somewhat redundant.
>
> Sure, that's fine for me.  So, like in the attached
> "Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'";
> OK to push?  (...  which evidently I forgot to send last week...)

'TDF_UID', 'TDF_NOUID' (now appends 'uid '):









just not append 'uid ...' for NOUID, it doesn't add any useful information here.

So

diff --git a/gcc/print-tree.cc b/gcc/print-tree.cc
index 0876da873a9..38dd032fbf7 100644
--- a/gcc/print-tree.cc
+++ b/gcc/print-tree.cc
@@ -158,6 +158,14 @@ print_node_brief (FILE *file, const char *prefix,
const_tree node, int indent)
 TREE_CODE (node) == CONST_DECL ? 'C' : 'D',
 DECL_UID (node));
}
+
+  if (dump_flags & TDF_UID)
+   {
+ if (dump_flags & TDF_NOUID)
+   fprintf (file, " uid ");
+ else
+   fprintf (file, " uid %d", DECL_UID (node));
+   }

just

if (dump_flags & TDF_UID)
  fprintf (file, " uid %d", DECL_UID (node));

Asking for both at the same time is odd and I'd really not expect
that.  It should be a tri-state, UID (everywhere), default (in some
places), NOUID (nowhere).

+  if (dump_flags & TDF_UID)
+   {
+ if (dump_flags & TDF_NOUID)
+   fprintf (file, " uid ");
+ else
+   fprintf (file, " uid %d", DECL_UID (node));
+   }

same here.  But for print_node I think we should default to printing
the UID.  So if would be

   if (!(dump_flags & TDF_NOUID))
fprintf (file, " uid %d", DECL_UID (node));

note that UIDs can be negative, but decl_minimal.uid is unsigned but
you are using
a signed format.  See DEBUG_TEMP_UID in tree.h.  I don't have a well thought out
opinion on how to present the uid here for debug temps, signed works for me
but then I think you should have (int) DECL_UID (node) for the prints?

Thanks,
Richard.

>
> Grüße
>  Thomas
>
>
> >> And also do the similar changes in
> >> 

PING - [PATCH] middle-end: Support ABIs that pass FP values as wider integers.

2022-02-17 Thread Tobias Burnus

PING for this cfgexpand.cc + expr.cc change by Roger.

This is a pre-requisite for Roger's nvptx patch to avoid an ICE during
bootstrap:

* https://gcc.gnu.org/pipermail/gcc-patches/2022-February/590250.html
  "[PATCH] nvptx: Back-end portion of a fix for PR target/104489."
  (see patch for additional reasoning for this patch)

* See also https://gcc.gnu.org/PR104489
   nvptx, sm_53: internal compiler error: in gen_rtx_SUBREG, at
emit-rtl.cc:1022

Thanks,

Tobias

On 09.02.22 21:12, Roger Sayle wrote:

This patch adds middle-end support for target ABIs that pass/return
floating point values in integer registers with precision wider than
the original FP mode.  An example, is the nvptx backend where 16-bit
HFmode registers are passed/returned as (promoted to) SImode registers.
Unfortunately, this currently falls foul of the various (recent?) sanity
checks that (very sensibly) prevent creating paradoxical SUBREGs of
floating point registers.  The approach below is to explicitly perform the
conversion/promotion in two steps, via an integer mode of same precision
as the floating point value.  So on nvptx, 16-bit HFmode is initially
converted to 16-bit HImode (using SUBREG), then zero-extended to SImode,
and likewise when going the other way, parameters truncated to HImode
then converted to HFmode (using SUBREG).  These changes are localized
to expand_value_return and expanding DECL_RTL to support strange ABIs,
rather than inside convert_modes or gen_lowpart, as mismatched
precision integer/FP conversions should be explicit in the RTL,
and these semantics not generally visible/implicit in user code.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check with no new failures, and on nvptx-none, where it is
the middle-end portion of a pair of patches to allow the default ISA to
be advanced.  Ok for mainline?


2022-02-09  Roger Sayle  

gcc/ChangeLog
* cfgexpand.cc (expand_value_return): Allow backends to promote
a scalar floating point return value to a wider integer mode.
* expr.cc (expand_expr_real_1) [expand_decl_rtl]: Likewise, allow
backends to promote scalar FP PARM_DECLs to wider integer modes.


Thanks in advance,
Roger
--


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[PATCH] c++: implicit 'this' in noexcept-spec within class tmpl [PR94944]

2022-02-17 Thread Patrick Palka via Gcc-patches
Here when instantiating the noexcept-spec we fail to resolve the
implicit object parameter for the call A::f() ultimately because
maybe_instantiate_noexcept sets current_class_ptr/ref to the dependent
'this' (of type B) rather than the specialized 'this' (of type B).
This ends up causing maybe_dummy_object (called from
finish_qualified_id_expr) to return a dummy object instead of 'this'.

This patch corrects this by making maybe_instantiate_noexcept always set
current_class_ptr/ref to the specialized 'this', consistent with what
tsubst_function_type does when substituting into a trailing return type.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk perhaps?

PR c++/94944

gcc/cp/ChangeLog:

* pt.cc (maybe_instantiate_noexcept): For non-static member
functions, set current_class_ptr/ref to the specialized 'this'
instead.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/noexcept34.C: Adjusted expected diagnostics.
* g++.dg/cpp0x/noexcept75.C: New test.
---
 gcc/cp/pt.cc| 19 +++
 gcc/testsuite/g++.dg/cpp0x/noexcept34.C |  8 
 gcc/testsuite/g++.dg/cpp0x/noexcept75.C | 17 +
 3 files changed, 28 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/noexcept75.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 6dda66081bd..a7a524fe9fc 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -26139,20 +26139,15 @@ maybe_instantiate_noexcept (tree fn, tsubst_flags_t 
complain)
  push_deferring_access_checks (dk_no_deferred);
  input_location = DECL_SOURCE_LOCATION (fn);
 
- if (!DECL_LOCAL_DECL_P (fn))
+ if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fn)
+ && !DECL_LOCAL_DECL_P (fn))
{
  /* If needed, set current_class_ptr for the benefit of
-tsubst_copy/PARM_DECL.  The exception pattern will
-refer to the parm of the template, not the
-instantiation.  */
- tree tdecl = DECL_TEMPLATE_RESULT (DECL_TI_TEMPLATE (fn));
- if (DECL_NONSTATIC_MEMBER_FUNCTION_P (tdecl))
-   {
- tree this_parm = DECL_ARGUMENTS (tdecl);
- current_class_ptr = NULL_TREE;
- current_class_ref = cp_build_fold_indirect_ref (this_parm);
- current_class_ptr = this_parm;
-   }
+tsubst_copy/PARM_DECL.  */
+ tree this_parm = DECL_ARGUMENTS (fn);
+ current_class_ptr = NULL_TREE;
+ current_class_ref = cp_build_fold_indirect_ref (this_parm);
+ current_class_ptr = this_parm;
}
 
  /* If this function is represented by a TEMPLATE_DECL, then
diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept34.C 
b/gcc/testsuite/g++.dg/cpp0x/noexcept34.C
index dce35652ef5..86129e7a520 100644
--- a/gcc/testsuite/g++.dg/cpp0x/noexcept34.C
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept34.C
@@ -7,13 +7,13 @@ template struct A
 {
   constexpr int f () { return 0; }
   bool b = true;
-  void g () noexcept (f()) { } // { dg-error "use of parameter" }
-  void g2 () noexcept (this->f()) { } // { dg-error "use of parameter" }
+  void g () noexcept (f()) { } // { dg-error ".this. is not a constant" }
+  void g2 () noexcept (this->f()) { } // { dg-error ".this. is not a constant" 
}
   void g3 () noexcept (b) { } // { dg-error "use of .this. in a constant 
expression|use of parameter" }
   void g4 (int i) noexcept (i) { } // { dg-error "use of parameter" }
-  void g5 () noexcept (A::f()) { } // { dg-error "use of parameter" }
+  void g5 () noexcept (A::f()) { } // { dg-error ".this. is not a constant" }
   void g6 () noexcept (foo(b)) { } // { dg-error "use of .this. in a constant 
expression|use of parameter" }
-  void g7 () noexcept (int{f()}) { } // { dg-error "use of parameter" }
+  void g7 () noexcept (int{f()}) { } // { dg-error ".this. is not a constant" }
 };
 
 int main ()
diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept75.C 
b/gcc/testsuite/g++.dg/cpp0x/noexcept75.C
new file mode 100644
index 000..d746f4768d0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept75.C
@@ -0,0 +1,17 @@
+// PR c++/94944
+// { dg-do compile { target c++11 } }
+
+template
+struct A {
+  void f();
+};
+
+template
+struct B : A {
+  void g() noexcept(noexcept(A::f()));
+};
+
+int main() {
+  B b;
+  b.g();
+}
-- 
2.35.1.129.gb80121027d



Re: [PATCH] x86: Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER

2022-02-17 Thread H.J. Lu via Gcc-patches
On Thu, Feb 17, 2022 at 10:49:48AM +0100, Richard Biener via Gcc-patches wrote:
> On Thu, Feb 17, 2022 at 8:52 AM Uros Bizjak via Gcc-patches
>  wrote:
> >
> > On Thu, Feb 17, 2022 at 6:25 AM Hongtao Liu via Gcc-patches
> >  wrote:
> > >
> > > On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches
> > >  wrote:
> > > >
> > > > Reading YMM registers with all zero bits needs VZEROUPPER on Sandy 
> > > > Bride,
> > > > Ivy Bridge, Haswell, Broadwell and Alder Lake to avoid SSE <-> AVX
> > > > transition penalty.  Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER to
> > > > generate vzeroupper instruction after loading all-zero YMM/YMM registers
> > > > and enable it by default.
> > > Shouldn't TARGET_READ_ZERO_YMM_ZMM_NONEED_VZEROUPPER sounds a bit 
> > > smoother?
> > > Because originally we needed to add vzeroupper to all avx<->sse cases,
> > > now it's a tune to indicate that we don't need to add it in some
> >
> > Perhaps we should go from the other side and use
> > X86_TUNE_OPTIMIZE_AVX_READ for new processors?
> 
> Btw, do you have a micro-benchmark to test this on AMD archs?
> 

I don't believe AMD CPUs needs vzeroupper.

H.J.


[PATCH v2] x86: Add TARGET_OMIT_VZEROUPPER_AFTER_AVX_READ_ZERO

2022-02-17 Thread H.J. Lu via Gcc-patches
On Thu, Feb 17, 2022 at 08:51:31AM +0100, Uros Bizjak wrote:
> On Thu, Feb 17, 2022 at 6:25 AM Hongtao Liu via Gcc-patches
>  wrote:
> >
> > On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches
> >  wrote:
> > >
> > > Reading YMM registers with all zero bits needs VZEROUPPER on Sandy Bride,
> > > Ivy Bridge, Haswell, Broadwell and Alder Lake to avoid SSE <-> AVX
> > > transition penalty.  Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER to
> > > generate vzeroupper instruction after loading all-zero YMM/YMM registers
> > > and enable it by default.
> > Shouldn't TARGET_READ_ZERO_YMM_ZMM_NONEED_VZEROUPPER sounds a bit smoother?
> > Because originally we needed to add vzeroupper to all avx<->sse cases,
> > now it's a tune to indicate that we don't need to add it in some
> 
> Perhaps we should go from the other side and use
> X86_TUNE_OPTIMIZE_AVX_READ for new processors?
> 

Here is the v2 patch to add TARGET_OMIT_VZEROUPPER_AFTER_AVX_READ_ZERO.


H.J.
---
Reading YMM registers with all zero bits needs VZEROUPPER on Sandy Bride,
Ivy Bridge, Haswell, Broadwell and Alder Lake to avoid SSE <-> AVX
transition penalty.  Add TARGET_OMIT_VZEROUPPER_AFTER_AVX_READ_ZERO to
omit vzeroupper instruction after loading all-zero YMM/ZMM registers.

gcc/

PR target/101456
* config/i386/i386.cc (ix86_avx_u128_mode_needed): Omit
vzeroupper after reading all-zero YMM/ZMM registers for
TARGET_OMIT_VZEROUPPER_AFTER_AVX_READ_ZERO.
* config/i386/i386.h (TARGET_OMIT_VZEROUPPER_AFTER_AVX_READ_ZERO):
New.
* config/i386/x86-tune.def
(X86_TUNE_OMIT_VZEROUPPER_AFTER_AVX_READ_ZERO): New.

gcc/testsuite/

PR target/101456
* gcc.target/i386/pr101456-1.c (dg-options): Add
-mtune-ctrl=-mtune-ctrl=omit_vzeroupper_after_avx_read_zero.
* gcc.target/i386/pr101456-2.c: Likewise.
* gcc.target/i386/pr101456-3.c: New test.
* gcc.target/i386/pr101456-4.c: Likewise.
---
 gcc/config/i386/i386.cc| 51 --
 gcc/config/i386/i386.h |  2 +
 gcc/config/i386/x86-tune.def   |  5 +++
 gcc/testsuite/gcc.target/i386/pr101456-1.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr101456-2.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr101456-3.c | 33 ++
 gcc/testsuite/gcc.target/i386/pr101456-4.c | 33 ++
 7 files changed, 103 insertions(+), 25 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101456-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr101456-4.c

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index cf246e74e57..60c72ceb72d 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -14502,33 +14502,38 @@ ix86_avx_u128_mode_needed (rtx_insn *insn)
 
   subrtx_iterator::array_type array;
 
-  rtx set = single_set (insn);
-  if (set)
+  if (TARGET_OMIT_VZEROUPPER_AFTER_AVX_READ_ZERO)
 {
-  rtx dest = SET_DEST (set);
-  rtx src = SET_SRC (set);
-  if (ix86_check_avx_upper_register (dest))
+  /* Perform this vzeroupper optimization if target doesn't need
+vzeroupper after reading all-zero YMM/YMM registers.  */
+  rtx set = single_set (insn);
+  if (set)
{
- /* This is an YMM/ZMM load.  Return AVX_U128_DIRTY if the
-source isn't zero.  */
- if (standard_sse_constant_p (src, GET_MODE (dest)) != 1)
-   return AVX_U128_DIRTY;
+ rtx dest = SET_DEST (set);
+ rtx src = SET_SRC (set);
+ if (ix86_check_avx_upper_register (dest))
+   {
+ /* This is an YMM/ZMM load.  Return AVX_U128_DIRTY if the
+source isn't zero.  */
+ if (standard_sse_constant_p (src, GET_MODE (dest)) != 1)
+   return AVX_U128_DIRTY;
+ else
+   return AVX_U128_ANY;
+   }
  else
-   return AVX_U128_ANY;
-   }
-  else
-   {
- FOR_EACH_SUBRTX (iter, array, src, NONCONST)
-   if (ix86_check_avx_upper_register (*iter))
- {
-   int status = ix86_avx_u128_mode_source (insn, *iter);
-   if (status == AVX_U128_DIRTY)
- return status;
- }
-   }
+   {
+ FOR_EACH_SUBRTX (iter, array, src, NONCONST)
+   if (ix86_check_avx_upper_register (*iter))
+ {
+   int status = ix86_avx_u128_mode_source (insn, *iter);
+   if (status == AVX_U128_DIRTY)
+ return status;
+ }
+   }
 
-  /* This isn't YMM/ZMM load/store.  */
-  return AVX_U128_ANY;
+ /* This isn't YMM/ZMM load/store.  */
+ return AVX_U128_ANY;
+   }
 }
 
   /* Require DIRTY mode if a 256bit or 512bit AVX register is referenced.
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index f41e0908250..46379d2231b 100644
--- a/gcc/config/i386/i386.h
+++ 

[PATCH] target/104581 - compile-time regression in mode-switching

2022-02-17 Thread Richard Biener via Gcc-patches
The x86 backend piggy-backs on mode-switching for insertion of
vzeroupper.  A recent improvement there was implemented in a way
to walk possibly the whole basic-block for all DF reg def definitions
in its mode_needed hook which is called for each instruction in
a basic-block during mode-switching local analysis.

The following mostly reverts this improvement.  It needs to be
re-done in a way more consistent with a local dataflow which
probably means making targets aware of the state of the local
dataflow analysis.

This improves compile-time of some 538.imagick_r TU from
362s to 16s with -Ofast -mavx2 -fprofile-generate.

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

2022-02-17  Richard Biener  

PR target/104581
* config/i386/i386.cc (ix86_avx_u128_mode_source): Remove.
(ix86_avx_u128_mode_needed): Return AVX_U128_DIRTY instead
of calling ix86_avx_u128_mode_source which would eventually
have returned AVX_U128_ANY in some very special case.

* gcc.target/i386/pr101456-1.c: XFAIL.
---
 gcc/config/i386/i386.cc| 78 +-
 gcc/testsuite/gcc.target/i386/pr101456-1.c |  3 +-
 2 files changed, 5 insertions(+), 76 deletions(-)

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index cf246e74e57..e4b42fbba6f 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -14377,80 +14377,12 @@ ix86_check_avx_upper_register (const_rtx exp)
 
 static void
 ix86_check_avx_upper_stores (rtx dest, const_rtx, void *data)
- {
-   if (ix86_check_avx_upper_register (dest))
+{
+  if (ix86_check_avx_upper_register (dest))
 {
   bool *used = (bool *) data;
   *used = true;
 }
- }
-
-/* For YMM/ZMM store or YMM/ZMM extract.  Return mode for the source
-   operand of SRC DEFs in the same basic block before INSN.  */
-
-static int
-ix86_avx_u128_mode_source (rtx_insn *insn, const_rtx src)
-{
-  basic_block bb = BLOCK_FOR_INSN (insn);
-  rtx_insn *end = BB_END (bb);
-
-  /* Return AVX_U128_DIRTY if there is no DEF in the same basic
- block.  */
-  int status = AVX_U128_DIRTY;
-
-  for (df_ref def = DF_REG_DEF_CHAIN (REGNO (src));
-   def; def = DF_REF_NEXT_REG (def))
-if (DF_REF_BB (def) == bb)
-  {
-   /* Ignore DEF from different basic blocks.  */
-   rtx_insn *def_insn = DF_REF_INSN (def);
-
-   /* Check if DEF_INSN is before INSN.  */
-   rtx_insn *next;
-   for (next = NEXT_INSN (def_insn);
-next != nullptr && next != end && next != insn;
-next = NEXT_INSN (next))
- ;
-
-   /* Skip if DEF_INSN isn't before INSN.  */
-   if (next != insn)
- continue;
-
-   /* Return AVX_U128_DIRTY if the source operand of DEF_INSN
-  isn't constant zero.  */
-
-   if (CALL_P (def_insn))
- {
-   bool avx_upper_reg_found = false;
-   note_stores (def_insn,
-ix86_check_avx_upper_stores,
-_upper_reg_found);
-
-   /* Return AVX_U128_DIRTY if call returns AVX.  */
-   if (avx_upper_reg_found)
- return AVX_U128_DIRTY;
-
-   continue;
- }
-
-   rtx set = single_set (def_insn);
-   if (!set)
- return AVX_U128_DIRTY;
-
-   rtx dest = SET_DEST (set);
-
-   /* Skip if DEF_INSN is not an AVX load.  Return AVX_U128_DIRTY
-  if the source operand isn't constant zero.  */
-   if (ix86_check_avx_upper_register (dest)
-   && standard_sse_constant_p (SET_SRC (set),
-   GET_MODE (dest)) != 1)
- return AVX_U128_DIRTY;
-
-   /* We get here only if all AVX loads are from constant zero.  */
-   status = AVX_U128_ANY;
-  }
-
-  return status;
 }
 
 /* Return needed mode for entity in optimize_mode_switching pass.  */
@@ -14520,11 +14452,7 @@ ix86_avx_u128_mode_needed (rtx_insn *insn)
{
  FOR_EACH_SUBRTX (iter, array, src, NONCONST)
if (ix86_check_avx_upper_register (*iter))
- {
-   int status = ix86_avx_u128_mode_source (insn, *iter);
-   if (status == AVX_U128_DIRTY)
- return status;
- }
+ return AVX_U128_DIRTY;
}
 
   /* This isn't YMM/ZMM load/store.  */
diff --git a/gcc/testsuite/gcc.target/i386/pr101456-1.c 
b/gcc/testsuite/gcc.target/i386/pr101456-1.c
index 803fc6e0207..7fb3a3f055c 100644
--- a/gcc/testsuite/gcc.target/i386/pr101456-1.c
+++ b/gcc/testsuite/gcc.target/i386/pr101456-1.c
@@ -30,4 +30,5 @@ foo3 (void)
   bar ();
 }
 
-/* { dg-final { scan-assembler-not "vzeroupper" } } */
+/* See PR104581 for the XFAIL reason.  */
+/* { dg-final { scan-assembler-not "vzeroupper" { xfail *-*-* } } } */
-- 
2.34.1


Re: [PATCH] x86: Don't set AVX_U128_DIRTY when zeroing YMM/ZMM register

2022-02-17 Thread Richard Biener via Gcc-patches
On Wed, Jul 28, 2021 at 5:00 AM Hongtao Liu via Gcc-patches
 wrote:
>
> On Wed, Jul 28, 2021 at 10:46 AM H.J. Lu  wrote:
> >
> > On Tue, Jul 27, 2021 at 7:02 PM Hongtao Liu  wrote:
> > >
> > > On Tue, Jul 27, 2021 at 10:46 PM H.J. Lu via Gcc-patches
> > >  wrote:
> > > >
> > > > There is no SSE <-> AVX transition penalty if the upper bits of YMM/ZMM
> > > > registers are unchanged and YMM/ZMM store doesn't change the upper bits
> > > > of YMM/ZMM registers.
> > > >
> > > > 1. Since zeroing YMM/ZMM register is implemented with zeroing XMM
> > > > register, don't set AVX_U128_DIRTY when zeroing YMM/ZMM register.
> > > > 2. Since store doesn't change the INIT state on the upper bits of
> > > > YMM/ZMM register, don't set AVX_U128_DIRTY on store if the source
> > > > of store was never non-zero.
> > > >
> > > > Here are the vzeroupper count differences on SPEC CPU 2017 with
> > > >
> > > > -Ofast -march=skylake-avx512
> > > >
> > > > Before  AfterDiff
> > > > 500.perlbench_r 226 225 -0.44%
> > > > 502.gcc_r   12631103-12.67%
> > > > 503.bwaves_r14  14  0.00%
> > > > 505.mcf_r   29  28  -3.45%
> > > > 507.cactuBSSN_r 46514628-0.49%
> > > > 508.namd_r  433 432 -0.23%
> > > > 510.parest_r20380   19347   -5.07%
> > > > 511.povray_r495 452 -8.69%
> > > > 519.lbm_r   2   2   0.00%
> > > > 520.omnetpp_r   59545677-4.65%
> > > > 521.wrf_r   12353   12339   -0.11%
> > > > 523.xalancbmk_r 13137   13001   -1.04%
> > > > 525.x264_r  192 191 -0.52%
> > > > 526.blender_r   25152366-5.92%
> > > > 527.cam4_r  46014583-0.39%
> > > > 531.deepsjeng_r 20  19  -5.00%
> > > > 538.imagick_r   898 805 -10.36%
> > > > 541.leela_r 427 399 -6.56%
> > > > 544.nab_r   74  74  0.00%
> > > > 548.exchange2_r 72  72  0.00%
> > > > 549.fotonik3d_r 318 318 0.00%
> > > > 554.roms_r  558 554 -0.72%
> > > > 557.xz_r79  52  -34.18%
> > > >
> > > > and performance differences are within noise range.
> > > >
> > > > gcc/
> > > >
> > > > PR target/101456
> > > > * config/i386/i386.c (ix86_avx_u128_mode_needed): Don't set
> > > > AVX_U128_DIRTY when all bits are zero.
> > > >
> > > > gcc/testsuite/
> > > >
> > > > PR target/101456
> > > > * gcc.target/i386/pr101456-1.c: New test.
> > > > * gcc.target/i386/pr101456-2.c: Likewise.
> > > > ---
> > > >  gcc/config/i386/i386.c | 88 ++
> > > >  gcc/testsuite/gcc.target/i386/pr101456-1.c | 33 
> > > >  gcc/testsuite/gcc.target/i386/pr101456-2.c | 33 
> > > >  3 files changed, 154 insertions(+)
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101456-1.c
> > > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101456-2.c
> > > >
> > > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> > > > index 876a19f4c1f..a1eb7c18d65 100644
> > > > --- a/gcc/config/i386/i386.c
> > > > +++ b/gcc/config/i386/i386.c
> > > > @@ -14149,6 +14149,94 @@ ix86_avx_u128_mode_needed (rtx_insn *insn)
> > > >return AVX_U128_CLEAN;
> > > >  }
> > > >
> > > > +  rtx set = single_set (insn);
> > > > +  if (set)
> > > > +{
> > > > +  rtx dest = SET_DEST (set);
> > > > +  rtx src = SET_SRC (set);
> > > > +  if (ix86_check_avx_upper_register (dest))
> > > > +   {
> > > > + /* This is an YMM/ZMM load.  Return AVX_U128_DIRTY if the
> > > > +source isn't zero.  */
> > > > + if (standard_sse_constant_p (src, GET_MODE (dest)) != 1)
> > > > +   return AVX_U128_DIRTY;
> > > > + else
> > > > +   return AVX_U128_ANY;
> > > > +   }
> > > > +  else if (ix86_check_avx_upper_register (src))
> > > > +   {
> > > > + /* This is an YMM/ZMM store.  Check for the source operand
> > > > +of SRC DEFs in the same basic block before INSN.  */
> > > > + basic_block bb = BLOCK_FOR_INSN (insn);
> > > > + rtx_insn *end = BB_END (bb);
> > > > +
> > > > + /* Return AVX_U128_DIRTY if there is no DEF in the same basic
> > > > +block.  */
> > > > + int status = AVX_U128_DIRTY;
> > > > +
> > > > + for (df_ref def = DF_REG_DEF_CHAIN (REGNO (src));
> > > > +  def; def = DF_REF_NEXT_REG (def))
> > > > +   if (DF_REF_BB (def) == bb)
> > > > + {
> > > > +   /* Ignore DEF from different basic blocks.  */
> > > > +   rtx_insn *def_insn = DF_REF_INSN (def);
> > > > +
> > > > +   /* Check if DEF_INSN is before INSN.  */
> > > > +   rtx_insn *next;
> > > > +   for (next = NEXT_INSN (def_insn);
> > > > +next != nullptr && next != end && next != insn;
> > > > +next = NEXT_INSN (next))
> > > > + ;

This causes 

Add 'gcc/tree.cc:user_omp_clause_code_name' [PR65095] (was: [PATCH 1/4] Add function for pretty-printing OpenACC clause names)

2022-02-17 Thread Thomas Schwinge
Hi!

On 2019-10-18T14:28:18+0200, I wrote:
> On 2019-10-06T15:32:34-0700, Julian Brown  wrote:
>> This patch adds a function to pretty-print OpenACC clause names from
>> OMP_CLAUSE_MAP_KINDs, for error output.
>
> Indeed talking about (OpenMP) 'map' clauses in an OpenACC context is not
> quite ideal -- that's what PR65095 is about

>> Previously approved as part of:
>>
>>   https://gcc.gnu.org/ml/gcc-patches/2018-12/msg01292.html


> A few more comments, for later:
>
>>  gcc/c-family/c-common.h |  1 +
>>  gcc/c-family/c-omp.c| 33 +
>
> As I'd mentioned before: 'Eventually (that is, later), this should move
> into generic code, next to the other "clause printing".

As part of an ICE bug fix that I'm working on, I now need to use
this in GCC middle end code.  Once tested, OK to push the attached
"Add 'gcc/tree.cc:user_omp_clause_code_name' [PR65095]"?


Grüße
 Thomas


> Also to be
> shared with Fortran.'
>
>> --- a/gcc/c-family/c-omp.c
>> +++ b/gcc/c-family/c-omp.c
>
>> +/* For OpenACC, the OMP_CLAUSE_MAP_KIND of an OMP_CLAUSE_MAP is used 
>> internally
>> +   to distinguish clauses as seen by the user.  Return the "friendly" clause
>> +   name for error messages etc., where possible.  See also
>> +   c/c-parser.c:c_parser_oacc_data_clause and
>> +   cp/parser.c:cp_parser_oacc_data_clause.  */
>> +
>> +const char *
>> +c_omp_map_clause_name (tree clause, bool oacc)
>> +{
>> +  if (oacc && OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_MAP)
>> +switch (OMP_CLAUSE_MAP_KIND (clause))
>> +{
>> +case GOMP_MAP_FORCE_ALLOC:
>> +case GOMP_MAP_ALLOC: return "create";
>> +case GOMP_MAP_FORCE_TO:
>> +case GOMP_MAP_TO: return "copyin";
>> +case GOMP_MAP_FORCE_FROM:
>> +case GOMP_MAP_FROM: return "copyout";
>> +case GOMP_MAP_FORCE_TOFROM:
>> +case GOMP_MAP_TOFROM: return "copy";
>> +case GOMP_MAP_RELEASE: return "delete";
>> +case GOMP_MAP_FORCE_PRESENT: return "present";
>> +case GOMP_MAP_ATTACH: return "attach";
>> +case GOMP_MAP_FORCE_DETACH:
>> +case GOMP_MAP_DETACH: return "detach";
>> +case GOMP_MAP_DEVICE_RESIDENT: return "device_resident";
>> +case GOMP_MAP_LINK: return "link";
>> +case GOMP_MAP_FORCE_DEVICEPTR: return "deviceptr";
>> +default: break;
>> +}
>> +  return omp_clause_code_name[OMP_CLAUSE_CODE (clause)];
>> +}
>
> Indeed nearby (after) the 'omp_clause_code_name' definition in
> 'gcc/tree.c' would probably be a better place for this, as that's where
> the current clause names are coming from.
>
> I did wonder whether we need to explicitly translate from (OpenMP) "'map'
> clause" into (OpenACC) "'create' clause" etc., or if a generic (OpenACC)
> "data clause" would be sufficient?  (After all, in diagnostics we also
> print out the original code, so the user can then see which specific data
> clause is being complained about.  But -- somewhat funnily! -- the way
> you're doing this might actually be better in terms of translatability in
> diagnostics printing: "%qs clause" might require a different translation
> when its "%s" can be "'map'" (doesn't get translated) vs. "data" (gets
> translated), but remains the same when "%s" is "'map'" vs. "'create'"
> etc.
>
> Do we at all still generate 'GOMP_MAP_FORCE_*' anywhere, or should these
> in fact be 'gcc_unreachable'?
>
> Generally, I prefer if all possible 'case's are listed explicitly, and
> then the 'default' (and here OpenMP-only ones, too) be 'gcc_unreachable',
> so that we easily catch the case that new 'GOMP_MAP_*' get added but such
> functions not updated, for example.
>
>
> Grüße
>  Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 741a15e861fb97f720d527f917b5888c2b9324e9 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Thu, 17 Feb 2022 12:46:57 +0100
Subject: [PATCH] Add 'gcc/tree.cc:user_omp_clause_code_name' [PR65095]

Re PR65095 "Adapt OpenMP diagnostic messages for OpenACC", move
'gcc/c-family/c-omp.cc:c_omp_map_clause_name' C/C++ front end to
'gcc/tree.cc:user_omp_clause_code_name' middle end.  No functional change.

	PR other/65095
	TODO
---
 gcc/c-family/c-common.h |  1 -
 gcc/c-family/c-omp.cc   | 33 -
 gcc/c/c-typeck.cc   |  4 ++--
 gcc/cp/semantics.cc |  4 ++--
 gcc/tree-core.h |  1 +
 gcc/tree.cc | 36 
 6 files changed, 41 insertions(+), 38 deletions(-)

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index ee0c4de2a05..09e4c378cb9 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1249,7 +1249,6 @@ extern enum omp_clause_default_kind c_omp_predetermined_sharing (tree);
 extern enum omp_clause_defaultmap_kind c_omp_predetermined_mapping (tree);
 extern tree 

Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'

2022-02-17 Thread Thomas Schwinge
Hi!

On 2022-02-11T08:02:20+0100, Richard Biener  wrote:
> On Thu, Feb 10, 2022 at 11:20 PM Thomas Schwinge
>  wrote:
>> On 2022-02-10T16:36:51+, Michael Matz via Gcc-patches 
>>  wrote:
>> > On Thu, 10 Feb 2022, Richard Biener via Gcc-patches wrote:
>> >> On Wed, Feb 9, 2022 at 2:21 PM Thomas Schwinge  
>> >> wrote:
>> >> > OK to push (now, or in next development stage 1?) the attached
>> >> > "Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'",
>> >> > or should that be done differently -- or, per the current state (why?)
>> >> > not at all?
>>
>> First, thanks for (indirectly) having confirmed that my confusion is not
>> completely off, why this is currently missing.  ;-)
>>
>> >> Hmm, I wonder if we shouldn't simply dump DECL_UID as
>> >>
>> >>  'uid NNN'
>> >
>> > Yes, much better in line with the normal dump_tree output.
>>
>> >> somewhere.  For example after or before DECL_NAME?
>>
>> Heh -- that's what I wanted to do initially, but then I saw that we've
>> currently got in 'print_node_brief' (and very similar in 'print_node'):
>>
>> [...]
>>   fprintf (file, "%s <%s", prefix, get_tree_code_name (TREE_CODE 
>> (node)));
>>   dump_addr (file, " ", node);
>>
>>   if (tclass == tcc_declaration)
>> {
>>   if (DECL_NAME (node))
>> fprintf (file, " %s", IDENTIFIER_POINTER (DECL_NAME (node)));
>>   else if (TREE_CODE (node) == LABEL_DECL
>>&& LABEL_DECL_UID (node) != -1)
>> {
>>   if (dump_flags & TDF_NOUID)
>> fprintf (file, " L.");
>>   else
>> fprintf (file, " L.%d", (int) LABEL_DECL_UID (node));
>> }
>>   else
>> {
>>   if (dump_flags & TDF_NOUID)
>> fprintf (file, " %c.",
>>  TREE_CODE (node) == CONST_DECL ? 'C' : 'D');
>>   else
>> fprintf (file, " %c.%u",
>>  TREE_CODE (node) == CONST_DECL ? 'C' : 'D',
>>  DECL_UID (node));
>> }
>> }
>> [...]
>>
>> That is, if there's no 'DECL_NAME', we print 'L.[UID]', 'C.[UID]',
>> 'D.[UID]'.  The same we do in 'gcc/tree-pretty-print.cc:dump_decl_name',
>> I found.  But in the latter function, we also do it that same way if
>> there is a 'DECL_NAME' ('i' -> 'iD.4249', for example), so that's why I
>> copied that style back to my proposed 'print_node_brief'/'print_node'
>> change.
>>
>> Are you now suggesting to only print 'DECL_NAME' as '[NAME] uid [UID]',
>> but keep 'L.[UID]', 'C.[UID]', 'D.[UID]' in the "dot" form, or change
>> these to 'L uid [UID]', 'C uid [UID]', 'D uid [UID]' correspondingly?
>
> I'd say these should then be 'D.[UID] uid [UID]' even if that's
> somewhat redundant.

Sure, that's fine for me.  So, like in the attached
"Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief', 'print_node'";
OK to push?  (...  which evidently I forgot to send last week...)


Grüße
 Thomas


>> And also do the similar changes in
>> 'gcc/tree-pretty-print.cc:dump_decl_name' (as well as another dozen or so
>> places where such things are printed...), or don't change those?
>
> Don't change those - you were targeting the tree dumper, not the
> pretty printers.
> The tree dumpers generally dump attributes separately.
>
>
>>
>> I don't care very much which way, just have some slight preference to
>> keep things similar.
>>
>>
>> Grüße
>>  Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 0a39ef2415e5b4376e5554533b33ff86f16d Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 11 Feb 2022 10:10:25 +0100
Subject: [PATCH] Consider 'TDF_UID', 'TDF_NOUID' in 'print_node_brief',
 'print_node'

Running GCC with '-fdump-tree-all-uid' (so that 'TDF_UID' is set in
'dump_flags') and '-wrapper gdb,--args', then for a 'call debug_tree(decl)',
that does (pretty-)print all kinds of things -- but not the 'DECL_UID':

[...]
(gdb) print dump_flags & TDF_UID
$1 = 256
(gdb) call debug_tree(decl)
 
unit-size 
align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x77e8 precision:32 min  max 
pointer_to_this >
used SI source-gcc/gcc/testsuite/gfortran.dg/goacc-gomp/pr102330-3.f90:10:3 size  unit-size 
align:32 warn_if_not_align:0 context >
(gdb) print decl.decl_minimal.uid
$3 = 4249

In my opinion, that's a bit unfortunate, as the 'DECL_UID' is very important
for debugging certain classes of issues.

With this patch, there is no change if 'TDF_UID' isn't set, but if it is, we
now append 'uid [DECL_UID]':

 
unit-size 
align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 

Re: libgo patch committed: Update to Go1.18beta2 release

2022-02-17 Thread Eric Botcazou via Gcc-patches
> I've committed this patch to fix these problems.  Bootstrapped and ran
> Go testsuite on x86_64-pc-linux-gnu and x86_64-solaris.

Fine by me, thanks for the quick turnaround!

-- 
Eric Botcazou




Re: [PATCH] tree-optimization/96881 - CD-DCE and CLOBBERs

2022-02-17 Thread Richard Biener via Gcc-patches
On Thu, 17 Feb 2022, Jan Hubicka wrote:

> > +/* Returns whether the control parents of BB are preserved.  */
> > +
> > +static bool
> > +control_parents_preserved_p (basic_block bb)
> > +{
> > +  /* If we marked the control parents from BB they are preserved.  */
> > +  if (bitmap_bit_p (visited_control_parents, bb->index))
> > +return true;
> > +
> > +  /* But they can also end up being marked from elsewhere.  */
> > +  bitmap_iterator bi;
> > +  unsigned edge_number;
> > +  EXECUTE_IF_SET_IN_BITMAP (cd->get_edges_dependent_on (bb->index),
> > +   0, edge_number, bi)
> > +{
> > +  basic_block cd_bb = cd->get_edge_src (edge_number);
> > +  if (cd_bb != bb
> > + && !bitmap_bit_p (last_stmt_necessary, cd_bb->index))
> > +   return false;
> > +}
> > +  return true;
> I suppose you can also set visited_control_parents bit here so the loop
> is not re-done for every clobber in a BB.

Good idea, will do that.

Richard.


Re: [PATCH] tree-optimization/96881 - CD-DCE and CLOBBERs

2022-02-17 Thread Jan Hubicka via Gcc-patches
> +/* Returns whether the control parents of BB are preserved.  */
> +
> +static bool
> +control_parents_preserved_p (basic_block bb)
> +{
> +  /* If we marked the control parents from BB they are preserved.  */
> +  if (bitmap_bit_p (visited_control_parents, bb->index))
> +return true;
> +
> +  /* But they can also end up being marked from elsewhere.  */
> +  bitmap_iterator bi;
> +  unsigned edge_number;
> +  EXECUTE_IF_SET_IN_BITMAP (cd->get_edges_dependent_on (bb->index),
> + 0, edge_number, bi)
> +{
> +  basic_block cd_bb = cd->get_edge_src (edge_number);
> +  if (cd_bb != bb
> +   && !bitmap_bit_p (last_stmt_necessary, cd_bb->index))
> + return false;
> +}
> +  return true;
I suppose you can also set visited_control_parents bit here so the loop
is not re-done for every clobber in a BB.

Honza


Re: [PATCHv3] libiberty rust-demangle, ignore .suffix

2022-02-17 Thread Mark Wielaard
Ping. Is this OK to commit now?
I am not sure who can approve this.

On Sun, Jan 16, 2022 at 01:35:34AM +0100, Mark Wielaard wrote:
> Rust symbols can have a .suffix because of compiler transformations.
> These can be ignored in the demangled name. Which is what this patch
> implements. By stopping at the first dot for v0 symbols and searching
> backwards to the ending 'E' for legacy symbols.
> 
> An alternative implementation could be to follow what C++ does and
> represent these as [clone .suffix] tagged onto the demangled name.
> But this seems somewhat confusing since it results in a demangled
> name that cannot be mangled again. And it would mean trying to
> decode compiler internal naming.
> 
> https://bugs.kde.org/show_bug.cgi?id=445916
> https://github.com/rust-lang/rust/issues/60705
> 
> libiberty/Changelog
> 
>   * rust-demangle.c (rust_demangle_callback): Ignore everything
>   after '.' char in sym for v0. For legacy symbols search
>   backwards to find the last 'E' before any '.'.
>   * testsuite/rust-demangle-expected: Add new .suffix testcases.
> ---
>  libiberty/rust-demangle.c  | 21 ++---
>  libiberty/testsuite/rust-demangle-expected | 26 ++
>  2 files changed, 44 insertions(+), 3 deletions(-)
> 
> V3 - Add more testcases
>- Allow @ in legacy symbols (which can appear in the .suffix)
> 
> diff --git a/libiberty/rust-demangle.c b/libiberty/rust-demangle.c
> index 18c760491bdc..42c88161da30 100644
> --- a/libiberty/rust-demangle.c
> +++ b/libiberty/rust-demangle.c
> @@ -1340,13 +1340,19 @@ rust_demangle_callback (const char *mangled, int 
> options,
>/* Rust symbols (v0) use only [_0-9a-zA-Z] characters. */
>for (p = rdm.sym; *p; p++)
>  {
> +  /* Rust v0 symbols can have '.' suffixes, ignore those.  */
> +  if (rdm.version == 0 && *p == '.')
> +break;
> +
>rdm.sym_len++;
>  
>if (*p == '_' || ISALNUM (*p))
>  continue;
>  
> -  /* Legacy Rust symbols can also contain [.:$] characters. */
> -  if (rdm.version == -1 && (*p == '$' || *p == '.' || *p == ':'))
> +  /* Legacy Rust symbols can also contain [.:$] characters.
> + Or @ in the .suffix (which will be skipped, see below). */
> +  if (rdm.version == -1 && (*p == '$' || *p == '.' || *p == ':'
> +|| *p == '@'))
>  continue;
>  
>return 0;
> @@ -1355,7 +1361,16 @@ rust_demangle_callback (const char *mangled, int 
> options,
>/* Legacy Rust symbols need to be handled separately. */
>if (rdm.version == -1)
>  {
> -  /* Legacy Rust symbols always end with E. */
> +  /* Legacy Rust symbols always end with E.  But can be followed by a
> + .suffix (which we want to ignore).  */
> +  int dot_suffix = 1;
> +  while (rdm.sym_len > 0 &&
> + !(dot_suffix && rdm.sym[rdm.sym_len - 1] == 'E'))
> +{
> +  dot_suffix = rdm.sym[rdm.sym_len - 1] == '.';
> +  rdm.sym_len--;
> +}
> +
>if (!(rdm.sym_len > 0 && rdm.sym[rdm.sym_len - 1] == 'E'))
>  return 0;
>rdm.sym_len--;
> diff --git a/libiberty/testsuite/rust-demangle-expected 
> b/libiberty/testsuite/rust-demangle-expected
> index 7dca315d0054..b565084cfefa 100644
> --- a/libiberty/testsuite/rust-demangle-expected
> +++ b/libiberty/testsuite/rust-demangle-expected
> @@ -295,3 +295,29 @@ _RMCs4fqI2P2rA04_13const_genericINtB0_4CharKc2202_E
>  --format=auto
>  _RNvNvMCs4fqI2P2rA04_13const_genericINtB4_3FooKpE3foo3FOO
>  >::foo::FOO
> +#
> +# Suffixes
> +#
> +--format=rust
> +_RNvMs0_NtCs5l0EXMQXRMU_21rustc_data_structures17obligation_forestINtB5_16ObligationForestNtNtNtCsdozMG8X9FIu_21rustc_trait_selection6traits7fulfill26PendingPredicateObligationE22register_obligation_atB1v_.llvm.8517020237817239694
> +>::register_obligation_at
> +--format=rust
> +_ZN4core3ptr85drop_in_place$LT$std..rt..lang_start$LT$$LP$$RP$$GT$..$u7b$$u7b$closure$u7d$$u7d$$GT$17h27f14859c664490dE.llvm.8091179795805947855
> +core::ptr::drop_in_place::{{closure}}>
> +# old style rustc llvm thinlto
> +--format=rust
> +_ZN9backtrace3foo17hbb467fcdaea5d79bE.llvm.A5310EB9
> +backtrace::foo
> +--format=rust
> +_ZN9backtrace3foo17hbb467fcdaea5d79bE.llvm.A5310EB9@@16
> +backtrace::foo
> +# new style rustc llvm thinlto
> +--format=rust
> +_RC3foo.llvm.9D1C9369
> +foo
> +--format=rust
> +_RC3foo.llvm.9D1C9369@@16
> +foo
> +--format=rust
> +_RNvC9backtrace3foo.llvm.A5310EB9
> +backtrace::foo
> -- 
> 2.30.2
> 


[PATCH] tree-optimization/96881 - CD-DCE and CLOBBERs

2022-02-17 Thread Richard Biener via Gcc-patches
CD-DCE does not consider CLOBBERs as necessary in the attempt
to not prevent DCE of SSA defs it uses.  A side-effect of that
is that it also removes all its control dependences if they are
not made necessary by other means.  When we later try to preserve
as many CLOBBERs as possible we have to make sure we also
preserved the controlling conditions, otherwise a CLOBBER can
now appear on a path where it was not executed before, leading
to wrong code as seen in the testcase.

I've tried to continue to handle both direct and indirect
CLOBBERs optimistically, allowing CD-DCE to remove control
flow that just controls CLOBBERs but that regresses for
example the stack coalescing test g++.dg/opt/pr80032.C.
The pattern there is
  if (pred) D.2512 = CLOBBER; else D.2512 = CLOBBER;
basically we have all paths leading to the same clobber but
we could safely cut some branches which we do not realize
early enough.  This regression can be mitigated by no longer
considering direct CLOBBERs optimistically - the original
motivation for the CD-DCE handling wasn't removal of control
flow but SSA defs of the address.

Handling indirect vs. direct clobbers differently feels
somewhat wrong, still the patch goes with this solution.

Bootstrapped and tested on x86_64-unknown-linux-gnu, will
push later today unless I hear otherwise.

Richard.

2022-02-15  Richard Biener  

PR tree-optimization/96881
* tree-ssa-dce.cc (mark_stmt_if_obviously_necessary): Comment
CLOBBER handling.
(control_parents_preserved_p): New function.
(eliminate_unnecessary_stmts): Check that we preserved control
parents before retaining a CLOBBER.
(perform_tree_ssa_dce): Pass down aggressive flag
to eliminate_unnecessary_stmts.

* g++.dg/torture/pr96881-1.C: New testcase.
* g++.dg/torture/pr96881-2.C: Likewise.
---
 gcc/testsuite/g++.dg/torture/pr96881-1.C | 37 
 gcc/testsuite/g++.dg/torture/pr96881-2.C | 37 
 gcc/tree-ssa-dce.cc  | 37 +---
 3 files changed, 107 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr96881-1.C
 create mode 100644 gcc/testsuite/g++.dg/torture/pr96881-2.C

diff --git a/gcc/testsuite/g++.dg/torture/pr96881-1.C 
b/gcc/testsuite/g++.dg/torture/pr96881-1.C
new file mode 100644
index 000..1c182e6a8b3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr96881-1.C
@@ -0,0 +1,37 @@
+/* { dg-do run } */
+
+struct S { int s; ~S () {} } s;
+
+void __attribute__((noipa))
+foo (struct S *s, int flag)
+{
+  s->s = 1;
+  // We have to makes sure to not make the inlined CLOBBER
+  // unconditional but we have to remove it to be able
+  // to elide the branch
+  if (!flag)
+return;
+  s->~S();
+}
+
+void __attribute__((noipa))
+bar (struct S *s, int flag)
+{
+  s->s = 1;
+  // CD-DCE chooses an arbitrary path, try to present it
+  // with all variants
+  if (flag)
+s->~S();
+}
+
+int
+main ()
+{
+  foo (, 0);
+  if (s.s != 1)
+__builtin_abort ();
+  bar (, 0);
+  if (s.s != 1)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/torture/pr96881-2.C 
b/gcc/testsuite/g++.dg/torture/pr96881-2.C
new file mode 100644
index 000..35c788791e8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr96881-2.C
@@ -0,0 +1,37 @@
+/* { dg-do run } */
+
+struct S { int s; ~S () {} } s;
+
+void __attribute__((noipa))
+foo (int flag)
+{
+  s.s = 1;
+  // We have to makes sure to not make the inlined CLOBBER
+  // unconditional but we have to remove it to be able
+  // to elide the branch
+  if (!flag)
+return;
+  s.~S();
+}
+
+void __attribute__((noipa))
+bar (int flag)
+{
+  s.s = 1;
+  // CD-DCE chooses an arbitrary path, try to present it
+  // with all variants
+  if (flag)
+s.~S();
+}
+
+int
+main ()
+{
+  foo (0);
+  if (s.s != 1)
+__builtin_abort ();
+  bar (0);
+  if (s.s != 1)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-dce.cc b/gcc/tree-ssa-dce.cc
index f1034878eaf..ea8c6c98b81 100644
--- a/gcc/tree-ssa-dce.cc
+++ b/gcc/tree-ssa-dce.cc
@@ -284,7 +284,10 @@ mark_stmt_if_obviously_necessary (gimple *stmt, bool 
aggressive)
   break;
 
 case GIMPLE_ASSIGN:
-  if (gimple_clobber_p (stmt))
+  /* Mark indirect CLOBBERs to be lazily removed if their SSA operands
+do not prevail.  That also makes control flow leading to them
+not necessary in aggressive mode.  */
+  if (gimple_clobber_p (stmt) && !zero_ssa_operands (stmt, SSA_OP_USE))
return;
   break;
 
@@ -1268,11 +1271,34 @@ maybe_optimize_arith_overflow (gimple_stmt_iterator 
*gsi,
   gimplify_and_update_call_from_tree (gsi, result);
 }
 
+/* Returns whether the control parents of BB are preserved.  */
+
+static bool
+control_parents_preserved_p (basic_block bb)
+{
+  /* If we marked the control parents from BB they are preserved.  */
+  if (bitmap_bit_p (visited_control_parents, bb->index))
+ 

Re: [PATCH][gcc][middle-end] PR104498: Fix comparing symbol reference

2022-02-17 Thread Richard Biener via Gcc-patches
On Wed, 16 Feb 2022, Andre Vieira (lists) wrote:

> Hi,
> 
> As reported on PR104498, the issue here is that when compare_base_symbol_refs
> swaps x and y but doesn't take that into account when computing the distance.
> This patch makes sure that if x and y are swapped, we correct the distance
> computation by multiplying it by -1 to end up with the correct expected result
> of the original Y_BASE - X_BASE.
> 
> Bootstrapped and regression tested on aarch64-none-linux.
> 
> OK for trunk?

OK.

Thanks,
Richard.

> gcc/ChangeLog:
> 
>     PR middle-end/104498
>     * alias.cc (compare_base_symbol_refs): Correct distance 
> computation when
>     swapping x and y.
> 


Re: [PATCH] valtrack: Avoid creating raw SUBREGs with VOIDmode argument [PR104557]

2022-02-17 Thread Richard Biener via Gcc-patches
On Thu, 17 Feb 2022, Jakub Jelinek wrote:

> Hi!
> 
> After the recent r12-7240 simplify_immed_subreg changes, we bail on more
> simplify_subreg calls than before, e.g. apparently for decimal modes
> in the NaN representations  we almost never preserve anything except the
> canonical {q,s}NaNs.
> simplify_gen_subreg will punt in such cases because a SUBREG with VOIDmode
> is not valid, but debug_lowpart_subreg wants to attempt even harder, even
> if e.g. target indicates certain mode combinations aren't valid for the
> backend, dwarf2out can still handle them.  But a SUBREG from a VOIDmode
> operand is just too much, the inner mode is lost there.  We'd need some
> new rtx that would be able to represent those cases.
> For now, just punt in those cases.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Richard.

> 2022-02-17  Jakub Jelinek  
> 
>   PR debug/104557
>   * valtrack.cc (debug_lowpart_subreg): Don't call gen_rtx_raw_SUBREG
>   if expr has VOIDmode.
> 
>   * gcc.dg/dfp/pr104557.c: New test.
> 
> --- gcc/valtrack.cc.jj2022-01-18 11:59:00.252972485 +0100
> +++ gcc/valtrack.cc   2022-02-16 11:29:28.234826860 +0100
> @@ -558,7 +558,9 @@ debug_lowpart_subreg (machine_mode outer
>rtx ret = simplify_gen_subreg (outer_mode, expr, inner_mode, offset);
>if (ret)
>  return ret;
> -  return gen_rtx_raw_SUBREG (outer_mode, expr, offset);
> +  if (GET_MODE (expr) != VOIDmode)
> +return gen_rtx_raw_SUBREG (outer_mode, expr, offset);
> +  return NULL_RTX;
>  }
>  
>  /* If UREGNO is referenced by any entry in DEBUG, emit a debug insn
> --- gcc/testsuite/gcc.dg/dfp/pr104557.c.jj2022-02-16 11:36:03.733329235 
> +0100
> +++ gcc/testsuite/gcc.dg/dfp/pr104557.c   2022-02-16 11:35:27.599831513 
> +0100
> @@ -0,0 +1,22 @@
> +/* PR debug/104557 */
> +/* { dg-do compile } */
> +/* { dg-options "-O -g -Wno-psabi" } */
> +
> +typedef int __attribute__((__vector_size__ (32))) U;
> +typedef double __attribute__((__vector_size__ (32))) F;
> +typedef _Decimal64 __attribute__((__vector_size__ (32))) D;
> +
> +F
> +bar (void)
> +{
> +  F f = __builtin_convertvector ((D) (-10.d < (D) ((D) (U) { 0, 0, 0, 0, 0, 
> 0, 0, -0xe0 }
> +>= (D) { 8000 })), F);
> +  return f;
> +}
> +
> +F
> +foo ()
> +{
> +  F x = bar ();
> +  return x;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev; HRB 36809 (AG Nuernberg)


Re: [PATCH] x86: Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER

2022-02-17 Thread Richard Biener via Gcc-patches
On Thu, Feb 17, 2022 at 8:52 AM Uros Bizjak via Gcc-patches
 wrote:
>
> On Thu, Feb 17, 2022 at 6:25 AM Hongtao Liu via Gcc-patches
>  wrote:
> >
> > On Thu, Feb 17, 2022 at 12:26 PM H.J. Lu via Gcc-patches
> >  wrote:
> > >
> > > Reading YMM registers with all zero bits needs VZEROUPPER on Sandy Bride,
> > > Ivy Bridge, Haswell, Broadwell and Alder Lake to avoid SSE <-> AVX
> > > transition penalty.  Add TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER to
> > > generate vzeroupper instruction after loading all-zero YMM/YMM registers
> > > and enable it by default.
> > Shouldn't TARGET_READ_ZERO_YMM_ZMM_NONEED_VZEROUPPER sounds a bit smoother?
> > Because originally we needed to add vzeroupper to all avx<->sse cases,
> > now it's a tune to indicate that we don't need to add it in some
>
> Perhaps we should go from the other side and use
> X86_TUNE_OPTIMIZE_AVX_READ for new processors?

Btw, do you have a micro-benchmark to test this on AMD archs?

Thanks,
Richard.

> Uros.
>
> > cases.
> > >
> > > gcc/
> > >
> > > PR target/101456
> > > * config/i386/i386.cc (ix86_avx_u128_mode_needed): Skip the
> > > vzeroupper optimization if target needs vzeroupper after reading
> > > all-zero YMM/YMM registers.
> > > * config/i386/i386.h (TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER):
> > > New.
> > > * config/i386/x86-tune.def
> > > (X86_TUNE_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER): New.
> > >
> > > gcc/testsuite/
> > >
> > > PR target/101456
> > > * gcc.target/i386/pr101456-1.c (dg-options): Add
> > > -mtune-ctrl=^read_zero_ymm_zmm_need_vzeroupper.
> > > * gcc.target/i386/pr101456-2.c: Likewise.
> > > * gcc.target/i386/pr101456-3.c: New test.
> > > * gcc.target/i386/pr101456-4.c: Likewise.
> > > ---
> > >  gcc/config/i386/i386.cc| 51 --
> > >  gcc/config/i386/i386.h |  2 +
> > >  gcc/config/i386/x86-tune.def   |  5 +++
> > >  gcc/testsuite/gcc.target/i386/pr101456-1.c |  2 +-
> > >  gcc/testsuite/gcc.target/i386/pr101456-2.c |  2 +-
> > >  gcc/testsuite/gcc.target/i386/pr101456-3.c | 33 ++
> > >  gcc/testsuite/gcc.target/i386/pr101456-4.c | 33 ++
> > >  7 files changed, 103 insertions(+), 25 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101456-3.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr101456-4.c
> > >
> > > diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
> > > index cf246e74e57..1f8b4caf24c 100644
> > > --- a/gcc/config/i386/i386.cc
> > > +++ b/gcc/config/i386/i386.cc
> > > @@ -14502,33 +14502,38 @@ ix86_avx_u128_mode_needed (rtx_insn *insn)
> > >
> > >subrtx_iterator::array_type array;
> > >
> > > -  rtx set = single_set (insn);
> > > -  if (set)
> > > +  if (!TARGET_READ_ZERO_YMM_ZMM_NEED_VZEROUPPER)
> > >  {
> > > -  rtx dest = SET_DEST (set);
> > > -  rtx src = SET_SRC (set);
> > > -  if (ix86_check_avx_upper_register (dest))
> > > +  /* Perform this vzeroupper optimization if target doesn't need
> > > +vzeroupper after reading all-zero YMM/YMM registers.  */
> > > +  rtx set = single_set (insn);
> > > +  if (set)
> > > {
> > > - /* This is an YMM/ZMM load.  Return AVX_U128_DIRTY if the
> > > -source isn't zero.  */
> > > - if (standard_sse_constant_p (src, GET_MODE (dest)) != 1)
> > > -   return AVX_U128_DIRTY;
> > > + rtx dest = SET_DEST (set);
> > > + rtx src = SET_SRC (set);
> > > + if (ix86_check_avx_upper_register (dest))
> > > +   {
> > > + /* This is an YMM/ZMM load.  Return AVX_U128_DIRTY if the
> > > +source isn't zero.  */
> > > + if (standard_sse_constant_p (src, GET_MODE (dest)) != 1)
> > > +   return AVX_U128_DIRTY;
> > > + else
> > > +   return AVX_U128_ANY;
> > > +   }
> > >   else
> > > -   return AVX_U128_ANY;
> > > -   }
> > > -  else
> > > -   {
> > > - FOR_EACH_SUBRTX (iter, array, src, NONCONST)
> > > -   if (ix86_check_avx_upper_register (*iter))
> > > - {
> > > -   int status = ix86_avx_u128_mode_source (insn, *iter);
> > > -   if (status == AVX_U128_DIRTY)
> > > - return status;
> > > - }
> > > -   }
> > > +   {
> > > + FOR_EACH_SUBRTX (iter, array, src, NONCONST)
> > > +   if (ix86_check_avx_upper_register (*iter))
> > > + {
> > > +   int status = ix86_avx_u128_mode_source (insn, *iter);
> > > +   if (status == AVX_U128_DIRTY)
> > > + return status;
> > > + }
> > > +   }
> > >
> > > -  /* This isn't YMM/ZMM load/store.  */
> > > -  return AVX_U128_ANY;
> > > + /* This isn't YMM/ZMM load/store.  */

Re: [PATCH V2] Restrict the two sources of vect_recog_cond_expr_convert_pattern to be of the same type when convert is extension.

2022-02-17 Thread Richard Biener via Gcc-patches
On Thu, Feb 17, 2022 at 6:32 AM liuhongt via Gcc-patches
 wrote:
>
> > I find this quite unreadable, it looks like if @2 and @3 are treated
> > differently.  I think keeping the old 3 lines and just adding
> >   && (TYPE_PRECISION (TREE_TYPE (@0)) >= TYPE_PRECISION (type)
> >   || (TYPE_UNSIGNED (TREE_TYPE (@2))
> >   == TYPE_UNSIGNED (TREE_TYPE (@3
> > after it ideally with a comment why would be better.
> Update patch.

OK.

Thanks,
Richard.

> gcc/ChangeLog:
>
> PR tree-optimization/104551
> PR tree-optimization/103771
> * match.pd (cond_expr_convert_p): Add types_match check when
> convert is extension.
> * tree-vect-patterns.cc
> (gimple_cond_expr_convert_p): Adjust comments.
> (vect_recog_cond_expr_convert_pattern): Ditto.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr104551.c: New test.
> ---
>  gcc/match.pd |  6 ++
>  gcc/testsuite/gcc.target/i386/pr104551.c | 24 
>  gcc/tree-vect-patterns.cc|  6 --
>  3 files changed, 34 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr104551.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 05a10ab6bfd..8b6f22f1065 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -7698,5 +7698,11 @@ and,
>   == TYPE_PRECISION (TREE_TYPE (@2))
> && TYPE_PRECISION (TREE_TYPE (@0))
>   == TYPE_PRECISION (TREE_TYPE (@3))
> +   /* For vect_recog_cond_expr_convert_pattern, @2 and @3 can differ in
> + signess when convert is truncation, but not ok for extension since
> + it's sign_extend vs zero_extend.  */
> +   && (TYPE_PRECISION (TREE_TYPE (@0)) > TYPE_PRECISION (type)
> +  || (TYPE_UNSIGNED (TREE_TYPE (@2))
> +  == TYPE_UNSIGNED (TREE_TYPE (@3
> && single_use (@4)
> && single_use (@5
> diff --git a/gcc/testsuite/gcc.target/i386/pr104551.c 
> b/gcc/testsuite/gcc.target/i386/pr104551.c
> new file mode 100644
> index 000..6300f25c0d5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr104551.c
> @@ -0,0 +1,24 @@
> +/* { dg-do run } */
> +/* { dg-options "-O3 -mavx2" } */
> +/* { dg-require-effective-target avx2 } */
> +
> +unsigned int
> +__attribute__((noipa))
> +test(unsigned int a, unsigned char p[16]) {
> +  unsigned int res = 0;
> +  for (unsigned b = 0; b < a; b += 1)
> +res = p[b] ? p[b] : (char) b;
> +  return res;
> +}
> +
> +int main ()
> +{
> +  unsigned int a = 16U;
> +  unsigned char p[16];
> +  for (int i = 0; i != 16; i++)
> +p[i] = (unsigned char)128;
> +  unsigned int res = test (a, p);
> +  if (res != 128)
> +__builtin_abort ();
> +  return 0;
> +}
> diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
> index a8f96d59643..217bdfd7045 100644
> --- a/gcc/tree-vect-patterns.cc
> +++ b/gcc/tree-vect-patterns.cc
> @@ -929,8 +929,10 @@ vect_reassociating_reduction_p (vec_info *vinfo,
> with conditions:
> 1) @1, @2, c, d, a, b are all integral type.
> 2) There's single_use for both @1 and @2.
> -   3) a, c and d have same precision.
> +   3) a, c have same precision.
> 4) c and @1 have different precision.
> +   5) c, d are the same type or they can differ in sign when convert is
> +   truncation.
>
> record a and c and d and @3.  */
>
> @@ -952,7 +954,7 @@ extern bool gimple_cond_expr_convert_p (tree, tree*, tree 
> (*)(tree));
> TYPE_PRECISION (TYPE_E) != TYPE_PRECISION (TYPE_CD);
> TYPE_PRECISION (TYPE_AB) == TYPE_PRECISION (TYPE_CD);
> single_use of op_true and op_false.
> -   TYPE_AB could differ in sign.
> +   TYPE_AB could differ in sign when (TYPE_E) A is a truncation.
>
> Input:
>
> --
> 2.18.1
>


Re: [wwwdocs PATCH] gcc-11.3: Mention -mharden-sls= and -mindirect-branch-cs-prefix

2022-02-17 Thread Richard Biener via Gcc-patches
On Wed, Feb 16, 2022 at 3:42 PM H.J. Lu via Gcc-patches
 wrote:

OK.

> ---
>  htdocs/gcc-11/changes.html | 7 +++
>  1 file changed, 7 insertions(+)
>
> diff --git a/htdocs/gcc-11/changes.html b/htdocs/gcc-11/changes.html
> index fbd1b8ba..8e6d4ec8 100644
> --- a/htdocs/gcc-11/changes.html
> +++ b/htdocs/gcc-11/changes.html
> @@ -1129,6 +1129,13 @@ are not listed here).
>  no longer changes how they are passed nor returned.  This ABI change
>  is now diagnosed with -Wpsabi.
>
> +  Mitigation against straight line speculation (SLS) for function
> +  return and indirect jump is supported via
> +  -mharden-sls=[none|all|return|indirect-jmp].
> +  
> +  Add CS prefix to call and jmp to indirect thunk with branch target
> +  in r8-r15 registers via -mindirect-branch-cs-prefix.
> +  
>  
>
>  
> --
> 2.35.1
>


[committed] openmp: Ensure proper diagnostics for -> in map/to/from clauses [PR104532]

2022-02-17 Thread Jakub Jelinek via Gcc-patches
Hi!

The following patch uses the functions normal CPP_DEREF parsing uses,
i.e. convert_lvalue_to_rvalue and build_indirect_ref, instead of
blindly calling build_simple_mem_ref, so that if the variable does not
have correct type, we properly diagnose it instead of ICEing on it.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2022-02-17  Jakub Jelinek  

PR c/104532
* c-parser.cc (c_parser_omp_variable_list): For CPP_DEREF, use
convert_lvalue_to_rvalue and build_indirect_ref instead of
build_simple_mem_ref.

* gcc.dg/gomp/pr104532.c: New test.

--- gcc/c/c-parser.cc.jj2022-02-11 00:19:22.121067484 +0100
+++ gcc/c/c-parser.cc   2022-02-16 14:12:03.562041597 +0100
@@ -13145,7 +13145,16 @@ c_parser_omp_variable_list (c_parser *pa
{
  location_t op_loc = c_parser_peek_token (parser)->location;
  if (c_parser_next_token_is (parser, CPP_DEREF))
-   t = build_simple_mem_ref (t);
+   {
+ c_expr t_expr;
+ t_expr.value = t;
+ t_expr.original_code = ERROR_MARK;
+ t_expr.original_type = NULL;
+ set_c_expr_source_range (_expr, op_loc, op_loc);
+ t_expr = convert_lvalue_to_rvalue (op_loc, t_expr,
+true, false);
+ t = build_indirect_ref (op_loc, t_expr.value, RO_ARROW);
+   }
  c_parser_consume_token (parser);
  if (!c_parser_next_token_is (parser, CPP_NAME))
{
--- gcc/testsuite/gcc.dg/gomp/pr104532.c.jj 2022-02-16 14:23:51.749180699 
+0100
+++ gcc/testsuite/gcc.dg/gomp/pr104532.c2022-02-16 14:23:31.896457132 
+0100
@@ -0,0 +1,15 @@
+/* PR c/104532 */
+/* { dg-do compile } */
+
+void
+foo (int x)
+{
+  #pragma omp target enter data map (to: x->vectors)   /* { dg-error "invalid 
type argument of '->'" } */
+}  /* { dg-error "must 
contain at least one" "" { target *-*-* } .-1 } */
+
+void
+bar (int x)
+{
+  #pragma omp target enter data map (to: x->vectors[]) /* { dg-error "invalid 
type argument of '->'" } */
+}  /* { dg-error "must 
contain at least one" "" { target *-*-* } .-1 } */
+/* { dg-error 
"expected expression before" "" { target *-*-* } .-2 } */

Jakub



[PATCH] calls: When bypassing emit_push_insn for 0 sized arg, emit at least anti_adjust_stack for alignment pad if needed [PR104558]

2022-02-17 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase ICEs on x86_64 when asked to use the pre-GCC 8
ABI where zero sized arguments weren't ignored.
In GCC 7 the emit_push_insn calls in store_one_arg were unconditional,
it is true that they didn't actually push anything because it had zero
size, but because arg->locate.alignment_pad is 8 in this case,
emit_push_insn at the end performs
  if (alignment_pad && args_addr == 0)
anti_adjust_stack (alignment_pad);
and an assert larger on is upset if we don't do it.
The following patch keeps the emit_push_insn conditional but calls
the anti_adjust_stack when needed by hand for the zero sized arguments.
For the new x86_64 ABI where zero sized arguments are ignored
arg->locate.alignment_pad is 0 in this case, so nothing changes
- we in that case really do ignore it.

There is another emit_push_insn call earlier in store_one_arg, also made
conditional on non-zero size by Marek in GCC 8, but that one is for
arguments with non-BLKmode and the only way those can be zero size is
if they are TYPE_EMPTY_P aka when they are completely ignored.  But
I believe arg->locate.alignment_pad should be 0 in that case, so IMHO
there is no need to do anything in the second spot.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-02-17  Jakub Jelinek  

PR middle-end/104558
* calls.cc (store_one_arg): When not calling emit_push_insn
because size_rtx is const0_rtx, call at least anti_adjust_stack
on arg->locate.alignment_pad if !argblock and the alignment might
be non-zero.

* gcc.dg/pr104558.c: New test.

--- gcc/calls.cc.jj 2022-01-18 11:58:58.944991171 +0100
+++ gcc/calls.cc2022-02-16 13:00:01.079192624 +0100
@@ -5139,6 +5139,10 @@ store_one_arg (struct arg_data *arg, rtx
ARGS_SIZE_RTX (arg->locate.offset),
reg_parm_stack_space,
ARGS_SIZE_RTX (arg->locate.alignment_pad), false);
+  else if ((arg->locate.alignment_pad.var
+   || maybe_ne (arg->locate.alignment_pad.constant, 0))
+  && !argblock)
+   anti_adjust_stack (ARGS_SIZE_RTX (arg->locate.alignment_pad));
 
   /* Unless this is a partially-in-register argument, the argument is now
 in the stack.
--- gcc/testsuite/gcc.dg/pr104558.c.jj  2022-02-16 13:24:46.986523821 +0100
+++ gcc/testsuite/gcc.dg/pr104558.c 2022-02-16 13:24:26.491808905 +0100
@@ -0,0 +1,15 @@
+/* PR middle-end/104558 */
+/* { dg-do compile } */
+/* { dg-options "-fabi-version=9" } */
+
+struct __attribute__ ((aligned)) A {};
+
+struct A a;
+
+void bar (int, int, int, int, int, int, int, struct A);
+
+void
+foo (void)
+{
+  bar (0, 1, 2, 3, 4, 5, 6, a);
+}

Jakub



[PATCH] valtrack: Avoid creating raw SUBREGs with VOIDmode argument [PR104557]

2022-02-17 Thread Jakub Jelinek via Gcc-patches
Hi!

After the recent r12-7240 simplify_immed_subreg changes, we bail on more
simplify_subreg calls than before, e.g. apparently for decimal modes
in the NaN representations  we almost never preserve anything except the
canonical {q,s}NaNs.
simplify_gen_subreg will punt in such cases because a SUBREG with VOIDmode
is not valid, but debug_lowpart_subreg wants to attempt even harder, even
if e.g. target indicates certain mode combinations aren't valid for the
backend, dwarf2out can still handle them.  But a SUBREG from a VOIDmode
operand is just too much, the inner mode is lost there.  We'd need some
new rtx that would be able to represent those cases.
For now, just punt in those cases.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-02-17  Jakub Jelinek  

PR debug/104557
* valtrack.cc (debug_lowpart_subreg): Don't call gen_rtx_raw_SUBREG
if expr has VOIDmode.

* gcc.dg/dfp/pr104557.c: New test.

--- gcc/valtrack.cc.jj  2022-01-18 11:59:00.252972485 +0100
+++ gcc/valtrack.cc 2022-02-16 11:29:28.234826860 +0100
@@ -558,7 +558,9 @@ debug_lowpart_subreg (machine_mode outer
   rtx ret = simplify_gen_subreg (outer_mode, expr, inner_mode, offset);
   if (ret)
 return ret;
-  return gen_rtx_raw_SUBREG (outer_mode, expr, offset);
+  if (GET_MODE (expr) != VOIDmode)
+return gen_rtx_raw_SUBREG (outer_mode, expr, offset);
+  return NULL_RTX;
 }
 
 /* If UREGNO is referenced by any entry in DEBUG, emit a debug insn
--- gcc/testsuite/gcc.dg/dfp/pr104557.c.jj  2022-02-16 11:36:03.733329235 
+0100
+++ gcc/testsuite/gcc.dg/dfp/pr104557.c 2022-02-16 11:35:27.599831513 +0100
@@ -0,0 +1,22 @@
+/* PR debug/104557 */
+/* { dg-do compile } */
+/* { dg-options "-O -g -Wno-psabi" } */
+
+typedef int __attribute__((__vector_size__ (32))) U;
+typedef double __attribute__((__vector_size__ (32))) F;
+typedef _Decimal64 __attribute__((__vector_size__ (32))) D;
+
+F
+bar (void)
+{
+  F f = __builtin_convertvector ((D) (-10.d < (D) ((D) (U) { 0, 0, 0, 0, 0, 0, 
0, -0xe0 }
+  >= (D) { 8000 })), F);
+  return f;
+}
+
+F
+foo ()
+{
+  F x = bar ();
+  return x;
+}

Jakub



Re: [PATCH] tree-optimization/96881 - CD-DCE and CLOBBERs

2022-02-17 Thread Richard Biener via Gcc-patches
On Thu, 17 Feb 2022, Richard Biener wrote:

> On Tue, 15 Feb 2022, Jan Hubicka wrote:
> 
> > > @@ -1272,7 +1275,7 @@ maybe_optimize_arith_overflow (gimple_stmt_iterator 
> > > *gsi,
> > > contributes nothing to the program, and can be deleted.  */
> > >  
> > >  static bool
> > > -eliminate_unnecessary_stmts (void)
> > > +eliminate_unnecessary_stmts (bool aggressive)
> > >  {
> > >bool something_changed = false;
> > >basic_block bb;
> > > @@ -1366,7 +1369,9 @@ eliminate_unnecessary_stmts (void)
> > > break;
> > >   }
> > >   }
> > > -   if (!dead)
> > > +   if (!dead
> > > +   && (!aggressive
> > > +   || bitmap_bit_p (visited_control_parents, bb->index)))
> > 
> > It seems to me that it may be worth to consider case where
> > visited_control_parents is 0 while all basic blocks in the CD relation
> > are live for different reasons.  I suppose this can happen in more
> > complex CFGs when the other arms of conditionals are live...
> 
> It's a bit difficult to do in this place though since we might already
> have altered those blocks (and we need to check not for the block being
> live but for its control stmt).  I suppose we could use the
> last_stmt_necessary bitmap.  I'll do some statistics to see whether
> this helps.

So it does help.  The visited_control_parents catches
44010 from 44033 candidates and the remaining 23 are catched by doing

  EXECUTE_IF_SET_IN_BITMAP (cd->get_edges_dependent_on 
(bb->index),
0, edge_number, bi)
{
  basic_block cd_bb = cd->get_edge_src 
(edge_number);
  if (cd_bb != bb
  && !bitmap_bit_p (last_stmt_necessary, 
cd_bb->index))
{
  dead = true;
  break;
}
}

in addition to that simple check.  That means for all files in gcc/
the patch would be a no-op but it still fixes the problematical case.

I'll put an adjusted patch to testing.

Richard.