date:20220602

[PATCH] libgccjit: Support getting the size of a float

2022-06-02 Thread Antoni Boucher via Gcc-patches

Hi.
The attached patch fix bug 105829:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105829

Thanks for the review.
From 5d75fd342cdcea94a785312252be0c0046afdf43 Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Wed, 4 May 2022 21:42:44 -0400
Subject: [PATCH] libgccjit: Support getting the size of a float

2022-06-02  Antoni Boucher  

gcc/jit/
	PR target/105829
	* libgccjit.cc: Add support for floating-point types in
	gcc_jit_type_get_size.

gcc/testsuite/
	PR target/105829
	* jit.dg/test-types.c: Add tests for gcc_jit_type_get_size.
---
 gcc/jit/libgccjit.cc  | 4 ++--
 gcc/testsuite/jit.dg/test-types.c | 3 +++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/gcc/jit/libgccjit.cc b/gcc/jit/libgccjit.cc
index cc6486c9cad..12e7679988b 100644
--- a/gcc/jit/libgccjit.cc
+++ b/gcc/jit/libgccjit.cc
@@ -545,8 +545,8 @@ gcc_jit_type_get_size (gcc_jit_type *type)
 {
   RETURN_VAL_IF_FAIL (type, -1, NULL, NULL, "NULL type");
   RETURN_VAL_IF_FAIL
-(type->is_int (), -1, NULL, NULL,
- "only getting the size of an integer type is supported for now");
+(type->is_int () || type->is_float (), -1, NULL, NULL,
+ "only getting the size of an int or float type is supported for now");
   return type->get_size ();
 }
 
diff --git a/gcc/testsuite/jit.dg/test-types.c b/gcc/testsuite/jit.dg/test-types.c
index 6836597d14e..53bdeafed61 100644
--- a/gcc/testsuite/jit.dg/test-types.c
+++ b/gcc/testsuite/jit.dg/test-types.c
@@ -489,4 +489,7 @@ verify_code (gcc_jit_context *ctxt, gcc_jit_result *result)
 CHECK (gcc_jit_compatible_types (
   gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_LONG),
   gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_INT64_T)));
+
+  CHECK_VALUE (gcc_jit_type_get_size (gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_FLOAT)), 4);
+  CHECK_VALUE (gcc_jit_type_get_size (gcc_jit_context_get_type (ctxt, GCC_JIT_TYPE_DOUBLE)), 8);
 }
-- 
2.26.2.7.g19db9cfb68.dirty

Re: [ping2][PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations

2022-06-02 Thread Yonghong Song via Gcc-patches





On 5/27/22 12:56 PM, David Faust wrote:



On 5/26/22 00:29, Yonghong Song wrote:



On 5/24/22 10:04 AM, David Faust wrote:



On 5/24/22 09:03, Yonghong Song wrote:



On 5/24/22 8:53 AM, David Faust wrote:



On 5/24/22 04:07, Jose E. Marchesi wrote:



On 5/11/22 11:44 AM, David Faust wrote:


On 5/10/22 22:05, Yonghong Song wrote:



On 5/10/22 8:43 PM, Yonghong Song wrote:



On 5/6/22 2:18 PM, David Faust wrote:



On 5/5/22 16:00, Yonghong Song wrote:



On 5/4/22 10:03 AM, David Faust wrote:



On 5/3/22 15:32, Joseph Myers wrote:

On Mon, 2 May 2022, David Faust via Gcc-patches wrote:


Consider the following example:

     #define __typetag1 __attribute__((btf_type_tag("tag1")))
     #define __typetag2 __attribute__((btf_type_tag("tag2")))
     #define __typetag3 __attribute__((btf_type_tag("tag3")))

     int __typetag1 * __typetag2 __typetag3 * g;

The expected behavior is that 'g' is "a pointer with tags
'tag2' and
'tag3',
to a pointer with tag 'tag1' to an int". i.e.:


That's not a correct expectation for either GNU __attribute__ or
C2x [[]]
attribute syntax.  In either syntax, __typetag2 __typetag3 should
apply to
the type to which g points, not to g or its type, just as if
you had a
type qualifier there.  You'd need to put the attributes (or
qualifier)
after the *, not before, to make them apply to the pointer
type.  See
"Attribute Syntax" in the GCC manual for how the syntax is
defined for
GNU
attributes and deduce in turn, for each subsequence of the tokens
matching
the syntax for some kind of declarator, what the type for "T D1"
would be
as defined there and in the C standard, as deduced from the type for
"T D"
for a sub-declarator D.
      >> But GCC's attribute parsing produces a variable 'g'
which is "a

pointer with

tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an
int", i.e.


In GNU syntax, __typetag1 applies to the declaration, whereas in C2x
syntax it applies to int.  Again, if you wanted it to apply to the
pointer
type it would need to go after the * not before.

If you are concerned with the fine details of what construct an
attribute
appertains to, I recommend using C2x syntax not GNU syntax.



Joseph, thank you! This is very helpful. My understanding of
the syntax
was not correct.

(Actually, I made a bad mistake in paraphrasing this example from the
discussion of it in the series cover letter. But, the reason
why it is
incorrect is the same.)


Yonghong, is the specific ordering an expectation in BPF programs or
other users of the tags?


This is probably a language writing issue. We are saying tags only
apply to pointer. We probably should say it only apply to pointee.

$ cat t.c
int const *ptr;

the llvm ir debuginfo:

!5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
!6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !7)
!7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)

We could replace 'const' with a tag like below:

int __attribute__((btf_type_tag("tag"))) *ptr;

!5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
annotations: !7)
!6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!7 = !{!8}
!8 = !{!"btf_type_tag", !"tag"}

In the above IR, we generate annotations to pointer_type because
we didn't invent a new DI type for encode btf_type_tag. But it is
totally okay to have IR looks like

!5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 64)
!11 = !DIBtfTypeTagType(..., baseType: !6, name: !"Tag")
!6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)


OK, thanks.

There is still the question of why the DWARF generated for this case
that I have been concerned about:

       int __typetag1 * __typetag2 __typetag3 * g;

differs between GCC (with this series) and clang. After studying it,
GCC is doing with the attributes exactly as is described in the
Attribute Syntax portion of the GCC manual where the GNU syntax is
described. I do not think there is any problem here.

So the difference in DWARF suggests to me that clang is not handling
the GNU attribute syntax in this particular case correctly, since it
seems to be associating __typetag2 and __typetag3 to g's type rather
than the type to which it points.

I am not sure whether for the use purposes of the tags this difference
is very important, but it is worth noting.


As Joseph suggested, it may be better to encourage users of these tags
to use the C2x attribute syntax if they are concerned with precisely
which construct the tag applies.

This would also be a way around any issues in handling the attributes
due to the GNU syntax.

I tried a few test cases using C2x syntax BTF type tags with a
clang-15 build, but ran into some issues (in particular, some of the
tag attributes being ignored altogether). I couldn't find confirmation
whether C2x attribute syntax is fully supported in clang yet, so maybe
this isn't expected to work. Do you know whether the C2x syntax is
fully supported in

Re: [PATCH] libgccjit: Fix infinite recursion in gt_ggc_mx_lang_tree_node

2022-06-02 Thread Andrew Pinski via Gcc-patches

On Thu, Jun 2, 2022 at 6:21 PM Antoni Boucher via Gcc-patches
 wrote:
>
> Hi.
> The attached patch fix bug 105827:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105827
>
> I'm not sure how to test this, so please share ideas.

Looks like the attachment was removed ...

>
> Thanks for the review.

Re: [PATCH] libgccjit: Fix infinite recursion in gt_ggc_mx_lang_tree_node

2022-06-02 Thread Antoni Boucher via Gcc-patches

Sorry, forgot to attach the patch.

Here it is.

On Thu, 2022-06-02 at 21:20 -0400, Antoni Boucher via Jit wrote:
> Hi.
> The attached patch fix bug 105827:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105827
> 
> I'm not sure how to test this, so please share ideas.
> 
> Thanks for the review.

From 609153a39921b8e9aa1934da131575bb64881d67 Mon Sep 17 00:00:00 2001
From: Antoni Boucher 
Date: Thu, 2 Jun 2022 21:14:06 -0400
Subject: [PATCH] libgccjit: Fix infinite recursion in gt_ggc_mx_lang_tree_node

2022-06-02  Antoni Boucher  

gcc/jit/
	PR target/105827
	* dummy-frontend.cc: Fix lang_tree_node.
	* jit-common.h: New function (jit_tree_chain_next) used by
	lang_tree_node.
---
 gcc/jit/dummy-frontend.cc | 13 +++--
 gcc/jit/jit-common.h  | 15 +++
 2 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/gcc/jit/dummy-frontend.cc b/gcc/jit/dummy-frontend.cc
index 84ff359bfe3..8bb5d03d630 100644
--- a/gcc/jit/dummy-frontend.cc
+++ b/gcc/jit/dummy-frontend.cc
@@ -506,13 +506,14 @@ struct GTY(()) lang_identifier
 
 /* The resulting tree type.  */
 
+/* See lang_tree_node in gcc/c/c-decl.cc.  */
 union GTY((desc ("TREE_CODE (&%h.generic) == IDENTIFIER_NODE"),
-	   chain_next ("CODE_CONTAINS_STRUCT (TREE_CODE (&%h.generic), TS_COMMON) ? ((union lang_tree_node *) TREE_CHAIN (&%h.generic)) : NULL")))
-lang_tree_node
-{
-  union tree_node GTY((tag ("0"),
-		   desc ("tree_node_structure (&%h)"))) generic;
-  struct lang_identifier GTY((tag ("1"))) identifier;
+   chain_next ("(union lang_tree_node *) jit_tree_chain_next (&%h.generic)"))) lang_tree_node
+ {
+  union tree_node GTY ((tag ("0"),
+			desc ("tree_node_structure (&%h)")))
+generic;
+  struct lang_identifier GTY ((tag ("1"))) identifier;
 };
 
 /* We don't use language_function.  */
diff --git a/gcc/jit/jit-common.h b/gcc/jit/jit-common.h
index 3ff7447fbf3..50580358a1f 100644
--- a/gcc/jit/jit-common.h
+++ b/gcc/jit/jit-common.h
@@ -93,6 +93,21 @@ const int NUM_GCC_JIT_TYPES = GCC_JIT_TYPE_INT128_T + 1;
 
End of comment for inclusion in the docs.  */
 
+/* See c_tree_chain_next in gcc/c-family/c-common.h.  */
+static inline tree
+jit_tree_chain_next (tree t)
+{
+  /* TREE_CHAIN of a type is TYPE_STUB_DECL, which is different
+ kind of object, never a long chain of nodes.  Prefer
+ TYPE_NEXT_VARIANT for types.  */
+  if (CODE_CONTAINS_STRUCT (TREE_CODE (t), TS_TYPE_COMMON))
+return TYPE_NEXT_VARIANT (t);
+  /* Otherwise, if there is TREE_CHAIN, return it.  */
+  if (CODE_CONTAINS_STRUCT (TREE_CODE (t), TS_COMMON))
+return TREE_CHAIN (t);
+  return NULL;
+}
+
 namespace gcc {
 
 namespace jit {
-- 
2.26.2.7.g19db9cfb68.dirty

[PATCH] libgccjit: Fix infinite recursion in gt_ggc_mx_lang_tree_node

2022-06-02 Thread Antoni Boucher via Gcc-patches

Hi.
The attached patch fix bug 105827:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105827

I'm not sure how to test this, so please share ideas.

Thanks for the review.

Re: Ping: [PATCH] Optimize vec_splats of constant V2DI/V2DF vec_extract, PR target/99293

2022-06-02 Thread Segher Boessenkool

On Thu, Jun 02, 2022 at 03:06:52PM -0400, Michael Meissner wrote:
> Ping patch posted on May 13th:

Are you not going to apply any of Will's suggestions?  They looked solid
to me.


Segher

Re: [PATCH] c++: fix broken copy elision with nested TARGET_EXPRs [PR105550]

2022-06-02 Thread Jason Merrill via Gcc-patches


On 5/26/22 11:01, Marek Polacek wrote:

In this problem, we are failing to properly perform copy elision with
a conditional operator, so this:

   constexpr A a = true ? A{} : A{};

fails with:

   error: 'A{((const A*)(&))}' is not a constant expression

The whole initializer is

   TARGET_EXPR }> : TARGET_EXPR }>>

where the outermost TARGET_EXPR is elided, but not the nested ones.
Then we end up replacing the PLACEHOLDER_EXPRs with the temporaries the
TARGET_EXPRs represent, which is precisely what should *not* happen with
copy elision.

I've tried the approach of tweaking ctx->object, but I ran into gazillion
problems with that.  I thought that I would let cxx_eval_constant_expression
/TARGET_EXPR create a new object only when ctx->object was null, then
adjust setting of ctx->object in places like cxx_bind_parameters_in_call
and cxx_eval_component_reference but that failed completely.  Sometimes
ctx->object has to be reset, sometimes it cannot be reset, 'this' needed
special handling, etc.  I gave up.
> But now that we have potential_prvalue_result_of, a simple but less
elegant solution is the following.  I thought about setting a flag on
a TARGET_EXPR to avoid adding ctx.full_expr, but a new flag would be
overkill and using TARGET_EXPR_DIRECT_INIT_P broke things.


This doesn't seem like a general solution; the same issue would also 
apply to ?: of TARGET_EXPR when it's a subexpression rather than the 
full expression, like f(true ? A{} : B{}).


Another simple approach, but more general, would be to routinely strip 
TARGET_EXPR from the operands of ?: like we do in various other places 
in constexpr.c.



Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/105550

gcc/cp/ChangeLog:

* constexpr.cc (struct constexpr_ctx): Add a tree member.
(init_subob_ctx): Set it.
(cxx_eval_constant_expression): Don't initialize a temporary object
if potential_prvalue_result_of says true.
(cxx_eval_outermost_constant_expr): Adjust the ctx initializer.  Set
ctx.full_expr.
* cp-tree.h (potential_prvalue_result_of): Declare.
* typeck2.cc (potential_prvalue_result_of): No longer static.  Return
if full_expr is null.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-elision1.C: New test.
---
  gcc/cp/constexpr.cc   | 33 +---
  gcc/cp/cp-tree.h  |  1 +
  gcc/cp/typeck2.cc |  4 +-
  .../g++.dg/cpp1y/constexpr-elision1.C | 53 +++
  4 files changed, 82 insertions(+), 9 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-elision1.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 45208478c3f..73880fb089e 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -1129,6 +1129,9 @@ struct constexpr_ctx {
tree ctor;
/* The object we're building the CONSTRUCTOR for.  */
tree object;
+  /* The whole initializer expression.  Currently only used when the whole
+ expression is a TARGET_EXPR.  */
+  tree full_expr;
/* If inside SWITCH_EXPR.  */
constexpr_switch_state *css_state;
/* The aggregate initialization context inside which this one is nested.  
This
@@ -4700,8 +4703,11 @@ init_subob_ctx (const constexpr_ctx *ctx, constexpr_ctx 
_ctx,
new_ctx.ctor = elt;
  
if (TREE_CODE (value) == TARGET_EXPR)

-/* Avoid creating another CONSTRUCTOR when we expand the TARGET_EXPR.  */
-value = TARGET_EXPR_INITIAL (value);
+{
+  new_ctx.full_expr = value;
+  /* Avoid creating another CONSTRUCTOR when we expand the TARGET_EXPR.  */
+  value = TARGET_EXPR_INITIAL (value);
+}
  }
  
  /* We're about to process an initializer for a class or array TYPE.  Make

@@ -6822,6 +6828,14 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
break;
  }
gcc_checking_assert (!TARGET_EXPR_DIRECT_INIT_P (t));
+   /* This TARGET_EXPR may be nested inside another TARGET_EXPR, but
+  still serve as the initializer for the same object as the outer
+  TARGET_EXPR, as in
+A a = true ? A{} : A{};
+  so we can't materialize a temporary.  IOW, don't set ctx->object
+  to the TARGET_EXPR's slot.  */
+   const bool prvalue = potential_prvalue_result_of (t, ctx->full_expr);
+   gcc_checking_assert (!prvalue || lval == vc_prvalue);
/* Avoid evaluating a TARGET_EXPR more than once.  */
tree slot = TARGET_EXPR_SLOT (t);
if (tree *p = ctx->global->values.get (slot))
@@ -6831,7 +6845,7 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
r = *p;
break;
  }
-   if ((AGGREGATE_TYPE_P (type) || VECTOR_TYPE_P (type)))
+   if (!prvalue && (AGGREGATE_TYPE_P (type) || VECTOR_TYPE_P (type)))
  {
/* We're being expanded without an explicit target, so start

Re: [PATCH v2] x86: {,v}psadbw have commutative source operands

2022-06-02 Thread Uros Bizjak via Gcc-patches

On Thu, Jun 2, 2022 at 5:00 PM Jan Beulich  wrote:
>
> Like noticed for gas as well (binutils-gdb commit c8cad9d389b7), the
> "absolute difference" aspect of the insns makes their source operands
> commutative.
>
> gcc/
>
> * config/i386/mmx.md (mmx_psadbw): Convert to expander.
> (*mmx_psadbw): New. Mark as commutative.
> * config/i386/sse.md (_psadbw): Convert to expander.
> (*_psadbw): New. Mark as commutative.

OK with a nit below.

Thanks,
Uros.

> ---
> v2: Introduce expanders.
>
> --- a/gcc/config/i386/mmx.md
> +++ b/gcc/config/i386/mmx.md
> @@ -4405,13 +4405,21 @@
> (set_attr "type" "sseiadd")
> (set_attr "mode" "TI")])
>
> -(define_insn "mmx_psadbw"
> +(define_expand "mmx_psadbw"
> +  [(set (match_operand:V1DI 0 "register_operand")
> +   (unspec:V1DI [(match_operand:V8QI 1 "register_mmxmem_operand")
> + (match_operand:V8QI 2 "register_mmxmem_operand")]
> +UNSPEC_PSADBW))]
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && (TARGET_SSE || TARGET_3DNOW_A)"
> +  "ix86_fixup_binary_operands_no_copy (PLUS, V8QImode, operands);")
> +
> +(define_insn "*mmx_psadbw"
>[(set (match_operand:V1DI 0 "register_operand" "=y,x,Yw")
> -(unspec:V1DI [(match_operand:V8QI 1 "register_operand" "0,0,Yw")
> +   (unspec:V1DI [(match_operand:V8QI 1 "register_mmxmem_operand" 
> "%0,0,Yw")
>   (match_operand:V8QI 2 "register_mmxmem_operand" 
> "ym,x,Yw")]
>  UNSPEC_PSADBW))]
> -  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
> -   && (TARGET_SSE || TARGET_3DNOW_A)"
> +  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && (TARGET_SSE || TARGET_3DNOW_A)
> +   && ix86_binary_operator_ok (PLUS, V8QImode, operands)"
>"@
> psadbw\t{%2, %0|%0, %2}
> psadbw\t{%2, %0|%0, %2}
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -19981,13 +19981,22 @@
>
>  ;; The correct representation for this is absolutely enormous, and
>  ;; surely not generally useful.
> -(define_insn "_psadbw"
> +(define_expand "_psadbw"
> +  [(set (match_operand:VI8_AVX2_AVX512BW 0 "register_operand")
> +   (unspec:VI8_AVX2_AVX512BW
> + [(match_operand: 1 "vector_operand")
> +  (match_operand: 2 "vector_operand")]
> + UNSPEC_PSADBW))]
> +  "TARGET_SSE2"
> +  "ix86_fixup_binary_operands_no_copy (PLUS, mode, operands);")
> +
> +(define_insn "*_psadbw"
>[(set (match_operand:VI8_AVX2_AVX512BW 0 "register_operand" "=x,YW")
> (unspec:VI8_AVX2_AVX512BW
> - [(match_operand: 1 "register_operand" "0,YW")
> + [(match_operand: 1 "vector_operand" "%0,YW")
>(match_operand: 2 "vector_operand" "xBm,YWm")]
>   UNSPEC_PSADBW))]
> -  "TARGET_SSE2"
> +  "TARGET_SSE2 && ix86_binary_operator_ok (PLUS, mode, 
> operands)"

Please put && ix86_binary_operator_ok to a separate line.

>"@
> psadbw\t{%2, %0|%0, %2}
> vpsadbw\t{%2, %1, %0|%0, %1, %2}"
>

Re: [PATCH] c++: find_template_parameters and PARM_DECLs [PR105797]

2022-06-02 Thread Jason Merrill via Gcc-patches


On 6/1/22 12:44, Patrick Palka wrote:

As explained in r11-4959-gde6f64f9556ae3, the atom cache assumes two
equivalent expressions (according to cp_tree_equal) must use the same
template parameters (according to find_template_parameters).  This
assumption turned out to not hold for TARGET_EXPR, which was addressed
by that commit.

But this assumption apparently doesn't hold for PARM_DECL either:
find_template_parameters walks its DECL_CONTEXT but cp_tree_equal by
default doesn't consider DECL_CONTEXT unless comparing_specializations
is set.  Thus in the first testcase below, the atomic constraints of #1
and #2 are equivalent according to cp_tree_equal, but according to
find_template_parameters the former uses T and the latter uses both T
and U.

I suppose we can fix this assumption violation by setting
comparing_specializations in the atom_hasher, which would make
cp_tree_equal return false for the two atoms, but that seems overly
pessimistic here.  Ideally the atoms should be considered equivalent
and we should fix find_template_paremeters to return just T for #2's
atom.

To that end this patch makes for_each_template_parm_r stop walking the
DECL_CONTEXT of a PARM_DECL.  This should be safe to do because
tsubst_copy / tsubst_decl only cares about the TREE_TYPE of a PARM_DECL
and doesn't bother substituting the DECL_CONTEXT, thus the only relevant
template parameters are those used in its type.  any_template_parm_r is
currently responsible for walking its TREE_TYPE, but I suppose it now makes
sense make for_each_template_parm_r do so instead.

In passing this patch also makes for_each_template_parm_r stop walking
the DECL_CONTEXT of a VAR_/FUNCTION_DECL since it should be unnecessary
after walking DECL_TI_ARGS.

I experimented with not walking DECL_CONTEXT for CONST_DECL, but the
second testcase below demonstrates it's necessary to walk it.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?


OK.


PR c++/105797

gcc/cp/ChangeLog:

* pt.cc (for_each_template_parm_r) :
Don't walk DECL_CONTEXT.
: Likewise.  Walk TREE_TYPE.
: Simplify accordingly.
(any_template_parm_r) : Don't walk TREE_TYPE.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-decltype4.C: New test.
---
  gcc/cp/pt.cc| 10 +-
  gcc/testsuite/g++.dg/cpp2a/concepts-decltype4.C | 16 
  gcc/testsuite/g++.dg/cpp2a/concepts-memfun3.C   | 12 
  3 files changed, 33 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-decltype4.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-memfun3.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 4f0ace2644b..e4a473002a0 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -10561,11 +10561,14 @@ for_each_template_parm_r (tree *tp, int 
*walk_subtrees, void *d)
  case VAR_DECL:
if (DECL_LANG_SPECIFIC (t) && DECL_TEMPLATE_INFO (t))
WALK_SUBTREE (DECL_TI_ARGS (t));
-  /* Fall through.  */
+  break;
  
  case PARM_DECL:

+  WALK_SUBTREE (TREE_TYPE (t));
+  break;
+
  case CONST_DECL:
-  if (TREE_CODE (t) == CONST_DECL && DECL_TEMPLATE_PARM_P (t))
+  if (DECL_TEMPLATE_PARM_P (t))
WALK_SUBTREE (DECL_INITIAL (t));
if (DECL_CONTEXT (t)
  && pfd->include_nondeduced_p)
@@ -10824,9 +10827,6 @@ any_template_parm_r (tree t, void *data)
break;
  
  case TEMPLATE_PARM_INDEX:

-case PARM_DECL:
-  /* A parameter or constraint variable may also depend on a template
-parameter without explicitly naming it.  */
WALK_SUBTREE (TREE_TYPE (t));
break;
  
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-decltype4.C b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype4.C

new file mode 100644
index 000..6683d224cf8
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-decltype4.C
@@ -0,0 +1,16 @@
+// PR c++/105797
+// { dg-do compile { target c++20 } }
+
+template
+concept C = requires { T(); };
+
+template
+void f(T v) requires C; // #1
+
+template
+void f(T v) requires C; // #2
+
+int main() {
+  f(0);
+  f(0);
+}
diff --git a/gcc/testsuite/g++.dg/cpp2a/concepts-memfun3.C 
b/gcc/testsuite/g++.dg/cpp2a/concepts-memfun3.C
new file mode 100644
index 000..3fa4fb82818
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/concepts-memfun3.C
@@ -0,0 +1,12 @@
+// { dg-do compile { target c++20 } }
+
+template
+struct A {
+  enum E { e = I };
+  static void f() requires (e != 0);
+};
+
+int main() {
+  A::f();
+  A::f(); // { dg-error "no match" }
+}

Re: [PATCH] c++: ICE with template NEW_EXPR [PR105803]

2022-06-02 Thread Jason Merrill via Gcc-patches


On 6/2/22 16:33, Marek Polacek wrote:

On Thu, Jun 02, 2022 at 04:26:27PM -0400, Jason Merrill wrote:

On 6/2/22 16:10, Marek Polacek wrote:

index 1346a1d4c10..2bbd8785627 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -9039,10 +9039,18 @@ potential_constant_expression_1 (tree t, bool 
want_rval, bool strict, bool now,
  "before C++17");
 return false;
-case DYNAMIC_CAST_EXPR:
-case PSEUDO_DTOR_EXPR:
   case NEW_EXPR:
   case VEC_NEW_EXPR:
+  if (cxx_dialect >= cxx20)
+   /* In C++20, new-expressions are potentially constant.  */
+   return true;
+  else if (flags & tf_error)
+   error_at (loc, "new-expression is not a constant expression "
+ "before C++20");
+  return false;
+
+case DYNAMIC_CAST_EXPR:
+case PSEUDO_DTOR_EXPR:
   case DELETE_EXPR:
   case VEC_DELETE_EXPR:


Delete, too.


Duh.  Fixed:


OK, thanks.


 From 2423f6548405185e256036df3d0ef3c13fd996c5 Mon Sep 17 00:00:00 2001
From: Marek Polacek 
Date: Thu, 2 Jun 2022 15:56:18 -0400
Subject: [PATCH] c++: new-expression is potentially constant in C++20

... so adjust p_c_e accordingly.

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1): Treat
{,VEC_}NEW_EXPR and {,VEC_}DELETE_EXPRas potentially constant in C++20.
---
  gcc/cp/constexpr.cc | 12 ++--
  1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 1346a1d4c10..684238883dc 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -9039,12 +9039,20 @@ potential_constant_expression_1 (tree t, bool 
want_rval, bool strict, bool now,
  "before C++17");
return false;
  
-case DYNAMIC_CAST_EXPR:

-case PSEUDO_DTOR_EXPR:
  case NEW_EXPR:
  case VEC_NEW_EXPR:
  case DELETE_EXPR:
  case VEC_DELETE_EXPR:
+  if (cxx_dialect >= cxx20)
+   /* In C++20, new-expressions are potentially constant.  */
+   return true;
+  else if (flags & tf_error)
+   error_at (loc, "new-expression is not a constant expression "
+ "before C++20");
+  return false;
+
+case DYNAMIC_CAST_EXPR:
+case PSEUDO_DTOR_EXPR:
  case THROW_EXPR:
  case OMP_PARALLEL:
  case OMP_TASK:

base-commit: 7b98910406b5000a6429c188b0c6cc14e3140637

Re: [PATCH] c++: ICE with template NEW_EXPR [PR105803]

2022-06-02 Thread Marek Polacek via Gcc-patches

On Thu, Jun 02, 2022 at 04:26:27PM -0400, Jason Merrill wrote:
> On 6/2/22 16:10, Marek Polacek wrote:
> > index 1346a1d4c10..2bbd8785627 100644
> > --- a/gcc/cp/constexpr.cc
> > +++ b/gcc/cp/constexpr.cc
> > @@ -9039,10 +9039,18 @@ potential_constant_expression_1 (tree t, bool 
> > want_rval, bool strict, bool now,
> >   "before C++17");
> > return false;
> > -case DYNAMIC_CAST_EXPR:
> > -case PSEUDO_DTOR_EXPR:
> >   case NEW_EXPR:
> >   case VEC_NEW_EXPR:
> > +  if (cxx_dialect >= cxx20)
> > +   /* In C++20, new-expressions are potentially constant.  */
> > +   return true;
> > +  else if (flags & tf_error)
> > +   error_at (loc, "new-expression is not a constant expression "
> > + "before C++20");
> > +  return false;
> > +
> > +case DYNAMIC_CAST_EXPR:
> > +case PSEUDO_DTOR_EXPR:
> >   case DELETE_EXPR:
> >   case VEC_DELETE_EXPR:
> 
> Delete, too.

Duh.  Fixed:

>From 2423f6548405185e256036df3d0ef3c13fd996c5 Mon Sep 17 00:00:00 2001
From: Marek Polacek 
Date: Thu, 2 Jun 2022 15:56:18 -0400
Subject: [PATCH] c++: new-expression is potentially constant in C++20

... so adjust p_c_e accordingly.

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1): Treat
{,VEC_}NEW_EXPR and {,VEC_}DELETE_EXPRas potentially constant in C++20.
---
 gcc/cp/constexpr.cc | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 1346a1d4c10..684238883dc 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -9039,12 +9039,20 @@ potential_constant_expression_1 (tree t, bool 
want_rval, bool strict, bool now,
  "before C++17");
   return false;
 
-case DYNAMIC_CAST_EXPR:
-case PSEUDO_DTOR_EXPR:
 case NEW_EXPR:
 case VEC_NEW_EXPR:
 case DELETE_EXPR:
 case VEC_DELETE_EXPR:
+  if (cxx_dialect >= cxx20)
+   /* In C++20, new-expressions are potentially constant.  */
+   return true;
+  else if (flags & tf_error)
+   error_at (loc, "new-expression is not a constant expression "
+ "before C++20");
+  return false;
+
+case DYNAMIC_CAST_EXPR:
+case PSEUDO_DTOR_EXPR:
 case THROW_EXPR:
 case OMP_PARALLEL:
 case OMP_TASK:

base-commit: 7b98910406b5000a6429c188b0c6cc14e3140637
-- 
2.36.1

Re: [PATCH] c++: cv-quals of dummy obj for non-dep memfn call [PR105637]

2022-06-02 Thread Jason Merrill via Gcc-patches


On 6/2/22 15:57, Patrick Palka wrote:

On Thu, 2 Jun 2022, Jason Merrill wrote:


On 5/27/22 09:57, Patrick Palka wrote:

On Thu, 26 May 2022, Patrick Palka wrote:


On Thu, 26 May 2022, Jason Merrill wrote:


On 5/26/22 14:57, Patrick Palka wrote:

On Thu, 26 May 2022, Patrick Palka wrote:


Here we expect the calls to BaseClass::baseDevice resolve to the
second,
third and fourth overloads respectively in light of the
cv-qualifiers
of 'this' in each case.  But ever since r12-6075-g2decd2cabe5a4f,
the
calls incorrectly resolve to the first overload at instantiation
time.

This happens because the calls to BaseClass::baseDevice are all
deemed
non-dependent (ever since r7-755-g23cb72663051cd made us ignore the
dependentness of 'this' when considering the dependence of a
non-static
memfn call), hence we end up checking the call ahead of time, using
as
the object argument a dummy object of type BaseClass.  Since this
object
argument is cv-unqualified, the calls incoherently resolve to the
first
overload of baseDevice.  Before r12-6075, this incorrect result
would
just get silently discarded and we'd end up redoing OR at
instantiation
time using 'this' as the object argument.  But after r12-6075, we
now
reuse this incorrect result at instantiation time.

This patch fixes this by making finish_call_expr request from
maybe_dummy_object a cv-qualified object consistent with the
cv-quals of
'this'.  That way, ahead of time OR using a dummy object will give
us
the right answer and we could safely reuse it at instantiation time.

NB: r7-755 is also the cause of the related issue PR105742.  Not
sure
if there's a fix that could resolve both PRs at once..

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk/12?

PR c++/105637

gcc/cp/ChangeLog:

* semantics.cc (finish_call_expr): Pass a cv-qualified object
type to maybe_dummy_object that is consistent with the
cv-qualifiers of 'this' if available.

gcc/testsuite/ChangeLog:

* g++.dg/template/non-dependent23.C: New test.
---
gcc/cp/semantics.cc   | 15 ---
.../g++.dg/template/non-dependent23.C | 25
+++
2 files changed, 37 insertions(+), 3 deletions(-)
create mode 100644
gcc/testsuite/g++.dg/template/non-dependent23.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index cd7a2818feb..1d9348c6cf1 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -2802,16 +2802,25 @@ finish_call_expr (tree fn, vec
**args, bool disallow_virtual,
[class.access.base] says that we need to convert 'this' to B*
as
part of the access, so we pass 'B' to maybe_dummy_object.  */
+  tree object_type = BINFO_TYPE (BASELINK_ACCESS_BINFO
(fn));
  if (DECL_MAYBE_IN_CHARGE_CONSTRUCTOR_P (get_first_fn (fn)))
{
  /* A constructor call always uses a dummy object.  (This
constructor
 call which has the form A::A () is actually invalid and
we are
 going to reject it later in build_new_method_call.)  */
- object = build_dummy_object (BINFO_TYPE
(BASELINK_ACCESS_BINFO
(fn)));
+ object = build_dummy_object (object_type);
}
  else
-   object = maybe_dummy_object (BINFO_TYPE (BASELINK_ACCESS_BINFO
(fn)),
-NULL);
+   {
+ if (current_class_ref)
+   {
+ /* Make sure that if maybe_dummy_object gives us a dummy
object,
+it'll have the same cv-quals as '*this'.  */
+ int quals = cp_type_quals (TREE_TYPE
(current_class_ref));
+ object_type = cp_build_qualified_type (object_type,
quals);
+   }
+ object = maybe_dummy_object (object_type, NULL);
+   }
result = build_new_method_call (object, fn, args,
NULL_TREE,
  (disallow_virtual


Drat, this fix doesn't interact well with 'this'-capturing lambdas:

   struct BaseClass {
 void baseDevice();// #1
 void baseDevice() const = delete; // #2
   };

   template
   struct TopClass : T {
 void failsToCompile() {
   [this] { BaseClass::baseDevice(); }();
 }
   };

   template struct TopClass;

Here after the fix, we'd incorrectly select the const #2 overload at
template definition time because current_class_ref is the const 'this'
for the lambda rather than the non-const 'this' for TopClass..  I
suppose
we need something like current_nonlambda_class_type for getting at the
innermost non-lambda 'this'?


Do you want maybe_resolve_dummy (ob, false)?


That sadly doesn't seem to work -- the object type is BaseClass which is
not necessarily a base of the dependent TopClass, so
resolvable_dummy_lambda returns NULL_TREE.  I guess it would work at
instantiation time though.


Ah, what seems to work well is directly using lambda_expr_this_capture
instead of maybe_resolve_dummy.

Re: [PATCH] c++: value-dep but not type-dep decltype operand [PR105756]

2022-06-02 Thread Patrick Palka via Gcc-patches

On Thu, 2 Jun 2022, Jason Merrill wrote:

> On 6/1/22 14:20, Patrick Palka wrote:
> > r12-7564-gec0f53a3a542e7 made us instantiate non-constant non-dependent
> > decltype operands by relaxing instantiate_non_dependent_expr to check
> > instantiation_dependent_uneval_expression_p.  But as the testcase below
> > demonstrates, this predicate is too permissive here because it allows
> > value-dependent-only expressions to go through and get instantiated
> > ahead of time, which causes us to crash during constexpr evaluation of
> > (5 % N).
> 
> Why are we doing constexpr evaluation in unevaluated context?

Looks like because cp_build_binary_op attempts to fold the resulting
operator expression via cp_fully_fold (which performs speculate
constexpr evaluation):

6261  if (!processing_template_decl)
6262{
6263  if (resultcode == SPACESHIP_EXPR)
6264result = get_target_expr (result, complain);
6265  op0 = cp_fully_fold (op0);
6266  /* Only consider the second argument if the first isn't overflowed.  
*/
6267  if (!CONSTANT_CLASS_P (op0) || TREE_OVERFLOW_P (op0))
6268return result;
6269  op1 = cp_fully_fold (op1);
6270  if (!CONSTANT_CLASS_P (op1) || TREE_OVERFLOW_P (op1))
6271return result;
6272}

But in an unevaluated context I suppose we don't need or want to do this
folding.  I'll work on a patch to that effect.

> 
> > This patch strengthens instantiate_non_dependent_expr to use the
> > non-uneval version of the predicate instead, which does consider value
> > dependence.  In turn, we need to make finish_decltype_type avoid calling
> > i_n_d_e on a value-dependent-only expression; I assume we still want to
> > resolve the decltype ahead of time in this case.  (Doing so seems
> > unintuitive to me since the expression could be ill-formed at
> > instantiation time as in the testcase, but it matches the behavior of
> > Clang and MSVC.)
> 
> I don't think there's any problem with the testcase;  decltype(1/0) is
> well-formed because the expression is not evaluated.

D'oh, that makes sense, not sure what I was on about :)

> 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> > trunk/12?
> > 
> > PR c++/105756
> > 
> > gcc/cp/ChangeLog:
> > 
> > * pt.cc (instantiate_non_dependent_expr_internal): Adjust
> > comment.
> > (instantiate_non_dependent_expr_sfinae): Assert i_d_e_p instead
> > of i_d_u_e_p.
> > * semantics.cc (finish_decltype_type): Don't instantiate the
> > expression when i_d_e_p is true.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/cpp0x/decltype82.C: New test.
> > ---
> >   gcc/cp/pt.cc|  4 ++--
> >   gcc/cp/semantics.cc | 13 -
> >   gcc/testsuite/g++.dg/cpp0x/decltype82.C | 10 ++
> >   3 files changed, 24 insertions(+), 3 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp0x/decltype82.C
> > 
> > diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> > index e4a473002a0..1ea2545e115 100644
> > --- a/gcc/cp/pt.cc
> > +++ b/gcc/cp/pt.cc
> > @@ -6372,7 +6372,7 @@ redeclare_class_template (tree type, tree parms, tree
> > cons)
> > /* The actual substitution part of
> > instantiate_non_dependent_expr_sfinae,
> >  to be used when the caller has already checked
> > -!instantiation_dependent_uneval_expression_p (expr)
> > +!instantiation_dependent_expression_p (expr)
> >  and cleared processing_template_decl.  */
> > tree
> > @@ -6397,7 +6397,7 @@ instantiate_non_dependent_expr_sfinae (tree expr,
> > tsubst_flags_t complain)
> > if (processing_template_decl)
> >   {
> > /* The caller should have checked this already.  */
> > -  gcc_checking_assert (!instantiation_dependent_uneval_expression_p
> > (expr));
> > +  gcc_checking_assert (!instantiation_dependent_expression_p (expr));
> > processing_template_decl_sentinel s;
> > expr = instantiate_non_dependent_expr_internal (expr, complain);
> >   }
> > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > index 3600d270ff8..b23848ab94c 100644
> > --- a/gcc/cp/semantics.cc
> > +++ b/gcc/cp/semantics.cc
> > @@ -11302,9 +11302,20 @@ finish_decltype_type (tree expr, bool
> > id_expression_or_member_access_p,
> >   return type;
> >   }
> > +  else if (processing_template_decl
> > +  && potential_constant_expression (expr)
> > +  && value_dependent_expression_p (expr))
> > +/* The above test is equivalent to instantiation_dependent_expression_p
> > +   after instantiation_dependent_uneval_expression_p has been ruled
> > out.
> > +   In this case the expression is dependent but not type-dependent, so
> > +   we can resolve the decltype ahead of time but we can't instantiate
> > +   the expression.  */;
> > else if (processing_template_decl)
> >   {
> > -  expr = instantiate_non_dependent_expr_sfinae (expr,
> > complain|tf_decltype);
> > +  /* The

Re: [PATCH] c++: ICE with template NEW_EXPR [PR105803]

2022-06-02 Thread Jason Merrill via Gcc-patches


On 6/2/22 16:10, Marek Polacek wrote:

On Thu, Jun 02, 2022 at 03:42:15PM -0400, Jason Merrill wrote:

On 6/2/22 10:03, Marek Polacek wrote:

On Thu, Jun 02, 2022 at 08:42:36AM -0400, Patrick Palka wrote:

On Wed, 1 Jun 2022, Marek Polacek via Gcc-patches wrote:


Here we ICE because value_dependent_expression_p gets a NEW_EXPR
whose operand is a type, and we go to the default case which just
calls v_d_e_p on each operand of the NEW_EXPR.  Since one of them
is a type, we crash on the new assert in t_d_e_p.


Looks like NEW_EXPR is considered to be not potentially constant
according to potential_constant_expression.  I thought we shouldn't
be calling value_dependent_expression_p on such exprs?


Except in C++20 new-expressions are potentially constant, so the patch is


Thanks, pushed.


OK, and we should adjust pce1 accordingly.


Is the attached patch OK then?  So far dg.exp passed.  Though it won't help
with...
  

I notice we currently fail to handle

struct A
{
   int i;
   constexpr A(int *p): i(*p) { delete p; }
};

constexpr int i = A(new int(42)).i;

though it does work inside a function.


...this test (it complains about a TARGET_EXPR's slot variable not being
declared constexpr), so I'm going to open a PR.

 From cf70354894bc31cc542ed8df40633bea2427fee7 Mon Sep 17 00:00:00 2001
From: Marek Polacek 
Date: Thu, 2 Jun 2022 15:56:18 -0400
Subject: [PATCH] c++: new-expression is potentially constant in C++20

... so adjust p_c_e accordingly.

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1): Treat
{,VEC_}NEW_EXPR as potentially constant in C++20.
---
  gcc/cp/constexpr.cc | 12 ++--
  1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 1346a1d4c10..2bbd8785627 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -9039,10 +9039,18 @@ potential_constant_expression_1 (tree t, bool 
want_rval, bool strict, bool now,
  "before C++17");
return false;
  
-case DYNAMIC_CAST_EXPR:

-case PSEUDO_DTOR_EXPR:
  case NEW_EXPR:
  case VEC_NEW_EXPR:
+  if (cxx_dialect >= cxx20)
+   /* In C++20, new-expressions are potentially constant.  */
+   return true;
+  else if (flags & tf_error)
+   error_at (loc, "new-expression is not a constant expression "
+ "before C++20");
+  return false;
+
+case DYNAMIC_CAST_EXPR:
+case PSEUDO_DTOR_EXPR:
  case DELETE_EXPR:
  case VEC_DELETE_EXPR:


Delete, too.


  case THROW_EXPR:

base-commit: 7b98910406b5000a6429c188b0c6cc14e3140637

Re: [PATCH] c++: ICE with template NEW_EXPR [PR105803]

2022-06-02 Thread Marek Polacek via Gcc-patches

On Thu, Jun 02, 2022 at 03:42:15PM -0400, Jason Merrill wrote:
> On 6/2/22 10:03, Marek Polacek wrote:
> > On Thu, Jun 02, 2022 at 08:42:36AM -0400, Patrick Palka wrote:
> > > On Wed, 1 Jun 2022, Marek Polacek via Gcc-patches wrote:
> > > 
> > > > Here we ICE because value_dependent_expression_p gets a NEW_EXPR
> > > > whose operand is a type, and we go to the default case which just
> > > > calls v_d_e_p on each operand of the NEW_EXPR.  Since one of them
> > > > is a type, we crash on the new assert in t_d_e_p.
> > > 
> > > Looks like NEW_EXPR is considered to be not potentially constant
> > > according to potential_constant_expression.  I thought we shouldn't
> > > be calling value_dependent_expression_p on such exprs?
> 
> Except in C++20 new-expressions are potentially constant, so the patch is

Thanks, pushed.

> OK, and we should adjust pce1 accordingly.

Is the attached patch OK then?  So far dg.exp passed.  Though it won't help
with...
 
> I notice we currently fail to handle
> 
> struct A
> {
>   int i;
>   constexpr A(int *p): i(*p) { delete p; }
> };
> 
> constexpr int i = A(new int(42)).i;
> 
> though it does work inside a function.

...this test (it complains about a TARGET_EXPR's slot variable not being
declared constexpr), so I'm going to open a PR.

>From cf70354894bc31cc542ed8df40633bea2427fee7 Mon Sep 17 00:00:00 2001
From: Marek Polacek 
Date: Thu, 2 Jun 2022 15:56:18 -0400
Subject: [PATCH] c++: new-expression is potentially constant in C++20

... so adjust p_c_e accordingly.

gcc/cp/ChangeLog:

* constexpr.cc (potential_constant_expression_1): Treat
{,VEC_}NEW_EXPR as potentially constant in C++20.
---
 gcc/cp/constexpr.cc | 12 ++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 1346a1d4c10..2bbd8785627 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -9039,10 +9039,18 @@ potential_constant_expression_1 (tree t, bool 
want_rval, bool strict, bool now,
  "before C++17");
   return false;
 
-case DYNAMIC_CAST_EXPR:
-case PSEUDO_DTOR_EXPR:
 case NEW_EXPR:
 case VEC_NEW_EXPR:
+  if (cxx_dialect >= cxx20)
+   /* In C++20, new-expressions are potentially constant.  */
+   return true;
+  else if (flags & tf_error)
+   error_at (loc, "new-expression is not a constant expression "
+ "before C++20");
+  return false;
+
+case DYNAMIC_CAST_EXPR:
+case PSEUDO_DTOR_EXPR:
 case DELETE_EXPR:
 case VEC_DELETE_EXPR:
 case THROW_EXPR:

base-commit: 7b98910406b5000a6429c188b0c6cc14e3140637
-- 
2.36.1

Re: [PATCH] c++: value-dep but not type-dep decltype operand [PR105756]

2022-06-02 Thread Jason Merrill via Gcc-patches


On 6/1/22 14:20, Patrick Palka wrote:

r12-7564-gec0f53a3a542e7 made us instantiate non-constant non-dependent
decltype operands by relaxing instantiate_non_dependent_expr to check
instantiation_dependent_uneval_expression_p.  But as the testcase below
demonstrates, this predicate is too permissive here because it allows
value-dependent-only expressions to go through and get instantiated
ahead of time, which causes us to crash during constexpr evaluation of
(5 % N).


Why are we doing constexpr evaluation in unevaluated context?


This patch strengthens instantiate_non_dependent_expr to use the
non-uneval version of the predicate instead, which does consider value
dependence.  In turn, we need to make finish_decltype_type avoid calling
i_n_d_e on a value-dependent-only expression; I assume we still want to
resolve the decltype ahead of time in this case.  (Doing so seems
unintuitive to me since the expression could be ill-formed at
instantiation time as in the testcase, but it matches the behavior of
Clang and MSVC.)


I don't think there's any problem with the testcase;  decltype(1/0) is 
well-formed because the expression is not evaluated.



Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk/12?

PR c++/105756

gcc/cp/ChangeLog:

* pt.cc (instantiate_non_dependent_expr_internal): Adjust
comment.
(instantiate_non_dependent_expr_sfinae): Assert i_d_e_p instead
of i_d_u_e_p.
* semantics.cc (finish_decltype_type): Don't instantiate the
expression when i_d_e_p is true.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/decltype82.C: New test.
---
  gcc/cp/pt.cc|  4 ++--
  gcc/cp/semantics.cc | 13 -
  gcc/testsuite/g++.dg/cpp0x/decltype82.C | 10 ++
  3 files changed, 24 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/decltype82.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index e4a473002a0..1ea2545e115 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -6372,7 +6372,7 @@ redeclare_class_template (tree type, tree parms, tree 
cons)
  
  /* The actual substitution part of instantiate_non_dependent_expr_sfinae,

 to be used when the caller has already checked
-!instantiation_dependent_uneval_expression_p (expr)
+!instantiation_dependent_expression_p (expr)
 and cleared processing_template_decl.  */
  
  tree

@@ -6397,7 +6397,7 @@ instantiate_non_dependent_expr_sfinae (tree expr, 
tsubst_flags_t complain)
if (processing_template_decl)
  {
/* The caller should have checked this already.  */
-  gcc_checking_assert (!instantiation_dependent_uneval_expression_p 
(expr));
+  gcc_checking_assert (!instantiation_dependent_expression_p (expr));
processing_template_decl_sentinel s;
expr = instantiate_non_dependent_expr_internal (expr, complain);
  }
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index 3600d270ff8..b23848ab94c 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -11302,9 +11302,20 @@ finish_decltype_type (tree expr, bool 
id_expression_or_member_access_p,
  
return type;

  }
+  else if (processing_template_decl
+  && potential_constant_expression (expr)
+  && value_dependent_expression_p (expr))
+/* The above test is equivalent to instantiation_dependent_expression_p
+   after instantiation_dependent_uneval_expression_p has been ruled out.
+   In this case the expression is dependent but not type-dependent, so
+   we can resolve the decltype ahead of time but we can't instantiate
+   the expression.  */;
else if (processing_template_decl)
  {
-  expr = instantiate_non_dependent_expr_sfinae (expr, 
complain|tf_decltype);
+  /* The expression isn't instantiation dependent, so we can fully
+instantiate it ahead of time.  */
+  expr = instantiate_non_dependent_expr_sfinae (expr,
+   complain|tf_decltype);
if (expr == error_mark_node)
return error_mark_node;
/* Keep processing_template_decl cleared for the rest of the function
diff --git a/gcc/testsuite/g++.dg/cpp0x/decltype82.C 
b/gcc/testsuite/g++.dg/cpp0x/decltype82.C
new file mode 100644
index 000..915e5e37675
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/decltype82.C
@@ -0,0 +1,10 @@
+// PR c++/105756
+// { dg-do compile { target c++11 } }
+
+template
+void f() {
+  using ty1 = decltype((5 % N) == 0);
+  using ty2 = decltype((5 / N) == 0);
+}
+
+template void f<0>();

Re: [PATCH] c++: cv-quals of dummy obj for non-dep memfn call [PR105637]

2022-06-02 Thread Patrick Palka via Gcc-patches

On Thu, 2 Jun 2022, Jason Merrill wrote:

> On 5/27/22 09:57, Patrick Palka wrote:
> > On Thu, 26 May 2022, Patrick Palka wrote:
> > 
> > > On Thu, 26 May 2022, Jason Merrill wrote:
> > > 
> > > > On 5/26/22 14:57, Patrick Palka wrote:
> > > > > On Thu, 26 May 2022, Patrick Palka wrote:
> > > > > 
> > > > > > Here we expect the calls to BaseClass::baseDevice resolve to the
> > > > > > second,
> > > > > > third and fourth overloads respectively in light of the
> > > > > > cv-qualifiers
> > > > > > of 'this' in each case.  But ever since r12-6075-g2decd2cabe5a4f,
> > > > > > the
> > > > > > calls incorrectly resolve to the first overload at instantiation
> > > > > > time.
> > > > > > 
> > > > > > This happens because the calls to BaseClass::baseDevice are all
> > > > > > deemed
> > > > > > non-dependent (ever since r7-755-g23cb72663051cd made us ignore the
> > > > > > dependentness of 'this' when considering the dependence of a
> > > > > > non-static
> > > > > > memfn call), hence we end up checking the call ahead of time, using
> > > > > > as
> > > > > > the object argument a dummy object of type BaseClass.  Since this
> > > > > > object
> > > > > > argument is cv-unqualified, the calls incoherently resolve to the
> > > > > > first
> > > > > > overload of baseDevice.  Before r12-6075, this incorrect result
> > > > > > would
> > > > > > just get silently discarded and we'd end up redoing OR at
> > > > > > instantiation
> > > > > > time using 'this' as the object argument.  But after r12-6075, we
> > > > > > now
> > > > > > reuse this incorrect result at instantiation time.
> > > > > > 
> > > > > > This patch fixes this by making finish_call_expr request from
> > > > > > maybe_dummy_object a cv-qualified object consistent with the
> > > > > > cv-quals of
> > > > > > 'this'.  That way, ahead of time OR using a dummy object will give
> > > > > > us
> > > > > > the right answer and we could safely reuse it at instantiation time.
> > > > > > 
> > > > > > NB: r7-755 is also the cause of the related issue PR105742.  Not
> > > > > > sure
> > > > > > if there's a fix that could resolve both PRs at once..
> > > > > > 
> > > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
> > > > > > for trunk/12?
> > > > > > 
> > > > > > PR c++/105637
> > > > > > 
> > > > > > gcc/cp/ChangeLog:
> > > > > > 
> > > > > > * semantics.cc (finish_call_expr): Pass a cv-qualified object
> > > > > > type to maybe_dummy_object that is consistent with the
> > > > > > cv-qualifiers of 'this' if available.
> > > > > > 
> > > > > > gcc/testsuite/ChangeLog:
> > > > > > 
> > > > > > * g++.dg/template/non-dependent23.C: New test.
> > > > > > ---
> > > > > >gcc/cp/semantics.cc   | 15 ---
> > > > > >.../g++.dg/template/non-dependent23.C | 25
> > > > > > +++
> > > > > >2 files changed, 37 insertions(+), 3 deletions(-)
> > > > > >create mode 100644
> > > > > > gcc/testsuite/g++.dg/template/non-dependent23.C
> > > > > > 
> > > > > > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > > > > > index cd7a2818feb..1d9348c6cf1 100644
> > > > > > --- a/gcc/cp/semantics.cc
> > > > > > +++ b/gcc/cp/semantics.cc
> > > > > > @@ -2802,16 +2802,25 @@ finish_call_expr (tree fn, vec
> > > > > > **args, bool disallow_virtual,
> > > > > > [class.access.base] says that we need to convert 'this' to B*
> > > > > > as
> > > > > > part of the access, so we pass 'B' to maybe_dummy_object.  */
> > > > > >+  tree object_type = BINFO_TYPE (BASELINK_ACCESS_BINFO
> > > > > > (fn));
> > > > > >  if (DECL_MAYBE_IN_CHARGE_CONSTRUCTOR_P (get_first_fn (fn)))
> > > > > > {
> > > > > >   /* A constructor call always uses a dummy object.  (This
> > > > > > constructor
> > > > > >  call which has the form A::A () is actually invalid and
> > > > > > we are
> > > > > >  going to reject it later in build_new_method_call.)  */
> > > > > > - object = build_dummy_object (BINFO_TYPE
> > > > > > (BASELINK_ACCESS_BINFO
> > > > > > (fn)));
> > > > > > + object = build_dummy_object (object_type);
> > > > > > }
> > > > > >  else
> > > > > > -   object = maybe_dummy_object (BINFO_TYPE (BASELINK_ACCESS_BINFO
> > > > > > (fn)),
> > > > > > -NULL);
> > > > > > +   {
> > > > > > + if (current_class_ref)
> > > > > > +   {
> > > > > > + /* Make sure that if maybe_dummy_object gives us a dummy
> > > > > > object,
> > > > > > +it'll have the same cv-quals as '*this'.  */
> > > > > > + int quals = cp_type_quals (TREE_TYPE
> > > > > > (current_class_ref));
> > > > > > + object_type = cp_build_qualified_type (object_type,
> > > > > > quals);
> > > > > > +   }
> > > > > > + object = maybe_dummy_object (object_type, NULL);
> > > > > > +   }
> > > > > >result = build_new_method_call (object, fn, args,
> > > > > > NULL_TREE,
> > > >

Re: [PATCH] c++: don't substitute TEMPLATE_PARM_CONSTRAINT [PR100374]

2022-06-02 Thread Jason Merrill via Gcc-patches


On 6/2/22 11:40, Patrick Palka wrote:

On Tue, 31 May 2022, Jason Merrill wrote:


On 5/31/22 08:56, Patrick Palka wrote:

On Sun, 29 May 2022, Jason Merrill wrote:


On 5/29/22 22:10, Jason Merrill wrote:

On 5/27/22 14:05, Patrick Palka wrote:

This makes us avoid substituting into the TEMPLATE_PARM_CONSTRAINT of
each template parameter except as necessary for (friend) declaration
matching, like we already do for the overall
TEMPLATE_PARMS_CONSTRAINTS
of a template parameter list.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for
trunk and perhaps 12.2?  Also tested on range-v3 and cmcstl2.


Are there already tests that cover the friend cases?


Yes, by cpp2a/concepts-friend{2,3,7}.C I think.



Also, don't you also need to handle specialization of partial
instantiations?


Hmm, do you have an example?  IIUC we call tsubst_friend_function and
tsubst_friend_class only from instantiate_class_template_1, which always
uses the most general template and full template argument set to
instantiate any friend declarations.  So friend declarations are never
partially instantiated I think.  (And IIUC non-friends are irrelevant
here since we don't ever want to substitute their constraints outside of
satisfaction.)


 From C++20 CA104:

   template  struct A {
 template  U f(U) requires C;
 template  U f(U) requires C;
   };

   // Substitute int for T in above requirements to find match.
   template <> template  U A::f(U) requires C  { }


Aha, thanks.  In this case of declaration matching, it looks like
determine_specialization ignores all but the trailing requirement
clause.  I think it's doable if a little messy to precisely handle
this case but in the meantime it seems we could get 90% of the way there
by considering the overall constraints instead of just the trailing
constraints?  Something like the following.

(Either way, IIUC the tsubst_template_parm change shouldn't affect this
case at all since determine_specialization uses comp_template_parms
instead of template_heads_equivalent_p and so it ignores
TEMPLATE_PARM_CONSTRAINT.)


OK.


-- >8 --

Subject: [PATCH] c++: don't substitute TEMPLATE_PARM_CONSTRAINT [PR100374]

This makes us avoid substituting into the TEMPLATE_PARM_CONSTRAINT of
each template parameter except as necessary for friend declaration
matching, like we already do for the overall associated constraints.

In passing this improves upon the CA104 implementation of explicit
specialization mathing of a constrained function template inside
a class template, by considering the overall constraints instead of
just the trailing constraints.  This allows us to correctly handle the
first three explicit specializations in concepts-spec2.C below, but
because we compare the constraints as a whole, it means we incorrectly
accept the fourth explicit specialization.  For complete correctness,
we should be using tsubst_each_template_parm_constraint and
template_parameter_heads_equivalent_p in determine_specialization.

PR c++/100374

gcc/cp/ChangeLog:

* pt.cc (determine_specialization): Compare overall constraints,
not just the trailing constraints, in the CA104 case.
(tsubst_each_template_parm_constraint): Define.
(tsubst_friend_function): Use it.
(tsubst_friend_class): Use it.
(tsubst_template_parm): Don't substitute TEMPLATE_PARM_CONSTRAINT.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-template-parm11.C: New test.
---
  gcc/cp/pt.cc  | 41 +++
  gcc/testsuite/g++.dg/cpp2a/concepts-spec2.C   | 15 +++
  .../g++.dg/cpp2a/concepts-template-parm11.C   | 20 +
  3 files changed, 69 insertions(+), 7 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-spec2.C
  create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-template-parm11.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 45dd036c2d5..d867ce8e141 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -184,6 +184,7 @@ static int unify_pack_expansion (tree, tree, tree,
 tree, unification_kind_t, bool, bool);
  static tree copy_template_args (tree);
  static tree tsubst_template_parms (tree, tree, tsubst_flags_t);
+static void tsubst_each_template_parm_constraint (tree, tree, tsubst_flags_t);
  tree most_specialized_partial_spec (tree, tsubst_flags_t);
  static tree tsubst_aggr_type (tree, tree, tsubst_flags_t, tree, int);
  static tree tsubst_arg_types (tree, tree, tree, tsubst_flags_t, tree);
@@ -2323,8 +2324,8 @@ determine_specialization (tree template_id,
  if (!compparms (fn_arg_types, decl_arg_types))
continue;
  
-	  tree freq = get_trailing_function_requirements (fn);

- tree dreq = get_trailing_function_requirements (decl);
+ tree freq = get_constraints (fn);
+ tree dreq = get_constraints (decl);
  if (!freq != !dreq)
continue;
  if (freq)

[committed] diagnostics: add SARIF output format

2022-06-02 Thread David Malcolm via Gcc-patches

This patch adds support to gcc's diagnostic subsystem for emitting
diagnostics in SARIF, aka the Static Analysis Results Interchange Format:
  https://sarifweb.azurewebsites.net/
by extending -fdiagnostics-format= to add two new options:
  -fdiagnostics-format=sarif-stderr
and:
  -fdiagnostics-format=sarif-file

The patch targets SARIF v2.1.0

This is a JSON-based format suited for capturing the results of static
analysis tools (like GCC's -fanalyzer), but it can also be used for plain
GCC warnings and errors.

SARIF supports per-event metadata in diagnostic paths such as
["acquire", "resource"] and ["release", "lock"] (specifically, the
threadFlowLocation "kinds" property: SARIF v2.1.0 section 3.38.8), so
the patch extends GCC"s diagnostic_event subclass with a "struct meaning"
with similar purpose.  The patch implements this for -fanalyzer so that
the various state-machine-based warnings set these in the SARIF output.

The heart of the implementation is in the new file
diagnostic-format-sarif.cc.  Much of the rest of the patch is interface
classes, isolating the diagnostic subsystem (which has no knowledge of
e.g. tree or langhook) from the "client" code in the compiler proper
cc1 etc).

The patch adds a langhook for specifying the SARIF v2.1.0
"artifact.sourceLanguage" property, based on the list in
SARIF v2.1.0 Appendix J.

The patch adds automated DejaGnu tests to our testsuite via new
scan-sarif-file and scan-sarif-file-not directives (although these
merely use regexps, rather than attempting to use a proper JSON parser).

I've tested the patch by hand using the validator at:
  https://sarifweb.azurewebsites.net/Validation
and the react-based viewer at:
  https://microsoft.github.io/sarif-web-component/
which successfully shows most of the information (although not paths,
and not CWE IDs), and I've fixed all validation errors I've seen (though
bugs no doubt remain).

I've also tested the generated SARIF using the VS Code extension linked
to from the SARIF website; I'm a novice with VS Code, but it seems to be
able to handle my generated SARIF files (e.g. showing the data in the
SARIF tab, and showing squiggly underlines under issues, and when I
click on them, it visualizes the events in the path inline within the
source window).

Has anyone written an Emacs mode for SARIF files? (pretty please)

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-967-g6cf276ddf22066.

gcc/ChangeLog:
* Makefile.in (OBJS): Add tree-diagnostic-client-data-hooks.o and
tree-logical-location.o.
(OBJS-libcommon): Add diagnostic-format-sarif.o; reorder.
(CFLAGS-tree-diagnostic-client-data-hooks.o): Add TARGET_NAME.
* common.opt (fdiagnostics-format=): Add sarif-stderr and sarif-file.
(sarif-stderr, sarif-file): New enum values.
* diagnostic-client-data-hooks.h: New file.
* diagnostic-format-sarif.cc: New file.
* diagnostic-path.h (enum diagnostic_event::verb): New enum.
(enum diagnostic_event::noun): New enum.
(enum diagnostic_event::property): New enum.
(struct diagnostic_event::meaning): New struct.
(diagnostic_event::get_logical_location): New vfunc.
(diagnostic_event::get_meaning): New vfunc.
(simple_diagnostic_event::get_logical_location): New vfunc impl.
(simple_diagnostic_event::get_meaning): New vfunc impl.
* diagnostic.cc: Include "diagnostic-client-data-hooks.h".
(diagnostic_initialize): Initialize m_client_data_hooks.
(diagnostic_finish): Clean up m_client_data_hooks.
(diagnostic_event::meaning::dump_to_pp): New.
(diagnostic_event::meaning::maybe_get_verb_str): New.
(diagnostic_event::meaning::maybe_get_noun_str): New.
(diagnostic_event::meaning::maybe_get_property_str): New.
(get_cwe_url): Make non-static.
(diagnostic_output_format_init): Handle
DIAGNOSTICS_OUTPUT_FORMAT_SARIF_STDERR and
DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE.
* diagnostic.h (enum diagnostics_output_format): Add
DIAGNOSTICS_OUTPUT_FORMAT_SARIF_STDERR and
DIAGNOSTICS_OUTPUT_FORMAT_SARIF_FILE.
(class diagnostic_client_data_hooks): New forward decl.
(class logical_location): New forward decl.
(diagnostic_context::m_client_data_hooks): New field.
(diagnostic_output_format_init_sarif_stderr): New decl.
(diagnostic_output_format_init_sarif_file): New decl.
(get_cwe_url): New decl.
* doc/invoke.texi (-fdiagnostics-format=): Add sarif-stderr and
sarif-file.
* doc/sourcebuild.texi (Scan a particular file): Add
scan-sarif-file and scan-sarif-file-not.
* langhooks-def.h (lhd_get_sarif_source_language): New decl.
(LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE): New macro.
(LANG_HOOKS_INITIALIZER): Add
LANG_HOOKS_GET_SARIF_SOURCE_LANGUAGE.
* langhooks.cc

[committed] Add -fdiagnostics-format={json-stderr|json-file}

2022-06-02 Thread David Malcolm via Gcc-patches

This commit adds -fdiagnostics-format=json-file, writing to
DUMP_BASE_NAME.gcc.json, and adds -fdiagnostics-format=json-stderr,
a synonym for the existing -fdiagnostics-format=json.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-966-g5ab73173cca461.

gcc/ChangeLog:
* common.opt (fdiagnostics-format=): Add json-stderr and json-file
to description.
(DIAGNOSTICS_OUTPUT_FORMAT_JSON): Rename to...
(DIAGNOSTICS_OUTPUT_FORMAT_JSON_STDERR): ...this.
(diagnostics_output_format): Add json-stderr and json-file.
* diagnostic-format-json.cc (json_flush_to_file): New.
(json_final_cb): Convert to...
(json_flush_to_file): ...this, ...
(json_stderr_final_cb): ...this, and...
(json_file_final_cb): ...this.
(diagnostic_output_format_init): Move to diagnostic.cc.
(json_output_base_file_name): New.
(diagnostic_output_format_init_json): New.
(diagnostic_output_format_init_json_stderr): New.
(diagnostic_output_format_init_json_file): New.
* diagnostic.cc (diagnostic_output_format_init): Move here from
diagnostic-format-json.cc; update for changes to enum.
* diagnostic.h (enum diagnostics_output_format): Rename
DIAGNOSTICS_OUTPUT_FORMAT_JSON to
DIAGNOSTICS_OUTPUT_FORMAT_JSON_STDERR, and add
DIAGNOSTICS_OUTPUT_FORMAT_JSON_FILE.
(diagnostic_output_format_init): Add base_file_name param.
(diagnostic_output_format_init_json_stderr): New decl.
(diagnostic_output_format_init_json_file): New dec.
* doc/invoke.texi (-fdiagnostics-format=): Add "json-stderr" and
"json-file".  Rewrite so that the existing "json" is a synonym of
"json-stderr".
* gcc.cc (driver_handle_option): Pass dump_base_name to
diagnostic_output_format_init.
* opts.cc (common_handle_option): Likewise.

gcc/testsuite/ChangeLog:
* c-c++-common/diagnostic-format-json-file-1.c: New test.
* c-c++-common/diagnostic-format-json-stderr-1.c: New test.

Signed-off-by: David Malcolm 
---
 gcc/common.opt|  10 +-
 gcc/diagnostic-format-json.cc | 116 --
 gcc/diagnostic.cc |  26 
 gcc/diagnostic.h  |  11 +-
 gcc/doc/invoke.texi   |  17 ++-
 gcc/gcc.cc|   2 +-
 gcc/opts.cc   |   2 +-
 .../diagnostic-format-json-file-1.c   |   8 ++
 .../diagnostic-format-json-stderr-1.c |  33 +
 9 files changed, 173 insertions(+), 52 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/diagnostic-format-json-file-1.c
 create mode 100644 gcc/testsuite/c-c++-common/diagnostic-format-json-stderr-1.c

diff --git a/gcc/common.opt b/gcc/common.opt
index 8a0dafc522d..3237ce9b530 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1390,7 +1390,7 @@ Common Joined RejectNegative UInteger
 
 fdiagnostics-format=
 Common Joined RejectNegative Enum(diagnostics_output_format)
--fdiagnostics-format=[text|json]   Select output format.
+-fdiagnostics-format=[text|json|json-stderr|json-file] Select output format.
 
 fdiagnostics-escape-format=
 Common Joined RejectNegative Enum(diagnostics_escape_format)
@@ -1425,7 +1425,13 @@ EnumValue
 Enum(diagnostics_output_format) String(text) 
Value(DIAGNOSTICS_OUTPUT_FORMAT_TEXT)
 
 EnumValue
-Enum(diagnostics_output_format) String(json) 
Value(DIAGNOSTICS_OUTPUT_FORMAT_JSON)
+Enum(diagnostics_output_format) String(json) 
Value(DIAGNOSTICS_OUTPUT_FORMAT_JSON_STDERR)
+
+EnumValue
+Enum(diagnostics_output_format) String(json-stderr) 
Value(DIAGNOSTICS_OUTPUT_FORMAT_JSON_STDERR)
+
+EnumValue
+Enum(diagnostics_output_format) String(json-file) 
Value(DIAGNOSTICS_OUTPUT_FORMAT_JSON_FILE)
 
 fdiagnostics-parseable-fixits
 Common Var(flag_diagnostics_parseable_fixits)
diff --git a/gcc/diagnostic-format-json.cc b/gcc/diagnostic-format-json.cc
index 62594ebb4d7..051fa6c2e48 100644
--- a/gcc/diagnostic-format-json.cc
+++ b/gcc/diagnostic-format-json.cc
@@ -285,57 +285,93 @@ json_end_group (diagnostic_context *)
   cur_children_array = NULL;
 }
 
-/* Callback for final cleanup for JSON output.  */
+/* Flush the top-level array to OUTF.  */
 
 static void
-json_final_cb (diagnostic_context *)
+json_flush_to_file (FILE *outf)
 {
-  /* Flush the top-level array.  */
-  toplevel_array->dump (stderr);
-  fprintf (stderr, "\n");
+  toplevel_array->dump (outf);
+  fprintf (outf, "\n");
   delete toplevel_array;
   toplevel_array = NULL;
 }
 
-/* Set the output format for CONTEXT to FORMAT.  */
+/* Callback for final cleanup for JSON output to stderr.  */
 
-void
-diagnostic_output_format_init (diagnostic_context *context,
-  enum diagnostics_output_format format)
+static void
+json_stderr_final_cb (diagnostic_context *)
+{
+

Re: [PATCH] c++: cv-quals of dummy obj for non-dep memfn call [PR105637]

2022-06-02 Thread Jason Merrill via Gcc-patches


On 5/27/22 09:57, Patrick Palka wrote:

On Thu, 26 May 2022, Patrick Palka wrote:


On Thu, 26 May 2022, Jason Merrill wrote:


On 5/26/22 14:57, Patrick Palka wrote:

On Thu, 26 May 2022, Patrick Palka wrote:


Here we expect the calls to BaseClass::baseDevice resolve to the second,
third and fourth overloads respectively in light of the cv-qualifiers
of 'this' in each case.  But ever since r12-6075-g2decd2cabe5a4f, the
calls incorrectly resolve to the first overload at instantiation time.

This happens because the calls to BaseClass::baseDevice are all deemed
non-dependent (ever since r7-755-g23cb72663051cd made us ignore the
dependentness of 'this' when considering the dependence of a non-static
memfn call), hence we end up checking the call ahead of time, using as
the object argument a dummy object of type BaseClass.  Since this object
argument is cv-unqualified, the calls incoherently resolve to the first
overload of baseDevice.  Before r12-6075, this incorrect result would
just get silently discarded and we'd end up redoing OR at instantiation
time using 'this' as the object argument.  But after r12-6075, we now
reuse this incorrect result at instantiation time.

This patch fixes this by making finish_call_expr request from
maybe_dummy_object a cv-qualified object consistent with the cv-quals of
'this'.  That way, ahead of time OR using a dummy object will give us
the right answer and we could safely reuse it at instantiation time.

NB: r7-755 is also the cause of the related issue PR105742.  Not sure
if there's a fix that could resolve both PRs at once..

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk/12?

PR c++/105637

gcc/cp/ChangeLog:

* semantics.cc (finish_call_expr): Pass a cv-qualified object
type to maybe_dummy_object that is consistent with the
cv-qualifiers of 'this' if available.

gcc/testsuite/ChangeLog:

* g++.dg/template/non-dependent23.C: New test.
---
   gcc/cp/semantics.cc   | 15 ---
   .../g++.dg/template/non-dependent23.C | 25 +++
   2 files changed, 37 insertions(+), 3 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent23.C

diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index cd7a2818feb..1d9348c6cf1 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -2802,16 +2802,25 @@ finish_call_expr (tree fn, vec
**args, bool disallow_virtual,
[class.access.base] says that we need to convert 'this' to B* as
part of the access, so we pass 'B' to maybe_dummy_object.  */
   +  tree object_type = BINFO_TYPE (BASELINK_ACCESS_BINFO (fn));
 if (DECL_MAYBE_IN_CHARGE_CONSTRUCTOR_P (get_first_fn (fn)))
{
  /* A constructor call always uses a dummy object.  (This constructor
 call which has the form A::A () is actually invalid and we are
 going to reject it later in build_new_method_call.)  */
- object = build_dummy_object (BINFO_TYPE (BASELINK_ACCESS_BINFO
(fn)));
+ object = build_dummy_object (object_type);
}
 else
-   object = maybe_dummy_object (BINFO_TYPE (BASELINK_ACCESS_BINFO (fn)),
-NULL);
+   {
+ if (current_class_ref)
+   {
+ /* Make sure that if maybe_dummy_object gives us a dummy object,
+it'll have the same cv-quals as '*this'.  */
+ int quals = cp_type_quals (TREE_TYPE (current_class_ref));
+ object_type = cp_build_qualified_type (object_type, quals);
+   }
+ object = maybe_dummy_object (object_type, NULL);
+   }
   result = build_new_method_call (object, fn, args, NULL_TREE,
  (disallow_virtual


Drat, this fix doesn't interact well with 'this'-capturing lambdas:

  struct BaseClass {
void baseDevice();// #1
void baseDevice() const = delete; // #2
  };

  template
  struct TopClass : T {
void failsToCompile() {
  [this] { BaseClass::baseDevice(); }();
}
  };

  template struct TopClass;

Here after the fix, we'd incorrectly select the const #2 overload at
template definition time because current_class_ref is the const 'this'
for the lambda rather than the non-const 'this' for TopClass..  I suppose
we need something like current_nonlambda_class_type for getting at the
innermost non-lambda 'this'?


Do you want maybe_resolve_dummy (ob, false)?


That sadly doesn't seem to work -- the object type is BaseClass which is
not necessarily a base of the dependent TopClass, so
resolvable_dummy_lambda returns NULL_TREE.  I guess it would work at
instantiation time though.


Ah, what seems to work well is directly using lambda_expr_this_capture
instead of maybe_resolve_dummy.  And we might as well handle this in
maybe_dummy_object for benefit of all callers.  How does the

[committed] json: fix escaping of '\'

2022-06-02 Thread David Malcolm via Gcc-patches

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-965-g4f9ad0b4b0a8c7.

gcc/ChangeLog:
* json.cc (string::print): Fix escaping of '\'.
---
 gcc/json.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/json.cc b/gcc/json.cc
index 3bf9b61a641..974f8c36825 100644
--- a/gcc/json.cc
+++ b/gcc/json.cc
@@ -207,7 +207,7 @@ string::print (pretty_printer *pp) const
  pp_string (pp, "\\\"");
  break;
case '\\':
- pp_string (pp, "\\n");
+ pp_string (pp, "");
  break;
case '\b':
  pp_string (pp, "\\b");
-- 
2.26.3

Re: [PATCH] c++: ICE with template NEW_EXPR [PR105803]

2022-06-02 Thread Jason Merrill via Gcc-patches


On 6/2/22 10:03, Marek Polacek wrote:

On Thu, Jun 02, 2022 at 08:42:36AM -0400, Patrick Palka wrote:

On Wed, 1 Jun 2022, Marek Polacek via Gcc-patches wrote:


Here we ICE because value_dependent_expression_p gets a NEW_EXPR
whose operand is a type, and we go to the default case which just
calls v_d_e_p on each operand of the NEW_EXPR.  Since one of them
is a type, we crash on the new assert in t_d_e_p.


Looks like NEW_EXPR is considered to be not potentially constant
according to potential_constant_expression.  I thought we shouldn't
be calling value_dependent_expression_p on such exprs?


Except in C++20 new-expressions are potentially constant, so the patch 
is OK, and we should adjust pce1 accordingly.


I notice we currently fail to handle

struct A
{
  int i;
  constexpr A(int *p): i(*p) { delete p; }
};

constexpr int i = A(new int(42)).i;

though it does work inside a function.


You're correct.  This is non-obvious: instantiation_dependent_expression_p
calls p_c_e before v_d_e_p, but the expression is CAST_EXPR<[NEW_EXPR]>,
where the [] denotes a TREE_LIST, created in cp_parser_functional_cast.
This TREE_LIST has no type.  So p_c_e_1/CAST_EXPR goes to
9183   /* If this is a dependent type, it could end up being a class
9184  with conversions.  */
9185   if (type == NULL_TREE || WILDCARD_TYPE_P (type))
9186 return true;
and returns true.

So we call v_d_e_p, which looks at the CAST_EXPR's op and sees a TREE_LIST,
so it calls any_value_dependent_elements_p, and we end up with a NEW_EXPR.

An alternative/more proper fix would be to fix p_c_e_1/CAST_EXPR.  Maybe
by calling any_type_dependent_elements_p (which currently has no uses).
Thoughts?


--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/new13.C
@@ -0,0 +1,11 @@
+// PR c++/105803
+// { dg-do compile }
+// { dg-additional-options "-fchecking=2" }
+
+namespace std {
+template  class shared_ptr;
+}
+struct S {};
+template  void build_matrices() {
+  std::shared_ptr(new S);
+}


I think this testcase might be IFNDR since shared_ptr is incomplete
at the point of its non-dependent use.


Ah, overreduced.  I've made shared_ptr complete.

Marek

Re: [PATCH v2] RISC-V: bitmanip: improve constant-loading for (1ULL << 31) in DImode

2022-06-02 Thread Philipp Tomsich

OK for backport?

On Thu, 2 Jun 2022 at 21:23, Philipp Tomsich  wrote:
>
> Thanks, applied to trunk!
>
> On Thu, 2 Jun 2022 at 15:17, Kito Cheng  wrote:
> >
> > LGTM
> >
> > On Mon, May 30, 2022 at 5:52 AM Philipp Tomsich
> >  wrote:
> > >
> > > The SINGLE_BIT_MASK_OPERAND() is overly restrictive, triggering for
> > > bits above 31 only (to side-step any issues with the negative SImode
> > > value 0x8000/(-1ull << 31)/(1 << 31)).  This moves the special
> > > handling of this SImode value (i.e. the check for (-1ull << 31) to
> > > riscv.cc and relaxes the SINGLE_BIT_MASK_OPERAND() test.
> > >
> > > With this, the code-generation for loading (1ULL << 31) from:
> > > li  a0,1
> > > sllia0,a0,31
> > > to:
> > > bseti   a0,zero,31
> > >
> > > gcc/ChangeLog:
> > >
> > > * config/riscv/riscv.cc (riscv_build_integer_1): Rewrite value as
> > > (-1 << 31) for the single-bit case, when operating on (1 << 31)
> > > in SImode.
> > > * gcc/config/riscv/riscv.h (SINGLE_BIT_MASK_OPERAND): Allow for
> > > any single-bit value, moving the special case for (1 << 31) to
> > > riscv_build_integer_1 (in riscv.c).
> > >
> > > Signed-off-by: Philipp Tomsich 
> > >
> > > ---
> > >
> > > Changes in v2:
> > > - Use HOST_WIDE_INT_1U/HOST_WIDE_INT_M1U instead of constants.
> > > - Fix some typos in the comment above the rewrite of the value.
> > > - Update the comment to clarify that we expect a LUI to be emitted for
> > >   the SImode case (i.e. sign-extended for RV64) of (1 << 31).
> > >
> > >  gcc/config/riscv/riscv.cc |  9 +
> > >  gcc/config/riscv/riscv.h  | 11 ---
> > >  2 files changed, 13 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > > index f83dc796d88..2e83ca07394 100644
> > > --- a/gcc/config/riscv/riscv.cc
> > > +++ b/gcc/config/riscv/riscv.cc
> > > @@ -420,6 +420,15 @@ riscv_build_integer_1 (struct riscv_integer_op 
> > > codes[RISCV_MAX_INTEGER_OPS],
> > >/* Simply BSETI.  */
> > >codes[0].code = UNKNOWN;
> > >codes[0].value = value;
> > > +
> > > +  /* RISC-V sign-extends all 32bit values that live in a 32bit
> > > +register.  To avoid paradoxes, we thus need to use the
> > > +sign-extended (negative) representation (-1 << 31) for the
> > > +value, if we want to build (1 << 31) in SImode.  This will
> > > +then expand to an LUI instruction.  */
> > > +  if (mode == SImode && value == (HOST_WIDE_INT_1U << 31))
> > > +   codes[0].value = (HOST_WIDE_INT_M1U << 31);
> > > +
> > >return 1;
> > >  }
> > >
> > > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> > > index 5083a1c24b0..6f7f4d3fbdc 100644
> > > --- a/gcc/config/riscv/riscv.h
> > > +++ b/gcc/config/riscv/riscv.h
> > > @@ -528,13 +528,10 @@ enum reg_class
> > >(((VALUE) | ((1UL<<31) - IMM_REACH)) == ((1UL<<31) - IMM_REACH)  \
> > > || ((VALUE) | ((1UL<<31) - IMM_REACH)) + IMM_REACH == 0)
> > >
> > > -/* If this is a single bit mask, then we can load it with bseti.  But 
> > > this
> > > -   is not useful for any of the low 31 bits because we can use addi or 
> > > lui
> > > -   to load them.  It is wrong for loading SImode 0x8000 on rv64 
> > > because it
> > > -   needs to be sign-extended.  So we restrict this to the upper 32-bits
> > > -   only.  */
> > > -#define SINGLE_BIT_MASK_OPERAND(VALUE) \
> > > -  (pow2p_hwi (VALUE) && (ctz_hwi (VALUE) >= 32))
> > > +/* If this is a single bit mask, then we can load it with bseti.  Special
> > > +   handling of SImode 0x8000 on RV64 is done in 
> > > riscv_build_integer_1. */
> > > +#define SINGLE_BIT_MASK_OPERAND(VALUE) \
> > > +  (pow2p_hwi (VALUE))
> > >
> > >  /* Stack layout; function entry, exit and calling.  */
> > >
> > > --
> > > 2.34.1
> > >

Re: [PATCH v2] RISC-V: bitmanip: improve constant-loading for (1ULL << 31) in DImode

2022-06-02 Thread Philipp Tomsich

Thanks, applied to trunk!

On Thu, 2 Jun 2022 at 15:17, Kito Cheng  wrote:
>
> LGTM
>
> On Mon, May 30, 2022 at 5:52 AM Philipp Tomsich
>  wrote:
> >
> > The SINGLE_BIT_MASK_OPERAND() is overly restrictive, triggering for
> > bits above 31 only (to side-step any issues with the negative SImode
> > value 0x8000/(-1ull << 31)/(1 << 31)).  This moves the special
> > handling of this SImode value (i.e. the check for (-1ull << 31) to
> > riscv.cc and relaxes the SINGLE_BIT_MASK_OPERAND() test.
> >
> > With this, the code-generation for loading (1ULL << 31) from:
> > li  a0,1
> > sllia0,a0,31
> > to:
> > bseti   a0,zero,31
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv.cc (riscv_build_integer_1): Rewrite value as
> > (-1 << 31) for the single-bit case, when operating on (1 << 31)
> > in SImode.
> > * gcc/config/riscv/riscv.h (SINGLE_BIT_MASK_OPERAND): Allow for
> > any single-bit value, moving the special case for (1 << 31) to
> > riscv_build_integer_1 (in riscv.c).
> >
> > Signed-off-by: Philipp Tomsich 
> >
> > ---
> >
> > Changes in v2:
> > - Use HOST_WIDE_INT_1U/HOST_WIDE_INT_M1U instead of constants.
> > - Fix some typos in the comment above the rewrite of the value.
> > - Update the comment to clarify that we expect a LUI to be emitted for
> >   the SImode case (i.e. sign-extended for RV64) of (1 << 31).
> >
> >  gcc/config/riscv/riscv.cc |  9 +
> >  gcc/config/riscv/riscv.h  | 11 ---
> >  2 files changed, 13 insertions(+), 7 deletions(-)
> >
> > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> > index f83dc796d88..2e83ca07394 100644
> > --- a/gcc/config/riscv/riscv.cc
> > +++ b/gcc/config/riscv/riscv.cc
> > @@ -420,6 +420,15 @@ riscv_build_integer_1 (struct riscv_integer_op 
> > codes[RISCV_MAX_INTEGER_OPS],
> >/* Simply BSETI.  */
> >codes[0].code = UNKNOWN;
> >codes[0].value = value;
> > +
> > +  /* RISC-V sign-extends all 32bit values that live in a 32bit
> > +register.  To avoid paradoxes, we thus need to use the
> > +sign-extended (negative) representation (-1 << 31) for the
> > +value, if we want to build (1 << 31) in SImode.  This will
> > +then expand to an LUI instruction.  */
> > +  if (mode == SImode && value == (HOST_WIDE_INT_1U << 31))
> > +   codes[0].value = (HOST_WIDE_INT_M1U << 31);
> > +
> >return 1;
> >  }
> >
> > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> > index 5083a1c24b0..6f7f4d3fbdc 100644
> > --- a/gcc/config/riscv/riscv.h
> > +++ b/gcc/config/riscv/riscv.h
> > @@ -528,13 +528,10 @@ enum reg_class
> >(((VALUE) | ((1UL<<31) - IMM_REACH)) == ((1UL<<31) - IMM_REACH)  \
> > || ((VALUE) | ((1UL<<31) - IMM_REACH)) + IMM_REACH == 0)
> >
> > -/* If this is a single bit mask, then we can load it with bseti.  But this
> > -   is not useful for any of the low 31 bits because we can use addi or lui
> > -   to load them.  It is wrong for loading SImode 0x8000 on rv64 
> > because it
> > -   needs to be sign-extended.  So we restrict this to the upper 32-bits
> > -   only.  */
> > -#define SINGLE_BIT_MASK_OPERAND(VALUE) \
> > -  (pow2p_hwi (VALUE) && (ctz_hwi (VALUE) >= 32))
> > +/* If this is a single bit mask, then we can load it with bseti.  Special
> > +   handling of SImode 0x8000 on RV64 is done in riscv_build_integer_1. 
> > */
> > +#define SINGLE_BIT_MASK_OPERAND(VALUE) \
> > +  (pow2p_hwi (VALUE))
> >
> >  /* Stack layout; function entry, exit and calling.  */
> >
> > --
> > 2.34.1
> >

[pushed] c++: more constexpr empty base [PR105795]

2022-06-02 Thread Jason Merrill via Gcc-patches

Following on from the previous patch, for trunk let's consistently set
ctx->ctor to NULL_TREE for empty subobjects.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/105795

gcc/cp/ChangeLog:

* constexpr.cc (init_subob_ctx): Clear ctx->ctor for empty subob.
(cxx_eval_store_expression): Likewise.
(cxx_eval_bare_aggregate): Handle null ctx->ctor.
---
 gcc/cp/constexpr.cc | 63 +
 1 file changed, 41 insertions(+), 22 deletions(-)

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 021eaa34920..1346a1d4c10 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -4695,9 +4695,17 @@ init_subob_ctx (const constexpr_ctx *ctx, constexpr_ctx 
_ctx,
   else
new_ctx.object = build_ctor_subob_ref (index, type, ctx->object);
 }
-  tree elt = build_constructor (type, NULL);
-  CONSTRUCTOR_NO_CLEARING (elt) = true;
-  new_ctx.ctor = elt;
+
+  if (is_empty_class (type))
+/* Leave ctor null for an empty subobject, they aren't represented in the
+   result of evaluation.  */
+new_ctx.ctor = NULL_TREE;
+  else
+{
+  tree elt = build_constructor (type, NULL);
+  CONSTRUCTOR_NO_CLEARING (elt) = true;
+  new_ctx.ctor = elt;
+}
 
   if (TREE_CODE (value) == TARGET_EXPR)
 /* Avoid creating another CONSTRUCTOR when we expand the TARGET_EXPR.  */
@@ -4762,11 +4770,14 @@ cxx_eval_bare_aggregate (const constexpr_ctx *ctx, tree 
t,
   ctx = _ctx;
 };
   verify_ctor_sanity (ctx, type);
-  vec **p = _ELTS (ctx->ctor);
-  vec_alloc (*p, vec_safe_length (v));
-
-  if (CONSTRUCTOR_PLACEHOLDER_BOUNDARY (t))
-CONSTRUCTOR_PLACEHOLDER_BOUNDARY (ctx->ctor) = 1;
+  vec **p = nullptr;
+  if (ctx->ctor)
+{
+  p = _ELTS (ctx->ctor);
+  vec_alloc (*p, vec_safe_length (v));
+  if (CONSTRUCTOR_PLACEHOLDER_BOUNDARY (t))
+   CONSTRUCTOR_PLACEHOLDER_BOUNDARY (ctx->ctor) = 1;
+}
 
   unsigned i;
   tree index, value;
@@ -4814,17 +4825,19 @@ cxx_eval_bare_aggregate (const constexpr_ctx *ctx, tree 
t,
  inner->value = elt;
  changed = true;
}
+  else if (no_slot)
+   /* This is an initializer for an empty field; now that we've
+  checked that it's constant, we can ignore it.  */
+   changed = true;
   else if (index
   && (TREE_CODE (index) == NOP_EXPR
   || TREE_CODE (index) == POINTER_PLUS_EXPR))
{
- /* This is an initializer for an empty base; now that we've
-checked that it's constant, we can ignore it.  */
+ /* Old representation of empty bases.  FIXME remove.  */
+ gcc_checking_assert (false);
  gcc_assert (is_empty_class (TREE_TYPE (TREE_TYPE (index;
  changed = true;
}
-  else if (no_slot)
-   changed = true;
   else
{
  if (TREE_CODE (type) == UNION_TYPE
@@ -4849,6 +4862,8 @@ cxx_eval_bare_aggregate (const constexpr_ctx *ctx, tree t,
   if (*non_constant_p || !changed)
 return t;
   t = ctx->ctor;
+  if (!t)
+t = build_constructor (type, NULL);
   /* We're done building this CONSTRUCTOR, so now we can interpret an
  element without an explicit initializer as value-initialized.  */
   CONSTRUCTOR_NO_CLEARING (t) = false;
@@ -5833,6 +5848,16 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, 
tree t,
   valp = >value;
 }
 
+  /* For initialization of an empty base, the original target will be
+ *(base*)this, evaluation of which resolves to the object
+ argument, which has the derived type rather than the base type.  */
+  if (!empty_base && !(same_type_ignoring_top_level_qualifiers_p
+  (initialized_type (init), type)))
+{
+  gcc_assert (is_empty_class (TREE_TYPE (target)));
+  empty_base = true;
+}
+
   /* Detect modifying a constant object in constexpr evaluation.
  We have found a const object that is being modified.  Figure out
  if we need to issue an error.  Consider
@@ -5901,7 +5926,7 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, tree 
t,
  *valp = build_constructor (type, NULL);
  CONSTRUCTOR_NO_CLEARING (*valp) = no_zero_init;
}
-  new_ctx.ctor = *valp;
+  new_ctx.ctor = empty_base ? NULL_TREE : *valp;
   new_ctx.object = target;
   /* Avoid temporary materialization when initializing from a TARGET_EXPR.
 We don't need to mess with AGGR_EXPR_SLOT/VEC_INIT_EXPR_SLOT because
@@ -5931,16 +5956,10 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, 
tree t,
 
   gcc_checking_assert (!*valp || (same_type_ignoring_top_level_qualifiers_p
  (TREE_TYPE (*valp), type)));
-  if (empty_base || !(same_type_ignoring_top_level_qualifiers_p
- (initialized_type (init), type)))
+  if (empty_base)
 {
-  /* For initialization of an empty base, the original target will be
-   *(base*)this, evaluation of

[pushed] c++: constexpr empty aggr [PR105795]

2022-06-02 Thread Jason Merrill via Gcc-patches

In this testcase, leaving ctx->ctor pointing to the enclosing object meant
that evaluating the initializer for the subobject clobbered previous
initializers for the enclosing object.  So do update ctx->ctor, just don't
add it to the enclosing object ctor.

Tested x86_64-pc-linux-gnu, applying to trunk and 12.

PR c++/105795

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_bare_aggregate): Always call
init_subob_ctx.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/constexpr-aggr-base1.C: New test.
---
 gcc/cp/constexpr.cc   |  7 ++---
 .../g++.dg/cpp1z/constexpr-aggr-base1.C   | 27 +++
 2 files changed, 29 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/constexpr-aggr-base1.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 45208478c3f..021eaa34920 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -4777,12 +4777,9 @@ cxx_eval_bare_aggregate (const constexpr_ctx *ctx, tree 
t,
   tree orig_value = value;
   /* Like in cxx_eval_store_expression, omit entries for empty fields.  */
   bool no_slot = TREE_CODE (type) == RECORD_TYPE && is_empty_field (index);
-  if (no_slot)
-   new_ctx = *ctx;
-  else
-   init_subob_ctx (ctx, new_ctx, index, value);
+  init_subob_ctx (ctx, new_ctx, index, value);
   int pos_hint = -1;
-  if (new_ctx.ctor != ctx->ctor)
+  if (new_ctx.ctor != ctx->ctor && !no_slot)
{
  /* If we built a new CONSTRUCTOR, attach it now so that other
 initializers can refer to it.  */
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-aggr-base1.C 
b/gcc/testsuite/g++.dg/cpp1z/constexpr-aggr-base1.C
new file mode 100644
index 000..06acb4a9f78
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-aggr-base1.C
@@ -0,0 +1,27 @@
+// PR c++/105795
+// { dg-do compile { target c++17 } }
+
+struct empty
+{};
+
+template 
+struct tuple_holder
+{
+[[no_unique_address]] T value;
+};
+
+struct tuple : tuple_holder, tuple_holder
+{};
+
+constexpr auto make_tuple(int&& i, empty&& e)
+{
+return tuple{i, e};
+}
+
+constexpr int foo()
+{
+auto tuple = make_tuple(1, empty{});
+return static_cast&>(tuple).value;
+}
+
+static_assert (foo() == 1);

base-commit: 37e4e7f77d8f7b7e911bf611a0f8edbc3a850c7a
-- 
2.27.0

Ping: [PATCH] Optimize vec_splats of constant V2DI/V2DF vec_extract, PR target/99293

2022-06-02 Thread Michael Meissner via Gcc-patches

Ping patch posted on May 13th:

| Cate: Fri, 13 May 2022 10:49:26 -0400
| From: Michael Meissner 
| Subject: [PATCH] Optimize vec_splats of constant V2DI/V2DF vec_extract, PR 
target/99293
| Message-ID: 

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com

Re: [PATCH v3] RISC-V: Implement C[LT]Z_DEFINED_VALUE_AT_ZERO

2022-06-02 Thread Philipp Tomsich

Cherry picked from commit 16f7fcadac19dabd04a5abbe6601df52d22e9685
onto releases/gcc-12.

On Thu, 2 Jun 2022 at 10:49, Kito Cheng  wrote:
>
> OK to back port, thanks!
>
> On Thu, Jun 2, 2022 at 4:46 PM Philipp Tomsich  
> wrote:
> >
> > OK for backport?
> >
> > Thanks,
> > Phil.
> >
> > On Fri, 13 May 2022 at 22:23, Philipp Tomsich 
> > wrote:
> >
> > > Added the two nits from Kito's review and … Applied to trunk!
> > >
> > >
> > > On Fri, 13 May 2022 at 22:16, Philipp Tomsich 
> > > wrote:
> > > >
> > > > The Zbb support has introduced ctz and clz to the backend, but some
> > > > transformations in GCC need to know what the value of c[lt]z at zero
> > > > is. This affects how the optab is generated and may suppress use of
> > > > CLZ/CTZ in tree passes.
> > > >
> > > > Among other things, this is needed for the transformation of
> > > > table-based ctz-implementations, such as in deepsjeng, to work
> > > > (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90838).
> > > >
> > > > Prior to this change, the test case from PR90838 would compile to
> > > > on RISC-V targets with Zbb:
> > > >   myctz:
> > > > lui a4,%hi(.LC0)
> > > > ld  a4,%lo(.LC0)(a4)
> > > > neg a5,a0
> > > > and a5,a5,a0
> > > > mul a5,a5,a4
> > > > lui a4,%hi(.LANCHOR0)
> > > > addia4,a4,%lo(.LANCHOR0)
> > > > srlia5,a5,58
> > > > sh2add  a5,a5,a4
> > > > lw  a0,0(a5)
> > > > ret
> > > >
> > > > After this change, we get:
> > > >   myctz:
> > > > ctz a0,a0
> > > > andia0,a0,63
> > > > ret
> > > >
> > > > Testing this with deepsjeng_r (from SPEC 2017) against QEMU, this
> > > > shows a clear reduction in dynamic instruction count:
> > > >  - before  1961888067076
> > > >  - after   1907928279874 (2.75% reduction)
> > > >
> > > > This also merges the various target-specific test-cases (for x86-64,
> > > > aarch64 and riscv) within gcc.dg/pr90838.c.
> > > >
> > > > This extends the macros (i.e., effective-target keywords) used in
> > > > testing (lib/target-supports.exp) to reliably distinguish between RV32
> > > > and RV64 via __riscv_xlen (i.e., the integer register bitwidth) :
> > > > testing for ILP32 could be misleading (as ILP32 is a valid memory
> > > > model for 64bit systems).
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * config/riscv/riscv.h (CLZ_DEFINED_VALUE_AT_ZERO): Implement.
> > > > (CTZ_DEFINED_VALUE_AT_ZERO): Same.
> > > > * doc/sourcebuild.texi: add documentation for RISC-V specific
> > > > test target keywords
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > * gcc.dg/pr90838.c: Add additional flags (dg-additional-options)
> > > >   when compiling for riscv64 and subsume
> > > gcc.target/aarch64/pr90838.c
> > > >   and gcc.target/i386/pr95863-2.c.
> > > > * gcc.target/riscv/zbb-ctz.c: New test.
> > > > * gcc.target/aarch64/pr90838.c: Removed.
> > > > * gcc.target/i386/pr95863-2.c: Removed.
> > > > * lib/target-supports.exp: Recognize RV32 or RV64 via XLEN
> > > >
> > > > Signed-off-by: Philipp Tomsich 
> > > > Signed-off-by: Manolis Tsamis 
> > > > Co-developed-by: Manolis Tsamis 
> > > >
> > > > ---
> > > > Changes in v3:
> > > > - Address nit from Kito (use rv64 and rv32 on gcc.dg/pr90838.c
> > > >   consistently.
> > > >
> > > > Changes in v2:
> > > > - Address review comments from Palmer (merging testcases)
> > > > - Merge the different target-specific testcases for CLZ into one
> > > > - Add RV32 tests
> > > > - Fix pr90383.c testcase for x86_64
> > > >
> > > >  gcc/config/riscv/riscv.h   |  5 ++
> > > >  gcc/doc/sourcebuild.texi   | 12 
> > > >  gcc/testsuite/gcc.dg/pr90838.c | 25 +
> > > >  gcc/testsuite/gcc.target/aarch64/pr90838.c | 64 --
> > > >  gcc/testsuite/gcc.target/i386/pr95863-2.c  | 27 -
> > > >  gcc/testsuite/lib/target-supports.exp  | 30 ++
> > > >  6 files changed, 72 insertions(+), 91 deletions(-)
> > > >  delete mode 100644 gcc/testsuite/gcc.target/aarch64/pr90838.c
> > > >  delete mode 100644 gcc/testsuite/gcc.target/i386/pr95863-2.c
> > > >
> > > > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> > > > index 8a4d2cf7f85..b191606edb4 100644
> > > > --- a/gcc/config/riscv/riscv.h
> > > > +++ b/gcc/config/riscv/riscv.h
> > > > @@ -1004,4 +1004,9 @@ extern void
> > > riscv_remove_unneeded_save_restore_calls (void);
> > > >
> > > >  #define HARD_REGNO_RENAME_OK(FROM, TO) riscv_hard_regno_rename_ok
> > > (FROM, TO)
> > > >
> > > > +#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> > > > +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> > > > +#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> > > > +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> > > > +
> > > >  #endif /* ! GCC_RISCV_H */
> > > > diff --git a/gcc/doc/sourcebuild.texi

Re: [PATCH] c++: cv-quals of dummy obj for non-dep memfn call [PR105637]

2022-06-02 Thread Patrick Palka via Gcc-patches

On Fri, 27 May 2022, Patrick Palka wrote:

> On Thu, 26 May 2022, Patrick Palka wrote:
> 
> > On Thu, 26 May 2022, Jason Merrill wrote:
> > 
> > > On 5/26/22 14:57, Patrick Palka wrote:
> > > > On Thu, 26 May 2022, Patrick Palka wrote:
> > > > 
> > > > > Here we expect the calls to BaseClass::baseDevice resolve to the 
> > > > > second,
> > > > > third and fourth overloads respectively in light of the cv-qualifiers
> > > > > of 'this' in each case.  But ever since r12-6075-g2decd2cabe5a4f, the
> > > > > calls incorrectly resolve to the first overload at instantiation time.
> > > > > 
> > > > > This happens because the calls to BaseClass::baseDevice are all deemed
> > > > > non-dependent (ever since r7-755-g23cb72663051cd made us ignore the
> > > > > dependentness of 'this' when considering the dependence of a 
> > > > > non-static
> > > > > memfn call), hence we end up checking the call ahead of time, using as
> > > > > the object argument a dummy object of type BaseClass.  Since this 
> > > > > object
> > > > > argument is cv-unqualified, the calls incoherently resolve to the 
> > > > > first
> > > > > overload of baseDevice.  Before r12-6075, this incorrect result would
> > > > > just get silently discarded and we'd end up redoing OR at 
> > > > > instantiation
> > > > > time using 'this' as the object argument.  But after r12-6075, we now
> > > > > reuse this incorrect result at instantiation time.
> > > > > 
> > > > > This patch fixes this by making finish_call_expr request from
> > > > > maybe_dummy_object a cv-qualified object consistent with the cv-quals 
> > > > > of
> > > > > 'this'.  That way, ahead of time OR using a dummy object will give us
> > > > > the right answer and we could safely reuse it at instantiation time.
> > > > > 
> > > > > NB: r7-755 is also the cause of the related issue PR105742.  Not sure
> > > > > if there's a fix that could resolve both PRs at once..
> > > > > 
> > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
> > > > > for trunk/12?
> > > > > 
> > > > >   PR c++/105637
> > > > > 
> > > > > gcc/cp/ChangeLog:
> > > > > 
> > > > >   * semantics.cc (finish_call_expr): Pass a cv-qualified object
> > > > >   type to maybe_dummy_object that is consistent with the
> > > > >   cv-qualifiers of 'this' if available.
> > > > > 
> > > > > gcc/testsuite/ChangeLog:
> > > > > 
> > > > >   * g++.dg/template/non-dependent23.C: New test.
> > > > > ---
> > > > >   gcc/cp/semantics.cc   | 15 ---
> > > > >   .../g++.dg/template/non-dependent23.C | 25 
> > > > > +++
> > > > >   2 files changed, 37 insertions(+), 3 deletions(-)
> > > > >   create mode 100644 gcc/testsuite/g++.dg/template/non-dependent23.C
> > > > > 
> > > > > diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
> > > > > index cd7a2818feb..1d9348c6cf1 100644
> > > > > --- a/gcc/cp/semantics.cc
> > > > > +++ b/gcc/cp/semantics.cc
> > > > > @@ -2802,16 +2802,25 @@ finish_call_expr (tree fn, vec
> > > > > **args, bool disallow_virtual,
> > > > >   [class.access.base] says that we need to convert 'this' to B* as
> > > > >   part of the access, so we pass 'B' to maybe_dummy_object.  */
> > > > >   +  tree object_type = BINFO_TYPE (BASELINK_ACCESS_BINFO (fn));
> > > > > if (DECL_MAYBE_IN_CHARGE_CONSTRUCTOR_P (get_first_fn (fn)))
> > > > >   {
> > > > > /* A constructor call always uses a dummy object.  (This 
> > > > > constructor
> > > > >call which has the form A::A () is actually invalid and we 
> > > > > are
> > > > >going to reject it later in build_new_method_call.)  */
> > > > > -   object = build_dummy_object (BINFO_TYPE (BASELINK_ACCESS_BINFO
> > > > > (fn)));
> > > > > +   object = build_dummy_object (object_type);
> > > > >   }
> > > > > else
> > > > > - object = maybe_dummy_object (BINFO_TYPE (BASELINK_ACCESS_BINFO 
> > > > > (fn)),
> > > > > -  NULL);
> > > > > + {
> > > > > +   if (current_class_ref)
> > > > > + {
> > > > > +   /* Make sure that if maybe_dummy_object gives us a dummy 
> > > > > object,
> > > > > +  it'll have the same cv-quals as '*this'.  */
> > > > > +   int quals = cp_type_quals (TREE_TYPE (current_class_ref));
> > > > > +   object_type = cp_build_qualified_type (object_type, 
> > > > > quals);
> > > > > + }
> > > > > +   object = maybe_dummy_object (object_type, NULL);
> > > > > + }
> > > > >   result = build_new_method_call (object, fn, args, NULL_TREE,
> > > > > (disallow_virtual
> > > > 
> > > > Drat, this fix doesn't interact well with 'this'-capturing lambdas:
> > > > 
> > > >  struct BaseClass {
> > > >void baseDevice();// #1
> > > >void baseDevice() const = delete; // #2
> > > >  };
> > > > 
> > > >  template
> > > >

[PATCH][WIP] have configure probe prefix for gmp/mpfr/mpc [PR44425]

2022-06-02 Thread Eric Gallager via Gcc-patches

So, I'm working on fixing PR bootstrap/44425, and have this patch to
have the top-level configure script check in the value passed to
`--prefix=` when looking for gmp/mpfr/mpc. It "works" (in that
configuring with just `--prefix=` and none of
`--with-gmp=`/`--with-mpfr=`/`--with-mpc=` now works where it failed
before), but unfortunately it results in a bunch of duplicated
`-I`/`-L` flags stuck in ${gmplibs} and ${gmpinc}... is that
acceptable or should I try another approach?
Eric


patch-configure.diff
Description: Binary data

Re: [PATCH] c++: don't substitute TEMPLATE_PARM_CONSTRAINT [PR100374]

2022-06-02 Thread Patrick Palka via Gcc-patches

On Tue, 31 May 2022, Jason Merrill wrote:

> On 5/31/22 08:56, Patrick Palka wrote:
> > On Sun, 29 May 2022, Jason Merrill wrote:
> > 
> > > On 5/29/22 22:10, Jason Merrill wrote:
> > > > On 5/27/22 14:05, Patrick Palka wrote:
> > > > > This makes us avoid substituting into the TEMPLATE_PARM_CONSTRAINT of
> > > > > each template parameter except as necessary for (friend) declaration
> > > > > matching, like we already do for the overall
> > > > > TEMPLATE_PARMS_CONSTRAINTS
> > > > > of a template parameter list.
> > > > > 
> > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
> > > > > for
> > > > > trunk and perhaps 12.2?  Also tested on range-v3 and cmcstl2.
> > > > 
> > > > Are there already tests that cover the friend cases?
> > 
> > Yes, by cpp2a/concepts-friend{2,3,7}.C I think.
> > 
> > > 
> > > Also, don't you also need to handle specialization of partial
> > > instantiations?
> > 
> > Hmm, do you have an example?  IIUC we call tsubst_friend_function and
> > tsubst_friend_class only from instantiate_class_template_1, which always
> > uses the most general template and full template argument set to
> > instantiate any friend declarations.  So friend declarations are never
> > partially instantiated I think.  (And IIUC non-friends are irrelevant
> > here since we don't ever want to substitute their constraints outside of
> > satisfaction.)
> 
> From C++20 CA104:
> 
>   template  struct A {
> template  U f(U) requires C;
> template  U f(U) requires C;
>   };
> 
>   // Substitute int for T in above requirements to find match.
>   template <> template  U A::f(U) requires C  { }

Aha, thanks.  In this case of declaration matching, it looks like
determine_specialization ignores all but the trailing requirement
clause.  I think it's doable if a little messy to precisely handle
this case but in the meantime it seems we could get 90% of the way there
by considering the overall constraints instead of just the trailing
constraints?  Something like the following.

(Either way, IIUC the tsubst_template_parm change shouldn't affect this
case at all since determine_specialization uses comp_template_parms
instead of template_heads_equivalent_p and so it ignores
TEMPLATE_PARM_CONSTRAINT.)

-- >8 --

Subject: [PATCH] c++: don't substitute TEMPLATE_PARM_CONSTRAINT [PR100374]

This makes us avoid substituting into the TEMPLATE_PARM_CONSTRAINT of
each template parameter except as necessary for friend declaration
matching, like we already do for the overall associated constraints.

In passing this improves upon the CA104 implementation of explicit
specialization mathing of a constrained function template inside
a class template, by considering the overall constraints instead of
just the trailing constraints.  This allows us to correctly handle the
first three explicit specializations in concepts-spec2.C below, but
because we compare the constraints as a whole, it means we incorrectly
accept the fourth explicit specialization.  For complete correctness,
we should be using tsubst_each_template_parm_constraint and
template_parameter_heads_equivalent_p in determine_specialization.

PR c++/100374

gcc/cp/ChangeLog:

* pt.cc (determine_specialization): Compare overall constraints,
not just the trailing constraints, in the CA104 case.
(tsubst_each_template_parm_constraint): Define.
(tsubst_friend_function): Use it.
(tsubst_friend_class): Use it.
(tsubst_template_parm): Don't substitute TEMPLATE_PARM_CONSTRAINT.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-template-parm11.C: New test.
---
 gcc/cp/pt.cc  | 41 +++
 gcc/testsuite/g++.dg/cpp2a/concepts-spec2.C   | 15 +++
 .../g++.dg/cpp2a/concepts-template-parm11.C   | 20 +
 3 files changed, 69 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-spec2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-template-parm11.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 45dd036c2d5..d867ce8e141 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -184,6 +184,7 @@ static int unify_pack_expansion (tree, tree, tree,
 tree, unification_kind_t, bool, bool);
 static tree copy_template_args (tree);
 static tree tsubst_template_parms (tree, tree, tsubst_flags_t);
+static void tsubst_each_template_parm_constraint (tree, tree, tsubst_flags_t);
 tree most_specialized_partial_spec (tree, tsubst_flags_t);
 static tree tsubst_aggr_type (tree, tree, tsubst_flags_t, tree, int);
 static tree tsubst_arg_types (tree, tree, tree, tsubst_flags_t, tree);
@@ -2323,8 +2324,8 @@ determine_specialization (tree template_id,
  if (!compparms (fn_arg_types, decl_arg_types))
continue;

- tree freq = get_trailing_function_requirements (fn);
- tree dreq = get_trailing_function_requirements (decl);
+ tree freq = get_constraints

[PATCH] configure: arrange to use appropriate objcopy

2022-06-02 Thread Jan Beulich via Gcc-patches

Using the system objcopy is wrong when other configure checks have
probed a different set of binutils (I've noticed the problem on a system
where the base objcopy can't deal with compressed debug sections).
Arrange for the matching one to be picked up, first and foremost if an
"in tree" one is available, by mirroring respective logic already
present for nm.

gcc/

* Makefile.in (ORIGINAL_OBJCOPY_FOR_TARGET): New.
* configure.ac: Check for objcopy, producing
ORIGINAL_OBJCOPY_FOR_TARGET.
* configure: Update accordingly.
* exec-tool.in (ORIGINAL_OBJCOPY_FOR_TARGET): New.
Handle objcopy.

--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -499,6 +499,7 @@ RANLIB_FOR_TARGET := $(shell \
 ORIGINAL_LD_FOR_TARGET = @ORIGINAL_LD_FOR_TARGET@
 ORIGINAL_NM_FOR_TARGET = @ORIGINAL_NM_FOR_TARGET@
 NM_FOR_TARGET = ./nm
+ORIGINAL_OBJCOPY_FOR_TARGET = @ORIGINAL_OBJCOPY_FOR_TARGET@
 STRIP_FOR_TARGET := $(shell \
   if [ -f $(objdir)/../binutils/strip-new ] ; then \
 echo $(objdir)/../binutils/strip-new ; \
--- a/gcc/configure
+++ b/gcc/configure
@@ -733,6 +733,8 @@ gcc_cv_readelf
 gcc_cv_objdump
 ORIGINAL_NM_FOR_TARGET
 gcc_cv_nm
+ORIGINAL_OBJCOPY_FOR_TARGET
+gcc_cv_objcopy
 ORIGINAL_LD_GOLD_FOR_TARGET
 ORIGINAL_LD_BFD_FOR_TARGET
 ORIGINAL_LD_FOR_TARGET
@@ -23436,6 +23438,83 @@ case "$ORIGINAL_NM_FOR_TARGET" in
  ;;
 esac
 
+# Figure out what objcopy we will be using.
+if ${gcc_cv_objcopy+:} false; then :
+
+else
+
+if test -f $gcc_cv_binutils_srcdir/configure.ac \
+ && test -f ../binutils/Makefile \
+ && test x$build = x$host; then
+   gcc_cv_objcopy=../binutils/objcopy$build_exeext
+elif test -x objcopy$build_exeext; then
+   gcc_cv_objcopy=./objcopy$build_exeext
+elif ( set dummy $OBJCOPY_FOR_TARGET; test -x $2 ); then
+gcc_cv_objcopy="$OBJCOPY_FOR_TARGET"
+else
+# Extract the first word of "$OBJCOPY_FOR_TARGET", so it can be a 
program name with args.
+set dummy $OBJCOPY_FOR_TARGET; ac_word=$2
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+$as_echo_n "checking for $ac_word... " >&6; }
+if ${ac_cv_path_gcc_cv_objcopy+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  case $gcc_cv_objcopy in
+  [\\/]* | ?:[\\/]*)
+  ac_cv_path_gcc_cv_objcopy="$gcc_cv_objcopy" # Let the user override the test 
with a path.
+  ;;
+  *)
+  as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+for as_dir in $PATH
+do
+  IFS=$as_save_IFS
+  test -z "$as_dir" && as_dir=.
+for ac_exec_ext in '' $ac_executable_extensions; do
+  if as_fn_executable_p "$as_dir/$ac_word$ac_exec_ext"; then
+ac_cv_path_gcc_cv_objcopy="$as_dir/$ac_word$ac_exec_ext"
+$as_echo "$as_me:${as_lineno-$LINENO}: found $as_dir/$ac_word$ac_exec_ext" 
>&5
+break 2
+  fi
+done
+  done
+IFS=$as_save_IFS
+
+  ;;
+esac
+fi
+gcc_cv_objcopy=$ac_cv_path_gcc_cv_objcopy
+if test -n "$gcc_cv_objcopy"; then
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_objcopy" >&5
+$as_echo "$gcc_cv_objcopy" >&6; }
+else
+  { $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
+$as_echo "no" >&6; }
+fi
+
+
+fi
+fi
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking what objcopy to use" >&5
+$as_echo_n "checking what objcopy to use... " >&6; }
+if test "$gcc_cv_objcopy" = ../binutils/objcopy$build_exeext; then
+   # Single tree build which includes binutils.
+   { $as_echo "$as_me:${as_lineno-$LINENO}: result: newly built objcopy" 
>&5
+$as_echo "newly built objcopy" >&6; }
+   in_tree_objcopy=yes
+else
+   { $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_objcopy" >&5
+$as_echo "$gcc_cv_objcopy" >&6; }
+   in_tree_objcopy=no
+fi
+
+ORIGINAL_OBJCOPY_FOR_TARGET=$gcc_cv_objcopy
+
+case "$ORIGINAL_OBJCOPY_FOR_TARGET" in
+  ./objcopy | ./objcopy$build_exeext) ;;
+  *) ac_config_files="$ac_config_files objcopy:exec-tool.in"
+ ;;
+esac
 
 # Figure out what objdump we will be using.
 if ${gcc_cv_objdump+:} false; then :
@@ -33176,6 +33255,7 @@ do
 "as") CONFIG_FILES="$CONFIG_FILES as:exec-tool.in" ;;
 "collect-ld") CONFIG_FILES="$CONFIG_FILES collect-ld:exec-tool.in" ;;
 "nm") CONFIG_FILES="$CONFIG_FILES nm:exec-tool.in" ;;
+"objcopy") CONFIG_FILES="$CONFIG_FILES objcopy:exec-tool.in" ;;
 "dsymutil") CONFIG_FILES="$CONFIG_FILES dsymutil:exec-tool.in" ;;
 "clearcap.map") CONFIG_LINKS="$CONFIG_LINKS 
clearcap.map:${srcdir}/config/$clearcap_map" ;;
 "$all_outputs") CONFIG_FILES="$CONFIG_FILES $all_outputs" ;;
@@ -33811,6 +33891,7 @@ $as_echo "$as_me: executing $ac_file com
 "as":F) chmod +x as ;;
 "collect-ld":F) chmod +x collect-ld ;;
 "nm":F) chmod +x nm ;;
+"objcopy":F) chmod +x objcopy ;;
 "dsymutil":F) chmod +x dsymutil ;;
 "default":C)
 case ${CONFIG_HEADERS} in
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -2815,6 +2815,36 @@ case "$ORIGINAL_NM_FOR_TARGET" in
   *) AC_CONFIG_FILES(nm:exec-tool.in, [chmod +x nm]) ;;
 esac
 
+# Figure out what objcopy we will be using.
+AS_VAR_SET_IF(gcc_cv_objcopy,, [
+if

[PATCH] x86-64: make "length_vex" also account for VEX.B use by register operand

2022-06-02 Thread Jan Beulich via Gcc-patches

The length attribute ought to be "the (bounding maximum) length of an
instruction" according to the comment next to its definition. A register
operand encoded using the ModR/M.rm field will additionally use VEX.B
for encoding the highest bit of the register number. Hence for the high
8 GPR registers as well as the [xy]mm{8..15} ones 3-byte VEX encoding
may be needed. Since it isn't known to the function calculating the
length which register goes where in the insn encoding, be conservative
and assume a 3-byte VEX prefix whenever any such register operand is
present and there's no memory operand.

gcc/

* config/i386/i386.cc (ix86_attr_length_vex_default): Take REX.B
into account for reg-only insns.

--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -16820,7 +16820,8 @@ int
 ix86_attr_length_vex_default (rtx_insn *insn, bool has_0f_opcode,
  bool has_vex_w)
 {
-  int i;
+  int i, reg_only = 2 + 1;
+  bool has_mem = false;
 
   /* Only 0f opcode can use 2 byte VEX prefix and  VEX W bit uses 3
  byte VEX prefix.  */
@@ -16840,16 +16841,23 @@ ix86_attr_length_vex_default (rtx_insn *
if (GET_MODE (recog_data.operand[i]) == DImode
&& GENERAL_REG_P (recog_data.operand[i]))
  return 3 + 1;
+
+   /* REX.B bit requires 3-byte VEX. Right here we don't know which
+  operand will be encoded using VEX.B, so be conservative.  */
+   if (REX_INT_REGNO_P (recog_data.operand[i])
+   || REX_SSE_REGNO_P (recog_data.operand[i]))
+ reg_only = 3 + 1;
   }
-else
+else if (MEM_P (recog_data.operand[i]))
   {
/* REX.X or REX.B bits use 3 byte VEX prefix.  */
-   if (MEM_P (recog_data.operand[i])
-   && x86_extended_reg_mentioned_p (recog_data.operand[i]))
+   if (x86_extended_reg_mentioned_p (recog_data.operand[i]))
  return 3 + 1;
+
+   has_mem = true;
   }
 
-  return 2 + 1;
+  return has_mem ? 2 + 1 : reg_only;
 }

[PATCH] x86: harmonize __builtin_ia32_psadbw*() types

2022-06-02 Thread Jan Beulich via Gcc-patches

The 64-bit, 128-bit, and 512-bit variants have VDI return type, in
line with instruction behavior. Make the 256-bit builtin match, thus
also making it match the insn it expands to (using VI8_AVX2_AVX512BW).

gcc/

* config/i386/i386-builtin.def (__builtin_ia32_psadbw256):
Change type.
* config/i386/i386-builtin-types.def: New function type
(V4DI, V32QI, V32QI).
* config/i386/i386-expand.cc (ix86_expand_args_builtin): Handle
V4DI_FTYPE_V32QI_V32QI.

--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -1217,7 +1217,7 @@ BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR
 BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_mulv8si3, "__builtin_ia32_pmulld256"  
, IX86_BUILTIN_PMULLD256  , UNKNOWN, (int) V8SI_FTYPE_V8SI_V8SI)
 BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_vec_widen_umult_even_v8si, 
"__builtin_ia32_pmuludq256", IX86_BUILTIN_PMULUDQ256, UNKNOWN, (int) 
V4DI_FTYPE_V8SI_V8SI)
 BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_iorv4di3, "__builtin_ia32_por256", 
IX86_BUILTIN_POR256, UNKNOWN, (int) V4DI_FTYPE_V4DI_V4DI)
-BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_avx2_psadbw, 
"__builtin_ia32_psadbw256", IX86_BUILTIN_PSADBW256, UNKNOWN, (int) 
V16HI_FTYPE_V32QI_V32QI)
+BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_avx2_psadbw, 
"__builtin_ia32_psadbw256", IX86_BUILTIN_PSADBW256, UNKNOWN, (int) 
V4DI_FTYPE_V32QI_V32QI)
 BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_avx2_pshufbv32qi3, 
"__builtin_ia32_pshufb256", IX86_BUILTIN_PSHUFB256, UNKNOWN, (int) 
V32QI_FTYPE_V32QI_V32QI)
 BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_avx2_pshufdv3, 
"__builtin_ia32_pshufd256", IX86_BUILTIN_PSHUFD256, UNKNOWN, (int) 
V8SI_FTYPE_V8SI_INT)
 BDESC (OPTION_MASK_ISA_AVX2, 0, CODE_FOR_avx2_pshufhwv3, 
"__builtin_ia32_pshufhw256", IX86_BUILTIN_PSHUFHW256, UNKNOWN, (int) 
V16HI_FTYPE_V16HI_INT)
--- a/gcc/config/i386/i386-builtin-types.def
+++ b/gcc/config/i386/i386-builtin-types.def
@@ -516,6 +516,7 @@ DEF_FUNCTION_TYPE (V8DI, V8DI, V2DI, INT
 DEF_FUNCTION_TYPE (V8DI, V8DI, V2DI, INT, V8DI, UQI)
 DEF_FUNCTION_TYPE (V8DI, V8DI, V4DI, INT, V8DI, UQI)
 DEF_FUNCTION_TYPE (V4DI, V8SI, V8SI)
+DEF_FUNCTION_TYPE (V4DI, V32QI, V32QI)
 DEF_FUNCTION_TYPE (V8DI, V64QI, V64QI)
 DEF_FUNCTION_TYPE (V4DI, V4DI, V2DI)
 DEF_FUNCTION_TYPE (V4DI, PCV4DI, V4DI)
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -10359,6 +10359,7 @@ ix86_expand_args_builtin (const struct b
 case V8SI_FTYPE_V16HI_V16HI:
 case V4DI_FTYPE_V4DI_V4DI:
 case V4DI_FTYPE_V8SI_V8SI:
+case V4DI_FTYPE_V32QI_V32QI:
 case V8DI_FTYPE_V64QI_V64QI:
   if (comparison == UNKNOWN)
return ix86_expand_binop_builtin (icode, exp, target);

[PATCH v2] x86: {,v}psadbw have commutative source operands

2022-06-02 Thread Jan Beulich via Gcc-patches

Like noticed for gas as well (binutils-gdb commit c8cad9d389b7), the
"absolute difference" aspect of the insns makes their source operands
commutative.

gcc/

* config/i386/mmx.md (mmx_psadbw): Convert to expander.
(*mmx_psadbw): New. Mark as commutative.
* config/i386/sse.md (_psadbw): Convert to expander.
(*_psadbw): New. Mark as commutative.
---
v2: Introduce expanders.

--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -4405,13 +4405,21 @@
(set_attr "type" "sseiadd")
(set_attr "mode" "TI")])
 
-(define_insn "mmx_psadbw"
+(define_expand "mmx_psadbw"
+  [(set (match_operand:V1DI 0 "register_operand")
+   (unspec:V1DI [(match_operand:V8QI 1 "register_mmxmem_operand")
+ (match_operand:V8QI 2 "register_mmxmem_operand")]
+UNSPEC_PSADBW))]
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && (TARGET_SSE || TARGET_3DNOW_A)"
+  "ix86_fixup_binary_operands_no_copy (PLUS, V8QImode, operands);")
+
+(define_insn "*mmx_psadbw"
   [(set (match_operand:V1DI 0 "register_operand" "=y,x,Yw")
-(unspec:V1DI [(match_operand:V8QI 1 "register_operand" "0,0,Yw")
+   (unspec:V1DI [(match_operand:V8QI 1 "register_mmxmem_operand" "%0,0,Yw")
  (match_operand:V8QI 2 "register_mmxmem_operand" 
"ym,x,Yw")]
 UNSPEC_PSADBW))]
-  "(TARGET_MMX || TARGET_MMX_WITH_SSE)
-   && (TARGET_SSE || TARGET_3DNOW_A)"
+  "(TARGET_MMX || TARGET_MMX_WITH_SSE) && (TARGET_SSE || TARGET_3DNOW_A)
+   && ix86_binary_operator_ok (PLUS, V8QImode, operands)"
   "@
psadbw\t{%2, %0|%0, %2}
psadbw\t{%2, %0|%0, %2}
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -19981,13 +19981,22 @@
 
 ;; The correct representation for this is absolutely enormous, and
 ;; surely not generally useful.
-(define_insn "_psadbw"
+(define_expand "_psadbw"
+  [(set (match_operand:VI8_AVX2_AVX512BW 0 "register_operand")
+   (unspec:VI8_AVX2_AVX512BW
+ [(match_operand: 1 "vector_operand")
+  (match_operand: 2 "vector_operand")]
+ UNSPEC_PSADBW))]
+  "TARGET_SSE2"
+  "ix86_fixup_binary_operands_no_copy (PLUS, mode, operands);")
+
+(define_insn "*_psadbw"
   [(set (match_operand:VI8_AVX2_AVX512BW 0 "register_operand" "=x,YW")
(unspec:VI8_AVX2_AVX512BW
- [(match_operand: 1 "register_operand" "0,YW")
+ [(match_operand: 1 "vector_operand" "%0,YW")
   (match_operand: 2 "vector_operand" "xBm,YWm")]
  UNSPEC_PSADBW))]
-  "TARGET_SSE2"
+  "TARGET_SSE2 && ix86_binary_operator_ok (PLUS, mode, operands)"
   "@
psadbw\t{%2, %0|%0, %2}
vpsadbw\t{%2, %1, %0|%0, %1, %2}"

Re: [PATCH] c++: ICE with template NEW_EXPR [PR105803]

2022-06-02 Thread Marek Polacek via Gcc-patches

On Thu, Jun 02, 2022 at 08:42:36AM -0400, Patrick Palka wrote:
> On Wed, 1 Jun 2022, Marek Polacek via Gcc-patches wrote:
> 
> > Here we ICE because value_dependent_expression_p gets a NEW_EXPR
> > whose operand is a type, and we go to the default case which just
> > calls v_d_e_p on each operand of the NEW_EXPR.  Since one of them
> > is a type, we crash on the new assert in t_d_e_p.
> 
> Looks like NEW_EXPR is considered to be not potentially constant
> according to potential_constant_expression.  I thought we shouldn't
> be calling value_dependent_expression_p on such exprs?

You're correct.  This is non-obvious: instantiation_dependent_expression_p
calls p_c_e before v_d_e_p, but the expression is CAST_EXPR<[NEW_EXPR]>,
where the [] denotes a TREE_LIST, created in cp_parser_functional_cast.
This TREE_LIST has no type.  So p_c_e_1/CAST_EXPR goes to 
9183   /* If this is a dependent type, it could end up being a class
9184  with conversions.  */
9185   if (type == NULL_TREE || WILDCARD_TYPE_P (type))
9186 return true;
and returns true.

So we call v_d_e_p, which looks at the CAST_EXPR's op and sees a TREE_LIST,
so it calls any_value_dependent_elements_p, and we end up with a NEW_EXPR.

An alternative/more proper fix would be to fix p_c_e_1/CAST_EXPR.  Maybe
by calling any_type_dependent_elements_p (which currently has no uses).
Thoughts?

> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/template/new13.C
> > @@ -0,0 +1,11 @@
> > +// PR c++/105803
> > +// { dg-do compile }
> > +// { dg-additional-options "-fchecking=2" }
> > +
> > +namespace std {
> > +template  class shared_ptr;
> > +}
> > +struct S {};
> > +template  void build_matrices() {
> > +  std::shared_ptr(new S);
> > +}
> 
> I think this testcase might be IFNDR since shared_ptr is incomplete
> at the point of its non-dependent use.

Ah, overreduced.  I've made shared_ptr complete.

Marek

Re: [PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (stype)0) [PR105777]

2022-06-02 Thread Richard Biener via Gcc-patches

On Thu, 2 Jun 2022, Jakub Jelinek wrote:

> Hi!
> 
> The following patch is an incremental change to the PR30314 enhancement,
> this one handles signed types.
> For signed types (but still, the same for 1st and result element type
> and non-zero constant that fits into that type), we actually need to
> watch for overflow in direction to positive and negative infinity
> and it also depends on whether the cst operand is positive or negative.
> For __builtin_mul_overflow_p (x, cst, (stype) 0):
> For cst > 0, we can simplify it to:
> x > INT_MAX / cst || x < INT_MIN / cst
> aka:
> x + (unsigned) (INT_MIN / cst) > (unsigned) (INT_MAX / cst) - (unsigned) 
> (INT_MIN / cst)
> and for cst < 0 to:
> x < INT_MAX / cst || x > INT_MIN / cst
> aka:
> x + (unsigned) (INT_MAX / cst) > (unsigned) (INT_MIN / cst) - (unsigned) 
> (INT_MAX / cst)
> 
> Additionally, I've added executable testcases, so we don't just check for
> the optimization to be performed, but also that it is correct (done that
> even for the other PR's testcase).
> 
> Starting x86_64-linux and i686-linux bootstrap/regtest, ok for trunk if
> it passes them?

OK.

Thanks,
Richard.

> 2022-06-02  Jakub Jelinek  
> 
>   PR middle-end/30314
>   PR middle-end/105777
>   * match.pd (__builtin_mul_overflow_p (x, cst, (stype) 0) ->
>   x > stype_max / cst || x < stype_min / cst): New simplification.
> 
>   * gcc.dg/tree-ssa/pr30314.c: Add noipa attribute to all functions.
>   * gcc.dg/tree-ssa/pr105777.c: New test.
>   * gcc.c-torture/execute/pr30314.c: New test.
>   * gcc.c-torture/execute/pr105777.c: New test.
> 
> --- gcc/match.pd.jj   2022-06-01 17:54:30.536372912 +0200
> +++ gcc/match.pd  2022-06-02 13:16:17.171415948 +0200
> @@ -5970,15 +5970,37 @@ (define_operator_list SYNC_FETCH_AND_AND
> (ovf @1 @0
>  
>  /* Optimize __builtin_mul_overflow_p (x, cst, (utype) 0) if all 3 types
> -   are unsigned to x > (umax / cst).  */
> +   are unsigned to x > (umax / cst).  Similarly for signed type, but
> +   in that case it needs to be outside of a range.  */
>  (simplify
>   (imagpart (IFN_MUL_OVERFLOW:cs@2 @0 integer_nonzerop@1))
>(if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> -   && TYPE_UNSIGNED (TREE_TYPE (@0))
> && TYPE_MAX_VALUE (TREE_TYPE (@0))
> && types_match (TREE_TYPE (@0), TREE_TYPE (TREE_TYPE (@2)))
> && int_fits_type_p (@1, TREE_TYPE (@0)))
> -   (convert (gt @0 (trunc_div! { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1)
> +   (if (TYPE_UNSIGNED (TREE_TYPE (@0)))
> +(convert (gt @0 (trunc_div! { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1)))
> +(if (TYPE_MIN_VALUE (TREE_TYPE (@0)))
> + (with
> +  {
> + tree lo = int_const_binop (TRUNC_DIV_EXPR,
> +TYPE_MIN_VALUE (TREE_TYPE (@0)),
> +fold_convert (TREE_TYPE (@0), @1));
> + tree hi = int_const_binop (TRUNC_DIV_EXPR,
> +TYPE_MAX_VALUE (TREE_TYPE (@0)),
> +fold_convert (TREE_TYPE (@0), @1));
> + tree etype = range_check_type (TREE_TYPE (@0));
> + if (etype)
> +   {
> + if (wi::neg_p (wi::to_wide (@1)))
> +   std::swap (lo, hi);
> + lo = fold_convert (etype, lo);
> + hi = fold_convert (etype, hi);
> + hi = int_const_binop (MINUS_EXPR, hi, lo);
> +   }
> +  }
> +  (if (etype)
> +   (convert (gt (minus (convert:etype @0) { lo; }) { hi; }
>  
>  /* Simplification of math builtins.  These rules must all be optimizations
> as well as IL simplifications.  If there is a possibility that the new
> --- gcc/testsuite/gcc.dg/tree-ssa/pr30314.c.jj2022-06-02 
> 11:17:23.689835550 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr30314.c   2022-06-02 14:23:19.294093445 
> +0200
> @@ -7,25 +7,25 @@
>  /* { dg-final { scan-tree-dump " > 102261126" "optimized" { target int32 } } 
> } */
>  /* { dg-final { scan-tree-dump " > 439208192231179800" "optimized" { target 
> lp64 } } } */
>  
> -int
> +__attribute__((noipa)) int
>  foo (unsigned int x)
>  {
>return __builtin_mul_overflow_p (x, 35U, 0U);
>  }
>  
> -int
> +__attribute__((noipa)) int
>  bar (unsigned long int x)
>  {
>return __builtin_mul_overflow_p (x, 35UL, 0UL);
>  }
>  
> -int
> +__attribute__((noipa)) int
>  baz (unsigned int x)
>  {
>return __builtin_mul_overflow_p (42, x, 0U);
>  }
>  
> -int
> +__attribute__((noipa)) int
>  qux (unsigned long int x)
>  {
>return __builtin_mul_overflow_p (42, x, 0UL);
> --- gcc/testsuite/gcc.dg/tree-ssa/pr105777.c.jj   2022-06-02 
> 14:22:57.017328731 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr105777.c  2022-06-02 14:19:29.399521503 
> +0200
> @@ -0,0 +1,68 @@
> +/* PR middle-end/105777 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-not "\.MUL_OVERFLOW " "optimized" } } */
> +/* { dg-final { scan-tree-dump " \\+ 61356675;" "optimized" { target int32

Re: [PATCH v2] RISC-V: bitmanip: improve constant-loading for (1ULL << 31) in DImode

2022-06-02 Thread Kito Cheng via Gcc-patches

LGTM

On Mon, May 30, 2022 at 5:52 AM Philipp Tomsich
 wrote:
>
> The SINGLE_BIT_MASK_OPERAND() is overly restrictive, triggering for
> bits above 31 only (to side-step any issues with the negative SImode
> value 0x8000/(-1ull << 31)/(1 << 31)).  This moves the special
> handling of this SImode value (i.e. the check for (-1ull << 31) to
> riscv.cc and relaxes the SINGLE_BIT_MASK_OPERAND() test.
>
> With this, the code-generation for loading (1ULL << 31) from:
> li  a0,1
> sllia0,a0,31
> to:
> bseti   a0,zero,31
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_build_integer_1): Rewrite value as
> (-1 << 31) for the single-bit case, when operating on (1 << 31)
> in SImode.
> * gcc/config/riscv/riscv.h (SINGLE_BIT_MASK_OPERAND): Allow for
> any single-bit value, moving the special case for (1 << 31) to
> riscv_build_integer_1 (in riscv.c).
>
> Signed-off-by: Philipp Tomsich 
>
> ---
>
> Changes in v2:
> - Use HOST_WIDE_INT_1U/HOST_WIDE_INT_M1U instead of constants.
> - Fix some typos in the comment above the rewrite of the value.
> - Update the comment to clarify that we expect a LUI to be emitted for
>   the SImode case (i.e. sign-extended for RV64) of (1 << 31).
>
>  gcc/config/riscv/riscv.cc |  9 +
>  gcc/config/riscv/riscv.h  | 11 ---
>  2 files changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index f83dc796d88..2e83ca07394 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -420,6 +420,15 @@ riscv_build_integer_1 (struct riscv_integer_op 
> codes[RISCV_MAX_INTEGER_OPS],
>/* Simply BSETI.  */
>codes[0].code = UNKNOWN;
>codes[0].value = value;
> +
> +  /* RISC-V sign-extends all 32bit values that live in a 32bit
> +register.  To avoid paradoxes, we thus need to use the
> +sign-extended (negative) representation (-1 << 31) for the
> +value, if we want to build (1 << 31) in SImode.  This will
> +then expand to an LUI instruction.  */
> +  if (mode == SImode && value == (HOST_WIDE_INT_1U << 31))
> +   codes[0].value = (HOST_WIDE_INT_M1U << 31);
> +
>return 1;
>  }
>
> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> index 5083a1c24b0..6f7f4d3fbdc 100644
> --- a/gcc/config/riscv/riscv.h
> +++ b/gcc/config/riscv/riscv.h
> @@ -528,13 +528,10 @@ enum reg_class
>(((VALUE) | ((1UL<<31) - IMM_REACH)) == ((1UL<<31) - IMM_REACH)  \
> || ((VALUE) | ((1UL<<31) - IMM_REACH)) + IMM_REACH == 0)
>
> -/* If this is a single bit mask, then we can load it with bseti.  But this
> -   is not useful for any of the low 31 bits because we can use addi or lui
> -   to load them.  It is wrong for loading SImode 0x8000 on rv64 because 
> it
> -   needs to be sign-extended.  So we restrict this to the upper 32-bits
> -   only.  */
> -#define SINGLE_BIT_MASK_OPERAND(VALUE) \
> -  (pow2p_hwi (VALUE) && (ctz_hwi (VALUE) >= 32))
> +/* If this is a single bit mask, then we can load it with bseti.  Special
> +   handling of SImode 0x8000 on RV64 is done in riscv_build_integer_1. */
> +#define SINGLE_BIT_MASK_OPERAND(VALUE) \
> +  (pow2p_hwi (VALUE))
>
>  /* Stack layout; function entry, exit and calling.  */
>
> --
> 2.34.1
>

[PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (stype)0) [PR105777]

2022-06-02 Thread Jakub Jelinek via Gcc-patches

Hi!

The following patch is an incremental change to the PR30314 enhancement,
this one handles signed types.
For signed types (but still, the same for 1st and result element type
and non-zero constant that fits into that type), we actually need to
watch for overflow in direction to positive and negative infinity
and it also depends on whether the cst operand is positive or negative.
For __builtin_mul_overflow_p (x, cst, (stype) 0):
For cst > 0, we can simplify it to:
x > INT_MAX / cst || x < INT_MIN / cst
aka:
x + (unsigned) (INT_MIN / cst) > (unsigned) (INT_MAX / cst) - (unsigned) 
(INT_MIN / cst)
and for cst < 0 to:
x < INT_MAX / cst || x > INT_MIN / cst
aka:
x + (unsigned) (INT_MAX / cst) > (unsigned) (INT_MIN / cst) - (unsigned) 
(INT_MAX / cst)

Additionally, I've added executable testcases, so we don't just check for
the optimization to be performed, but also that it is correct (done that
even for the other PR's testcase).

Starting x86_64-linux and i686-linux bootstrap/regtest, ok for trunk if
it passes them?

2022-06-02  Jakub Jelinek  

PR middle-end/30314
PR middle-end/105777
* match.pd (__builtin_mul_overflow_p (x, cst, (stype) 0) ->
x > stype_max / cst || x < stype_min / cst): New simplification.

* gcc.dg/tree-ssa/pr30314.c: Add noipa attribute to all functions.
* gcc.dg/tree-ssa/pr105777.c: New test.
* gcc.c-torture/execute/pr30314.c: New test.
* gcc.c-torture/execute/pr105777.c: New test.

--- gcc/match.pd.jj 2022-06-01 17:54:30.536372912 +0200
+++ gcc/match.pd2022-06-02 13:16:17.171415948 +0200
@@ -5970,15 +5970,37 @@ (define_operator_list SYNC_FETCH_AND_AND
(ovf @1 @0
 
 /* Optimize __builtin_mul_overflow_p (x, cst, (utype) 0) if all 3 types
-   are unsigned to x > (umax / cst).  */
+   are unsigned to x > (umax / cst).  Similarly for signed type, but
+   in that case it needs to be outside of a range.  */
 (simplify
  (imagpart (IFN_MUL_OVERFLOW:cs@2 @0 integer_nonzerop@1))
   (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
-   && TYPE_UNSIGNED (TREE_TYPE (@0))
&& TYPE_MAX_VALUE (TREE_TYPE (@0))
&& types_match (TREE_TYPE (@0), TREE_TYPE (TREE_TYPE (@2)))
&& int_fits_type_p (@1, TREE_TYPE (@0)))
-   (convert (gt @0 (trunc_div! { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1)
+   (if (TYPE_UNSIGNED (TREE_TYPE (@0)))
+(convert (gt @0 (trunc_div! { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1)))
+(if (TYPE_MIN_VALUE (TREE_TYPE (@0)))
+ (with
+  {
+   tree lo = int_const_binop (TRUNC_DIV_EXPR,
+  TYPE_MIN_VALUE (TREE_TYPE (@0)),
+  fold_convert (TREE_TYPE (@0), @1));
+   tree hi = int_const_binop (TRUNC_DIV_EXPR,
+  TYPE_MAX_VALUE (TREE_TYPE (@0)),
+  fold_convert (TREE_TYPE (@0), @1));
+   tree etype = range_check_type (TREE_TYPE (@0));
+   if (etype)
+ {
+   if (wi::neg_p (wi::to_wide (@1)))
+ std::swap (lo, hi);
+   lo = fold_convert (etype, lo);
+   hi = fold_convert (etype, hi);
+   hi = int_const_binop (MINUS_EXPR, hi, lo);
+ }
+  }
+  (if (etype)
+   (convert (gt (minus (convert:etype @0) { lo; }) { hi; }
 
 /* Simplification of math builtins.  These rules must all be optimizations
as well as IL simplifications.  If there is a possibility that the new
--- gcc/testsuite/gcc.dg/tree-ssa/pr30314.c.jj  2022-06-02 11:17:23.689835550 
+0200
+++ gcc/testsuite/gcc.dg/tree-ssa/pr30314.c 2022-06-02 14:23:19.294093445 
+0200
@@ -7,25 +7,25 @@
 /* { dg-final { scan-tree-dump " > 102261126" "optimized" { target int32 } } } 
*/
 /* { dg-final { scan-tree-dump " > 439208192231179800" "optimized" { target 
lp64 } } } */
 
-int
+__attribute__((noipa)) int
 foo (unsigned int x)
 {
   return __builtin_mul_overflow_p (x, 35U, 0U);
 }
 
-int
+__attribute__((noipa)) int
 bar (unsigned long int x)
 {
   return __builtin_mul_overflow_p (x, 35UL, 0UL);
 }
 
-int
+__attribute__((noipa)) int
 baz (unsigned int x)
 {
   return __builtin_mul_overflow_p (42, x, 0U);
 }
 
-int
+__attribute__((noipa)) int
 qux (unsigned long int x)
 {
   return __builtin_mul_overflow_p (42, x, 0UL);
--- gcc/testsuite/gcc.dg/tree-ssa/pr105777.c.jj 2022-06-02 14:22:57.017328731 
+0200
+++ gcc/testsuite/gcc.dg/tree-ssa/pr105777.c2022-06-02 14:19:29.399521503 
+0200
@@ -0,0 +1,68 @@
+/* PR middle-end/105777 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-not "\.MUL_OVERFLOW " "optimized" } } */
+/* { dg-final { scan-tree-dump " \\+ 61356675;" "optimized" { target int32 } } 
} */
+/* { dg-final { scan-tree-dump " > 122713350" "optimized" { target int32 } } } 
*/
+/* { dg-final { scan-tree-dump " \\+ 263524915338707880" "optimized" { target 
lp64 } } } */
+/* { dg-final { scan-tree-dump " > 527049830677415760" "optimized" { target 
lp64 } } }

Re: [PATCH] c++: ICE with template NEW_EXPR [PR105803]

2022-06-02 Thread Patrick Palka via Gcc-patches

On Wed, 1 Jun 2022, Marek Polacek via Gcc-patches wrote:

> Here we ICE because value_dependent_expression_p gets a NEW_EXPR
> whose operand is a type, and we go to the default case which just
> calls v_d_e_p on each operand of the NEW_EXPR.  Since one of them
> is a type, we crash on the new assert in t_d_e_p.

Looks like NEW_EXPR is considered to be not potentially constant
according to potential_constant_expression.  I thought we shouldn't
be calling value_dependent_expression_p on such exprs?

> 
> t_d_e_p has code to handle {,VEC_}NEW_EXPR, which at this point
> was already performed, so I think we can handle these two codes
> specifically and skip the second operand, which is always going
> to be a type.
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> 
>   PR c++/105803
> 
> gcc/cp/ChangeLog:
> 
>   * pt.cc (value_dependent_expression_p): Handle {,VEC_}NEW_EXPR
>   in the switch.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/template/new13.C: New test.
> ---
>  gcc/cp/pt.cc  |  8 
>  gcc/testsuite/g++.dg/template/new13.C | 11 +++
>  2 files changed, 19 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/template/new13.C
> 
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index 6de8e496859..836861e1039 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -27643,6 +27643,14 @@ value_dependent_expression_p (tree expression)
>under instantiate_non_dependent_expr; it can't be constant.  */
>return true;
>  
> +case NEW_EXPR:
> +case VEC_NEW_EXPR:
> +  /* The second operand is a type, which type_dependent_expression_p
> +  (and therefore value_dependent_expression_p) doesn't want to see.  */
> +  return (value_dependent_expression_p (TREE_OPERAND (expression, 0))
> +   || value_dependent_expression_p (TREE_OPERAND (expression, 2))
> +   || value_dependent_expression_p (TREE_OPERAND (expression, 3)));
> +
>  default:
>/* A constant expression is value-dependent if any subexpression is
>value-dependent.  */
> diff --git a/gcc/testsuite/g++.dg/template/new13.C 
> b/gcc/testsuite/g++.dg/template/new13.C
> new file mode 100644
> index 000..3168374b26d
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/template/new13.C
> @@ -0,0 +1,11 @@
> +// PR c++/105803
> +// { dg-do compile }
> +// { dg-additional-options "-fchecking=2" }
> +
> +namespace std {
> +template  class shared_ptr;
> +}
> +struct S {};
> +template  void build_matrices() {
> +  std::shared_ptr(new S);
> +}

I think this testcase might be IFNDR since shared_ptr is incomplete
at the point of its non-dependent use.

> 
> base-commit: 2d546ff69455f7deadab65309de89d19380a8864
> -- 
> 2.36.1
> 
>

Re: [PATCH] Fold truncations of left shifts in match.pd

2022-06-02 Thread Richard Biener via Gcc-patches

On Thu, Jun 2, 2022 at 12:55 PM Roger Sayle  wrote:
>
>
> Hi Richard,
> > +  /* RTL expansion knows how to expand rotates using shift/or.  */  if
> > + (icode == CODE_FOR_nothing
> > +  && (code == LROTATE_EXPR || code == RROTATE_EXPR)
> > +  && optab_handler (ior_optab, vec_mode) != CODE_FOR_nothing
> > +  && optab_handler (ashl_optab, vec_mode) != CODE_FOR_nothing)
> > +icode = (int) optab_handler (lshr_optab, vec_mode);
> >
> > but we then get the vector costing wrong.
>
> The issue is that we currently get the (relative) vector costing wrong.
> Currently for gcc.dg/vect/pr98674.c, the vectorizer thinks the scalar
> code requires two shifts and an ior, so believes its profitable to vectorize
> this loop using two vector shifts and an vector ior.  But once match.pd
> simplifies the truncate and recognizes the HImode rotate we end up with:
>
> pr98674.c:6:16: note:   ==> examining statement: _6 = _1 r>> 8;
> pr98674.c:6:16: note:   vect_is_simple_use: vectype vector(8) short int
> pr98674.c:6:16: note:   vect_is_simple_use: operand 8, type of def: constant
> pr98674.c:6:16: missed:   op not supported by target.
> pr98674.c:8:33: missed:   not vectorized: relevant stmt not supported: _6 = 
> _1 r>> 8;
> pr98674.c:6:16: missed:  bad operation or unsupported loop bound.
>
>
> Clearly, it's a win to vectorize HImode rotates, when the backend can perform
> 8 (or 16) rotations at a time, but using 3 vector instructions, even when a 
> scalar
> rotate can performed in a single instruction.  Fundamentally, vectorization 
> may
> still be desirable/profitable even when the backend doesn't provide an optab.

Yes, as said it's tree-vect-patterns.cc job to handle this not
natively supported
rotate by re-writing it.  Can you check why vect_recog_rotate_pattern does
not do this?  Ah, the code only handles !TYPE_UNSIGNED (type) - not sure
why though (for rotates it should not matter and for the lowered sequence
we can convert to desired signedness to get arithmetic/logical shifts)?

> The current situation where the i386's backend provides expanders to lower
> rotations (or vcond) into individual instruction sequences, also interferes 
> with
> vector costing.   It's the vector cost function that needs to be fixed, not 
> the
> generated code made worse (or the backend bloated performing its own
> RTL expansion workarounds).
>
> Is it instead ok to mark pr98674.c as XFAIL (a regression)?
> The tweak to tree-vect-stmts.cc was based on the assumption that we wished
> to continue vectorizing this loop.  Improving scalar code generation really
> shouldn't disable vectorization like this.

Yes, see above where the fix needs to be.  The pattern will then expose
the shift and ior to the vectorizer which then are properly costed.

Richard.

>
>
> Cheers,
> Roger
> --
>
>

Re: [PATCH/RFC] cprop_hardreg... Third time's a charm.

2022-06-02 Thread Richard Biener via Gcc-patches

On Thu, Jun 2, 2022 at 12:20 PM Roger Sayle  wrote:
>
>
>
> This middle-end patch proposes the "hard register constant propagation"
>
> pass be performed up to three times on each basic block (up from the
>
> current two times) if the second pass successfully made changes.
>
>
>
> The motivation for three passes is to handle the "swap idiom" (i.e.
>
> t = x; x = y; y = t;" sequences) that get generated by register allocation
>
> (reload).
>
>
>
> Consider the x86_64 test case for __int128 addition recently discussed
>
> on gcc-patches.  With that proposed patch, the input to the cprop_hardreg
>
> pass looks like:
>
>
>
> movq%rdi, %r8
>
> movq%rsi, %rdi
>
> movq%r8, %rsi
>
> movq%rdx, %rax
>
> movq%rcx, %rdx
>
> addq%rsi %rax
>
> adcq%rdi, %rdx
>
> ret
>
>
>
> where the first three instructions effectively swap %rsi and %rdi.
>
>
>
> On the first pass of cprop_hardreg, we notice that the third insn,
>
> %rsi := %r8, is redundant and can eliminated/propagated to produce:
>
>
>
> movq%rdi, %r8
>
> movq%rsi, %rdi
>
> movq%rdx, %rax
>
> movq%rcx, %rdx
>
> addq%r8 %rax
>
> adcq%rdi, %rdx
>
> ret
>
>
>
> Because a successful propagation was found, cprop_hardreg then runs
>
> a second pass/sweep on affected basic blocks (using worklist), and
>
> on this second pass notices that the second instruction, %rdi := %rsi,
>
> may now be propagated (%rsi was killed in the before the first transform),
>
> and after a second pass, we now end up with:
>
>
>
> movq%rdi, %r8
>
> movq%rdx, %rax
>
> movq%rcx, %rdx
>
> addq%r8, %rax
>
> adcq%rsi, %rdx
>
> ret
>
>
>
> which is the current behaviour on mainline.  However, a third and final
>
> pass would now notice that the first insn, "%r8 := %rdi" is also now
>
> eliminable, and a third iteration would produce optimal code:
>
>
>
> movq%rdx, %rax
>
> movq%rcx, %rdx
>
> addq%rdi, %rax
>
> adcq%rsi, %rdx
>
> ret
>
>
>
> The patch below creates an additional worklist, third_pass, that is
>
> populated with the basic block id's of blocks that were updated during
>
> the second pass.  Does the motivation for three passes (reload doesn't
>
> generate more or less than three instructions to swap a pair of registers)
>
> seem reasonable for all targets?  If cprop_hardreg is considered an
>
> expensive pass, this change could be gated based on basic block count
>
> or similar.  Finally, I should point out that this a regression fix;
>
> GCC 4.8 generated optimal code with two moves (whereas GCC 12 required
>
> 5 moves, up from GCC 11's 4 moves).
>
>
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
>
> and make -k check, both with and without --target_board=unix{-m32} with
>
> no new failures.  Thoughts?  Ok for mainline?

Can you instead refactor the code to have

auto_vec worklist1, worklist2;
vec *worklist = 

and alternate between worklist1 and worklist2 N times (const int N = 2
is OK for now I guess, OTOH for -O1 we might want to stick to 1).

it might be tempting to maintain the number of times we visited a block
and just directly iterate on changed BBs.  There's also the comment
in cprop_hardreg_bb:

  /* If a block has a single predecessor, that we've already
 processed, begin with the value data that was live at
 the end of the predecessor block.  */
  /* ??? Ought to use more intelligent queuing of blocks.  */

which suggests we want to improve on its dataflow (like iterating
in RPO order so we can intersect the live value data of predecessors?)
That said, calling df_analyze () between iterations is what makes
iterating expensive I guess, as well as cprop_hardreg_debug.

Richard.

>
>
>
>
> 2022-06-02  Roger Sayle  
>
>
>
> gcc/ChangeLog
>
> * regcprop.cc (pass_cprop_hardreg::execute): Perform a third
>
> iteration over each basic block that was updated by the second
>
> iteration.
>
>
>
>
>
> Thanks in advance,
>
> Roger
>
> --
>
>
>

RE: [PATCH] Fold truncations of left shifts in match.pd

2022-06-02 Thread Roger Sayle



Hi Richard,
> +  /* RTL expansion knows how to expand rotates using shift/or.  */  if
> + (icode == CODE_FOR_nothing
> +  && (code == LROTATE_EXPR || code == RROTATE_EXPR)
> +  && optab_handler (ior_optab, vec_mode) != CODE_FOR_nothing
> +  && optab_handler (ashl_optab, vec_mode) != CODE_FOR_nothing)
> +icode = (int) optab_handler (lshr_optab, vec_mode);
> 
> but we then get the vector costing wrong. 

The issue is that we currently get the (relative) vector costing wrong.
Currently for gcc.dg/vect/pr98674.c, the vectorizer thinks the scalar
code requires two shifts and an ior, so believes its profitable to vectorize
this loop using two vector shifts and an vector ior.  But once match.pd
simplifies the truncate and recognizes the HImode rotate we end up with:

pr98674.c:6:16: note:   ==> examining statement: _6 = _1 r>> 8;
pr98674.c:6:16: note:   vect_is_simple_use: vectype vector(8) short int
pr98674.c:6:16: note:   vect_is_simple_use: operand 8, type of def: constant
pr98674.c:6:16: missed:   op not supported by target.
pr98674.c:8:33: missed:   not vectorized: relevant stmt not supported: _6 = _1 
r>> 8;
pr98674.c:6:16: missed:  bad operation or unsupported loop bound.


Clearly, it's a win to vectorize HImode rotates, when the backend can perform
8 (or 16) rotations at a time, but using 3 vector instructions, even when a 
scalar
rotate can performed in a single instruction.  Fundamentally, vectorization may
still be desirable/profitable even when the backend doesn't provide an optab.

The current situation where the i386's backend provides expanders to lower
rotations (or vcond) into individual instruction sequences, also interferes with
vector costing.   It's the vector cost function that needs to be fixed, not the
generated code made worse (or the backend bloated performing its own
RTL expansion workarounds).

Is it instead ok to mark pr98674.c as XFAIL (a regression)?
The tweak to tree-vect-stmts.cc was based on the assumption that we wished
to continue vectorizing this loop.  Improving scalar code generation really
shouldn't disable vectorization like this.


Cheers,
Roger
--

[PATCH/RFC] cprop_hardreg... Third time's a charm.

2022-06-02 Thread Roger Sayle

 

This middle-end patch proposes the "hard register constant propagation"

pass be performed up to three times on each basic block (up from the

current two times) if the second pass successfully made changes.

 

The motivation for three passes is to handle the "swap idiom" (i.e.

t = x; x = y; y = t;" sequences) that get generated by register allocation

(reload).

 

Consider the x86_64 test case for __int128 addition recently discussed

on gcc-patches.  With that proposed patch, the input to the cprop_hardreg

pass looks like:

 

movq%rdi, %r8

movq%rsi, %rdi

movq%r8, %rsi

movq%rdx, %rax

movq%rcx, %rdx

addq%rsi %rax

adcq%rdi, %rdx

ret

 

where the first three instructions effectively swap %rsi and %rdi.

 

On the first pass of cprop_hardreg, we notice that the third insn,

%rsi := %r8, is redundant and can eliminated/propagated to produce:

 

movq%rdi, %r8

movq%rsi, %rdi

movq%rdx, %rax

movq%rcx, %rdx

addq%r8 %rax

adcq%rdi, %rdx

ret

 

Because a successful propagation was found, cprop_hardreg then runs

a second pass/sweep on affected basic blocks (using worklist), and

on this second pass notices that the second instruction, %rdi := %rsi,

may now be propagated (%rsi was killed in the before the first transform),

and after a second pass, we now end up with:

 

movq%rdi, %r8

movq%rdx, %rax

movq%rcx, %rdx

addq%r8, %rax

adcq%rsi, %rdx

ret

 

which is the current behaviour on mainline.  However, a third and final

pass would now notice that the first insn, "%r8 := %rdi" is also now

eliminable, and a third iteration would produce optimal code:

 

movq%rdx, %rax

movq%rcx, %rdx

addq%rdi, %rax

adcq%rsi, %rdx

ret

 

The patch below creates an additional worklist, third_pass, that is

populated with the basic block id's of blocks that were updated during

the second pass.  Does the motivation for three passes (reload doesn't

generate more or less than three instructions to swap a pair of registers)

seem reasonable for all targets?  If cprop_hardreg is considered an

expensive pass, this change could be gated based on basic block count

or similar.  Finally, I should point out that this a regression fix;

GCC 4.8 generated optimal code with two moves (whereas GCC 12 required

5 moves, up from GCC 11's 4 moves).

 

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap

and make -k check, both with and without --target_board=unix{-m32} with

no new failures.  Thoughts?  Ok for mainline?

 

 

2022-06-02  Roger Sayle  

 

gcc/ChangeLog

* regcprop.cc (pass_cprop_hardreg::execute): Perform a third

iteration over each basic block that was updated by the second

iteration.

 

 

Thanks in advance,

Roger

--

 

diff --git a/gcc/regcprop.cc b/gcc/regcprop.cc
index 1fdc367..c4a 100644
--- a/gcc/regcprop.cc
+++ b/gcc/regcprop.cc
@@ -1384,6 +1384,7 @@ pass_cprop_hardreg::execute (function *fun)
   bitmap_clear (visited);
 
   auto_vec worklist;
+  auto_vec third_pass;
   bool any_debug_changes = false;
 
   /* We need accurate notes.  Earlier passes such as if-conversion may
@@ -1425,7 +1426,27 @@ pass_cprop_hardreg::execute (function *fun)
   for (int index : worklist)
{
  bb = BASIC_BLOCK_FOR_FN (fun, index);
- cprop_hardreg_bb (bb, all_vd, visited);
+ /* Perform a third pass, if the second pass changed anything.
+Three passes are required for swaps: t = x; x = y; y = t.  */
+  if (cprop_hardreg_bb (bb, all_vd, visited))
+   third_pass.safe_push (bb->index);
+ if (all_vd[bb->index].n_debug_insn_changes)
+   any_debug_changes = true;
+   }
+
+  df_analyze ();
+  if (MAY_HAVE_DEBUG_BIND_INSNS && any_debug_changes)
+   cprop_hardreg_debug (fun, all_vd);
+}
+
+  if (!third_pass.is_empty ())
+{
+  any_debug_changes = false;
+  bitmap_clear (visited);
+  for (int index : third_pass)
+   {
+ bb = BASIC_BLOCK_FOR_FN (fun, index);
+  cprop_hardreg_bb (bb, all_vd, visited);
  if (all_vd[bb->index].n_debug_insn_changes)
any_debug_changes = true;
}

Re: [PATCH] Introduce -finstrument-functions-once

2022-06-02 Thread Richard Biener via Gcc-patches

On Tue, May 24, 2022 at 12:49 PM Eric Botcazou via Gcc-patches
 wrote:
>
> Hi,
>
> some time ago we were requested to implement a -finstrument-functions-once
> switch in the compiler, with the semantics that the profiling functions be
> called only once per instrumented function.  The goal was to make it possible
> to use it in (large) production binaries to do function-level coverage, so the
> overhead must be minimum and, in particular, there is no protection against
> data races so the "once" moniker is imprecise.

So that also applies to

"... and the second profiling function is called before the exit
+corresponding to this first entry"

specifically "corresponding to this first entry"?   As if the second
entry exits first will that call the second profiling function or will
it really be the thread that called the first profiling function
(what happens when that thread terminates before calling the second
profiling function? (***)).  Consider re-wording this slightly.

+  /* If -finstrument-functions-once is specified, generate:
+
+  static volatile bool F.0 = true;
+  bool tmp_first;

is there any good reason to make F.0 volatile?  That doesn't prevent
races.  Any reason to make F.0 initialized to true rather than false
(bss init?)

(***) looking at the implementation the second profiling function
can end up being never called when the thread calling the first
profiling function does not exit the function.  So I wonder if
the "optimization"(?) not re-reading F.0 makes sense (it also
requires to keep the value of F.0 live across the whole function)

Otherwise looks OK to me.

Richard.

> Tested on x86-64/Linux, OK for the mainline?
>
>
> 2022-05-24  Eric Botcazou  
>
> * common.opt (finstrument-functions): Set explicit value.
> (-finstrument-functions-once): New option.
> * doc/invoke.texi (Program Instrumentation Options): Document it.
> * gimplify.c (build_instrumentation_call): New static function.
> (gimplify_function_tree): Invoke it to emit the instrumentation calls
> if -finstrument-functions[-once] is specified.
>
> --
> Eric Botcazou

Re: [x86 PATCH] Add peephole2 to reduce double word register shuffling.

2022-06-02 Thread Richard Biener via Gcc-patches

On Thu, Jun 2, 2022 at 11:48 AM Uros Bizjak via Gcc-patches
 wrote:
>
> On Thu, Jun 2, 2022 at 9:20 AM Roger Sayle  wrote:
> >
> > The simple test case below demonstrates an interesting register
> > allocation challenge facing x86_64, imposed by ABI requirements
> > on int128.
> >
> > __int128 foo(__int128 x, __int128 y)
> > {
> >   return x+y;
> > }
> >
> > For which GCC currently generates the unusual sequence:
> >
> > movq%rsi, %rax
> > movq%rdi, %r8
> > movq%rax, %rdi
> > movq%rdx, %rax
> > movq%rcx, %rdx
> > addq%r8, %rax
> > adcq%rdi, %rdx
> > ret
> >
> > The challenge is that the x86_64 ABI requires passing the first __int128,
> > x, in %rsi:%rdi (highpart in %rsi, lowpart in %rdi), where internally
> > GCC prefers TI mode (double word) integers to be register allocated as
> > %rdi:%rsi (highpart in %rdi, lowpart in %rsi).  So after reload, we have
> > four mov instructions, two to move the double word to temporary registers
> > and then two to move them back.
> >
> > This patch adds a peephole2 to spot this register shuffling, and with
> > -Os generates a xchg instruction, to produce:
> >
> > xchgq   %rsi, %rdi
> > movq%rdx, %rax
> > movq%rcx, %rdx
> > addq%rsi, %rax
> > adcq%rdi, %rdx
> > ret
> >
> > or when optimizing for speed, a three mov sequence, using just one of
> > the temporary registers, which ultimately results in the improved:
> >
> > movq%rdi, %r8
> > movq%rdx, %rax
> > movq%rcx, %rdx
> > addq%r8, %rax
> > adcq%rsi, %rdx
> > ret
> >
> > I've a follow-up patch which improves things further, and with the
> > output in flux, I'd like to add the new testcase with part 2, once
> > we're back down to requiring only two movq instructions.
>
> Shouldn't we rather do something about:
>
> (insn 2 9 3 2 (set (reg:DI 85)
>(reg:DI 5 di [ x ])) "dword-2.c":2:1 82 {*movdi_internal}
> (nil))
> (insn 3 2 4 2 (set (reg:DI 86)
>(reg:DI 4 si [ x+8 ])) "dword-2.c":2:1 82 {*movdi_internal}
> (nil))
> (insn 4 3 5 2 (set (reg:TI 84)
>(subreg:TI (reg:DI 85) 0)) "dword-2.c":2:1 81 {*movti_internal}
> (nil))
> (insn 5 4 6 2 (set (subreg:DI (reg:TI 84) 8)
>(reg:DI 86)) "dword-2.c":2:1 82 {*movdi_internal}
> (nil))
> (insn 6 5 7 2 (set (reg/v:TI 83 [ x ])
>(reg:TI 84)) "dword-2.c":2:1 81 {*movti_internal}
> (nil))
>
> The above is how the functionTImode argument is constructed.
>
> The other problem is that double-word addition gets split only after
> reload, mostly due to RA reasons. In the past it was determined that
> RA creates better code when registers are split late (this reason
> probably does not hold anymore), but nowadays the limitation remains
> only for arithmetic and shifts.

Hmm.  Presumably the lower-subreg pass doesn't split the above
after the double-word adds are split?  Or maybe we simply do it
too late.

> Attached to this message, please find the patch that performs dual
> word mode arithmetic splitting before reload. It improves generated
> code somehow, but due to the above argument construction sequence, the
> bulk of moves remain. Unfortunately, when under register pressure
> (e.g. 32-bit targets), the peephole approach gets ineffective due to
> register spilling, so IMO the root of the problem should be fixed.
>
> Uros.
>
>
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check, both with and without --target_board=unix{-m32} with
> > no new failures.  Ok for mainline?
> >
> >
> > 2022-06-02  Roger Sayle  
> >
> > gcc/ChangeLog
> > * config/i386/i386.md (define_peephole2): Recognize double word
> > swap sequences, and replace them with more efficient idioms,
> > including using xchg when optimizing for size.
> >
> >
> > Thanks in advance,
> > Roger
> > --
> >

Re: [PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (utype)0) to x > ~(utype)0 / cst [PR30314]

2022-06-02 Thread Richard Biener via Gcc-patches

On Thu, 2 Jun 2022, Jakub Jelinek wrote:

> On Thu, Jun 02, 2022 at 08:36:42AM +, Richard Biener wrote:
> > > --- gcc/match.pd.jj   2022-06-01 13:54:32.000654151 +0200
> > > +++ gcc/match.pd  2022-06-01 15:13:35.473084402 +0200
> > > @@ -5969,6 +5969,17 @@ (define_operator_list SYNC_FETCH_AND_AND
> > > && (!TYPE_UNSIGNED (TREE_TYPE (@2)) || TYPE_UNSIGNED (TREE_TYPE 
> > > (@0
> > > (ovf @1 @0
> > >  
> > > +/* Optimize __builtin_mul_overflow_p (x, cst, (utype) 0) if all 3 types
> > > +   are unsigned to x > (umax / cst).  */
> > > +(simplify
> > > + (imagpart (IFN_MUL_OVERFLOW:cs@2 @0 integer_nonzerop@1))
> > 
> > does :c work here?  I think it is at least ignored, possibly diagnostic
> > in genmatch is missing ...
> 
> I saw it used in another pattern, so thought it will work fine:
>   (cmp:c (realpart (IFN_ADD_OVERFLOW:c@2 @0 @1)) @0)
> 
> And looking at the generated source, I think it does work:
>   case CFN_MUL_OVERFLOW:
> if (gimple_call_num_args (_c1) == 2)
>   {
> tree _q20 = gimple_call_arg (_c1, 0);
> _q20 = do_valueize (valueize, _q20);
> tree _q21 = gimple_call_arg (_c1, 1);
> _q21 = do_valueize (valueize, _q21);
> if (integer_nonzerop (_q21))
>   {
> {
> /* #line 5976 "../../gcc/match.pd" */
>   tree captures[3] ATTRIBUTE_UNUSED = { _p0, _q20, 
> _q21 };
>   if (gimple_simplify_256 (res_op, seq, valueize, 
> type, captures))
> return true;
> }
>   }
> if (integer_nonzerop (_q20))
>   {
> {
> /* #line 5976 "../../gcc/match.pd" */
>   tree captures[3] ATTRIBUTE_UNUSED = { _p0, _q21, 
> _q20 };
>   if (gimple_simplify_256 (res_op, seq, valueize, 
> type, captures))
> return true;
> }
>   }
>   }
> Though, sure, I should test it in the testcase too.

Ah yeah, we simply trust :c on functions with two arguments (and
explicitely handle CFN_FMA).

Richard.

Re: [x86 PATCH] Add peephole2 to reduce double word register shuffling.

2022-06-02 Thread Uros Bizjak via Gcc-patches

On Thu, Jun 2, 2022 at 11:32 AM Uros Bizjak  wrote:
>
> On Thu, Jun 2, 2022 at 9:20 AM Roger Sayle  wrote:
> >
> > The simple test case below demonstrates an interesting register
> > allocation challenge facing x86_64, imposed by ABI requirements
> > on int128.
> >
> > __int128 foo(__int128 x, __int128 y)
> > {
> >   return x+y;
> > }
> >
> > For which GCC currently generates the unusual sequence:
> >
> > movq%rsi, %rax
> > movq%rdi, %r8
> > movq%rax, %rdi
> > movq%rdx, %rax
> > movq%rcx, %rdx
> > addq%r8, %rax
> > adcq%rdi, %rdx
> > ret
> >
> > The challenge is that the x86_64 ABI requires passing the first __int128,
> > x, in %rsi:%rdi (highpart in %rsi, lowpart in %rdi), where internally
> > GCC prefers TI mode (double word) integers to be register allocated as
> > %rdi:%rsi (highpart in %rdi, lowpart in %rsi).  So after reload, we have
> > four mov instructions, two to move the double word to temporary registers
> > and then two to move them back.
> >
> > This patch adds a peephole2 to spot this register shuffling, and with
> > -Os generates a xchg instruction, to produce:
> >
> > xchgq   %rsi, %rdi
> > movq%rdx, %rax
> > movq%rcx, %rdx
> > addq%rsi, %rax
> > adcq%rdi, %rdx
> > ret
> >
> > or when optimizing for speed, a three mov sequence, using just one of
> > the temporary registers, which ultimately results in the improved:
> >
> > movq%rdi, %r8
> > movq%rdx, %rax
> > movq%rcx, %rdx
> > addq%r8, %rax
> > adcq%rsi, %rdx
> > ret
> >
> > I've a follow-up patch which improves things further, and with the
> > output in flux, I'd like to add the new testcase with part 2, once
> > we're back down to requiring only two movq instructions.
>
> Shouldn't we rather do something about:
>
> (insn 2 9 3 2 (set (reg:DI 85)
>(reg:DI 5 di [ x ])) "dword-2.c":2:1 82 {*movdi_internal}
> (nil))
> (insn 3 2 4 2 (set (reg:DI 86)
>(reg:DI 4 si [ x+8 ])) "dword-2.c":2:1 82 {*movdi_internal}
> (nil))
> (insn 4 3 5 2 (set (reg:TI 84)
>(subreg:TI (reg:DI 85) 0)) "dword-2.c":2:1 81 {*movti_internal}
> (nil))
> (insn 5 4 6 2 (set (subreg:DI (reg:TI 84) 8)
>(reg:DI 86)) "dword-2.c":2:1 82 {*movdi_internal}
> (nil))
> (insn 6 5 7 2 (set (reg/v:TI 83 [ x ])
>(reg:TI 84)) "dword-2.c":2:1 81 {*movti_internal}
> (nil))
>
> The above is how the functionTImode argument is constructed.
>
> The other problem is that double-word addition gets split only after
> reload, mostly due to RA reasons. In the past it was determined that
> RA creates better code when registers are split late (this reason
> probably does not hold anymore), but nowadays the limitation remains
> only for arithmetic and shifts.

FYI, the effect of the patch can be seen with the following testcase:

--cut here--
#include 

void test (int64_t n)
{
  while (1)
{
  n++;
  asm volatile ("#"
:: "b" ((int32_t)n),
   "c" ((int32_t)(n >> 32)));
}
}
--cut here--

Please compile this with -O2 -m32 with patched and unpatched compiler.

Uros.

Re: [x86 PATCH] Add peephole2 to reduce double word register shuffling.

2022-06-02 Thread Uros Bizjak via Gcc-patches

On Thu, Jun 2, 2022 at 9:20 AM Roger Sayle  wrote:
>
> The simple test case below demonstrates an interesting register
> allocation challenge facing x86_64, imposed by ABI requirements
> on int128.
>
> __int128 foo(__int128 x, __int128 y)
> {
>   return x+y;
> }
>
> For which GCC currently generates the unusual sequence:
>
> movq%rsi, %rax
> movq%rdi, %r8
> movq%rax, %rdi
> movq%rdx, %rax
> movq%rcx, %rdx
> addq%r8, %rax
> adcq%rdi, %rdx
> ret
>
> The challenge is that the x86_64 ABI requires passing the first __int128,
> x, in %rsi:%rdi (highpart in %rsi, lowpart in %rdi), where internally
> GCC prefers TI mode (double word) integers to be register allocated as
> %rdi:%rsi (highpart in %rdi, lowpart in %rsi).  So after reload, we have
> four mov instructions, two to move the double word to temporary registers
> and then two to move them back.
>
> This patch adds a peephole2 to spot this register shuffling, and with
> -Os generates a xchg instruction, to produce:
>
> xchgq   %rsi, %rdi
> movq%rdx, %rax
> movq%rcx, %rdx
> addq%rsi, %rax
> adcq%rdi, %rdx
> ret
>
> or when optimizing for speed, a three mov sequence, using just one of
> the temporary registers, which ultimately results in the improved:
>
> movq%rdi, %r8
> movq%rdx, %rax
> movq%rcx, %rdx
> addq%r8, %rax
> adcq%rsi, %rdx
> ret
>
> I've a follow-up patch which improves things further, and with the
> output in flux, I'd like to add the new testcase with part 2, once
> we're back down to requiring only two movq instructions.

Shouldn't we rather do something about:

(insn 2 9 3 2 (set (reg:DI 85)
   (reg:DI 5 di [ x ])) "dword-2.c":2:1 82 {*movdi_internal}
(nil))
(insn 3 2 4 2 (set (reg:DI 86)
   (reg:DI 4 si [ x+8 ])) "dword-2.c":2:1 82 {*movdi_internal}
(nil))
(insn 4 3 5 2 (set (reg:TI 84)
   (subreg:TI (reg:DI 85) 0)) "dword-2.c":2:1 81 {*movti_internal}
(nil))
(insn 5 4 6 2 (set (subreg:DI (reg:TI 84) 8)
   (reg:DI 86)) "dword-2.c":2:1 82 {*movdi_internal}
(nil))
(insn 6 5 7 2 (set (reg/v:TI 83 [ x ])
   (reg:TI 84)) "dword-2.c":2:1 81 {*movti_internal}
(nil))

The above is how the functionTImode argument is constructed.

The other problem is that double-word addition gets split only after
reload, mostly due to RA reasons. In the past it was determined that
RA creates better code when registers are split late (this reason
probably does not hold anymore), but nowadays the limitation remains
only for arithmetic and shifts.

Attached to this message, please find the patch that performs dual
word mode arithmetic splitting before reload. It improves generated
code somehow, but due to the above argument construction sequence, the
bulk of moves remain. Unfortunately, when under register pressure
(e.g. 32-bit targets), the peephole approach gets ineffective due to
register spilling, so IMO the root of the problem should be fixed.

Uros.


>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32} with
> no new failures.  Ok for mainline?
>
>
> 2022-06-02  Roger Sayle  
>
> gcc/ChangeLog
> * config/i386/i386.md (define_peephole2): Recognize double word
> swap sequences, and replace them with more efficient idioms,
> including using xchg when optimizing for size.
>
>
> Thanks in advance,
> Roger
> --
>
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 050dee7d43a..876220d57f9 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -5730,9 +5730,10 @@
  (match_operand: 1 "nonimmediate_operand" "%0,0")
  (match_operand: 2 "x86_64_hilo_general_operand" "r,o")))
(clobber (reg:CC FLAGS_REG))]
-  "ix86_binary_operator_ok (PLUS, mode, operands)"
+  "ix86_binary_operator_ok (PLUS, mode, operands)
+   && ix86_pre_reload_split ()"
   "#"
-  "&& reload_completed"
+  "&& 1"
   [(parallel [(set (reg:CCC FLAGS_REG)
   (compare:CCC
 (plus:DWIH (match_dup 1) (match_dup 2))
@@ -5750,6 +5751,7 @@
   split_double_mode (mode, [0], 3, [0], [3]);
   if (operands[2] == const0_rtx)
 {
+  emit_move_insn (operands[0], operands[1]);
   ix86_expand_binary_operator (PLUS, mode, [3]);
   DONE;
 }
@@ -6539,9 +6541,10 @@
(plus: (match_dup 1) (match_dup 2)
(set (match_operand: 0 "nonimmediate_operand" "=ro,r")
(plus: (match_dup 1) (match_dup 2)))]
-  "ix86_binary_operator_ok (PLUS, mode, operands)"
+  "ix86_binary_operator_ok (PLUS, mode, operands)
+   && ix86_pre_reload_split ()"
   "#"
-  "&& reload_completed"
+  "&& 1"
   [(parallel [(set (reg:CCC FLAGS_REG)
   (compare:CCC
 (plus:DWIH (match_dup 1) (match_dup 2))
@@ -6567,9 +6570,7 @@

Re: [x86 PATCH] Add peephole2 to reduce double word register shuffling.

2022-06-02 Thread Richard Biener via Gcc-patches

On Thu, Jun 2, 2022 at 9:21 AM Roger Sayle  wrote:
>
> The simple test case below demonstrates an interesting register
> allocation challenge facing x86_64, imposed by ABI requirements
> on int128.
>
> __int128 foo(__int128 x, __int128 y)
> {
>   return x+y;
> }
>
> For which GCC currently generates the unusual sequence:
>
> movq%rsi, %rax
> movq%rdi, %r8
> movq%rax, %rdi
> movq%rdx, %rax
> movq%rcx, %rdx
> addq%r8, %rax
> adcq%rdi, %rdx
> ret
>
> The challenge is that the x86_64 ABI requires passing the first __int128,
> x, in %rsi:%rdi (highpart in %rsi, lowpart in %rdi), where internally
> GCC prefers TI mode (double word) integers to be register allocated as
> %rdi:%rsi (highpart in %rdi, lowpart in %rsi).

Do you know if this is a hard limitation?  I guess reg:TI 2 will cover
hardreg 2 and 3
and the overlap is always implicit adjacent hardregs?  I suspect that in other
places we prefer the current hardreg ordering so altering it to make it match
the __int128 register passing convention is not an option.

Alternatively TImode ops should be split before RA and for register passing
(concat:TI ...) could be allowed?

Fixing up after the fact is of course possible but it looks awkward that there's
no good way for the RA and the backend to communicate better here?

>  So after reload, we have
> four mov instructions, two to move the double word to temporary registers
> and then two to move them back.
>
> This patch adds a peephole2 to spot this register shuffling, and with
> -Os generates a xchg instruction, to produce:
>
> xchgq   %rsi, %rdi
> movq%rdx, %rax
> movq%rcx, %rdx
> addq%rsi, %rax
> adcq%rdi, %rdx
> ret
>
> or when optimizing for speed, a three mov sequence, using just one of
> the temporary registers, which ultimately results in the improved:
>
> movq%rdi, %r8
> movq%rdx, %rax
> movq%rcx, %rdx
> addq%r8, %rax
> adcq%rsi, %rdx
> ret
>
> I've a follow-up patch which improves things further, and with the
> output in flux, I'd like to add the new testcase with part 2, once
> we're back down to requiring only two movq instructions.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32} with
> no new failures.  Ok for mainline?
>
>
> 2022-06-02  Roger Sayle  
>
> gcc/ChangeLog
> * config/i386/i386.md (define_peephole2): Recognize double word
> swap sequences, and replace them with more efficient idioms,
> including using xchg when optimizing for size.
>
>
> Thanks in advance,
> Roger
> --
>

Re: [PATCH] [PR105665] ivopts: check defs of names in base for undefs

2022-06-02 Thread Richard Biener via Gcc-patches

On Thu, Jun 2, 2022 at 9:10 AM Alexandre Oliva  wrote:
>
> On Jun  1, 2022, Alexandre Oliva  wrote:
>
> > Now I'm thinking we can go for an even stricter predicate to disable
> > the optimization: if a non-PHI use of a maybe-undefined dominates the
> > loop, then we can still perform the optimization:
>
> Here it is.
>
>
> [PR105665] ivopts: check defs of names in base for undefs
>
> From: Alexandre Oliva 
>
> The patch for PR 100810 tested for undefined SSA_NAMEs appearing
> directly in the base expression of the potential IV candidate, but
> that's not enough.  The testcase for PR105665 shows an undefined
> SSA_NAME has the same ill effect if it's referenced as an PHI_NODE arg
> in the referenced SSA_NAME.  The variant of that test shows it can be
> further removed from the referenced SSA_NAME.
>
> To avoid deep recursion, precompute maybe-undefined SSA_NAMEs: start
> from known-undefined nonvirtual default defs, and propagate them to
> any PHI nodes reached by a maybe-undefined arg, as long as there
> aren't intervening non-PHI uses, that would imply the maybe-undefined
> name must be defined at that point, otherwise it would invoke
> undefined behavior.  Also test for intervening non-PHI uses of DEFs in
> the base expr.
>
> The test for intervening uses implemented herein relies on dominance;
> this could be further extended, regarding conditional uses in every
> path leading to a point as an unconditional use dominating that point,
> but I haven't implemented that.
>
>
> for  gcc/ChangeLog
>
> PR tree-optimization/105665
> PR tree-optimization/100810
> * tree-ssa-loop-ivopts.cc
> (ssa_name_maybe_undef_p, ssa_name_set_maybe_undef): New.
> (ssa_name_any_use_dominates_bb_p, mark_ssa_maybe_undefs): New.
> (find_ssa_undef): Check precomputed flag and intervening uses.
> (tree_ssa_iv_optimize): Call mark_ssa_maybe_undefs.
>
> for  gcc/testsuite/ChangeLog
>
> PR tree-optimization/105665
> PR tree-optimization/100810
> * gcc.dg/torture/pr105665.c: New.
> ---
>  gcc/testsuite/gcc.dg/torture/pr105665.c |   20 +
>  gcc/tree-ssa-loop-ivopts.cc |  124 
> ++-
>  2 files changed, 140 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/torture/pr105665.c
>
> diff --git a/gcc/testsuite/gcc.dg/torture/pr105665.c 
> b/gcc/testsuite/gcc.dg/torture/pr105665.c
> new file mode 100644
> index 0..34cfc65843495
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr105665.c
> @@ -0,0 +1,20 @@
> +/* { dg-do run } */
> +
> +int a, b, c[1], d[2], *e = c;
> +int main() {
> +  int f = 0;
> +  for (; b < 2; b++) {
> +int g;
> +if (f)
> +  g++, b = 40;
> +a = d[b * b];
> +for (f = 0; f < 3; f++) {
> +  if (e)
> +break;
> +  g--;
> +  if (a)
> +a = g;
> +}
> +  }
> +  return 0;
> +}
> diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
> index 81b536f930415..f20a985d7ca22 100644
> --- a/gcc/tree-ssa-loop-ivopts.cc
> +++ b/gcc/tree-ssa-loop-ivopts.cc
> @@ -3071,13 +3071,128 @@ get_loop_invariant_expr (struct ivopts_data *data, 
> tree inv_expr)
>return *slot;
>  }
>
> -/* Find the first undefined SSA name in *TP.  */
> +/* Return TRUE iff VAR is marked as maybe-undefined.  See
> +   mark_ssa_maybe_undefs.  */
> +
> +static inline bool
> +ssa_name_maybe_undef_p (tree var)
> +{
> +  gcc_checking_assert (TREE_CODE (var) == SSA_NAME);
> +  return TREE_VISITED (var);
> +}
> +
> +/* Set (or clear, depending on VALUE) VAR's maybe-undefined mark.  */
> +
> +static inline void
> +ssa_name_set_maybe_undef (tree var, bool value = true)
> +{
> +  gcc_checking_assert (TREE_CODE (var) == SSA_NAME);
> +  TREE_VISITED (var) = value;
> +}
> +
> +/* Return TRUE iff there are any non-PHI uses of VAR that dominate the
> +   end of BB.  If we return TRUE and BB is a loop header, then VAR we
> +   be assumed to be defined within the loop, even if it is marked as
> +   maybe-undefined.  */
> +
> +static inline bool
> +ssa_name_any_use_dominates_bb_p (tree var, basic_block bb)
> +{
> +  imm_use_iterator iter;
> +  use_operand_p use_p;
> +  FOR_EACH_IMM_USE_FAST (use_p, iter, var)
> +{
> +  if (is_a  (USE_STMT (use_p)))

I think you also want to skip debug stmts here?

> +   continue;
> +  basic_block dombb = gimple_bb (USE_STMT (use_p));
> +  if (dominated_by_p (CDI_DOMINATORS, bb, dombb))
> +   return true;
> +}
> +
> +  return false;
> +}
> +
> +/* Mark as maybe_undef any SSA_NAMEs that are unsuitable as ivopts
> +   candidates for potentially involving undefined behavior.  */
> +
> +static void
> +mark_ssa_maybe_undefs (void)
> +{
> +  auto_vec queue;
> +
> +  /* Scan all SSA_NAMEs, marking the definitely-undefined ones as
> + maybe-undefined and queuing them for propagation, while clearing
> + the mark on others.  */
> +  unsigned int i;
> +  tree var;
> +  FOR_EACH_SSA_NAME (i, var, cfun)
>

Re: [PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (utype)0) to x > ~(utype)0 / cst [PR30314]

2022-06-02 Thread Jakub Jelinek via Gcc-patches

On Thu, Jun 02, 2022 at 08:36:42AM +, Richard Biener wrote:
> > --- gcc/match.pd.jj 2022-06-01 13:54:32.000654151 +0200
> > +++ gcc/match.pd2022-06-01 15:13:35.473084402 +0200
> > @@ -5969,6 +5969,17 @@ (define_operator_list SYNC_FETCH_AND_AND
> > && (!TYPE_UNSIGNED (TREE_TYPE (@2)) || TYPE_UNSIGNED (TREE_TYPE 
> > (@0
> > (ovf @1 @0
> >  
> > +/* Optimize __builtin_mul_overflow_p (x, cst, (utype) 0) if all 3 types
> > +   are unsigned to x > (umax / cst).  */
> > +(simplify
> > + (imagpart (IFN_MUL_OVERFLOW:cs@2 @0 integer_nonzerop@1))
> 
> does :c work here?  I think it is at least ignored, possibly diagnostic
> in genmatch is missing ...

I saw it used in another pattern, so thought it will work fine:
  (cmp:c (realpart (IFN_ADD_OVERFLOW:c@2 @0 @1)) @0)

And looking at the generated source, I think it does work:
  case CFN_MUL_OVERFLOW:
if (gimple_call_num_args (_c1) == 2)
  {
tree _q20 = gimple_call_arg (_c1, 0);
_q20 = do_valueize (valueize, _q20);
tree _q21 = gimple_call_arg (_c1, 1);
_q21 = do_valueize (valueize, _q21);
if (integer_nonzerop (_q21))
  {
{
/* #line 5976 "../../gcc/match.pd" */
  tree captures[3] ATTRIBUTE_UNUSED = { _p0, _q20, _q21 
};
  if (gimple_simplify_256 (res_op, seq, valueize, type, 
captures))
return true;
}
  }
if (integer_nonzerop (_q20))
  {
{
/* #line 5976 "../../gcc/match.pd" */
  tree captures[3] ATTRIBUTE_UNUSED = { _p0, _q21, _q20 
};
  if (gimple_simplify_256 (res_op, seq, valueize, type, 
captures))
return true;
}
  }
  }
Though, sure, I should test it in the testcase too.
Here is what I've committed as obvious after testing on x86_64-linux
-m32/-m64:

2022-06-02  Jakub Jelinek  

PR middle-end/30314
* gcc.dg/tree-ssa/pr30314.c: Add tests with swapped arguments.

--- gcc/testsuite/gcc.dg/tree-ssa/pr30314.c.jj  2022-06-01 17:54:30.0 
+0200
+++ gcc/testsuite/gcc.dg/tree-ssa/pr30314.c 2022-06-02 11:01:27.818987766 
+0200
@@ -4,6 +4,8 @@
 /* { dg-final { scan-tree-dump-not "\.MUL_OVERFLOW " "optimized" } } */
 /* { dg-final { scan-tree-dump " > 122713351" "optimized" { target int32 } } } 
*/
 /* { dg-final { scan-tree-dump " > 527049830677415760" "optimized" { target 
lp64 } } } */
+/* { dg-final { scan-tree-dump " > 102261126" "optimized" { target int32 } } } 
*/
+/* { dg-final { scan-tree-dump " > 439208192231179800" "optimized" { target 
lp64 } } } */
 
 int
 foo (unsigned int x)
@@ -16,3 +18,15 @@ bar (unsigned long int x)
 {
   return __builtin_mul_overflow_p (x, 35UL, 0UL);
 }
+
+int
+baz (unsigned int x)
+{
+  return __builtin_mul_overflow_p (42, x, 0U);
+}
+
+int
+qux (unsigned long int x)
+{
+  return __builtin_mul_overflow_p (42, x, 0UL);
+}


Jakub

Re: [PATCH] Simplify (B * v + C) * D -> BD* v + CD when B, C, D are all INTEGER_CST.

2022-06-02 Thread Richard Biener via Gcc-patches

On Thu, Jun 2, 2022 at 3:10 AM liuhongt via Gcc-patches
 wrote:
>
> Similar for (v + B) * C + D -> C * v + BCD.
> Don't simplify it when there's overflow and overflow is UB for type v.
>
> There's new failure
>
> gcc.dg/vect/slp-11a.c scan-tree-dump-times vect "vectorizing stmts using SLP" > 0
>
> It's because the patch simplify different operations to mult + add and enables
> SLP. So I adjust the testcase to prevent simplication by making the
> multiplication result overflow.
>
> Also with -fwrapv, benchmark in the PR now 100% faster than
> origin(scalar version).
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR tree-optimization/53533
> * match.pd: Simplify (B * v + C) * D -> BD * v + CD and
> (v + B) * C + D -> C * v + BCD when B,C,D are all INTEGER_CST,
> and there's no overflow or TYPE_OVERFLOW_WRAP.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr53533-1.c: New test.
> * gcc.target/i386/pr53533-2.c: New test.
> * gcc.target/i386/pr53533-3.c: New test.
> * gcc.target/i386/pr53533-4.c: New test.
> * gcc.dg/vect/slp-11a.c: Adjust testcase.
> ---
>  gcc/match.pd  | 36 ++
>  gcc/testsuite/gcc.dg/vect/slp-11a.c   | 10 ++---
>  gcc/testsuite/gcc.target/i386/pr53533-1.c | 23 
>  gcc/testsuite/gcc.target/i386/pr53533-2.c | 46 +++
>  gcc/testsuite/gcc.target/i386/pr53533-3.c | 24 
>  gcc/testsuite/gcc.target/i386/pr53533-4.c | 46 +++
>  6 files changed, 180 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr53533-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr53533-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr53533-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr53533-4.c
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 88c6c414881..b753f7bda3c 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -489,6 +489,42 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>(if (!overflow || TYPE_OVERFLOW_WRAPS (type))
> (mult @0 { wide_int_to_tree (type, mul); }
>
> +/* Similar to above, but there could be an extra add/sub between
> +   successive multuiplications.  */
> +(simplify
> + (mult:c (plus:c@4 (mult:c@5 @0 INTEGER_CST@1) INTEGER_CST@2) INTEGER_CST@3)

since canonicalization puts INTEGER_CSTs last the :c should not be necessary.

> + (if (single_use (@4)
> +  && single_use (@5))

since the resulting expression is not simple using :s instead of
single_use (..) should
work as well.

> +  (with {
> +wi::overflow_type overflow;
> +wi::overflow_type overflow2;
> +wide_int mul = wi::mul (wi::to_wide (@1), wi::to_wide (@3),
> +   TYPE_SIGN (type), );
> +wide_int add = wi::mul (wi::to_wide (@2), wi::to_wide (@3),
> +   TYPE_SIGN (type), );
> +  }
> +   /* Skip folding on overflow.  */
> +   (if (!(overflow || overflow2) || TYPE_OVERFLOW_WRAPS (type))
> +(plus (mult @0 { wide_int_to_tree (type, mul); })
> + { wide_int_to_tree (type, add); })
> +
> +/* Similar to above, but a multiplication between successive additions.  */
> +(simplify
> + (plus:c (mult:c@4 (plus:c@5 @0 INTEGER_CST@1) INTEGER_CST@2) INTEGER_CST@3)

Likewise for :c and :s

> + (if (single_use (@4)
> +  && single_use (@5))
> +  (with {
> +wi::overflow_type overflow;
> +wi::overflow_type overflow2;
> +wide_int mul = wi::mul (wi::to_wide (@1), wi::to_wide (@2),
> +   TYPE_SIGN (type), );
> +wide_int add = wi::add (mul, wi::to_wide (@3),
> +   TYPE_SIGN (type), );
> +  }
> +   /* Skip folding on overflow.  */
> +   (if (!(overflow || overflow2) || TYPE_OVERFLOW_WRAPS (type))
> +(plus (mult @0 @2) { wide_int_to_tree (type, add); })

when we go from (a + CST1) * CST2 to a * CST2 + CST1*CST2 we have
to worry about CST1 == -a which would make (a+CST1) * INT_MAX
not overflow but a * INT_MAX + CST1 * INT_MAX might.  Is the
overflow check for CST1 * INT_MAX sufficient to rule out
that a * CST2 does not overflow when (a + CST1) * CST2 does not
overflow?  Consider a == 2, CST1 == -1, CST2 == INT_MAX,
here 1 * INT_MAX does not overflow, nor does -1 * INT_MAX, but
2 * INT_MAX overflows and thus the resulting expression invokes
undefined behavior.

The same issue probably arises for the first pattern outer half
which looks like (a' + CST2) * CST3 with a' = a * CST1?

The appropriate solution might be to perform the arithmetic
in an unsigned type with the implication that has on value-range
analysis.

Richard.

> +
>  /* Optimize A / A to 1.0 if we don't care about
> NaNs or Infinities.  */
>  (simplify
> diff --git a/gcc/testsuite/gcc.dg/vect/slp-11a.c 
> b/gcc/testsuite/gcc.dg/vect/slp-11a.c
> index bcd3c861ca4..e6632fa77be 100644
> --- a/gcc/testsuite/gcc.dg/vect/slp-11a.c
> +++

[Ada] Fix record layout warnings not being tagged

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

This allows tools ingesting GNAT's output to properly classify these
messages.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* gcc-interface/decl.cc (warn_on_field_placement): Add insertion
character '.q' to warning string.diff --git a/gcc/ada/gcc-interface/decl.cc b/gcc/ada/gcc-interface/decl.cc
--- a/gcc/ada/gcc-interface/decl.cc
+++ b/gcc/ada/gcc-interface/decl.cc
@@ -7770,20 +7770,20 @@ warn_on_field_placement (tree gnu_field, Node_Id gnat_component_list,
 
   const char *msg1
 = in_variant
-  ? "??variant layout may cause performance issues"
-  : "??record layout may cause performance issues";
+  ? "?.q?variant layout may cause performance issues"
+  : "?.q?record layout may cause performance issues";
   const char *msg2
 = Ekind (gnat_field) == E_Discriminant
-  ? "??discriminant & whose length is not multiple of a byte"
+  ? "?.q?discriminant & whose length is not multiple of a byte"
   : field_has_self_size (gnu_field)
-	? "??component & whose length depends on a discriminant"
+	? "?.q?component & whose length depends on a discriminant"
 	: field_has_variable_size (gnu_field)
-	  ? "??component & whose length is not fixed"
-	  : "??component & whose length is not multiple of a byte";
+	  ? "?.q?component & whose length is not fixed"
+	  : "?.q?component & whose length is not multiple of a byte";
   const char *msg3
 = do_reorder
-  ? "??comes too early and was moved down"
-  : "??comes too early and ought to be moved down";
+  ? "?.q?comes too early and was moved down"
+  : "?.q?comes too early and ought to be moved down";
 
   post_error (msg1, gnat_field);
   post_error_ne (msg2, gnat_field, gnat_field);

[Ada] Disable -flto when building the shared libgnat

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

This setting isn't useful in this context.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* gcc-interface/Makefile.in (gnatlib-shared-default): Add
-fno-lto.diff --git a/gcc/ada/gcc-interface/Makefile.in b/gcc/ada/gcc-interface/Makefile.in
--- a/gcc/ada/gcc-interface/Makefile.in
+++ b/gcc/ada/gcc-interface/Makefile.in
@@ -663,8 +663,8 @@ gnatlib: ../stamp-gnatlib1-$(RTSDIR) ../stamp-gnatlib2-$(RTSDIR) $(RTSDIR)/s-osc
 gnatlib-shared-default:
 	$(MAKE) $(FLAGS_TO_PASS) \
  GNATLIBFLAGS="$(GNATLIBFLAGS)" \
-	 GNATLIBCFLAGS="$(GNATLIBCFLAGS) $(PICFLAG_FOR_TARGET)" \
-	 GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C) $(PICFLAG_FOR_TARGET)" \
+	 GNATLIBCFLAGS="$(GNATLIBCFLAGS) $(PICFLAG_FOR_TARGET) -fno-lto" \
+	 GNATLIBCFLAGS_FOR_C="$(GNATLIBCFLAGS_FOR_C) $(PICFLAG_FOR_TARGET) -fno-lto" \
 	 MULTISUBDIR="$(MULTISUBDIR)" \
 	 THREAD_KIND="$(THREAD_KIND)" \
 	 LN_S="$(LN_S)" \

[Ada] Couple of small preparatory adjustments

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* gcc-interface/decl.cc (gnat_to_gnu_entity): Do not check the
scope of anonymous access Itypes.
* gcc-interface/trans.cc (Identifier_to_gnu): Do not translate
the return type of a subprogram here.diff --git a/gcc/ada/gcc-interface/decl.cc b/gcc/ada/gcc-interface/decl.cc
--- a/gcc/ada/gcc-interface/decl.cc
+++ b/gcc/ada/gcc-interface/decl.cc
@@ -346,10 +346,12 @@ gnat_to_gnu_entity (Entity_Id gnat_entity, tree gnu_expr, bool definition)
   && !present_gnu_tree (gnat_entity)
   && In_Extended_Main_Code_Unit (gnat_entity))
 {
-  /* Ensure that we are in a subprogram mentioned in the Scope chain of
+  /* Unless it's for an anonymous access type, whose scope is irrelevant,
+	 ensure that we are in a subprogram mentioned in the Scope chain of
 	 this entity, our current scope is global, or we encountered a task
 	 or entry (where we can't currently accurately check scoping).  */
-  if (!current_function_decl
+  if (Ekind (gnat_entity) == E_Anonymous_Access_Type
+	  || !current_function_decl
 	  || DECL_ELABORATION_PROC_P (current_function_decl))
 	{
 	  process_type (gnat_entity);


diff --git a/gcc/ada/gcc-interface/trans.cc b/gcc/ada/gcc-interface/trans.cc
--- a/gcc/ada/gcc-interface/trans.cc
+++ b/gcc/ada/gcc-interface/trans.cc
@@ -1220,10 +1220,13 @@ Identifier_to_gnu (Node_Id gnat_node, tree *gnu_result_type_p)
 	gnat_result_type = Etype (gnat_node);
 }
 
-  /* Expand the type of this identifier first, in case it is an enumeral
- literal, which only get made when the type is expanded.  There is no
- order-of-elaboration issue here.  */
-  gnu_result_type = get_unpadded_type (gnat_result_type);
+  /* Expand the type of this identifier first if it is needed, in case it is an
+ enumeral literal, which only get made when the type is expanded.  There is
+ no order-of-elaboration issue here.  */
+  if (Is_Subprogram (gnat_entity))
+gnu_result_type = NULL_TREE;
+  else
+gnu_result_type = get_unpadded_type (gnat_result_type);
 
   /* If this is a non-imported elementary constant with an address clause,
  retrieve the value instead of a pointer to be dereferenced unless

[Ada] Restore full generation of static dispatch tables with -gnatzr

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

The -gnatzr switch triggers the creation of distribution stubs for use
by the implementation of PolyORB.  Now these stubs declare tagged types
and are generated at the very end of the analysis of compilation units,
after the static dispatch tables have been built, so these tables are
missing for the tagged types of the stubs.

Therefore this change defers the generation of static dispatch tables
for compilation units, which is the common case, until after the stubs
are (potentially) generated.  For the other cases, in particular the
generic instances that are not compilation units, nothing is changed.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch7.adb (Expand_N_Package_Body): Build static dispatch
tables only for units that are not compilation units, unless
they are generic instances.  Do not push a scope for this.
(Expand_N_Package_Declaration): Build static dispatch tables
only for units that are both not compilation units and generic
instances.
* exp_disp.adb (Build_Static_Dispatch_Tables): Remove redundant
early return.  Push a scope for package bodies.
* sem_ch10.adb: Add with and use clauses for Exp_Disp.
(Analyze_Compilation_Unit): Build static dispatch tables here.diff --git a/gcc/ada/exp_ch7.adb b/gcc/ada/exp_ch7.adb
--- a/gcc/ada/exp_ch7.adb
+++ b/gcc/ada/exp_ch7.adb
@@ -5876,16 +5876,20 @@ package body Exp_Ch7 is
   --  This is done only for non-generic packages
 
   if Ekind (Spec_Id) = E_Package then
- Push_Scope (Spec_Id);
-
- --  Build dispatch tables of library level tagged types
+ --  Build dispatch tables of library-level tagged types for bodies
+ --  that are not compilation units (see Analyze_Compilation_Unit),
+ --  except for instances because they have no N_Compilation_Unit.
 
  if Tagged_Type_Expansion
and then Is_Library_Level_Entity (Spec_Id)
+   and then (not Is_Compilation_Unit (Spec_Id)
+  or else Is_Generic_Instance (Spec_Id))
  then
 Build_Static_Dispatch_Tables (N);
  end if;
 
+ Push_Scope (Spec_Id);
+
  Expand_CUDA_Package (N);
 
  Build_Task_Activation_Call (N);
@@ -6035,12 +6039,13 @@ package body Exp_Ch7 is
  Pop_Scope;
   end if;
 
-  --  Build dispatch tables of library-level tagged types
+  --  Build dispatch tables of library-level tagged types for instances
+  --  that are not compilation units (see Analyze_Compilation_Unit).
 
   if Tagged_Type_Expansion
-and then (Is_Compilation_Unit (Id)
-   or else (Is_Generic_Instance (Id)
- and then Is_Library_Level_Entity (Id)))
+and then Is_Library_Level_Entity (Id)
+and then Is_Generic_Instance (Id)
+and then not Is_Compilation_Unit (Id)
   then
  Build_Static_Dispatch_Tables (N);
   end if;


diff --git a/gcc/ada/exp_disp.adb b/gcc/ada/exp_disp.adb
--- a/gcc/ada/exp_disp.adb
+++ b/gcc/ada/exp_disp.adb
@@ -524,12 +524,6 @@ package body Exp_Disp is
--  Start of processing for Build_Static_Dispatch_Tables
 
begin
-  if not Expander_Active
-or else not Tagged_Type_Expansion
-  then
- return;
-  end if;
-
   if Nkind (N) = N_Package_Declaration then
  declare
 Spec   : constant Node_Id := Specification (N);
@@ -553,8 +547,15 @@ package body Exp_Disp is
  end;
 
   else pragma Assert (Nkind (N) = N_Package_Body);
- Target_List := Declarations (N);
- Build_Dispatch_Tables (Target_List);
+ declare
+Spec_Id : constant Entity_Id := Corresponding_Spec (N);
+
+ begin
+Push_Scope (Spec_Id);
+Target_List := Declarations (N);
+Build_Dispatch_Tables (Target_List);
+Pop_Scope;
+ end;
   end if;
end Build_Static_Dispatch_Tables;
 


diff --git a/gcc/ada/sem_ch10.adb b/gcc/ada/sem_ch10.adb
--- a/gcc/ada/sem_ch10.adb
+++ b/gcc/ada/sem_ch10.adb
@@ -31,6 +31,7 @@ with Einfo;  use Einfo;
 with Einfo.Entities; use Einfo.Entities;
 with Einfo.Utils;use Einfo.Utils;
 with Errout; use Errout;
+with Exp_Disp;   use Exp_Disp;
 with Exp_Put_Image;
 with Exp_Util;   use Exp_Util;
 with Elists; use Elists;
@@ -1000,6 +1001,22 @@ package body Sem_Ch10 is
  end if;
   end if;
 
+  --  Build dispatch tables of library-level tagged types only now because
+  --  the generation of distribution stubs above may create some of them.
+
+  if Expander_Active and then Tagged_Type_Expansion then
+ case Nkind (Unit_Node) is
+when N_Package_Declaration | N_Package_Body =>
+   Build_Static_Dispatch_Tables (Unit_Node);
+
+when N_Package_Instantiation =>
+   Build_Static_Dispatch_Tables (Instance_Spec

[Ada] Never make symbols for thunks public

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

Thunks are only referenced locally by dispatch tables and never inlined.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch6.adb (Analyze_Subprogram_Body_Helper): Clear the Is_Public
flag on thunks.diff --git a/gcc/ada/sem_ch6.adb b/gcc/ada/sem_ch6.adb
--- a/gcc/ada/sem_ch6.adb
+++ b/gcc/ada/sem_ch6.adb
@@ -4772,6 +4772,13 @@ package body Sem_Ch6 is
  if Nkind (N) /= N_Subprogram_Body
or else not Was_Expression_Function (N)
  then
+--  First clear the Is_Public flag on thunks since they are only
+--  referenced locally by dispatch tables and thus never inlined.
+
+if Is_Thunk (Body_Id) then
+   Set_Is_Public (Body_Id, False);
+end if;
+
 Freeze_Before (N, Body_Id);
  end if;

[Ada] Get rid of secondary stack for most calls returning tagged types

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

This eliminates the use of the secondary stack to return specific tagged
types from functions in calls that are not dispatching on result, which
comprises returning controlled types, by introducing thunks whose only
purpose is to move the result from the primary to the secondary stack
for primitive functions that are controlling on result, and referencing
them in the dispatch table in lieu of the primitive functions.

The implementation reuses the existing machinery of interface thunks and
thus creates another kind of thunks, secondary stack thunks, which only
perform a call to the primitive function and return the result.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* einfo.ads (Has_Controlling_Result): Document new usage.
(Is_Thunk): Document secondary stack thunks.
(Returns_By_Ref): Adjust.
* exp_ch6.adb (Caller_Known_Size): Return true for tagged types.
(Expand_N_Extended_Return_Statement): Do not call Set_By_Ref.
(Expand_Simple_Function_Return): For a BIP return with an Alloc_Form
parameter, mark the node as returning on the secondary stack.
Replace call to Is_Limited_Interface with Is_Limited_View.  Deal wit
secondary stack thunks.  Do not call Set_By_Ref.  Optimize the case
of a call to a function whose type also needs finalization.
(Needs_BIP_Task_Actuals): Replace Thunk_Entity with Thunk_Target.
(Needs_BIP_Finalization_Master): Cosmetic fixes.
(Needs_BIP_Alloc_Form): Check No_Secondary_Stack restriction and
return true for tagged types.
* exp_ch7.adb (Transient Scope Management): Update description.
* exp_disp.adb (Expand_Dispatching_Call): Always set Returns_By_Ref
on designated type if the call is dispatching on result.  Tidy up.
(Expand_Interface_Thunk): Change type of Thunk_Code from Node_Id to
List_Id.  Change type of local variables from Node_Id to Entity_Id.
Propagate Aliased_Present flag to create the formals and explicitly
set Has_Controlling_Result to False.  Build a secondary stack thunk
if necessary in the function case.
(Expand_Secondary_Stack_Thunk): New function.
(Make_Secondary_DT): Build secondary stack thunks if necessary.
(Make_DT): Likewise.
(Register_Predefined_Primitive): Likewise.
(Register_Primitive): Likewise.
* exp_util.ads (Is_Secondary_Stack_Thunk): Declare.
(Thunk_Target): Likewise.
* exp_util.adb (Is_Secondary_Stack_Thunk): New function.
(Thunk_Target): Likewise.
* fe.h (Is_Secondary_Stack_Thunk): Declare.
(Thunk_Target): Likewise.
* gen_il-fields.ads (Opt_Field_Enum): Remove By_Ref.
* gen_il-gen-gen_nodes.adb (N_Simple_Return_Statement): Likewise.
(N_Extended_Return_Statement): Likewise.
* sem_ch6.adb (Analyze_Subprogram_Specification): Skip check for
abstract return type in the thunk case.
(Create_Extra_Formals): Replace Thunk_Entity with Thunk_Target.
* sem_disp.adb (Check_Controlling_Formals): Skip in the thunk case.
* sem_util.adb: Add use and with clauses for Exp_Ch6.
(Compute_Returns_By_Ref): Do not process procedures and only set
the flag for direct return by reference.
(Needs_Secondary_Stack): Do not return true for specific tagged
types and adjust comments accordingly.
* sinfo.ads (By_Ref): Delete.
(N_Simple_Return_Statement): Remove By_Ref.
(N_Extended_Return_Statement): Likewise.
* gcc-interface/ada-tree.h (TYPE_RETURN_UNCONSTRAINED_P): Delete.
* gcc-interface/decl.cc (gnat_to_gnu_subprog_type): Do not use it.
Return by direct reference if the return type needs the secondary
stack as well as for secondary stack thunks.
* gcc-interface/gigi.h (fntype_same_flags_p): Remove parameter.
* gcc-interface/misc.cc (gnat_type_hash_eq): Adjust to above change.
* gcc-interface/trans.cc (finalize_nrv): Replace test on
TYPE_RETURN_UNCONSTRAINED_P with TYPE_RETURN_BY_DIRECT_REF_P.
(Subprogram_Body_to_gnu): Do not call maybe_make_gnu_thunk for
secondary stack thunks.
(Call_to_gnu): Do not test TYPE_RETURN_UNCONSTRAINED_P.
(gnat_to_gnu) : In the return by direct
reference case, test for the presence of Storage_Pool on the node
to build an allocator.
(maybe_make_gnu_thunk): Deal with Thunk_Entity and Thunk_Target.
* gcc-interface/utils.cc (fntype_same_flags_p): Remove parameter.

patch.diff.gz
Description: application/gzip

[Ada] Remove redundant checks for missing lists

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

When iterating over list elements with First/Next there is no need to
check if the list is present, because First intentionally returns Empty
if list is not present and the condition of subsequent loop will not be
satisfied.

Code cleanup; semantics is unaffected.

Occurrences of the redundant pattern were found with:

  $ grep First -B 3 | less

and examining the output for the calls to Present.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch13.adb, exp_ch5.adb, exp_ch9.adb, exp_strm.adb,
sem_ch10.adb, sem_ch13.adb, sem_ch5.adb, sem_ch6.adb,
sem_ch8.adb, sem_elab.adb, sem_eval.adb, sem_prag.adb,
sem_util.adb: Remove checks for the missing list before
iterating with First/Next; reindent code and refill comments.

patch.diff.gz
Description: application/gzip

[Ada] Fix references to old variables that have been refactored

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

Local_Entity_Suppress and Global_Entity_Suppress variables referencing
tables were refactored to Local_Suppress_Stack_Top and
Global_Suppress_Stack_Top stacks back in 2007.  Fix remaining references
to these variables.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* einfo.ads: Fix reference to Global_Entity_Suppress and
Local_Entity_Suppress variable in the comments.
* sem.ads: Likewise.
* sem_prag.adb: Likewise.diff --git a/gcc/ada/einfo.ads b/gcc/ada/einfo.ads
--- a/gcc/ada/einfo.ads
+++ b/gcc/ada/einfo.ads
@@ -605,10 +605,10 @@ package Einfo is
 
 --Checks_May_Be_Suppressed
 --   Defined in all entities. Set if a pragma Suppress or Unsuppress
---   mentions the entity specifically in the second argument. If this
---   flag is set the Global_Entity_Suppress and Local_Entity_Suppress
---   tables must be consulted to determine if there actually is an active
---   Suppress or Unsuppress pragma that applies to the entity.
+--   mentions the entity specifically in the second argument. If this flag
+--   is set the global and local suppress stacks must be consulted to
+--   determine if there actually is an active Suppress or Unsuppress pragma
+--   that applies to the entity.
 
 --Class_Postconditions
 --   Defined on subprogram entities. Set if the subprogram has class-wide


diff --git a/gcc/ada/sem.ads b/gcc/ada/sem.ads
--- a/gcc/ada/sem.ads
+++ b/gcc/ada/sem.ads
@@ -327,8 +327,8 @@ package Sem is
--  using pragma Check_Name), are handled as follows. If a suppress or
--  unsuppress pragma is encountered for a given entity, then the flag
--  Checks_May_Be_Suppressed is set in the entity and an entry is made in
-   --  either the Local_Entity_Suppress stack (case of pragma that appears in
-   --  other than a package spec), or in the Global_Entity_Suppress stack (case
+   --  either the local suppress stack (case of pragma that appears in
+   --  other than a package spec), or in the global suppress stack (case
--  of pragma that appears in a package spec, which is by the rule of RM
--  11.5(7) applicable throughout the life of the entity). Similarly, a
--  Suppress/Unsuppress pragma for a non-predefined check which does not
@@ -340,7 +340,7 @@ package Sem is
--  other point is that we have to make sure that we have proper nested
--  interaction between such specific pragmas and locally applied general
--  pragmas applying to all entities. This is achieved by including in the
-   --  Local_Entity_Suppress table dummy entries with an empty Entity field
+   --  local suppress stack dummy entries with an empty Entity field
--  that are applicable to all entities. A similar search is needed for any
--  non-predefined check even if no specific entity is involved.
 
@@ -359,18 +359,18 @@ package Sem is
--  applies, and gives the right result when such pragmas are used even
--  in complex cases of nested Suppress and Unsuppress pragmas.
 
-   --  The Local_Entity_Suppress and Global_Entity_Suppress stacks are handled
-   --  using dynamic allocation and linked lists. We do not often use this
-   --  approach in the compiler (preferring to use extensible tables instead).
-   --  The reason we do it here is that scope stack entries save a pointer to
-   --  the current local stack top, which is also saved and restored on scope
-   --  exit. Furthermore for processing of generics we save pointers to the
-   --  top of the stack, so that the local stack is actually a tree of stacks
-   --  rather than a single stack, a structure that is easy to represent using
-   --  linked lists, but impossible to represent using a single table. Note
-   --  that because of the generic issue, we never release entries in these
-   --  stacks, but that's no big deal, since we are unlikely to have a huge
-   --  number of Suppress/Unsuppress entries in a single compilation.
+   --  The local and global suppress stacks are handled using dynamic
+   --  allocation and linked lists. We do not often use this approach in the
+   --  compiler (preferring to use extensible tables instead). The reason we do
+   --  it here is that scope stack entries save a pointer to the current local
+   --  stack top, which is also saved and restored on scope exit. Furthermore
+   --  for processing of generics we save pointers to the top of the stack, so
+   --  that the local stack is actually a tree of stacks rather than a single
+   --  stack, a structure that is easy to represent using linked lists, but
+   --  impossible to represent using a single table. Note that because of the
+   --  generic issue, we never release entries in these stacks, but that's no
+   --  big deal, since we are unlikely to have a huge number of
+   --  Suppress/Unsuppress entries in a single compilation.
 
type Suppress_Stack_Entry;
type Suppress_Stack_Entry_Ptr is access all Suppress_Stack_Entry;


diff --git

[Ada] Make the functional Maps and Sets unbounded

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

Before this patch, the Functional Sets ans Maps were bounded both from
the user and the implementation points of view.  To make them closer to
mathematical Sets ans Maps, this patch removes the bounds from the
contracts. Note that, in practice, they are still bounded by
Count_Type'Last, even if the user is not aware of it anymore.

This patch removed constraints on length of sets and maps from the
preconditions of functions. The function Length and Num_Overlaps now
return a Big_Natural.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-cofuse.ads, libgnat/a-cofuse.adb,
libgnat/a-cofuma.ads, libgnat/a-cofuma.adb: Make Length and
Num_Overlaps return Big_Natural.
* libgnat/a-cforse.ads, libgnat/a-cforse.adb,
libgnat/a-cforma.adb, libgnat/a-cfhase.ads,
libgnat/a-cfhase.adb, libgnat/a-cfhama.adb,
libgnat/a-cfdlli.adb: Adapt code to handle Big_Integers instead
of Count_Type.diff --git a/gcc/ada/libgnat/a-cfdlli.adb b/gcc/ada/libgnat/a-cfdlli.adb
--- a/gcc/ada/libgnat/a-cfdlli.adb
+++ b/gcc/ada/libgnat/a-cfdlli.adb
@@ -29,9 +29,17 @@ with Ada.Containers.Stable_Sorting; use Ada.Containers.Stable_Sorting;
 
 with System; use type System.Address;
 
+with Ada.Numerics.Big_Numbers.Big_Integers;
+use Ada.Numerics.Big_Numbers.Big_Integers;
+
 package body Ada.Containers.Formal_Doubly_Linked_Lists with
   SPARK_Mode => Off
 is
+   --  Convert Count_Type to Big_Interger
+
+   package Conversions is new Signed_Conversions (Int => Count_Type);
+   use Conversions;
+
---
-- Local Subprograms --
---
@@ -809,7 +817,7 @@ is
 
  while Position /= 0 loop
 R := P.Add (R, (Node => Position), I);
-pragma Assert (P.Length (R) = I);
+pragma Assert (P.Length (R) = To_Big_Integer (I));
 Position := Container.Nodes (Position).Next;
 I := I + 1;
  end loop;


diff --git a/gcc/ada/libgnat/a-cfhama.adb b/gcc/ada/libgnat/a-cfhama.adb
--- a/gcc/ada/libgnat/a-cfhama.adb
+++ b/gcc/ada/libgnat/a-cfhama.adb
@@ -33,6 +33,9 @@ pragma Elaborate_All (Ada.Containers.Hash_Tables.Generic_Formal_Keys);
 
 with Ada.Containers.Prime_Numbers; use Ada.Containers.Prime_Numbers;
 
+with Ada.Numerics.Big_Numbers.Big_Integers;
+use Ada.Numerics.Big_Numbers.Big_Integers;
+
 with System; use type System.Address;
 
 package body Ada.Containers.Formal_Hashed_Maps with
@@ -71,6 +74,13 @@ is
function Vet (Container : Map; Position : Cursor) return Boolean
  with Inline;
 
+   --  Convert Count_Type to Big_Interger
+
+   package Conversions is new Signed_Conversions (Int => Count_Type);
+
+   function Big (J : Count_Type) return Big_Integer renames
+ Conversions.To_Big_Integer;
+
--
-- Local Instantiations --
--
@@ -526,7 +536,7 @@ is
 
  while Position /= 0 loop
 R := P.Add (R, (Node => Position), I);
-pragma Assert (P.Length (R) = I);
+pragma Assert (P.Length (R) = Big (I));
 Position := HT_Ops.Next (Container.Content, Position);
 I := I + 1;
  end loop;


diff --git a/gcc/ada/libgnat/a-cfhase.adb b/gcc/ada/libgnat/a-cfhase.adb
--- a/gcc/ada/libgnat/a-cfhase.adb
+++ b/gcc/ada/libgnat/a-cfhase.adb
@@ -753,7 +753,7 @@ is
 
  while Position /= 0 loop
 R := P.Add (R, (Node => Position), I);
-pragma Assert (P.Length (R) = I);
+pragma Assert (P.Length (R) = Big (I));
 Position := HT_Ops.Next (Container.Content, Position);
 I := I + 1;
  end loop;


diff --git a/gcc/ada/libgnat/a-cfhase.ads b/gcc/ada/libgnat/a-cfhase.ads
--- a/gcc/ada/libgnat/a-cfhase.ads
+++ b/gcc/ada/libgnat/a-cfhase.ads
@@ -48,6 +48,8 @@
 with Ada.Containers.Functional_Maps;
 with Ada.Containers.Functional_Sets;
 with Ada.Containers.Functional_Vectors;
+with Ada.Numerics.Big_Numbers.Big_Integers;
+use Ada.Numerics.Big_Numbers.Big_Integers;
 private with Ada.Containers.Hash_Tables;
 
 generic
@@ -70,6 +72,13 @@ is
pragma Assertion_Policy (Contract_Cases => Ignore);
pragma Annotate (CodePeer, Skip_Analysis);
 
+   --  Convert Count_Type to Big_Interger.
+
+   package Conversions is new Signed_Conversions (Int => Count_Type);
+
+   function Big (J : Count_Type) return Big_Integer renames
+ Conversions.To_Big_Integer;
+
type Set (Capacity : Count_Type; Modulus : Hash_Type) is private with
  Iterable => (First   => First,
   Next=> Next,
@@ -261,7 +270,7 @@ is
 
 Ghost,
 Global => null,
-Post   => M.Length (Model'Result) = Length (Container);
+Post   => M.Length (Model'Result) = Big (Length (Container));
 
   function Elements (Container : Set) return E.Sequence with
   --  The Elements sequence represents the underlying list structure of
@@ -859,9 +868,9 @@ is
Length (Source) -

[Ada] Rename GNATprove annotate pragma for termination to Always_Return

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

GNATprove changed the name of the pragma Annotate used to verify that
a subprogram always returns normally. It is now called Always_Return
instead of Terminating.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/s-aridou.adb: Use Always_Return instead of Terminating
to annotate termination for GNATprove.
* libgnat/s-arit32.adb: Idem.
* libgnat/s-spcuop.ads: Idem.diff --git a/gcc/ada/libgnat/s-aridou.adb b/gcc/ada/libgnat/s-aridou.adb
--- a/gcc/ada/libgnat/s-aridou.adb
+++ b/gcc/ada/libgnat/s-aridou.adb
@@ -162,7 +162,7 @@ is
 
function To_Neg_Int (A : Double_Uns) return Double_Int
with
- Annotate => (GNATprove, Terminating),
+ Annotate => (GNATprove, Always_Return),
  Pre  => In_Double_Int_Range (-Big (A)),
  Post => Big (To_Neg_Int'Result) = -Big (A);
--  Convert to negative integer equivalent. If the input is in the range
@@ -172,7 +172,7 @@ is
 
function To_Pos_Int (A : Double_Uns) return Double_Int
with
- Annotate => (GNATprove, Terminating),
+ Annotate => (GNATprove, Always_Return),
  Pre  => In_Double_Int_Range (Big (A)),
  Post => Big (To_Pos_Int'Result) = Big (A);
--  Convert to positive integer equivalent. If the input is in the range


diff --git a/gcc/ada/libgnat/s-arit32.adb b/gcc/ada/libgnat/s-arit32.adb
--- a/gcc/ada/libgnat/s-arit32.adb
+++ b/gcc/ada/libgnat/s-arit32.adb
@@ -104,7 +104,7 @@ is
 
function To_Neg_Int (A : Uns32) return Int32
with
- Annotate => (GNATprove, Terminating),
+ Annotate => (GNATprove, Always_Return),
  Pre  => In_Int32_Range (-Big (A)),
  Post => Big (To_Neg_Int'Result) = -Big (A);
--  Convert to negative integer equivalent. If the input is in the range
@@ -114,7 +114,7 @@ is
 
function To_Pos_Int (A : Uns32) return Int32
with
- Annotate => (GNATprove, Terminating),
+ Annotate => (GNATprove, Always_Return),
  Pre  => In_Int32_Range (Big (A)),
  Post => Big (To_Pos_Int'Result) = Big (A);
--  Convert to positive integer equivalent. If the input is in the range


diff --git a/gcc/ada/libgnat/s-spcuop.ads b/gcc/ada/libgnat/s-spcuop.ads
--- a/gcc/ada/libgnat/s-spcuop.ads
+++ b/gcc/ada/libgnat/s-spcuop.ads
@@ -45,7 +45,7 @@
 package System.SPARK.Cut_Operations with
   SPARK_Mode,
   Pure,
-  Annotate => (GNATprove, Terminating)
+  Annotate => (GNATprove, Always_Return)
 is
 
function By (Consequence, Premise : Boolean) return Boolean with

[Ada] Fix spurious use of trampolines with interface types

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

Function pointers must always be built with '[Unrestricted_]Access.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch3.adb (Init_Secondary_Tags.Initialize_Tag): Initialize the
Offset_Func component by means of 'Unrestricted_Access.diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -9506,7 +9506,7 @@ package body Exp_Ch3 is
 --   Prim_T   => Typ'Tag,
 --   Interface_T  => Iface'Tag,
 --   Offset_Value => n,
---   Offset_Func  => Fn'Address)
+--   Offset_Func  => Fn'Unrestricted_Access)
 
 Append_To (Stmts_List,
   Make_Procedure_Call_Statement (Loc,
@@ -9541,7 +9541,7 @@ package body Exp_Ch3 is
 Make_Attribute_Reference (Loc,
   Prefix => New_Occurrence_Of
   (DT_Offset_To_Top_Func (Tag_Comp), Loc),
-  Attribute_Name => Name_Address);
+  Attribute_Name => Name_Unrestricted_Access);
 
 --  In this case the next component stores the value of the offset
 --  to the top.

[Ada] Add precise subprogram identification to -fdiagnostics-format=json

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

This commit adds information allowing identification of the subprogram
surrounding the message emitted by gnat when using -gnatdJ along with
-fdiagnostics-format=json.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* errout.adb (Write_JSON_Span): Add subprogram name to emitted
JSON.diff --git a/gcc/ada/errout.adb b/gcc/ada/errout.adb
--- a/gcc/ada/errout.adb
+++ b/gcc/ada/errout.adb
@@ -2097,11 +2097,14 @@ package body Errout is
   --  Write Sptr as a JSON location, an object containing a file attribute,
   --  a line number and a column number.
 
-  procedure Write_JSON_Span (Span : Source_Span);
-  --  Write Span as a JSON span, an object containing a "caret" attribute
-  --  whose value is the JSON location of Span.Ptr. If Span.First and
-  --  Span.Last are different from Span.Ptr, they will be printed as JSON
+  procedure Write_JSON_Span (Error : Error_Msg_Object);
+  --  Write Error as a JSON span, an object containing a "caret" attribute
+  --  whose value is the JSON location of Error.Sptr.Ptr. If Sptr.First and
+  --  Sptr.Last are different from Sptr.Ptr, they will be printed as JSON
   --  locations under the names "start" and "finish".
+  --  When Include_Subprogram_In_Messages is true (-gnatdJ) an additional,
+  --  non-standard, attribute named "subprogram" will be added, allowing
+  --  precisely identifying the subprogram surrounding the span.
 
   ---
   --  Is_Continuation  --
@@ -2162,7 +2165,8 @@ package body Errout is
   -- Write_JSON_Span --
   -
 
-  procedure Write_JSON_Span (Span : Source_Span) is
+  procedure Write_JSON_Span (Error : Error_Msg_Object) is
+ Span : constant Source_Span := Error.Sptr;
   begin
  Write_Str ("{""caret"":");
  Write_JSON_Location (Span.Ptr);
@@ -2177,6 +2181,11 @@ package body Errout is
 Write_JSON_Location (Span.Last);
  end if;
 
+ if Include_Subprogram_In_Messages then
+Write_Str
+  (",""subprogram"":""" & Subprogram_Name_Ptr (Error.Node) & );
+ end if;
+
  Write_Str ("}");
   end Write_JSON_Span;
 
@@ -2210,7 +2219,7 @@ package body Errout is
   --  Print message location
 
   Write_Str (",""locations"":[");
-  Write_JSON_Span (Errors.Table (E).Sptr);
+  Write_JSON_Span (Errors.Table (E));
 
   if Errors.Table (E).Optr /= Errors.Table (E).Sptr.Ptr then
  Write_Str (",{""caret"":");

[Ada] Refine previous changes on making symbols public/private

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

Inline_Always procedures should be kept public for proper inter unit
inlining.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch7.adb (Set_Referencer_Of_Non_Subprograms): New local
procedure, used for code refactoring. Also take into account
Inline_Always pragma when deciding to make a symbol public for
C generation.diff --git a/gcc/ada/sem_ch7.adb b/gcc/ada/sem_ch7.adb
--- a/gcc/ada/sem_ch7.adb
+++ b/gcc/ada/sem_ch7.adb
@@ -304,6 +304,46 @@ package body Sem_Ch7 is
 Decl_Id : Entity_Id;
 In_Instance : Boolean;
 Spec: Node_Id;
+Ignore  : Boolean;
+
+function Set_Referencer_Of_Non_Subprograms return Boolean;
+--  Set Has_Referencer_Of_Non_Subprograms and call
+--  Scan_Subprogram_Refs if relevant.
+--  Return whether Scan_Subprogram_Refs was called.
+
+---
+-- Set_Referencer_Of_Non_Subprograms --
+---
+
+function Set_Referencer_Of_Non_Subprograms return Boolean is
+begin
+   --  An inlined subprogram body acts as a referencer
+   --  unless we generate C code since inlining is then
+   --  handled by the C compiler.
+
+   --  Note that we test Has_Pragma_Inline here in addition
+   --  to Is_Inlined. We are doing this for a client, since
+   --  we are computing which entities should be public, and
+   --  it is the client who will decide if actual inlining
+   --  should occur, so we need to catch all cases where the
+   --  subprogram may be inlined by the client.
+
+   if (not CCG_Mode or else Has_Pragma_Inline_Always (Decl_Id))
+ and then (Is_Inlined (Decl_Id)
+or else Has_Pragma_Inline (Decl_Id))
+   then
+  Has_Referencer_Of_Non_Subprograms := True;
+
+  --  Inspect the statements of the subprogram body
+  --  to determine whether the body references other
+  --  subprograms.
+
+  Scan_Subprogram_Refs (Decl);
+  return True;
+   else
+  return False;
+   end if;
+end Set_Referencer_Of_Non_Subprograms;
 
  begin
 if No (Decls) then
@@ -398,54 +438,17 @@ package body Sem_Ch7 is
 return True;
  end if;
 
- --  An inlined subprogram body acts as a referencer
- --  unless we generate C code since inlining is then
- --  handled by the C compiler.
-
- --  Note that we test Has_Pragma_Inline here in addition
- --  to Is_Inlined. We are doing this for a client, since
- --  we are computing which entities should be public, and
- --  it is the client who will decide if actual inlining
- --  should occur, so we need to catch all cases where the
- --  subprogram may be inlined by the client.
-
- if not CCG_Mode
-   and then (Is_Inlined (Decl_Id)
-  or else Has_Pragma_Inline (Decl_Id))
- then
-Has_Referencer_Of_Non_Subprograms := True;
-
---  Inspect the statements of the subprogram body
---  to determine whether the body references other
---  subprograms.
-
-Scan_Subprogram_Refs (Decl);
- end if;
+ Ignore := Set_Referencer_Of_Non_Subprograms;
 
   --  Otherwise this is a stand alone subprogram body
 
   else
  Decl_Id := Defining_Entity (Decl);
 
- --  An inlined subprogram body acts as a referencer
- --  unless we generate C code since inlining is then
- --  handled by the C compiler.
-
- if not CCG_Mode
-   and then (Is_Inlined (Decl_Id)
-  or else Has_Pragma_Inline (Decl_Id))
+ if not Set_Referencer_Of_Non_Subprograms
+   and then not Subprogram_Table.Get (Decl_Id)
  then
-Has_Referencer_Of_Non_Subprograms := True;
-
---  Inspect the statements of the subprogram body
---  to determine whether the body references other
---  subprograms.
-
-Scan_Subprogram_Refs (Decl);
-
- --  Otherwise we can reset Is_Public right away
-
-

[Ada] Fix spurious errors on ghost code in generics

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

After the recent fix for detecting illegal use of ghost entities in
code, spurious errors could be raised on generic code with ghost, due to
wrong setting of the ghost flags on copied entities from the generic to
the instantiation.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* atree.adb (New_Copy): Reset flags related to ghost entities
before marking the new node.diff --git a/gcc/ada/atree.adb b/gcc/ada/atree.adb
--- a/gcc/ada/atree.adb
+++ b/gcc/ada/atree.adb
@@ -1775,6 +1775,11 @@ package body Atree is
 
  --  Mark the copy as Ghost depending on the current Ghost region
 
+ if Nkind (New_Id) in N_Entity then
+Set_Is_Checked_Ghost_Entity (New_Id, False);
+Set_Is_Ignored_Ghost_Entity (New_Id, False);
+ end if;
+
  Mark_New_Ghost_Node (New_Id);
 
  New_Node_Debugging_Output (New_Id);

[Ada] Move registering code for predefined primitives to Exp_Disp

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

This avoids making Expand_Interface_Thunk visible from the outside.

No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch6.adb (Freeze_Subprogram.Register_Predefined_DT_Entry): Move
procedure to...
* exp_disp.ads (Expand_Interface_Thunk): Move declaration to...
(Register_Predefined_Primitive): Declare.
* exp_disp.adb (Expand_Interface_Thunk): ...here.
(Register_Predefined_Primitive): ...here and change into a function
returning List_Id.diff --git a/gcc/ada/exp_ch6.adb b/gcc/ada/exp_ch6.adb
--- a/gcc/ada/exp_ch6.adb
+++ b/gcc/ada/exp_ch6.adb
@@ -7828,109 +7828,9 @@ package body Exp_Ch6 is
---
 
procedure Freeze_Subprogram (N : Node_Id) is
-  Loc : constant Source_Ptr := Sloc (N);
-
-  procedure Register_Predefined_DT_Entry (Prim : Entity_Id);
-  --  (Ada 2005): Register a predefined primitive in all the secondary
-  --  dispatch tables of its primitive type.
-
-  --
-  -- Register_Predefined_DT_Entry --
-  --
-
-  procedure Register_Predefined_DT_Entry (Prim : Entity_Id) is
- Iface_DT_Ptr : Elmt_Id;
- L: List_Id;
- Tagged_Typ   : Entity_Id;
- Thunk_Id : Entity_Id;
- Thunk_Code   : Node_Id;
-
-  begin
- Tagged_Typ := Find_Dispatching_Type (Prim);
-
- if No (Access_Disp_Table (Tagged_Typ))
-   or else not Has_Interfaces (Tagged_Typ)
-   or else not RTE_Available (RE_Interface_Tag)
-   or else Restriction_Active (No_Dispatching_Calls)
- then
-return;
- end if;
-
- --  Skip the first two access-to-dispatch-table pointers since they
- --  leads to the primary dispatch table (predefined DT and user
- --  defined DT). We are only concerned with the secondary dispatch
- --  table pointers. Note that the access-to- dispatch-table pointer
- --  corresponds to the first implemented interface retrieved below.
-
- Iface_DT_Ptr :=
-   Next_Elmt (Next_Elmt (First_Elmt (Access_Disp_Table (Tagged_Typ;
-
- while Present (Iface_DT_Ptr)
-   and then Ekind (Node (Iface_DT_Ptr)) = E_Constant
- loop
-pragma Assert (Has_Thunks (Node (Iface_DT_Ptr)));
-Expand_Interface_Thunk (Prim, Thunk_Id, Thunk_Code,
-  Iface => Related_Type (Node (Iface_DT_Ptr)));
-
-if Present (Thunk_Code) then
-   L := New_List (
- Thunk_Code,
-
- Build_Set_Predefined_Prim_Op_Address (Loc,
-   Tag_Node =>
- New_Occurrence_Of (Node (Next_Elmt (Iface_DT_Ptr)), Loc),
-   Position => DT_Position (Prim),
-   Address_Node =>
- Unchecked_Convert_To (RTE (RE_Prim_Ptr),
-   Make_Attribute_Reference (Loc,
- Prefix => New_Occurrence_Of (Thunk_Id, Loc),
- Attribute_Name => Name_Unrestricted_Access))),
-
- Build_Set_Predefined_Prim_Op_Address (Loc,
-   Tag_Node =>
- New_Occurrence_Of
-  (Node (Next_Elmt (Next_Elmt (Next_Elmt (Iface_DT_Ptr,
-   Loc),
-   Position => DT_Position (Prim),
-   Address_Node =>
- Unchecked_Convert_To (RTE (RE_Prim_Ptr),
-   Make_Attribute_Reference (Loc,
- Prefix => New_Occurrence_Of (Prim, Loc),
- Attribute_Name => Name_Unrestricted_Access;
-
-   if No (Actions (N)) then
-  Set_Actions (N, L);
-
-   else
-  Append_List (L, Actions (N));
-   end if;
-end if;
-
---  Skip the tag of the predefined primitives dispatch table
-
-Next_Elmt (Iface_DT_Ptr);
-pragma Assert (Has_Thunks (Node (Iface_DT_Ptr)));
-
---  Skip tag of the no-thunks dispatch table
-
-Next_Elmt (Iface_DT_Ptr);
-pragma Assert (not Has_Thunks (Node (Iface_DT_Ptr)));
-
---  Skip tag of predefined primitives no-thunks dispatch table
-
-Next_Elmt (Iface_DT_Ptr);
-pragma Assert (not Has_Thunks (Node (Iface_DT_Ptr)));
-
-Next_Elmt (Iface_DT_Ptr);
- end loop;
-  end Register_Predefined_DT_Entry;
-
-  --  Local variables
-
+  Loc  : constant Source_Ptr := Sloc (N);
   Subp : constant Entity_Id  := Entity (N);
 
-   --  Start of processing for Freeze_Subprogram
-
begin
   --  We suppress the initialization of the dispatch table entry when
   --  not Tagged_Type_Expansion because the dispatching mechanism is
@@ -7985,10 +7885,12 @@ package body

[Ada] Build static dispatch tables always at the end of declarative part

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

The static dispatch tables of library-level tagged types are either built
on the first object declaration or at the end of the declarative part of
the package spec or body. There is no real need for the former case, and
the tables are not built for other constructs that freeze (tagged) types.

Therefore this change removes the former case, thus causing the tables to
be always built at the end of the declarative part; that's orthogonal to
freezing and the tagged types are still frozen at the appropriate place.

Moreover, it wraps the code in the Actions list of a freeze node (like
for the nonstatic case) so that it is considered elaboration code by the
processing done in Sem_Elab and does not disturb it.

No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* exp_ch3.adb (Expand_Freeze_Record_Type): Adjust comment.
(Expand_N_Object_Declaration): Do not build static dispatch tables.
* exp_disp.adb (Make_And_Insert_Dispatch_Table): New procedure.
(Build_Static_Dispatch_Tables): Call it to build the dispatch tables
and wrap them in the Actions list of a freeze node.diff --git a/gcc/ada/exp_ch3.adb b/gcc/ada/exp_ch3.adb
--- a/gcc/ada/exp_ch3.adb
+++ b/gcc/ada/exp_ch3.adb
@@ -5763,7 +5763,7 @@ package body Exp_Ch3 is
 
--  Generate dispatch table of locally defined tagged type.
--  Dispatch tables of library level tagged types are built
-   --  later (see Analyze_Declarations).
+   --  later (see Build_Static_Dispatch_Tables).
 
if not Building_Static_DT (Typ) then
   Append_Freeze_Actions (Typ, Make_DT (Typ));
@@ -6907,37 +6907,6 @@ package body Exp_Ch3 is
  return;
   end if;
 
-  --  First we do special processing for objects of a tagged type where
-  --  this is the point at which the type is frozen. The creation of the
-  --  dispatch table and the initialization procedure have to be deferred
-  --  to this point, since we reference previously declared primitive
-  --  subprograms.
-
-  --  Force construction of dispatch tables of library level tagged types
-
-  if Tagged_Type_Expansion
-and then Building_Static_Dispatch_Tables
-and then Is_Library_Level_Entity (Def_Id)
-and then Is_Library_Level_Tagged_Type (Base_Typ)
-and then Ekind (Base_Typ) in E_Record_Type
-   | E_Protected_Type
-   | E_Task_Type
-and then not Has_Dispatch_Table (Base_Typ)
-  then
- declare
-New_Nodes : List_Id := No_List;
-
- begin
-if Is_Concurrent_Type (Base_Typ) then
-   New_Nodes := Make_DT (Corresponding_Record_Type (Base_Typ));
-else
-   New_Nodes := Make_DT (Base_Typ);
-end if;
-
-Insert_List_Before (N, New_Nodes);
- end;
-  end if;
-
   --  Make shared memory routines for shared passive variable
 
   if Is_Shared_Passive (Def_Id) then


diff --git a/gcc/ada/exp_disp.adb b/gcc/ada/exp_disp.adb
--- a/gcc/ada/exp_disp.adb
+++ b/gcc/ada/exp_disp.adb
@@ -358,6 +358,12 @@ package body Exp_Disp is
   procedure Build_Package_Dispatch_Tables (N : Node_Id);
   --  Build static dispatch tables associated with package declaration N
 
+  procedure Make_And_Insert_Dispatch_Table (Typ : Entity_Id);
+  --  Build the dispatch table of the tagged type Typ and insert it at the
+  --  end of Target_List after wrapping it in the Actions list of a freeze
+  --  node, so that it is skipped by Sem_Elab (Expand_Freeze_Record_Type
+  --  does the same for nonstatic dispatch tables).
+
   ---
   -- Build_Dispatch_Tables --
   ---
@@ -410,8 +416,7 @@ package body Exp_Disp is
then
   null;
else
-  Insert_List_After_And_Analyze (Last (Target_List),
-Make_DT (Defining_Entity (D)));
+  Make_And_Insert_Dispatch_Table (Defining_Entity (D));
end if;
 
 --  Handle private types of library level tagged types. We must
@@ -434,8 +439,7 @@ package body Exp_Disp is
 and then not Is_Concurrent_Type (E2)
   then
  Exchange_Declarations (E1);
- Insert_List_After_And_Analyze (Last (Target_List),
-   Make_DT (E1));
+ Make_And_Insert_Dispatch_Table (E1);
  Exchange_Declarations (E2);
   end if;
end;
@@ -469,6 +473,25 @@ package body Exp_Disp is
  Pop_Scope;
   end Build_Package_Dispatch_Tables;
 
+  
+  -- Make_And_Insert_Dispatch_Table --
+  
+
+  procedure

[Ada] vx7r2-arm/aarch64 - Support Atomic Primitives True

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

This feature is an architecture feature, not an OS feature, so enable
on vx7r2 for arm and aarch64 to coincide with what is done on similarly
capable targets.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/system-vxworks7-arm.ads (Support_Atomic_Primitives):
Set True.
* libgnat/system-vxworks7-arm-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-aarch64.ads: Likewise.
* libgnat/system-vxworks7-aarch64-rtp-smp.ads: Likewise:diff --git a/gcc/ada/libgnat/system-vxworks7-aarch64-rtp-smp.ads b/gcc/ada/libgnat/system-vxworks7-aarch64-rtp-smp.ads
--- a/gcc/ada/libgnat/system-vxworks7-aarch64-rtp-smp.ads
+++ b/gcc/ada/libgnat/system-vxworks7-aarch64-rtp-smp.ads
@@ -151,6 +151,7 @@ private
Stack_Check_Probes: constant Boolean := True;
Stack_Check_Limits: constant Boolean := False;
Support_Aggregates: constant Boolean := True;
+   Support_Atomic_Primitives : constant Boolean := True;
Support_Composite_Assign  : constant Boolean := True;
Support_Composite_Compare : constant Boolean := True;
Support_Long_Shifts   : constant Boolean := True;


diff --git a/gcc/ada/libgnat/system-vxworks7-aarch64.ads b/gcc/ada/libgnat/system-vxworks7-aarch64.ads
--- a/gcc/ada/libgnat/system-vxworks7-aarch64.ads
+++ b/gcc/ada/libgnat/system-vxworks7-aarch64.ads
@@ -148,6 +148,7 @@ private
Stack_Check_Probes: constant Boolean := True;
Stack_Check_Limits: constant Boolean := False;
Support_Aggregates: constant Boolean := True;
+   Support_Atomic_Primitives : constant Boolean := True;
Support_Composite_Assign  : constant Boolean := True;
Support_Composite_Compare : constant Boolean := True;
Support_Long_Shifts   : constant Boolean := True;


diff --git a/gcc/ada/libgnat/system-vxworks7-arm-rtp-smp.ads b/gcc/ada/libgnat/system-vxworks7-arm-rtp-smp.ads
--- a/gcc/ada/libgnat/system-vxworks7-arm-rtp-smp.ads
+++ b/gcc/ada/libgnat/system-vxworks7-arm-rtp-smp.ads
@@ -148,6 +148,7 @@ private
Stack_Check_Probes: constant Boolean := True;
Stack_Check_Limits: constant Boolean := False;
Support_Aggregates: constant Boolean := True;
+   Support_Atomic_Primitives : constant Boolean := True;
Support_Composite_Assign  : constant Boolean := True;
Support_Composite_Compare : constant Boolean := True;
Support_Long_Shifts   : constant Boolean := True;


diff --git a/gcc/ada/libgnat/system-vxworks7-arm.ads b/gcc/ada/libgnat/system-vxworks7-arm.ads
--- a/gcc/ada/libgnat/system-vxworks7-arm.ads
+++ b/gcc/ada/libgnat/system-vxworks7-arm.ads
@@ -146,6 +146,7 @@ private
Stack_Check_Probes: constant Boolean := True;
Stack_Check_Limits: constant Boolean := False;
Support_Aggregates: constant Boolean := True;
+   Support_Atomic_Primitives : constant Boolean := True;
Support_Composite_Assign  : constant Boolean := True;
Support_Composite_Compare : constant Boolean := True;
Support_Long_Shifts   : constant Boolean := True;

[Ada] Update documentation of GNAT.Dynamic_Tables

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

Document that dynamic tables are defined aliased for backward
compatibility.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/g-dyntab.ads (Table_Type): Update documentation.diff --git a/gcc/ada/libgnat/g-dyntab.ads b/gcc/ada/libgnat/g-dyntab.ads
--- a/gcc/ada/libgnat/g-dyntab.ads
+++ b/gcc/ada/libgnat/g-dyntab.ads
@@ -82,10 +82,6 @@ package GNAT.Dynamic_Tables is
--  freely (expensive reallocation occurs only at major granularity
--  chunks controlled by the allocation parameters).
 
-   --  Note: we do not make the table components aliased, since this would
-   --  restrict the use of table for discriminated types. If it is necessary
-   --  to take the access of a table element, use Unrestricted_Access.
-
--  WARNING: On HPPA, the virtual addressing approach used in this unit is
--  incompatible with the indexing instructions on the HPPA. So when using
--  this unit, compile your application with -mdisable-indexing.
@@ -120,9 +116,10 @@ package GNAT.Dynamic_Tables is
--  freely (expensive reallocation occurs only at major granularity
--  chunks controlled by the allocation parameters).
 
-   --  Note: we do not make the table components aliased, since this would
-   --  restrict the use of table for discriminated types. If it is necessary
-   --  to take the access of a table element, use Unrestricted_Access.
+   --  Note: For backward compatibility we do not make the table components
+   --  aliased, since for Ada 95 this would have restricted the use of tables
+   --  for discriminated types. If it is necessary to take the access of a
+   --  table element, use Unrestricted_Access.
 
type Table_Type is
  array (Valid_Table_Index_Type range <>) of Table_Component_Type;

[Ada] Fix detection of deferred constants for freezing error

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

Testing Is_Frozen is not robust enough, so instead test that the full view
has been seen and that the Has_Completion flag is set on it.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* freeze.adb (Check_Expression_Function.Find_Constant): Make test
for deferred constants more robust.diff --git a/gcc/ada/freeze.adb b/gcc/ada/freeze.adb
--- a/gcc/ada/freeze.adb
+++ b/gcc/ada/freeze.adb
@@ -1458,7 +1458,8 @@ package body Freeze is
  N_Object_Declaration
and then not Is_Imported (Entity (Nod))
and then not Has_Completion (Entity (Nod))
-   and then not Is_Frozen (Entity (Nod))
+   and then not (Present (Full_View (Entity (Nod)))
+  and then Has_Completion (Full_View (Entity (Nod
  then
 Error_Msg_NE
   ("premature use of& in call or instance", N, Entity (Nod));

[Ada] Fix preconditions of Interfaces.C.Strings

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

Preconditions of Update procedures were always true when Offset was 0.
The changes enable to protect from Update_Error when Offset is 0.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/i-cstrin.ads (Update): Update precondition.diff --git a/gcc/ada/libgnat/i-cstrin.ads b/gcc/ada/libgnat/i-cstrin.ads
--- a/gcc/ada/libgnat/i-cstrin.ads
+++ b/gcc/ada/libgnat/i-cstrin.ads
@@ -120,7 +120,10 @@ is
with
  Pre=>
Item /= Null_Ptr
- and then (if Check then Offset <= Strlen (Item) - Chars'Length),
+ and then
+  (if Check then
+ Strlen (Item) <= size_t'Last - Offset
+   and then Strlen (Item) + Offset <= Chars'Length),
  Global => (In_Out => C_Memory);
 
procedure Update
@@ -131,7 +134,10 @@ is
with
  Pre=>
Item /= Null_Ptr
- and then (if Check then Offset <= Strlen (Item) - Str'Length),
+ and then
+  (if Check then
+ Strlen (Item) <= size_t'Last - Offset
+   and then Strlen (Item) + Offset <= Str'Length),
  Global => (In_Out => C_Memory);
 
Update_Error : exception;

[Ada] Issue errors on wrong context for ghost entities

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

References to ghost entities should only occur in ghost context.  This
was not checked systematically on all references.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_ch2.adb (Analyze_Identifier): Add checking for ghost
context.
* sem_ch5.adb (Analyze_Implicit_Label_Declaration): Treat
implicit labels like other entities by setting their ghost
status according to context.
* ghost.adb (Check_Ghost_Context): Adapt checking.diff --git a/gcc/ada/ghost.adb b/gcc/ada/ghost.adb
--- a/gcc/ada/ghost.adb
+++ b/gcc/ada/ghost.adb
@@ -365,6 +365,17 @@ package body Ghost is
if Is_Ghost_Pragma (Prag) then
   return True;
 
+   --  A pragma may not be analyzed, so that its Ghost status is
+   --  not determined yet, but it is guaranteed to be Ghost when
+   --  referencing a Ghost entity.
+
+   elsif Prag_Nam in Name_Annotate
+   | Name_Compile_Time_Error
+   | Name_Compile_Time_Warning
+   | Name_Unreferenced
+   then
+  return True;
+
--  An assertion expression pragma is Ghost when it contains a
--  reference to a Ghost entity (SPARK RM 6.9(10)), except for
--  predicate pragmas (SPARK RM 6.9(11)).
@@ -444,14 +455,6 @@ package body Ghost is
  if Ghost_Mode > None then
 return True;
 
- --  A Ghost type may be referenced in a use_type clause
- --  (SPARK RM 6.9.10).
-
- elsif Present (Parent (Context))
-   and then Nkind (Parent (Context)) = N_Use_Type_Clause
- then
-return True;
-
  --  Routine Expand_Record_Extension creates a parent subtype without
  --  inserting it into the tree. There is no good way of recognizing
  --  this special case as there is no parent. Try to approximate the
@@ -482,6 +485,46 @@ package body Ghost is
then
   return True;
 
+   --  A Ghost type may be referenced in a use or use_type clause
+   --  (SPARK RM 6.9(10)).
+
+   elsif Present (Parent (Par))
+ and then Nkind (Parent (Par)) in N_Use_Package_Clause
+| N_Use_Type_Clause
+   then
+  return True;
+
+   --  The context is an attribute definition clause for a Ghost
+   --  entity.
+
+   elsif Nkind (Parent (Par)) = N_Attribute_Definition_Clause
+ and then Par = Name (Parent (Par))
+   then
+  return True;
+
+   --  The context is the instantiation or renaming of a Ghost
+   --  entity.
+
+   elsif Nkind (Parent (Par)) in N_Generic_Instantiation
+   | N_Renaming_Declaration
+   | N_Generic_Renaming_Declaration
+   and then Par = Name (Parent (Par))
+   then
+  return True;
+
+   --  In the context of an instantiation, accept currently Ghost
+   --  arguments for formal subprograms, as SPARK does not provide
+   --  a way to distinguish Ghost formal parameters from non-Ghost
+   --  ones. Illegal use of such arguments in a non-Ghost context
+   --  will lead to errors inside the instantiation.
+
+   elsif Nkind (Parent (Par)) = N_Generic_Association
+ and then (Nkind (Par) in N_Has_Entity
+and then Present (Entity (Par))
+and then Is_Subprogram (Entity (Par)))
+   then
+  return True;
+
elsif Is_OK_Declaration (Par) then
   return True;
 
@@ -593,6 +636,13 @@ package body Ghost is
  return;
   end if;
 
+  --  When assertions are enabled, compiler generates code for ghost
+  --  entities, that is not subject to Ghost policy.
+
+  if not Comes_From_Source (Ghost_Ref) then
+ return;
+  end if;
+
   --  Once it has been established that the reference to the Ghost entity
   --  is within a suitable context, ensure that the policy at the point of
   --  declaration and at the point of use match.


diff --git a/gcc/ada/sem_ch2.adb b/gcc/ada/sem_ch2.adb
--- a/gcc/ada/sem_ch2.adb
+++ b/gcc/ada/sem_ch2.adb
@@ -26,6 +26,7 @@
 with Atree;  use Atree;
 with Einfo;  use Einfo;
 with Einfo.Utils;use Einfo.Utils;
+with Ghost;  use Ghost;
 with Namet;  use Namet;
 with Opt;use Opt;
 with Restrict;   use Restrict;
@@ -34,6 +35,7 @@ with Sem_Ch8;use Sem_Ch8;
 with Sem_Dim;use Sem_Dim;
 with Sinfo;  use Sinfo;
 with Sinfo.Nodes;use

[Ada] Complete contracts of Interfaces.C.Strings subprograms

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

This patch adds preconditions to Update procedures, to protect from
Update_Error propagations.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/i-cstrin.ads (Update): Add precondition.diff --git a/gcc/ada/libgnat/i-cstrin.ads b/gcc/ada/libgnat/i-cstrin.ads
--- a/gcc/ada/libgnat/i-cstrin.ads
+++ b/gcc/ada/libgnat/i-cstrin.ads
@@ -36,7 +36,8 @@
 --  Preconditions in this unit are meant for analysis only, not for run-time
 --  checking, so that the expected exceptions are raised. This is enforced by
 --  setting the corresponding assertion policy to Ignore. These preconditions
---  do not protect against Storage_Error.
+--  protect from Dereference_Error and Update_Error, but not from
+--  Storage_Error.
 
 pragma Assertion_Policy (Pre => Ignore);
 
@@ -117,7 +118,9 @@ is
   Chars  : char_array;
   Check  : Boolean := True)
with
- Pre=> Item /= Null_Ptr,
+ Pre=>
+   Item /= Null_Ptr
+ and then (if Check then Offset <= Strlen (Item) - Chars'Length),
  Global => (In_Out => C_Memory);
 
procedure Update
@@ -126,7 +129,9 @@ is
   Str: String;
   Check  : Boolean := True)
with
- Pre=> Item /= Null_Ptr,
+ Pre=>
+   Item /= Null_Ptr
+ and then (if Check then Offset <= Strlen (Item) - Str'Length),
  Global => (In_Out => C_Memory);
 
Update_Error : exception;

[Ada] Propagate Has_Inherit{able,ed}_Invariants to base types

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

The two flags apply to base types only like Has_Own_Invariants.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* sem_util.adb (Propagate_DIC_Attributes): Add ??? comment.
(Propagate_Invariant_Attributes): Likewise.  Propagate the
Has_Inheritable_Invariants and Has_Inherited_Invariants to
the base type of the target type.diff --git a/gcc/ada/sem_util.adb b/gcc/ada/sem_util.adb
--- a/gcc/ada/sem_util.adb
+++ b/gcc/ada/sem_util.adb
@@ -27312,7 +27312,7 @@ package body Sem_Util is
  --  The setting of the attributes is intentionally conservative. This
  --  prevents accidental clobbering of enabled attributes. We need to
  --  call Base_Type twice, because it is sometimes not set to an actual
- --  base type.
+ --  base type???
 
  if Has_Inherited_DIC (From_Typ) then
 Set_Has_Inherited_DIC (Base_Type (Base_Type (Typ)));
@@ -27362,14 +27362,14 @@ package body Sem_Util is
  --  The setting of the attributes is intentionally conservative. This
  --  prevents accidental clobbering of enabled attributes. We need to
  --  call Base_Type twice, because it is sometimes not set to an actual
- --  base type.
+ --  base type???
 
  if Has_Inheritable_Invariants (From_Typ) then
-Set_Has_Inheritable_Invariants (Typ);
+Set_Has_Inheritable_Invariants (Base_Type (Base_Type (Typ)));
  end if;
 
  if Has_Inherited_Invariants (From_Typ) then
-Set_Has_Inherited_Invariants (Typ);
+Set_Has_Inherited_Invariants (Base_Type (Base_Type (Typ)));
  end if;
 
  if Has_Own_Invariants (From_Typ) then

[Ada] Combine system.ads files - arm and aarch64 linux

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

Systemitize Word_Size and Memory_Size declarations rather than hard code
with numerical values or OS specific Long_Integer size.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/system-linux-arm.ads (Memory_Size): Compute based on
Word_Size.diff --git a/gcc/ada/libgnat/system-linux-arm.ads b/gcc/ada/libgnat/system-linux-arm.ads
--- a/gcc/ada/libgnat/system-linux-arm.ads
+++ b/gcc/ada/libgnat/system-linux-arm.ads
@@ -70,7 +70,7 @@ package System is
 
Storage_Unit : constant := 8;
Word_Size: constant := Standard'Word_Size;
-   Memory_Size  : constant := 2 ** Long_Integer'Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison

[Ada] Combine system.ads file - vxworks7 rtp constants

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

Systemitize Word_Size and Memory_Size declarations rather than hard code
with numerical values or OS specific Long_Integer size.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/system-vxworks7-aarch64-rtp-smp.ads (Word_Size):
Compute based on Standard'Word_Size. (Memory_Size): Compute
based on Word_Size.
* libgnat/system-vxworks7-arm-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-e500-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-e500-rtp.ads: Likewise.
* libgnat/system-vxworks7-ppc-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-ppc-rtp.ads: Likewise.
* libgnat/system-vxworks7-ppc64-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-x86-rtp-smp.ads: Likewise.
* libgnat/system-vxworks7-x86-rtp.ads: Likewise.diff --git a/gcc/ada/libgnat/system-vxworks7-aarch64-rtp-smp.ads b/gcc/ada/libgnat/system-vxworks7-aarch64-rtp-smp.ads
--- a/gcc/ada/libgnat/system-vxworks7-aarch64-rtp-smp.ads
+++ b/gcc/ada/libgnat/system-vxworks7-aarch64-rtp-smp.ads
@@ -71,8 +71,8 @@ package System is
Null_Address : constant Address;
 
Storage_Unit : constant := 8;
-   Word_Size: constant := 64;
-   Memory_Size  : constant := 2 ** 64;
+   Word_Size: constant := Standard'Word_Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison
 


diff --git a/gcc/ada/libgnat/system-vxworks7-arm-rtp-smp.ads b/gcc/ada/libgnat/system-vxworks7-arm-rtp-smp.ads
--- a/gcc/ada/libgnat/system-vxworks7-arm-rtp-smp.ads
+++ b/gcc/ada/libgnat/system-vxworks7-arm-rtp-smp.ads
@@ -71,8 +71,8 @@ package System is
Null_Address : constant Address;
 
Storage_Unit : constant := 8;
-   Word_Size: constant := 32;
-   Memory_Size  : constant := 2 ** 32;
+   Word_Size: constant := Standard'Word_Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison
 


diff --git a/gcc/ada/libgnat/system-vxworks7-e500-rtp-smp.ads b/gcc/ada/libgnat/system-vxworks7-e500-rtp-smp.ads
--- a/gcc/ada/libgnat/system-vxworks7-e500-rtp-smp.ads
+++ b/gcc/ada/libgnat/system-vxworks7-e500-rtp-smp.ads
@@ -71,8 +71,8 @@ package System is
Null_Address : constant Address;
 
Storage_Unit : constant := 8;
-   Word_Size: constant := 32;
-   Memory_Size  : constant := 2 ** 32;
+   Word_Size: constant := Standard'Word_Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison
 


diff --git a/gcc/ada/libgnat/system-vxworks7-e500-rtp.ads b/gcc/ada/libgnat/system-vxworks7-e500-rtp.ads
--- a/gcc/ada/libgnat/system-vxworks7-e500-rtp.ads
+++ b/gcc/ada/libgnat/system-vxworks7-e500-rtp.ads
@@ -71,8 +71,8 @@ package System is
Null_Address : constant Address;
 
Storage_Unit : constant := 8;
-   Word_Size: constant := 32;
-   Memory_Size  : constant := 2 ** 32;
+   Word_Size: constant := Standard'Word_Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison
 


diff --git a/gcc/ada/libgnat/system-vxworks7-ppc-rtp-smp.ads b/gcc/ada/libgnat/system-vxworks7-ppc-rtp-smp.ads
--- a/gcc/ada/libgnat/system-vxworks7-ppc-rtp-smp.ads
+++ b/gcc/ada/libgnat/system-vxworks7-ppc-rtp-smp.ads
@@ -71,8 +71,8 @@ package System is
Null_Address : constant Address;
 
Storage_Unit : constant := 8;
-   Word_Size: constant := 32;
-   Memory_Size  : constant := 2 ** 32;
+   Word_Size: constant := Standard'Word_Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison
 


diff --git a/gcc/ada/libgnat/system-vxworks7-ppc-rtp.ads b/gcc/ada/libgnat/system-vxworks7-ppc-rtp.ads
--- a/gcc/ada/libgnat/system-vxworks7-ppc-rtp.ads
+++ b/gcc/ada/libgnat/system-vxworks7-ppc-rtp.ads
@@ -71,8 +71,8 @@ package System is
Null_Address : constant Address;
 
Storage_Unit : constant := 8;
-   Word_Size: constant := 32;
-   Memory_Size  : constant := 2 ** 32;
+   Word_Size: constant := Standard'Word_Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison
 


diff --git a/gcc/ada/libgnat/system-vxworks7-ppc64-rtp-smp.ads b/gcc/ada/libgnat/system-vxworks7-ppc64-rtp-smp.ads
--- a/gcc/ada/libgnat/system-vxworks7-ppc64-rtp-smp.ads
+++ b/gcc/ada/libgnat/system-vxworks7-ppc64-rtp-smp.ads
@@ -71,8 +71,8 @@ package System is
Null_Address : constant Address;
 
Storage_Unit : constant := 8;
-   Word_Size: constant := 64;
-   Memory_Size  : constant := 2 ** 64;
+   Word_Size: constant := Standard'Word_Size;
+   Memory_Size  : constant := 2 ** Word_Size;
 
--  Address comparison
 


diff --git a/gcc/ada/libgnat/system-vxworks7-x86-rtp-smp.ads b/gcc/ada/libgnat/system-vxworks7-x86-rtp-smp.ads
--- a/gcc/ada/libgnat/system-vxworks7-x86-rtp-smp.ads
+++ b/gcc/ada/libgnat/system-vxworks7-x86-rtp-smp.ads
@@ -69,8 +69,8 @@ package System is
Null_Address : constant Address;
 
Storage_Unit : constant := 8;
-   Word_Size: constant := 32;
-   Memory_Size  : constant := 2 ** 32;
+   Word_Size:

[Ada] Gnatbind crash during checksum calculation

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

This patch corrects an error in the compiler whereby gnatbind may crash
during calculation of file checksums in certain corner cases due to
uninitialized lookup tables.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* gnatbind.adb (Gnatbind): Add initialize call for Uintp
* gnatls.adb (Gnatls): Likewise.
* gprep.adb (Gnatprep): Likewise.
* make.adb (Initialize): Likewise.diff --git a/gcc/ada/gnatbind.adb b/gcc/ada/gnatbind.adb
--- a/gcc/ada/gnatbind.adb
+++ b/gcc/ada/gnatbind.adb
@@ -50,6 +50,7 @@ with Switch;   use Switch;
 with Switch.B; use Switch.B;
 with Targparm; use Targparm;
 with Types;use Types;
+with Uintp;
 
 with System.Case_Util; use System.Case_Util;
 with System.Response_File;
@@ -617,6 +618,7 @@ begin
--  is in some cases important.
 
Csets.Initialize;
+   Uintp.Initialize;
Snames.Initialize;
 
--  Scan the switches and arguments. Note that Snames must already be


diff --git a/gcc/ada/gnatls.adb b/gcc/ada/gnatls.adb
--- a/gcc/ada/gnatls.adb
+++ b/gcc/ada/gnatls.adb
@@ -44,6 +44,7 @@ with Snames;
 with Stringt;
 with Switch;  use Switch;
 with Types;   use Types;
+with Uintp;
 
 with GNAT.Case_Util;use GNAT.Case_Util;
 with GNAT.Command_Line; use GNAT.Command_Line;
@@ -2023,6 +2024,7 @@ begin
--  Initialize standard packages
 
Csets.Initialize;
+   Uintp.Initialize;
Snames.Initialize;
Stringt.Initialize;
 


diff --git a/gcc/ada/gprep.adb b/gcc/ada/gprep.adb
--- a/gcc/ada/gprep.adb
+++ b/gcc/ada/gprep.adb
@@ -37,6 +37,7 @@ with Snames;
 with Stringt;  use Stringt;
 with Switch;   use Switch;
 with Types;use Types;
+with Uintp;
 
 with Ada.Command_Line; use Ada.Command_Line;
 with Ada.Text_IO;  use Ada.Text_IO;
@@ -169,6 +170,7 @@ package body GPrep is
   --  Do some initializations (order is important here)
 
   Csets.Initialize;
+  Uintp.Initialize;
   Snames.Initialize;
   Stringt.Initialize;
   Prep.Initialize;


diff --git a/gcc/ada/make.adb b/gcc/ada/make.adb
--- a/gcc/ada/make.adb
+++ b/gcc/ada/make.adb
@@ -44,6 +44,7 @@ with SFN_Scan;
 with Sinput;
 with Snames;
 with Stringt;
+with Uintp;
 
 pragma Warnings (Off);
 with System.HTable;
@@ -3676,6 +3677,7 @@ package body Make is
   Linker_Switches.Init;
 
   Csets.Initialize;
+  Uintp.Initialize;
   Snames.Initialize;
   Stringt.Initialize;

[Ada] Fix iteration on formal vectors

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

We need to use Extended_Index for the Position parameter of the Element
function in formal vectors so it is compatible with other primitives of
the Iterable aspect.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/a-cfinve.ads (Element): Change the type of the
Position parameter to Extended_Index.
* libgnat/a-cfinve.adb (Element): Idem.
* libgnat/a-cofove.ads (Element): Idem.
* libgnat/a-cofove.adb (Element): Idem.diff --git a/gcc/ada/libgnat/a-cfinve.adb b/gcc/ada/libgnat/a-cfinve.adb
--- a/gcc/ada/libgnat/a-cfinve.adb
+++ b/gcc/ada/libgnat/a-cfinve.adb
@@ -432,7 +432,7 @@ is
 
function Element
  (Container : Vector;
-  Index : Index_Type) return Element_Type
+  Index : Extended_Index) return Element_Type
is
begin
   if Index > Container.Last then


diff --git a/gcc/ada/libgnat/a-cfinve.ads b/gcc/ada/libgnat/a-cfinve.ads
--- a/gcc/ada/libgnat/a-cfinve.ads
+++ b/gcc/ada/libgnat/a-cfinve.ads
@@ -284,7 +284,7 @@ is
 
function Element
  (Container : Vector;
-  Index : Index_Type) return Element_Type
+  Index : Extended_Index) return Element_Type
with
  Global => null,
  Pre=> Index in First_Index (Container) .. Last_Index (Container),


diff --git a/gcc/ada/libgnat/a-cofove.adb b/gcc/ada/libgnat/a-cofove.adb
--- a/gcc/ada/libgnat/a-cofove.adb
+++ b/gcc/ada/libgnat/a-cofove.adb
@@ -370,7 +370,7 @@ is
 
function Element
  (Container : Vector;
-  Index : Index_Type) return Element_Type
+  Index : Extended_Index) return Element_Type
is
begin
   if Index > Container.Last then


diff --git a/gcc/ada/libgnat/a-cofove.ads b/gcc/ada/libgnat/a-cofove.ads
--- a/gcc/ada/libgnat/a-cofove.ads
+++ b/gcc/ada/libgnat/a-cofove.ads
@@ -263,7 +263,7 @@ is
 
function Element
  (Container : Vector;
-  Index : Index_Type) return Element_Type
+  Index : Extended_Index) return Element_Type
with
  Global => null,
  Pre=> Index in First_Index (Container) .. Last_Index (Container),

[Ada] Add contracts to System.Address_To_Access_Conversions

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

This patch adds SPARK annotations to subprograms from
System.Address_To_Access_Conversions. To_Pointer is considered to have
no global items, if the returned value has no aliases. To_Address is
forbidden in SPARK because addresses are not handled.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/s-atacco.ads (To_Pointer): Add Global => null.
(To_Address): Add SPARK_Mode => Off.diff --git a/gcc/ada/libgnat/s-atacco.ads b/gcc/ada/libgnat/s-atacco.ads
--- a/gcc/ada/libgnat/s-atacco.ads
+++ b/gcc/ada/libgnat/s-atacco.ads
@@ -54,8 +54,10 @@ package System.Address_To_Access_Conversions is
--  optimizations that may cause unexpected results based on the assumption
--  of no strict aliasing.
 
-   function To_Pointer (Value : Address)return Object_Pointer;
-   function To_Address (Value : Object_Pointer) return Address;
+   function To_Pointer (Value : Address)return Object_Pointer with
+ Global => null;
+   function To_Address (Value : Object_Pointer) return Address with
+ SPARK_Mode => Off;
 
pragma Import (Intrinsic, To_Pointer);
pragma Import (Intrinsic, To_Address);

[Ada] Add contracts to Interfaces.C.Strings

2022-06-02 Thread Pierre-Marie de Rodat via Gcc-patches

This patch adds Global contracts and preconditions to subprograms of
Interfaces.C.Strings. Effects on allocated memory are modelled
through an abstract state, C_Memory. The preconditions protect against
Dereference_Error, but not Storage_Error (which is not handled by
SPARK). This patch also disables the use of To_Chars_Ptr, which
creates an alias between an ownership pointer and the abstract state,
and the use of Free, in SPARK code. Thus, memory leaks will happen
if the user creates the Chars_Ptr using New_Char_Array and New_String.

Tested on x86_64-pc-linux-gnu, committed on trunk

gcc/ada/

* libgnat/i-cstrin.ads (To_Chars_Ptr): Add SPARK_Mode => Off.
(Free): Likewise.
(New_Char_Array): Add global contracts and Volatile attribute.
(New_String): Likewise.
(Value, Strlen, Update): Add global contracts and preconditions.
* libgnat/i-cstrin.adb: Add SPARK_Mode => Off to the package
body.diff --git a/gcc/ada/libgnat/i-cstrin.adb b/gcc/ada/libgnat/i-cstrin.adb
--- a/gcc/ada/libgnat/i-cstrin.adb
+++ b/gcc/ada/libgnat/i-cstrin.adb
@@ -34,7 +34,9 @@ with System.Storage_Elements; use System.Storage_Elements;
 
 with Ada.Unchecked_Conversion;
 
-package body Interfaces.C.Strings is
+package body Interfaces.C.Strings with
+  SPARK_Mode => Off
+is
 
--  Note that the type chars_ptr has a pragma No_Strict_Aliasing in the
--  spec, to prevent any assumptions about aliasing for values of this type,


diff --git a/gcc/ada/libgnat/i-cstrin.ads b/gcc/ada/libgnat/i-cstrin.ads
--- a/gcc/ada/libgnat/i-cstrin.ads
+++ b/gcc/ada/libgnat/i-cstrin.ads
@@ -33,7 +33,18 @@
 --  --
 --
 
-package Interfaces.C.Strings is
+--  Preconditions in this unit are meant for analysis only, not for run-time
+--  checking, so that the expected exceptions are raised. This is enforced by
+--  setting the corresponding assertion policy to Ignore. These preconditions
+--  do not protect against Storage_Error.
+
+pragma Assertion_Policy (Pre => Ignore);
+
+package Interfaces.C.Strings with
+  SPARK_Mode => On,
+  Abstract_State => (C_Memory),
+  Initializes=> (C_Memory)
+is
pragma Preelaborate;
 
type char_array_access is access all char_array;
@@ -53,47 +64,75 @@ package Interfaces.C.Strings is
 
function To_Chars_Ptr
  (Item  : char_array_access;
-  Nul_Check : Boolean := False) return chars_ptr;
-
-   function New_Char_Array (Chars : char_array) return chars_ptr;
-
-   function New_String (Str : String) return chars_ptr;
-
-   procedure Free (Item : in out chars_ptr);
+  Nul_Check : Boolean := False) return chars_ptr
+   with
+ SPARK_Mode => Off;
+
+   function New_Char_Array (Chars : char_array) return chars_ptr with
+ Volatile_Function,
+ Post   => New_Char_Array'Result /= Null_Ptr,
+ Global => (Input => C_Memory);
+
+   function New_String (Str : String) return chars_ptr with
+ Volatile_Function,
+ Post   => New_String'Result /= Null_Ptr,
+ Global => (Input => C_Memory);
+
+   procedure Free (Item : in out chars_ptr) with
+ SPARK_Mode => Off;
--  When deallocation is prohibited (eg: cert runtimes) this routine
--  will raise Program_Error
 
Dereference_Error : exception;
 
-   function Value (Item : chars_ptr) return char_array;
+   function Value (Item : chars_ptr) return char_array with
+ Pre=> Item /= Null_Ptr,
+ Global => (Input => C_Memory);
 
function Value
  (Item   : chars_ptr;
-  Length : size_t) return char_array;
+  Length : size_t) return char_array
+   with
+ Pre=> Item /= Null_Ptr,
+ Global => (Input => C_Memory);
 
-   function Value (Item : chars_ptr) return String;
+   function Value (Item : chars_ptr) return String with
+ Pre=> Item /= Null_Ptr,
+ Global => (Input => C_Memory);
 
function Value
  (Item   : chars_ptr;
-  Length : size_t) return String;
+  Length : size_t) return String
+   with
+ Pre=> Item /= Null_Ptr,
+ Global => (Input => C_Memory);
 
-   function Strlen (Item : chars_ptr) return size_t;
+   function Strlen (Item : chars_ptr) return size_t with
+ Pre=> Item /= Null_Ptr,
+ Global => (Input => C_Memory);
 
procedure Update
  (Item   : chars_ptr;
   Offset : size_t;
   Chars  : char_array;
-  Check  : Boolean := True);
+  Check  : Boolean := True)
+   with
+ Pre=> Item /= Null_Ptr,
+ Global => (In_Out => C_Memory);
 
procedure Update
  (Item   : chars_ptr;
   Offset : size_t;
   Str: String;
-  Check  : Boolean := True);
+  Check  : Boolean := True)
+   with
+ Pre=> Item /= Null_Ptr,
+ Global => (In_Out => C_Memory);
 
Update_Error : exception;
 
 private
+   pragma SPARK_Mode (Off);
type chars_ptr is access all Character;
for chars_ptr'Size use

Re: [PATCH v2, rs6000] Fix ICE on expand bcd__ [PR100736]

2022-06-02 Thread Segher Boessenkool

Hi!

On Thu, Jun 02, 2022 at 01:30:04PM +0800, HAO CHEN GUI wrote:
> Segher,
>   Does BCD comparison return false when either operand is invalid coding?

It sets all of LT, GT, and EQ to 0 (it normally sets exactly one of them
to 1).  It sets bit 3 (the "SO" bit usually) to 1.

That is what the machine insns do.  What the builtins do is undefined as
far as I know?  If So we can do whatever is most convenient, so, not
handle it specifically at all, just go with what falls out.

> If yes, the result could be 3-way. We can check gt and eq bits for ge.

You can check the LT bit, instead: it is only one branch insn, and also
only one setbc[r] insn (it can be slightly more expensive if you can use
only older insns).

> We still can't use crnot to only check lt bit as there could be invalid
> coding.
>   Also, do you think finite-math-only excludes invalid coding? Seems GCC
> doesn't clear define it.

This is not floating-point code at all, it should not be influenced at
all by finite-math-only!

Segher

Re: [PATCH v3] RISC-V: Implement C[LT]Z_DEFINED_VALUE_AT_ZERO

2022-06-02 Thread Kito Cheng via Gcc-patches

OK to back port, thanks!

On Thu, Jun 2, 2022 at 4:46 PM Philipp Tomsich  wrote:
>
> OK for backport?
>
> Thanks,
> Phil.
>
> On Fri, 13 May 2022 at 22:23, Philipp Tomsich 
> wrote:
>
> > Added the two nits from Kito's review and … Applied to trunk!
> >
> >
> > On Fri, 13 May 2022 at 22:16, Philipp Tomsich 
> > wrote:
> > >
> > > The Zbb support has introduced ctz and clz to the backend, but some
> > > transformations in GCC need to know what the value of c[lt]z at zero
> > > is. This affects how the optab is generated and may suppress use of
> > > CLZ/CTZ in tree passes.
> > >
> > > Among other things, this is needed for the transformation of
> > > table-based ctz-implementations, such as in deepsjeng, to work
> > > (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90838).
> > >
> > > Prior to this change, the test case from PR90838 would compile to
> > > on RISC-V targets with Zbb:
> > >   myctz:
> > > lui a4,%hi(.LC0)
> > > ld  a4,%lo(.LC0)(a4)
> > > neg a5,a0
> > > and a5,a5,a0
> > > mul a5,a5,a4
> > > lui a4,%hi(.LANCHOR0)
> > > addia4,a4,%lo(.LANCHOR0)
> > > srlia5,a5,58
> > > sh2add  a5,a5,a4
> > > lw  a0,0(a5)
> > > ret
> > >
> > > After this change, we get:
> > >   myctz:
> > > ctz a0,a0
> > > andia0,a0,63
> > > ret
> > >
> > > Testing this with deepsjeng_r (from SPEC 2017) against QEMU, this
> > > shows a clear reduction in dynamic instruction count:
> > >  - before  1961888067076
> > >  - after   1907928279874 (2.75% reduction)
> > >
> > > This also merges the various target-specific test-cases (for x86-64,
> > > aarch64 and riscv) within gcc.dg/pr90838.c.
> > >
> > > This extends the macros (i.e., effective-target keywords) used in
> > > testing (lib/target-supports.exp) to reliably distinguish between RV32
> > > and RV64 via __riscv_xlen (i.e., the integer register bitwidth) :
> > > testing for ILP32 could be misleading (as ILP32 is a valid memory
> > > model for 64bit systems).
> > >
> > > gcc/ChangeLog:
> > >
> > > * config/riscv/riscv.h (CLZ_DEFINED_VALUE_AT_ZERO): Implement.
> > > (CTZ_DEFINED_VALUE_AT_ZERO): Same.
> > > * doc/sourcebuild.texi: add documentation for RISC-V specific
> > > test target keywords
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > * gcc.dg/pr90838.c: Add additional flags (dg-additional-options)
> > >   when compiling for riscv64 and subsume
> > gcc.target/aarch64/pr90838.c
> > >   and gcc.target/i386/pr95863-2.c.
> > > * gcc.target/riscv/zbb-ctz.c: New test.
> > > * gcc.target/aarch64/pr90838.c: Removed.
> > > * gcc.target/i386/pr95863-2.c: Removed.
> > > * lib/target-supports.exp: Recognize RV32 or RV64 via XLEN
> > >
> > > Signed-off-by: Philipp Tomsich 
> > > Signed-off-by: Manolis Tsamis 
> > > Co-developed-by: Manolis Tsamis 
> > >
> > > ---
> > > Changes in v3:
> > > - Address nit from Kito (use rv64 and rv32 on gcc.dg/pr90838.c
> > >   consistently.
> > >
> > > Changes in v2:
> > > - Address review comments from Palmer (merging testcases)
> > > - Merge the different target-specific testcases for CLZ into one
> > > - Add RV32 tests
> > > - Fix pr90383.c testcase for x86_64
> > >
> > >  gcc/config/riscv/riscv.h   |  5 ++
> > >  gcc/doc/sourcebuild.texi   | 12 
> > >  gcc/testsuite/gcc.dg/pr90838.c | 25 +
> > >  gcc/testsuite/gcc.target/aarch64/pr90838.c | 64 --
> > >  gcc/testsuite/gcc.target/i386/pr95863-2.c  | 27 -
> > >  gcc/testsuite/lib/target-supports.exp  | 30 ++
> > >  6 files changed, 72 insertions(+), 91 deletions(-)
> > >  delete mode 100644 gcc/testsuite/gcc.target/aarch64/pr90838.c
> > >  delete mode 100644 gcc/testsuite/gcc.target/i386/pr95863-2.c
> > >
> > > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> > > index 8a4d2cf7f85..b191606edb4 100644
> > > --- a/gcc/config/riscv/riscv.h
> > > +++ b/gcc/config/riscv/riscv.h
> > > @@ -1004,4 +1004,9 @@ extern void
> > riscv_remove_unneeded_save_restore_calls (void);
> > >
> > >  #define HARD_REGNO_RENAME_OK(FROM, TO) riscv_hard_regno_rename_ok
> > (FROM, TO)
> > >
> > > +#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> > > +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> > > +#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> > > +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> > > +
> > >  #endif /* ! GCC_RISCV_H */
> > > diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> > > index 613ac29967b..71c04841df2 100644
> > > --- a/gcc/doc/sourcebuild.texi
> > > +++ b/gcc/doc/sourcebuild.texi
> > > @@ -2420,6 +2420,18 @@ PowerPC target pre-defines macro _ARCH_PWR9 which
> > means the @code{-mcpu}
> > >  setting is Power9 or later.
> > >  @end table
> > >
> > > +@subsection RISC-V specific attributes
> > > +
> > > +@table @code
> > > +

Re: [PATCH] Fold truncations of left shifts in match.pd

2022-06-02 Thread Richard Biener via Gcc-patches

On Mon, May 30, 2022 at 2:24 PM Roger Sayle  wrote:
>
>
> Whilst investigating PR 55278, I noticed that the tree-ssa optimizers
> aren't eliminating the promotions of shifts to "int" as inserted by the
> c-family front-ends, instead leaving this simplification to be left to
> the RTL optimizers.  This patch allows match.pd to do this itself earlier,
> narrowing (T)(X << C) to (T)X << C when the constant C is known to be
> valid for the (narrower) type T.
>
> Hence for this simple test case:
> short foo(short x) { return x << 5; }
>
> the .optimized dump currently looks like:
>
> short int foo (short int x)
> {
>   int _1;
>   int _2;
>   short int _4;
>
>[local count: 1073741824]:
>   _1 = (int) x_3(D);
>   _2 = _1 << 5;
>   _4 = (short int) _2;
>   return _4;
> }
>
> but with this patch, now becomes:
>
> short int foo (short int x)
> {
>   short int _2;
>
>[local count: 1073741824]:
>   _2 = x_1(D) << 5;
>   return _2;
> }
>
> This is always reasonable as RTL expansion knows how to use
> widening optabs if it makes sense at the RTL level to perform
> this shift in a wider mode.
>
> Of course, there's often a catch.  The above simplification not only
> reduces the number of statements in gimple, but also allows further
> optimizations, for example including the perception of rotate idioms
> and bswap16.  Alas, optimizing things earlier than anticipated
> requires several testsuite changes [though all these tests have
> been confirmed to generate identical assembly code on x86_64].
> The only significant change is that the vectorization pass previously
> wouldn't vectorize rotations if the backend doesn't explicitly provide
> an optab for them.  This is curious as if the rotate is expressed as
> ior(lshift,rshift) it will vectorize, and likewise RTL expansion will
> generate the iorv(lshiftv,rshiftv) sequence if required for a vector
> mode rotation.  Hence this patch includes a tweak to the optabs
> test in tree-vect-stmts.cc's vectorizable_shifts to better reflect
> the functionality supported by RTL expansion.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.  Ok for mainline?

can the lshift pattern be merged with the preceeding one?  It looks
awfully similar.  Possibly even do

  (if (wi::ltu_p ())
   (lshift (convert @1) (convert @2))
   { build_zero_cst (type); })

for when the truncation leaves us with zero?

+  /* RTL expansion knows how to expand rotates using shift/or.  */
+  if (icode == CODE_FOR_nothing
+  && (code == LROTATE_EXPR || code == RROTATE_EXPR)
+  && optab_handler (ior_optab, vec_mode) != CODE_FOR_nothing
+  && optab_handler (ashl_optab, vec_mode) != CODE_FOR_nothing)
+icode = (int) optab_handler (lshr_optab, vec_mode);

but we then get the vector costing wrong.  Also note that vector lowering
will figure the rotate is not supported and do its own "lowering" using
IOR.  Also it seems that only handles the case of vector by scalar (aka
uniform vector) rotates, otherwise will expand to scalar operations.

That said, the appropriate way to deal with this is in tree-vect-patterns.cc
where there already is vect_recog_rotate_pattern that should be detected
so the above hunk shouldn't be necessary - instead eventually the
pattern recognition routine needs improving?

Thanks,
Richard.


>
> 2022-05-30  Roger Sayle  
>
> gcc/ChangeLog
> * match.pd (convert (lshift @1 INTEGER_CST@2)): Narrow integer
> left shifts by a constant when the result is truncated, and the
> shift constant is well-defined for the narrower mode.
> * tree-vect-stmts.cc (vectorizable_shift): Rotations by
> constants are vectorizable, if the backend supports logical
> shifts and IOR logical operations in the required vector mode.
>
> gcc/testsuite/ChangeLog
> * gcc.dg/fold-convlshift-4.c: New test case.
> * gcc.dg/optimize-bswaphi-1.c: Update found bswap count.
> * gcc.dg/tree-ssa/pr61839_3.c: Shift is now optimized before VRP.
> * gcc.dg/vect/vect-over-widen-1-big-array.c: Remove obsolete tests.
> * gcc.dg/vect/vect-over-widen-1.c: Likewise.
> * gcc.dg/vect/vect-over-widen-3-big-array.c: Likewise.
> * gcc.dg/vect/vect-over-widen-3.c: Likewise.
> * gcc.dg/vect/vect-over-widen-4-big-array.c: Likewise.
> * gcc.dg/vect/vect-over-widen-4.c: Likewise.
>
>
> Thanks in advance,
> Roger
> --
>

Re: [PATCH v3] RISC-V: Implement C[LT]Z_DEFINED_VALUE_AT_ZERO

2022-06-02 Thread Philipp Tomsich

OK for backport?

Thanks,
Phil.

On Fri, 13 May 2022 at 22:23, Philipp Tomsich 
wrote:

> Added the two nits from Kito's review and … Applied to trunk!
>
>
> On Fri, 13 May 2022 at 22:16, Philipp Tomsich 
> wrote:
> >
> > The Zbb support has introduced ctz and clz to the backend, but some
> > transformations in GCC need to know what the value of c[lt]z at zero
> > is. This affects how the optab is generated and may suppress use of
> > CLZ/CTZ in tree passes.
> >
> > Among other things, this is needed for the transformation of
> > table-based ctz-implementations, such as in deepsjeng, to work
> > (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90838).
> >
> > Prior to this change, the test case from PR90838 would compile to
> > on RISC-V targets with Zbb:
> >   myctz:
> > lui a4,%hi(.LC0)
> > ld  a4,%lo(.LC0)(a4)
> > neg a5,a0
> > and a5,a5,a0
> > mul a5,a5,a4
> > lui a4,%hi(.LANCHOR0)
> > addia4,a4,%lo(.LANCHOR0)
> > srlia5,a5,58
> > sh2add  a5,a5,a4
> > lw  a0,0(a5)
> > ret
> >
> > After this change, we get:
> >   myctz:
> > ctz a0,a0
> > andia0,a0,63
> > ret
> >
> > Testing this with deepsjeng_r (from SPEC 2017) against QEMU, this
> > shows a clear reduction in dynamic instruction count:
> >  - before  1961888067076
> >  - after   1907928279874 (2.75% reduction)
> >
> > This also merges the various target-specific test-cases (for x86-64,
> > aarch64 and riscv) within gcc.dg/pr90838.c.
> >
> > This extends the macros (i.e., effective-target keywords) used in
> > testing (lib/target-supports.exp) to reliably distinguish between RV32
> > and RV64 via __riscv_xlen (i.e., the integer register bitwidth) :
> > testing for ILP32 could be misleading (as ILP32 is a valid memory
> > model for 64bit systems).
> >
> > gcc/ChangeLog:
> >
> > * config/riscv/riscv.h (CLZ_DEFINED_VALUE_AT_ZERO): Implement.
> > (CTZ_DEFINED_VALUE_AT_ZERO): Same.
> > * doc/sourcebuild.texi: add documentation for RISC-V specific
> > test target keywords
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.dg/pr90838.c: Add additional flags (dg-additional-options)
> >   when compiling for riscv64 and subsume
> gcc.target/aarch64/pr90838.c
> >   and gcc.target/i386/pr95863-2.c.
> > * gcc.target/riscv/zbb-ctz.c: New test.
> > * gcc.target/aarch64/pr90838.c: Removed.
> > * gcc.target/i386/pr95863-2.c: Removed.
> > * lib/target-supports.exp: Recognize RV32 or RV64 via XLEN
> >
> > Signed-off-by: Philipp Tomsich 
> > Signed-off-by: Manolis Tsamis 
> > Co-developed-by: Manolis Tsamis 
> >
> > ---
> > Changes in v3:
> > - Address nit from Kito (use rv64 and rv32 on gcc.dg/pr90838.c
> >   consistently.
> >
> > Changes in v2:
> > - Address review comments from Palmer (merging testcases)
> > - Merge the different target-specific testcases for CLZ into one
> > - Add RV32 tests
> > - Fix pr90383.c testcase for x86_64
> >
> >  gcc/config/riscv/riscv.h   |  5 ++
> >  gcc/doc/sourcebuild.texi   | 12 
> >  gcc/testsuite/gcc.dg/pr90838.c | 25 +
> >  gcc/testsuite/gcc.target/aarch64/pr90838.c | 64 --
> >  gcc/testsuite/gcc.target/i386/pr95863-2.c  | 27 -
> >  gcc/testsuite/lib/target-supports.exp  | 30 ++
> >  6 files changed, 72 insertions(+), 91 deletions(-)
> >  delete mode 100644 gcc/testsuite/gcc.target/aarch64/pr90838.c
> >  delete mode 100644 gcc/testsuite/gcc.target/i386/pr95863-2.c
> >
> > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> > index 8a4d2cf7f85..b191606edb4 100644
> > --- a/gcc/config/riscv/riscv.h
> > +++ b/gcc/config/riscv/riscv.h
> > @@ -1004,4 +1004,9 @@ extern void
> riscv_remove_unneeded_save_restore_calls (void);
> >
> >  #define HARD_REGNO_RENAME_OK(FROM, TO) riscv_hard_regno_rename_ok
> (FROM, TO)
> >
> > +#define CLZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> > +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> > +#define CTZ_DEFINED_VALUE_AT_ZERO(MODE, VALUE) \
> > +  ((VALUE) = GET_MODE_UNIT_BITSIZE (MODE), 2)
> > +
> >  #endif /* ! GCC_RISCV_H */
> > diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> > index 613ac29967b..71c04841df2 100644
> > --- a/gcc/doc/sourcebuild.texi
> > +++ b/gcc/doc/sourcebuild.texi
> > @@ -2420,6 +2420,18 @@ PowerPC target pre-defines macro _ARCH_PWR9 which
> means the @code{-mcpu}
> >  setting is Power9 or later.
> >  @end table
> >
> > +@subsection RISC-V specific attributes
> > +
> > +@table @code
> > +
> > +@item rv32
> > +Test system has an integer register width of 32 bits.
> > +
> > +@item rv64
> > +Test system has an integer register width of 64 bits.
> > +
> > +@end table
> > +
> >  @subsubsection Other hardware attributes
> >
> >  @c Please keep this table sorted alphabetically.
> > diff --git a/gcc/testsuite/gcc.dg/pr90838.c
>

Re: [PATCH] match.pd: Optimize __builtin_mul_overflow_p (x, cst, (utype)0) to x > ~(utype)0 / cst [PR30314]

2022-06-02 Thread Richard Biener via Gcc-patches

On Wed, 1 Jun 2022, Jakub Jelinek wrote:

> Hi!
> 
> A comparison with a constant is most likely always faster than
> .MUL_OVERFLOW from which we only check whether it overflowed and not the
> multiplication result, and even if not, it is simpler operation on GIMPLE
> and even if a target exists where such multiplications with overflow checking
> are cheaper than comparisons, because comparisons are so much more common
> than overflow checking multiplications, it would be nice if it simply
> arranged for comparisons to be emitted like those multiplications on its
> own...
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2022-06-01  Jakub Jelinek  
> 
>   PR middle-end/30314
>   * match.pd (__builtin_mul_overflow_p (x, cst, (utype) 0) ->
>   x > ~(utype)0 / cst): New simplification.
> 
>   * gcc.dg/tree-ssa/pr30314.c: New test.
> 
> --- gcc/match.pd.jj   2022-06-01 13:54:32.000654151 +0200
> +++ gcc/match.pd  2022-06-01 15:13:35.473084402 +0200
> @@ -5969,6 +5969,17 @@ (define_operator_list SYNC_FETCH_AND_AND
> && (!TYPE_UNSIGNED (TREE_TYPE (@2)) || TYPE_UNSIGNED (TREE_TYPE 
> (@0
> (ovf @1 @0
>  
> +/* Optimize __builtin_mul_overflow_p (x, cst, (utype) 0) if all 3 types
> +   are unsigned to x > (umax / cst).  */
> +(simplify
> + (imagpart (IFN_MUL_OVERFLOW:cs@2 @0 integer_nonzerop@1))

does :c work here?  I think it is at least ignored, possibly diagnostic
in genmatch is missing ...

> +  (if (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> +   && TYPE_UNSIGNED (TREE_TYPE (@0))
> +   && TYPE_MAX_VALUE (TREE_TYPE (@0))
> +   && types_match (TREE_TYPE (@0), TREE_TYPE (TREE_TYPE (@2)))
> +   && int_fits_type_p (@1, TREE_TYPE (@0)))
> +   (convert (gt @0 (trunc_div! { TYPE_MAX_VALUE (TREE_TYPE (@0)); } @1)
> +
>  /* Simplification of math builtins.  These rules must all be optimizations
> as well as IL simplifications.  If there is a possibility that the new
> form could be a pessimization, the rule should go in the canonicalization
> --- gcc/testsuite/gcc.dg/tree-ssa/pr30314.c.jj2022-06-01 
> 15:22:53.201271365 +0200
> +++ gcc/testsuite/gcc.dg/tree-ssa/pr30314.c   2022-06-01 15:13:24.725196482 
> +0200
> @@ -0,0 +1,18 @@
> +/* PR middle-end/30314 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-optimized" } */
> +/* { dg-final { scan-tree-dump-not "\.MUL_OVERFLOW " "optimized" } } */
> +/* { dg-final { scan-tree-dump " > 122713351" "optimized" { target int32 } } 
> } */
> +/* { dg-final { scan-tree-dump " > 527049830677415760" "optimized" { target 
> lp64 } } } */
> +
> +int
> +foo (unsigned int x)
> +{
> +  return __builtin_mul_overflow_p (x, 35U, 0U);
> +}
> +
> +int
> +bar (unsigned long int x)
> +{
> +  return __builtin_mul_overflow_p (x, 35UL, 0UL);
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)

Re: [PATCH] i386: Optimize away shift count masking of shifts/rotates some more [PR105778]

2022-06-02 Thread Uros Bizjak via Gcc-patches

On Thu, Jun 2, 2022 at 10:00 AM Jakub Jelinek  wrote:
>
> Hi!
>
> As the following testcase shows, our x86 backend support for optimizing
> out useless masking of shift/rotate counts when using instructions
> that naturally modulo the count themselves is insufficient.
> The *_mask define_insn_and_split patterns use
> (subreg:QI (and:SI (match_operand:SI) (match_operand "const_int_operand")))
> for the masking, but that can catch only the case where the masking
> is done in SImode, so typically in SImode in the source.
> We then have another set of patterns, *_mask_1, which use
> (and:QI (match_operand:QI) (match_operand "const_int_operand"))
> If the masking is done in DImode or in theory in HImode, we don't match
> it.
> The following patch does 4 different things to improve this:
> 1) drops the mode from AND and MATCH_OPERAND inside of the subreg:QI
>and replaces that by checking that the register shift count has
>SWI48 mode - I think doing it this way is cheaper than adding
>another mode iterator to patterns which use already another mode
>iterator and sometimes a code iterator as well
> 2) the doubleword shift patterns were only handling the case where
>the shift count is masked with a constant that has the most significant
>bit clear, i.e. where we know the shift count is less than half the
>number of bits in double-word.  If the mask is equal to half the
>number of bits in double-word minus 1, the masking was optimized
>away, otherwise the AND was kept.
>But if the most significant bit isn't clear, e use a word-sized shift
>and SHRD instruction, where the former does the modulo and the latter
>modulo with 64 / 32 depending on what mode the CPU is in (so 64 for
>128-bit doubleword and 32 or 64-bit doubleword).  So we can also
>optimize away the masking when the mask has all the relevant bits set,
>masking with the most significant bit will remain for the cmove
>test.
> 3) as requested, this patch adds a bunch of force_reg calls before
>gen_lowpart
> 4) 1-3 above unfortunately regressed
>+FAIL: gcc.target/i386/bt-mask-2.c scan-assembler-not and[lq][ \\t]
>+FAIL: gcc.target/i386/pr57819.c scan-assembler-not and[lq][ \\t]
>where we during combine match the new pattern we didn't match
>before and in the end don't match the pattern we were testing for.
>These 2 tests are fixed by the *jcc_bt_mask_1 pattern
>addition and small tweak to target rtx_costs, because even with
>the pattern around we'd refuse to match it because it appeared to
>have higher instruction cost
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2022-06-02  Jakub Jelinek  
>
> PR target/105778
> * config/i386/i386.md (*ashl3_doubleword_mask): Remove :SI
> from AND and its operands and just verify operands[2] has HImode,
> SImode or for TARGET_64BIT DImode.  Allow operands[3] to be a mask
> with all low 6 (64-bit) or 5 (32-bit) bits set and in that case
> just throw away the masking.  Use force_reg before calling
> gen_lowpart.
> (*ashl3_doubleword_mask_1): Allow operands[3] to be a mask
> with all low 6 (64-bit) or 5 (32-bit) bits set and in that case
> just throw away the masking.
> (*ashl3_doubleword): Rename to ...
> (ashl3_doubleword): ... this.
> (*ashl3_mask): Remove :SI from AND and its operands and just
> verify operands[2] has HImode, SImode or for TARGET_64BIT DImode.
> Use force_reg before calling gen_lowpart.
> (*3_mask): Likewise.
> (*3_doubleword_mask): Likewise.  Allow operands[3] to be
> a mask with all low 6 (64-bit) or 5 (32-bit) bits set and in that
> case just throw away the masking.  Use force_reg before calling
> gen_lowpart.
> (*3_doubleword_mask_1): Allow operands[3] to be a mask
> with all low 6 (64-bit) or 5 (32-bit) bits set and in that case just
> throw away the masking.
> (*3_doubleword): Rename to ...
> (3_doubleword): ... this.
> (*3_mask): Remove :SI from AND and its operands and just
> verify operands[2] has HImode, SImode or for TARGET_64BIT DImode.
> Use force_reg before calling gen_lowpart.
> (splitter after it): Remove :SI from AND and its operands and just
> verify operands[2] has HImode, SImode or for TARGET_64BIT DImode.
> (*_mask, *_mask): Remove :SI from AND and its
> operands and just verify operands[1] has HImode, SImode or for
> TARGET_64BIT DImode.  Use force_reg before calling gen_lowpart.
> (*jcc_bt_mask_1): New define_insn_and_split pattern.
> * config/i386/i386.cc (ix86_rtx_costs): For ZERO_EXTRACT with
> ZERO_EXTEND QI->SI in last operand ignore the cost of the ZERO_EXTEND.
>
> * gcc.target/i386/pr105778.c: New test.

OK.

Thanks,
Uros.

>
> --- gcc/config/i386/i386.md.jj

[PATCH] i386: Optimize away shift count masking of shifts/rotates some more [PR105778]

2022-06-02 Thread Jakub Jelinek via Gcc-patches

Hi!

As the following testcase shows, our x86 backend support for optimizing
out useless masking of shift/rotate counts when using instructions
that naturally modulo the count themselves is insufficient.
The *_mask define_insn_and_split patterns use
(subreg:QI (and:SI (match_operand:SI) (match_operand "const_int_operand")))
for the masking, but that can catch only the case where the masking
is done in SImode, so typically in SImode in the source.
We then have another set of patterns, *_mask_1, which use
(and:QI (match_operand:QI) (match_operand "const_int_operand"))
If the masking is done in DImode or in theory in HImode, we don't match
it.
The following patch does 4 different things to improve this:
1) drops the mode from AND and MATCH_OPERAND inside of the subreg:QI
   and replaces that by checking that the register shift count has
   SWI48 mode - I think doing it this way is cheaper than adding
   another mode iterator to patterns which use already another mode
   iterator and sometimes a code iterator as well
2) the doubleword shift patterns were only handling the case where
   the shift count is masked with a constant that has the most significant
   bit clear, i.e. where we know the shift count is less than half the
   number of bits in double-word.  If the mask is equal to half the
   number of bits in double-word minus 1, the masking was optimized
   away, otherwise the AND was kept.
   But if the most significant bit isn't clear, e use a word-sized shift
   and SHRD instruction, where the former does the modulo and the latter
   modulo with 64 / 32 depending on what mode the CPU is in (so 64 for
   128-bit doubleword and 32 or 64-bit doubleword).  So we can also
   optimize away the masking when the mask has all the relevant bits set,
   masking with the most significant bit will remain for the cmove
   test.
3) as requested, this patch adds a bunch of force_reg calls before
   gen_lowpart
4) 1-3 above unfortunately regressed
   +FAIL: gcc.target/i386/bt-mask-2.c scan-assembler-not and[lq][ \\t]
   +FAIL: gcc.target/i386/pr57819.c scan-assembler-not and[lq][ \\t]
   where we during combine match the new pattern we didn't match
   before and in the end don't match the pattern we were testing for.
   These 2 tests are fixed by the *jcc_bt_mask_1 pattern
   addition and small tweak to target rtx_costs, because even with
   the pattern around we'd refuse to match it because it appeared to
   have higher instruction cost

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-06-02  Jakub Jelinek  

PR target/105778
* config/i386/i386.md (*ashl3_doubleword_mask): Remove :SI
from AND and its operands and just verify operands[2] has HImode,
SImode or for TARGET_64BIT DImode.  Allow operands[3] to be a mask
with all low 6 (64-bit) or 5 (32-bit) bits set and in that case
just throw away the masking.  Use force_reg before calling
gen_lowpart.
(*ashl3_doubleword_mask_1): Allow operands[3] to be a mask
with all low 6 (64-bit) or 5 (32-bit) bits set and in that case
just throw away the masking.
(*ashl3_doubleword): Rename to ...
(ashl3_doubleword): ... this.
(*ashl3_mask): Remove :SI from AND and its operands and just
verify operands[2] has HImode, SImode or for TARGET_64BIT DImode.
Use force_reg before calling gen_lowpart.
(*3_mask): Likewise.
(*3_doubleword_mask): Likewise.  Allow operands[3] to be
a mask with all low 6 (64-bit) or 5 (32-bit) bits set and in that
case just throw away the masking.  Use force_reg before calling
gen_lowpart.
(*3_doubleword_mask_1): Allow operands[3] to be a mask
with all low 6 (64-bit) or 5 (32-bit) bits set and in that case just
throw away the masking.
(*3_doubleword): Rename to ...
(3_doubleword): ... this.
(*3_mask): Remove :SI from AND and its operands and just
verify operands[2] has HImode, SImode or for TARGET_64BIT DImode.
Use force_reg before calling gen_lowpart.
(splitter after it): Remove :SI from AND and its operands and just
verify operands[2] has HImode, SImode or for TARGET_64BIT DImode.
(*_mask, *_mask): Remove :SI from AND and its
operands and just verify operands[1] has HImode, SImode or for
TARGET_64BIT DImode.  Use force_reg before calling gen_lowpart.
(*jcc_bt_mask_1): New define_insn_and_split pattern.
* config/i386/i386.cc (ix86_rtx_costs): For ZERO_EXTRACT with
ZERO_EXTEND QI->SI in last operand ignore the cost of the ZERO_EXTEND.

* gcc.target/i386/pr105778.c: New test.

--- gcc/config/i386/i386.md.jj  2022-05-31 11:33:51.457251607 +0200
+++ gcc/config/i386/i386.md 2022-06-01 11:59:27.388631872 +0200
@@ -11890,11 +11890,16 @@ (define_insn_and_split "*ashl3_doub
(ashift:
  (match_operand: 1 "register_operand")

Re: [PATCH] x86-64: Remove HAVE_LD_PIE_COPYRELOC

2022-06-02 Thread Fāng-ruì Sòng via Gcc-patches

On Sun, Oct 31, 2021 at 7:36 PM Fāng-ruì Sòng  wrote:
>
> On Fri, Oct 8, 2021 at 10:57 AM Fāng-ruì Sòng  wrote:
> >
> > On Fri, Sep 24, 2021 at 11:29 AM H.J. Lu  wrote:
> > >
> > > On Fri, Sep 24, 2021 at 11:14 AM Fāng-ruì Sòng  wrote:
> > > >
> > > > On Fri, Sep 24, 2021 at 10:41 AM H.J. Lu  wrote:
> > > > >
> > > > > On Fri, Sep 24, 2021 at 10:29 AM Fāng-ruì Sòng  
> > > > > wrote:
> > > > > >
> > > > > >  On Tue, Sep 21, 2021 at 7:08 PM Fāng-ruì Sòng  
> > > > > > wrote:
> > > > > > >
> > > > > > > On Tue, Sep 21, 2021 at 6:57 PM H.J. Lu  
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > On Tue, Sep 21, 2021 at 9:16 AM Uros Bizjak  
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Sep 20, 2021 at 8:20 PM Fāng-ruì Sòng via Gcc-patches
> > > > > > > > >  wrote:
> > > > > > > > > >
> > > > > > > > > > PING^5 
> > > > > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > > > > > >
> > > > > > > > > > On Sat, Sep 4, 2021 at 12:11 PM Fāng-ruì Sòng 
> > > > > > > > > >  wrote:
> > > > > > > > > > >
> > > > > > > > > > > PING^4 
> > > > > > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > > > > > > >
> > > > > > > > > > > One major design goal of PIE was to avoid copy 
> > > > > > > > > > > relocations.
> > > > > > > > > > > The original patch for GCC 5 caused problems for many 
> > > > > > > > > > > years.
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Aug 18, 2021 at 11:54 PM Fāng-ruì Sòng 
> > > > > > > > > > >  wrote:
> > > > > > > > > > >>
> > > > > > > > > > >> PING^3 
> > > > > > > > > > >> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > > > > > > >>
> > > > > > > > > > >> On Fri, Jun 4, 2021 at 3:04 PM Fāng-ruì Sòng 
> > > > > > > > > > >>  wrote:
> > > > > > > > > > >> >
> > > > > > > > > > >> > PING^2 
> > > > > > > > > > >> > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > > > > > > >> >
> > > > > > > > > > >> > On Mon, May 24, 2021 at 9:43 AM Fāng-ruì Sòng 
> > > > > > > > > > >> >  wrote:
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Ping 
> > > > > > > > > > >> > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > On Tue, May 11, 2021 at 8:29 PM Fangrui Song 
> > > > > > > > > > >> > >  wrote:
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > This was introduced in 2014-12 to use local 
> > > > > > > > > > >> > > > binding for external symbols
> > > > > > > > > > >> > > > for -fPIE. Now that we have H.J. Lu's GOTPCRELX 
> > > > > > > > > > >> > > > for years which mostly
> > > > > > > > > > >> > > > nullify the benefit of HAVE_LD_PIE_COPYRELOC, 
> > > > > > > > > > >> > > > HAVE_LD_PIE_COPYRELOC
> > > > > > > > > > >> > > > should retire now.
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > One design goal of -fPIE was to avoid copy 
> > > > > > > > > > >> > > > relocations.
> > > > > > > > > > >> > > > HAVE_LD_PIE_COPYRELOC has deviated from the goal.  
> > > > > > > > > > >> > > > With this change, the
> > > > > > > > > > >> > > > -fPIE behavior of x86-64 will be closer to x86-32 
> > > > > > > > > > >> > > > and other targets.
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > ---
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > See 
> > > > > > > > > > >> > > > https://gcc.gnu.org/legacy-ml/gcc/2019-05/msg00215.html
> > > > > > > > > > >> > > >  for a list
> > > > > > > > > > >> > > > of fixed and unfixed (e.g. gold incompatibility 
> > > > > > > > > > >> > > > with protected
> > > > > > > > > > >> > > > https://sourceware.org/bugzilla/show_bug.cgi?id=19823)
> > > > > > > > > > >> > > >  issues.
> > > > > > > > > > >> > > >
> > > > > > > > > > >> > > > If you prefer a longer write-up, see
> > > > > > > > > > >> > > > https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected
> > > > > > > > > > >> > > > ---
> > > > > > > > > > >> > > >  gcc/config.in |  
> > > > > > > > > > >> > > > 6 ---
> > > > > > > > > > >> > > >  gcc/config/i386/i386.c| 
> > > > > > > > > > >> > > > 11 +---
> > > > > > > > > > >> > > >  gcc/configure | 
> > > > > > > > > > >> > > > 52 ---
> > > > > > > > > > >> > > >  gcc/configure.ac  | 
> > > > > > > > > > >> > > > 48 -
> > > > > > > > > > >> > > >  gcc/doc/sourcebuild.texi  |  
> > > > > > > > > > >> > > > 3 --
> > > > > > > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-1.c| 
> > > > > > > > > > >> > > > 14 -
> > > > > > > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-2.c| 
> > > > > > > > > > >> > > > 14 -
> > > > > > > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-3.c| 
> > > > > > > > > > >> > > > 14 -
> > > > > > > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-4.c

[PATCH] Add instruction level discriminator support.

2022-06-02 Thread Eugene Rozenfeld via Gcc-patches

This is the first in a series of patches to enable discriminator support
in AutoFDO.

This patch switches to tracking discriminators per statement/instruction
instead of per basic block. Tracking per basic block was problematic since
not all statements in a basic block needed a discriminator and, also, later
optimizations could move statements between basic blocks making correlation
during AutoFDO compilation unreliable. Tracking per statement also allows
us to assign different discriminators to multiple function calls in the same
basic block. A subsequent patch will add that support.

The idea of this patch is based on commit 
4c311d95cf6d9519c3c20f641cc77af7df491fdf
by Dehao Chen in vendors/google/heads/gcc-4_8 but uses a slightly different
approach. In Dehao's work special (normally unused) location ids and side tables
were used to keep track of locations with discriminators. Things have changed
since then and I don't think we have unused location ids anymore. Instead,
I made discriminators a part of ad-hoc locations.

The difference from Dehao's work also includes support for discriminator
reading/writing in lto streaming and in modules.

Tested on x86_64-pc-linux-gnu.


0001-Add-instruction-level-discriminator-support.patch
Description: 0001-Add-instruction-level-discriminator-support.patch

[x86 PATCH] Add peephole2 to reduce double word register shuffling.

2022-06-02 Thread Roger Sayle

The simple test case below demonstrates an interesting register
allocation challenge facing x86_64, imposed by ABI requirements
on int128.

__int128 foo(__int128 x, __int128 y)
{
  return x+y;
}

For which GCC currently generates the unusual sequence:

movq%rsi, %rax
movq%rdi, %r8
movq%rax, %rdi
movq%rdx, %rax
movq%rcx, %rdx
addq%r8, %rax
adcq%rdi, %rdx
ret

The challenge is that the x86_64 ABI requires passing the first __int128,
x, in %rsi:%rdi (highpart in %rsi, lowpart in %rdi), where internally
GCC prefers TI mode (double word) integers to be register allocated as
%rdi:%rsi (highpart in %rdi, lowpart in %rsi).  So after reload, we have
four mov instructions, two to move the double word to temporary registers
and then two to move them back.

This patch adds a peephole2 to spot this register shuffling, and with
-Os generates a xchg instruction, to produce:

xchgq   %rsi, %rdi
movq%rdx, %rax
movq%rcx, %rdx
addq%rsi, %rax
adcq%rdi, %rdx
ret

or when optimizing for speed, a three mov sequence, using just one of
the temporary registers, which ultimately results in the improved:

movq%rdi, %r8
movq%rdx, %rax
movq%rcx, %rdx
addq%r8, %rax
adcq%rsi, %rdx
ret

I've a follow-up patch which improves things further, and with the
output in flux, I'd like to add the new testcase with part 2, once
we're back down to requiring only two movq instructions.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32} with
no new failures.  Ok for mainline?


2022-06-02  Roger Sayle  

gcc/ChangeLog
* config/i386/i386.md (define_peephole2): Recognize double word
swap sequences, and replace them with more efficient idioms,
including using xchg when optimizing for size.


Thanks in advance,
Roger
--

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 2b1d65b..f3cf6e2 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3016,6 +3016,36 @@
   [(parallel [(set (match_dup 1) (match_dup 2))
  (set (match_dup 2) (match_dup 1))])])
 
+;; Replace a double word swap that requires 4 mov insns with a
+;; 3 mov insn implementation (or an xchg when optimizing for size).
+(define_peephole2
+  [(set (match_operand:DWIH 0 "general_reg_operand")
+   (match_operand:DWIH 1 "general_reg_operand"))
+   (set (match_operand:DWIH 2 "general_reg_operand")
+   (match_operand:DWIH 3 "general_reg_operand"))
+   (clobber (match_operand: 4 "general_reg_operand"))
+   (set (match_dup 3) (match_dup 0))
+   (set (match_dup 1) (match_dup 2))]
+  "REGNO (operands[0]) != REGNO (operands[3])
+   && REGNO (operands[1]) != REGNO (operands[2])
+   && REGNO (operands[1]) != REGNO (operands[3])
+   && REGNO (operands[3]) == REGNO (operands[4])
+   && peep2_reg_dead_p (4, operands[0])
+   && peep2_reg_dead_p (5, operands[2])"
+  [(parallel [(set (match_dup 1) (match_dup 3))
+ (set (match_dup 3) (match_dup 1))])]
+{
+  if (!optimize_insn_for_size_p ())
+{
+  rtx tmp = REGNO (operands[0]) > REGNO (operands[2]) ? operands[0]
+ : operands[2];
+  emit_move_insn (tmp, operands[1]);
+  emit_move_insn (operands[1], operands[3]);
+  emit_move_insn (operands[3], tmp);
+  DONE;
+}
+})
+
 (define_expand "movstrict"
   [(set (strict_low_part (match_operand:SWI12 0 "register_operand"))
(match_operand:SWI12 1 "general_operand"))]

Re: [PATCH] [PR105665] ivopts: check defs of names in base for undefs

2022-06-02 Thread Alexandre Oliva via Gcc-patches

On Jun  1, 2022, Alexandre Oliva  wrote:

> Now I'm thinking we can go for an even stricter predicate to disable
> the optimization: if a non-PHI use of a maybe-undefined dominates the
> loop, then we can still perform the optimization:

Here it is.

[PR105665] ivopts: check defs of names in base for undefs

From: Alexandre Oliva 

The patch for PR 100810 tested for undefined SSA_NAMEs appearing
directly in the base expression of the potential IV candidate, but
that's not enough.  The testcase for PR105665 shows an undefined
SSA_NAME has the same ill effect if it's referenced as an PHI_NODE arg
in the referenced SSA_NAME.  The variant of that test shows it can be
further removed from the referenced SSA_NAME.

To avoid deep recursion, precompute maybe-undefined SSA_NAMEs: start
from known-undefined nonvirtual default defs, and propagate them to
any PHI nodes reached by a maybe-undefined arg, as long as there
aren't intervening non-PHI uses, that would imply the maybe-undefined
name must be defined at that point, otherwise it would invoke
undefined behavior.  Also test for intervening non-PHI uses of DEFs in
the base expr.

The test for intervening uses implemented herein relies on dominance;
this could be further extended, regarding conditional uses in every
path leading to a point as an unconditional use dominating that point,
but I haven't implemented that.

for  gcc/ChangeLog

PR tree-optimization/105665
PR tree-optimization/100810
* tree-ssa-loop-ivopts.cc
(ssa_name_maybe_undef_p, ssa_name_set_maybe_undef): New.
(ssa_name_any_use_dominates_bb_p, mark_ssa_maybe_undefs): New.
(find_ssa_undef): Check precomputed flag and intervening uses.
(tree_ssa_iv_optimize): Call mark_ssa_maybe_undefs.

for  gcc/testsuite/ChangeLog

PR tree-optimization/105665
PR tree-optimization/100810
* gcc.dg/torture/pr105665.c: New.
---
 gcc/testsuite/gcc.dg/torture/pr105665.c |   20 +
 gcc/tree-ssa-loop-ivopts.cc |  124 ++-
 2 files changed, 140 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr105665.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr105665.c 
b/gcc/testsuite/gcc.dg/torture/pr105665.c
new file mode 100644
index 0..34cfc65843495
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr105665.c
@@ -0,0 +1,20 @@
+/* { dg-do run } */
+
+int a, b, c[1], d[2], *e = c;
+int main() {
+  int f = 0;
+  for (; b < 2; b++) {
+int g;
+if (f)
+  g++, b = 40;
+a = d[b * b];
+for (f = 0; f < 3; f++) {
+  if (e)
+break;
+  g--;
+  if (a)
+a = g;
+}
+  }
+  return 0;
+}
diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
index 81b536f930415..f20a985d7ca22 100644
--- a/gcc/tree-ssa-loop-ivopts.cc
+++ b/gcc/tree-ssa-loop-ivopts.cc
@@ -3071,13 +3071,128 @@ get_loop_invariant_expr (struct ivopts_data *data, 
tree inv_expr)
   return *slot;
 }

-/* Find the first undefined SSA name in *TP.  */
+/* Return TRUE iff VAR is marked as maybe-undefined.  See
+   mark_ssa_maybe_undefs.  */
+
+static inline bool
+ssa_name_maybe_undef_p (tree var)
+{
+  gcc_checking_assert (TREE_CODE (var) == SSA_NAME);
+  return TREE_VISITED (var);
+}
+
+/* Set (or clear, depending on VALUE) VAR's maybe-undefined mark.  */
+
+static inline void
+ssa_name_set_maybe_undef (tree var, bool value = true)
+{
+  gcc_checking_assert (TREE_CODE (var) == SSA_NAME);
+  TREE_VISITED (var) = value;
+}
+
+/* Return TRUE iff there are any non-PHI uses of VAR that dominate the
+   end of BB.  If we return TRUE and BB is a loop header, then VAR we
+   be assumed to be defined within the loop, even if it is marked as
+   maybe-undefined.  */
+
+static inline bool
+ssa_name_any_use_dominates_bb_p (tree var, basic_block bb)
+{
+  imm_use_iterator iter;
+  use_operand_p use_p;
+  FOR_EACH_IMM_USE_FAST (use_p, iter, var)
+{
+  if (is_a  (USE_STMT (use_p)))
+   continue;
+  basic_block dombb = gimple_bb (USE_STMT (use_p));
+  if (dominated_by_p (CDI_DOMINATORS, bb, dombb))
+   return true;
+}
+
+  return false;
+}
+
+/* Mark as maybe_undef any SSA_NAMEs that are unsuitable as ivopts
+   candidates for potentially involving undefined behavior.  */
+
+static void
+mark_ssa_maybe_undefs (void)
+{
+  auto_vec queue;
+
+  /* Scan all SSA_NAMEs, marking the definitely-undefined ones as
+ maybe-undefined and queuing them for propagation, while clearing
+ the mark on others.  */
+  unsigned int i;
+  tree var;
+  FOR_EACH_SSA_NAME (i, var, cfun)
+{
+  if (SSA_NAME_IS_VIRTUAL_OPERAND (var)
+ || !ssa_undefined_value_p (var, false))
+   ssa_name_set_maybe_undef (var, false);
+  else
+   {
+ ssa_name_set_maybe_undef (var);
+ queue.safe_push (var);
+ if (dump_file)
+   fprintf (dump_file, "marking _%i as maybe-undef\n",
+SSA_NAME_VERSION (var));
+

Re: [x86_64 PATCH] PR target/105791: Add V1TI to V_128_256 for xop_pcmov_v1ti.

2022-06-02 Thread Hongtao Liu via Gcc-patches

On Thu, Jun 2, 2022 at 2:24 PM Roger Sayle  wrote:
>
>
> This patch resolves PR target/105791 which is a regression that was
> accidentally introduced for my workaround to PR tree-optimization/10566.
> (a deeper problem in GCC's vectorizer creating VEC_COND_EXPR when it
> shouldn't).  The latest issues is that by providing a vcond_mask_v1tiv1ti
> pattern in sse.md, the backend now calls ix86_expand_sse_movcc with
> V1TImode operands, which has a special case for TARGET_XOP to generate
> a vpcmov instruction.  Unfortunately, there wasn't previously a V1TImode
> variant, xop_pcmov_v1ti, so we'd ICE.
>
> This is easily fixed by adding V1TImode (and V2TImode) to V_128_256
> which is only used for defining XOP's vpcmov instruction.  This in turn
> requires V1TI (and V2TI) to be supported by  (though
> the use if  in the names xop_pcmov_
> seems unnecessary; the mode makes the name unique).
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.  Ok for mainline?
LGTM.
>
>
> 2022-06-02  Roger Sayle  
>
> gcc/ChangeLog
> PR target/105791
> * config/i386/sse.md (V_128_256):Add V1TI and V2TI.
> (define_mode_attr avxsizesuffix): Add support for V1TI and V2TI.
>
> gcc/testsuite/ChangeLog
> PR target/105791
> * gcc.target/i386/pr105791.c: New test case.
>
>
> Thanks in advance. Sorry for the inconvenience/breakage.
> Roger
> --
>


-- 
BR,
Hongtao

[PATCH] tree-optimization/105802 - another unswitching type issue

2022-06-02 Thread Richard Biener via Gcc-patches

This also fixes the type of the irange used for unswitching of
switch statements.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/105802
* tree-ssa-loop-unswitch.cc (find_unswitching_predicates_for_bb):
Make sure to also compute the range in the type of the switch index.

* g++.dg/opt/pr105802.C: New testcase.
---
 gcc/testsuite/g++.dg/opt/pr105802.C | 23 +++
 gcc/tree-ssa-loop-unswitch.cc   | 17 +++--
 2 files changed, 30 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/opt/pr105802.C

diff --git a/gcc/testsuite/g++.dg/opt/pr105802.C 
b/gcc/testsuite/g++.dg/opt/pr105802.C
new file mode 100644
index 000..2514245d00a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/pr105802.C
@@ -0,0 +1,23 @@
+// { dg-do compile }
+// { dg-options "-O3" }
+
+enum E { E0, E1 };
+
+void bar ();
+void baz ();
+
+int c;
+
+void
+foo (int i)
+{
+  E e = (E) i;
+  while (c)
+switch (e)
+  {
+  case E0:
+bar ();
+  case E1:
+baz ();
+  }
+}
diff --git a/gcc/tree-ssa-loop-unswitch.cc b/gcc/tree-ssa-loop-unswitch.cc
index 2b6013e9d69..61c04ed9f2e 100644
--- a/gcc/tree-ssa-loop-unswitch.cc
+++ b/gcc/tree-ssa-loop-unswitch.cc
@@ -523,22 +523,19 @@ find_unswitching_predicates_for_bb (basic_block bb, class 
loop *loop,
  tree lab = gimple_switch_label (stmt, i);
  tree cmp;
  int_range<2> lab_range;
+ tree low = fold_convert (idx_type, CASE_LOW (lab));
  if (CASE_HIGH (lab) != NULL_TREE)
{
- tree cmp1 = fold_build2 (GE_EXPR, boolean_type_node, idx,
-  fold_convert (idx_type,
-CASE_LOW (lab)));
- tree cmp2 = fold_build2 (LE_EXPR, boolean_type_node, idx,
-  fold_convert (idx_type,
-CASE_HIGH (lab)));
+ tree high = fold_convert (idx_type, CASE_HIGH (lab));
+ tree cmp1 = fold_build2 (GE_EXPR, boolean_type_node, idx, low);
+ tree cmp2 = fold_build2 (LE_EXPR, boolean_type_node, idx, high);
  cmp = fold_build2 (BIT_AND_EXPR, boolean_type_node, cmp1, cmp2);
- lab_range.set (CASE_LOW (lab), CASE_HIGH (lab));
+ lab_range.set (low, high);
}
  else
{
- cmp = fold_build2 (EQ_EXPR, boolean_type_node, idx,
-fold_convert (idx_type, CASE_LOW (lab)));
- lab_range.set (CASE_LOW (lab));
+ cmp = fold_build2 (EQ_EXPR, boolean_type_node, idx, low);
+ lab_range.set (low);
}
 
  /* Combine the expression with the existing one.  */
-- 
2.35.3

[x86_64 PATCH] PR target/105791: Add V1TI to V_128_256 for xop_pcmov_v1ti.

2022-06-02 Thread Roger Sayle


This patch resolves PR target/105791 which is a regression that was
accidentally introduced for my workaround to PR tree-optimization/10566.
(a deeper problem in GCC's vectorizer creating VEC_COND_EXPR when it
shouldn't).  The latest issues is that by providing a vcond_mask_v1tiv1ti
pattern in sse.md, the backend now calls ix86_expand_sse_movcc with
V1TImode operands, which has a special case for TARGET_XOP to generate
a vpcmov instruction.  Unfortunately, there wasn't previously a V1TImode
variant, xop_pcmov_v1ti, so we'd ICE.

This is easily fixed by adding V1TImode (and V2TImode) to V_128_256
which is only used for defining XOP's vpcmov instruction.  This in turn
requires V1TI (and V2TI) to be supported by  (though
the use if  in the names xop_pcmov_
seems unnecessary; the mode makes the name unique).

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2022-06-02  Roger Sayle  

gcc/ChangeLog
PR target/105791
* config/i386/sse.md (V_128_256):Add V1TI and V2TI.
(define_mode_attr avxsizesuffix): Add support for V1TI and V2TI.

gcc/testsuite/ChangeLog
PR target/105791
* gcc.target/i386/pr105791.c: New test case.


Thanks in advance. Sorry for the inconvenience/breakage.
Roger
--

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index c2e046e8..8b3163f 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -301,7 +301,8 @@
 
 ;; All 128bit and 256bit vector modes
 (define_mode_iterator V_128_256
-  [V32QI V16QI V16HI V8HI V8SI V4SI V4DI V2DI V16HF V8HF V8SF V4SF V4DF V2DF])
+  [V32QI V16QI V16HI V8HI V8SI V4SI V4DI V2DI V2TI V1TI
+   V16HF V8HF V8SF V4SF V4DF V2DF])
 
 ;; All 512bit vector modes
 (define_mode_iterator V_512 [V64QI V32HI V16SI V8DI V16SF V8DF])
@@ -897,9 +898,9 @@
(V8HI "sse4_1") (V16HI "avx")])
 
 (define_mode_attr avxsizesuffix
-  [(V64QI "512") (V32HI "512") (V16SI "512") (V8DI "512")
-   (V32QI "256") (V16HI "256") (V8SI "256") (V4DI "256")
-   (V16QI "") (V8HI "") (V4SI "") (V2DI "")
+  [(V64QI "512") (V32HI "512") (V16SI "512") (V8DI "512") (V4TI "512")
+   (V32QI "256") (V16HI "256") (V8SI "256") (V4DI "256") (V2TI "256")
+   (V16QI "") (V8HI "") (V4SI "") (V2DI "") (V1TI "")
(V32HF "512") (V16SF "512") (V8DF "512")
(V16HF "256") (V8SF "256") (V4DF "256")
(V8HF "") (V4SF "") (V2DF "")])
diff --git a/gcc/testsuite/gcc.target/i386/pr105791.c 
b/gcc/testsuite/gcc.target/i386/pr105791.c
new file mode 100644
index 000..55e278b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr105791.c
@@ -0,0 +1,13 @@
+/* { dg-do compile { target int128 } } */
+/* { dg-options "-O2 -mxop" } */
+typedef __int128 __attribute__((__vector_size__ (sizeof (__int128 U;
+typedef int __attribute__((__vector_size__ (sizeof (int V;
+
+U u;
+V v;
+
+U
+foo (void)
+{
+  return (0 != __builtin_convertvector (v, U)) <= (0 != u);
+}

97 matches

Mail list logo