Re: [PATCH v2] Support --disable-fixincludes.

2022-05-24 Thread Alexandre Oliva via Gcc-patches
On May 24, 2022, Martin Liška  wrote:

> Allways install limits.h and syslimits.h header files
> to include folder.

typo: s/Allways/Always/

I'm a little worried about this bit, too.  limitx.h includes
"syslimits.h", mentioning it should be in the same directory.  Perhaps
it could be left in include-fixed?

The patch also changes syslimits.h from either the fixincluded file or
gsyslimits.h to use gsyslimits.h unconditionally, which seemed wrong at
first.

Now I see how these two hunks work together: syslimits.h will now always
#include_next , which will find it in include-fixed if it's
there, and system header files otherwise.  Nice!, but IMHO the commit
message could be a little more verbose on the extent of the change and
why that (is supposed to) work.


It also looks like install-mkheaders installs limits-related headers for
when fixincludes runs; we could probably skip the whole thing if
fixincludes is disabled, but I'm also worried about how the changes
above might impact post-install fixincludes: if that installs
gsyslimits.h et al in include-fixed while your changes moves it to
include, headers might end up in a confusing state.  I haven't worked
out whether that's safe, but there appears to be room for cleanups
there.

gcc/config/mips/t-sdemtk also places syslimits.h explicitly in include/
rather than include-fixed/, as part of disabling fixincludes, which is
good, but it could be cleaned up as well.

I don't see other config fragments that might require adjustments, so I
think the patch looks good; hopefully my worries are unjustified, and
the cleanups it enables could be


We still create the README file in there and install it, even with
fixincludes disabled and thus unavailable, don't we?  That README then
becomes misleading; we might be better off not installing it.


> When --disable-fixincludes is used, then no systen header files
> are fixed by the tools in fixincludes. Moreover, the fixincludes
> tools are not built any longer.

typo: s/systen/system/


Could you please check that a post-install mkheaders still has a
functional limits.h with these changes?  The patch is ok (with the typo
fixes) if so.  The cleanups it enables would be welcome as separate
patches ;-)

Thanks!

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about 


[PATCH V3]rs6000: Optimize comparison on rotated 16bits constant

2022-05-24 Thread Jiufu Guo via Gcc-patches
Jiufu Guo via Gcc-patches  writes:

Hi,

This patch is based on:
https://gcc.gnu.org/pipermail/gcc-patches/2022-May/594252.html.
Compare with previous patch, this patch refined the comment and
rename one function, and support case: "*p == 0xc001".

When checking eq/neq with a constant which has only 16bits, then it can
be optimized to check the rotated data.  By this, the constant building
is optimized.

As the example in PR103743:
For "in == 0x8000LL", this patch generates:
rotldi %r3,%r3,16
cmpldi %cr0,%r3,32768
instead:
li %r9,-1
rldicr %r9,%r9,0,0
cmpd %cr0,%r3,%r9

This patch pass bootstrap and regtest on ppc64 and ppc64le.
Ok for trunk?  Thanks!

BR,
Jiufu

PR target/103743

gcc/ChangeLog:

* config/rs6000/rs6000.cc (rotate_from_leading_zeros_const): New.
(rotate_comparison_ops): New.
(rs6000_generate_compare): Optimize compare on const.

gcc/testsuite/ChangeLog:

* gcc.target/powerpc/pr103743.c: New test.
* gcc.target/powerpc/pr103743_1.c: New test.

---
 gcc/config/rs6000/rs6000.cc   | 103 ++
 gcc/testsuite/gcc.target/powerpc/pr103743.c   |  52 +
 gcc/testsuite/gcc.target/powerpc/pr103743_1.c |  95 
 3 files changed, 250 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr103743_1.c

diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index d4defc855d0..2cfe3c49e85 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -14858,6 +14858,95 @@ rs6000_reverse_condition (machine_mode mode, enum 
rtx_code code)
 return reverse_condition (code);
 }
 
+/* Check if C can be rotated from an immediate which contains leading
+   zeros at least CLZ.
+
+   Return the number by which C can be rotated from the immediate.
+   Return -1 if C can not be rotated as from.  */
+
+static int
+rotate_from_leading_zeros_const (unsigned HOST_WIDE_INT c, int clz)
+{
+  /* case. 0..0xxx: already at least clz zeros.  */
+  int lz = clz_hwi (c);
+  if (lz >= clz)
+return 0;
+
+  /* case a. 0..0xxx0..0: at least clz zeros.  */
+  int tz = ctz_hwi (c);
+  if (lz + tz >= clz)
+return tz;
+
+  /* xx0..0xx: rotate enough bits firstly, then check case a.  */
+  const int rot_bits = HOST_BITS_PER_WIDE_INT - clz + 1;
+  unsigned HOST_WIDE_INT rc = (c >> rot_bits) | (c << (clz - 1));
+  tz = ctz_hwi (rc);
+  if (clz_hwi (rc) + tz >= clz)
+return tz + rot_bits;
+
+  return -1;
+}
+
+/* Check if able to optimize the CMP on rotated operands.
+   "i == C" ==> "rotl(i,N) == rotl(C,N)" if rotl(C,N) fit into
+   immediate operand of cmpldi or cmpdi.
+
+   Return the number by which the operands are rotated from.
+   Return -1 if unable to rotate.  */
+
+static int
+rotate_comparison_ops (rtx cmp, machine_mode mode, bool *sgn_cmp, rtx *cst)
+{
+  /* Now only support compare on DImode, for "== or !=".  */
+  if (mode != DImode)
+return -1;
+
+  enum rtx_code code = GET_CODE (cmp);
+  if (code != NE && code != EQ)
+return -1;
+
+  rtx op1 = XEXP (cmp, 1);
+
+  /* The constant would already been set to reg by previous insn.  */
+  rtx_insn *insn = get_last_insn_anywhere ();
+  rtx src = NULL_RTX;
+  while (!src && insn && INSN_P (insn))
+{
+  rtx set = single_set (insn);
+  if (set && SET_DEST (set) == op1)
+   src = SET_SRC (set);
+  else
+   insn = PREV_INSN (insn);
+}
+
+  /* It constant may be in constant pool. */
+  if (src && MEM_P (src))
+src = avoid_constant_pool_reference (src);
+
+  /* Check if able to compare against rotated const.  */
+  if (!(src && CONST_INT_P (src)))
+return -1;
+
+  unsigned HOST_WIDE_INT C = INTVAL (src);
+  *cst = src;
+
+  /* For case like 0x8765LL, use logical cmpldi.
+ Rotated from 0x8765.  */
+  *sgn_cmp = false;
+  int rot = rotate_from_leading_zeros_const (C, 48);
+  if (rot >= 0)
+return rot;
+
+  /* For case like 0x8765LL, use sign cmpdi.
+ Rotated from 0x8765.  */
+  *sgn_cmp = true;
+  rot = rotate_from_leading_zeros_const (~C, 49);
+  if (rot >= 0)
+return rot;
+
+  return -1;
+}
+
 /* Generate a compare for CODE.  Return a brand-new rtx that
represents the result of the compare.  */
 
@@ -14887,6 +14976,20 @@ rs6000_generate_compare (rtx cmp, machine_mode mode)
   else
 comp_mode = CCmode;
 
+  /* "i == C" ==> "rotl(i,N) == rotl(C,N)" if rotl(C,N) only low 16bits.  */
+  bool sgn_cmp = false;
+  rtx cst = NULL_RTX;
+  int rot_bits = rotate_comparison_ops (cmp, mode, _cmp, );
+  if (rot_bits > 0)
+{
+  rtx n = GEN_INT (HOST_BITS_PER_WIDE_INT - rot_bits);
+  rtx rot_op0 = gen_reg_rtx (mode);
+  emit_insn (gen_rtx_SET (rot_op0, gen_rtx_ROTATE (mode, op0, n)));
+  op0 = rot_op0;
+  op1 = simplify_gen_binary (ROTATE, mode, cst, n);
+  comp_mode = sgn_cmp ? CCmode : 

Re: [PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-24 Thread Hongtao Liu via Gcc-patches
On Wed, May 25, 2022 at 11:39 AM liuhongt via Gcc-patches
 wrote:
>
> Rigt now, mem_cost for separate mem alternative is 1 * frequency which
> is pretty small and caused the unnecessary SSE spill in the PR, I've tried
> to rework backend cost model, but RA still not happy with that(regress
> somewhere else). I think the root cause of this is cost for separate 'm'
> alternative cost is too small, especially considering that the mov cost
> of gpr are 2(default for REGISTER_MOVE_COST). So this patch increase mem_cost
> to 2*frequency, also increase 1 for reg_class cost when m alternative.
>
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> PR target/105513
> * ira-costs.cc (record_reg_classes): Increase both mem_cost
> and reg class cost by 1 for separate mem alternative when
> REG_P (op).
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr105513-1.c: New test.
> ---
>  gcc/ira-costs.cc   | 26 +-
>  gcc/testsuite/gcc.target/i386/pr105513-1.c | 16 +
>  2 files changed, 31 insertions(+), 11 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr105513-1.c
>
> diff --git a/gcc/ira-costs.cc b/gcc/ira-costs.cc
> index 964c94a06ef..f7b8325e195 100644
> --- a/gcc/ira-costs.cc
> +++ b/gcc/ira-costs.cc
> @@ -625,7 +625,8 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops,
>   for (k = cost_classes_ptr->num - 1; k >= 0; k--)
> {
>   rclass = cost_classes[k];
> - pp_costs[k] = mem_cost[rclass][0] * frequency;
> + pp_costs[k] = (mem_cost[rclass][0]
> ++ 1) * frequency;
> }
> }
>   else
> @@ -648,7 +649,8 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops,
>   for (k = cost_classes_ptr->num - 1; k >= 0; k--)
> {
>   rclass = cost_classes[k];
> - pp_costs[k] = mem_cost[rclass][1] * frequency;
> + pp_costs[k] = (mem_cost[rclass][1]
> ++ 1) * frequency;
> }
> }
>   else
> @@ -670,9 +672,9 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops,
>   for (k = cost_classes_ptr->num - 1; k >= 0; k--)
> {
>   rclass = cost_classes[k];
> - pp_costs[k] = ((mem_cost[rclass][0]
> - + mem_cost[rclass][1])
> -* frequency);
> + pp_costs[k] = (mem_cost[rclass][0]
> ++ mem_cost[rclass][1]
> ++ 2) * frequency;
> }
> }
>   else
> @@ -861,7 +863,8 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops,
>   for (k = cost_classes_ptr->num - 1; k >= 0; k--)
> {
>   rclass = cost_classes[k];
> - pp_costs[k] = mem_cost[rclass][0] * frequency;
> + pp_costs[k] = (mem_cost[rclass][0]
> ++ 1) * frequency;
> }
> }
>   else
> @@ -884,7 +887,8 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops,
>   for (k = cost_classes_ptr->num - 1; k >= 0; k--)
> {
>   rclass = cost_classes[k];
> - pp_costs[k] = mem_cost[rclass][1] * frequency;
> + pp_costs[k] = (mem_cost[rclass][1]
> ++ 1) * frequency;
> }
> }
>   else
> @@ -906,9 +910,9 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops,
>   for (k = cost_classes_ptr->num - 1; k >= 0; k--)
> {
>   rclass = cost_classes[k];
> - pp_costs[k] = ((mem_cost[rclass][0]
> - + mem_cost[rclass][1])
> -* frequency);
> + pp_costs[k] = (mem_cost[rclass][0]
> ++ mem_cost[rclass][1]
> ++ 2) * frequency;
> }
> }
>   

Re: [PATCH][_Hashtable] Fix insertion of range of type convertible to value_type PR 105714

2022-05-24 Thread François Dumont via Gcc-patches

Here is the patch to fix just what is described in PR 105714.

    libstdc++: [_Hashtable] Insert range of types convertible to 
value_type PR 105714


    Fix insertion of range of types convertible to value_type.

    libstdc++-v3/ChangeLog:

    PR libstdc++/105714
    * include/bits/hashtable_policy.h (_ValueTypeEnforcer): New.
    * include/bits/hashtable.h 
(_Hashtable<>::_M_insert_unique_aux): New.
    (_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&, 
true_type)): Use latters.
    (_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&, 
false_type)): Likewise.
    (_Hashtable(_InputIterator, _InputIterator, size_type, 
const _Hash&, const _Equal&,

    const allocator_type&, true_type)): Use this.insert range.
    (_Hashtable(_InputIterator, _InputIterator, size_type, 
const _Hash&, const _Equal&,

    const allocator_type&, false_type)): Use _M_insert.
    * testsuite/23_containers/unordered_map/cons/56112.cc: 
Check how many times conversion

    is done.
    * testsuite/23_containers/unordered_map/insert/105714.cc: 
New test.
    * testsuite/23_containers/unordered_set/insert/105714.cc: 
New test.


Tested under Linux x64, ok to commit ?

François

On 24/05/22 12:31, Jonathan Wakely wrote:

On Tue, 24 May 2022 at 11:22, Jonathan Wakely wrote:

On Tue, 24 May 2022 at 11:18, Jonathan Wakely wrote:

On Thu, 5 May 2022 at 18:38, François Dumont via Libstdc++
 wrote:

Hi

Renewing my patch to fix PR 56112 but for the insert methods, I totally
change it, now works also with move-only key types.

I let you Jonathan find a better name than _ValueTypeEnforcer as usual :-)

libstdc++: [_Hashtable] Insert range of types convertible to value_type
PR 56112

Fix insertion of range of types convertible to value_type. Fix also when
this value_type
has a move-only key_type which also allow converted values to be moved.

libstdc++-v3/ChangeLog:

  PR libstdc++/56112
  * include/bits/hashtable_policy.h (_ValueTypeEnforcer): New.
  * include/bits/hashtable.h
(_Hashtable<>::_M_insert_unique_aux): New.
  (_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&,
true_type)): Use latters.
  (_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&,
false_type)): Likewise.
  (_Hashtable(_InputIterator, _InputIterator, size_type, const
_Hash&, const _Equal&,
  const allocator_type&, true_type)): Use this.insert range.
  (_Hashtable(_InputIterator, _InputIterator, size_type, const
_Hash&, const _Equal&,
  const allocator_type&, false_type)): Use _M_insert.
  * testsuite/23_containers/unordered_map/cons/56112.cc: Check
how many times conversion
  is done.
  (test02): New test case.
  * testsuite/23_containers/unordered_set/cons/56112.cc: New test.

Tested under Linux x86_64.

Ok to commit ?

No, sorry.

The new test02 function in 23_containers/unordered_map/cons/56112.cc
doesn't compile with libc++ or MSVC either, are you sure that test is
valid? I don't think it is, because S2 is not convertible to
pair. None of the pair constructors are
viable, because the move constructor would require two user-defined
conversions (from S2 to pair and then from
pair to pair). A conversion
sequence cannot have more than one user-defined conversion using a
constructor or converion operator. So if your patch makes that
compile, it's a bug in the new code. I haven't analyzed that code to
see where the problem is, I'm just looking at the test results and the
changes in behaviour.

I meant to include this link showing that libc++ and MSVC reject
test02() as well:

https://godbolt.org/z/j7E9f6bd4

I've created https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105717 for
the insertion bug, rather than reopening PR 56112.

diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/include/bits/hashtable.h
index 5e1a417f7cd..cd42d3c9ba0 100644
--- a/libstdc++-v3/include/bits/hashtable.h
+++ b/libstdc++-v3/include/bits/hashtable.h
@@ -898,21 +898,33 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
 	std::pair
-	_M_insert(_Arg&& __arg, const _NodeGenerator& __node_gen,
-		  true_type /* __uks */)
+	_M_insert_unique_aux(_Arg&& __arg, const _NodeGenerator& __node_gen)
 	{
 	  return _M_insert_unique(
 	_S_forward_key(_ExtractKey{}(std::forward<_Arg>(__arg))),
 	std::forward<_Arg>(__arg), __node_gen);
 	}
 
+  template
+	std::pair
+	_M_insert(_Arg&& __arg, const _NodeGenerator& __node_gen,
+		  true_type /* __uks */)
+	{
+	  using __to_value
+	= __detail::_ValueTypeEnforcer<_ExtractKey, value_type>;
+	  return _M_insert_unique_aux(
+	__to_value{}(std::forward<_Arg>(__arg)), __node_gen);
+	}
+
   template
 	iterator
 	_M_insert(_Arg&& __arg, const _NodeGenerator& __node_gen,
 		  false_type __uks)
 	{
-	  return _M_insert(cend(), std::forward<_Arg>(__arg), __node_gen,
-			   __uks);
+	  using __to_value
+	= 

[PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

2022-05-24 Thread liuhongt via Gcc-patches
Rigt now, mem_cost for separate mem alternative is 1 * frequency which
is pretty small and caused the unnecessary SSE spill in the PR, I've tried
to rework backend cost model, but RA still not happy with that(regress
somewhere else). I think the root cause of this is cost for separate 'm'
alternative cost is too small, especially considering that the mov cost
of gpr are 2(default for REGISTER_MOVE_COST). So this patch increase mem_cost
to 2*frequency, also increase 1 for reg_class cost when m alternative.


Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok for trunk?

gcc/ChangeLog:

PR target/105513
* ira-costs.cc (record_reg_classes): Increase both mem_cost
and reg class cost by 1 for separate mem alternative when
REG_P (op).

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr105513-1.c: New test.
---
 gcc/ira-costs.cc   | 26 +-
 gcc/testsuite/gcc.target/i386/pr105513-1.c | 16 +
 2 files changed, 31 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr105513-1.c

diff --git a/gcc/ira-costs.cc b/gcc/ira-costs.cc
index 964c94a06ef..f7b8325e195 100644
--- a/gcc/ira-costs.cc
+++ b/gcc/ira-costs.cc
@@ -625,7 +625,8 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops,
  for (k = cost_classes_ptr->num - 1; k >= 0; k--)
{
  rclass = cost_classes[k];
- pp_costs[k] = mem_cost[rclass][0] * frequency;
+ pp_costs[k] = (mem_cost[rclass][0]
++ 1) * frequency;
}
}
  else
@@ -648,7 +649,8 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops,
  for (k = cost_classes_ptr->num - 1; k >= 0; k--)
{
  rclass = cost_classes[k];
- pp_costs[k] = mem_cost[rclass][1] * frequency;
+ pp_costs[k] = (mem_cost[rclass][1]
++ 1) * frequency;
}
}
  else
@@ -670,9 +672,9 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops,
  for (k = cost_classes_ptr->num - 1; k >= 0; k--)
{
  rclass = cost_classes[k];
- pp_costs[k] = ((mem_cost[rclass][0]
- + mem_cost[rclass][1])
-* frequency);
+ pp_costs[k] = (mem_cost[rclass][0]
++ mem_cost[rclass][1]
++ 2) * frequency;
}
}
  else
@@ -861,7 +863,8 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops,
  for (k = cost_classes_ptr->num - 1; k >= 0; k--)
{
  rclass = cost_classes[k];
- pp_costs[k] = mem_cost[rclass][0] * frequency;
+ pp_costs[k] = (mem_cost[rclass][0]
++ 1) * frequency;
}
}
  else
@@ -884,7 +887,8 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops,
  for (k = cost_classes_ptr->num - 1; k >= 0; k--)
{
  rclass = cost_classes[k];
- pp_costs[k] = mem_cost[rclass][1] * frequency;
+ pp_costs[k] = (mem_cost[rclass][1]
++ 1) * frequency;
}
}
  else
@@ -906,9 +910,9 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops,
  for (k = cost_classes_ptr->num - 1; k >= 0; k--)
{
  rclass = cost_classes[k];
- pp_costs[k] = ((mem_cost[rclass][0]
- + mem_cost[rclass][1])
-* frequency);
+ pp_costs[k] = (mem_cost[rclass][0]
++ mem_cost[rclass][1]
++ 2) * frequency;
}
}
  else
@@ -929,7 +933,7 @@ record_reg_classes (int n_alts, int n_ops, rtx *ops,
/* Although we don't need insn to reload from
   memory, still accessing memory is usually more
   expensive than a 

Re: [PATCH 00/10] Add 'final' and 'override' where missing

2022-05-24 Thread Eric Gallager via Gcc-patches
On Mon, May 23, 2022 at 3:32 PM David Malcolm via Gcc-patches
 wrote:
>
> With C++11 we can add "final" and "override" to the decls of vfuncs
> in derived classes, which documents to both human and automated readers
> of the code that a decl is intended to override a vfunc in a base class,
> and can help catch mistakes where we intended to override a vfunc, but
> messed up the prototypes.
>
> The following patch kit adds "final" and "override" specifiers to the
> decls of vfunc implementations throughout the source tree.
>
> I added "final override" everywhere where this was possible, or just
> "override" for the places where the overridden vfunc gets further
> overridden.
>
> I also removed "virtual" from such decls, since this isn't required
> when overriding an existing vfunc, and the "final override" better
> implies the intent of the code.
>
> I temporarily hacked -Werror=suggest-override into the Makefile whilst
> I was creating the patches, but I skipped the following:
>
> (a) gcc/d/dmd/ ...since these sources are copied from an upstream
> (b) gcc/go/gofrontend/ ...likewise
> (c) gcc/range.op.cc: as I believe this code is under heavy development
> (d) target-specific passes other than i386 (for ease of testing); I can
> do these in a followup, if desired.
>
> I didn't attempt to add -Wsuggest-override into our compile flags
> "properly".

Have you tried clang's -Winconsistent-missing-override flag for
comparison? I noticed some warnings from it when doing a build of gcc
trunk with clang just now.

>
> No functional changes intended.
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
>
> I split them up into separate patches by topic for ease of review, and
> for ease of writing the ChangeLog entries.
>
> Worth an update to https://gcc.gnu.org/codingconventions.html ?
>
> OK for trunk?
> Dave
>
> David Malcolm (10):
>   Add 'final' and 'override' to opt_pass vfunc impls
>   Add 'final' and 'override' on dom_walker vfunc impls
>   expr.cc: use final/override on op_by_pieces_d vfuncs
>   tree-switch-conversion.h: use final/override for cluster vfunc impls
>   d: add 'final' and 'override' to gcc/d/*.cc 'visit' impls
>   ipa: add 'final' and 'override' to call_summary_base vfunc impls
>   value-relation.h: add 'final' and 'override' to relation_oracle vfunc
> impls
>   i386: add 'final' and 'override' to scalar_chain vfunc impls
>   tree-vect-slp-patterns.cc: add 'final' and 'override' to
> vect_pattern::build impls
>   Add 'final' and 'override' in various places
>
>  gcc/adjust-alignment.cc  |  2 +-
>  gcc/asan.cc  | 19 ++---
>  gcc/auto-inc-dec.cc  |  4 +-
>  gcc/auto-profile.cc  |  8 ++--
>  gcc/bb-reorder.cc| 12 +++---
>  gcc/cfgcleanup.cc|  8 ++--
>  gcc/cfgexpand.cc |  2 +-
>  gcc/cfgrtl.cc|  6 +--
>  gcc/cgraphbuild.cc   | 13 +++---
>  gcc/combine-stack-adj.cc |  4 +-
>  gcc/combine.cc   |  4 +-
>  gcc/compare-elim.cc  |  6 +--
>  gcc/config/i386/i386-features.cc | 20 -
>  gcc/config/i386/i386-features.h  | 16 +++
>  gcc/coroutine-passes.cc  |  8 ++--
>  gcc/cp/cxx-pretty-print.h| 38 -
>  gcc/cp/module.cc |  4 +-
>  gcc/cprop.cc |  9 ++--
>  gcc/cse.cc   | 18 +---
>  gcc/d/decl.cc| 36 
>  gcc/d/expr.cc|  2 +-
>  gcc/d/toir.cc| 64 ++--
>  gcc/d/typeinfo.cc| 34 +++
>  gcc/d/types.cc   | 30 ++---
>  gcc/dce.cc   |  8 ++--
>  gcc/df-core.cc   | 10 ++---
>  gcc/dse.cc   | 14 --
>  gcc/dwarf2cfi.cc |  7 ++-
>  gcc/early-remat.cc   |  4 +-
>  gcc/except.cc|  6 +--
>  gcc/expr.cc  | 14 +++---
>  gcc/final.cc | 14 --
>  gcc/function.cc  | 10 ++---
>  gcc/fwprop.cc|  8 ++--
>  gcc/gcse.cc  | 14 --
>  gcc/genmatch.cc  | 22 +-
>  gcc/gensupport.cc|  2 +-
>  gcc/gimple-harden-conditionals.cc| 20 ++---
>  gcc/gimple-if-to-switch.cc   |  4 +-
>  gcc/gimple-isel.cc   |  4 +-
>  gcc/gimple-laddress.cc   |  6 +--
>  gcc/gimple-loop-interchange.cc   |  6 +--
>  gcc/gimple-loop-jam.cc   |  4 +-
>  gcc/gimple-loop-versioning.cc|  7 ++-
>  gcc/gimple-low.cc|  5 ++-
>  gcc/gimple-range-cache.h |  4 +-
>  gcc/gimple-ssa-backprop.cc   |  6 +--
>  gcc/gimple-ssa-evrp.cc  

[PATCH v3] libstdc++: fix pointer type exception catch [PR105387]

2022-05-24 Thread Jakob Hasse via Gcc-patches
Hello,

two weeks ago I submitted the second version of the patch PR105387 for the bug 
105387. Now I added a pointer-to-member exception test just to make sure that 
it doesn't break in case RTTI is enabled. The test is disabled if RTTI is 
disabled. I didn't receive any feedback so far regarding the second version of 
the patch. Is there any issue preventing acceptance?

I ran the conformance tests on libstdc++v3 by running
make -j 18 check RUNTESTFLAGS=conformance.exp

Results for the current version (only difference is the added pointer-to-member 
test):

Without RTTI before applying patch:
=== libstdc++ Summary ===

# of expected passes 14560
# of unexpected failures 5
# of expected failures 95
# of unsupported tests 702

Without RTTI after applying patch:
=== libstdc++ Summary ===

# of expected passes 14562
# of unexpected failures 5
# of expected failures 95
# of unsupported tests 703

With RTTI before applying patch:
=== libstdc++ Summary ===

# of expected passes 14598
# of unexpected failures 2
# of expected failures 95
# of unsupported tests 683

With RTTI after applying patch:
=== libstdc++ Summary ===

# of expected passes 14602
# of unexpected failures 2
# of expected failures 95
# of unsupported tests 683

Given that the pointer-to-member test is disabled when RTTI is disabled, the 
results look logical to me.

From 26004c6f26f4b2f3e664184767d861c7291f3a16 Mon Sep 17 00:00:00 2001
From: Jakob Hasse <0xja...@users.noreply.github.com>
Date: Tue, 26 Apr 2022 12:03:47 +0800
Subject: [PATCH] [PATCH] libstdc++: fix pointer type exception catch (no RTTI)
 [PR105387]

PR libstdc++/105387

__pbase_type_info::__do_catch(), used to catch pointer type exceptions,
did not check if the type info object to compare against is a pointer
type info object before doing a static down-cast to a pointer type info
object. If RTTI is disabled, this leads to the following situation:
Since a pointer type info object has additional fields, they would
end up being undefined if the actual type info object was not a pointer
type info object.

A simple check has been added before the down-cast happens.

Note that a consequence of this check is that exceptions of type
pointer-to-member cannot be caught anymore.

In case RTTI is enabled, this does not seem to be a problem because
RTTI-based checks would run before and prevent running into the bad
down-cast. Hence, the fix is disabled if RTTI is enabled and exceptions
of type pointer-to-member can still be caught.

libstdc++-v3/ChangeLog:

	* libsupc++/pbase_type_info.cc (__do_catch):
	* testsuite/18_support/105387.cc: New test.

Signed-off-by: Jakob Hasse 
---
 libstdc++-v3/libsupc++/pbase_type_info.cc |  7 ++-
 libstdc++-v3/testsuite/18_support/105387.cc   | 61 +++
 .../18_support/exception_ptr/ptr_to_member.cc | 25 
 3 files changed, 92 insertions(+), 1 deletion(-)
 create mode 100644 libstdc++-v3/testsuite/18_support/105387.cc
 create mode 100644 libstdc++-v3/testsuite/18_support/exception_ptr/ptr_to_member.cc

diff --git a/libstdc++-v3/libsupc++/pbase_type_info.cc b/libstdc++-v3/libsupc++/pbase_type_info.cc
index 7e5720b84a3..934e049a4e0 100644
--- a/libstdc++-v3/libsupc++/pbase_type_info.cc
+++ b/libstdc++-v3/libsupc++/pbase_type_info.cc
@@ -74,7 +74,12 @@ __do_catch (const type_info *thr_type,
 // Therefore there must at least be a qualification conversion involved
 // But for that to be valid, our outer pointers must be const qualified.
 return false;
-  
+
+#if !__cpp_rtti
+  if (!thr_type->__is_pointer_p ())
+return false;
+#endif
+
   const __pbase_type_info *thrown_type =
 static_cast  (thr_type);
 
diff --git a/libstdc++-v3/testsuite/18_support/105387.cc b/libstdc++-v3/testsuite/18_support/105387.cc
new file mode 100644
index 000..5cec222c334
--- /dev/null
+++ b/libstdc++-v3/testsuite/18_support/105387.cc
@@ -0,0 +1,61 @@
+#include 
+#include 
+#include 
+
+// Test cases for PR libstdc++/105387
+
+// This test is to trigger undefined behavior if the bug 105387 is present
+// in the code. Note, however, given that the bug is present, this test runs
+// into undefined behavior which can also mean that it passes.
+// It has been observed to fail quite reliably on x86_64-linux-gnu but only
+// fail sporadically on Xtensa, depending on the code placement.
+void portable_test()
+{
+  bool exception_thrown = false;
+  try {
+throw std::runtime_error("test");
+  } catch (const char *e) {
+VERIFY(false);
+  } catch (const std::exception ) {
+exception_thrown = true;
+  }
+  VERIFY(exception_thrown);
+}
+
+// This test relies on the types defined in the files typeinfo and cxxabi.h
+// It is therefore less portable then the test case above but should be
+// guaranteed to fail if the implementation has the bug 105387.
+//
+// This test case checks that __pbase_type_info::__do_catch() behaves
+// correctly when called with a non-pointer type info object as argument.
+// In particular, 

Re: [PATCH v2] RISC-V: Enable overlap-by-pieces in case of fast unaliged access

2022-05-24 Thread Palmer Dabbelt

On Tue, 24 May 2022 18:36:27 PDT (-0700), Vineet Gupta wrote:



On 5/24/22 18:32, Palmer Dabbelt wrote:


Ping, IMO this needs to be (re)considered for trunk.
This goes really nicely with riscv_slow_unaligned_access_p==false, to
elide the unrolled tail copies for trailer word/sword/byte accesses.

@Kito, @Palmer ? Just from codegen pov this seems to be a no brainer


Has anything changed since this was posted?

IIRC the discussion essentially boiled down to that overlapping store
likely being a hard case on in-order machines (like the C906), but
there weren't any benchmarks or documentation so we could figure that
out.  I don't see how this is an obvious win: sure it's fewer ops (and
assuming a uniform distribution fewer misaligned accesses, though I
don't know how reasonable uniform distributions are here), but it's
only a small upside so that hard case would have to be fast in order
for this to be better code.

If someone has benchmarks showing these are actually faster on the
C906 (or even some documentation describing how these accesses are
handled) then I'm happy to take the code (with the -Os bit fixed).  It
shouldn't be all that hard of a benchmark to run...


Will this be acceptable, if this was a per cpu knob then ? There seem to
be existing OoO RV cores too !


It's being added as a per-cpu knob, it's just only being turned on for 
the C906 and -Os tunings where it's not obviously a win.


I'm certainly not saying nobody builds this flavor of machine, certainly 
Intel does as it's on for their machines, just that there's no solid 
evidence the C906 behaves this way.  Given that this flag had been 
explicitly discussed not to include generating misaligned accesses on 
purpose during the Os discussions, I don't want to just flip it over on 
a vendor and risk a performance regression.


The only other pipeline models are for in-order SiFive processors that 
trap into M-mode for unaligned accesses, so this sort of thing doesn't 
apply (though it's part of the reason -Os doesn't do this, as they're 
still pretty common).







foo:
     sd    zero,0(a0)
     sw    zero,8(a0)
     sh    zero,12(a0)
     sb    zero,14(a0)

vs.

 sd    zero,0(a0)
     sd    zero,7(a0)




Re: [PATCH v2] RISC-V: Enable overlap-by-pieces in case of fast unaliged access

2022-05-24 Thread Vineet Gupta




On 5/24/22 18:32, Palmer Dabbelt wrote:


Ping, IMO this needs to be (re)considered for trunk.
This goes really nicely with riscv_slow_unaligned_access_p==false, to
elide the unrolled tail copies for trailer word/sword/byte accesses.

@Kito, @Palmer ? Just from codegen pov this seems to be a no brainer


Has anything changed since this was posted?

IIRC the discussion essentially boiled down to that overlapping store 
likely being a hard case on in-order machines (like the C906), but 
there weren't any benchmarks or documentation so we could figure that 
out.  I don't see how this is an obvious win: sure it's fewer ops (and 
assuming a uniform distribution fewer misaligned accesses, though I 
don't know how reasonable uniform distributions are here), but it's 
only a small upside so that hard case would have to be fast in order 
for this to be better code.


If someone has benchmarks showing these are actually faster on the 
C906 (or even some documentation describing how these accesses are 
handled) then I'm happy to take the code (with the -Os bit fixed).  It 
shouldn't be all that hard of a benchmark to run...


Will this be acceptable, if this was a per cpu knob then ? There seem to 
be existing OoO RV cores too !





foo:
     sd    zero,0(a0)
     sw    zero,8(a0)
     sh    zero,12(a0)
     sb    zero,14(a0)

vs.

 sd    zero,0(a0)
     sd    zero,7(a0)






Re: [PATCH v2] RISC-V: Enable overlap-by-pieces in case of fast unaliged access

2022-05-24 Thread Palmer Dabbelt

On Tue, 24 May 2022 17:55:24 PDT (-0700), Vineet Gupta wrote:



On 7/22/21 15:41, Christoph Muellner via Gcc-patches wrote:

This patch enables the overlap-by-pieces feature of the by-pieces
infrastructure for inlining builtins in case the target has set
riscv_slow_unaligned_access_p to false.

An example to demonstrate the effect for targets with fast unaligned
access (target's that have slow_unaligned_access set to false) is
the code that is generated for "memset (p, 0, 15);", where the
alignment of p is unknown:

   Without overlap_op_by_pieces we get:
 8e:   00053023sd  zero,0(a0)
 92:   00052423sw  zero,8(a0)
 96:   00051623sh  zero,12(a0)
 9a:   00050723sb  zero,14(a0)

   With overlap_op_by_pieces we get:
 7e:   00053023sd  zero,0(a0)
 82:   000533a3sd  zero,7(a0)

gcc/ChangeLog:

* config/riscv/riscv.c (riscv_overlap_op_by_pieces): New function.
(TARGET_OVERLAP_OP_BY_PIECES_P): Connect to
riscv_overlap_op_by_pieces.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/builtins-overlap-1.c: New test.
* gcc.target/riscv/builtins-overlap-2.c: New test.
* gcc.target/riscv/builtins-overlap-3.c: New test.
* gcc.target/riscv/builtins-overlap-4.c: New test.
* gcc.target/riscv/builtins-overlap-5.c: New test.
* gcc.target/riscv/builtins-overlap-6.c: New test.
* gcc.target/riscv/builtins-overlap-7.c: New test.
* gcc.target/riscv/builtins-overlap-8.c: New test.
* gcc.target/riscv/builtins-strict-align.c: New test.
* gcc.target/riscv/builtins.h: New test.

Signed-off-by: Christoph Muellner 


Ping, IMO this needs to be (re)considered for trunk.
This goes really nicely with riscv_slow_unaligned_access_p==false, to
elide the unrolled tail copies for trailer word/sword/byte accesses.

@Kito, @Palmer ? Just from codegen pov this seems to be a no brainer


Has anything changed since this was posted?

IIRC the discussion 
essentially boiled down to that overlapping store likely being a hard 
case on in-order machines (like the C906), but there weren't any 
benchmarks or documentation so we could figure that out.  I don't see 
how this is an obvious win: sure it's fewer ops (and assuming a uniform 
distribution fewer misaligned accesses, though I don't know how 
reasonable uniform distributions are here), but it's only a small upside 
so that hard case would have to be fast in order for this to be better 
code.


If someone has benchmarks showing these are actually faster on the C906 
(or even some documentation describing how these accesses are handled) 
then I'm happy to take the code (with the -Os bit fixed).  It shouldn't 
be all that hard of a benchmark to run...



foo:
     sd    zero,0(a0)
     sw    zero,8(a0)
     sh    zero,12(a0)
     sb    zero,14(a0)

vs.

     sd    zero,0(a0)
     sd    zero,7(a0)

-Vineet


---
  gcc/config/riscv/riscv.c | 11 +++
  .../gcc.target/riscv/builtins-overlap-1.c| 10 ++
  .../gcc.target/riscv/builtins-overlap-2.c| 10 ++
  .../gcc.target/riscv/builtins-overlap-3.c| 10 ++
  .../gcc.target/riscv/builtins-overlap-4.c| 10 ++
  .../gcc.target/riscv/builtins-overlap-5.c| 11 +++
  .../gcc.target/riscv/builtins-overlap-6.c| 13 +
  .../gcc.target/riscv/builtins-overlap-7.c| 11 +++
  .../gcc.target/riscv/builtins-overlap-8.c| 11 +++
  .../gcc.target/riscv/builtins-strict-align.c | 10 ++
  gcc/testsuite/gcc.target/riscv/builtins.h| 16 
  11 files changed, 123 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-1.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-2.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-3.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-4.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-5.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-6.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-7.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-8.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-strict-align.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins.h

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 576960bb37c..98c76ba657a 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -5201,6 +5201,14 @@ riscv_slow_unaligned_access (machine_mode, unsigned int)
return riscv_slow_unaligned_access_p;
  }

+/* Implement TARGET_OVERLAP_OP_BY_PIECES_P.  */
+
+static bool
+riscv_overlap_op_by_pieces (void)
+{
+  return !riscv_slow_unaligned_access_p;
+}
+
  /* Implement 

Re: [PATCH v2] RISC-V: Enable overlap-by-pieces in case of fast unaliged access

2022-05-24 Thread Vineet Gupta




On 7/22/21 15:41, Christoph Muellner via Gcc-patches wrote:

This patch enables the overlap-by-pieces feature of the by-pieces
infrastructure for inlining builtins in case the target has set
riscv_slow_unaligned_access_p to false.

An example to demonstrate the effect for targets with fast unaligned
access (target's that have slow_unaligned_access set to false) is
the code that is generated for "memset (p, 0, 15);", where the
alignment of p is unknown:

   Without overlap_op_by_pieces we get:
 8e:   00053023sd  zero,0(a0)
 92:   00052423sw  zero,8(a0)
 96:   00051623sh  zero,12(a0)
 9a:   00050723sb  zero,14(a0)

   With overlap_op_by_pieces we get:
 7e:   00053023sd  zero,0(a0)
 82:   000533a3sd  zero,7(a0)

gcc/ChangeLog:

* config/riscv/riscv.c (riscv_overlap_op_by_pieces): New function.
(TARGET_OVERLAP_OP_BY_PIECES_P): Connect to
riscv_overlap_op_by_pieces.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/builtins-overlap-1.c: New test.
* gcc.target/riscv/builtins-overlap-2.c: New test.
* gcc.target/riscv/builtins-overlap-3.c: New test.
* gcc.target/riscv/builtins-overlap-4.c: New test.
* gcc.target/riscv/builtins-overlap-5.c: New test.
* gcc.target/riscv/builtins-overlap-6.c: New test.
* gcc.target/riscv/builtins-overlap-7.c: New test.
* gcc.target/riscv/builtins-overlap-8.c: New test.
* gcc.target/riscv/builtins-strict-align.c: New test.
* gcc.target/riscv/builtins.h: New test.

Signed-off-by: Christoph Muellner 


Ping, IMO this needs to be (re)considered for trunk.
This goes really nicely with riscv_slow_unaligned_access_p==false, to 
elide the unrolled tail copies for trailer word/sword/byte accesses.


@Kito, @Palmer ? Just from codegen pov this seems to be a no brainer

foo:
    sd    zero,0(a0)
    sw    zero,8(a0)
    sh    zero,12(a0)
    sb    zero,14(a0)

vs.

    sd    zero,0(a0)
    sd    zero,7(a0)

-Vineet


---
  gcc/config/riscv/riscv.c | 11 +++
  .../gcc.target/riscv/builtins-overlap-1.c| 10 ++
  .../gcc.target/riscv/builtins-overlap-2.c| 10 ++
  .../gcc.target/riscv/builtins-overlap-3.c| 10 ++
  .../gcc.target/riscv/builtins-overlap-4.c| 10 ++
  .../gcc.target/riscv/builtins-overlap-5.c| 11 +++
  .../gcc.target/riscv/builtins-overlap-6.c| 13 +
  .../gcc.target/riscv/builtins-overlap-7.c| 11 +++
  .../gcc.target/riscv/builtins-overlap-8.c| 11 +++
  .../gcc.target/riscv/builtins-strict-align.c | 10 ++
  gcc/testsuite/gcc.target/riscv/builtins.h| 16 
  11 files changed, 123 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-1.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-2.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-3.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-4.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-5.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-6.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-7.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-overlap-8.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins-strict-align.c
  create mode 100644 gcc/testsuite/gcc.target/riscv/builtins.h

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 576960bb37c..98c76ba657a 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -5201,6 +5201,14 @@ riscv_slow_unaligned_access (machine_mode, unsigned int)
return riscv_slow_unaligned_access_p;
  }
  
+/* Implement TARGET_OVERLAP_OP_BY_PIECES_P.  */

+
+static bool
+riscv_overlap_op_by_pieces (void)
+{
+  return !riscv_slow_unaligned_access_p;
+}
+
  /* Implement TARGET_CAN_CHANGE_MODE_CLASS.  */
  
  static bool

@@ -5525,6 +5533,9 @@ riscv_asan_shadow_offset (void)
  #undef TARGET_SLOW_UNALIGNED_ACCESS
  #define TARGET_SLOW_UNALIGNED_ACCESS riscv_slow_unaligned_access
  
+#undef TARGET_OVERLAP_OP_BY_PIECES_P

+#define TARGET_OVERLAP_OP_BY_PIECES_P riscv_overlap_op_by_pieces
+
  #undef TARGET_SECONDARY_MEMORY_NEEDED
  #define TARGET_SECONDARY_MEMORY_NEEDED riscv_secondary_memory_needed
  
diff --git a/gcc/testsuite/gcc.target/riscv/builtins-overlap-1.c b/gcc/testsuite/gcc.target/riscv/builtins-overlap-1.c

new file mode 100644
index 000..ca51fff0fc6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/builtins-overlap-1.c
@@ -0,0 +1,10 @@
+/* { dg-options "-O2 -mtune=thead-c906 -march=rv64gc -mabi=lp64" } */
+/* { dg-do compile } */
+
+#include "builtins.h"
+
+DO_MEMSET0_N(7)
+
+/* { dg-final { scan-assembler-times "sw\tzero,0"  1 } } */
+/* { dg-final 

Re: [PATCH v1] RISC-V: bitmanip: improve constant-loading for (1ULL << 31) in DImode

2022-05-24 Thread Andrew Pinski via Gcc-patches
On Tue, May 24, 2022 at 3:57 PM Philipp Tomsich
 wrote:
>
> The SINGLE_BIT_MASK_OPERAND() is overly restrictive, triggering for
> bits above 31 only (to side-step any issues with the negative SImode
> value 0x8000).  This moves the special handling of this SImode
> value (i.e. the check for -2147483648) to riscv.cc and relaxes the
> SINGLE_BIT_MASK_OPERAND() test.
>
> This changes the code-generation for loading (1ULL << 31) from:
> li  a0,1
> sllia0,a0,31
> to:
> bseti   a0,zero,31
>
> gcc/ChangeLog:
>
> * config/riscv/riscv.cc (riscv_build_integer_1): Rewrite value as
> -2147483648 for the single-bit case, when operating on 0x8000
> in SImode.
> * gcc/config/riscv/riscv.h (SINGLE_BIT_MASK_OPERAND): Allow for
> any single-bit value, moving the special case for 0x8000 to
> riscv_build_integer_1 (in riscv.c).
>
> Signed-off-by: Philipp Tomsich 
>
> ---
>
>  gcc/config/riscv/riscv.cc |  9 +
>  gcc/config/riscv/riscv.h  | 11 ---
>  2 files changed, 13 insertions(+), 7 deletions(-)
>
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index f83dc796d88..fe8196f5c80 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -420,6 +420,15 @@ riscv_build_integer_1 (struct riscv_integer_op 
> codes[RISCV_MAX_INTEGER_OPS],
>/* Simply BSETI.  */
>codes[0].code = UNKNOWN;
>codes[0].value = value;
> +
> +  /* RISC-V sign-extends all 32bit values that life in a 32bit
> +register.  To avoid paradoxes, we thus need to use the
> +sign-extended (negative) representation for the value, if we
> +want to build 0x8000 in SImode.  This will then expand
> +to an ADDI/LI instruction.  */
> +  if (mode == SImode && value == 0x8000)
> +   codes[0].value = -2147483648;

Instead it is better to use HOST_WIDE_INT_M1U<<31 here instead of the
number -2147483648 since I have no idea what that is.
Like for 0x8000 to be HOST_WIDE_INT_1U<<31 ?

Thanks,
Andrew Pinski

> +
>return 1;
>  }
>
> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> index 5083a1c24b0..6f7f4d3fbdc 100644
> --- a/gcc/config/riscv/riscv.h
> +++ b/gcc/config/riscv/riscv.h
> @@ -528,13 +528,10 @@ enum reg_class
>(((VALUE) | ((1UL<<31) - IMM_REACH)) == ((1UL<<31) - IMM_REACH)  \
> || ((VALUE) | ((1UL<<31) - IMM_REACH)) + IMM_REACH == 0)
>
> -/* If this is a single bit mask, then we can load it with bseti.  But this
> -   is not useful for any of the low 31 bits because we can use addi or lui
> -   to load them.  It is wrong for loading SImode 0x8000 on rv64 because 
> it
> -   needs to be sign-extended.  So we restrict this to the upper 32-bits
> -   only.  */
> -#define SINGLE_BIT_MASK_OPERAND(VALUE) \
> -  (pow2p_hwi (VALUE) && (ctz_hwi (VALUE) >= 32))
> +/* If this is a single bit mask, then we can load it with bseti.  Special
> +   handling of SImode 0x8000 on RV64 is done in riscv_build_integer_1. */
> +#define SINGLE_BIT_MASK_OPERAND(VALUE) \
> +  (pow2p_hwi (VALUE))
>
>  /* Stack layout; function entry, exit and calling.  */
>
> --
> 2.34.1
>


[PATCH v1] RISC-V: bitmanip: improve constant-loading for (1ULL << 31) in DImode

2022-05-24 Thread Philipp Tomsich
The SINGLE_BIT_MASK_OPERAND() is overly restrictive, triggering for
bits above 31 only (to side-step any issues with the negative SImode
value 0x8000).  This moves the special handling of this SImode
value (i.e. the check for -2147483648) to riscv.cc and relaxes the
SINGLE_BIT_MASK_OPERAND() test.

This changes the code-generation for loading (1ULL << 31) from:
li  a0,1
sllia0,a0,31
to:
bseti   a0,zero,31

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_build_integer_1): Rewrite value as
-2147483648 for the single-bit case, when operating on 0x8000
in SImode.
* gcc/config/riscv/riscv.h (SINGLE_BIT_MASK_OPERAND): Allow for
any single-bit value, moving the special case for 0x8000 to
riscv_build_integer_1 (in riscv.c).

Signed-off-by: Philipp Tomsich 

---

 gcc/config/riscv/riscv.cc |  9 +
 gcc/config/riscv/riscv.h  | 11 ---
 2 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index f83dc796d88..fe8196f5c80 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -420,6 +420,15 @@ riscv_build_integer_1 (struct riscv_integer_op 
codes[RISCV_MAX_INTEGER_OPS],
   /* Simply BSETI.  */
   codes[0].code = UNKNOWN;
   codes[0].value = value;
+
+  /* RISC-V sign-extends all 32bit values that life in a 32bit
+register.  To avoid paradoxes, we thus need to use the
+sign-extended (negative) representation for the value, if we
+want to build 0x8000 in SImode.  This will then expand
+to an ADDI/LI instruction.  */
+  if (mode == SImode && value == 0x8000)
+   codes[0].value = -2147483648;
+
   return 1;
 }
 
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 5083a1c24b0..6f7f4d3fbdc 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -528,13 +528,10 @@ enum reg_class
   (((VALUE) | ((1UL<<31) - IMM_REACH)) == ((1UL<<31) - IMM_REACH)  \
|| ((VALUE) | ((1UL<<31) - IMM_REACH)) + IMM_REACH == 0)
 
-/* If this is a single bit mask, then we can load it with bseti.  But this
-   is not useful for any of the low 31 bits because we can use addi or lui
-   to load them.  It is wrong for loading SImode 0x8000 on rv64 because it
-   needs to be sign-extended.  So we restrict this to the upper 32-bits
-   only.  */
-#define SINGLE_BIT_MASK_OPERAND(VALUE) \
-  (pow2p_hwi (VALUE) && (ctz_hwi (VALUE) >= 32))
+/* If this is a single bit mask, then we can load it with bseti.  Special
+   handling of SImode 0x8000 on RV64 is done in riscv_build_integer_1. */
+#define SINGLE_BIT_MASK_OPERAND(VALUE) \
+  (pow2p_hwi (VALUE))
 
 /* Stack layout; function entry, exit and calling.  */
 
-- 
2.34.1



[PATCH v1 3/3] RISC-V: Split "(a & (1UL << bitno)) ? 0 : 1" to bext + xori

2022-05-24 Thread Philipp Tomsich
We avoid reassociating "(~(a >> BIT_NO)) & 1" into "((~a) >> BIT_NO) & 1"
by splitting it into a zero-extraction (bext) and an xori.  This both
avoids burning a register on a temporary and generates a sequence that
clearly captures 'extract bit, then invert bit'.

This change improves the previously generated
srl   a0,a0,a1
not   a0,a0
andi  a0,a0,1
into
bext  a0,a0,a1
xori  a0,a0,1

Signed-off-by: Philipp Tomsich 

gcc/ChangeLog:

* config/riscv/bitmanip.md: Add split covering
"(a & (1 << BIT_NO)) ? 0 : 1".

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbs-bext.c: Add testcases.
* gcc.target/riscv/zbs-bexti.c: Add testcases.

---

 gcc/config/riscv/bitmanip.md   | 13 +
 gcc/testsuite/gcc.target/riscv/zbs-bext.c  | 10 --
 gcc/testsuite/gcc.target/riscv/zbs-bexti.c | 10 --
 3 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 5d7c20e9fdc..c4b61880e0c 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -365,3 +365,16 @@ (define_split
   "TARGET_ZBS"
   [(set (match_dup 0) (zero_extract:GPR (match_dup 1) (const_int 1) (match_dup 
2)))
(set (match_dup 0) (plus:GPR (match_dup 0) (const_int -1)))])
+
+;; Split for "(a & (1 << BIT_NO)) ? 0 : 1":
+;; We avoid reassociating "(~(a >> BIT_NO)) & 1" into "((~a) >> BIT_NO) & 1",
+;; so we don't have to use a temporary.  Instead we extract the bit and then
+;; invert bit 0 ("a ^ 1") only.
+(define_split
+  [(set (match_operand:X 0 "register_operand")
+(and:X (not:X (lshiftrt:X (match_operand:X 1 "register_operand")
+  (subreg:QI (match_operand:X 2 
"register_operand") 0)))
+   (const_int 1)))]
+  "TARGET_ZBS"
+  [(set (match_dup 0) (zero_extract:X (match_dup 1) (const_int 1) (match_dup 
2)))
+   (set (match_dup 0) (xor:X (match_dup 0) (const_int 1)))])
diff --git a/gcc/testsuite/gcc.target/riscv/zbs-bext.c 
b/gcc/testsuite/gcc.target/riscv/zbs-bext.c
index 8de9c5a167c..a8aadb60390 100644
--- a/gcc/testsuite/gcc.target/riscv/zbs-bext.c
+++ b/gcc/testsuite/gcc.target/riscv/zbs-bext.c
@@ -23,16 +23,22 @@ long bext64_1(long a, char bitno)
 
 long bext64_2(long a, char bitno)
 {
-  return (a & (1UL << bitno)) ? 0 : -1;
+  return (a & (1UL << bitno)) ? 0 : 1;
 }
 
 long bext64_3(long a, char bitno)
+{
+  return (a & (1UL << bitno)) ? 0 : -1;
+}
+
+long bext64_4(long a, char bitno)
 {
   return (a & (1UL << bitno)) ? -1 : 0;
 }
 
 /* { dg-final { scan-assembler-times "bexti\t" 1 } } */
-/* { dg-final { scan-assembler-times "bext\t" 4 } } */
+/* { dg-final { scan-assembler-times "bext\t" 5 } } */
+/* { dg-final { scan-assembler-times "xori\t|snez\t" 1 } } */
 /* { dg-final { scan-assembler-times "addi\t" 1 } } */
 /* { dg-final { scan-assembler-times "neg\t" 1 } } */
 /* { dg-final { scan-assembler-not "andi" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/riscv/zbs-bexti.c 
b/gcc/testsuite/gcc.target/riscv/zbs-bexti.c
index 8182a61707d..aa13487b357 100644
--- a/gcc/testsuite/gcc.target/riscv/zbs-bexti.c
+++ b/gcc/testsuite/gcc.target/riscv/zbs-bexti.c
@@ -12,14 +12,20 @@ long bexti64_1(long a, char bitno)
 
 long bexti64_2(long a, char bitno)
 {
-  return (a & (1UL << BIT_NO)) ? 0 : -1;
+  return (a & (1UL << BIT_NO)) ? 0 : 1;
 }
 
 long bexti64_3(long a, char bitno)
+{
+  return (a & (1UL << BIT_NO)) ? 0 : -1;
+}
+
+long bexti64_4(long a, char bitno)
 {
   return (a & (1UL << BIT_NO)) ? -1 : 0;
 }
 
-/* { dg-final { scan-assembler-times "bexti\t" 3 } } */
+/* { dg-final { scan-assembler-times "bexti\t" 4 } } */
+/* { dg-final { scan-assembler-times "xori\t|snez\t" 1 } } */
 /* { dg-final { scan-assembler-times "addi\t" 1 } } */
 /* { dg-final { scan-assembler-times "neg\t" 1 } } */
\ No newline at end of file
-- 
2.34.1



[PATCH v1 2/3] RISC-V: Split "(a & (1UL << bitno)) ? 0 : -1" to bext + addi

2022-05-24 Thread Philipp Tomsich
For a straightforward application of bext for the following function
long bext64(long a, char bitno)
{
  return (a & (1UL << bitno)) ? 0 : -1;
}
we generate
srl a0,a0,a1# 7 [c=4 l=4]  lshrdi3
andia0,a0,1 # 8 [c=4 l=4]  anddi3/1
addia0,a0,-1# 14[c=4 l=4]  adddi3/1
due to the following failed match at combine time:
(set (reg:DI 82)
 (zero_extract:DI (reg:DI 83)
  (const_int 1 [0x1])
  (reg:DI 84)))

The existing pattern for bext requires the 3rd argument to
zero_extract to be a QImode register wrapped in a zero_extension.
This adds an additional pattern that allows an Xmode argument.

With this change, the testcase compiles to
bexta0,a0,a1# 8 [c=4 l=4]  *bextdi
addia0,a0,-1# 14[c=4 l=4]  adddi3/1

gcc/ChangeLog:

* config/riscv/bitmanip.md (*bext): Add an additional
pattern that allows the 3rd argument to zero_extract to be
an Xmode register operand.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbs-bext.c: Add testcases.
* gcc.target/riscv/zbs-bexti.c: Add testcases.

Signed-off-by: Philipp Tomsich 
Co-developed-by: Manolis Tsamis 

---

 gcc/config/riscv/bitmanip.md   | 12 +++
 gcc/testsuite/gcc.target/riscv/zbs-bext.c  | 23 +++---
 gcc/testsuite/gcc.target/riscv/zbs-bexti.c | 23 --
 3 files changed, 49 insertions(+), 9 deletions(-)

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index ea5dea13cfb..5d7c20e9fdc 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -332,6 +332,18 @@ (define_insn "*bext"
   "bext\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
 
+;; When performing `(a & (1UL << bitno)) ? 0 : -1` the combiner
+;; usually has the `bitno` typed as X-mode (i.e. no further
+;; zero-extension is performed around the bitno).
+(define_insn "*bext"
+  [(set (match_operand:X 0 "register_operand" "=r")
+   (zero_extract:X (match_operand:X 1 "register_operand" "r")
+   (const_int 1)
+   (match_operand:X 2 "register_operand" "r")))]
+  "TARGET_ZBS"
+  "bext\t%0,%1,%2"
+  [(set_attr "type" "bitmanip")])
+
 (define_insn "*bexti"
   [(set (match_operand:X 0 "register_operand" "=r")
(zero_extract:X (match_operand:X 1 "register_operand" "r")
diff --git a/gcc/testsuite/gcc.target/riscv/zbs-bext.c 
b/gcc/testsuite/gcc.target/riscv/zbs-bext.c
index 47982396119..8de9c5a167c 100644
--- a/gcc/testsuite/gcc.target/riscv/zbs-bext.c
+++ b/gcc/testsuite/gcc.target/riscv/zbs-bext.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-march=rv64gc_zbs -mabi=lp64" } */
-/* { dg-skip-if "" { *-*-* } { "-O0" } } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
 
 /* bext */
 long
@@ -16,6 +16,23 @@ foo1 (long i)
   return 1L & (i >> 20);
 }
 
+long bext64_1(long a, char bitno)
+{
+  return (a & (1UL << bitno)) ? 1 : 0;
+}
+
+long bext64_2(long a, char bitno)
+{
+  return (a & (1UL << bitno)) ? 0 : -1;
+}
+
+long bext64_3(long a, char bitno)
+{
+  return (a & (1UL << bitno)) ? -1 : 0;
+}
+
 /* { dg-final { scan-assembler-times "bexti\t" 1 } } */
-/* { dg-final { scan-assembler-times "bext\t" 1 } } */
-/* { dg-final { scan-assembler-not "andi" } } */
+/* { dg-final { scan-assembler-times "bext\t" 4 } } */
+/* { dg-final { scan-assembler-times "addi\t" 1 } } */
+/* { dg-final { scan-assembler-times "neg\t" 1 } } */
+/* { dg-final { scan-assembler-not "andi" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/riscv/zbs-bexti.c 
b/gcc/testsuite/gcc.target/riscv/zbs-bexti.c
index 99e3b58309c..8182a61707d 100644
--- a/gcc/testsuite/gcc.target/riscv/zbs-bexti.c
+++ b/gcc/testsuite/gcc.target/riscv/zbs-bexti.c
@@ -1,14 +1,25 @@
 /* { dg-do compile } */
-/* { dg-options "-march=rv64gc_zbs -mabi=lp64 -O2" } */
+/* { dg-options "-march=rv64gc_zbs -mabi=lp64" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
 
 /* bexti */
 #define BIT_NO  4
 
-long
-foo0 (long a)
+long bexti64_1(long a, char bitno)
 {
-  return (a & (1 << BIT_NO)) ? 0 : -1;
+  return (a & (1UL << BIT_NO)) ? 1 : 0;
 }
 
-/* { dg-final { scan-assembler "bexti" } } */
-/* { dg-final { scan-assembler "addi" } } */
+long bexti64_2(long a, char bitno)
+{
+  return (a & (1UL << BIT_NO)) ? 0 : -1;
+}
+
+long bexti64_3(long a, char bitno)
+{
+  return (a & (1UL << BIT_NO)) ? -1 : 0;
+}
+
+/* { dg-final { scan-assembler-times "bexti\t" 3 } } */
+/* { dg-final { scan-assembler-times "addi\t" 1 } } */
+/* { dg-final { scan-assembler-times "neg\t" 1 } } */
\ No newline at end of file
-- 
2.34.1



[PATCH v1 1/3] RISC-V: Split "(a & (1 << BIT_NO)) ? 0 : -1" to bexti + addi

2022-05-24 Thread Philipp Tomsich
Consider creating a polarity-reversed mask from a set-bit (i.e., if
the bit is set, produce all-ones; otherwise: all-zeros).  Using Zbb,
this can be expressed as bexti, followed by an addi of minus-one.  To
enable the combiner to discover this opportunity, we need to split the
canonical expression for "(a & (1 << BIT_NO)) ? 0 : -1" into a form
combinable into bexti.

Consider the function:
long f(long a)
{
  return (a & (1 << BIT_NO)) ? 0 : -1;
}
This produces the following sequence prior to this change:
andia0,a0,16
seqza0,a0
neg a0,a0
ret
Following this change, it results in:
bexti   a0,a0,4
addia0,a0,-1
ret

Signed-off-by: Philipp Tomsich 

gcc/ChangeLog:

* config/riscv/bitmanip.md: Add a splitter to generate
  polarity-reversed masks from a set bit using bexti + addi.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbs-bexti.c: New test.

---

 gcc/config/riscv/bitmanip.md   | 13 +
 gcc/testsuite/gcc.target/riscv/zbs-bexti.c | 14 ++
 2 files changed, 27 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbs-bexti.c

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 0ab9ffe3c0b..ea5dea13cfb 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -340,3 +340,16 @@ (define_insn "*bexti"
   "TARGET_ZBS"
   "bexti\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
+
+;; We can create a polarity-reversed mask (i.e. bit N -> { set = 0, clear = -1 
})
+;; using a bext(i) followed by an addi instruction.
+;; This splits the canonical representation of "(a & (1 << BIT_NO)) ? 0 : -1".
+(define_split
+  [(set (match_operand:GPR 0 "register_operand")
+   (neg:GPR (eq:GPR (zero_extract:GPR (match_operand:GPR 1 
"register_operand")
+  (const_int 1)
+  (match_operand 2))
+(const_int 0]
+  "TARGET_ZBS"
+  [(set (match_dup 0) (zero_extract:GPR (match_dup 1) (const_int 1) (match_dup 
2)))
+   (set (match_dup 0) (plus:GPR (match_dup 0) (const_int -1)))])
diff --git a/gcc/testsuite/gcc.target/riscv/zbs-bexti.c 
b/gcc/testsuite/gcc.target/riscv/zbs-bexti.c
new file mode 100644
index 000..99e3b58309c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zbs-bexti.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zbs -mabi=lp64 -O2" } */
+
+/* bexti */
+#define BIT_NO  4
+
+long
+foo0 (long a)
+{
+  return (a & (1 << BIT_NO)) ? 0 : -1;
+}
+
+/* { dg-final { scan-assembler "bexti" } } */
+/* { dg-final { scan-assembler "addi" } } */
-- 
2.34.1



Re: [wwwdocs] Add C status page

2022-05-24 Thread Marek Polacek via Gcc-patches
On Tue, May 24, 2022 at 06:11:09PM +, Joseph Myers wrote:
> On Tue, 24 May 2022, Marek Polacek via Gcc-patches wrote:
> 
> > I thought it'd be nice to have a table that documents our C support
> > status, like we have https://gcc.gnu.org/projects/cxx-status.html for C++.
> > We have https://gcc.gnu.org/c99status.html, but that's C99 only.
> > 
> > So here's a patch to add just that.  For C99, I used c99status.html but
> > added paper numbers (taken from https://clang.llvm.org/c_status.html).
> 
> For C11, see https://gcc.gnu.org/wiki/C11Status (note: I haven't checked 
> the accuracy of that page).

Ah, nice (I almost never use our wiki).  One more reason to have a single
place for such overviews.
 
> Listing in terms of features is more useful than listing in terms of 
> papers.  Referring to the original paper, even if it's the version that 
> got accepted into the standard, is liable to be actively misleading to 
> anyone working on the implementation; sometimes the paper has multiple 
> choices of which only one was accepted into the standard, or only some of 
> the listed changes were accepted, or there were various subsequent 
> features or fixes from various subsequent papers.

Right, so I think it would make sense to have one line for a feature, and
add related papers to the second column, as we do in the C++ table:
https://gcc.gnu.org/projects/cxx-status.html (see e.g. concepts).

> (By way of example, it 
> would make more sense to list _BitInt as a single entry for a missing 
> feature than to separately list N2763 and N2775 (accepted papers), while 
> N2960, to be considered at the July meeting of WG14, makes further wording 
> fixes but can't exactly be considered a feature in a sense that should be 
> listed in such a table.)

OK, so I think we should have one feature ("_BitInt") and have those three
papers in the "Proposal" column.

> Lots of papers are just cleanups, or 
> clarification of wording, or fixes to issues with previous papers, such 
> that it doesn't make sense to list them as implemented or not at all.

For those either we could say "N/A" (gray color), or not mention them at
all, though I'd perfer the former.  
 
> As usual there are also cases where a feature is implemented to the extent 
> relevant for conformance but e.g. more optimizations (such as built-in 
> functions) could be added.

Notes like these could go to the "Notes" column.

> And cases where some support in GCC should 
> definitely be done to consider the feature implemented, even when not 
> needed for conformance (e.g. the %wN, %wfN printf/scanf formats need 
> implementing in glibc, and corresponding format checking support needs 
> implementing in GCC).

These could be marked as "partially implemented" (yellow color).  Except
I often don't know which features need extensions like that.

> There are also cases where a feature is 
> substantially there but a more detailed review should be done for how it 
> matches up to the standard version (e.g. the DFP support based on TR 
> 24732-2009 could do with such a detailed review for how it matches C2x 
> requirements).

Indeed, and that's a hard problem.  I for one could never figure out this
one.  So I'd leave it in the "?" (red color) state.
 
> > +
> > +  Binary literals
> > +   > href="http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2549.pdf;>N2549
> > +  GCC 
> > 11
> > +  
> > +
> 
> This is an example of cases where the version where a feature was 
> supported in GCC as an extension is long before the version where 
> -pedantic knows about it being supported in a given standard version; 
> listing the version with the -pedantic change in such cases may not be 
> very helpful without noting when it was originally implemented.

Probably here we could just say "Yes" (green color) and make a note in the
"Notes" column.

> There are 
> probably other examples in the list.  (There are also examples where GCC 
> supports the feature but hasn't yet had -pedantic updated accordingly, 
> e.g. #warning.  And cases where it's largely supported but there are small 
> differences in the standard version that still need implementing, e.g. 
> typeof.)
 
Yeah, I bet.  It's tricky to decide :/.

> > +
> > +  What we think we reserve
> > +   > href="http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2572.pdf;>N2572
> > +  ?
> > +  
> > +
> 
> This is an example of the many cases where it doesn't make sense to 
> consider something as a feature with an "implemented" or "not implemented" 
> state at all - so it doesn't belong in such a table at all.  There are 
> many other such examples in the list.

Maybe it's just me, but I find some value in having proposals like that in
the table too (but it should be "N/A" and gray).  This is what I did for
N2641.


What I like about having a table like this is that it makes it clear what
remains to be implemented, and unimplemented features have a linked PR,
which makes it easy to see 

[PATCH v1 3/3] RISC-V: Replace zero_extendsidi2_shifted with generalized split

2022-05-24 Thread Philipp Tomsich
The current method of treating shifts of extended values on RISC-V
frequently causes sequences of 3 shifts, despite the presence of the
'zero_extendsidi2_shifted' pattern.

Consider:
unsigned long f(unsigned int a, unsigned long b)
{
a = a << 1;
unsigned long c = (unsigned long) a;
c = b + (c<<4);
return c;
}
which will present at combine-time as:
Trying 7, 8 -> 9:
7: r78:SI=r81:DI#0<<0x1
  REG_DEAD r81:DI
8: r79:DI=zero_extend(r78:SI)
  REG_DEAD r78:SI
9: r72:DI=r79:DI<<0x4
  REG_DEAD r79:DI
Failed to match this instruction:
(set (reg:DI 72 [ _1 ])
(and:DI (ashift:DI (reg:DI 81)
(const_int 5 [0x5]))
(const_int 68719476704 [0xfffe0])))
and produce the following (optimized) assembly:
f:
slliw   a5,a0,1
sllia5,a5,32
srlia5,a5,28
add a0,a5,a1
ret

The current way of handling this (in 'zero_extendsidi2_shifted')
doesn't apply for two reasons:
- this is seen before reload, and
- (more importantly) the constant mask is not 0xul.

To address this, we introduce a generalized version of shifting
zero-extended values that supports any mask of consecutive ones as
long as the number of training zeros is the inner shift-amount.

With this new split, we generate the following assembly for the
aforementioned function:
f:
sllia0,a0,33
srlia0,a0,28
add a0,a0,a1
ret

gcc/ChangeLog:

* config/riscv/riscv.md (zero_extendsidi2_shifted): Replace
  with a generalized split that requires no clobber, runs
  before reload and works for smaller masks.

Signed-off-by: Philipp Tomsich 
---

 gcc/config/riscv/riscv.md | 37 -
 1 file changed, 20 insertions(+), 17 deletions(-)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index b8ab0cf169a..cc10cd90a74 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -2119,23 +2119,26 @@ (define_split
 ;; occur when unsigned int is used for array indexing.  Split this into two
 ;; shifts.  Otherwise we can get 3 shifts.
 
-(define_insn_and_split "zero_extendsidi2_shifted"
-  [(set (match_operand:DI 0 "register_operand" "=r")
-   (and:DI (ashift:DI (match_operand:DI 1 "register_operand" "r")
-  (match_operand:QI 2 "immediate_operand" "I"))
-   (match_operand 3 "immediate_operand" "")))
-   (clobber (match_scratch:DI 4 "="))]
-  "TARGET_64BIT && !TARGET_ZBA
-   && ((INTVAL (operands[3]) >> INTVAL (operands[2])) == 0x)"
-  "#"
-  "&& reload_completed"
-  [(set (match_dup 4)
-   (ashift:DI (match_dup 1) (const_int 32)))
-   (set (match_dup 0)
-   (lshiftrt:DI (match_dup 4) (match_dup 5)))]
-  "operands[5] = GEN_INT (32 - (INTVAL (operands [2])));"
-  [(set_attr "type" "shift")
-   (set_attr "mode" "DI")])
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+   (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
+  (match_operand:QI 2 "immediate_operand"))
+   (match_operand:DI 3 "consecutive_bits_operand")))]
+  "TARGET_64BIT"
+  [(set (match_dup 0) (ashift:DI (match_dup 1) (match_dup 4)))
+   (set (match_dup 0) (lshiftrt:DI (match_dup 0) (match_dup 5)))]
+{
+   unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]);
+   int leading = clz_hwi (mask);
+   int trailing = ctz_hwi (mask);
+
+   /* The shift-amount must match the number of trailing bits */
+   if (trailing != UINTVAL (operands[2]))
+  FAIL;
+
+   operands[4] = GEN_INT (leading + trailing);
+   operands[5] = GEN_INT (leading);
+})
 
 ;;
 ;;  
-- 
2.34.1



[PATCH v1 2/3] RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w

2022-05-24 Thread Philipp Tomsich
When encountering a prescaled (biased) value as a candidate for
sh[123]add.uw, the combine pass will present this as shifted by the
aggregate amount (prescale + shift-amount) with an appropriately
adjusted mask constant that has fewer than 32 bits set.

E.g., here's the failing expression seen in combine for a prescale of
1 and a shift of 2 (note how 0x3fff8 >> 3 is 0x7fff).
  Trying 7, 8 -> 10:
  7: r78:SI=r81:DI#0<<0x1
REG_DEAD r81:DI
  8: r79:DI=zero_extend(r78:SI)
REG_DEAD r78:SI
 10: r80:DI=r79:DI<<0x2+r82:DI
REG_DEAD r79:DI
REG_DEAD r82:DI
  Failed to match this instruction:
  (set (reg:DI 80 [ cD.1491 ])
  (plus:DI (and:DI (ashift:DI (reg:DI 81)
   (const_int 3 [0x3]))
   (const_int 17179869176 [0x3fff8]))
  (reg:DI 82)))

To address this, we introduce a splitter handling these cases.

gcc/ChangeLog:

* config/riscv/bitmanip.md: Add split to handle opportunities
  for slli + sh[123]add.uw

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zba-shadd.c: New test.

Signed-off-by: Philipp Tomsich 
Co-developed-by: Manolis Tsamis 

---

 gcc/config/riscv/bitmanip.md   | 44 ++
 gcc/testsuite/gcc.target/riscv/zba-shadd.c | 13 +++
 2 files changed, 57 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shadd.c

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 0ab9ffe3c0b..6c1ccc6f8c5 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -79,6 +79,50 @@ (define_insn "*shNadduw"
   [(set_attr "type" "bitmanip")
(set_attr "mode" "DI")])
 
+;; During combine, we may encounter an attempt to combine
+;;   slli rtmp, rs, #imm
+;;   zext.w rtmp, rtmp
+;;   sh[123]add rd, rtmp, rs2
+;; which will lead to the immediate not satisfying the above constraints.
+;; By splitting the compound expression, we can simplify to a slli and a
+;; sh[123]add.uw.
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+   (plus:DI (and:DI (ashift:DI (match_operand:DI 1 "register_operand")
+   (match_operand:QI 2 "immediate_operand"))
+(match_operand:DI 3 "consecutive_bits_operand"))
+(match_operand:DI 4 "register_operand")))
+   (clobber (match_operand:DI 5 "register_operand"))]
+  "TARGET_64BIT && TARGET_ZBA"
+  [(set (match_dup 5) (ashift:DI (match_dup 1) (match_dup 6)))
+   (set (match_dup 0) (plus:DI (and:DI (ashift:DI (match_dup 5)
+ (match_dup 7))
+  (match_dup 8))
+  (match_dup 4)))]
+{
+   unsigned HOST_WIDE_INT mask = UINTVAL (operands[3]);
+   /* scale: shift within the sh[123]add.uw */
+   int scale = 32 - clz_hwi (mask);
+   /* bias:  pre-scale amount (i.e. the prior shift amount) */
+   int bias = ctz_hwi (mask) - scale;
+
+   /* If the bias + scale don't add up to operand[2], reject. */
+   if ((scale + bias) != UINTVAL (operands[2]))
+  FAIL;
+
+   /* If the shift-amount is out-of-range for sh[123]add.uw, reject. */
+   if ((scale < 1) || (scale > 3))
+  FAIL;
+
+   /* If there's no bias, the '*shNadduw' pattern should have matched. */
+   if (bias == 0)
+  FAIL;
+
+   operands[6] = GEN_INT (bias);
+   operands[7] = GEN_INT (scale);
+   operands[8] = GEN_INT (0xULL << scale);
+})
+
 (define_insn "*add.uw"
   [(set (match_operand:DI 0 "register_operand" "=r")
(plus:DI (zero_extend:DI
diff --git a/gcc/testsuite/gcc.target/riscv/zba-shadd.c 
b/gcc/testsuite/gcc.target/riscv/zba-shadd.c
new file mode 100644
index 000..33da2530f3f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zba-shadd.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -march=rv64gc_zba -mabi=lp64" } */
+
+unsigned long foo(unsigned int a, unsigned long b)
+{
+a = a << 1;
+unsigned long c = (unsigned long) a;
+unsigned long d = b + (c<<2);
+return d;
+}
+
+/* { dg-final { scan-assembler "sh2add.uw" } } */
+/* { dg-final { scan-assembler-not "zext" } } */
\ No newline at end of file
-- 
2.34.1



[PATCH v1 1/3] RISC-V: add consecutive_bits_operand predicate

2022-05-24 Thread Philipp Tomsich
Provide an easy way to constrain for constants that are a a single,
consecutive run of ones.

gcc/ChangeLog:

* config/riscv/predicates.md (consecutive_bits_operand):
  Implement new predicate.

Signed-off-by: Philipp Tomsich 
---

 gcc/config/riscv/predicates.md | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md
index c37caa2502b..90db5dfcdd5 100644
--- a/gcc/config/riscv/predicates.md
+++ b/gcc/config/riscv/predicates.md
@@ -243,3 +243,14 @@ (define_predicate "const63_operand"
 (define_predicate "imm5_operand"
   (and (match_code "const_int")
(match_test "INTVAL (op) < 5")))
+
+;; A CONST_INT operand that consists of a single run of consecutive set bits.
+(define_predicate "consecutive_bits_operand"
+  (match_code "const_int")
+{
+   unsigned HOST_WIDE_INT val = UINTVAL (op);
+   if (exact_log2 ((val >> ctz_hwi (val)) + 1) < 0)
+   return false;
+
+   return true;
+})
-- 
2.34.1



[PATCH v1 0/3] RISC-V: Improve sequences with shifted zero-extended operands

2022-05-24 Thread Philipp Tomsich


Code-generation currently misses some opportunities for optimized
sequences when zero-extension is combined with shifts.


Philipp Tomsich (3):
  RISC-V: add consecutive_bits_operand predicate
  RISC-V: Split slli+sh[123]add.uw opportunities to avoid zext.w
  RISC-V: Replace zero_extendsidi2_shifted with generalized split

 gcc/config/riscv/bitmanip.md   | 44 ++
 gcc/config/riscv/predicates.md | 11 ++
 gcc/config/riscv/riscv.md  | 37 +-
 gcc/testsuite/gcc.target/riscv/zba-shadd.c | 13 +++
 4 files changed, 88 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zba-shadd.c

-- 
2.34.1



Ping: [PATCH v2] diagnostics: Honor #pragma GCC diagnostic in the preprocessor [PR53431]

2022-05-24 Thread Lewis Hyatt via Gcc-patches
Hello-

Now that we're back in stage 1, I thought it might be a better time to
ask for feedback on this pair of patches that tries to resolve PR53431
please?

https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587357.html
https://gcc.gnu.org/pipermail/gcc-patches/2021-December/587358.html

Part 1/2 is a trivial cleanup in the C++ parser that simplifies
adding the support for early pragma handling.

Part 2/2 adds the concept of early pragma handling and makes the C++
and preprocessor frontends use it.

The patches required some minor rebasing, so I have attached updated
versions here.

bootstrap + regtest all languages still looks good:

FAIL 103 103
PASS 541178 541213
UNSUPPORTED 15177 15177
UNTESTED 136 136
XFAIL 4140 4140
XPASS 17 17

Thanks! If this approach doesn't seem like the right one, I am happy
to try another way.

-Lewis


On Fri, Dec 24, 2021 at 04:23:08PM -0500, Lewis Hyatt wrote:
> Hello-
> 
> I would like please to follow up on this patch submitted for PR53431 here:
> https://gcc.gnu.org/pipermail/gcc-patches/2021-December/586191.html
> 
> However, it was suggested on the PR that part of it could be split into a
> separate simpler patch. I have now done that, and also made a few tweaks to
> the first version at the same time, so may I please request that you review
> this version 2 instead? This email contains the first smaller cleanup patch,
> and the next email contains the main part of it. Thanks very much.
> 
> bootstrap and regtest were performed on x86-64 Linux, all tests look the same
> before + after, plus the new passing testcases.
> 
> FAIL 112 112
> PASS 528007 528042
> UNSUPPORTED 14888 14888
> UNTESTED 132 132
> XFAIL 3238 3238
> XPASS 17 17
> 
> -Lewis

> From: Lewis Hyatt 
> Date: Thu, 23 Dec 2021 17:03:04 -0500
> Subject: [PATCH] c++: Minor cleanup in parser.c
> 
> The code to determine whether a given token starts a module directive is
> currently repeated in 4 places in parser.c. I am about to submit a patch
> that needs to add it in a 5th place, so since the code is not completely
> trivial (needing to check for 3 different token types), it seems worthwhile
> to factor this logic into its own function.
> 
> gcc/cp/ChangeLog:
> 
>   * parser.c (cp_token_is_module_directive): New function
>   refactoring common code.
>   (cp_parser_skip_to_closing_parenthesis_1): Use the new function.
>   (cp_parser_skip_to_end_of_statement): Likewise.
>   (cp_parser_skip_to_end_of_block_or_statement): Likewise.
>   (cp_parser_declaration): Likewise.
> 
> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> index 33fb40a5b59..9b7446655be 100644
> --- a/gcc/cp/parser.c
> +++ b/gcc/cp/parser.c
> @@ -629,6 +629,16 @@ cp_lexer_alloc (void)
>return lexer;
>  }
>  
> +/* Return TRUE if token is the start of a module declaration that will be
> +   terminated by a CPP_PRAGMA_EOL token.  */
> +static inline bool
> +cp_token_is_module_directive (cp_token *token)
> +{
> +  return token->keyword == RID__EXPORT
> +|| token->keyword == RID__MODULE
> +|| token->keyword == RID__IMPORT;
> +}
> +
>  /* Create a new main C++ lexer, the lexer that gets tokens from the
> preprocessor.  */
>  
> @@ -3805,9 +3815,7 @@ cp_parser_skip_to_closing_parenthesis_1 (cp_parser 
> *parser,
> break;
>  
>   case CPP_KEYWORD:
> -   if (token->keyword != RID__EXPORT
> -   && token->keyword != RID__MODULE
> -   && token->keyword != RID__IMPORT)
> +   if (!cp_token_is_module_directive (token))
>   break;
> /* FALLTHROUGH  */
>  
> @@ -3908,9 +3916,7 @@ cp_parser_skip_to_end_of_statement (cp_parser* parser)
> break;
>  
>   case CPP_KEYWORD:
> -   if (token->keyword != RID__EXPORT
> -   && token->keyword != RID__MODULE
> -   && token->keyword != RID__IMPORT)
> +   if (!cp_token_is_module_directive (token))
>   break;
> /* FALLTHROUGH  */
>  
> @@ -3997,9 +4003,7 @@ cp_parser_skip_to_end_of_block_or_statement (cp_parser* 
> parser)
> break;
>  
>   case CPP_KEYWORD:
> -   if (token->keyword != RID__EXPORT
> -   && token->keyword != RID__MODULE
> -   && token->keyword != RID__IMPORT)
> +   if (!cp_token_is_module_directive (token))
>   break;
> /* FALLTHROUGH  */
>  
> @@ -14860,9 +14864,7 @@ cp_parser_declaration (cp_parser* parser, tree 
> prefix_attrs)
>else
>   cp_parser_module_export (parser);
>  }
> -  else if (token1->keyword == RID__EXPORT
> -|| token1->keyword == RID__IMPORT
> -|| token1->keyword == RID__MODULE)
> +  else if (cp_token_is_module_directive (token1))
>  {
>bool exporting = token1->keyword == RID__EXPORT;
>cp_token *next = exporting ? token2 : token1;

c++: Minor cleanup in parser.cc

The code to determine whether a given token starts a module directive is
currently repeated in 4 places in parser.cc. I am about to submit a patch
that needs to add it in a 5th place, 

[r13-726 Regression] FAIL: libgomp.fortran/taskwait-depend-nowait-1.f90 -O execution test on Linux/x86_64

2022-05-24 Thread skpandey--- via Gcc-patches
On Linux/x86_64,

4fb2b4f7ea6b80ae75d3efb6f86e7c6179080535 is the first bad commit
commit 4fb2b4f7ea6b80ae75d3efb6f86e7c6179080535
Author: Tobias Burnus 
Date:   Tue May 24 10:41:43 2022 +0200

OpenMP: Support nowait with Fortran [PR105378]

caused

FAIL: libgomp.fortran/taskwait-depend-nowait-1.f90   -O  execution test

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-726/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/x86_64-linux/libgomp/testsuite && make check 
RUNTESTFLAGS="fortran.exp=libgomp.fortran/taskwait-depend-nowait-1.f90 
--target_board='unix{-m32\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


Re: [PATCH][_Hashtable] Fix insertion of range of type convertible to value_type PR 56112

2022-05-24 Thread François Dumont via Gcc-patches

On 24/05/22 12:18, Jonathan Wakely wrote:

On Thu, 5 May 2022 at 18:38, François Dumont via Libstdc++
 wrote:

Hi

Renewing my patch to fix PR 56112 but for the insert methods, I totally
change it, now works also with move-only key types.

I let you Jonathan find a better name than _ValueTypeEnforcer as usual :-)

libstdc++: [_Hashtable] Insert range of types convertible to value_type
PR 56112

Fix insertion of range of types convertible to value_type. Fix also when
this value_type
has a move-only key_type which also allow converted values to be moved.

libstdc++-v3/ChangeLog:

  PR libstdc++/56112
  * include/bits/hashtable_policy.h (_ValueTypeEnforcer): New.
  * include/bits/hashtable.h
(_Hashtable<>::_M_insert_unique_aux): New.
  (_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&,
true_type)): Use latters.
  (_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&,
false_type)): Likewise.
  (_Hashtable(_InputIterator, _InputIterator, size_type, const
_Hash&, const _Equal&,
  const allocator_type&, true_type)): Use this.insert range.
  (_Hashtable(_InputIterator, _InputIterator, size_type, const
_Hash&, const _Equal&,
  const allocator_type&, false_type)): Use _M_insert.
  * testsuite/23_containers/unordered_map/cons/56112.cc: Check
how many times conversion
  is done.
  (test02): New test case.
  * testsuite/23_containers/unordered_set/cons/56112.cc: New test.

Tested under Linux x86_64.

Ok to commit ?

No, sorry.

The new test02 function in 23_containers/unordered_map/cons/56112.cc
doesn't compile with libc++ or MSVC either, are you sure that test is
valid? I don't think it is, because S2 is not convertible to
pair. None of the pair constructors are
viable, because the move constructor would require two user-defined
conversions (from S2 to pair and then from
pair to pair). A conversion
sequence cannot have more than one user-defined conversion using a
constructor or converion operator. So if your patch makes that
compile, it's a bug in the new code. I haven't analyzed that code to
see where the problem is, I'm just looking at the test results and the
changes in behaviour.

The new 23_containers/unordered_set/cons/56112.cc test fails for GCC
11 but passes for GCC 12, even without your patch. Is it actually
testing some other change, not this patch, and not the 2013 fix for PR
56112?

.


Yes, I'm not surprised. This is due to this operator on _ValueTypeEnforcer:


  constexpr __enable_if_t::value,
          const __mutable_value_t<_Value>&>
  operator()(const __mutable_value_t<_Value>& __x) noexcept
  { return __x; }

I thought it was nice to allow move construction in this case.

If the Standard forces a copy in this case I can just remove it.

François



Re: [PATCH v5] c++: ICE with temporary of class type in DMI [PR100252]

2022-05-24 Thread Marek Polacek via Gcc-patches
On Tue, May 24, 2022 at 04:01:37PM -0400, Jason Merrill wrote:
> On 5/24/22 09:55, Marek Polacek wrote:
> > On Tue, May 24, 2022 at 08:36:39AM -0400, Jason Merrill wrote:
> > > On 5/16/22 11:36, Marek Polacek wrote:
> > > > +static tree
> > > > +replace_placeholders_for_class_temp_r (tree *tp, int *, void *data)
> > > > +{
> > > > +  tree t = *tp;
> > > > +  tree full_expr = *static_cast(data);
> > > > +
> > > > +  /* We're looking for a TARGET_EXPR nested in the whole expression.  
> > > > */
> > > > +  if (TREE_CODE (t) == TARGET_EXPR
> > > > +  && !potential_prvalue_result_of (t, full_expr))
> > > > +{
> > > > +  tree init = TARGET_EXPR_INITIAL (t);
> > > > +  while (TREE_CODE (init) == COMPOUND_EXPR)
> > > > +   init = TREE_OPERAND (init, 1);
> > > 
> > > Hmm, how do we get a COMPOUND_EXPR around a CONSTRUCTOR?
> > 
> > Sadly, that's possible for code like (from nsdmi-aggr18.C)
> > 
> > struct D {
> >int x = 42;
> >B b = (true, A{x});
> > };
> > 
> > where the TARGET_EXPR_INITIAL is
> > <<< Unknown tree: void_cst >>>, {.x=((struct D *) this)->x, 
> > .y=(&)->x}
> 
> Hmm, perhaps cp_build_compound_expr should build an additional TARGET_EXPR
> around the COMPOUND_EXPR but leave the one inside alone. Feel free to
> investigate that if you'd like, or the patch is OK as is.

Sorry, I was unclear.  The whole expression is:

TARGET_EXPR >>, {.x=((struct D *) this)->x, .y=(&)->x}>)>

so there *is* a TARGET_EXPR around the COMPOUND_EXPR.  We'd have to build
a TARGET_EXPR around the COMPOUND_EXPR's RHS = the CONSTRUCTOR.  Frankly,
I'm not sure if it's worth the effort.  The while loop is somewhat unsightly
but not too bad.

Marek



Re: [PATCH v2] DSE: Use the constant store source if possible

2022-05-24 Thread H.J. Lu via Gcc-patches
On Mon, May 23, 2022 at 11:42 PM Richard Biener
 wrote:
>
> On Mon, May 23, 2022 at 8:34 PM H.J. Lu  wrote:
> >
> > On Mon, May 23, 2022 at 12:38:06PM +0200, Richard Biener wrote:
> > > On Sat, May 21, 2022 at 5:02 AM H.J. Lu via Gcc-patches
> > >  wrote:
> > > >
> > > > When recording store for RTL dead store elimination, check if the source
> > > > register is set only once to a constant.  If yes, record the constant
> > > > as the store source.  It eliminates unrolled zero stores after memset 0
> > > > in a loop where a vector register is used as the zero store source.
> > > >
> > > > gcc/
> > > >
> > > > PR rtl-optimization/105638
> > > > * dse.cc (record_store): Use the constant source if the source
> > > > register is set only once.
> > > >
> > > > gcc/testsuite/
> > > >
> > > > PR rtl-optimization/105638
> > > > * g++.target/i386/pr105638.C: New test.
> > > > ---
> > > >  gcc/dse.cc   | 19 ++
> > > >  gcc/testsuite/g++.target/i386/pr105638.C | 44 
> > > >  2 files changed, 63 insertions(+)
> > > >  create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
> > > >
> > > > diff --git a/gcc/dse.cc b/gcc/dse.cc
> > > > index 30c11cee034..0433dd3d846 100644
> > > > --- a/gcc/dse.cc
> > > > +++ b/gcc/dse.cc
> > > > @@ -1508,6 +1508,25 @@ record_store (rtx body, bb_info_t bb_info)
> > > >
> > > >   if (tem && CONSTANT_P (tem))
> > > > const_rhs = tem;
> > > > + else
> > > > +   {
> > > > + /* If RHS is set only once to a constant, set CONST_RHS
> > > > +to the constant.  */
> > > > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
> > > > + if (def != nullptr
> > > > + && !DF_REF_IS_ARTIFICIAL (def)
> > > > + && !DF_REF_NEXT_REG (def))
> > > > +   {
> > > > + rtx_insn *def_insn = DF_REF_INSN (def);
> > > > + rtx def_body = PATTERN (def_insn);
> > > > + if (GET_CODE (def_body) == SET)
> > > > +   {
> > > > + rtx def_src = SET_SRC (def_body);
> > > > + if (CONSTANT_P (def_src))
> > > > +   const_rhs = def_src;
> > >
> > > doesn't DSE have its own tracking of stored values?  Shouldn't we
> >
> > It tracks stored values only within the basic block.  When RTL loop
> > invariant motion hoists a constant initialization out of the loop into
> > a separate basic block, the constant store value becomes unknown
> > within the original basic block.
> >
> > > improve _that_ if it is not enough?  I also wonder if you need to
> >
> > My patch extends DSE stored value tracking to include the constant which
> > is set only once in another basic block.
> >
> > > verify the SET isn't partial?
> > >
> >
> > Here is the v2 patch to check that the constant is set by a non-partial
> > unconditional load.
> >
> > OK for master?
> >
> > Thanks.
> >
> > H.J.
> > ---
> > RTL DSE tracks redundant constant stores within a basic block.  When RTL
> > loop invariant motion hoists a constant initialization out of the loop
> > into a separate basic block, the constant store value becomes unknown
> > within the original basic block.  When recording store for RTL DSE, check
> > if the source register is set only once to a constant by a non-partial
> > unconditional load.  If yes, record the constant as the constant store
> > source.  It eliminates unrolled zero stores after memset 0 in a loop
> > where a vector register is used as the zero store source.
> >
> > gcc/
> >
> > PR rtl-optimization/105638
> > * dse.cc (record_store): Use the constant source if the source
> > register is set only once.
> >
> > gcc/testsuite/
> >
> > PR rtl-optimization/105638
> > * g++.target/i386/pr105638.C: New test.
> > ---
> >  gcc/dse.cc   | 22 
> >  gcc/testsuite/g++.target/i386/pr105638.C | 44 
> >  2 files changed, 66 insertions(+)
> >  create mode 100644 gcc/testsuite/g++.target/i386/pr105638.C
> >
> > diff --git a/gcc/dse.cc b/gcc/dse.cc
> > index 30c11cee034..af8e88dac32 100644
> > --- a/gcc/dse.cc
> > +++ b/gcc/dse.cc
> > @@ -1508,6 +1508,28 @@ record_store (rtx body, bb_info_t bb_info)
> >
> >   if (tem && CONSTANT_P (tem))
> > const_rhs = tem;
> > + else
> > +   {
> > + /* If RHS is set only once to a constant, set CONST_RHS
> > +to the constant.  */
> > + df_ref def = DF_REG_DEF_CHAIN (REGNO (rhs));
> > + if (def != nullptr
> > + && !DF_REF_IS_ARTIFICIAL (def)
> > + && !(DF_REF_FLAGS (def)
> > +  & (DF_REF_PARTIAL | DF_REF_CONDITIONAL))
> > + && !DF_REF_NEXT_REG (def))
>
> Can we really use df-chain here and rely that a single definition is
> 

[pushed] c++: constexpr returning deallocated ptr

2022-05-24 Thread Jason Merrill via Gcc-patches
In constexpr-new3.C, the f7 function returns a deleted pointer, which we
were happily caching because the new and delete are balanced.  Don't.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Check for
heap vars in the result.
---
 gcc/cp/constexpr.cc | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 1a70fda1dc5..45208478c3f 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -1356,6 +1356,7 @@ static tree cxx_eval_constant_expression (const 
constexpr_ctx *, tree,
  value_cat, bool *, bool *, tree * = 
NULL);
 static tree cxx_fold_indirect_ref (const constexpr_ctx *, location_t, tree, 
tree,
   bool * = NULL);
+static tree find_heap_var_refs (tree *, int *, void *);
 
 /* Attempt to evaluate T which represents a call to a builtin function.
We assume here that all builtin functions evaluate to scalar types
@@ -2965,6 +2966,10 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
  cacheable = false;
  break;
}
+ /* Also don't cache a call that returns a deallocated pointer.  */
+ if (cacheable && (cp_walk_tree_without_duplicates
+   (, find_heap_var_refs, NULL)))
+   cacheable = false;
}
 
/* Rewrite all occurrences of the function's RESULT_DECL with the

base-commit: 1189c03859cefef4fc4fd44d57eb3d4d3348b562
prerequisite-patch-id: cc6e608c68f4eb133f6a153f83dfe4f033544cbd
-- 
2.27.0



[pushed] c++: strict constexpr and local vars

2022-05-24 Thread Jason Merrill via Gcc-patches
A change I was working on made constexpr_searcher.cc start to fail, and when
I looked at it I wondered why it had been accepted before.  This turned out
to be because we try to be more flexible about constant-evaluation of static
initializers, as allowed, but we were wrongly doing the same for non-static
initializers as well.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* constexpr.cc (maybe_constant_init_1): Only pass false for
strict when initializing a variable of static duration.

libstdc++-v3/ChangeLog:

* testsuite/20_util/function_objects/constexpr_searcher.cc: Add
constexpr.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/constexpr-local4.C: New test.
---
 gcc/cp/constexpr.cc | 12 +---
 gcc/testsuite/g++.dg/cpp1y/constexpr-local4.C   | 17 +
 .../function_objects/constexpr_searcher.cc  |  4 ++--
 3 files changed, 28 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-local4.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 388239ea8a8..1a70fda1dc5 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -8301,9 +8301,15 @@ maybe_constant_init_1 (tree t, tree decl, bool 
allow_non_constant,
   else if (CONSTANT_CLASS_P (t) && allow_non_constant)
 /* No evaluation needed.  */;
   else
-t = cxx_eval_outermost_constant_expr (t, allow_non_constant,
- /*strict*/false,
- manifestly_const_eval, false, decl);
+{
+  /* [basic.start.static] allows constant-initialization of variables with
+static or thread storage duration even if it isn't required, but we
+shouldn't bend the rules the same way for automatic variables.  */
+  bool is_static = (decl && DECL_P (decl)
+   && (TREE_STATIC (decl) || DECL_EXTERNAL (decl)));
+  t = cxx_eval_outermost_constant_expr (t, allow_non_constant, !is_static,
+   manifestly_const_eval, false, decl);
+}
   if (TREE_CODE (t) == TARGET_EXPR)
 {
   tree init = TARGET_EXPR_INITIAL (t);
diff --git a/gcc/testsuite/g++.dg/cpp1y/constexpr-local4.C 
b/gcc/testsuite/g++.dg/cpp1y/constexpr-local4.C
new file mode 100644
index 000..bef62488579
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/constexpr-local4.C
@@ -0,0 +1,17 @@
+// { dg-do compile { target c++14 } }
+
+struct A
+{
+  int i;
+  constexpr A(int i): i(i) {};
+};
+
+const A a = 42;
+
+constexpr int f()
+{
+  const int j = a.i;   // { dg-message "'a'" }
+  return j;
+}
+
+static_assert (f() == 42,"");  // { dg-error "non-constant" }
diff --git 
a/libstdc++-v3/testsuite/20_util/function_objects/constexpr_searcher.cc 
b/libstdc++-v3/testsuite/20_util/function_objects/constexpr_searcher.cc
index 33012200cb0..17069694c1b 100644
--- a/libstdc++-v3/testsuite/20_util/function_objects/constexpr_searcher.cc
+++ b/libstdc++-v3/testsuite/20_util/function_objects/constexpr_searcher.cc
@@ -28,13 +28,13 @@
 
 #include 
 
-const std::string_view
+constexpr std::string_view
 patt = "World";
 
 constexpr std::string_view
 greet = "Hello, Humongous World of Wonder!!!";
 
-const std::wstring_view
+constexpr std::wstring_view
 wpatt = L"World";
 
 constexpr std::wstring_view

base-commit: 1189c03859cefef4fc4fd44d57eb3d4d3348b562
-- 
2.27.0



Re: [PATCH v5] c++: ICE with temporary of class type in DMI [PR100252]

2022-05-24 Thread Jason Merrill via Gcc-patches

On 5/24/22 09:55, Marek Polacek wrote:

On Tue, May 24, 2022 at 08:36:39AM -0400, Jason Merrill wrote:

On 5/16/22 11:36, Marek Polacek wrote:

+static tree
+replace_placeholders_for_class_temp_r (tree *tp, int *, void *data)
+{
+  tree t = *tp;
+  tree full_expr = *static_cast(data);
+
+  /* We're looking for a TARGET_EXPR nested in the whole expression.  */
+  if (TREE_CODE (t) == TARGET_EXPR
+  && !potential_prvalue_result_of (t, full_expr))
+{
+  tree init = TARGET_EXPR_INITIAL (t);
+  while (TREE_CODE (init) == COMPOUND_EXPR)
+   init = TREE_OPERAND (init, 1);


Hmm, how do we get a COMPOUND_EXPR around a CONSTRUCTOR?


Sadly, that's possible for code like (from nsdmi-aggr18.C)

struct D {
   int x = 42;
   B b = (true, A{x});
};

where the TARGET_EXPR_INITIAL is
<<< Unknown tree: void_cst >>>, {.x=((struct D *) this)->x, .y=(&)->x}


Hmm, perhaps cp_build_compound_expr should build an additional 
TARGET_EXPR around the COMPOUND_EXPR but leave the one inside alone. 
Feel free to investigate that if you'd like, or the patch is OK as is.


Jason



Re: [ping] Re: [RFA] gcc.misc-tests/outputs.exp: Use link test to check for -gsplit-dwarf support

2022-05-24 Thread Joel Brobecker via Gcc-patches
> >> gcc/testsuite/ChangeLog:
> >> 
> >> * gcc.misc-tests/outputs.exp: Make the -gsplit-dwarf test
> >> a compile-and-link test rather than a compile-only test.
> 
> OK, thanks.

Thank you Richard. Pushed to master.

-- 
Joel


[pushed] c++: *this folding in constexpr call

2022-05-24 Thread Jason Merrill via Gcc-patches
The code in cxx_eval_call_expression to fold *this was doing the wrong thing
for array decay; we can use cxx_fold_indirect_ref instead.

This didn't end up being necessary to fix anything, but still seems like an
improvement.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* constexpr.cc (cxx_fold_indirect_ref): Add default arg.
(cxx_eval_call_expression): Call it.
(cxx_fold_indirect_ref_1): Handle null empty_base.
---
 gcc/cp/constexpr.cc | 11 ++-
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index a015bc7c818..388239ea8a8 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -1354,6 +1354,8 @@ enum value_cat {
 
 static tree cxx_eval_constant_expression (const constexpr_ctx *, tree,
  value_cat, bool *, bool *, tree * = 
NULL);
+static tree cxx_fold_indirect_ref (const constexpr_ctx *, location_t, tree, 
tree,
+  bool * = NULL);
 
 /* Attempt to evaluate T which represents a call to a builtin function.
We assume here that all builtin functions evaluate to scalar types
@@ -2720,9 +2722,7 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
 At this point it has already been evaluated in the call
 to cxx_bind_parameters_in_call.  */
   new_obj = TREE_VEC_ELT (new_call.bindings, 0);
-  STRIP_NOPS (new_obj);
-  if (TREE_CODE (new_obj) == ADDR_EXPR)
-   new_obj = TREE_OPERAND (new_obj, 0);
+  new_obj = cxx_fold_indirect_ref (ctx, loc, DECL_CONTEXT (fun), new_obj);
 
   if (ctx->call && ctx->call->fundef
  && DECL_CONSTRUCTOR_P (ctx->call->fundef->decl))
@@ -5197,7 +5197,8 @@ cxx_fold_indirect_ref_1 (const constexpr_ctx *ctx, 
location_t loc, tree type,
  && CLASS_TYPE_P (optype)
  && DERIVED_FROM_P (type, optype))
{
- *empty_base = true;
+ if (empty_base)
+   *empty_base = true;
  return op;
}
 }
@@ -5216,7 +5217,7 @@ cxx_fold_indirect_ref_1 (const constexpr_ctx *ctx, 
location_t loc, tree type,
 
 static tree
 cxx_fold_indirect_ref (const constexpr_ctx *ctx, location_t loc, tree type,
-  tree op0, bool *empty_base)
+  tree op0, bool *empty_base /* = NULL*/)
 {
   tree sub = op0;
   tree subtype;

base-commit: 72f76540ad0e7185d4f516e781e8bead13ebc170
-- 
2.27.0



[pushed] c++: discarded-value and constexpr

2022-05-24 Thread Jason Merrill via Gcc-patches
I've been thinking for a while that the 'lval' parameter needed a third
value for discarded-value expressions; most importantly,
cxx_eval_store_expression does extra work for an lvalue result, and we also
don't want to do the l->r conversion.

Mostly this is pretty mechanical.  Apart from the _store_ fix, I also use
vc_discard for substatements of a STATEMENT_LIST other than a stmt-expr
result, and avoid building _REFs to be ignored in a few other places.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* constexpr.cc (enum value_cat): New. Change all 'lval' parameters
from int to value_cat.  Change most false to vc_prvalue, most true
to vc_glvalue, cases where the return value is ignored to
vc_discard.
(cxx_eval_statement_list): Only vc_prvalue for stmt-expr result.
(cxx_eval_store_expression): Only build _REF for vc_glvalue.
(cxx_eval_array_reference, cxx_eval_component_reference)
(cxx_eval_indirect_ref, cxx_eval_constant_expression): Likewise.
---
 gcc/cp/constexpr.cc | 198 
 1 file changed, 108 insertions(+), 90 deletions(-)

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 0f1a43982d0..a015bc7c818 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -1210,9 +1210,6 @@ 
uid_sensitive_constexpr_evaluation_checker::evaluation_restricted_p () const
 
 static GTY (()) hash_table *constexpr_call_table;
 
-static tree cxx_eval_constant_expression (const constexpr_ctx *, tree,
- bool, bool *, bool *, tree * = NULL);
-
 /* Compute a hash value for a constexpr call representation.  */
 
 inline hashval_t
@@ -1346,13 +1343,25 @@ get_nth_callarg (tree t, int n)
 }
 }
 
+/* Whether our evaluation wants a prvalue (e.g. CONSTRUCTOR or _CST),
+   a glvalue (e.g. VAR_DECL or _REF), or nothing.  */
+
+enum value_cat {
+   vc_prvalue = 0,
+   vc_glvalue = 1,
+   vc_discard = 2
+};
+
+static tree cxx_eval_constant_expression (const constexpr_ctx *, tree,
+ value_cat, bool *, bool *, tree * = 
NULL);
+
 /* Attempt to evaluate T which represents a call to a builtin function.
We assume here that all builtin functions evaluate to scalar types
represented by _CST nodes.  */
 
 static tree
 cxx_eval_builtin_function_call (const constexpr_ctx *ctx, tree t, tree fun,
-   bool lval,
+   value_cat lval,
bool *non_constant_p, bool *overflow_p)
 {
   const int nargs = call_expr_nargs (t);
@@ -1458,7 +1467,7 @@ cxx_eval_builtin_function_call (const constexpr_ctx *ctx, 
tree t, tree fun,
  || potential_constant_expression (arg))
{
  bool dummy1 = false, dummy2 = false;
- arg = cxx_eval_constant_expression (_ctx, arg, false,
+ arg = cxx_eval_constant_expression (_ctx, arg, vc_prvalue,
  , );
}
 
@@ -1703,7 +1712,7 @@ cxx_bind_parameters_in_call (const constexpr_ctx *ctx, 
tree t, tree fun,
   /* Normally we would strip a TARGET_EXPR in an initialization context
 such as this, but here we do the elision differently: we keep the
 TARGET_EXPR, and use its CONSTRUCTOR as the value of the parm.  */
-  arg = cxx_eval_constant_expression (ctx, x, /*lval=*/false,
+  arg = cxx_eval_constant_expression (ctx, x, vc_prvalue,
  non_constant_p, overflow_p);
   /* Don't VERIFY_CONSTANT here.  */
   if (*non_constant_p && ctx->quiet)
@@ -1807,7 +1816,7 @@ cx_error_context (void)
 
 static tree
 cxx_eval_internal_function (const constexpr_ctx *ctx, tree t,
-   bool lval,
+   value_cat lval,
bool *non_constant_p, bool *overflow_p)
 {
   enum tree_code opcode = ERROR_MARK;
@@ -1832,12 +1841,13 @@ cxx_eval_internal_function (const constexpr_ctx *ctx, 
tree t,
 
 case IFN_LAUNDER:
   return cxx_eval_constant_expression (ctx, CALL_EXPR_ARG (t, 0),
-  false, non_constant_p, overflow_p);
+  vc_prvalue, non_constant_p,
+  overflow_p);
 
 case IFN_VEC_CONVERT:
   {
tree arg = cxx_eval_constant_expression (ctx, CALL_EXPR_ARG (t, 0),
-false, non_constant_p,
+vc_prvalue, non_constant_p,
 overflow_p);
if (TREE_CODE (arg) == VECTOR_CST)
  if (tree r = fold_const_call (CFN_VEC_CONVERT, TREE_TYPE (t), arg))
@@ -2103,7 +2113,7 @@ cxx_eval_dynamic_cast_fn (const constexpr_ctx *ctx, tree 
call,
 }
 
   /* Evaluate the object so that we know its dynamic type.  */
-  obj = cxx_eval_constant_expression (ctx, 

Re: [wwwdocs] Add C status page

2022-05-24 Thread Joseph Myers
On Tue, 24 May 2022, Marek Polacek via Gcc-patches wrote:

> +

This actually looks like a mix of papers approved at the Oct 2020 meeting 
and those approved at the Nov/Dec 2020 meeting, though I haven't checked 
the lists in detail against those the minutes show as being approved as-is 
or with specified changes (as opposed to support "along the lines" = bring 
back a revised paper for more discussion = doesn't belong in this list 
unless it was actually approved into C2x at a subsequent meeting).

> +

There was a Mar 2021 meeting, not an Apr 2021 one.

There were also June 2021, Aug/Sep 2021, Nov 2021, Jan/Feb 2022 and May 
2022 meetings, which don't appear in this list.

Also note that www.open-std.org now uses https so all links should be 
updated accordingly.

The following are lists of papers accepted at the last two WG14 meetings, 
sorted by whether, at the time of the meeting, I thought they needed 
implementation work in GCC *or glibc*.  Note that needing implementation 
work, for the purposes of these lists, sometimes only means "needs a 
review of existing code before I'm sure it *doesn't* need implementation 
work", or "already implemented, but may need test cases added to the 
testsuite", or "already implemented but -pedantic handling needs adjusting 
accordingly" (sometimes the entries have notes I made to that effect) - it 
does *not* necessarily mean a missing feature.  Also note that some 
entries on these lists may actually be part of a feature added at a 
previous meeting for C2x that don't really deserve their own table entry, 
and some may have been implemented since I took these notes at the 
meetings.  The descriptions of the papers in these lists are very 
abbreviated, intended only for my own reference; they aren't the full 
official paper titles or suitable for a list on the GCC website.  And just 
as I didn't distinguish language and library features (some papers have 
both, of course), I also didn't distinguish *why* a paper doesn't need 
implementation work (most often because it's not making any normative 
changes of intent relevant to GCC or glibc, but sometimes maybe already 
implemented or for other reasons).

Jan/Feb 2022, needed implementation work:

N2833 timegm (option 1, mktime->timegm typo fixed in Returns)
N2836 identifier syntax
  [check any impact on $ in identifiers?] [check pp-number handling]
N2764 noreturn attribute
N2775 _BitInt literal suffixes
N2927 typeof
N2841 no function declarators without prototypes
  (subject to editorial fixes, Aaron Ballman reflector 21595, 13 Feb 2022)
  [note: revert previous changes to () compat with prototypes]
N2840 call_once, once_flag, ONCE_FLAG_INIT in stdlib.h
N2826 unreachable, macro in stddef.h (changes 5, 6) [no vote on change 7]
N2819 storage class of compound literals in function parameters
  (adopting GCC behavior, but add tests including that non-const inits OK)
N2653 char8_t (Tom Honerman has GCC, glibc patches)
N2829 assert variadic macro
N2900 {} init (includes VLA initializers; -pedantic fixes to allow {})
  (not including the Optional Change 0 on largest members)
N2828 Unicode Sequences More Than 21 Bits are a Constraint Violation
  (diagnosed by GCC since Sep 2019, but check for tests before saying done)
N2931 tgmath narrowing (may need work along with other tgmath changes)
N2934 keywords alignas alignof bool static_assert thread_local
N2935 keywords / predefined constants true false

Jan/Feb 2022, no implementation work needed:

N2810 calloc wraparound (possibly remove "of", editorially)
N2797 *_HAS_SUBNORM (green text only)
N2754 DFP quantum exponent of NaN etc. ("would be" -> "is" in footnote)
N2844 (remove _FloatN default argument promotion)
N2847 "numerically equal" example change
N2879 5.2.4.2.2 cleanup
N2880 overflow and underflow definitions
N2881 normal and subnormal
  ("Larger magnitude ..." sentence is only new part not already in C23)
N2882 max exponent
N2762 potentially reserved identifiers
  (remove "All library functions have external linkage." footnote)
N2701 @ and $ and ` (all three changes, need to be single-byte)
N2937 (blocks in grammar)

May 2022, needed implementation work:

N2867 checked n-bit integers (i.e. not required to support them)
N2886 remove ATOMIC_VAR_INIT (variant 2 + changes 3.2/3.3/3.4/3.5, not 3.6)
N2888 require exact-width integer type interfaces (3.1 already accepted)
  (i.e., allowing exact-width types wider than (u)intmax_t)
  No implementation work needed for conformance, BUT would allow
  int128_t / uint128_t typedefs etc. given some other implementation changes
N2897 memset_explicit (Alternative 1)

May 2022, no implementation work needed:

N2861 indeterminate values and trap representations
  (subject to editorial merges with other papers)
N2992 wording change for variably-modified types

For the previous three meetings, I have a more abbreviated version of such 
lists.

Jun 2021, needed implementation work:

N2683 checked 

[pushed] c++: constexpr empty base redux [PR105622]

2022-05-24 Thread Jason Merrill via Gcc-patches
Here calling the constructor for s.__size_ had ctx->ctor for s itself
because cxx_eval_store_expression doesn't create a ctor for the empty field.
Then cxx_eval_call_expression returned the s initializer, and my empty base
overhaul in r13-160 got confused because the type of init is not an empty
class.  But that's OK, we should be checking the type of the original LHS
instead.  We also want to use initialized_type in the condition, in case
init is an AGGR_INIT_EXPR.

I spent quite a while working on more complex solutions before coming back
to this simple one.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/105622

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_store_expression): Adjust assert.
Use initialized_type.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/no_unique_address14.C: New test.
---
 gcc/cp/constexpr.cc   |  4 ++--
 .../g++.dg/cpp2a/no_unique_address14.C| 19 +++
 2 files changed, 21 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/no_unique_address14.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 433fa767c03..0f1a43982d0 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -5916,15 +5916,15 @@ cxx_eval_store_expression (const constexpr_ctx *ctx, 
tree t,
   gcc_checking_assert (!*valp || (same_type_ignoring_top_level_qualifiers_p
  (TREE_TYPE (*valp), type)));
   if (empty_base || !(same_type_ignoring_top_level_qualifiers_p
- (TREE_TYPE (init), type)))
+ (initialized_type (init), type)))
 {
   /* For initialization of an empty base, the original target will be
*(base*)this, evaluation of which resolves to the object
argument, which has the derived type rather than the base type.  In
this situation, just evaluate the initializer and return, since
there's no actual data to store, and we didn't build a CONSTRUCTOR.  */
+  gcc_assert (is_empty_class (TREE_TYPE (target)));
   empty_base = true;
-  gcc_assert (is_empty_class (TREE_TYPE (init)));
   if (!*valp)
{
  /* But do make sure we have something in *valp.  */
diff --git a/gcc/testsuite/g++.dg/cpp2a/no_unique_address14.C 
b/gcc/testsuite/g++.dg/cpp2a/no_unique_address14.C
new file mode 100644
index 000..d3fcd4ad354
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/no_unique_address14.C
@@ -0,0 +1,19 @@
+// PR c++/105622
+// { dg-do compile { target c++20 } }
+
+struct empty {
+  empty() = default;
+  constexpr empty(int) { }
+};
+
+struct container {
+  empty __begin_ = {};
+  [[no_unique_address]] empty __size_ = 0;
+};
+
+constexpr bool test() {
+  container s;
+  return true;
+}
+static_assert(test());
+

base-commit: ae8decf1d2b8329af59592b4fa78ee8dfab3ba5e
-- 
2.27.0



Re: [PATCH] [PR/target 105666] RISC-V: Inhibit FP <--> int register moves via tune param

2022-05-24 Thread Vineet Gupta




On 5/24/22 00:59, Kito Cheng wrote:

Committed, thanks!


Thx for the quick action Kito,
Can this be backported to gcc 12 as well ?

Thx,
-Vineet



On Tue, May 24, 2022 at 3:40 AM Philipp Tomsich
 wrote:

Good catch!

On Mon, 23 May 2022 at 20:12, Vineet Gupta  wrote:


Under extreme register pressure, compiler can use FP <--> int
moves as a cheap alternate to spilling to memory.
This was seen with SPEC2017 FP benchmark 507.cactu:
ML_BSSN_Advect.cc:ML_BSSN_Advect_Body()

|   fmv.d.x fa5,s9  # PDupwindNthSymm2Xt1, PDupwindNthSymm2Xt1
| .LVL325:
|   ld  s9,184(sp)  # _12469, %sfp
| ...
| .LVL339:
|   fmv.x.d s4,fa5  # PDupwindNthSymm2Xt1, PDupwindNthSymm2Xt1
|

The FMV instructions could be costlier (than stack spill) on certain
micro-architectures, thus this needs to be a per-cpu tunable
(default being to inhibit on all existing RV cpus).

Testsuite run with new test reports 10 failures without the fix
corresponding to the build variations of pr105666.c

|   === gcc Summary ===
|
| # of expected passes  123318   (+10)
| # of unexpected failures  34   (-10)
| # of unexpected successes 4
| # of expected failures780
| # of unresolved testcases 4
| # of unsupported tests2796

gcc/Changelog:

 * config/riscv/riscv.cc: (struct riscv_tune_param): Add
   fmv_cost.
 (rocket_tune_info): Add default fmv_cost 8.
 (sifive_7_tune_info): Ditto.
 (thead_c906_tune_info): Ditto.
 (optimize_size_tune_info): Ditto.
 (riscv_register_move_cost): Use fmv_cost for int<->fp moves.

gcc/testsuite/Changelog:

 * gcc.target/riscv/pr105666.c: New test.

Signed-off-by: Vineet Gupta 
---
  gcc/config/riscv/riscv.cc |  9 
  gcc/testsuite/gcc.target/riscv/pr105666.c | 55 +++
  2 files changed, 64 insertions(+)
  create mode 100644 gcc/testsuite/gcc.target/riscv/pr105666.c

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index ee756aab6940..f3ac0d8865f0 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -220,6 +220,7 @@ struct riscv_tune_param
unsigned short issue_rate;
unsigned short branch_cost;
unsigned short memory_cost;
+  unsigned short fmv_cost;
bool slow_unaligned_access;
  };

@@ -285,6 +286,7 @@ static const struct riscv_tune_param rocket_tune_info
= {
1,   /* issue_rate */
3,   /* branch_cost */
5,   /* memory_cost */
+  8,   /* fmv_cost */
true,/*
slow_unaligned_access */
  };

@@ -298,6 +300,7 @@ static const struct riscv_tune_param
sifive_7_tune_info = {
2,   /* issue_rate */
4,   /* branch_cost */
3,   /* memory_cost */
+  8,   /* fmv_cost */
true,/*
slow_unaligned_access */
  };

@@ -311,6 +314,7 @@ static const struct riscv_tune_param
thead_c906_tune_info = {
1,/* issue_rate */
3,/* branch_cost */
5,/* memory_cost */
+  8,   /* fmv_cost */
false,/* slow_unaligned_access */
  };

@@ -324,6 +328,7 @@ static const struct riscv_tune_param
optimize_size_tune_info = {
1,   /* issue_rate */
1,   /* branch_cost */
2,   /* memory_cost */
+  8,   /* fmv_cost */
false,   /* slow_unaligned_access */
  };

@@ -4737,6 +4742,10 @@ static int
  riscv_register_move_cost (machine_mode mode,
   reg_class_t from, reg_class_t to)
  {
+  if ((from == FP_REGS && to == GR_REGS) ||
+  (from == GR_REGS && to == FP_REGS))
+return tune_param->fmv_cost;
+
return riscv_secondary_memory_needed (mode, from, to) ? 8 : 2;
  }

diff --git a/gcc/testsuite/gcc.target/riscv/pr105666.c
b/gcc/testsuite/gcc.target/riscv/pr105666.c
new file mode 100644
index ..904f3bc0763f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr105666.c
@@ -0,0 +1,55 @@
+/* Shamelessly plugged off
gcc/testsuite/gcc.c-torture/execute/pr28982a.c.
+
+   The idea is to induce high register pressure for both int/fp registers
+   so that they spill. By default FMV instructions would be used to stash
+   int reg to a fp reg (and vice-versa) but that could be costlier than
+   spilling to stack.  */
+
+/* { dg-do compile } */
+/* { dg-options "-march=rv64g -ffast-math" } */
+
+#define NITER 4
+#define NVARS 20
+#define MULTI(X) \
+  X( 0), X( 1), X( 2), X( 3), X( 4), X( 5), X( 6), X( 

Re: [0/9] [middle-end] Add param to vec_perm_const hook to specify mode of input operand

2022-05-24 Thread Prathamesh Kulkarni via Gcc-patches
On Tue, 24 May 2022 at 14:50, Richard Sandiford
 wrote:
>
> Prathamesh Kulkarni  writes:
> > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> > index c5006afc00d..0a3c733ada9 100644
> > --- a/gcc/doc/tm.texi
> > +++ b/gcc/doc/tm.texi
> > @@ -6088,14 +6088,18 @@ for the given scalar type @var{type}.  
> > @var{is_packed} is false if the scalar
> >  access using @var{type} is known to be naturally aligned.
> >  @end deftypefn
> >
> > -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST 
> > (machine_mode @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, 
> > const vec_perm_indices @var{})
> > +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST 
> > (machine_mode @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx 
> > @var{in0}, rtx @var{in1}, const vec_perm_indices @var{})
> >  This hook is used to test whether the target can permute up to two
> > -vectors of mode @var{mode} using the permutation vector @code{sel}, and
> > -also to emit such a permutation.  In the former case @var{in0}, @var{in1}
> > -and @var{out} are all null.  In the latter case @var{in0} and @var{in1} are
> > -the source vectors and @var{out} is the destination vector; all three are
> > -operands of mode @var{mode}.  @var{in1} is the same as @var{in0} if
> > -@var{sel} describes a permutation on one vector instead of two.
> > +vectors of mode @var{op_mode} using the permutation vector @code{sel},
> > +producing a vector of mode @var{mode}.The hook is also used to emit such
>
> Should be two spaces between “@var{mode}.” and “The”.
>
> > +a permutation.
> > +
> > +When the hook is being used to test whether the target supports a 
> > permutation,
> > +@var{in0}, @var{in1}, and @var{out} are all null.When the hook is being 
> > used
>
> Same here: missing spaces before “When”.
>
> > +to emit a permutation, @var{in0} and @var{in1} are the source vectors of 
> > mode
> > +@var{op_mode} and @var{out} is the destination vector of mode @var{mode}.
> > +@var{in1} is the same as @var{in0} if @var{sel} describes a permutation on 
> > one
> > +vector instead of two.
> >
> >  Return true if the operation is possible, emitting instructions for it
> >  if rtxes are provided.
> > diff --git a/gcc/match.pd b/gcc/match.pd
> > index f5efa77560c..f2a527d9c42 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -7596,6 +7596,8 @@ and,
> >   (with
> >{
> >  tree op0 = @0, op1 = @1, op2 = @2;
> > +machine_mode result_mode = TYPE_MODE (type);
> > +machine_mode op_mode = TYPE_MODE (TREE_TYPE (op0));
> >
> >  /* Build a vector of integers from the tree mask.  */
> >  vec_perm_builder builder;
> > @@ -7703,12 +7705,12 @@ and,
> >  2-argument version.  */
> >   tree oldop2 = op2;
> >   if (sel.ninputs () == 2
> > -|| can_vec_perm_const_p (TYPE_MODE (type), sel, false))
> > +|| can_vec_perm_const_p (result_mode, op_mode, sel, false))
> > op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel);
> >   else
> > {
> >   vec_perm_indices sel2 (builder, 2, nelts);
> > - if (can_vec_perm_const_p (TYPE_MODE (type), sel2, false))
> > + if (can_vec_perm_const_p (result_mode, op_mode, sel2, false))
> > op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel2);
> >   else
> > /* Not directly supported with either encoding,
>
> Please replace the use of TYPE_MODE here:
>
> /* See if the permutation is performing a single element
>insert from a CONSTRUCTOR or constant and use a BIT_INSERT_EXPR
>in that case.  But only if the vector mode is supported,
>otherwise this is invalid GIMPLE.  */
> if (TYPE_MODE (type) != BLKmode
>
> as well.
>
> OK with those changes, thanks.
Thanks, committed the patch in ae8decf1d2b8329af59592b4fa78ee8dfab3ba5e.

Thanks,
Prathamesh
>
> Richard


Re: [PATCH][wwwdocs] Document ASAN changes for GCC 13.

2022-05-24 Thread Martin Liška
On 5/24/22 16:33, Gerald Pfeifer wrote:
> Hi Martin,
> 
> On Tue, 24 May 2022, Martin Liška wrote:
>> +AddressSanitizer defaults to 
>> detect_stack_use_after_return=1 on Linux target.
> 
> did you mean targets, or really just target?
> 
> (And Linux or GNU/Linux, though that one is more disputed, I know.
> Just following our own coding conventions...)
> 
> Gerald

Hello.

Both comments addressed and I've just pushed that.

Martin


Re: [wwwdocs] Add C status page

2022-05-24 Thread Joseph Myers
On Tue, 24 May 2022, Marek Polacek via Gcc-patches wrote:

> I thought it'd be nice to have a table that documents our C support
> status, like we have https://gcc.gnu.org/projects/cxx-status.html for C++.
> We have https://gcc.gnu.org/c99status.html, but that's C99 only.
> 
> So here's a patch to add just that.  For C99, I used c99status.html but
> added paper numbers (taken from https://clang.llvm.org/c_status.html).

For C11, see https://gcc.gnu.org/wiki/C11Status (note: I haven't checked 
the accuracy of that page).

Listing in terms of features is more useful than listing in terms of 
papers.  Referring to the original paper, even if it's the version that 
got accepted into the standard, is liable to be actively misleading to 
anyone working on the implementation; sometimes the paper has multiple 
choices of which only one was accepted into the standard, or only some of 
the listed changes were accepted, or there were various subsequent 
features or fixes from various subsequent papers.  (By way of example, it 
would make more sense to list _BitInt as a single entry for a missing 
feature than to separately list N2763 and N2775 (accepted papers), while 
N2960, to be considered at the July meeting of WG14, makes further wording 
fixes but can't exactly be considered a feature in a sense that should be 
listed in such a table.)  Lots of papers are just cleanups, or 
clarification of wording, or fixes to issues with previous papers, such 
that it doesn't make sense to list them as implemented or not at all.

As usual there are also cases where a feature is implemented to the extent 
relevant for conformance but e.g. more optimizations (such as built-in 
functions) could be added.  And cases where some support in GCC should 
definitely be done to consider the feature implemented, even when not 
needed for conformance (e.g. the %wN, %wfN printf/scanf formats need 
implementing in glibc, and corresponding format checking support needs 
implementing in GCC).  There are also cases where a feature is 
substantially there but a more detailed review should be done for how it 
matches up to the standard version (e.g. the DFP support based on TR 
24732-2009 could do with such a detailed review for how it matches C2x 
requirements).

> +
> +  Binary literals
> +   href="http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2549.pdf;>N2549
> +  GCC 11
> +  
> +

This is an example of cases where the version where a feature was 
supported in GCC as an extension is long before the version where 
-pedantic knows about it being supported in a given standard version; 
listing the version with the -pedantic change in such cases may not be 
very helpful without noting when it was originally implemented.  There are 
probably other examples in the list.  (There are also examples where GCC 
supports the feature but hasn't yet had -pedantic updated accordingly, 
e.g. #warning.  And cases where it's largely supported but there are small 
differences in the standard version that still need implementing, e.g. 
typeof.)

> +
> +  What we think we reserve
> +   href="http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2572.pdf;>N2572
> +  ?
> +  
> +

This is an example of the many cases where it doesn't make sense to 
consider something as a feature with an "implemented" or "not implemented" 
state at all - so it doesn't belong in such a table at all.  There are 
many other such examples in the list.

-- 
Joseph S. Myers
jos...@codesourcery.com


[wwwdocs] Add C status page

2022-05-24 Thread Marek Polacek via Gcc-patches
I thought it'd be nice to have a table that documents our C support
status, like we have https://gcc.gnu.org/projects/cxx-status.html for C++.
We have https://gcc.gnu.org/c99status.html, but that's C99 only.

So here's a patch to add just that.  For C99, I used c99status.html but
added paper numbers (taken from https://clang.llvm.org/c_status.html).

I've filed the availability status of some of the features, but not all;
I know next to nothing about the floating-point features.  I hope these
will get filed in due time.

It would also be nice to have a C DR table, like we have for C++:
https://gcc.gnu.org/projects/cxx-dr-status.html
but that's a bigger task.

Validated.  Any comments before I push it?

diff --git a/htdocs/projects/c-status.html b/htdocs/projects/c-status.html
new file mode 100644
index ..cc240c07
--- /dev/null
+++ b/htdocs/projects/c-status.html
@@ -0,0 +1,995 @@
+
+
+
+
+  C Standards Support in GCC
+https://gcc.gnu.org/gcc.css; />
+
+
+
+  C Standards Support in GCC
+
+  GCC supports different dialects of C, corresponding to the multiple
+  published ISO standards.  Which standard it implements can be selected using
+  the -std= command-line option.
+
+  
+C89
+C99
+C11
+C17
+C2x
+  
+
+
+
+  C2x Support in GCC
+
+  GCC has experimental support for the next revision of the C
+  standard.
+
+  C2X features are available since GCC 9. To enable C2X support, add the
+  command-line parameter -std=c2x to your gcc command
+  line.  Or, to enable GNU extensions in addition to C2X features, add
+  -std=gnu2x.  These options are available since
+  GCC 9.
+
+  Important: Because the ISO C2X standard is still
+evolving, GCC's support is experimental.  No attempt will
+be made to maintain backward compatibility with implementations of C2X
+features that do not reflect the final standard.
+
+  C2X Language Features
+
+  
+
+  Language Feature
+  Proposal
+  Available in GCC?
+  Notes
+
+
+
+  Evaluation formats
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2186.pdf;>N2186
+  ?
+  
+
+
+  Clarifying the restrict Keyword v2
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2260.pdf;>N2660
+  ?
+  
+
+
+  Harmonizing static_assert with C++
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2265.pdf;>N2665
+  ?
+  
+
+
+  nodiscard attribute
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2667.pdf;>N2667
+  GCC 11
+  
+
+
+  maybe_unused attribute
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2670.pdf;>N2670
+  GCC 10
+  
+
+
+  TS 18661 Integration
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2314.pdf;>N2314
+ http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2341.pdf;>N2341
+ http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2401.pdf;>N2401
+ http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2359.pdf;>N2359
+ http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2546.pdf;>N2546
+  ?
+  This is supported at least partially.
+
+
+  Preprocessor line numbers unspecified
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2322.htm;>N2322
+  ?
+  
+
+
+  deprecated attribute
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2334.pdf;>N2334
+  GCC 10
+  
+
+
+  Attributes
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2335.pdf;>N2335
+ http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2554.pdf;>N2554
+  ?
+  
+
+
+  Defining new types in offsetof
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2350.htm;>N2350
+  ?
+  
+
+
+  fallthrough attribute
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2408.pdf;>N2408
+  GCC 10
+  
+
+
+  Two's complement sign representation
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2412.pdf;>N2412
+  ?
+  
+
+
+  Adding the u8 character prefix
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2418.pdf;>N2418
+  ?
+  
+
+
+  Remove support for function definitions with identifier lists
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2432.pdf;>N2432
+  ?
+  
+
+
+
+  *_IS_IEC_60559 feature test macros
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2379.htm;>N2379
+  ?
+  
+
+
+  Floating-point negation and conversion
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2416.pdf;>N2416
+  ?
+  
+
+
+  Annex F.8 update for implementation extensions and rounding
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2384.pdf;>N2384
+  ?
+  
+
+
+  _Bool definitions for true and false
+  http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2393.pdf;>N2393
+  ?
+  
+
+
+
+  [[nodiscard("should have 

[PATCH] sourcebuild.texi: Document new toplevel directories [PR82383]

2022-05-24 Thread Eric Gallager via Gcc-patches
This patch adds entries for the c++tools, gotools, libbacktrace,
libcc1, libcody, liboffloadmic, and libsanitizer directories into the
list of toplevel source directories in sourcebuild.texi. I also
removed the entry for boehm-gc (which is no longer in-tree), and fixed
the alphabetization for libquadmath while I was at it. Any style nits
I need to fix before committing (with a proper ChangeLog entry)?


patch-sourcebuild.texi.diff
Description: Binary data


Re: [patch] [wwwdocs]+[invoke.texi] Update GCN for gfx90a (was: Re: [committed] amdgcn: Add gfx90a support)

2022-05-24 Thread Tobias Burnus

On 24.05.22 18:44, Tobias Burnus wrote:

On 24.05.22 17:31, Andrew Stubbs wrote:

amdgcn: Add gfx90a support


Attached is an attempt to update invoke.texi
And to update the gcc-13/changes.html. Regarding the latter, I have to
versions – the first is more readable, the latter makes more clear
where to use it, but reads much worse. – Pick one or suggest a better
one.


[wwwdocs, only]: Actually, regarding the gcc-13/changes, I am wondering
whether the best choice is to use the first wording but link to the
-march= page. That's what the new variant now does – see attachment.

(The linked-to page is:
https://gcc.gnu.org/onlinedocs/gcc/AMD-GCN-Options.html and
gfx906/gfx90a is added by the patch in the previous email in this thread.)

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
gcc-13/changes.html: Add gfx90a to GCN

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index 6c5b2a37..183a4bba 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -95,6 +95,13 @@ a work-in-progress.
 
 
 
+AMD Radeon (GCN)
+
+  Support for the Instinct MI200 series (https://gcc.gnu.org/onlinedocs/gcc/AMD-GCN-Options.html;>
+  gfx90a) has been added.
+
+
 
 
 


Re: [ping2][PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations

2022-05-24 Thread David Faust via Gcc-patches



On 5/24/22 09:03, Yonghong Song wrote:
> 
> 
> On 5/24/22 8:53 AM, David Faust wrote:
>>
>>
>> On 5/24/22 04:07, Jose E. Marchesi wrote:
>>>
 On 5/11/22 11:44 AM, David Faust wrote:
>
> On 5/10/22 22:05, Yonghong Song wrote:
>>
>>
>> On 5/10/22 8:43 PM, Yonghong Song wrote:
>>>
>>>
>>> On 5/6/22 2:18 PM, David Faust wrote:


 On 5/5/22 16:00, Yonghong Song wrote:
>
>
> On 5/4/22 10:03 AM, David Faust wrote:
>>
>>
>> On 5/3/22 15:32, Joseph Myers wrote:
>>> On Mon, 2 May 2022, David Faust via Gcc-patches wrote:
>>>
 Consider the following example:

    #define __typetag1 __attribute__((btf_type_tag("tag1")))
    #define __typetag2 __attribute__((btf_type_tag("tag2")))
    #define __typetag3 __attribute__((btf_type_tag("tag3")))

    int __typetag1 * __typetag2 __typetag3 * g;

 The expected behavior is that 'g' is "a pointer with tags
 'tag2' and
 'tag3',
 to a pointer with tag 'tag1' to an int". i.e.:
>>>
>>> That's not a correct expectation for either GNU __attribute__ or
>>> C2x [[]]
>>> attribute syntax.  In either syntax, __typetag2 __typetag3 should
>>> apply to
>>> the type to which g points, not to g or its type, just as if
>>> you had a
>>> type qualifier there.  You'd need to put the attributes (or
>>> qualifier)
>>> after the *, not before, to make them apply to the pointer
>>> type.  See
>>> "Attribute Syntax" in the GCC manual for how the syntax is
>>> defined for
>>> GNU
>>> attributes and deduce in turn, for each subsequence of the tokens
>>> matching
>>> the syntax for some kind of declarator, what the type for "T D1"
>>> would be
>>> as defined there and in the C standard, as deduced from the type for
>>> "T D"
>>> for a sub-declarator D.
>>>     >> But GCC's attribute parsing produces a variable 'g'
>>> which is "a
>> pointer with
 tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an
 int", i.e.
>>>
>>> In GNU syntax, __typetag1 applies to the declaration, whereas in C2x
>>> syntax it applies to int.  Again, if you wanted it to apply to the
>>> pointer
>>> type it would need to go after the * not before.
>>>
>>> If you are concerned with the fine details of what construct an
>>> attribute
>>> appertains to, I recommend using C2x syntax not GNU syntax.
>>>
>>
>> Joseph, thank you! This is very helpful. My understanding of
>> the syntax
>> was not correct.
>>
>> (Actually, I made a bad mistake in paraphrasing this example from the
>> discussion of it in the series cover letter. But, the reason
>> why it is
>> incorrect is the same.)
>>
>>
>> Yonghong, is the specific ordering an expectation in BPF programs or
>> other users of the tags?
>
> This is probably a language writing issue. We are saying tags only
> apply to pointer. We probably should say it only apply to pointee.
>
> $ cat t.c
> int const *ptr;
>
> the llvm ir debuginfo:
>
> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
> !6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !7)
> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>
> We could replace 'const' with a tag like below:
>
> int __attribute__((btf_type_tag("tag"))) *ptr;
>
> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
> annotations: !7)
> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
> !7 = !{!8}
> !8 = !{!"btf_type_tag", !"tag"}
>
> In the above IR, we generate annotations to pointer_type because
> we didn't invent a new DI type for encode btf_type_tag. But it is
> totally okay to have IR looks like
>
> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 64)
> !11 = !DIBtfTypeTagType(..., baseType: !6, name: !"Tag")
> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>
 OK, thanks.

 There is still the question of why the DWARF generated for this case
 that I have been concerned about:

      int __typetag1 * __typetag2 __typetag3 * g;

 differs between GCC (with this series) and clang. After studying it,
 GCC is doing with the attributes exactly as is described 

[PATCH] AArch64: Prioritise init_have_lse_atomics constructor [PR 105708]

2022-05-24 Thread Wilco Dijkstra via Gcc-patches

Increase the priority of the init_have_lse_atomics constructor so it runs
before other constructors. This improves chances that rr works when LSE
atomics are supported.

Regress and bootstrap pass, OK for commit?

2022-05-24  Wilco Dijkstra  

libgcc/
PR libgcc/105708
* config/aarch64/lse-init.c: Increase constructor priority.

---
diff --git a/libgcc/config/aarch64/lse-init.c b/libgcc/config/aarch64/lse-init.c
index 
fc875b7fe80e947623e570eac130e7a14b516551..92d91dfeed77f299aa610d72091499271490
 100644
--- a/libgcc/config/aarch64/lse-init.c
+++ b/libgcc/config/aarch64/lse-init.c
@@ -38,7 +38,7 @@ _Bool __aarch64_have_lse_atomics
 
 unsigned long int __getauxval (unsigned long int);
 
-static void __attribute__((constructor))
+static void __attribute__((constructor (90)))
 init_have_lse_atomics (void)
 {
   unsigned long hwcap = __getauxval (AT_HWCAP);


[patch] [wwwdocs]+[invoke.texi] Update GCN for gfx90a (was: Re: [committed] amdgcn: Add gfx90a support)

2022-05-24 Thread Tobias Burnus

On 24.05.22 17:31, Andrew Stubbs wrote:

amdgcn: Add gfx90a support


Attached is an attempt to update invoke.texi

And to update the gcc-13/changes.html. Regarding the latter, I have to
versions – the first is more readable, the latter makes more clear where
to use it, but reads much worse. – Pick one or suggest a better one.

OK for the two patches?

Tobias

PS: I was thinking of mentioning that GCN now requires llvm-mc of LLVM
13.0.1 or higher (during build + installed as assembler) but I then
thought that changes.html is not the best place and there is an error
during build, stating what is needed.
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
GCN: Add gfx908/gfx90a to -march/-mtune in invoke.texi

gcc/
	* doc/invoke.texi (AMD GCN Options): Add gfx908/gfx90a.

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index a2f85f0a4ea..521e4b65d3d 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -19740,7 +19740,6 @@ Set architecture type or tuning for @var{gpu}. Supported values for @var{gpu}
 are
 
 @table @samp
-@opindex fiji
 @item fiji
 Compile for GCN3 Fiji devices (gfx803).
 
@@ -19750,6 +19749,12 @@ Compile for GCN5 Vega 10 devices (gfx900).
 @item gfx906
 Compile for GCN5 Vega 20 devices (gfx906).
 
+@item gfx908
+Compile for CDNA1 Instinct MI100 devices (gfx908).
+
+@item gfx90a
+Compile for CDNA2 Instinct MI200 series devices (gfx90a).
+
 @end table
 
 @item -msram-ecc=on
gcc-13/changes.html: Add gfx90a to GCN

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index 6c5b2a37..745aa65c 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -95,6 +95,14 @@ a work-in-progress.
 
 
 
+AMD Radeon (GCN)
+
+  Support for the Instinct MI200 series (gfx90a) has been
+  added.
+  The -march= and -mtune= flags now support
+  gfx90a for the Instinct MI200 series (gfx90a).
+
+
 
 
 


Re: [PATCH] aarch64: Fix pac-ret with unusual dwarf in libgcc unwinder [PR104689]

2022-05-24 Thread Szabolcs Nagy via Gcc-patches
The 05/13/2022 16:35, Richard Sandiford wrote:
> Szabolcs Nagy via Gcc-patches  writes:
> > The RA_SIGN_STATE dwarf pseudo-register is normally only set using the
> > DW_CFA_AARCH64_negate_ra_state (== DW_CFA_window_save) operation which
> > toggles the return address signedness state (the default state is 0).
> > (It may be set by remember/restore_state CFI too, those save/restore
> > the state of all registers.)
> >
> > However RA_SIGN_STATE can be set directly via DW_CFA_val_expression too.
> > GCC does not generate such CFI but some other compilers reportedly do.
> >
> > Note: the toggle operation must not be mixed with other dwarf register
> > rule CFI within the same CIE and FDE.
> >
> > In libgcc we assume REG_UNSAVED means the RA_STATE is set using toggle
> > operations, otherwise we assume its value is set by other CFI.
> 
> AFAIK, this is the first time I've looked at the RA_SIGN_STATE code,
> so this is probably a naive point/question, but: it seems a bit
> underhand for the existing code to be using REG_UNSAVED and
> loc.offset to hold the toggle state.  Would it make sense to add
> a new enum value for known, pre-evaluated constants?  _Unwind_GetGR
> would then DTRT for both cases.
> 
> That's a comment about the pre-existing code though.  I agree this
> patch looks like the right fix if we keep to the current approach.

yes, this is a hack. i looked at introducing a generic REG_*
enum to deal with RA_SIGN_STATE now, but it's a bit awkward:

normally frame state for a reg starts out REG_UNSAVED, which
should mean 0 value for the RA_SIGN_STATE pseudo register.

when moving up frames the uw context gets copied and updated
according to the frame state (where REG_UNSAVED normally means
unmodified copy), this is not right for RA_SIGN_STATE which
should be reset in the absence of related dwarf ops. we can
fix this up in target hooks for update context, but we still
have to special case REG_UNSAVED.

i think introducing a new REG_CONST does not simplify the
aarch64 target code (we might need further changes to get
a clean solution).


Re: [ping2][PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations

2022-05-24 Thread Yonghong Song via Gcc-patches




On 5/24/22 8:53 AM, David Faust wrote:



On 5/24/22 04:07, Jose E. Marchesi wrote:



On 5/11/22 11:44 AM, David Faust wrote:


On 5/10/22 22:05, Yonghong Song wrote:



On 5/10/22 8:43 PM, Yonghong Song wrote:



On 5/6/22 2:18 PM, David Faust wrote:



On 5/5/22 16:00, Yonghong Song wrote:



On 5/4/22 10:03 AM, David Faust wrote:



On 5/3/22 15:32, Joseph Myers wrote:

On Mon, 2 May 2022, David Faust via Gcc-patches wrote:


Consider the following example:

   #define __typetag1 __attribute__((btf_type_tag("tag1")))
   #define __typetag2 __attribute__((btf_type_tag("tag2")))
   #define __typetag3 __attribute__((btf_type_tag("tag3")))

   int __typetag1 * __typetag2 __typetag3 * g;

The expected behavior is that 'g' is "a pointer with tags
'tag2' and
'tag3',
to a pointer with tag 'tag1' to an int". i.e.:


That's not a correct expectation for either GNU __attribute__ or
C2x [[]]
attribute syntax.  In either syntax, __typetag2 __typetag3 should
apply to
the type to which g points, not to g or its type, just as if
you had a
type qualifier there.  You'd need to put the attributes (or
qualifier)
after the *, not before, to make them apply to the pointer
type.  See
"Attribute Syntax" in the GCC manual for how the syntax is
defined for
GNU
attributes and deduce in turn, for each subsequence of the tokens
matching
the syntax for some kind of declarator, what the type for "T D1"
would be
as defined there and in the C standard, as deduced from the type for
"T D"
for a sub-declarator D.
    >> But GCC's attribute parsing produces a variable 'g'
which is "a

pointer with

tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an
int", i.e.


In GNU syntax, __typetag1 applies to the declaration, whereas in C2x
syntax it applies to int.  Again, if you wanted it to apply to the
pointer
type it would need to go after the * not before.

If you are concerned with the fine details of what construct an
attribute
appertains to, I recommend using C2x syntax not GNU syntax.



Joseph, thank you! This is very helpful. My understanding of
the syntax
was not correct.

(Actually, I made a bad mistake in paraphrasing this example from the
discussion of it in the series cover letter. But, the reason
why it is
incorrect is the same.)


Yonghong, is the specific ordering an expectation in BPF programs or
other users of the tags?


This is probably a language writing issue. We are saying tags only
apply to pointer. We probably should say it only apply to pointee.

$ cat t.c
int const *ptr;

the llvm ir debuginfo:

!5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
!6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !7)
!7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)

We could replace 'const' with a tag like below:

int __attribute__((btf_type_tag("tag"))) *ptr;

!5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
annotations: !7)
!6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!7 = !{!8}
!8 = !{!"btf_type_tag", !"tag"}

In the above IR, we generate annotations to pointer_type because
we didn't invent a new DI type for encode btf_type_tag. But it is
totally okay to have IR looks like

!5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 64)
!11 = !DIBtfTypeTagType(..., baseType: !6, name: !"Tag")
!6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)


OK, thanks.

There is still the question of why the DWARF generated for this case
that I have been concerned about:

     int __typetag1 * __typetag2 __typetag3 * g;

differs between GCC (with this series) and clang. After studying it,
GCC is doing with the attributes exactly as is described in the
Attribute Syntax portion of the GCC manual where the GNU syntax is
described. I do not think there is any problem here.

So the difference in DWARF suggests to me that clang is not handling
the GNU attribute syntax in this particular case correctly, since it
seems to be associating __typetag2 and __typetag3 to g's type rather
than the type to which it points.

I am not sure whether for the use purposes of the tags this difference
is very important, but it is worth noting.


As Joseph suggested, it may be better to encourage users of these tags
to use the C2x attribute syntax if they are concerned with precisely
which construct the tag applies.

This would also be a way around any issues in handling the attributes
due to the GNU syntax.

I tried a few test cases using C2x syntax BTF type tags with a
clang-15 build, but ran into some issues (in particular, some of the
tag attributes being ignored altogether). I couldn't find confirmation
whether C2x attribute syntax is fully supported in clang yet, so maybe
this isn't expected to work. Do you know whether the C2x syntax is
fully supported in clang yet?


Actually, I don't know either. But since the btf decl_tag and type_tag
are also used to compile linux kernel and the minimum compiler version
to compile kernel is gcc5.1 and 

Re: Back porting to GCC11/GCC12: Re: [patch][gcc13][i386][pr101891]Adjust -fzero-call-used-regs to always use XOR

2022-05-24 Thread Qing Zhao via Gcc-patches
Pushed to both gcc11 and gcc12.

thanks.

Qing

> On May 24, 2022, at 1:19 AM, Richard Biener  wrote:
> 
> On Mon, 23 May 2022, Qing Zhao wrote:
> 
>> Hi,
>> 
>> I have added the patch to GCC11 and GCC12 in my local area and bootstrapped 
>> and regress tested on both x86 and aarch64, no any issues.
>> 
>> Can I committed them to both GCC11 and GCC12 branches?
> 
> Yes.
> 
> Thanks,
> Richard.
> 
>> Thanks.
>> 
>> 
>> 
>> 
>>> On May 10, 2022, at 8:38 AM, Qing Zhao via Gcc-patches 
>>>  wrote:
>>> 
>>> 
>>> 
 On May 10, 2022, at 1:12 AM, Richard Biener  wrote:
 
 On Mon, 9 May 2022, Uros Bizjak wrote:
 
> On Mon, May 9, 2022 at 5:44 PM Qing Zhao  wrote:
>> 
>> Another question:
>> 
>> I think that this patch might need to be back ported to Gcc12 and GCC11.
>> 
>> What?s your opinion on this?
> 
> It is not a regression, so following general rules, the patch should
> not be backported. OTOH, the patch creates functionally equivalent
> code, better in some security aspects. The functionality is also
> hidden behind some non-default flag, so I think if release managers
> (CC'd) are OK with the backport, I'd give it a technical approval.
> 
>> If so, when can I backport it?
> 
> Let's keep it in the mainline for a week or two, before backporting it
> to non-EoL branches.
 
 OK from my POV after a week or two on trunk.
>>> 
>>> Sure, I will do the back porting after two weeks.
>>> 
>>> thanks.
>>> 
>>> Qing
 
 Richard.
 
> Uros.
> 
>> 
>> thanks.
>> 
>> Qing
>> 
>>> On May 7, 2022, at 4:06 AM, Uros Bizjak  wrote:
>>> 
>>> On Fri, May 6, 2022 at 6:42 PM Qing Zhao  wrote:
 
 
 
> On May 6, 2022, at 10:58 AM, Uros Bizjak  wrote:
> 
> On Fri, May 6, 2022 at 4:29 PM Qing Zhao  wrote:
>> 
>> Hi,
>> 
>> As Kee?s requested in this PR: 
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101891
>> 
>> =
>> 
>> Currently -fzero-call-used-regs will use a pattern of:
>> 
>> XOR regA,regA
>> MOV regA,regB
>> MOV regA,regC
>> ...
>> RET
>> 
>> However, this introduces both a register ordering dependency (e.g. 
>> the CPU cannot clear regB without clearing regA first), and while 
>> greatly reduces available ROP gadgets, it does technically leave a 
>> set of "MOV" ROP gadgets at the end of functions (e.g. "MOV 
>> regA,regC; RET").
>> 
>> Please switch to always using XOR:
>> 
>> XOR regA,regA
>> XOR regB,regB
>> XOR regC,regC
>> ...
>> RET
>> 
>> ===
>> 
>> This patch switch all MOV to XOR on i386.
>> 
>> Bootstrapped and regresstion tested on x86_64-linux-gnu.
>> 
>> Okay for gcc13?
>> 
>> Thanks.
>> 
>> Qing
>> 
>> ==
> 
>> gcc/ChangeLog:
>> 
>> * config/i386/i386.cc (zero_all_mm_registers): Use SET to zero 
>> instead
>> of MOV for zeroing scratch registers.
>> (ix86_zero_call_used_regs): Likewise.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> * gcc.target/i386/zero-scratch-regs-1.c: Add -fno-stack-protector
>> -fno-PIC.
>> * gcc.target/i386/zero-scratch-regs-10.c: Adjust mov to xor.
>> * gcc.target/i386/zero-scratch-regs-13.c: Add -msse.
>> * gcc.target/i386/zero-scratch-regs-14.c: Adjust mov to xor.
>> * gcc.target/i386/zero-scratch-regs-15.c: Add -fno-stack-protector
>> -fno-PIC.
>> * gcc.target/i386/zero-scratch-regs-16.c: Likewise.
>> * gcc.target/i386/zero-scratch-regs-17.c: Likewise.
>> * gcc.target/i386/zero-scratch-regs-18.c: Add -fno-stack-protector
>> -fno-PIC, adjust mov to xor.
>> * gcc.target/i386/zero-scratch-regs-19.c: Add -fno-stack-protector
>> -fno-PIC.
>> * gcc.target/i386/zero-scratch-regs-2.c: Adjust mov to xor.
>> * gcc.target/i386/zero-scratch-regs-20.c: Add -msse.
>> * gcc.target/i386/zero-scratch-regs-21.c: Add -fno-stack-protector
>> -fno-PIC, Adjust mov to xor.
>> * gcc.target/i386/zero-scratch-regs-22.c: Adjust mov to xor.
>> * gcc.target/i386/zero-scratch-regs-23.c: Likewise.
>> * gcc.target/i386/zero-scratch-regs-26.c: Likewise.
>> * gcc.target/i386/zero-scratch-regs-27.c: Likewise.
>> * gcc.target/i386/zero-scratch-regs-28.c: Likewise.
>> * gcc.target/i386/zero-scratch-regs-3.c: Add -fno-stack-protector.
>> * gcc.target/i386/zero-scratch-regs-31.c: Adjust mov to xor.
>> * gcc.target/i386/zero-scratch-regs-4.c: Add -fno-stack-protector
>> 

Re: [ping2][PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations

2022-05-24 Thread David Faust via Gcc-patches



On 5/24/22 04:07, Jose E. Marchesi wrote:
> 
>> On 5/11/22 11:44 AM, David Faust wrote:
>>>
>>> On 5/10/22 22:05, Yonghong Song wrote:


 On 5/10/22 8:43 PM, Yonghong Song wrote:
>
>
> On 5/6/22 2:18 PM, David Faust wrote:
>>
>>
>> On 5/5/22 16:00, Yonghong Song wrote:
>>>
>>>
>>> On 5/4/22 10:03 AM, David Faust wrote:


 On 5/3/22 15:32, Joseph Myers wrote:
> On Mon, 2 May 2022, David Faust via Gcc-patches wrote:
>
>> Consider the following example:
>>
>>   #define __typetag1 __attribute__((btf_type_tag("tag1")))
>>   #define __typetag2 __attribute__((btf_type_tag("tag2")))
>>   #define __typetag3 __attribute__((btf_type_tag("tag3")))
>>
>>   int __typetag1 * __typetag2 __typetag3 * g;
>>
>> The expected behavior is that 'g' is "a pointer with tags
>> 'tag2' and
>> 'tag3',
>> to a pointer with tag 'tag1' to an int". i.e.:
>
> That's not a correct expectation for either GNU __attribute__ or
> C2x [[]]
> attribute syntax.  In either syntax, __typetag2 __typetag3 should
> apply to
> the type to which g points, not to g or its type, just as if
> you had a
> type qualifier there.  You'd need to put the attributes (or
> qualifier)
> after the *, not before, to make them apply to the pointer
> type.  See
> "Attribute Syntax" in the GCC manual for how the syntax is
> defined for
> GNU
> attributes and deduce in turn, for each subsequence of the tokens
> matching
> the syntax for some kind of declarator, what the type for "T D1"
> would be
> as defined there and in the C standard, as deduced from the type for
> "T D"
> for a sub-declarator D.
>    >> But GCC's attribute parsing produces a variable 'g'
> which is "a
 pointer with
>> tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an
>> int", i.e.
>
> In GNU syntax, __typetag1 applies to the declaration, whereas in C2x
> syntax it applies to int.  Again, if you wanted it to apply to the
> pointer
> type it would need to go after the * not before.
>
> If you are concerned with the fine details of what construct an
> attribute
> appertains to, I recommend using C2x syntax not GNU syntax.
>

 Joseph, thank you! This is very helpful. My understanding of
 the syntax
 was not correct.

 (Actually, I made a bad mistake in paraphrasing this example from the
 discussion of it in the series cover letter. But, the reason
 why it is
 incorrect is the same.)


 Yonghong, is the specific ordering an expectation in BPF programs or
 other users of the tags?
>>>
>>> This is probably a language writing issue. We are saying tags only
>>> apply to pointer. We probably should say it only apply to pointee.
>>>
>>> $ cat t.c
>>> int const *ptr;
>>>
>>> the llvm ir debuginfo:
>>>
>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
>>> !6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !7)
>>> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>
>>> We could replace 'const' with a tag like below:
>>>
>>> int __attribute__((btf_type_tag("tag"))) *ptr;
>>>
>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
>>> annotations: !7)
>>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>> !7 = !{!8}
>>> !8 = !{!"btf_type_tag", !"tag"}
>>>
>>> In the above IR, we generate annotations to pointer_type because
>>> we didn't invent a new DI type for encode btf_type_tag. But it is
>>> totally okay to have IR looks like
>>>
>>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 64)
>>> !11 = !DIBtfTypeTagType(..., baseType: !6, name: !"Tag")
>>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>>
>> OK, thanks.
>>
>> There is still the question of why the DWARF generated for this case
>> that I have been concerned about:
>>
>>     int __typetag1 * __typetag2 __typetag3 * g;
>>
>> differs between GCC (with this series) and clang. After studying it,
>> GCC is doing with the attributes exactly as is described in the
>> Attribute Syntax portion of the GCC manual where the GNU syntax is
>> described. I do not think there is any problem here.
>>
>> So the difference in DWARF suggests to me that clang is not handling
>> the GNU attribute syntax in this particular case correctly, since it
>> seems to be associating __typetag2 

Re: [ping2][PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations

2022-05-24 Thread Yonghong Song via Gcc-patches




On 5/24/22 4:07 AM, Jose E. Marchesi wrote:



On 5/11/22 11:44 AM, David Faust wrote:


On 5/10/22 22:05, Yonghong Song wrote:



On 5/10/22 8:43 PM, Yonghong Song wrote:



On 5/6/22 2:18 PM, David Faust wrote:



On 5/5/22 16:00, Yonghong Song wrote:



On 5/4/22 10:03 AM, David Faust wrote:



On 5/3/22 15:32, Joseph Myers wrote:

On Mon, 2 May 2022, David Faust via Gcc-patches wrote:


Consider the following example:

   #define __typetag1 __attribute__((btf_type_tag("tag1")))
   #define __typetag2 __attribute__((btf_type_tag("tag2")))
   #define __typetag3 __attribute__((btf_type_tag("tag3")))

   int __typetag1 * __typetag2 __typetag3 * g;

The expected behavior is that 'g' is "a pointer with tags
'tag2' and
'tag3',
to a pointer with tag 'tag1' to an int". i.e.:


That's not a correct expectation for either GNU __attribute__ or
C2x [[]]
attribute syntax.  In either syntax, __typetag2 __typetag3 should
apply to
the type to which g points, not to g or its type, just as if
you had a
type qualifier there.  You'd need to put the attributes (or
qualifier)
after the *, not before, to make them apply to the pointer
type.  See
"Attribute Syntax" in the GCC manual for how the syntax is
defined for
GNU
attributes and deduce in turn, for each subsequence of the tokens
matching
the syntax for some kind of declarator, what the type for "T D1"
would be
as defined there and in the C standard, as deduced from the type for
"T D"
for a sub-declarator D.
    >> But GCC's attribute parsing produces a variable 'g'
which is "a

pointer with

tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an
int", i.e.


In GNU syntax, __typetag1 applies to the declaration, whereas in C2x
syntax it applies to int.  Again, if you wanted it to apply to the
pointer
type it would need to go after the * not before.

If you are concerned with the fine details of what construct an
attribute
appertains to, I recommend using C2x syntax not GNU syntax.



Joseph, thank you! This is very helpful. My understanding of
the syntax
was not correct.

(Actually, I made a bad mistake in paraphrasing this example from the
discussion of it in the series cover letter. But, the reason
why it is
incorrect is the same.)


Yonghong, is the specific ordering an expectation in BPF programs or
other users of the tags?


This is probably a language writing issue. We are saying tags only
apply to pointer. We probably should say it only apply to pointee.

$ cat t.c
int const *ptr;

the llvm ir debuginfo:

!5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
!6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !7)
!7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)

We could replace 'const' with a tag like below:

int __attribute__((btf_type_tag("tag"))) *ptr;

!5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
annotations: !7)
!6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
!7 = !{!8}
!8 = !{!"btf_type_tag", !"tag"}

In the above IR, we generate annotations to pointer_type because
we didn't invent a new DI type for encode btf_type_tag. But it is
totally okay to have IR looks like

!5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 64)
!11 = !DIBtfTypeTagType(..., baseType: !6, name: !"Tag")
!6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)


OK, thanks.

There is still the question of why the DWARF generated for this case
that I have been concerned about:

     int __typetag1 * __typetag2 __typetag3 * g;

differs between GCC (with this series) and clang. After studying it,
GCC is doing with the attributes exactly as is described in the
Attribute Syntax portion of the GCC manual where the GNU syntax is
described. I do not think there is any problem here.

So the difference in DWARF suggests to me that clang is not handling
the GNU attribute syntax in this particular case correctly, since it
seems to be associating __typetag2 and __typetag3 to g's type rather
than the type to which it points.

I am not sure whether for the use purposes of the tags this difference
is very important, but it is worth noting.


As Joseph suggested, it may be better to encourage users of these tags
to use the C2x attribute syntax if they are concerned with precisely
which construct the tag applies.

This would also be a way around any issues in handling the attributes
due to the GNU syntax.

I tried a few test cases using C2x syntax BTF type tags with a
clang-15 build, but ran into some issues (in particular, some of the
tag attributes being ignored altogether). I couldn't find confirmation
whether C2x attribute syntax is fully supported in clang yet, so maybe
this isn't expected to work. Do you know whether the C2x syntax is
fully supported in clang yet?


Actually, I don't know either. But since the btf decl_tag and type_tag
are also used to compile linux kernel and the minimum compiler version
to compile kernel is gcc5.1 and clang11. I am not sure whether gcc5.1

Re: [PATCH] Modula-2: merge proposal/review: 1/9 01.patch-set-01

2022-05-24 Thread Gaius Mulley via Gcc-patches
Richard Biener  writes:

> On Sat, May 21, 2022 at 3:11 AM Gaius Mulley  wrote:
>>
>>
>> Hi,
>>
>> Gaius wrote:
>>
>> > the changes do raise questions.  The reason for the changes here are to
>> > allow easy linking for modula-2 users.
>>
>> >  $ gm2 hello.mod
>>
>> > for example will compile and link with all dependent modules (dependants
>> > are generated by analysing module source imports).  The gm2 driver will
>> > add objects and libraries to the link.
>>
>> in more detail the gm2 driver does the following:
>>
>>   $ gm2 -v hello.mod
>>
>> full output below, but to summarise and annotate:
>>
>> cc1gm2 generates an assembler file from hello.mod
>>  as --64 /tmp/cc8BoL3d.s -o hello.o
>>
>>  # gm2l generates a list of all dependent modules from parsing all imports
>>  /home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/gm2l -v \
>>  -I/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/m2/m2pim -o \
>>  /tmp/ccSMojUb.l hello.mod
>>
>>  # gm2lorder reorders the critical runtime modules
>>  /home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/gm2lorder \
>> /tmp/ccSMojUb.l -o /tmp/ccHDRdde.lst
>>
>>  # gm2lgen generates a C++ scaffold from the reordered module list
>>  /home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/gm2lgen -fcpp \
>> /tmp/ccHDRdde.lst -o a-hello_m2.cpp
>>
>>  # cc1plus compiles the scaffold
>>  /home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/cc1plus -v \
>>  -mtune=generic -march=x86-64 \
>>  -I/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/m2/m2pim \
>>  -quiet a-hello_m2.cpp -o a-hello_m2.s
>>  as --64 a-hello_m2.s -o a-hello_m2.o
>>
>>  # gm2lcc creates an archive from the list of modules and the scaffold
>> /home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/gm2lcc \
>>   -L/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/m2/m2pim \
>>   -ftarget-ar=/usr/bin/ar -ftarget-ranlib=/usr/bin/ranlib \
>> -fobject-path=/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/m2/m2pim \
>>   --exec --startup a-hello_m2.o --ar -o /tmp/ccNJ60fa.a --mainobject \
>>   a-hello_m2.o /tmp/ccHDRdde.lst
>>
>> /usr/bin/ar rc /tmp/ccNJ60fa.a  hello.o a-hello_m2.o
>> /usr/bin/ranlib /tmp/ccNJ60fa.a
>>
>> # finally collect2 performs the link from the archive and any default
>>   libraries
>>
>> hope this helps
>
> Yes, it does.  So historically when there was complex massaging required
> like this it was offloaded to a "helper driver".  With -flto there's 
> lto-wrapper
> (but here invoked by the linker), with ada there's gnatmake and others
> and with certain targets collect2 does extra processing producing global
> CTORs (or for C++ with -frepo even invoked additional compilations).

Hi Richard,

interesting thank you for the information about different languages (yes
I recall years ago the lang-specs used to be much more complex).  global
CTORs would work might be helpful, although I wonder if they are overly
complex for modula-2?  (As we still need to control order).  I guess
there would be pros/cons for multi-lingual projects.

I wonder whether it could be resolved in the modula-2 front end by
placing the scaffold generation inside cc1gm2 (which is generated when a
program module is seen - and/or perhaps forced by a switch if a user
really wanted to link an implementation module as the application,
say by -fm2-scaffold).  So by default the proposal would be that

  $ gm2 -c programmodule.mod

generates programmodule.o (which contains main and a scaffold to
construct/deconstruct every module as well as the user level code).
There could be a switch to emit the scaffold in C or C++ should users
want to interfere.

Overall much of the modula-2 code inside /gm2tools would go into cc1gm2
and many 100s of C lines of code would disappear from the 'gm2' driver
and the code base would clean up :-).  In the process it would become
more like the other GCC language drivers.

Some of the gm2 link options could be passed into cc1gm2 (those forcing
the order of module initialization and user specified scaffold list
override).  The make dependencies could also be emitted if required
as cc1gm2 now has knowledge of all imports.

> I do think that this might all belong into the main driver code but then
> maybe all the different language compilation models will just make that
> very hard to maintain.

indeed I can see it could become problematic making the above quite
attractive, maybe?

> As for modula-2, does
>
> $ gm2 -c hello.mod

$ ~/opt/bin/gm2 -c hello.mod

> $ gm2 hello.o

$ ~/opt/bin/gm2 hello.o
/usr/bin/ld: /lib/x86_64-linux-gnu/crt1.o: in function `_start':
(.text+0x20): undefined reference to `main'
collect2: error: ld returned 1 exit status

alas no as there is no scaffold inside hello.o

> "work"?  And what intermediate files are build systems expecting to
> prevail?  Like for C/C++ code and GNU make there's the preprocessor
> driven dependence generation, but otherwise a single TU usually
> produces a single object file.  OTOH for GFortran a single TU might
> produce multiple .mod files for 

[committed] amdgcn: Add gfx90a support

2022-05-24 Thread Andrew Stubbs

I've committed this patch to add support for gfx90a AMD GPU devices.

The patch updates all the places that have architecture/ISA specific 
code, tidies up the ISA naming and handling in the backend, and adds a 
new multilib.


This is just lightly tested at this point, but there are no known issues 
and it shouldn't break anything for other architectures.


Andrewamdgcn: Add gfx90a support

This adds architecture options and multilibs for the AMD GFX90a GPUs.
It also tidies up some of the ISA selection code, and corrects a few small
mistake in the gfx908 naming.

gcc/ChangeLog:

* config.gcc (amdgcn): Accept --with-arch=gfx908 and gfx90a.
* config/gcn/gcn-opts.h (enum gcn_isa): New.
(TARGET_GCN3): Use enum gcn_isa.
(TARGET_GCN3_PLUS): Likewise.
(TARGET_GCN5): Likewise.
(TARGET_GCN5_PLUS): Likewise.
(TARGET_CDNA1): New.
(TARGET_CDNA1_PLUS): New.
(TARGET_CDNA2): New.
(TARGET_CDNA2_PLUS): New.
(TARGET_M0_LDS_LIMIT): New.
(TARGET_PACKED_WORK_ITEMS): New.
* config/gcn/gcn.cc (gcn_isa): Change to enum gcn_isa.
(gcn_option_override): Recognise CDNA ISA variants.
(gcn_omp_device_kind_arch_isa): Support gfx90a.
(gcn_expand_prologue): Make m0 init optional.
Add support for packed work items.
(output_file_start): Support gfx90a.
(gcn_hsa_declare_function_name): Support gfx90a metadata.
* config/gcn/gcn.h (TARGET_CPU_CPP_BUILTINS):Add __CDNA1__ and
__CDNA2__.
* config/gcn/gcn.md (mulsi3_highpart): Use TARGET_GCN5_PLUS.
(mulsi3_highpart_imm): Likewise.
(mulsidi3): Likewise.
(mulsidi3_imm): Likewise.
* config/gcn/gcn.opt (gpu_type): Add gfx90a.
* config/gcn/mkoffload.cc (EF_AMDGPU_MACH_AMDGCN_GFX90a): New.
(main): Support gfx90a.
* config/gcn/t-gcn-hsa: Add gfx90a multilib.
* config/gcn/t-omp-device: Add gfx90a isa.

libgomp/ChangeLog:

* plugin/plugin-gcn.c (EF_AMDGPU_MACH): Add
EF_AMDGPU_MACH_AMDGCN_GFX90a.
(gcn_gfx90a_s): New.
(isa_hsa_name): Support gfx90a.
(isa_code): Likewise.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 600ac357366..cdbefb5b4f5 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4522,7 +4522,7 @@ case "${target}" in
for which in arch tune; do
eval "val=\$with_$which"
case ${val} in
-   "" | fiji | gfx900 | gfx906 )
+   "" | fiji | gfx900 | gfx906 | gfx908 | gfx90a)
# OK
;;
*)
diff --git a/gcc/config/gcn/gcn-opts.h b/gcc/config/gcn/gcn-opts.h
index c0805241bc5..b62dfb45f59 100644
--- a/gcc/config/gcn/gcn-opts.h
+++ b/gcc/config/gcn/gcn-opts.h
@@ -23,16 +23,30 @@ enum processor_type
   PROCESSOR_FIJI,// gfx803
   PROCESSOR_VEGA10,  // gfx900
   PROCESSOR_VEGA20,  // gfx906
-  PROCESSOR_GFX908   // as yet unnamed
+  PROCESSOR_GFX908,
+  PROCESSOR_GFX90a
 };
 
 /* Set in gcn_option_override.  */
-extern int gcn_isa;
-
-#define TARGET_GCN3 (gcn_isa == 3)
-#define TARGET_GCN3_PLUS (gcn_isa >= 3)
-#define TARGET_GCN5 (gcn_isa == 5)
-#define TARGET_GCN5_PLUS (gcn_isa >= 5)
+extern enum gcn_isa {
+  ISA_UNKNOWN,
+  ISA_GCN3,
+  ISA_GCN5,
+  ISA_CDNA1,
+  ISA_CDNA2
+} gcn_isa;
+
+#define TARGET_GCN3 (gcn_isa == ISA_GCN3)
+#define TARGET_GCN3_PLUS (gcn_isa >= ISA_GCN3)
+#define TARGET_GCN5 (gcn_isa == ISA_GCN5)
+#define TARGET_GCN5_PLUS (gcn_isa >= ISA_GCN5)
+#define TARGET_CDNA1 (gcn_isa == ISA_CDNA1)
+#define TARGET_CDNA1_PLUS (gcn_isa >= ISA_CDNA1)
+#define TARGET_CDNA2 (gcn_isa == ISA_CDNA2)
+#define TARGET_CDNA2_PLUS (gcn_isa >= ISA_CDNA2)
+
+#define TARGET_M0_LDS_LIMIT (TARGET_GCN3)
+#define TARGET_PACKED_WORK_ITEMS (TARGET_CDNA2_PLUS)
 
 enum sram_ecc_type
 {
diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
index 39a7a966502..5e75a1b63aa 100644
--- a/gcc/config/gcn/gcn.cc
+++ b/gcc/config/gcn/gcn.cc
@@ -66,7 +66,7 @@ static bool ext_gcn_constants_init = 0;
 
 /* Holds the ISA variant, derived from the command line parameters.  */
 
-int gcn_isa = 3;   /* Default to GCN3.  */
+enum gcn_isa gcn_isa = ISA_GCN3;   /* Default to GCN3.  */
 
 /* Reserve this much space for LDS (for propagating variables from
worker-single mode to worker-partitioned mode), per workgroup.  Global
@@ -129,7 +129,13 @@ gcn_option_override (void)
   if (!flag_pic)
 flag_pic = flag_pie;
 
-  gcn_isa = gcn_arch == PROCESSOR_FIJI ? 3 : 5;
+  gcn_isa = (gcn_arch == PROCESSOR_FIJI ? ISA_GCN3
+  : gcn_arch == PROCESSOR_VEGA10 ? ISA_GCN5
+  : gcn_arch == PROCESSOR_VEGA20 ? ISA_GCN5
+  : gcn_arch == PROCESSOR_GFX908 ? ISA_CDNA1
+  : gcn_arch == PROCESSOR_GFX90a ? ISA_CDNA2
+  : ISA_UNKNOWN);
+  gcc_assert (gcn_isa != ISA_UNKNOWN);
 
   /* The default stack size needs to be small for offload kernels 

[committed] amdgcn: Remove LLVM 9 assembler/linker support

2022-05-24 Thread Andrew Stubbs
I've committed this patch to set the minimum required LLVM version, for 
the assembler and linker, to 13.0.1. An upgrade from LLVM 9 is a 
prerequisite for the gfx90a support, and 13.0.1 is now the oldest 
version not known to have compatibility issues.


The patch removes all the obsolete feature detection tests from 
configure and adds a new version test. Likewise the version dependencies 
in the backend are removed.


Andrewamdgcn: Remove LLVM 9 assembler/linker support

The minimum required LLVM version is now 13.0.1, and is enforced by configure.

gcc/ChangeLog:

* config.in: Regenerate.
* config/gcn/gcn-hsa.h (X_FIJI): Delete.
(X_900): Delete.
(X_906): Delete.
(X_908): Delete.
(S_FIJI): Delete.
(S_900): Delete.
(S_906): Delete.
(S_908): Delete.
(NO_XNACK): New macro.
(NO_SRAM_ECC): New macro.
(SRAMOPT): Keep only v4 variant.
(HSACO3_SELECT_OPT): Delete.
(DRIVER_SELF_SPECS): Delete.
(ASM_SPEC): Remove LLVM 9 support.
* config/gcn/gcn-valu.md
(gather_insn_2offsets): Remove assembler bug workaround.
(scatter_insn_2offsets): Likewise.
* config/gcn/gcn.cc (output_file_start): Remove LLVM 9 support.
(print_operand_address): Remove assembler bug workaround.
* config/gcn/mkoffload.cc (EF_AMDGPU_XNACK_V3): Delete.
(EF_AMDGPU_SRAM_ECC_V3): Delete.
(SET_XNACK_ON): Delete v3 variants.
(SET_XNACK_OFF): Delete v3 variants.
(TEST_XNACK): Delete v3 variants.
(SET_SRAM_ECC_ON): Delete v3 variants.
(SET_SRAM_ECC_ANY): Delete v3 variants.
(SET_SRAM_ECC_OFF): Delete v3 variants.
(SET_SRAM_ECC_UNSUPPORTED): Delete v3 variants.
(TEST_SRAM_ECC_ANY): Delete v3 variants.
(TEST_SRAM_ECC_ON): Delete v3 variants.
(copy_early_debug_info): Remove v3 support.
(main): Remove v3 support.
* configure: Regenerate.
* configure.ac: Replace all GCN feature checks with a version check.

diff --git a/gcc/config.in b/gcc/config.in
index 64c27c9cfac..6a4f8856c4f 100644
--- a/gcc/config.in
+++ b/gcc/config.in
@@ -1331,13 +1331,6 @@
 #endif
 
 
-/* Define if your Arm assembler permits context-specific feature extensions.
-   */
-#ifndef USED_FOR_TARGET
-#undef HAVE_GAS_ARM_EXTENDED_ARCH
-#endif
-
-
 /* Define if your assembler supports .balign and .p2align. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_GAS_BALIGN_AND_P2ALIGN
@@ -1457,72 +1450,6 @@
 #endif
 
 
-/* Define if your assembler has fixed global_load functions. */
-#ifndef USED_FOR_TARGET
-#undef HAVE_GCN_ASM_GLOBAL_LOAD_FIXED
-#endif
-
-
-/* Define if your assembler expects amdgcn_target gfx908+xnack syntax. */
-#ifndef USED_FOR_TARGET
-#undef HAVE_GCN_ASM_V3_SYNTAX
-#endif
-
-
-/* Define if your assembler expects amdgcn_target gfx908:xnack+ syntax. */
-#ifndef USED_FOR_TARGET
-#undef HAVE_GCN_ASM_V4_SYNTAX
-#endif
-
-
-/* Define if your assembler allows -mattr=+sramecc for fiji. */
-#ifndef USED_FOR_TARGET
-#undef HAVE_GCN_SRAM_ECC_FIJI
-#endif
-
-
-/* Define if your assembler allows -mattr=+sramecc for gfx900. */
-#ifndef USED_FOR_TARGET
-#undef HAVE_GCN_SRAM_ECC_GFX900
-#endif
-
-
-/* Define if your assembler allows -mattr=+sramecc for gfx906. */
-#ifndef USED_FOR_TARGET
-#undef HAVE_GCN_SRAM_ECC_GFX906
-#endif
-
-
-/* Define if your assembler allows -mattr=+sramecc for gfx908. */
-#ifndef USED_FOR_TARGET
-#undef HAVE_GCN_SRAM_ECC_GFX908
-#endif
-
-
-/* Define if your assembler allows -mattr=+xnack for fiji. */
-#ifndef USED_FOR_TARGET
-#undef HAVE_GCN_XNACK_FIJI
-#endif
-
-
-/* Define if your assembler allows -mattr=+xnack for gfx900. */
-#ifndef USED_FOR_TARGET
-#undef HAVE_GCN_XNACK_GFX900
-#endif
-
-
-/* Define if your assembler allows -mattr=+xnack for gfx906. */
-#ifndef USED_FOR_TARGET
-#undef HAVE_GCN_XNACK_GFX906
-#endif
-
-
-/* Define if your assembler allows -mattr=+xnack for gfx908. */
-#ifndef USED_FOR_TARGET
-#undef HAVE_GCN_XNACK_GFX908
-#endif
-
-
 /* Define to 1 if you have the `getchar_unlocked' function. */
 #ifndef USED_FOR_TARGET
 #undef HAVE_GETCHAR_UNLOCKED
@@ -2208,6 +2135,12 @@
 #endif
 
 
+/* Define which stat syscall is able to handle 64bit indodes. */
+#ifndef USED_FOR_TARGET
+#undef HOST_STAT_FOR_64BIT_INODES
+#endif
+
+
 /* Define as const if the declaration of iconv() needs const. */
 #ifndef USED_FOR_TARGET
 #undef ICONV_CONST
diff --git a/gcc/config/gcn/gcn-hsa.h b/gcc/config/gcn/gcn-hsa.h
index 9b5fee9f7d4..b3079cebb43 100644
--- a/gcc/config/gcn/gcn-hsa.h
+++ b/gcc/config/gcn/gcn-hsa.h
@@ -75,68 +75,19 @@ extern unsigned int gcn_local_sym_hash (const char *name);
supported for gcn.  */
 #define GOMP_SELF_SPECS ""
 
-#ifdef HAVE_GCN_XNACK_FIJI
-#define X_FIJI
-#else
-#define X_FIJI "!march=*:;march=fiji:;"
-#endif
-#ifdef HAVE_GCN_XNACK_GFX900
-#define X_900
-#else
-#define X_900 "march=gfx900:;"
-#endif
-#ifdef HAVE_GCN_XNACK_GFX906
-#define X_906
-#else
-#define X_906 

Re: [PATCH 05/10] d: add 'final' and 'override' to gcc/d/*.cc 'visit' impls

2022-05-24 Thread Iain Buclaw via Gcc-patches
Excerpts from David Malcolm's message of Mai 24, 2022 3:15 pm:
> On Tue, 2022-05-24 at 14:56 +0200, Iain Buclaw wrote:
>> Excerpts from David Malcolm via Gcc-patches's message of Mai 23, 2022
>> 9:28 pm:
>> > gcc/d/ChangeLog:
>> > * decl.cc: Add "final" and "override" to all "visit" vfunc
>> > decls
>> > as appropriate.
>> > * expr.cc: Likewise.
>> > * toir.cc: Likewise.
>> > * typeinfo.cc: Likewise.
>> > * types.cc: Likewise.
>> > 
>> > Signed-off-by: David Malcolm 
>> 
>> 
>> Thanks David!
>> 
>> Looks OK to me.
>> 
>> Iain.
> 
> Thanks; I've pushed it to trunk as r13-736-g442cf0977a2993.
> 
> FWIW, to repeat something I said in the cover letter, I tried hacking -
> Werror=suggest-override into the Makefile whilst I was creating the
> patches, and IIRC there were a bunch of them in the gcc/d/dmd
> subdirectory - but that code is copied from the D community upstream,
> right?
> So maybe if that D parser C++ code requires a C++11 compiler, perhaps
> they might want to add "final" and "override" specifiers to it as
> appropriate, to better document the intent of the decls?
> 

The D parser code is written in D, but most of it is marked
"extern(C++)" so that the code generator can interface with it.  It is
already a hard requirement in D that all overriden method have the
"override" keyword, and "final" is already tacked on most places, so for
the most part it is already there, just hasn't been mirrored to the C++
interfaces in the headers.

I'll have a look into it.

Iain.


Re: [PATCH] configure: cache result of "sys/sdt.h" header check

2022-05-24 Thread Eric Gallager via Gcc-patches
On Thu, Mar 24, 2022 at 9:27 AM David Seifert via Gcc-patches
 wrote:
>
> On Mon, 2022-03-14 at 18:38 +0100, David Seifert wrote:
> > Use AC_CACHE_CHECK to store the result of the header check for
> > systemtap's "sys/sdt.h", which is similar in spirit to libstdc++'s
> > AC_CACHE_CHECK(..., glibcxx_cv_sys_sdt_h).
> >
> > gcc/
> > * configure.ac: Add AC_CACHE_CHECK(..., gcc_cv_sys_sdt_h).
> > * configure: Regenerate.
> > ---
> >  gcc/configure| 20 +++-
> >  gcc/configure.ac | 16 +---
> >  2 files changed, 24 insertions(+), 12 deletions(-)
> >
> > diff --git a/gcc/configure b/gcc/configure
> > index 14b19c8fe0c..1dfc5cc7344 100755
> > --- a/gcc/configure
> > +++ b/gcc/configure
> > @@ -31389,15 +31389,25 @@ fi
> >
> >  { $as_echo "$as_me:${as_lineno-$LINENO}: checking sys/sdt.h in the
> > target C library" >&5
> >  $as_echo_n "checking sys/sdt.h in the target C library... " >&6; }
> > -have_sys_sdt_h=no
> > -if test -f $target_header_dir/sys/sdt.h; then
> > -  have_sys_sdt_h=yes
> > +if ${gcc_cv_sys_sdt_h+:} false; then :
> > +  $as_echo_n "(cached) " >&6
> > +else
> > +
> > +  gcc_cv_sys_sdt_h=no
> > +  if test -f $target_header_dir/sys/sdt.h; then
> > +gcc_cv_sys_sdt_h=yes
> > +  fi
> > +
> > +fi
> > +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $gcc_cv_sys_sdt_h"
> > >&5
> > +$as_echo "$gcc_cv_sys_sdt_h" >&6; }
> > +if test x$gcc_cv_sys_sdt_h = xyes; then :
> > +
> >
> >  $as_echo "#define HAVE_SYS_SDT_H 1" >>confdefs.h
> >
> > +
> >  fi
> > -{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $have_sys_sdt_h" >&5
> > -$as_echo "$have_sys_sdt_h" >&6; }
> >
> >  # Check if TFmode long double should be used by default or not.
> >  # Some glibc targets used DFmode long double, but with glibc 2.4
> > diff --git a/gcc/configure.ac b/gcc/configure.ac
> > index 90cad309762..c86ce5e0a9b 100644
> > --- a/gcc/configure.ac
> > +++ b/gcc/configure.ac
> > @@ -6904,14 +6904,16 @@ AC_SUBST([enable_default_ssp])
> >
> >  # Test for  on the target.
> >  GCC_TARGET_TEMPLATE([HAVE_SYS_SDT_H])
> > -AC_MSG_CHECKING(sys/sdt.h in the target C library)
> > -have_sys_sdt_h=no
> > -if test -f $target_header_dir/sys/sdt.h; then
> > -  have_sys_sdt_h=yes
> > -  AC_DEFINE(HAVE_SYS_SDT_H, 1,
> > +AC_CACHE_CHECK([sys/sdt.h in the target C library],
> > [gcc_cv_sys_sdt_h], [
> > +  gcc_cv_sys_sdt_h=no
> > +  if test -f $target_header_dir/sys/sdt.h; then
> > +gcc_cv_sys_sdt_h=yes
> > +  fi
> > +])
> > +AS_IF([test x$gcc_cv_sys_sdt_h = xyes], [
> > +  AC_DEFINE([HAVE_SYS_SDT_H], [1],
> >  [Define if your target C library provides sys/sdt.h])
> > -fi
> > -AC_MSG_RESULT($have_sys_sdt_h)
> > +])
> >
> >  # Check if TFmode long double should be used by default or not.
> >  # Some glibc targets used DFmode long double, but with glibc 2.4
>
> Ping, I think we agreed making this a cache variable is fine (which is
> how libstdc++ also does it).
>

So wait does this supersede the one adding a configure flag for it?
i.e.: https://gcc.gnu.org/pipermail/gcc-patches/2022-March/591704.html


Re: [PATCH v2 09/11] OpenMP 5.0 "declare mapper" support for C++

2022-05-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 18, 2022 at 09:26:50AM -0700, Julian Brown wrote:
> This patch implements OpenMP 5.0 "declare mapper" support for C++ --
> except for arrays of structs with mappers, which are TBD. I've taken cues
> from the existing "declare reduction" support where appropriate, though
> obviously the details of implementation differ somewhat (in particular,
> "declare mappers" must survive longer, until gimplification time).
> 
> Both named and unnamed (default) mappers are supported, and both
> explicitly-mapped structures and implicitly-mapped struct-typed variables
> used within an offload region are supported. For the latter, we need a
> way to communicate to the middle end which mapper (visible, in-scope) is
> the right one to use -- for that, we scan the target body in the front
> end for uses of structure (etc.) types, and create artificial "mapper
> binding" clauses to associate types with visible mappers. (It doesn't
> matter too much if we create these mapper bindings a bit over-eagerly,
> since they will only be used if needed later during gimplification.)
> 
> Another difficulty concerns the order in which an OpenMP offload region
> body's clauses are processed relative to its body: in order to add
> clauses for instantiated mappers, we need to have processed the body
> already in order to find which variables have been used, but at present
> the region's clauses are processed strictly before the body. So, this
> patch adds a second clause-processing step after gimplification of the
> body in order to add clauses resulting from instantiated mappers.
> 
> This version of the patch improves detection of explicitly-mapped struct
> accesses which inhibit implicitly-triggered user-defined mappers for a
> target region.

Will start with a general comment, from looking at the dumps it seems
handling the mappers in the FE right away for explicit mapping clauses
and attaching mapper binding clauses for types that are (or could
conservatively be, including from the recursive mappers themselves) be
used in the target body and letting gimplification find those var in detail
and use mapper binding clauses to actually expand it looks like the right
approach to me.  As I raised in an earlier patch, a big question is if we
should do map clause sorting on gimplify_scan_omp_clauses or
gimplify_adjust_omp_clauses or both...
The conservative discovery of what types we might need to create mapper
binding clauses for should be probably done only if
!processing_template_decl.

One question is though if DECL_OMP_DECLARE_MAPPER_P should be a magic
FUNCTION_DECL or a magic TREE_STATIC VAR_DECL or say CONST_DECLs.
The reason for the choice of FUNCTION_DECLs for UDRs is that they actually
contain code, but for UDMs we don't need any code, all we need is some
decl to which we can somehow attach list of clauses and a placeholder
decl used in them.  Perhaps a magic VAR_DECL or CONST_DECL would be
cheaper than a FUNCTION_DECL...

> @@ -32193,6 +32197,16 @@ cp_parser_late_parsing_for_member (cp_parser* 
> parser, tree member_function)
> finish_function (/*inline_p=*/true);
> cp_check_omp_declare_reduction (member_function);
>   }
> +  else if (DECL_OMP_DECLARE_MAPPER_P (member_function))
> + {
> +   parser->lexer->in_pragma = true;
> +   cp_parser_omp_declare_mapper_maplist (member_function, parser);
> +   finish_function (/*inline_p=*/true);
> +   cp_check_omp_declare_mapper (member_function);
> +   /* If this is a template class, this forces the body of the mapper
> +  to be instantiated.  */
> +   DECL_PRESERVE_P (member_function) = 1;

UDRs don't do this.  Why aren't the clauses instantiated when we actually
need such a template?

> @@ -39509,11 +39522,27 @@ cp_parser_omp_clause_map (cp_parser *parser, tree 
> list)
>  
>if (cp_lexer_peek_nth_token (parser->lexer, pos + 1)->type == 
> CPP_COMMA)
>   pos++;
> +  else if ((cp_lexer_peek_nth_token (parser->lexer, pos + 1)->type
> + == CPP_OPEN_PAREN)
> +&& ((cp_lexer_peek_nth_token (parser->lexer, pos + 2)->type
> + == CPP_NAME)
> +|| ((cp_lexer_peek_nth_token (parser->lexer, pos + 2)->type
> + == CPP_KEYWORD)
> +&& (cp_lexer_peek_nth_token (parser->lexer,
> + pos + 2)->keyword
> +== RID_DEFAULT)))
> +&& (cp_lexer_peek_nth_token (parser->lexer, pos + 3)->type
> +== CPP_CLOSE_PAREN)
> +&& (cp_lexer_peek_nth_token (parser->lexer, pos + 4)->type
> +== CPP_COMMA))

In this loop we don't need to be exact, all we want is find out
if the mapper-mdifier candidates are followed by : or not, the
actual parsing is done only later.  So, can't we just use
for CPP_OPEN_PAREN cp_parser_skip_balanced_tokens to move over
all the modifier's arguments?

Jakub



Re: [PATCH] gcc: add --enable-systemtap switch [PR61257]

2022-05-24 Thread Eric Gallager via Gcc-patches
On Mon, Mar 14, 2022 at 10:13 AM Jakub Jelinek via Gcc-patches
 wrote:
>
> On Mon, Mar 14, 2022 at 09:26:57AM -0400, Marek Polacek via Gcc-patches wrote:
> > Thanks for the patch.
> >
> > The new configure option needs documenting in doc/install.texi, and 
> > configure
> > needs to be regenerated.
>
> More importantly, I don't see explanation why the patch is needed,

The patch was requested in bug 61257:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61257

> analysis why did the HAVE_SYS_SDT_H configure check say that the header
> exists but trying to include it in libgcc doesn't.
>
> Jakub
>


Re: [PATCH][wwwdocs] Document ASAN changes for GCC 13.

2022-05-24 Thread Gerald Pfeifer
Hi Martin,

On Tue, 24 May 2022, Martin Liška wrote:
> +AddressSanitizer defaults to 
> detect_stack_use_after_return=1 on Linux target.

did you mean targets, or really just target?

(And Linux or GNU/Linux, though that one is more disputed, I know.
Just following our own coding conventions...)

Gerald


Re: [PATCH v2 08/11] Use OMP_ARRAY_SECTION instead of TREE_LIST in C++ FE

2022-05-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 18, 2022 at 09:26:49AM -0700, Julian Brown wrote:
> This patch changes the representation of OMP array sections in the
> C++ front end to use the new OMP_ARRAY_SECTION tree code instead of a
> TREE_LIST.  This is important for "declare mapper" support, because the
> array section representation may stick around longer (in "declare mapper"
> definitions), and special-case handling TREE_LIST becomes necessary in
> more places, which starts to become unwieldy.
> 
> 2022-02-18  Julian Brown  
> 
> gcc/c-family/
>   * c-omp.cc (c_omp_split_clauses): Support OMP_ARRAY_SECTION.
> 
> gcc/cp/
>   * parser.cc (cp_parser_omp_var_list_no_open): Use OMP_ARRAY_SECTION
>   code instead of TREE_LIST to represent OpenMP array sections.
>   * pt.cc (tsubst_copy, tsubst_omp_clause_decl, tsubst_copy_and_build):
>   Add OMP_ARRAY_SECTION support.
>   * semantics.cc (handle_omp_array_sections_1, handle_omp_array_sections,
>   cp_oacc_check_attachments, finish_omp_clauses): Use OMP_ARRAY_SECTION
>   instead of TREE_LIST where appropriate.
>   * gimplify.cc (gimplify_expr): Ensure OMP_ARRAY_SECTION has been
>   processed out before gimplification.

THis is all a step towards the right direction, but we really do want to
transition from uses of TREE_LIST to represent array sections to
OMP_ARRAY_SECTION.  For some clauses that do allow lvalue expressions that
is a must, for the rest just a good cleanup even when the OMP_ARRAY_SECTION
are created instead of TREE_LIST during the explicit array section parsing
in OpenMP var list parsing.

Jakub



Re: [PATCH][wwwdocs] Document ASAN changes for GCC 13.

2022-05-24 Thread Eric Gallager via Gcc-patches
On Tue, May 24, 2022 at 8:42 AM Martin Liška  wrote:
>
> Ready to be installed?
>
> Thanks,
> Martin
>
> ---
>  htdocs/gcc-13/changes.html | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
> index 6c5b2a37..f7f6866d 100644
> --- a/htdocs/gcc-13/changes.html
> +++ b/htdocs/gcc-13/changes.html
> @@ -47,6 +47,9 @@ a work-in-progress.
>  non-rectangular loop nests, which were added for C/C++ in GCC 11.
>
>
> +AddressSanitizer defaults to 
> detect_stack_use_after_return=1 on Linux target.
> +For compatibly, it can be disabled with env 
> ASAN_OPTIONS=detect_stack_use_after_return=0.
> +  
>  
>

Hm, the HTML tags look mismatched... also I'm assuming "compatibly"
should be "compatibility"?

>
> --
> 2.36.1
>


Re: [PATCH v2 06/11] OpenMP: lvalue parsing for map clauses (C++)

2022-05-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 18, 2022 at 09:26:47AM -0700, Julian Brown wrote:
> --- a/gcc/cp/parser.cc
> +++ b/gcc/cp/parser.cc
> @@ -4266,6 +4266,9 @@ cp_parser_new (cp_lexer *lexer)
>parser->omp_declare_simd = NULL;
>parser->oacc_routine = NULL;
>  
> +  /* Allow array slice in expression.  */

Better /* Disallow OpenMP array sections in expressions.  */

> +  parser->omp_array_section_p = false;
> +
>/* Not declaring an implicit function template.  */
>parser->auto_is_implicit_function_template_parm_p = false;
>parser->fully_implicit_function_template_p = false;

I think we should figure out when we should temporarily disable
  parser->omp_array_section_p = false;
and restore it afterwards to a saved value.  E.g.
cp_parser_lambda_expression seems like a good candidate, the fact that
OpenMP array sections are allowed say in map clause doesn't mean they are
allowed inside of lambdas and it would be especially hard when the lambda
is defining a separate function and the search for OMP_ARRAY_SECTION
probably wouldn't be able to discover those.
Other spots to consider might be statement expressions, perhaps type
definitions etc.

> @@ -8021,6 +8024,7 @@ cp_parser_postfix_open_square_expression (cp_parser 
> *parser,
>releasing_vec expression_list = NULL;
>location_t loc = cp_lexer_peek_token (parser->lexer)->location;
>bool saved_greater_than_is_operator_p;
> +  bool saved_colon_corrects_to_scope_p;
>  
>/* Consume the `[' token.  */
>cp_lexer_consume_token (parser->lexer);
> @@ -8028,6 +8032,9 @@ cp_parser_postfix_open_square_expression (cp_parser 
> *parser,
>saved_greater_than_is_operator_p = parser->greater_than_is_operator_p;
>parser->greater_than_is_operator_p = true;
>  
> +  saved_colon_corrects_to_scope_p = parser->colon_corrects_to_scope_p;
> +  parser->colon_corrects_to_scope_p = false;

I think the last above line should be guarded on
  if (parser->omp_array_section_p)
There is no reason to get worse diagnostics in non-OpenMP code or even in
OpenMP code where array sections aren't allowed.

> +
> +  /* NOTE: We are reusing using the type of the whole array as the type 
> of
> +  the array section here, which isn't necessarily entirely correct.
> +  Might need revisiting.  */

"reusing using" looks weird.
As for the type of OMP_ARRAY_SECTION trees, perhaps we could initially use
an incomplete array (so array element would be meaningful)
and when we figure out the details and the array section is contiguous
change its type to array type covering it.

> +  return build3_loc (input_location, OMP_ARRAY_SECTION,
> +  TREE_TYPE (postfix_expression),
> +  postfix_expression, index, length);
> +}
> +
> +  parser->colon_corrects_to_scope_p = saved_colon_corrects_to_scope_p;
> +
>/* Look for the closing `]'.  */
>cp_parser_require (parser, CPP_CLOSE_SQUARE, RT_CLOSE_SQUARE);
>  
> @@ -36536,7 +36570,7 @@ struct omp_dim
>  static tree
>  cp_parser_omp_var_list_no_open (cp_parser *parser, enum omp_clause_code kind,
>   tree list, bool *colon,
> - bool allow_deref = false)
> + bool map_lvalue = false)
>  {
>auto_vec dims;
>bool array_section_p;
> @@ -36547,12 +36581,95 @@ cp_parser_omp_var_list_no_open (cp_parser *parser, 
> enum omp_clause_code kind,
>parser->colon_corrects_to_scope_p = false;
>*colon = false;
>  }
> +  begin_scope (sk_omp, NULL);

Why?  Base-language-wise, clauses don't introduce a new scope
for name-lookup.
And if it is really needed, I'd strongly prefer to either do it solely
for the clauses that might need it, or do begin_scope before first
such clause and finish at the end if it has been introduced.

>while (1)
>  {
>tree name, decl;
>  
>if (kind == OMP_CLAUSE_DEPEND || kind == OMP_CLAUSE_AFFINITY)
>   cp_parser_parse_tentatively (parser);
> +  else if (map_lvalue && kind == OMP_CLAUSE_MAP)
> + {

This shouldn't be done just for OMP_CLAUSE_MAP, but for all the
other clauses that accept array sections, including
OMP_CLAUSE_DEPEND, OMP_CLAUSE_AFFINITY, OMP_CLAUSE_MAP, OMP_CLAUSE_TO,
OMP_CLAUSE_FROM, OMP_CLAUSE_INCLUSIVE, OMP_CLAUSE_EXCLUSIVE,
OMP_CLAUSE_USE_DEVICE_ADDR, OMP_CLAUSE_HAS_DEVICE_ADDR,
OMP_CLAUSE_*REDUCTION.
And preferrably, they should be kept in the IL until *finish_omp_clauses,
which should handle those instead of TREE_LIST that represented them before.
Additionally, something should diagnose incorrect uses of OMP_ARRAY_SECTION,
which is everywhere in the expressions but as the outermost node(s),
i.e. for clauses that do allow array sections scan OMP_CLAUSE_DECL after
handling handleable array sections and complain about embedded
OMP_ARRAY_SECTION, including OMP_ARRAY_SECTION say in the lower-bound,
length and/or stride expressions of the valid OMP_ARRAY_SECTION.

For C++ that also means handling OMP_ARRAY_SECTION code in pt.c.

 

Re: [PATCH] libiberty: remove FINAL and OVERRIDE from ansidecl.h

2022-05-24 Thread Richard Biener via Gcc-patches
On Tue, May 24, 2022 at 4:10 PM David Malcolm  wrote:
>
> On Tue, 2022-05-24 at 17:09 +0930, Alan Modra wrote:
> > On Mon, May 23, 2022 at 07:42:29PM -0400, David Malcolm via Binutils
> > wrote:
> > > Any objections, or is there a reason to keep these macros that I'm
> > > not aware of?  (and did I send this to all the pertinent lists?)
> >
> > No objection from me.  These macros are not used anywhere in
> > binutils-gdb.
>
> Thanks Alan.
>
> Richard, is the updated patch OK for gcc trunk? [1]

Sure.

Thanks,
Richard.

>
> Thanks
> Dave
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595453.html
>
>
>


Re: [PATCH] libiberty: remove FINAL and OVERRIDE from ansidecl.h

2022-05-24 Thread David Malcolm via Gcc-patches
On Tue, 2022-05-24 at 17:09 +0930, Alan Modra wrote:
> On Mon, May 23, 2022 at 07:42:29PM -0400, David Malcolm via Binutils
> wrote:
> > Any objections, or is there a reason to keep these macros that I'm
> > not aware of?  (and did I send this to all the pertinent lists?)
> 
> No objection from me.  These macros are not used anywhere in
> binutils-gdb.

Thanks Alan.

Richard, is the updated patch OK for gcc trunk? [1]


Thanks
Dave

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595453.html





Re: [PATCH] c: Improve build_component_ref diagnostics [PR91134]

2022-05-24 Thread David Malcolm via Gcc-patches
On Tue, 2022-05-24 at 09:57 -0400, David Malcolm wrote:
> On Tue, 2022-05-24 at 09:25 +0200, Jakub Jelinek via Gcc-patches
> wrote:
> > Hi!
> > 
> > On the following testcase (the first dg-error line) we emit a weird
> > diagnostics and even fixit on pointerpointer->member
> > where pointerpointer is pointer to pointer to struct and we say
> > 'pointerpointer' is a pointer; did you mean to use '->'?
> > The first part is indeed true, but suggesting -> when the code
> > already
> > does use -> is confusing.
> > The following patch adjusts callers so that they tell it if it is
> > from
> > . parsing or from -> parsing and in the latter case suggests to
> > dereference
> > the left operand instead by adding (* before it and ) after it
> > (before ->).
> > Or would a suggestion to add [0] before -> be better?
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> > trunk?
> > 
> 
> [...snip implementation...]
> 
> >  
> > --- gcc/testsuite/gcc.dg/pr91134.c.jj   2022-05-23
> > 20:31:11.751001817
> > +0200
> > +++ gcc/testsuite/gcc.dg/pr91134.c  2022-05-23
> > 20:30:45.291268997
> > +0200
> > @@ -0,0 +1,13 @@
> > +/* PR c/91134 */
> > +
> > +struct X { int member; } x;
> > +
> > +int
> > +foo (void)
> > +{
> > +  struct X *pointer = 
> > +  struct X **pointerpointer = 
> > +  int i = *pointerpointer->member; /* { dg-error
> > "'pointerpointer' is a pointer to pointer; did you mean to
> > dereference it before applying '->' to it\\\?" } */
> > +  int j = pointer.member;  /* { dg-error "'pointer' is
> > a
> > pointer; did you mean to use '->'\\\?" } */
> > +  return i + j;
> > +}
> 
> Ideally we'd have an automated check that the fix-it hint fixes the
> code, but failing that, I like to have at least some DejaGnu test
> coverage for fix-it hints - something like the tests in
> gcc.dg/fixits.c
> or gcc.dg/semicolon-fixits.c, perhaps?

Also, what does the output from:
  -fdiagnostics-generate-patch
look like?  That's usually the best way of checking if we're generating
good fix-it hints.

Dave



Re: [PATCH] c: Improve build_component_ref diagnostics [PR91134]

2022-05-24 Thread David Malcolm via Gcc-patches
On Tue, 2022-05-24 at 09:25 +0200, Jakub Jelinek via Gcc-patches wrote:
> Hi!
> 
> On the following testcase (the first dg-error line) we emit a weird
> diagnostics and even fixit on pointerpointer->member
> where pointerpointer is pointer to pointer to struct and we say
> 'pointerpointer' is a pointer; did you mean to use '->'?
> The first part is indeed true, but suggesting -> when the code
> already
> does use -> is confusing.
> The following patch adjusts callers so that they tell it if it is
> from
> . parsing or from -> parsing and in the latter case suggests to
> dereference
> the left operand instead by adding (* before it and ) after it
> (before ->).
> Or would a suggestion to add [0] before -> be better?
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 

[...snip implementation...]

>  
> --- gcc/testsuite/gcc.dg/pr91134.c.jj   2022-05-23 20:31:11.751001817
> +0200
> +++ gcc/testsuite/gcc.dg/pr91134.c  2022-05-23 20:30:45.291268997
> +0200
> @@ -0,0 +1,13 @@
> +/* PR c/91134 */
> +
> +struct X { int member; } x;
> +
> +int
> +foo (void)
> +{
> +  struct X *pointer = 
> +  struct X **pointerpointer = 
> +  int i = *pointerpointer->member; /* { dg-error
> "'pointerpointer' is a pointer to pointer; did you mean to
> dereference it before applying '->' to it\\\?" } */
> +  int j = pointer.member;  /* { dg-error "'pointer' is a
> pointer; did you mean to use '->'\\\?" } */
> +  return i + j;
> +}

Ideally we'd have an automated check that the fix-it hint fixes the
code, but failing that, I like to have at least some DejaGnu test
coverage for fix-it hints - something like the tests in gcc.dg/fixits.c
or gcc.dg/semicolon-fixits.c, perhaps?

Dave



Re: [PATCH v5] c++: ICE with temporary of class type in DMI [PR100252]

2022-05-24 Thread Marek Polacek via Gcc-patches
On Tue, May 24, 2022 at 08:36:39AM -0400, Jason Merrill wrote:
> On 5/16/22 11:36, Marek Polacek wrote:
> > +static tree
> > +replace_placeholders_for_class_temp_r (tree *tp, int *, void *data)
> > +{
> > +  tree t = *tp;
> > +  tree full_expr = *static_cast(data);
> > +
> > +  /* We're looking for a TARGET_EXPR nested in the whole expression.  */
> > +  if (TREE_CODE (t) == TARGET_EXPR
> > +  && !potential_prvalue_result_of (t, full_expr))
> > +{
> > +  tree init = TARGET_EXPR_INITIAL (t);
> > +  while (TREE_CODE (init) == COMPOUND_EXPR)
> > +   init = TREE_OPERAND (init, 1);
> 
> Hmm, how do we get a COMPOUND_EXPR around a CONSTRUCTOR?

Sadly, that's possible for code like (from nsdmi-aggr18.C)

struct D {
  int x = 42;
  B b = (true, A{x});
};

where the TARGET_EXPR_INITIAL is
<<< Unknown tree: void_cst >>>, {.x=((struct D *) this)->x, 
.y=(&)->x}

Marek



[PATCH] Canonicalize X&-Y as X*Y in match.pd when Y is [0,1].

2022-05-24 Thread Roger Sayle

"For every pessimization, there's an equal and opposite optimization".

In the review of my original patch for PR middle-end/98865, Richard
Biener pointed out that match.pd shouldn't be transforming X*Y into
X&-Y as the former is considered cheaper by tree-ssa's cost model
(operator count).  A corollary of this is that we should instead be
transforming X&-Y into the cheaper X*Y as a preferred canonical form
(especially as RTL expansion now intelligently selects the appropriate
implementation based on the target's costs).

With this patch we now generate identical code for:
int foo(int x, int y) { return -(x&1) & y; }
int bar(int x, int y) { return (x&1) * y; }

specifically on x86_64-pc-linux-gnu both use and/neg/and when
optimizing for speed, but both use and/mul when optimizing for
size.

One minor wrinkle/improvement is that this patch includes three
additional optimizations (that account for the change in canonical
form) to continue to optimize PR92834 and PR94786.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?


2022-05-24  Roger Sayle  

gcc/ChangeLog
* match.pd (match_zero_one_valued_p): New predicate.
(mult @0 @1): Use zero_one_valued_p for optimization to the
expression "bit_and @0 @1".
(bit_and (negate zero_one_valued_p@0) @1): Optimize to MULT_EXPR.
(plus @0 (mult (minus @1 @0) zero_one_valued_p@2): New transform.
(minus @0 (mult (minus @0 @1) zero_one_valued_p@2): Likewise.
(bit_xor @0 (mult (bit_xor @0 @1) zero_one_valued_p@2): Likewise.

gcc/testsuite/ChangeLog
* gcc.dg/pr98865.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/match.pd b/gcc/match.pd
index c2fed9b..ce97d85 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -285,14 +285,6 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
|| !COMPLEX_FLOAT_TYPE_P (type)))
(negate @0)))
 
-/* Transform { 0 or 1 } * { 0 or 1 } into { 0 or 1 } & { 0 or 1 } */
-(simplify
- (mult SSA_NAME@1 SSA_NAME@2)
-  (if (INTEGRAL_TYPE_P (type)
-   && get_nonzero_bits (@1) == 1
-   && get_nonzero_bits (@2) == 1)
-   (bit_and @1 @2)))
-
 /* Transform x * { 0 or 1, 0 or 1, ... } into x & { 0 or -1, 0 or -1, ...},
unless the target has native support for the former but not the latter.  */
 (simplify
@@ -1787,6 +1779,24 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   (bit_not (bit_not @0))
   @0)
 
+(match zero_one_valued_p
+ @0
+ (if (INTEGRAL_TYPE_P (type) && tree_nonzero_bits (@0) == 1)))
+(match zero_one_valued_p
+ truth_valued_p@0)
+
+/* Transform { 0 or 1 } * { 0 or 1 } into { 0 or 1 } & { 0 or 1 }.  */
+(simplify
+ (mult zero_one_valued_p@0 zero_one_valued_p@1)
+ (if (INTEGRAL_TYPE_P (type))
+  (bit_and @0 @1)))
+
+/* Transform X & -Y into X * Y when Y is { 0 or 1 }.  */
+(simplify
+ (bit_and:c (negate zero_one_valued_p@0) @1)
+ (if (INTEGRAL_TYPE_P (type))
+  (mult @0 @1)))
+
 /* Convert ~ (-A) to A - 1.  */
 (simplify
  (bit_not (convert? (negate @0)))
@@ -3320,6 +3330,25 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
&& (GIMPLE || !TREE_SIDE_EFFECTS (@1)))
(cond (cmp @2 @3) @1 @0
 
+/* Likewise using multiplication, A + (B-A)*cmp into cmp ? B : A.  */
+(simplify
+ (plus:c @0 (mult:c (minus @1 @0) zero_one_valued_p@2))
+ (if (INTEGRAL_TYPE_P (type)
+  && (GIMPLE || !TREE_SIDE_EFFECTS (@1)))
+  (cond @2 @1 @0)))
+/* Likewise using multiplication, A - (A-B)*cmp into cmp ? B : A.  */
+(simplify
+ (minus @0 (mult:c (minus @0 @1) zero_one_valued_p@2))
+ (if (INTEGRAL_TYPE_P (type)
+  && (GIMPLE || !TREE_SIDE_EFFECTS (@1)))
+  (cond @2 @1 @0)))
+/* Likewise using multiplication, A ^ (A^B)*cmp into cmp ? B : A.  */
+(simplify
+ (bit_xor:c @0 (mult:c (bit_xor:c @0 @1) zero_one_valued_p@2))
+ (if (INTEGRAL_TYPE_P (type)
+  && (GIMPLE || !TREE_SIDE_EFFECTS (@1)))
+  (cond @2 @1 @0)))
+
 /* Simplifications of shift and rotates.  */
 
 (for rotate (lrotate rrotate)
diff --git a/gcc/testsuite/gcc.dg/pr98865.c b/gcc/testsuite/gcc.dg/pr98865.c
new file mode 100644
index 000..95f7270
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr98865.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized" } */
+
+int foo(int x, int y)
+{
+  return -(x&1) & y;
+}
+
+int bar(int x, int y)
+{
+  return (x&1) * y;
+}
+
+/* { dg-final { scan-tree-dump-times " \\* " 2 "optimized" } } */


Re: [PATCH] c: Improve build_component_ref diagnostics [PR91134]

2022-05-24 Thread Marek Polacek via Gcc-patches
On Tue, May 24, 2022 at 09:25:57AM +0200, Jakub Jelinek wrote:
> Hi!
> 
> On the following testcase (the first dg-error line) we emit a weird
> diagnostics and even fixit on pointerpointer->member
> where pointerpointer is pointer to pointer to struct and we say
> 'pointerpointer' is a pointer; did you mean to use '->'?
> The first part is indeed true, but suggesting -> when the code already
> does use -> is confusing.
> The following patch adjusts callers so that they tell it if it is from
> . parsing or from -> parsing and in the latter case suggests to dereference
> the left operand instead by adding (* before it and ) after it (before ->).
> Or would a suggestion to add [0] before -> be better?
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2022-05-24  Jakub Jelinek  
> 
>   PR c/91134
> gcc/c/
>   * c-tree.h (build_component_ref): Add ARROW_LOC location_t argument.
>   * c-typeck.cc (build_component_ref): Likewise.  If DATUM is
>   INDIRECT_REF and ARROW_LOC isn't UNKNOWN_LOCATION, print a different
>   diagnostics and fixit hint if DATUM has pointer type.

s/diagnostic/, missing "than" before if?

>   * c-parser.cc (c_parser_postfix_expression,
>   c_parser_omp_variable_list): Adjust build_component_ref callers.
>   * gimple-parser.cc (c_parser_gimple_postfix_expression_after_primary):
>   Likewise.
> gcc/objc/
>   * objc-act.cc (objc_build_component_ref): Adjust build_component_ref
>   caller.
> gcc/testsuite/
>   * gcc.dg/pr91134.c: New test.
> 
> --- gcc/c/c-tree.h.jj 2022-05-19 11:48:56.058291437 +0200
> +++ gcc/c/c-tree.h2022-05-23 20:22:05.669515990 +0200
> @@ -699,7 +699,8 @@ extern struct c_expr convert_lvalue_to_r
>  extern tree decl_constant_value_1 (tree, bool);
>  extern void mark_exp_read (tree);
>  extern tree composite_type (tree, tree);
> -extern tree build_component_ref (location_t, tree, tree, location_t);
> +extern tree build_component_ref (location_t, tree, tree, location_t,
> +  location_t);
>  extern tree build_array_ref (location_t, tree, tree);
>  extern tree build_external_ref (location_t, tree, bool, tree *);
>  extern void pop_maybe_used (bool);
> --- gcc/c/c-typeck.cc.jj  2022-05-19 11:48:56.077291176 +0200
> +++ gcc/c/c-typeck.cc 2022-05-23 20:23:44.713515875 +0200
> @@ -2457,11 +2457,12 @@ should_suggest_deref_p (tree datum_type)
>  /* Make an expression to refer to the COMPONENT field of structure or
> union value DATUM.  COMPONENT is an IDENTIFIER_NODE.  LOC is the
> location of the COMPONENT_REF.  COMPONENT_LOC is the location
> -   of COMPONENT.  */
> +   of COMPONENT.  ARROW_LOC is the location of first -> operand if

"of the first"?

> +   it is from -> operator.  */
>  
>  tree
>  build_component_ref (location_t loc, tree datum, tree component,
> -  location_t component_loc)
> +  location_t component_loc, location_t arrow_loc)
>  {
>tree type = TREE_TYPE (datum);
>enum tree_code code = TREE_CODE (type);
> @@ -2577,11 +2578,23 @@ build_component_ref (location_t loc, tre
>/* Special-case the error message for "ptr.field" for the case
>where the user has confused "." vs "->".  */
>rich_location richloc (line_table, loc);
> -  /* "loc" should be the "." token.  */
> -  richloc.add_fixit_replace ("->");
> -  error_at (,
> - "%qE is a pointer; did you mean to use %<->%>?",
> - datum);
> +  if (TREE_CODE (datum) == INDIRECT_REF && arrow_loc != UNKNOWN_LOCATION)
> + {
> +   richloc.add_fixit_insert_before (arrow_loc, "(*");
> +   richloc.add_fixit_insert_after (arrow_loc, ")");
> +   error_at (,
> + "%qE is a pointer to pointer; did you mean to dereference "
> + "it before applying %<->%> to it?",
> + TREE_OPERAND (datum, 0));
> + }
> +  else
> + {
> +   /* "loc" should be the "." token.  */
> +   richloc.add_fixit_replace ("->");
> +   error_at (,
> + "%qE is a pointer; did you mean to use %<->%>?",
> + datum);
> + }
>return error_mark_node;
>  }
>else if (code != ERROR_MARK)
> --- gcc/c/c-parser.cc.jj  2022-05-23 16:16:30.360856580 +0200
> +++ gcc/c/c-parser.cc 2022-05-23 20:33:36.683537409 +0200
> @@ -9235,8 +9235,9 @@ c_parser_postfix_expression (c_parser *p
>   if (c_parser_next_token_is (parser, CPP_NAME))
> {
>   c_token *comp_tok = c_parser_peek_token (parser);
> - offsetof_ref = build_component_ref
> -   (loc, offsetof_ref, comp_tok->value, comp_tok->location);
> + offsetof_ref
> +   = build_component_ref (loc, offsetof_ref, comp_tok->value,
> +  comp_tok->location, UNKNOWN_LOCATION);
>   c_parser_consume_token (parser);
>   while (c_parser_next_token_is 

Re: [PATCH v2 05/11] OpenMP: Handle reference-typed struct members

2022-05-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 18, 2022 at 09:26:46AM -0700, Julian Brown wrote:
> This patch relates to OpenMP mapping clauses containing struct members of
> reference type, e.g. "mystruct.myref.myptr[:N]".  To be able to access
> the array slice through the reference in the middle, we need to perform
> an attach action for that reference, since it is represented internally
> as a pointer.
> 
> I don't think the spec allows for this case explicitly.  The closest
> clause is (OpenMP 5.0, "2.19.7.1 map Clause"):
> 
>   "If the type of a list item is a reference to a type T then the
>reference in the device data environment is initialized to refer to
>the object in the device data environment that corresponds to the
>object referenced by the list item. If mapping occurs, it occurs as
>though the object were mapped through a pointer with an array section
>of type T and length one."

Plus the general rule that aggregates are mapped as mapping of all its
individual members/elements.

> --- a/gcc/gimplify.cc
> +++ b/gcc/gimplify.cc
> @@ -9813,7 +9813,10 @@ accumulate_sibling_list (enum omp_region_type 
> region_type, enum tree_code code,
>/* FIXME: If we're not mapping the base pointer in some other clause on 
> this
>   directive, I think we want to create ALLOC/RELEASE here -- i.e. not
>   early-exit.  */
> -  if (openmp && attach_detach)
> +  if (openmp
> +  && attach_detach
> +  && !(TREE_CODE (TREE_TYPE (ocd)) == REFERENCE_TYPE
> +&& TREE_CODE (TREE_TYPE (TREE_TYPE (ocd))) != POINTER_TYPE))
>  return NULL;

Why isn't a reference to pointer handled that way too?

Jakub



Re: [PATCH v2 04/11] OpenMP/OpenACC: Add inspector class to unify mapped address analysis

2022-05-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 18, 2022 at 09:24:54AM -0700, Julian Brown wrote:
> 2022-03-17  Julian Brown  
> 
> gcc/c-family/
> * c-common.h (c_omp_address_inspector): New class.
> * c-omp.c (c_omp_address_inspector::get_deref_origin,
> c_omp_address_inspector::component_access_p,
> c_omp_address_inspector::check_clause,
> c_omp_address_inspector::get_root_term,

Spaces instead of tabs.

>   c_omp_address_inspector::map_supported_p,
>   c_omp_address_inspector::mappable_type,
>   c_omp_address_inspector::get_origin,
>   c_omp_address_inspector::peel_components,
>   c_omp_address_inspector::maybe_peel_ref,
>   c_omp_address_inspector::maybe_zero_length_array_section,
>   c_omp_address_inspector::get_base_pointer,
>   c_omp_address_inspector::get_base_pointer_tgt,
>   c_omp_address_inspector::get_attachment_point): New methods.

> --- a/gcc/c-family/c-common.h
> +++ b/gcc/c-family/c-common.h
> @@ -1253,6 +1253,61 @@ extern void c_omp_mark_declare_variant (location_t, 
> tree, tree);
>  extern const char *c_omp_map_clause_name (tree, bool);
>  extern void c_omp_adjust_map_clauses (tree, bool);
>  
> +class c_omp_address_inspector
> +{
> +  location_t loc;
> +  tree root_term;
> +  bool indirections;
> +  int map_supported;
> +
> +protected:
> +  tree orig;
> +
> +public:
> +  c_omp_address_inspector (location_t loc, tree t)
> +: loc (loc), root_term (NULL_TREE), indirections (false),
> +  map_supported (-1), orig (t)
> +  { }
> +
> +  ~c_omp_address_inspector () {}
> +
> +  virtual bool processing_template_decl_p () { return false; }
> +  virtual bool mappable_type (tree t);
> +  virtual void emit_unmappable_type_notes (tree) { }
> +
> +  bool check_clause (tree);
> +  tree get_root_term (bool);
> +
> +  tree get_address () { return orig; }
> +  tree get_deref_origin ();
> +  bool component_access_p ();
> +
> +  bool has_indirections_p ()
> +{
> +  if (!root_term)
> + get_root_term (false);
> +  return indirections;
> +}
> +
> +  bool indir_component_ref_p ()
> +{
> +  return component_access_p () && has_indirections_p ();
> +}

I think https://gcc.gnu.org/codingconventions.html#Cxx_Conventions
just says that no member functions should be defined inside of the
class, which is something that almost nobody actually honors.
But, when they are inline, there should be one style, not many,
so either
  type method (args)
  {
  }
(guess my preference) or
  type method (args)
{
}
but not those mixed up, which you have in the patch.

> --- a/gcc/c-family/c-omp.cc
> +++ b/gcc/c-family/c-omp.cc
> @@ -3113,6 +3113,274 @@ c_omp_adjust_map_clauses (tree clauses, bool 
> is_target)
>  }
>  }
>  

There should be function comment for all the out of line definitions.
> +tree
> +c_omp_address_inspector::get_deref_origin ()

>  {
>if (error_operand_p (t))
>   return error_mark_node;
> +  c_omp_address_inspector t_insp (OMP_CLAUSE_LOCATION (c), t);

Wouldn't ai (address inspector) be better than t_insp?

> +/* C++ specialisation of the c_omp_address_inspector class.  */
> +
> +class cp_omp_address_inspector : public c_omp_address_inspector
> +{
> +public:
> +  cp_omp_address_inspector (location_t loc, tree t)
> +: c_omp_address_inspector (loc, t)
> +  { }
> +
> +  ~cp_omp_address_inspector ()
> +  { }
> +
> +  bool processing_template_decl_p ()
> +  {
> +return processing_template_decl;
> +  }
> +
> +  bool mappable_type (tree t)
> +  {
> +return cp_omp_mappable_type (t);
> +  }
> +
> +  void emit_unmappable_type_notes (tree t)
> +  {
> +cp_omp_emit_unmappable_type_notes (t);
> +  }
> +
> +  static bool ref_p (tree t)
> +{
> +  return (TYPE_REF_P (TREE_TYPE (t))
> +   || REFERENCE_REF_P (t));
> +}

See above the mixing of styles...
I know, some headers are really bad examples, e.g. hash-map.h
even has 3 different styles,
  {
  }
and
{
}
and
  {
  }
for the type method (args) indented by 2 spaces.

> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/gomp/unmappable-component-1.C
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +
> +struct A {
> +  static int x[10];
> +};
> +
> +struct B {
> +  A a;
> +};
> +
> +int
> +main (int argc, char *argv[])
> +{
> +  B *b = new B;
> +#pragma omp target map(b->a) // { dg-error "'b->B::a' does not have a 
> mappable type in 'map' clause" }
> +  ;
> +  B bb;
> +#pragma omp target map(bb.a) // { dg-error "'bb\.B::a' does not have a 
> mappable type in 'map' clause" }

We don't diagnose static data members as non-mappable anymore.
So I don't see how this test can work.

> +int
> +main (int argc, char *argv[])

Why "int argc, char *argv[]" when you don't use it?

> +  p0 = (p0type *) malloc (sizeof *p0);
> +  p0->x0[0].p1 = (p1type *) malloc (sizeof *p0->x0[0].p1);
> +  p0->x0[0].p1->p2 = (p2type *) malloc (sizeof *p0->x0[0].p1->p2);
> +  memset (p0->x0[0].p1->p2, 0, sizeof *p0->x0[0].p1->p2);
> +
> +#pragma omp target 

Re: [PATCH v2 03/11] OpenMP/OpenACC struct sibling list gimplification extension and rework

2022-05-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 18, 2022 at 09:24:53AM -0700, Julian Brown wrote:
> 2022-03-17  Julian Brown  
> 
> gcc/fortran/
> * trans-openmp.cc (gfc_trans_omp_clauses): Don't create
> GOMP_MAP_TO_PSET mappings for class metadata, nor GOMP_MAP_POINTER
> mappings for POINTER_TYPE_P decls.
> 
> gcc/
> * gimplify.c (gimplify_omp_var_data): Remove GOVD_MAP_HAS_ATTACHMENTS.
> (insert_struct_comp_map): Refactor function into...
> (build_struct_comp_nodes): This new function.  Remove list handling
> and improve self-documentation.
> (extract_base_bit_offset): Remove BASE_REF, OFFSETP parameters.  Move
> code to strip outer parts of address out of function, but strip no-op
> conversions.
> (omp_mapping_group): Add DELETED field for use during reindexing.
> (strip_components_and_deref, strip_indirections): New functions.
> (omp_group_last, omp_group_base): Add GOMP_MAP_STRUCT handling.
> (omp_gather_mapping_groups): Initialise DELETED field for new groups.
> (omp_index_mapping_groups): Notice DELETED groups when (re)indexing.
> (insert_node_after, move_node_after, move_nodes_after,
> move_concat_nodes_after): New helper functions.
> (accumulate_sibling_list): New function to build up GOMP_MAP_STRUCT
> node groups for sibling lists. Outlined from 
> gimplify_scan_omp_clauses.
> (omp_build_struct_sibling_lists): New function.
> (gimplify_scan_omp_clauses): Remove struct_map_to_clause,
> struct_seen_clause, struct_deref_set.  Call
> omp_build_struct_sibling_lists as pre-pass instead of handling sibling
> lists in the function's main processing loop.
> (gimplify_adjust_omp_clauses_1): Remove GOVD_MAP_HAS_ATTACHMENTS
> handling, unused now.
> * omp-low.cc (scan_sharing_clauses): Handle pointer-type indirect
> struct references, and references to pointers to structs also.
> 
> gcc/testsuite/
> * g++.dg/goacc/member-array-acc.C: New test.
> * g++.dg/gomp/member-array-omp.C: New test.
> * g++.dg/gomp/target-3.C: Update expected output.
> * g++.dg/gomp/target-lambda-1.C: Likewise.
> * g++.dg/gomp/target-this-2.C: Likewise.
> * c-c++-common/goacc/deep-copy-arrayofstruct.c: Move test from here.
> 
> libgomp/
> * testsuite/libgomp.oacc-c-c++-common/deep-copy-15.c: New test.
> * testsuite/libgomp.oacc-c-c++-common/deep-copy-16.c: New test.
> * testsuite/libgomp.oacc-c++/deep-copy-17.C: New test.
> * testsuite/libgomp.oacc-c-c++-common/deep-copy-arrayofstruct.c: Move
> test to here, make "run" test.

> --- a/gcc/gimplify.cc
> +++ b/gcc/gimplify.cc
> @@ -125,10 +125,6 @@ enum gimplify_omp_var_data
>/* Flag for GOVD_REDUCTION: inscan seen in {in,ex}clusive clause.  */
>GOVD_REDUCTION_INSCAN = 0x200,
>  
> -  /* Flag for GOVD_MAP: (struct) vars that have pointer attachments for
> - fields.  */
> -  GOVD_MAP_HAS_ATTACHMENTS = 0x400,
> -
>/* Flag for GOVD_FIRSTPRIVATE: OMP_CLAUSE_FIRSTPRIVATE_IMPLICIT.  */
>GOVD_FIRSTPRIVATE_IMPLICIT = 0x800,

I'd renumber the GOVD_* constants after this, otherwise we won't remember
we've left a gap.

> +   (or derived type, etc.) component, create an "alloc" or "release" node to
> +   insert into a list following a GOMP_MAP_STRUCT node.  For some types of
> +   mapping (e.g. Fortran arrays with descriptors), an additional mapping may
> +   be created that is inserted into the list of mapping nodes attached to the
> +   directive being processed -- not part of the sorted list of nodes after
> +   GOMP_MAP_STRUCT.
> +
> +   CODE is the code of the directive being processed.  GRP_START and GRP_END
> +   are the first and last of two or three nodes representing this array 
> section
> +   mapping (e.g. a data movement node like GOMP_MAP_{TO,FROM}, optionally a
> +   GOMP_MAP_TO_PSET, and finally a GOMP_MAP_ALWAYS_POINTER).  EXTRA_NODE is
> +   filled with the additional node described above, if needed.
> +
> +   This function does not add the new nodes to any lists itself.  It is the
> +   responsibility of the caller to do that.  */
>  
>  static tree
> -insert_struct_comp_map (enum tree_code code, tree c, tree struct_node,
> - tree prev_node, tree *scp)
> +build_struct_comp_nodes (enum tree_code code, tree grp_start, tree grp_end,
> +  tree *extra_node)

I think it would be nice to use omp_ prefixes even for these static
functions, this is all in the gimplifier, so it should be clear that it
isn't some generic code but OpenMP specific gimplification code.

Another variant would be to introduce omp-gimplify.cc and move lots of stuff
there, but if we do that, best time might be during stage3 so that it
doesn't collide with too many patches.
>  
> +/* Link node NEWNODE so it is pointed to by chain INSERT_AT.  NEWNODE's chain
> +   is linked 

Re: [PATCH 05/10] d: add 'final' and 'override' to gcc/d/*.cc 'visit' impls

2022-05-24 Thread David Malcolm via Gcc-patches
On Tue, 2022-05-24 at 14:56 +0200, Iain Buclaw wrote:
> Excerpts from David Malcolm via Gcc-patches's message of Mai 23, 2022
> 9:28 pm:
> > gcc/d/ChangeLog:
> > * decl.cc: Add "final" and "override" to all "visit" vfunc
> > decls
> > as appropriate.
> > * expr.cc: Likewise.
> > * toir.cc: Likewise.
> > * typeinfo.cc: Likewise.
> > * types.cc: Likewise.
> > 
> > Signed-off-by: David Malcolm 
> 
> 
> Thanks David!
> 
> Looks OK to me.
> 
> Iain.

Thanks; I've pushed it to trunk as r13-736-g442cf0977a2993.

FWIW, to repeat something I said in the cover letter, I tried hacking -
Werror=suggest-override into the Makefile whilst I was creating the
patches, and IIRC there were a bunch of them in the gcc/d/dmd
subdirectory - but that code is copied from the D community upstream,
right?
So maybe if that D parser C++ code requires a C++11 compiler, perhaps
they might want to add "final" and "override" specifiers to it as
appropriate, to better document the intent of the decls?

Hope this is constructive
Dave




Re: [PATCH v2 02/11] Remove omp_target_reorder_clauses

2022-05-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 18, 2022 at 09:24:52AM -0700, Julian Brown wrote:
> This patch has been split out from the previous one to avoid a
> confusingly-interleaved diff.  The two patches should probably be
> committed squashed together.

Agreed, LGTM.
> 
> 2021-10-01  Julian Brown  
> 
> gcc/
>   * gimplify.c (omp_target_reorder_clauses): Delete.

Jakub



Re: [PATCH v2 01/11] OpenMP 5.0: Clause ordering for OpenMP 5.0 (topological sorting by base pointer)

2022-05-24 Thread Jakub Jelinek via Gcc-patches
On Fri, Mar 18, 2022 at 09:24:51AM -0700, Julian Brown wrote:
> 2021-11-23  Julian Brown  
> 
> gcc/
>   * gimplify.c (is_or_contains_p, omp_target_reorder_clauses): Delete
>   functions.
>   (omp_tsort_mark): Add enum.
>   (omp_mapping_group): Add struct.
>   (debug_mapping_group, omp_get_base_pointer, omp_get_attachment,
>   omp_group_last, omp_gather_mapping_groups, omp_group_base,
>   omp_index_mapping_groups, omp_containing_struct,
>   omp_tsort_mapping_groups_1, omp_tsort_mapping_groups,
>   omp_segregate_mapping_groups, omp_reorder_mapping_groups): New
>   functions.
>   (gimplify_scan_omp_clauses): Call above functions instead of
>   omp_target_reorder_clauses, unless we've seen an error.
>   * omp-low.c (scan_sharing_clauses): Avoid strict test if we haven't
>   sorted mapping groups.
> 
> gcc/testsuite/
>   * g++.dg/gomp/target-lambda-1.C: Adjust expected output.
>   * g++.dg/gomp/target-this-3.C: Likewise.
>   * g++.dg/gomp/target-this-4.C: Likewise.
> +

Wouldn't hurt to add a comment on the meanings of the enumerators.

> +enum omp_tsort_mark {
> +  UNVISITED,
> +  TEMPORARY,
> +  PERMANENT
> +};
> +
> +struct omp_mapping_group {
> +  tree *grp_start;
> +  tree grp_end;
> +  omp_tsort_mark mark;
> +  struct omp_mapping_group *sibling;
> +  struct omp_mapping_group *next;
> +};
> +
> +__attribute__((used)) static void

I'd use what is used elsewhere,
DEBUG_FUNCTION void
without static.

> +debug_mapping_group (omp_mapping_group *grp)
> +{
> +  tree tmp = OMP_CLAUSE_CHAIN (grp->grp_end);
> +  OMP_CLAUSE_CHAIN (grp->grp_end) = NULL;
> +  debug_generic_expr (*grp->grp_start);
> +  OMP_CLAUSE_CHAIN (grp->grp_end) = tmp;
> +}
> +
> +/* Return the OpenMP "base pointer" of an expression EXPR, or NULL if there
> +   isn't one.  This needs improvement.  */
> +
> +static tree
> +omp_get_base_pointer (tree expr)
> +{
> +  while (TREE_CODE (expr) == ARRAY_REF)
> +expr = TREE_OPERAND (expr, 0);
> +
> +  while (TREE_CODE (expr) == COMPONENT_REF
> +  && (DECL_P (TREE_OPERAND (expr, 0))
> +  || (TREE_CODE (TREE_OPERAND (expr, 0)) == COMPONENT_REF)
> +  || TREE_CODE (TREE_OPERAND (expr, 0)) == INDIRECT_REF
> +  || (TREE_CODE (TREE_OPERAND (expr, 0)) == MEM_REF
> +  && integer_zerop (TREE_OPERAND (TREE_OPERAND (expr, 0), 1)))
> +  || TREE_CODE (TREE_OPERAND (expr, 0)) == ARRAY_REF))
> +{
> +  expr = TREE_OPERAND (expr, 0);
> +
> +  while (TREE_CODE (expr) == ARRAY_REF)
> + expr = TREE_OPERAND (expr, 0);
> +
> +  if (TREE_CODE (expr) == INDIRECT_REF || TREE_CODE (expr) == MEM_REF)
> + break;
> +}

I must say I don't see advantages of just a single loop that
looks through all ARRAY_REFs and all COMPONENT_REFs and then just
checks if the expr it got in the end is a decl or INDIRECT_REF
or MEM_REF with offset 0.

> +  if (DECL_P (expr))
> +return NULL_TREE;
> +
> +  if (TREE_CODE (expr) == INDIRECT_REF
> +  || TREE_CODE (expr) == MEM_REF)
> +{
> +  expr = TREE_OPERAND (expr, 0);
> +  while (TREE_CODE (expr) == COMPOUND_EXPR)
> + expr = TREE_OPERAND (expr, 1);
> +  if (TREE_CODE (expr) == POINTER_PLUS_EXPR)
> + expr = TREE_OPERAND (expr, 0);
> +  if (TREE_CODE (expr) == SAVE_EXPR)
> + expr = TREE_OPERAND (expr, 0);
> +  STRIP_NOPS (expr);
> +  return expr;
> +}
> +
> +  return NULL_TREE;
> +}
> +

> +static tree
> +omp_containing_struct (tree expr)
> +{
> +  tree expr0 = expr;
> +
> +  STRIP_NOPS (expr);
> +
> +  tree expr1 = expr;
> +
> +  /* FIXME: other types of accessors?  */
> +  while (TREE_CODE (expr) == ARRAY_REF)
> +expr = TREE_OPERAND (expr, 0);
> +
> +  if (TREE_CODE (expr) == COMPONENT_REF)
> +{
> +  if (DECL_P (TREE_OPERAND (expr, 0))
> +   || TREE_CODE (TREE_OPERAND (expr, 0)) == COMPONENT_REF
> +   || TREE_CODE (TREE_OPERAND (expr, 0)) == INDIRECT_REF
> +   || (TREE_CODE (TREE_OPERAND (expr, 0)) == MEM_REF
> +   && integer_zerop (TREE_OPERAND (TREE_OPERAND (expr, 0), 1)))
> +   || TREE_CODE (TREE_OPERAND (expr, 0)) == ARRAY_REF)
> + expr = TREE_OPERAND (expr, 0);
> +  else
> + internal_error ("unhandled component");
> +}

Again?

> @@ -9063,11 +9820,29 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq 
> *pre_p,
>   break;
>}
>  
> -  if (code == OMP_TARGET
> -  || code == OMP_TARGET_DATA
> -  || code == OMP_TARGET_ENTER_DATA
> -  || code == OMP_TARGET_EXIT_DATA)
> -omp_target_reorder_clauses (list_p);
> +  /* Topological sorting may fail if we have duplicate nodes, which
> + we should have detected and shown an error for already.  Skip
> + sorting in that case.  */
> +  if (!seen_error ()
> +  && (code == OMP_TARGET
> +   || code == OMP_TARGET_DATA
> +   || code == OMP_TARGET_ENTER_DATA
> +   || code == OMP_TARGET_EXIT_DATA))
> +{
> +  vec *groups;
> +  groups = 

Re: [PATCH V4 0/3] RISC-V:Add mininal support for Zicbo[mzp]

2022-05-24 Thread Kito Cheng via Gcc-patches
Committed with a few minor style fixes, thanks!

On Tue, May 10, 2022 at 11:26 AM  wrote:
>
> From: yulong 
>
> This patchset adds support for three recently ratified RISC-V extensions:
>
> -   Zicbom (Cache-Block Management Instructions)
> -   Zicbop (Cache-Block Prefetch hint instructions)
> -   Zicboz (Cache-Block Zero Instructions)
>
> Patch 1: Add Zicbom/z/p mininal support
> Patch 2: Add Zicbom/z/p instructions arch support
> Patch 3: Add Zicbom/z/p instructions testcases
>
> diff with the previous version:
> We use unspec_volatile instead of unspec for those cache operations, and move 
> those UNSPEC from unspec to unspecv.
>  19
>  20 cf. 
> ;
>
> yulong (3):
>   RISC-V: Add mininal support for Zicbo[mzp]
>   RISC-V:Cache Management Operation instructions
>   RISC-V:Cache Management Operation instructions testcases
>
>  gcc/common/config/riscv/riscv-common.cc   |  8 +++
>  gcc/config/riscv/predicates.md|  4 ++
>  gcc/config/riscv/riscv-builtins.cc| 16 ++
>  gcc/config/riscv/riscv-cmo.def| 17 +++
>  gcc/config/riscv/riscv-ftypes.def |  4 ++
>  gcc/config/riscv/riscv-opts.h |  8 +++
>  gcc/config/riscv/riscv.md | 51 +++
>  gcc/config/riscv/riscv.opt|  3 ++
>  gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c | 21 
>  gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c | 21 
>  gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c | 23 +
>  gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c | 23 +
>  gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c |  9 
>  gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c |  9 
>  14 files changed, 217 insertions(+)
>  create mode 100644 gcc/config/riscv/riscv-cmo.def
>  create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicbom-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicbom-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicbop-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicbop-2.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicboz-1.c
>  create mode 100644 gcc/testsuite/gcc.target/riscv/cmo-zicboz-2.c
>
> --
> 2.17.1
>


Re: [PATCH 05/10] d: add 'final' and 'override' to gcc/d/*.cc 'visit' impls

2022-05-24 Thread Iain Buclaw via Gcc-patches
Excerpts from David Malcolm via Gcc-patches's message of Mai 23, 2022 9:28 pm:
> gcc/d/ChangeLog:
>   * decl.cc: Add "final" and "override" to all "visit" vfunc decls
>   as appropriate.
>   * expr.cc: Likewise.
>   * toir.cc: Likewise.
>   * typeinfo.cc: Likewise.
>   * types.cc: Likewise.
> 
> Signed-off-by: David Malcolm 


Thanks David!

Looks OK to me.

Iain.


> ---
>  gcc/d/decl.cc | 36 +-
>  gcc/d/expr.cc |  2 +-
>  gcc/d/toir.cc | 64 +++
>  gcc/d/typeinfo.cc | 34 -
>  gcc/d/types.cc| 30 +++---
>  5 files changed, 83 insertions(+), 83 deletions(-)
> 
> diff --git a/gcc/d/decl.cc b/gcc/d/decl.cc
> index f5c21078aad..5d850065bf0 100644
> --- a/gcc/d/decl.cc
> +++ b/gcc/d/decl.cc
> @@ -149,13 +149,13 @@ public:
>  
>/* This should be overridden by each declaration class.  */
>  
> -  void visit (Dsymbol *)
> +  void visit (Dsymbol *) final override
>{
>}
>  
>/* Compile a D module, and all members of it.  */
>  
> -  void visit (Module *d)
> +  void visit (Module *d) final override
>{
>  if (d->semanticRun >= PASS::obj)
>return;
> @@ -166,7 +166,7 @@ public:
>  
>/* Write the imported symbol to debug.  */
>  
> -  void visit (Import *d)
> +  void visit (Import *d) final override
>{
>  if (d->semanticRun >= PASS::obj)
>return;
> @@ -218,7 +218,7 @@ public:
>  
>/* Expand any local variables found in tuples.  */
>  
> -  void visit (TupleDeclaration *d)
> +  void visit (TupleDeclaration *d) final override
>{
>  for (size_t i = 0; i < d->objects->length; i++)
>{
> @@ -234,7 +234,7 @@ public:
>  
>/* Walk over all declarations in the attribute scope.  */
>  
> -  void visit (AttribDeclaration *d)
> +  void visit (AttribDeclaration *d) final override
>{
>  Dsymbols *ds = d->include (NULL);
>  
> @@ -248,7 +248,7 @@ public:
>/* Pragmas are a way to pass special information to the compiler and to add
>   vendor specific extensions to D.  */
>  
> -  void visit (PragmaDeclaration *d)
> +  void visit (PragmaDeclaration *d) final override
>{
>  if (d->ident == Identifier::idPool ("lib")
>   || d->ident == Identifier::idPool ("startaddress"))
> @@ -266,7 +266,7 @@ public:
>/* Conditional compilation is the process of selecting which code to 
> compile
>   and which code to not compile.  Look for version conditions that may  */
>  
> -  void visit (ConditionalDeclaration *d)
> +  void visit (ConditionalDeclaration *d) final override
>{
>  bool old_condition = this->in_version_unittest_;
>  
> @@ -284,7 +284,7 @@ public:
>  
>/* Walk over all members in the namespace scope.  */
>  
> -  void visit (Nspace *d)
> +  void visit (Nspace *d) final override
>{
>  if (isError (d) || !d->members)
>return;
> @@ -298,7 +298,7 @@ public:
>   voldemort type, then it's members must be compiled before the parent
>   function finishes.  */
>  
> -  void visit (TemplateDeclaration *d)
> +  void visit (TemplateDeclaration *d) final override
>{
>  /* Type cannot be directly named outside of the scope it's declared in, 
> so
> the only way it can be escaped is if the function has auto return.  */
> @@ -329,7 +329,7 @@ public:
>  
>/* Walk over all members in the instantiated template.  */
>  
> -  void visit (TemplateInstance *d)
> +  void visit (TemplateInstance *d) final override
>{
>  if (isError (d)|| !d->members)
>return;
> @@ -343,7 +343,7 @@ public:
>  
>/* Walk over all members in the mixin template scope.  */
>  
> -  void visit (TemplateMixin *d)
> +  void visit (TemplateMixin *d) final override
>{
>  if (isError (d)|| !d->members)
>return;
> @@ -355,7 +355,7 @@ public:
>/* Write out compiler generated TypeInfo, initializer and functions for the
>   given struct declaration, walking over all static members.  */
>  
> -  void visit (StructDeclaration *d)
> +  void visit (StructDeclaration *d) final override
>{
>  if (d->semanticRun >= PASS::obj)
>return;
> @@ -470,7 +470,7 @@ public:
>/* Write out compiler generated TypeInfo, initializer and vtables for the
>   given class declaration, walking over all static members.  */
>  
> -  void visit (ClassDeclaration *d)
> +  void visit (ClassDeclaration *d) final override
>{
>  if (d->semanticRun >= PASS::obj)
>return;
> @@ -544,7 +544,7 @@ public:
>/* Write out compiler generated TypeInfo and vtables for the given 
> interface
>   declaration, walking over all static members.  */
>  
> -  void visit (InterfaceDeclaration *d)
> +  void visit (InterfaceDeclaration *d) final override
>{
>  if (d->semanticRun >= PASS::obj)
>return;
> @@ -587,7 +587,7 @@ public:
>/* Write out compiler generated TypeInfo and initializer for the given
>   enum declaration.  */
> 

Re: [PATCH] c++: fix ICE on invalid attributes [PR96637]

2022-05-24 Thread Marek Polacek via Gcc-patches
On Tue, May 24, 2022 at 05:29:56PM +0530, Prathamesh Kulkarni wrote:
> On Fri, 29 Apr 2022 at 19:44, Marek Polacek via Gcc-patches
>  wrote:
> >
> > This patch fixes crashes with invalid attributes.  Arguably it could
> > make sense to assert seen_error() too.
> >
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk = GCC 13?
> >
> > PR c++/96637
> >
> > gcc/ChangeLog:
> >
> > * attribs.cc (decl_attributes): Check error_mark_node.
> >
> > gcc/cp/ChangeLog:
> >
> > * decl2.cc (cp_check_const_attributes): Check error_mark_node.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * g++.dg/parse/error64.C: New test.
> > ---
> >  gcc/attribs.cc   | 3 +++
> >  gcc/cp/decl2.cc  | 2 ++
> >  gcc/testsuite/g++.dg/parse/error64.C | 4 
> >  3 files changed, 9 insertions(+)
> >  create mode 100644 gcc/testsuite/g++.dg/parse/error64.C
> >
> > diff --git a/gcc/attribs.cc b/gcc/attribs.cc
> > index b219f878042..ff157dcf81c 100644
> > --- a/gcc/attribs.cc
> > +++ b/gcc/attribs.cc
> > @@ -700,6 +700,9 @@ decl_attributes (tree *node, tree attributes, int flags,
> >   in the same order as in the source.  */
> >for (tree attr = attributes; attr; attr = TREE_CHAIN (attr))
> >  {
> > +  if (attr == error_mark_node)
> > +   continue;
> Not a comment on the patch specifically, but just wondering if it'd be
> better to use error_operand_p,
> than testing against error_mark_node explicitly ?

Not here, I don't think; it tests more than is needed here.

Marek



[PATCH][wwwdocs] Document ASAN changes for GCC 13.

2022-05-24 Thread Martin Liška
Ready to be installed?

Thanks,
Martin

---
 htdocs/gcc-13/changes.html | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index 6c5b2a37..f7f6866d 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -47,6 +47,9 @@ a work-in-progress.
 non-rectangular loop nests, which were added for C/C++ in GCC 11.
   
   
+AddressSanitizer defaults to detect_stack_use_after_return=1 
on Linux target.
+For compatibly, it can be disabled with env 
ASAN_OPTIONS=detect_stack_use_after_return=0.
+  
 
 
 
-- 
2.36.1



Re: [PATCH v5] c++: ICE with temporary of class type in DMI [PR100252]

2022-05-24 Thread Jason Merrill via Gcc-patches

On 5/16/22 11:36, Marek Polacek wrote:

On Sat, May 14, 2022 at 11:13:28PM -0400, Jason Merrill wrote:

On 5/13/22 19:41, Marek Polacek wrote:

--- a/gcc/cp/typeck2.cc
+++ b/gcc/cp/typeck2.cc
@@ -1371,6 +1371,70 @@ digest_init_flags (tree type, tree init, int flags, 
tsubst_flags_t complain)
 return digest_init_r (type, init, 0, flags, complain);
   }
+/* Return true if a prvalue is used as an initializer rather than for
+   temporary materialization.  For instance:


I might say "if SUBOB initializes the same object as FULL_EXPR"; the
full-expression could still end up initializing a temporary.


Fixed.
  

+ A a = A{};  // initializer
+ A a = (A{});// initializer
+ A a = (1, A{}); // initializer
+ A a = true ? A{} : A{};  // initializer
+ auto x = A{}.x; // temporary materialization
+ auto x = foo(A{});  // temporary materialization
+
+   FULL_EXPR is the whole expression, SUBOB is its TARGET_EXPR subobject.  */
+
+static bool
+potential_prvalue_result_of (tree subob, tree full_expr)
+{
+  if (subob == full_expr)
+return true;
+  else if (TREE_CODE (full_expr) == TARGET_EXPR)
+{
+  tree init = TARGET_EXPR_INITIAL (full_expr);
+  if (TREE_CODE (init) == COND_EXPR)
+   return (potential_prvalue_result_of (subob, TREE_OPERAND (init, 1))
+   || potential_prvalue_result_of (subob, TREE_OPERAND (init, 2)));
+  else if (TREE_CODE (init) == COMPOUND_EXPR)
+   return (potential_prvalue_result_of (subob, TREE_OPERAND (init, 0))


We shouldn't recurse into the LHS of the comma, only the RHS.


Fixed.
  

+   || potential_prvalue_result_of (subob, TREE_OPERAND (init, 1)));
+  /* ??? I don't know if this can be hit.  If so, look inside the ( )
+instead of the assert.  */
+  else if (TREE_CODE (init) == PAREN_EXPR)
+   gcc_checking_assert (false);


It seems trivial enough to recurse after the assert, in case it does happen
in the wild.


OK, adjusted.  Thanks!

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

-- >8 --
Consider

   struct A {
 int x;
 int y = x;
   };

   struct B {
 int x = 0;
 int y = A{x}.y; // #1
   };

where for #1 we end up with

   {.x=(&)->x, .y=(&)->x}

that is, two PLACEHOLDER_EXPRs for different types on the same level in
a {}.  This crashes because our CONSTRUCTOR_PLACEHOLDER_BOUNDARY mechanism to
avoid replacing unrelated PLACEHOLDER_EXPRs cannot deal with it.

Here's why we wound up with those PLACEHOLDER_EXPRs: When we're performing
cp_parser_late_parsing_nsdmi for "int y = A{x}.y;" we use 
finish_compound_literal
on type=A, compound_literal={((struct B *) this)->x}.  When digesting this
initializer, we call get_nsdmi which creates a PLACEHOLDER_EXPR for A -- we 
don't
have any object to refer to yet.  After digesting, we have

   {.x=((struct B *) this)->x, .y=(&)->x}

and since we've created a PLACEHOLDER_EXPR inside it, we marked the whole ctor
CONSTRUCTOR_PLACEHOLDER_BOUNDARY.  f_c_l creates a TARGET_EXPR and returns

   TARGET_EXPR x, .y=(&)->x}>

Then we get to

   B b = {};

and call store_init_value, which digests the {}, which produces

   {.x=NON_LVALUE_EXPR <0>, .y=(TARGET_EXPR )->x, 
.y=(&)->x}>).y}

lookup_placeholder in constexpr won't find an object to replace the
PLACEHOLDER_EXPR for B, because ctx->object will be D.2395 of type A, and we
cannot search outward from D.2395 to find 'b'.

The call to replace_placeholders in store_init_value will not do anything:
we've marked the inner { } CONSTRUCTOR_PLACEHOLDER_BOUNDARY, and it's only
a sub-expression, so replace_placeholders does nothing, so the 
stays even though now is the perfect time to replace it because we have an
object for it: 'b'.

Later, in cp_gimplify_init_expr the *expr_p is

   D.2395 = {.x=(&)->x, .y=(&)->x}

where D.2395 is of type A, but we crash because we hit , which
has a different type.

My idea was to replace  with D.2384 after creating the
TARGET_EXPR because that means we have an object we can refer to.
Then clear CONSTRUCTOR_PLACEHOLDER_BOUNDARY because we no longer have
a PLACEHOLDER_EXPR in the {}.  Then store_init_value will be able to
replace  with 'b', and we should be good to go.  We must
be careful not to break guaranteed copy elision, so this replacement
happens in digest_nsdmi_init where we can see the whole initializer,
and avoid replacing any placeholders in TARGET_EXPRs used in the context
of initialization/copy elision.  This is achieved via the new function
called potential_prvalue_result_of.

While fixing this problem, I found PR105550, thus the FIXMEs in the
tests.

PR c++/100252

gcc/cp/ChangeLog:

* typeck2.cc (potential_prvalue_result_of): New.
(replace_placeholders_for_class_temp_r): New.
(digest_nsdmi_init): Call it.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1y/nsdmi-aggr14.C: New test.
* g++.dg/cpp1y/nsdmi-aggr15.C: New test.
* 

Re: [PATCH] c++: fix ICE on invalid attributes [PR96637]

2022-05-24 Thread Jason Merrill via Gcc-patches

On 4/29/22 10:12, Marek Polacek wrote:

This patch fixes crashes with invalid attributes.  Arguably it could
make sense to assert seen_error() too.


So in this testcase we have TREE_CHAIN of a TREE_LIST pointing to 
error_mark_node?  Can we avoid that?



Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk = GCC 13?

PR c++/96637

gcc/ChangeLog:

* attribs.cc (decl_attributes): Check error_mark_node.

gcc/cp/ChangeLog:

* decl2.cc (cp_check_const_attributes): Check error_mark_node.

gcc/testsuite/ChangeLog:

* g++.dg/parse/error64.C: New test.
---
  gcc/attribs.cc   | 3 +++
  gcc/cp/decl2.cc  | 2 ++
  gcc/testsuite/g++.dg/parse/error64.C | 4 
  3 files changed, 9 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/parse/error64.C

diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index b219f878042..ff157dcf81c 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -700,6 +700,9 @@ decl_attributes (tree *node, tree attributes, int flags,
   in the same order as in the source.  */
for (tree attr = attributes; attr; attr = TREE_CHAIN (attr))
  {
+  if (attr == error_mark_node)
+   continue;
+
tree ns = get_attribute_namespace (attr);
tree name = get_attribute_name (attr);
tree args = TREE_VALUE (attr);
diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index d2b29208ed5..c3ff1962a75 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -1537,6 +1537,8 @@ cp_check_const_attributes (tree attributes)
/* As we implement alignas using gnu::aligned attribute and
 alignas argument is a constant expression, force manifestly
 constant evaluation of aligned attribute argument.  */
+  if (attr == error_mark_node)
+   continue;
bool manifestly_const_eval
= is_attribute_p ("aligned", get_attribute_name (attr));
for (arg = TREE_VALUE (attr); arg && TREE_CODE (arg) == TREE_LIST;
diff --git a/gcc/testsuite/g++.dg/parse/error64.C 
b/gcc/testsuite/g++.dg/parse/error64.C
new file mode 100644
index 000..87848a58c27
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/error64.C
@@ -0,0 +1,4 @@
+// PR c++/96637
+// { dg-do compile }
+
+void foo(int[] alignas[1] alignas(1)){} // { dg-error "" }

base-commit: 9ae8b993cd362e8aea4f65580aaf1453120207f2




Re: [PATCH] c++: set TYPE_CANONICAL for most templated types

2022-05-24 Thread Jason Merrill via Gcc-patches

On 5/23/22 16:25, Patrick Palka wrote:

On 5/18/22, Jason Merrill wrote:

On 5/16/22 15:58, Patrick Palka wrote:

When processing a class template specialization, lookup_template_class
uses structural equality for the specialized type whenever one of its
template arguments uses structural equality.  This the sensible thing to
do in a vacuum, but given that we already effectively deduplicate class
specializations via the spec_hasher, it seems to me we can safely assume
that each class specialization is unique and therefore canonical,
regardless of the structure of the template arguments.

Makes sense.


To that end this patch makes us use the canonical type machinery for all
type specializations except for the case where a PARM_DECL appears in
the template arguments (added in r12-3766-g72394d38d929c7).

Additionally, this patch makes us use the canonical type machinery for
TEMPLATE_TEMPLATE_PARMs and BOUND_TEMPLATE_TEMPLATE_PARMs, by extending
canonical_type_parameter appropriately.  A comment in tsubst says it's
unsafe to set TYPE_CANONICAL for a lowered TEMPLATE_TEMPLATE_PARM, but
I'm not sure I understand it.

I think that comment from r120341 became obsolete when r129844 (later that
year) started to substitute the template parms of ttps.

Ah, I see.  I'll make note of this in the v2 commit message.


Note that r10-7817-ga6f400239d792d
recently changed process_template_parm to clear TYPE_CANONICAL for
TEMPLATE_TEMPLATE_PARM consistent with the tsubst comment; this patch
changes both functions to set instead of clear TYPE_CANONICAL for ttps.

This change improves compile time of heavily templated code by around 10%
for me (with a release compiler).  For instance, compile time for the
libstdc++ test std/ranges/adaptors/all.cc drops from 1.45s to 1.25s, and
for the range-v3 test test/view/zip.cpp it goes from 5.38s to 4.88s.
The total number of calls to structural_comptypes for the latter test
drops from 8.5M to 1.5M.  Memory use is unchanged (unsurpisingly).

Nice!


Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?  Also tested on cmcstl2 and range-v3 and various boost libraries.
Will also do more testing overnight...

One comment below.


gcc/cp/ChangeLog:

* pt.cc (any_template_arguments_need_structural_equality_p):
Remove.
(struct ctp_hasher): Define.
(ctp_table): Define.
(canonical_type_parameter): Use it.
(process_template_parm): Set TYPE_CANONICAL for
TEMPLATE_TEMPLATE_PARM too.
(lookup_template_class_1): Don't call a_t_a_n_s_e_p.  Inline
the PARM_DECL special case from that subroutine into here.
(tsubst) : Remove special
TYPE_CANONICAL handling specific to ttps, and perform the
remaining handling later.
(find_parm_usage_r): Remove.
* tree.cc (bind_template_template_parm): Set TYPE_CANONICAL
when safe to do so.
* typeck.cc (structural_comptypes) [check_alias]: Increment
processing_template_decl before using
dependent_alias_template_spec_p.
---
   gcc/cp/pt.cc | 166 ---
   gcc/cp/tree.cc   |  16 -
   gcc/cp/typeck.cc |   2 +
   3 files changed, 73 insertions(+), 111 deletions(-)

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index fa05e9134df..76562877355 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -203,7 +203,6 @@ static tree copy_default_args_to_explicit_spec_1 (tree,
tree);
   static void copy_default_args_to_explicit_spec (tree);
   static bool invalid_nontype_parm_type_p (tree, tsubst_flags_t);
   static bool dependent_template_arg_p (tree);
-static bool any_template_arguments_need_structural_equality_p (tree);
   static bool dependent_type_p_r (tree);
   static tree tsubst_copy  (tree, tree, tsubst_flags_t, tree);
   static tree tsubst_decl (tree, tree, tsubst_flags_t);
@@ -4526,6 +4525,27 @@ build_template_parm_index (int index,
 return t;
   }
   +struct ctp_hasher : ggc_ptr_hash
+{
+  static hashval_t hash (tree t)
+  {
+tree_code code = TREE_CODE (t);
+hashval_t val = iterative_hash_object (code, 0);
+val = iterative_hash_object (TEMPLATE_TYPE_LEVEL (t), val);
+val = iterative_hash_object (TEMPLATE_TYPE_IDX (t), val);
+if (TREE_CODE (t) == BOUND_TEMPLATE_TEMPLATE_PARM)
+  val = iterative_hash_template_arg (TYPE_TI_ARGS (t), val);
+return val;
+  }
+
+  static bool equal (tree t, tree u)
+  {
+return comptypes (t, u, COMPARE_STRUCTURAL);
+  }
+};
+
+static GTY (()) hash_table *ctp_table;
+
   /* Find the canonical type parameter for the given template type
  parameter.  Returns the canonical type parameter, which may be TYPE
  if no such parameter existed.  */
@@ -4533,21 +4553,13 @@ build_template_parm_index (int index,
   tree
   canonical_type_parameter (tree type)
   {
-  int idx = TEMPLATE_TYPE_IDX (type);
-
-  gcc_assert (TREE_CODE (type) != TEMPLATE_TEMPLATE_PARM);
+  if (ctp_table == NULL)
+ctp_table = 

[wwwdocs] Document changes in libstdc++

2022-05-24 Thread Jonathan Wakely via Gcc-patches
Pushed to wwwdocs.
commit f55f35c86c68143a2b148c66e4b0b560c852ce6f
Author: Jonathan Wakely 
Date:   Tue May 24 13:06:11 2022 +0100

Document  changes in libstdc++

diff --git a/htdocs/gcc-13/porting_to.html b/htdocs/gcc-13/porting_to.html
index b3e0895a..84a00f21 100644
--- a/htdocs/gcc-13/porting_to.html
+++ b/htdocs/gcc-13/porting_to.html
@@ -24,5 +24,23 @@ porting to GCC 13. This document is an effort to identify 
common issues
 and provide solutions. Let us know if you have suggestions for improvements!
 
 
+C++ language issues
+
+Header dependency changes
+Some C++ Standard Library headers have been changed to no longer include
+other headers that were being used internally by the library.
+As such, C++ programs that used standard library components without
+including the right headers will no longer compile.
+
+
+The following headers are used less widely in libstdc++ and may need to
+be included explicitly when compiled with GCC 13:
+
+
+ cstdint
+  (for std::int8_t, std::int32_t etc.)
+
+
+
 
 


Re: [PATCH] limits.h, syslimits.h: do not install to include-fixed

2022-05-24 Thread Martin Liška
On 5/9/22 11:02, Martin Liška wrote:
> |Ready to be installed?|

This patch should be replaced with:
https://gcc.gnu.org/pipermail/gcc-patches/2022-May/595495.html

Martin


[PATCH v2] Support --disable-fixincludes.

2022-05-24 Thread Martin Liška
On 5/20/22 14:42, Alexandre Oliva wrote:
> On May 11, 2022, Martin Liška  wrote:
> 
>> Ready to be installed?
> 
> Hmm...  I don't like that --disable-fixincludes would still configure,
> build and even install fixincludes.  This would be surprising, given
> that the semantics of disabling a component is to not even configure it.
> 
> How about leaving the top-level alone, and changing gcc/configure.ac to
> clear STMP_FIXINC when --disable-fixincludes is given?
> 

Sure, that's a good idea.

Allways install limits.h and syslimits.h header files
to include folder.

When --disable-fixincludes is used, then no systen header files
are fixed by the tools in fixincludes. Moreover, the fixincludes
tools are not built any longer.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
MartinFrom ba9bed4512d73d34d4c9bf5830e758097d517bc3 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 24 May 2022 13:06:07 +0200
Subject: [PATCH] Support --disable-fixincludes.

Allways install limits.h and syslimits.h header files
to include folder.

When --disable-fixincludes is used, then no systen header files
are fixed by the tools in fixincludes. Moreover, the fixincludes
tools are not built any longer.

gcc/ChangeLog:

	* Makefile.in: Always install limits.h and syslimits.h to
	include folder.
	* configure.ac: Assign STMP_FIXINC blank if
	--disable-fixincludes is used.
	* configure: Regenerate.
---
 gcc/Makefile.in  | 22 --
 gcc/configure| 10 --
 gcc/configure.ac |  6 ++
 3 files changed, 22 insertions(+), 16 deletions(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 97e5450ecb5..3ab8e36e1ed 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3153,19 +3153,20 @@ stmp-int-hdrs: $(STMP_FIXINC) $(T_GLIMITS_H) $(T_STDINT_GCC_H) $(USER_H) fixinc_
 	set -e; for ml in `cat fixinc_list`; do \
 	  sysroot_headers_suffix=`echo $${ml} | sed -e 's/;.*$$//'`; \
 	  multi_dir=`echo $${ml} | sed -e 's/^[^;]*;//'`; \
-	  fix_dir=include-fixed$${multi_dir}; \
+	  include_dir=include$${multi_dir}; \
 	  if $(LIMITS_H_TEST) ; then \
 	cat $(srcdir)/limitx.h $(T_GLIMITS_H) $(srcdir)/limity.h > tmp-xlimits.h; \
 	  else \
 	cat $(T_GLIMITS_H) > tmp-xlimits.h; \
 	  fi; \
-	  $(mkinstalldirs) $${fix_dir}; \
-	  chmod a+rx $${fix_dir} || true; \
+	  $(mkinstalldirs) $${include_dir}; \
+	  chmod a+rx $${include_dir} || true; \
 	  $(SHELL) $(srcdir)/../move-if-change \
 	tmp-xlimits.h  tmp-limits.h; \
-	  rm -f $${fix_dir}/limits.h; \
-	  cp -p tmp-limits.h $${fix_dir}/limits.h; \
-	  chmod a+r $${fix_dir}/limits.h; \
+	  rm -f $${include_dir}/limits.h; \
+	  cp -p tmp-limits.h $${include_dir}/limits.h; \
+	  chmod a+r $${include_dir}/limits.h; \
+	  cp $(srcdir)/gsyslimits.h $${include_dir}/syslimits.h; \
 	done
 # Install the README
 	rm -f include-fixed/README
@@ -3255,13 +3256,6 @@ stmp-fixinc: gsyslimits.h macro_list fixinc_list \
 	  cd $(build_objdir)/fixincludes && \
 	  $(SHELL) ./fixinc.sh "$${gcc_dir}/$${fix_dir}" \
 	$(BUILD_SYSTEM_HEADER_DIR) $(OTHER_FIXINCLUDES_DIRS) ); \
-	rm -f $${fix_dir}/syslimits.h; \
-	if [ -f $${fix_dir}/limits.h ]; then \
-	  mv $${fix_dir}/limits.h $${fix_dir}/syslimits.h; \
-	else \
-	  cp $(srcdir)/gsyslimits.h $${fix_dir}/syslimits.h; \
-	fi; \
-	chmod a+r $${fix_dir}/syslimits.h; \
 	  done; \
 	fi
 	$(STAMP) stmp-fixinc
@@ -3979,7 +3973,7 @@ install-mkheaders: stmp-int-hdrs install-itoolsdirs \
 	set -e; for ml in `cat fixinc_list`; do \
 	  multi_dir=`echo $${ml} | sed -e 's/^[^;]*;//'`; \
 	  $(mkinstalldirs) $(DESTDIR)$(itoolsdatadir)/include$${multi_dir}; \
-	  $(INSTALL_DATA) include-fixed$${multi_dir}/limits.h $(DESTDIR)$(itoolsdatadir)/include$${multi_dir}/limits.h; \
+	  $(INSTALL_DATA) include$${multi_dir}/limits.h $(DESTDIR)$(itoolsdatadir)/include$${multi_dir}/limits.h; \
 	done
 	$(INSTALL_SCRIPT) $(srcdir)/../mkinstalldirs \
 		$(DESTDIR)$(itoolsdir)/mkinstalldirs ; \
diff --git a/gcc/configure b/gcc/configure
index 37e0dd5e414..711e8e9b559 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -13548,6 +13548,12 @@ then
 BUILD_LDFLAGS='$(LDFLAGS_FOR_BUILD)'
 fi
 
+
+if test x$enable_fixincludes = xno;
+then
+STMP_FIXINC=''
+fi
+
 # Expand extra_headers to include complete path.
 # This substitutes for lots of t-* files.
 extra_headers_list=
@@ -19674,7 +19680,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19676 "configure"
+#line 19683 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -19780,7 +19786,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19782 "configure"
+#line 19789 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
diff --git a/gcc/configure.ac b/gcc/configure.ac
index 23bee7010a3..8a2dd5a193a 100644
--- a/gcc/configure.ac
+++ 

[DOCS][PATCH][PUSHED] GCC 13: come up with Porting to.

2022-05-24 Thread Martin Liška
And link it from changes.html.
---
 htdocs/gcc-13/changes.html|  2 --
 htdocs/gcc-13/porting_to.html | 28 
 2 files changed, 28 insertions(+), 2 deletions(-)
 create mode 100644 htdocs/gcc-13/porting_to.html

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index 2d974ae5..6c5b2a37 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -17,11 +17,9 @@
 
 This page is a "brief" summary of some of the huge number of improvements
 in GCC 13.
-
 
 
 Note: GCC 13 has not been released yet, so this document is
diff --git a/htdocs/gcc-13/porting_to.html b/htdocs/gcc-13/porting_to.html
new file mode 100644
index ..b3e0895a
--- /dev/null
+++ b/htdocs/gcc-13/porting_to.html
@@ -0,0 +1,28 @@
+
+
+
+
+
+Porting to GCC 13
+https://gcc.gnu.org/gcc.css; />
+
+
+
+Porting to GCC 13
+
+
+The GCC 13 release series differs from previous GCC releases in
+a number of ways. Some of these are a result
+of bug fixing, and some old behaviors have been intentionally changed
+to support new standards, or relaxed in standards-conforming ways to
+facilitate compilation or run-time performance.
+
+
+
+Some of these changes are user visible and can cause grief when
+porting to GCC 13. This document is an effort to identify common issues
+and provide solutions. Let us know if you have suggestions for improvements!
+
+
+
+
-- 
2.36.1



Re: [PATCH] c++: fix ICE on invalid attributes [PR96637]

2022-05-24 Thread Prathamesh Kulkarni via Gcc-patches
On Fri, 29 Apr 2022 at 19:44, Marek Polacek via Gcc-patches
 wrote:
>
> This patch fixes crashes with invalid attributes.  Arguably it could
> make sense to assert seen_error() too.
>
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk = GCC 13?
>
> PR c++/96637
>
> gcc/ChangeLog:
>
> * attribs.cc (decl_attributes): Check error_mark_node.
>
> gcc/cp/ChangeLog:
>
> * decl2.cc (cp_check_const_attributes): Check error_mark_node.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/parse/error64.C: New test.
> ---
>  gcc/attribs.cc   | 3 +++
>  gcc/cp/decl2.cc  | 2 ++
>  gcc/testsuite/g++.dg/parse/error64.C | 4 
>  3 files changed, 9 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/parse/error64.C
>
> diff --git a/gcc/attribs.cc b/gcc/attribs.cc
> index b219f878042..ff157dcf81c 100644
> --- a/gcc/attribs.cc
> +++ b/gcc/attribs.cc
> @@ -700,6 +700,9 @@ decl_attributes (tree *node, tree attributes, int flags,
>   in the same order as in the source.  */
>for (tree attr = attributes; attr; attr = TREE_CHAIN (attr))
>  {
> +  if (attr == error_mark_node)
> +   continue;
Not a comment on the patch specifically, but just wondering if it'd be
better to use error_operand_p,
than testing against error_mark_node explicitly ?

Thanks,
Prathamesh
> +
>tree ns = get_attribute_namespace (attr);
>tree name = get_attribute_name (attr);
>tree args = TREE_VALUE (attr);
> diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
> index d2b29208ed5..c3ff1962a75 100644
> --- a/gcc/cp/decl2.cc
> +++ b/gcc/cp/decl2.cc
> @@ -1537,6 +1537,8 @@ cp_check_const_attributes (tree attributes)
>/* As we implement alignas using gnu::aligned attribute and
>  alignas argument is a constant expression, force manifestly
>  constant evaluation of aligned attribute argument.  */
> +  if (attr == error_mark_node)
> +   continue;
>bool manifestly_const_eval
> = is_attribute_p ("aligned", get_attribute_name (attr));
>for (arg = TREE_VALUE (attr); arg && TREE_CODE (arg) == TREE_LIST;
> diff --git a/gcc/testsuite/g++.dg/parse/error64.C 
> b/gcc/testsuite/g++.dg/parse/error64.C
> new file mode 100644
> index 000..87848a58c27
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/parse/error64.C
> @@ -0,0 +1,4 @@
> +// PR c++/96637
> +// { dg-do compile }
> +
> +void foo(int[] alignas[1] alignas(1)){} // { dg-error "" }
>
> base-commit: 9ae8b993cd362e8aea4f65580aaf1453120207f2
> --
> 2.35.1
>


Re: [PATCH] c++: fix ICE on invalid attributes [PR96637]

2022-05-24 Thread Marek Polacek via Gcc-patches
Ping.

On Fri, Apr 29, 2022 at 10:12:33AM -0400, Marek Polacek via Gcc-patches wrote:
> This patch fixes crashes with invalid attributes.  Arguably it could
> make sense to assert seen_error() too.
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk = GCC 13?
> 
>   PR c++/96637
> 
> gcc/ChangeLog:
> 
>   * attribs.cc (decl_attributes): Check error_mark_node.
> 
> gcc/cp/ChangeLog:
> 
>   * decl2.cc (cp_check_const_attributes): Check error_mark_node.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/parse/error64.C: New test.
> ---
>  gcc/attribs.cc   | 3 +++
>  gcc/cp/decl2.cc  | 2 ++
>  gcc/testsuite/g++.dg/parse/error64.C | 4 
>  3 files changed, 9 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/parse/error64.C
> 
> diff --git a/gcc/attribs.cc b/gcc/attribs.cc
> index b219f878042..ff157dcf81c 100644
> --- a/gcc/attribs.cc
> +++ b/gcc/attribs.cc
> @@ -700,6 +700,9 @@ decl_attributes (tree *node, tree attributes, int flags,
>   in the same order as in the source.  */
>for (tree attr = attributes; attr; attr = TREE_CHAIN (attr))
>  {
> +  if (attr == error_mark_node)
> + continue;
> +
>tree ns = get_attribute_namespace (attr);
>tree name = get_attribute_name (attr);
>tree args = TREE_VALUE (attr);
> diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
> index d2b29208ed5..c3ff1962a75 100644
> --- a/gcc/cp/decl2.cc
> +++ b/gcc/cp/decl2.cc
> @@ -1537,6 +1537,8 @@ cp_check_const_attributes (tree attributes)
>/* As we implement alignas using gnu::aligned attribute and
>alignas argument is a constant expression, force manifestly
>constant evaluation of aligned attribute argument.  */
> +  if (attr == error_mark_node)
> + continue;
>bool manifestly_const_eval
>   = is_attribute_p ("aligned", get_attribute_name (attr));
>for (arg = TREE_VALUE (attr); arg && TREE_CODE (arg) == TREE_LIST;
> diff --git a/gcc/testsuite/g++.dg/parse/error64.C 
> b/gcc/testsuite/g++.dg/parse/error64.C
> new file mode 100644
> index 000..87848a58c27
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/parse/error64.C
> @@ -0,0 +1,4 @@
> +// PR c++/96637
> +// { dg-do compile }
> +
> +void foo(int[] alignas[1] alignas(1)){} // { dg-error "" }
> 
> base-commit: 9ae8b993cd362e8aea4f65580aaf1453120207f2
> -- 
> 2.35.1
> 

Marek



Re: [PATCH v5] c++: ICE with temporary of class type in DMI [PR100252]

2022-05-24 Thread Marek Polacek via Gcc-patches
Ping.

On Mon, May 16, 2022 at 11:36:27AM -0400, Marek Polacek wrote:
> On Sat, May 14, 2022 at 11:13:28PM -0400, Jason Merrill wrote:
> > On 5/13/22 19:41, Marek Polacek wrote:
> > > --- a/gcc/cp/typeck2.cc
> > > +++ b/gcc/cp/typeck2.cc
> > > @@ -1371,6 +1371,70 @@ digest_init_flags (tree type, tree init, int 
> > > flags, tsubst_flags_t complain)
> > > return digest_init_r (type, init, 0, flags, complain);
> > >   }
> > > +/* Return true if a prvalue is used as an initializer rather than for
> > > +   temporary materialization.  For instance:
> > 
> > I might say "if SUBOB initializes the same object as FULL_EXPR"; the
> > full-expression could still end up initializing a temporary.
> 
> Fixed.
>  
> > > + A a = A{};// initializer
> > > + A a = (A{});  // initializer
> > > + A a = (1, A{});   // initializer
> > > + A a = true ? A{} : A{};  // initializer
> > > + auto x = A{}.x;   // temporary materialization
> > > + auto x = foo(A{});// temporary materialization
> > > +
> > > +   FULL_EXPR is the whole expression, SUBOB is its TARGET_EXPR 
> > > subobject.  */
> > > +
> > > +static bool
> > > +potential_prvalue_result_of (tree subob, tree full_expr)
> > > +{
> > > +  if (subob == full_expr)
> > > +return true;
> > > +  else if (TREE_CODE (full_expr) == TARGET_EXPR)
> > > +{
> > > +  tree init = TARGET_EXPR_INITIAL (full_expr);
> > > +  if (TREE_CODE (init) == COND_EXPR)
> > > + return (potential_prvalue_result_of (subob, TREE_OPERAND (init, 1))
> > > + || potential_prvalue_result_of (subob, TREE_OPERAND (init, 2)));
> > > +  else if (TREE_CODE (init) == COMPOUND_EXPR)
> > > + return (potential_prvalue_result_of (subob, TREE_OPERAND (init, 0))
> > 
> > We shouldn't recurse into the LHS of the comma, only the RHS.
> 
> Fixed.
>  
> > > + || potential_prvalue_result_of (subob, TREE_OPERAND (init, 1)));
> > > +  /* ??? I don't know if this can be hit.  If so, look inside the ( )
> > > +  instead of the assert.  */
> > > +  else if (TREE_CODE (init) == PAREN_EXPR)
> > > + gcc_checking_assert (false);
> > 
> > It seems trivial enough to recurse after the assert, in case it does happen
> > in the wild.
> 
> OK, adjusted.  Thanks!
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> 
> -- >8 --
> Consider
> 
>   struct A {
> int x;
> int y = x;
>   };
> 
>   struct B {
> int x = 0;
> int y = A{x}.y; // #1
>   };
> 
> where for #1 we end up with
> 
>   {.x=(&)->x, .y=(&)->x}
> 
> that is, two PLACEHOLDER_EXPRs for different types on the same level in
> a {}.  This crashes because our CONSTRUCTOR_PLACEHOLDER_BOUNDARY mechanism to
> avoid replacing unrelated PLACEHOLDER_EXPRs cannot deal with it.
> 
> Here's why we wound up with those PLACEHOLDER_EXPRs: When we're performing
> cp_parser_late_parsing_nsdmi for "int y = A{x}.y;" we use 
> finish_compound_literal
> on type=A, compound_literal={((struct B *) this)->x}.  When digesting this
> initializer, we call get_nsdmi which creates a PLACEHOLDER_EXPR for A -- we 
> don't
> have any object to refer to yet.  After digesting, we have
> 
>   {.x=((struct B *) this)->x, .y=(&)->x}
> 
> and since we've created a PLACEHOLDER_EXPR inside it, we marked the whole ctor
> CONSTRUCTOR_PLACEHOLDER_BOUNDARY.  f_c_l creates a TARGET_EXPR and returns
> 
>   TARGET_EXPR x, .y=(& struct A>)->x}>
> 
> Then we get to
> 
>   B b = {};
> 
> and call store_init_value, which digests the {}, which produces
> 
>   {.x=NON_LVALUE_EXPR <0>, .y=(TARGET_EXPR  struct B>)->x, .y=(&)->x}>).y}
> 
> lookup_placeholder in constexpr won't find an object to replace the
> PLACEHOLDER_EXPR for B, because ctx->object will be D.2395 of type A, and we
> cannot search outward from D.2395 to find 'b'.
> 
> The call to replace_placeholders in store_init_value will not do anything:
> we've marked the inner { } CONSTRUCTOR_PLACEHOLDER_BOUNDARY, and it's only
> a sub-expression, so replace_placeholders does nothing, so the 
> stays even though now is the perfect time to replace it because we have an
> object for it: 'b'.
> 
> Later, in cp_gimplify_init_expr the *expr_p is
> 
>   D.2395 = {.x=(&)->x, .y=(& struct A>)->x}
> 
> where D.2395 is of type A, but we crash because we hit , which
> has a different type.
> 
> My idea was to replace  with D.2384 after creating the
> TARGET_EXPR because that means we have an object we can refer to.
> Then clear CONSTRUCTOR_PLACEHOLDER_BOUNDARY because we no longer have
> a PLACEHOLDER_EXPR in the {}.  Then store_init_value will be able to
> replace  with 'b', and we should be good to go.  We must
> be careful not to break guaranteed copy elision, so this replacement
> happens in digest_nsdmi_init where we can see the whole initializer,
> and avoid replacing any placeholders in TARGET_EXPRs used in the context
> of initialization/copy elision.  This is achieved via the new function
> called potential_prvalue_result_of.
> 
> 

Re: [PATCH] Add operators / and * for profile_{count,probability}.

2022-05-24 Thread Martin Liška
PING^1

On 5/5/22 20:15, Martin Liška wrote:
> On 5/5/22 15:49, Jan Hubicka wrote:
>> Hi,
>>> The patch simplifies usage of the profile_{count,probability} types.
>>>
>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>>
>>> Ready to be installed?
>>
>> The reason I intentionally did not add * and / to the original API was
>> to detect situations where values that should be
>> profile_count/profile_probability are stored into integers, since
>> previous code used integers for everything.
>>
>> Having one to add apply_scale made him/her (mostly me :) to think if the
>> value is really just a fixed scale or it it should be better converted
>> to proper data type (count or probability).
>>
>> I guess now we completed the conversion so risk of this creeping in is
>> relatively low and the code indeed looks better.
> 
> Yes, that's my impression as well that the profiling code is quite settled 
> down.
> 
>> It will make it bit
>> harder for me to backport jump threading profile updating fixes I plan
>> for 12.2 but it should not be hard.
> 
> You'll manage ;)
> 
>>> diff --git a/gcc/cfgloopmanip.cc b/gcc/cfgloopmanip.cc
>>> index b4357c03e86..a1ac1146445 100644
>>> --- a/gcc/cfgloopmanip.cc
>>> +++ b/gcc/cfgloopmanip.cc
>>> @@ -563,8 +563,7 @@ scale_loop_profile (class loop *loop, 
>>> profile_probability p,
>>>  
>>>   /* Probability of exit must be 1/iterations.  */
>>>   count_delta = e->count ();
>>> - e->probability = profile_probability::always ()
>>> -   .apply_scale (1, iteration_bound);
>>> + e->probability = profile_probability::always () / iteration_bound;
>> However this is kind of example of the problem. 
>> iteration_bound is gcov_type so we can get overflow here.
> 
> typedef int64_t gcov_type;
> 
> and apply_scale takes int64_t types as arguments. Similarly the newly added 
> operators,
> so how can that change anything?
> 
>> I guess we want to downgrade iteration_bound since it is always either 0
>> or int.
>>> diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc
>>> index e14b4e6c94a..cef26a9878e 100644
>>> --- a/gcc/tree-switch-conversion.cc
>>> +++ b/gcc/tree-switch-conversion.cc
>>> @@ -1782,7 +1782,7 @@ switch_decision_tree::analyze_switch_statement ()
>>>tree high = CASE_HIGH (elt);
>>>  
>>>profile_probability p
>>> -   = case_edge->probability.apply_scale (1, (intptr_t) (case_edge->aux));
>>> +   = case_edge->probability / ((intptr_t) (case_edge->aux));
>>
>> I think the switch ranges may be also in risk of overflow?
>>
>> We could make operators to accept gcov_type or int64_t.
> 
> As explained, they do.
> 
> Cheers,
> Martin
> 
>>
>> Thanks,
>> Honza
> 



Re: [PATCH] Extend --with-zstd documentation

2022-05-24 Thread Martin Liška
On 5/16/22 11:27, Martin Liška wrote:
> On 5/12/22 09:00, Richard Biener via Gcc-patches wrote:
>> On Wed, May 11, 2022 at 5:10 PM Bruno Haible  wrote:
>>>
>>> The patch that was so far added for documenting --with-zstd is pretty
>>> minimal:
>>>   - it refers to undocumented options --with-zstd-include and
>>> --with-zstd-lib;
>>>   - it suggests that --with-zstd can be used without an argument;
>>>   - it does not clarify how this option applies to cross-compilation.
>>>
>>> How about adding the same details as for the --with-isl,
>>> --with-isl-include, --with-isl-lib options, mutatis mutandis? This patch
>>> does that.
>>
>> Sounds good!
>>
>> OK.
> 
> Bruno, are you planning committing the change? Or should I do it on your
> behalf?

Pushed as 3677eb80b683cead7db972bc206fd2e75d997bd2 with the corresponding
Signed-off-by signature.

Martin

> 
> Cheers,
> Martin
> 
>>
>> Thanks,
>> Richard.
>>
>>> PR other/105527
>>>
>>> gcc/ChangeLog:
>>>
>>> * doc/install.texi (Configuration): Add more details about 
>>> --with-zstd.
>>> Document --with-zstd-include and --with-zstd-lib
>>> ---
>>> diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
>>> index 042241e9fad..ed0d1d882c3 100644
>>> --- a/gcc/doc/install.texi
>>> +++ b/gcc/doc/install.texi
>>> @@ -2360,10 +2360,20 @@ default is derived from glibc's behavior. When 
>>> glibc clamps float_t to double,
>>>  GCC follows and enables the option. For other cross compiles, the default 
>>> is
>>>  disabled.
>>>
>>> -@item --with-zstd
>>> -Specify prefix directory for installed zstd library.
>>> -Equivalent to @option{--with-zstd-include=PATH/include} plus
>>> -@option{--with-zstd-lib=PATH/lib}.
>>> +@item --with-zstd=@var{pathname}
>>> +@itemx --with-zstd-include=@var{pathname}
>>> +@itemx --with-zstd-lib=@var{pathname}
>>> +If you do not have the @code{zstd} library installed in a standard
>>> +location and you want to build GCC, you can explicitly specify the
>>> +directory where it is installed 
>>> (@samp{--with-zstd=@/@var{zstdinstalldir}}).
>>> +The @option{--with-zstd=@/@var{zstdinstalldir}} option is shorthand for
>>> +@option{--with-zstd-lib=@/@var{zstdinstalldir}/lib} and
>>> +@option{--with-zstd-include=@/@var{zstdinstalldir}/include}. If this
>>> +shorthand assumption is not correct, you can use the explicit
>>> +include and lib options directly.
>>> +
>>> +These flags are applicable to the host platform only.  When building
>>> +a cross compiler, they will not be used to configure target libraries.
>>>  @end table
>>>
>>>  @subheading Cross-Compiler-Specific Options
>>>
>>>
>>>
> 



Re: [ping2][PATCH 0/8][RFC] Support BTF decl_tag and type_tag annotations

2022-05-24 Thread Jose E. Marchesi via Gcc-patches


> On 5/11/22 11:44 AM, David Faust wrote:
>> 
>> On 5/10/22 22:05, Yonghong Song wrote:
>>>
>>>
>>> On 5/10/22 8:43 PM, Yonghong Song wrote:


 On 5/6/22 2:18 PM, David Faust wrote:
>
>
> On 5/5/22 16:00, Yonghong Song wrote:
>>
>>
>> On 5/4/22 10:03 AM, David Faust wrote:
>>>
>>>
>>> On 5/3/22 15:32, Joseph Myers wrote:
 On Mon, 2 May 2022, David Faust via Gcc-patches wrote:

> Consider the following example:
>
>   #define __typetag1 __attribute__((btf_type_tag("tag1")))
>   #define __typetag2 __attribute__((btf_type_tag("tag2")))
>   #define __typetag3 __attribute__((btf_type_tag("tag3")))
>
>   int __typetag1 * __typetag2 __typetag3 * g;
>
> The expected behavior is that 'g' is "a pointer with tags
> 'tag2' and
> 'tag3',
> to a pointer with tag 'tag1' to an int". i.e.:

 That's not a correct expectation for either GNU __attribute__ or
 C2x [[]]
 attribute syntax.  In either syntax, __typetag2 __typetag3 should
 apply to
 the type to which g points, not to g or its type, just as if
 you had a
 type qualifier there.  You'd need to put the attributes (or
 qualifier)
 after the *, not before, to make them apply to the pointer
 type.  See
 "Attribute Syntax" in the GCC manual for how the syntax is
 defined for
 GNU
 attributes and deduce in turn, for each subsequence of the tokens
 matching
 the syntax for some kind of declarator, what the type for "T D1"
 would be
 as defined there and in the C standard, as deduced from the type for
 "T D"
 for a sub-declarator D.
    >> But GCC's attribute parsing produces a variable 'g'
 which is "a
>>> pointer with
> tag 'tag1' to a pointer with tags 'tag2' and 'tag3' to an
> int", i.e.

 In GNU syntax, __typetag1 applies to the declaration, whereas in C2x
 syntax it applies to int.  Again, if you wanted it to apply to the
 pointer
 type it would need to go after the * not before.

 If you are concerned with the fine details of what construct an
 attribute
 appertains to, I recommend using C2x syntax not GNU syntax.

>>>
>>> Joseph, thank you! This is very helpful. My understanding of
>>> the syntax
>>> was not correct.
>>>
>>> (Actually, I made a bad mistake in paraphrasing this example from the
>>> discussion of it in the series cover letter. But, the reason
>>> why it is
>>> incorrect is the same.)
>>>
>>>
>>> Yonghong, is the specific ordering an expectation in BPF programs or
>>> other users of the tags?
>>
>> This is probably a language writing issue. We are saying tags only
>> apply to pointer. We probably should say it only apply to pointee.
>>
>> $ cat t.c
>> int const *ptr;
>>
>> the llvm ir debuginfo:
>>
>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64)
>> !6 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !7)
>> !7 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>
>> We could replace 'const' with a tag like below:
>>
>> int __attribute__((btf_type_tag("tag"))) *ptr;
>>
>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !6, size: 64,
>> annotations: !7)
>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>> !7 = !{!8}
>> !8 = !{!"btf_type_tag", !"tag"}
>>
>> In the above IR, we generate annotations to pointer_type because
>> we didn't invent a new DI type for encode btf_type_tag. But it is
>> totally okay to have IR looks like
>>
>> !5 = !DIDerivedType(tag: DW_TAG_pointer_type, baseType: !11, size: 64)
>> !11 = !DIBtfTypeTagType(..., baseType: !6, name: !"Tag")
>> !6 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
>>
> OK, thanks.
>
> There is still the question of why the DWARF generated for this case
> that I have been concerned about:
>
>     int __typetag1 * __typetag2 __typetag3 * g;
>
> differs between GCC (with this series) and clang. After studying it,
> GCC is doing with the attributes exactly as is described in the
> Attribute Syntax portion of the GCC manual where the GNU syntax is
> described. I do not think there is any problem here.
>
> So the difference in DWARF suggests to me that clang is not handling
> the GNU attribute syntax in this particular case correctly, since it
> seems to be associating __typetag2 and __typetag3 to g's type rather
> than the type to which it points.
>
> I am not sure whether for the use purposes of the tags this difference
> is very 

[PATCH] Introduce -finstrument-functions-once

2022-05-24 Thread Eric Botcazou via Gcc-patches
Hi,

some time ago we were requested to implement a -finstrument-functions-once
switch in the compiler, with the semantics that the profiling functions be
called only once per instrumented function.  The goal was to make it possible
to use it in (large) production binaries to do function-level coverage, so the
overhead must be minimum and, in particular, there is no protection against
data races so the "once" moniker is imprecise.

Tested on x86-64/Linux, OK for the mainline?


2022-05-24  Eric Botcazou  

* common.opt (finstrument-functions): Set explicit value.
(-finstrument-functions-once): New option.
* doc/invoke.texi (Program Instrumentation Options): Document it.
* gimplify.c (build_instrumentation_call): New static function.
(gimplify_function_tree): Invoke it to emit the instrumentation calls
if -finstrument-functions[-once] is specified.

-- 
Eric Botcazoudiff --git a/gcc/common.opt b/gcc/common.opt
index 8a0dafc522d..c82df1778e6 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1878,9 +1878,13 @@ EnumValue
 Enum(cf_protection_level) String(none) Value(CF_NONE)
 
 finstrument-functions
-Common Var(flag_instrument_function_entry_exit)
+Common Var(flag_instrument_function_entry_exit,1)
 Instrument function entry and exit with profiling calls.
 
+finstrument-functions-once
+Common Var(flag_instrument_function_entry_exit,2)
+Instrument function entry and exit with profiling calls invoked once.
+
 finstrument-functions-exclude-function-list=
 Common RejectNegative Joined
 -finstrument-functions-exclude-function-list=name,...	Do not instrument listed functions.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index e8e6d4e039b..eaea3a7cb93 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -617,7 +617,7 @@ Objective-C and Objective-C++ Dialects}.
 -fno-stack-limit  -fsplit-stack @gol
 -fvtable-verify=@r{[}std@r{|}preinit@r{|}none@r{]} @gol
 -fvtv-counts  -fvtv-debug @gol
--finstrument-functions @gol
+-finstrument-functions  -finstrument-functions-once @gol
 -finstrument-functions-exclude-function-list=@var{sym},@var{sym},@dots{} @gol
 -finstrument-functions-exclude-file-list=@var{file},@var{file},@dots{}} @gol
 -fprofile-prefix-map=@var{old}=@var{new}
@@ -16365,6 +16365,20 @@ cannot safely be called (perhaps signal handlers, if the profiling
 routines generate output or allocate memory).
 @xref{Common Function Attributes}.
 
+@item -finstrument-functions-once
+@opindex -finstrument-functions-once
+This is similar to @option{-finstrument-functions}, but the profiling
+functions are called only once per instrumented function, i.e. the first
+profiling function is called after the first entry into the instrumented
+function and the second profiling function is called before the exit
+corresponding to this first entry.
+
+The definition of @code{once} for the purpose of this option is a little
+vague because the implementation is not protected against data races.
+As a result, the implementation only guarantees that the profiling
+functions are invoked at @emph{least} once per process and at @emph{most}
+once per thread.
+
 @item -finstrument-functions-exclude-file-list=@var{file},@var{file},@dots{}
 @opindex finstrument-functions-exclude-file-list
 
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 260993be215..8ce15a2adad 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -16570,6 +16570,51 @@ flag_instrument_functions_exclude_p (tree fndecl)
   return false;
 }
 
+/* Build a call to the instrumentation function FNCODE and add it to SEQ.
+   If COND_VAR is not NULL, it is a boolean variable guarding the call to
+   the instrumentation function.  IF STMT is not NULL, it is a statement
+   to be executed just before the call to the instrumentation function.  */
+
+static void
+build_instrumentation_call (gimple_seq *seq, enum built_in_function fncode,
+			tree cond_var, gimple *stmt)
+{
+  /* The instrumentation hooks aren't going to call the instrumented
+ function and the address they receive is expected to be matchable
+ against symbol addresses.  Make sure we don't create a trampoline,
+ in case the current function is nested.  */
+  tree this_fn_addr = build_fold_addr_expr (current_function_decl);
+  TREE_NO_TRAMPOLINE (this_fn_addr) = 1;
+
+  tree label_true, label_false;
+  if (cond_var)
+{
+  label_true = create_artificial_label (UNKNOWN_LOCATION);
+  label_false = create_artificial_label (UNKNOWN_LOCATION);
+  gcond *cond = gimple_build_cond (EQ_EXPR, cond_var, boolean_true_node,
+  label_true, label_false);
+  gimplify_seq_add_stmt (seq, cond);
+  gimplify_seq_add_stmt (seq, gimple_build_label (label_true));
+  gimplify_seq_add_stmt (seq, gimple_build_predict (PRED_COLD_LABEL,
+			NOT_TAKEN));
+}
+
+  if (stmt)
+gimplify_seq_add_stmt (seq, stmt);
+
+  tree x = builtin_decl_implicit (BUILT_IN_RETURN_ADDRESS);
+  gcall *call = gimple_build_call (x, 1, 

Re: [PATCH][_Hashtable] Fix insertion of range of type convertible to value_type PR 56112

2022-05-24 Thread Jonathan Wakely via Gcc-patches
On Tue, 24 May 2022 at 11:22, Jonathan Wakely wrote:
>
> On Tue, 24 May 2022 at 11:18, Jonathan Wakely wrote:
> >
> > On Thu, 5 May 2022 at 18:38, François Dumont via Libstdc++
> >  wrote:
> > >
> > > Hi
> > >
> > > Renewing my patch to fix PR 56112 but for the insert methods, I totally
> > > change it, now works also with move-only key types.
> > >
> > > I let you Jonathan find a better name than _ValueTypeEnforcer as usual :-)
> > >
> > > libstdc++: [_Hashtable] Insert range of types convertible to value_type
> > > PR 56112
> > >
> > > Fix insertion of range of types convertible to value_type. Fix also when
> > > this value_type
> > > has a move-only key_type which also allow converted values to be moved.
> > >
> > > libstdc++-v3/ChangeLog:
> > >
> > >  PR libstdc++/56112
> > >  * include/bits/hashtable_policy.h (_ValueTypeEnforcer): New.
> > >  * include/bits/hashtable.h
> > > (_Hashtable<>::_M_insert_unique_aux): New.
> > >  (_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&,
> > > true_type)): Use latters.
> > >  (_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&,
> > > false_type)): Likewise.
> > >  (_Hashtable(_InputIterator, _InputIterator, size_type, const
> > > _Hash&, const _Equal&,
> > >  const allocator_type&, true_type)): Use this.insert range.
> > >  (_Hashtable(_InputIterator, _InputIterator, size_type, const
> > > _Hash&, const _Equal&,
> > >  const allocator_type&, false_type)): Use _M_insert.
> > >  * testsuite/23_containers/unordered_map/cons/56112.cc: Check
> > > how many times conversion
> > >  is done.
> > >  (test02): New test case.
> > >  * testsuite/23_containers/unordered_set/cons/56112.cc: New test.
> > >
> > > Tested under Linux x86_64.
> > >
> > > Ok to commit ?
> >
> > No, sorry.
> >
> > The new test02 function in 23_containers/unordered_map/cons/56112.cc
> > doesn't compile with libc++ or MSVC either, are you sure that test is
> > valid? I don't think it is, because S2 is not convertible to
> > pair. None of the pair constructors are
> > viable, because the move constructor would require two user-defined
> > conversions (from S2 to pair and then from
> > pair to pair). A conversion
> > sequence cannot have more than one user-defined conversion using a
> > constructor or converion operator. So if your patch makes that
> > compile, it's a bug in the new code. I haven't analyzed that code to
> > see where the problem is, I'm just looking at the test results and the
> > changes in behaviour.
>
> I meant to include this link showing that libc++ and MSVC reject
> test02() as well:
>
> https://godbolt.org/z/j7E9f6bd4

I've created https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105717 for
the insertion bug, rather than reopening PR 56112.



Re: [PATCH] Modula-2: merge proposal/review: 1/9 01.patch-set-01

2022-05-24 Thread Richard Biener via Gcc-patches
On Sat, May 21, 2022 at 3:11 AM Gaius Mulley  wrote:
>
>
> Hi,
>
> Gaius wrote:
>
> > the changes do raise questions.  The reason for the changes here are to
> > allow easy linking for modula-2 users.
>
> >  $ gm2 hello.mod
>
> > for example will compile and link with all dependent modules (dependants
> > are generated by analysing module source imports).  The gm2 driver will
> > add objects and libraries to the link.
>
> in more detail the gm2 driver does the following:
>
>   $ gm2 -v hello.mod
>
> full output below, but to summarise and annotate:
>
> cc1gm2 generates an assembler file from hello.mod
>  as --64 /tmp/cc8BoL3d.s -o hello.o
>
>  # gm2l generates a list of all dependent modules from parsing all imports
>  /home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/gm2l -v \
>  -I/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/m2/m2pim -o \
>  /tmp/ccSMojUb.l hello.mod
>
>  # gm2lorder reorders the critical runtime modules
>  /home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/gm2lorder \
> /tmp/ccSMojUb.l -o /tmp/ccHDRdde.lst
>
>  # gm2lgen generates a C++ scaffold from the reordered module list
>  /home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/gm2lgen -fcpp \
> /tmp/ccHDRdde.lst -o a-hello_m2.cpp
>
>  # cc1plus compiles the scaffold
>  /home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/cc1plus -v \
>  -mtune=generic -march=x86-64 \
>  -I/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/m2/m2pim \
>  -quiet a-hello_m2.cpp -o a-hello_m2.s
>  as --64 a-hello_m2.s -o a-hello_m2.o
>
>  # gm2lcc creates an archive from the list of modules and the scaffold
> /home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/gm2lcc \
>   -L/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/m2/m2pim \
>   -ftarget-ar=/usr/bin/ar -ftarget-ranlib=/usr/bin/ranlib \
> -fobject-path=/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/m2/m2pim \
>   --exec --startup a-hello_m2.o --ar -o /tmp/ccNJ60fa.a --mainobject \
>   a-hello_m2.o /tmp/ccHDRdde.lst
>
> /usr/bin/ar rc /tmp/ccNJ60fa.a  hello.o a-hello_m2.o
> /usr/bin/ranlib /tmp/ccNJ60fa.a
>
> # finally collect2 performs the link from the archive and any default
>   libraries
>
> hope this helps

Yes, it does.  So historically when there was complex massaging required
like this it was offloaded to a "helper driver".  With -flto there's lto-wrapper
(but here invoked by the linker), with ada there's gnatmake and others
and with certain targets collect2 does extra processing producing global
CTORs (or for C++ with -frepo even invoked additional compilations).

I do think that this might all belong into the main driver code but then
maybe all the different language compilation models will just make that
very hard to maintain.

As for modula-2, does

$ gm2 -c hello.mod
$ gm2 hello.o

"work"?  And what intermediate files are build systems expecting to
prevail?  Like for C/C++ code and GNU make there's the preprocessor
driven dependence generation, but otherwise a single TU usually
produces a single object file.  OTOH for GFortran a single TU might produce
multiple .mod files for example.

Btw, does

$ gcc -c hello.mod

"work" (or with -x m2 if the extension isn't auto detected)?

Richard.

>
> regards,
> Gaius
>
>
>
>
>
>
> $ ~/opt/bin/gm2 -v hello.mod
> Using built-in specs.
> COLLECT_GCC=/home/gaius/opt/bin/gm2
> COLLECT_LTO_WRAPPER=/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/lto-wrapper
> Target: x86_64-pc-linux-gnu
> Configured with: 
> /home/gaius/GM2/graft-combine/gcc-git-devel-modula2/configure 
> --prefix=/home/gaius/opt --libexecdir=/home/gaius/opt/lib 
> --enable-threads=posix --enable-clocale=gnu --enable-languages=m2 
> --enable-multilib --enable-checking --enable-long-longx --enable-bootstrap 
> --with-build-config=bootstrap-Og
> Thread model: posix
> Supported LTO compression algorithms: zlib
> gcc version 13.0.0 20220519 (experimental) (GCC)
> COLLECT_GCC_OPTIONS='-I/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/m2/m2pim'
>  '-ftarget-ar=/usr/bin/ar' '-ftarget-ranlib=/usr/bin/ranlib' 
> '-fobject-path=/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/m2/m2pim' 
> '-x' 'modula-2' '-fplugin=m2rte' 
> '-L/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/m2/m2pim' 
> '-L/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/m2/m2pim' 
> '-shared-libgcc' '-v' '-mtune=generic' '-march=x86-64' '-dumpdir' 'a-'
>  /home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/cc1gm2 
> -iplugindir=/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/plugin -quiet 
> -dumpdir a- -dumpbase hello.mod -dumpbase-ext .mod -mtune=generic 
> -march=x86-64 -version -ftarget-ar=/usr/bin/ar 
> -ftarget-ranlib=/usr/bin/ranlib 
> -fobject-path=/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/m2/m2pim 
> -fplugin=m2rte -ftarget-ar=/usr/bin/ar -ftarget-ranlib=/usr/bin/ranlib 
> -fobject-path=/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/m2/m2pim 
> -fplugin=m2rte -I/home/gaius/opt/lib/gcc/x86_64-pc-linux-gnu/13.0.0/m2/m2pim 
> -o /tmp/cc8BoL3d.s hello.mod
> GNU Modula-2  1.9.5  

Re: [PATCH][_Hashtable] Fix insertion of range of type convertible to value_type PR 56112

2022-05-24 Thread Jonathan Wakely via Gcc-patches
On Tue, 24 May 2022 at 11:18, Jonathan Wakely wrote:
>
> On Thu, 5 May 2022 at 18:38, François Dumont via Libstdc++
>  wrote:
> >
> > Hi
> >
> > Renewing my patch to fix PR 56112 but for the insert methods, I totally
> > change it, now works also with move-only key types.
> >
> > I let you Jonathan find a better name than _ValueTypeEnforcer as usual :-)
> >
> > libstdc++: [_Hashtable] Insert range of types convertible to value_type
> > PR 56112
> >
> > Fix insertion of range of types convertible to value_type. Fix also when
> > this value_type
> > has a move-only key_type which also allow converted values to be moved.
> >
> > libstdc++-v3/ChangeLog:
> >
> >  PR libstdc++/56112
> >  * include/bits/hashtable_policy.h (_ValueTypeEnforcer): New.
> >  * include/bits/hashtable.h
> > (_Hashtable<>::_M_insert_unique_aux): New.
> >  (_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&,
> > true_type)): Use latters.
> >  (_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&,
> > false_type)): Likewise.
> >  (_Hashtable(_InputIterator, _InputIterator, size_type, const
> > _Hash&, const _Equal&,
> >  const allocator_type&, true_type)): Use this.insert range.
> >  (_Hashtable(_InputIterator, _InputIterator, size_type, const
> > _Hash&, const _Equal&,
> >  const allocator_type&, false_type)): Use _M_insert.
> >  * testsuite/23_containers/unordered_map/cons/56112.cc: Check
> > how many times conversion
> >  is done.
> >  (test02): New test case.
> >  * testsuite/23_containers/unordered_set/cons/56112.cc: New test.
> >
> > Tested under Linux x86_64.
> >
> > Ok to commit ?
>
> No, sorry.
>
> The new test02 function in 23_containers/unordered_map/cons/56112.cc
> doesn't compile with libc++ or MSVC either, are you sure that test is
> valid? I don't think it is, because S2 is not convertible to
> pair. None of the pair constructors are
> viable, because the move constructor would require two user-defined
> conversions (from S2 to pair and then from
> pair to pair). A conversion
> sequence cannot have more than one user-defined conversion using a
> constructor or converion operator. So if your patch makes that
> compile, it's a bug in the new code. I haven't analyzed that code to
> see where the problem is, I'm just looking at the test results and the
> changes in behaviour.

I meant to include this link showing that libc++ and MSVC reject
test02() as well:

https://godbolt.org/z/j7E9f6bd4


>
> The new 23_containers/unordered_set/cons/56112.cc test fails for GCC
> 11 but passes for GCC 12, even without your patch. Is it actually
> testing some other change, not this patch, and not the 2013 fix for PR
> 56112?



Re: [PATCH][_Hashtable] Fix insertion of range of type convertible to value_type PR 56112

2022-05-24 Thread Jonathan Wakely via Gcc-patches
On Thu, 5 May 2022 at 18:38, François Dumont via Libstdc++
 wrote:
>
> Hi
>
> Renewing my patch to fix PR 56112 but for the insert methods, I totally
> change it, now works also with move-only key types.
>
> I let you Jonathan find a better name than _ValueTypeEnforcer as usual :-)
>
> libstdc++: [_Hashtable] Insert range of types convertible to value_type
> PR 56112
>
> Fix insertion of range of types convertible to value_type. Fix also when
> this value_type
> has a move-only key_type which also allow converted values to be moved.
>
> libstdc++-v3/ChangeLog:
>
>  PR libstdc++/56112
>  * include/bits/hashtable_policy.h (_ValueTypeEnforcer): New.
>  * include/bits/hashtable.h
> (_Hashtable<>::_M_insert_unique_aux): New.
>  (_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&,
> true_type)): Use latters.
>  (_Hashtable<>::_M_insert(_Arg&&, const _NodeGenerator&,
> false_type)): Likewise.
>  (_Hashtable(_InputIterator, _InputIterator, size_type, const
> _Hash&, const _Equal&,
>  const allocator_type&, true_type)): Use this.insert range.
>  (_Hashtable(_InputIterator, _InputIterator, size_type, const
> _Hash&, const _Equal&,
>  const allocator_type&, false_type)): Use _M_insert.
>  * testsuite/23_containers/unordered_map/cons/56112.cc: Check
> how many times conversion
>  is done.
>  (test02): New test case.
>  * testsuite/23_containers/unordered_set/cons/56112.cc: New test.
>
> Tested under Linux x86_64.
>
> Ok to commit ?

No, sorry.

The new test02 function in 23_containers/unordered_map/cons/56112.cc
doesn't compile with libc++ or MSVC either, are you sure that test is
valid? I don't think it is, because S2 is not convertible to
pair. None of the pair constructors are
viable, because the move constructor would require two user-defined
conversions (from S2 to pair and then from
pair to pair). A conversion
sequence cannot have more than one user-defined conversion using a
constructor or converion operator. So if your patch makes that
compile, it's a bug in the new code. I haven't analyzed that code to
see where the problem is, I'm just looking at the test results and the
changes in behaviour.

The new 23_containers/unordered_set/cons/56112.cc test fails for GCC
11 but passes for GCC 12, even without your patch. Is it actually
testing some other change, not this patch, and not the 2013 fix for PR
56112?



Re: [PATCH] middle-end/105711 - properly handle CONST_INT when expanding bitfields

2022-05-24 Thread Richard Biener via Gcc-patches
On Tue, 24 May 2022, Richard Sandiford wrote:

> Richard Biener  writes:
> > This is another place where we fail to pass down the mode of a
> > CONST_INT.
> >
> > Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
> >
> > Thanks,
> > Richard.
> >
> > 2022-05-24  Richard Biener  
> >
> > PR middle-end/105711
> > * expmed.cc (extract_bit_field_as_subreg): Add op0_mode parameter
> > and use it.
> > (extract_bit_field_1): Pass down the mode of op0 to
> > extract_bit_field_as_subreg.
> >
> > * gcc.target/i386/pr105711.c: New testcase.
> 
> LGTM, but I guess the new parameter should be documented above
> extract_bit_field_as_subreg.

Ah, yes - fixed and pushed.

Thanks,
Richard.


Re: [PATCH/RFC] PR tree-optimization/96912: Recognize VEC_COND_EXPR in match.pd

2022-05-24 Thread Richard Biener via Gcc-patches
On Mon, May 23, 2022 at 4:27 PM Roger Sayle  wrote:
>
>
> Hi Richard,
>
> Currently for pr96912, we end up with:
>
> W foo (W x, W y, V m)
> {
>   W t;
>   vector(16)  _1;
>   vector(16) signed char _2;
>   W _7;
>   vector(2) long long int _9;
>   vector(2) long long int _10;
>
>[local count: 1073741824]:
>   _1 = m_3(D) < { 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 };
>   _2 = VIEW_CONVERT_EXPR(_1);
>   t_4 = VIEW_CONVERT_EXPR(_2);
>   _9 = x_5(D) ^ y_6(D);
>   _10 = t_4 & _9;
>   _7 = x_5(D) ^ _10;
>   return _7;
> }
>
> The mask mode is V16QI and the data mode is V2DI (please forgive my RTL 
> terminology).
> The assignment t_4 view-converts the mask into the "data type" for the bitwise
> operations.  The use x86's pblendvb, the "vcond_expr" operation needs to be 
> mask's
> mode (V16QI) rather than the data's mode (V2DI).  Hence the unsigned_type_for
> of the truth_type_for [if they have different NUNITS].  Obviously, converting 
> the
> mask to V2DI and performing a V2DI blend, won't produce the same result.

OK, so that's because pblendvb doesn't produce the bit operation
result but instead
does element-wise selection.  So you rely on the all-zero or all-ones
constant matching
or the VECTOR_BOOLEAN_TYPE_P guarantee that this is a suitable mask to get
at the "element" granularity that's usable since from the data modes
in the bit operations
there's no way to guess that (which might also be the reason why it's hard to do
this on the RTL level).  Commentary above the pattern should explain
this I think.

The recursive (and exponential via bit-ops) matching of this is a bit
awkward, since
you match (view_convert vector_mask_p@2) isn't it enough to require a
VECTOR_BOOLEAN_TYPE_P here?  That should then eventually be good enough
to simply use TREE_TYPE (@2) as data type without further checking and thus

 (view_convert (vec_cond @2 (view_convert:mask_type @1)
(view_convert:mask_type @0))

>
> The most useful clause of vector_mask_p is actually the VECTOR_BOOLEAN_ TYPE_P
> test that catches all "mask types", such as those that result from vector 
> comparisons.
> Note that vector_mask_p is tested against the operand of a view_convert expr.
> The remaining cases handle truth_*_expr like operations on those comparisons.

But those should be still VECTOR_BOOLEAN_TYPE_P?

> One follow-up patch is to additionally allow VIEW_CONVERT_EXPR if both source
> and destination are of  VECTOR_TYPE_P and known_eq TYPE_VECTOR_SUBUNITS.
> Likewise, a C cst_vector_mask_p could check each element rather than the catch
> all "integer_zerop || integer_all_onesp".

Yes, you'd want to know whether the constant bit pattern covers a
vector mode (and which)
making lanes all ones or zeros.

> I agree with gimple-isel replacing VEC_COND_EXPR when it's supported in 
> hardware,
> just like .FMA is inserted to replace the universal MULT_EXPR/PLUS_EXPR tree 
> codes,
> the question is whether vec_cond_expr (and vec_duplicate) can always be 
> expanded
> moderately efficiently by the middle-end.  For one thing, we have vector cost 
> models
> that should be able to  distinguish between efficient and inefficient 
> implementations.
> And a vec_duplicate expander for SPARC should be able to generate an optimal
> sequence of instructions, even if there's isn't native hardware (instruction) 
> support.
> For example, scalar multiplication by 0x0101010101010101 may be a good way to
> vec_duplicate QI mode to V8QI mode (via DI mode), at least with -Os.  As 
> you've
> mentioned, the VEC_PERM infrastructure should be useful here.

If it can be done efficiently with alternate code sequences then the
current setup
is so that vector lowering would be the place that does this.  There's
somewhat of
a consensus that we want to move away from RTL expansion doing all the "clever"
things toward doing that on (late) GIMPLE so that RTL expansion can eventually
rely on optabs only.

I'm not saying that's going to win, likewise the current vector lowering is both
sub-optimal (relying on followup optimization) and placed too early
(due to this).

In theory vector match.pd patterns could condition themselves to
optimize_vectors_before_lowering_p (), alternatively they have to check
target support.  There's now plenty passes between vectorization and
vector lowering though and we probably should not mess with the vectorizer
produced code (which is supported by the target) so we do not run into
the sub-optimal vector lowering implementation.  Possibly vector lowering
should be done earlier, before loop opts (before PRE).

> [p.s. it's unfortunate that some of my patches appear controversial.  By 
> randomly
> selecting Bugzilla (regression) PRs that have been open for a long time, I 
> seem to be
> selecting/enriching for bugs for which there's no simple solution, and that 
> maintainers
> have already thought about for a long time but without coming up a 
> satisfactory solution].

You are simply hitting areas where GCC isn't nicely designed and the 

Re: [PATCH] middle-end/105711 - properly handle CONST_INT when expanding bitfields

2022-05-24 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> This is another place where we fail to pass down the mode of a
> CONST_INT.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?
>
> Thanks,
> Richard.
>
> 2022-05-24  Richard Biener  
>
>   PR middle-end/105711
>   * expmed.cc (extract_bit_field_as_subreg): Add op0_mode parameter
>   and use it.
>   (extract_bit_field_1): Pass down the mode of op0 to
>   extract_bit_field_as_subreg.
>
>   * gcc.target/i386/pr105711.c: New testcase.

LGTM, but I guess the new parameter should be documented above
extract_bit_field_as_subreg.

Thanks,
Richard

> ---
>  gcc/expmed.cc| 15 +--
>  gcc/testsuite/gcc.target/i386/pr105711.c | 12 
>  2 files changed, 21 insertions(+), 6 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr105711.c
>
> diff --git a/gcc/expmed.cc b/gcc/expmed.cc
> index 41738c1efe9..e22278370fd 100644
> --- a/gcc/expmed.cc
> +++ b/gcc/expmed.cc
> @@ -1611,14 +1611,15 @@ extract_bit_field_using_extv (const extraction_insn 
> *extv, rtx op0,
>  
>  static rtx
>  extract_bit_field_as_subreg (machine_mode mode, rtx op0,
> +  machine_mode op0_mode,
>poly_uint64 bitsize, poly_uint64 bitnum)
>  {
>poly_uint64 bytenum;
>if (multiple_p (bitnum, BITS_PER_UNIT, )
>&& known_eq (bitsize, GET_MODE_BITSIZE (mode))
> -  && lowpart_bit_field_p (bitnum, bitsize, GET_MODE (op0))
> -  && TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (op0)))
> -return simplify_gen_subreg (mode, op0, GET_MODE (op0), bytenum);
> +  && lowpart_bit_field_p (bitnum, bitsize, op0_mode)
> +  && TRULY_NOOP_TRUNCATION_MODES_P (mode, op0_mode))
> +return simplify_gen_subreg (mode, op0, op0_mode, bytenum);
>return NULL_RTX;
>  }
>  
> @@ -1777,7 +1778,8 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, 
> poly_uint64 bitnum,
>for valid bitsize and bitnum, so we don't need to do that here.  */
>if (VECTOR_MODE_P (mode))
>   {
> -   rtx sub = extract_bit_field_as_subreg (mode, op0, bitsize, bitnum);
> +   rtx sub = extract_bit_field_as_subreg (mode, op0, outermode,
> +  bitsize, bitnum);
> if (sub)
>   return sub;
>   }
> @@ -1824,9 +1826,10 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, 
> poly_uint64 bitnum,
>/* Extraction of a full MODE1 value can be done with a subreg as long
>   as the least significant bit of the value is the least significant
>   bit of either OP0 or a word of OP0.  */
> -  if (!MEM_P (op0) && !reverse)
> +  if (!MEM_P (op0) && !reverse && op0_mode.exists ())
>  {
> -  rtx sub = extract_bit_field_as_subreg (mode1, op0, bitsize, bitnum);
> +  rtx sub = extract_bit_field_as_subreg (mode1, op0, imode,
> +  bitsize, bitnum);
>if (sub)
>   return convert_extracted_bit_field (sub, mode, tmode, unsignedp);
>  }
> diff --git a/gcc/testsuite/gcc.target/i386/pr105711.c 
> b/gcc/testsuite/gcc.target/i386/pr105711.c
> new file mode 100644
> index 000..6d07e08138a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr105711.c
> @@ -0,0 +1,12 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 --param=sccvn-max-alias-queries-per-access=0" } */
> +
> +int *p, a, b;
> +
> +void
> +foo (_Complex char c)
> +{
> +  c /= 3040;
> +  a %= __builtin_memcmp (1 + , p, 1);
> +  b = c + __imag__ c;
> +}


[PATCH] middle-end/105711 - properly handle CONST_INT when expanding bitfields

2022-05-24 Thread Richard Biener via Gcc-patches
This is another place where we fail to pass down the mode of a
CONST_INT.

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK?

Thanks,
Richard.

2022-05-24  Richard Biener  

PR middle-end/105711
* expmed.cc (extract_bit_field_as_subreg): Add op0_mode parameter
and use it.
(extract_bit_field_1): Pass down the mode of op0 to
extract_bit_field_as_subreg.

* gcc.target/i386/pr105711.c: New testcase.
---
 gcc/expmed.cc| 15 +--
 gcc/testsuite/gcc.target/i386/pr105711.c | 12 
 2 files changed, 21 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr105711.c

diff --git a/gcc/expmed.cc b/gcc/expmed.cc
index 41738c1efe9..e22278370fd 100644
--- a/gcc/expmed.cc
+++ b/gcc/expmed.cc
@@ -1611,14 +1611,15 @@ extract_bit_field_using_extv (const extraction_insn 
*extv, rtx op0,
 
 static rtx
 extract_bit_field_as_subreg (machine_mode mode, rtx op0,
+machine_mode op0_mode,
 poly_uint64 bitsize, poly_uint64 bitnum)
 {
   poly_uint64 bytenum;
   if (multiple_p (bitnum, BITS_PER_UNIT, )
   && known_eq (bitsize, GET_MODE_BITSIZE (mode))
-  && lowpart_bit_field_p (bitnum, bitsize, GET_MODE (op0))
-  && TRULY_NOOP_TRUNCATION_MODES_P (mode, GET_MODE (op0)))
-return simplify_gen_subreg (mode, op0, GET_MODE (op0), bytenum);
+  && lowpart_bit_field_p (bitnum, bitsize, op0_mode)
+  && TRULY_NOOP_TRUNCATION_MODES_P (mode, op0_mode))
+return simplify_gen_subreg (mode, op0, op0_mode, bytenum);
   return NULL_RTX;
 }
 
@@ -1777,7 +1778,8 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, 
poly_uint64 bitnum,
 for valid bitsize and bitnum, so we don't need to do that here.  */
   if (VECTOR_MODE_P (mode))
{
- rtx sub = extract_bit_field_as_subreg (mode, op0, bitsize, bitnum);
+ rtx sub = extract_bit_field_as_subreg (mode, op0, outermode,
+bitsize, bitnum);
  if (sub)
return sub;
}
@@ -1824,9 +1826,10 @@ extract_bit_field_1 (rtx str_rtx, poly_uint64 bitsize, 
poly_uint64 bitnum,
   /* Extraction of a full MODE1 value can be done with a subreg as long
  as the least significant bit of the value is the least significant
  bit of either OP0 or a word of OP0.  */
-  if (!MEM_P (op0) && !reverse)
+  if (!MEM_P (op0) && !reverse && op0_mode.exists ())
 {
-  rtx sub = extract_bit_field_as_subreg (mode1, op0, bitsize, bitnum);
+  rtx sub = extract_bit_field_as_subreg (mode1, op0, imode,
+bitsize, bitnum);
   if (sub)
return convert_extracted_bit_field (sub, mode, tmode, unsignedp);
 }
diff --git a/gcc/testsuite/gcc.target/i386/pr105711.c 
b/gcc/testsuite/gcc.target/i386/pr105711.c
new file mode 100644
index 000..6d07e08138a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr105711.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 --param=sccvn-max-alias-queries-per-access=0" } */
+
+int *p, a, b;
+
+void
+foo (_Complex char c)
+{
+  c /= 3040;
+  a %= __builtin_memcmp (1 + , p, 1);
+  b = c + __imag__ c;
+}
-- 
2.35.3


Re: [0/9] [middle-end] Add param to vec_perm_const hook to specify mode of input operand

2022-05-24 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni  writes:
> diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
> index c5006afc00d..0a3c733ada9 100644
> --- a/gcc/doc/tm.texi
> +++ b/gcc/doc/tm.texi
> @@ -6088,14 +6088,18 @@ for the given scalar type @var{type}.  
> @var{is_packed} is false if the scalar
>  access using @var{type} is known to be naturally aligned.
>  @end deftypefn
>  
> -@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode 
> @var{mode}, rtx @var{output}, rtx @var{in0}, rtx @var{in1}, const 
> vec_perm_indices @var{})
> +@deftypefn {Target Hook} bool TARGET_VECTORIZE_VEC_PERM_CONST (machine_mode 
> @var{mode}, machine_mode @var{op_mode}, rtx @var{output}, rtx @var{in0}, rtx 
> @var{in1}, const vec_perm_indices @var{})
>  This hook is used to test whether the target can permute up to two
> -vectors of mode @var{mode} using the permutation vector @code{sel}, and
> -also to emit such a permutation.  In the former case @var{in0}, @var{in1}
> -and @var{out} are all null.  In the latter case @var{in0} and @var{in1} are
> -the source vectors and @var{out} is the destination vector; all three are
> -operands of mode @var{mode}.  @var{in1} is the same as @var{in0} if
> -@var{sel} describes a permutation on one vector instead of two.
> +vectors of mode @var{op_mode} using the permutation vector @code{sel},
> +producing a vector of mode @var{mode}.The hook is also used to emit such

Should be two spaces between “@var{mode}.” and “The”.

> +a permutation.
> +
> +When the hook is being used to test whether the target supports a 
> permutation,
> +@var{in0}, @var{in1}, and @var{out} are all null.When the hook is being used

Same here: missing spaces before “When”.

> +to emit a permutation, @var{in0} and @var{in1} are the source vectors of mode
> +@var{op_mode} and @var{out} is the destination vector of mode @var{mode}.
> +@var{in1} is the same as @var{in0} if @var{sel} describes a permutation on 
> one
> +vector instead of two.
>  
>  Return true if the operation is possible, emitting instructions for it
>  if rtxes are provided.
> diff --git a/gcc/match.pd b/gcc/match.pd
> index f5efa77560c..f2a527d9c42 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -7596,6 +7596,8 @@ and,
>   (with
>{
>  tree op0 = @0, op1 = @1, op2 = @2;
> +machine_mode result_mode = TYPE_MODE (type);
> +machine_mode op_mode = TYPE_MODE (TREE_TYPE (op0));
>  
>  /* Build a vector of integers from the tree mask.  */
>  vec_perm_builder builder;
> @@ -7703,12 +7705,12 @@ and,
>  2-argument version.  */
>   tree oldop2 = op2;
>   if (sel.ninputs () == 2
> -|| can_vec_perm_const_p (TYPE_MODE (type), sel, false))
> +|| can_vec_perm_const_p (result_mode, op_mode, sel, false))
> op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel);
>   else
> {
>   vec_perm_indices sel2 (builder, 2, nelts);
> - if (can_vec_perm_const_p (TYPE_MODE (type), sel2, false))
> + if (can_vec_perm_const_p (result_mode, op_mode, sel2, false))
> op2 = vec_perm_indices_to_tree (TREE_TYPE (op2), sel2);
>   else
> /* Not directly supported with either encoding,

Please replace the use of TYPE_MODE here:

/* See if the permutation is performing a single element
   insert from a CONSTRUCTOR or constant and use a BIT_INSERT_EXPR
   in that case.  But only if the vector mode is supported,
   otherwise this is invalid GIMPLE.  */
if (TYPE_MODE (type) != BLKmode

as well.

OK with those changes, thanks.

Richard


  1   2   >