[PATCH] lower-subreg: Fix ROTATE handling [PR114211]

2024-03-04 Thread Jakub Jelinek
Hi!

On the following testcase, we have
(insn 10 7 11 2 (set (reg/v:TI 106 [ h ])
(rotate:TI (reg/v:TI 106 [ h ])
(const_int 64 [0x40]))) "pr114211.c":8:5 1042 {rotl64ti2_doubleword}
 (nil))
before subreg1 and the pass decides to use
(reg:DI 127 [ h ]) / (reg:DI 128 [ h+8 ])
register pair instead of (reg/v:TI 106 [ h ]).
resolve_operand_for_swap_move_operator implements it by pretending it is
an assignment from
(concatn (reg:DI 127 [ h ]) (reg:DI 128 [ h+8 ]))
to
(concatn (reg:DI 128 [ h+8 ]) (reg:DI 127 [ h ]))
The problem is that if the rotate argument is the same as destination or
if there is even an overlap between the first half of the destination with
second half of the source we emit incorrect code, because the store to
(reg:DI 128 [ h+8 ]) overwrites what we need for source of the second
move.  THe following patch detects that case and uses a temporary pseudo
to hold the original (reg:DI 128 [ h+8 ]) value across the first store.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-03-05  Jakub Jelinek  

PR rtl-optimization/114211
* lower-subreg.cc (resolve_simple_move): For double-word
rotates by BITS_PER_WORD if there is overlap between source
and destination use a temporary.

* gcc.dg/pr114211.c: New test.

--- gcc/lower-subreg.cc.jj  2024-01-03 11:51:33.713700906 +0100
+++ gcc/lower-subreg.cc 2024-03-04 20:29:13.911428988 +0100
@@ -927,6 +927,21 @@ resolve_simple_move (rtx set, rtx_insn *
 SRC's operator.  */
  dest = resolve_operand_for_swap_move_operator (dest);
  src = src_op;
+ if (resolve_reg_p (src))
+   {
+ gcc_assert (GET_CODE (src) == CONCATN);
+ if (reg_overlap_mentioned_p (XVECEXP (dest, 0, 0),
+  XVECEXP (src, 0, 1)))
+   {
+ /* If there is overlap betwee the first half of the
+destination and what will be stored to the second one,
+use a temporary pseudo.  See PR114211.  */
+ rtx tem = gen_reg_rtx (GET_MODE (XVECEXP (src, 0, 1)));
+ emit_move_insn (tem, XVECEXP (src, 0, 1));
+ src = copy_rtx (src);
+ XVECEXP (src, 0, 1) = tem;
+   }
+   }
}
   else if (resolve_reg_p (src_op))
{
--- gcc/testsuite/gcc.dg/pr114211.c.jj  2024-03-04 20:37:58.735339443 +0100
+++ gcc/testsuite/gcc.dg/pr114211.c 2024-03-04 20:37:33.78077 +0100
@@ -0,0 +1,23 @@
+/* PR rtl-optimization/114211 */
+/* { dg-do run { target int128 } } */
+/* { dg-options "-O -fno-tree-coalesce-vars -Wno-psabi" } */
+
+typedef unsigned __int128 V __attribute__((__vector_size__ (16)));
+unsigned int u;
+V v;
+
+V
+foo (unsigned __int128 h)
+{
+  h = h << 64 | h >> 64;
+  h *= ~u;
+  return h + v;
+}
+
+int
+main ()
+{
+  V x = foo (1);
+  if (x[0] != (unsigned __int128) 0x << 64)
+__builtin_abort ();
+}

Jakub



[PATCHv2] fwprop: Avoid volatile defines to be propagated

2024-03-04 Thread HAO CHEN GUI
Hi,
  This patch tries to fix a potential problem which is raised by the patch
for PR111267. The volatile asm operand tries to be propagated to a single
set insn with the patch for PR111267. The volatile asm operand might be
executed for multiple times if the define insn isn't eliminated after
propagation. Now set_src_cost comparison might reject such propagation.
But it has the chance to be taken after replacing set_src_cost with insn
cost. Actually I found the problem in testing my patch which replacing
set_src_cost with insn_cost in fwprop pass.

  Compared to the last version, the check volatile_insn_p is replaced with
volatile_refs_p in order to check volatile memory reference also.
https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646482.html

  Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no
regressions. Is it OK for the trunk?

Thanks
Gui Haochen

ChangeLog
fwprop: Avoid volatile defines to be propagated

The patch for PR111267 (commit id 86de9b66480b710202a2898cf513db105d8c432f)
which introduces an exception for propagation on single set insn.  The
propagation which might not be profitable (checked by profitable_p) is still
allowed to be propagated to single set insn.  It has a potential problem
that a volatile operand might be propagated to a single set insn.  If the
define insn is not eliminated after propagation, the volatile operand will
be executed for multiple times.  This patch fixes the problem by skipping
volatile set source rtx in propagation.

gcc/
* fwprop.cc (forward_propagate_into): Return false for volatile set
source rtx.

gcc/testsuite/
* gcc.target/powerpc/fwprop-1.c: New.

patch.diff
diff --git a/gcc/fwprop.cc b/gcc/fwprop.cc
index 7872609b336..cb6fd6700ca 100644
--- a/gcc/fwprop.cc
+++ b/gcc/fwprop.cc
@@ -854,6 +854,8 @@ forward_propagate_into (use_info *use, bool reg_prop_only = 
false)

   rtx dest = SET_DEST (def_set);
   rtx src = SET_SRC (def_set);
+  if (volatile_refs_p (src))
+return false;

   /* Allow propagations into a loop only for reg-to-reg copies, since
  replacing one register by another shouldn't increase the cost.
diff --git a/gcc/testsuite/gcc.target/powerpc/fwprop-1.c 
b/gcc/testsuite/gcc.target/powerpc/fwprop-1.c
new file mode 100644
index 000..07b207f980c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/fwprop-1.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -fdump-rtl-fwprop1-details" } */
+/* { dg-final { scan-rtl-dump-not "propagating insn" "fwprop1" } } */
+
+/* Verify that volatile asm operands doesn't be propagated.  */
+long long foo ()
+{
+  long long res;
+  __asm__ __volatile__(
+""
+  : "=r" (res)
+  :
+  : "memory");
+  return res;
+}



RE: [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation US_PLUS

2024-03-04 Thread Li, Pan2
Thanks Richard for comments.

> I do wonder what the existing usadd patterns with integer vector modes
> in various targets do?
> Those define_insn will at least not end up in the optab set I guess,
> so they must end up
> being either unused or used by explicit gen_* (via intrinsic
> functions?) or by combine?

For usadd with vector modes, I think the backend like RISC-V try to leverage 
instructions
like Vector Single-Width Saturating Add(aka vsaddu.vv/x/i).

> I think simply changing gen_*_fixed_libfunc to gen_int_libfunc won't
> work.  Since there's
> no libgcc support I'd leave it as gen_*_fixed_libfunc thus no library
> fallback for integers?

Change to gen_int_libfunc follows other int optabs. I am not sure if it will 
hit the standard name usaddm3 for vector mode.
But the happy path for scalar modes works up to a point, please help to correct 
me if any misunderstanding.

#0  riscv_expand_usadd (dest=0x76a8c7c8, x=0x76a8c798, 
y=0x76a8c7b0) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/config/riscv/riscv.cc:10662
#1  0x029f142a in gen_usaddsi3 (operand0=0x76a8c7c8, 
operand1=0x76a8c798, operand2=0x76a8c7b0) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/config/riscv/riscv.md:3848
#2  0x01751e60 in insn_gen_fn::operator() 
(this=0x4910e70 ) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/recog.h:441
#3  0x0180f553 in maybe_gen_insn (icode=CODE_FOR_usaddsi3, nops=3, 
ops=0x7fffd2c0) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/optabs.cc:8232
#4  0x0180fa42 in maybe_expand_insn (icode=CODE_FOR_usaddsi3, nops=3, 
ops=0x7fffd2c0) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/optabs.cc:8275
#5  0x0180fade in expand_insn (icode=CODE_FOR_usaddsi3, nops=3, 
ops=0x7fffd2c0) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/optabs.cc:8306
#6  0x015cebdc in expand_fn_using_insn (stmt=0x76a36480, 
icode=CODE_FOR_usaddsi3, noutputs=1, ninputs=2) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/internal-fn.cc:254
#7  0x015de146 in expand_direct_optab_fn (fn=IFN_SAT_ADD, 
stmt=0x76a36480, optab=usadd_optab, nargs=2) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/internal-fn.cc:3818
#8  0x015e3610 in expand_SAT_ADD (fn=IFN_SAT_ADD, stmt=0x76a36480) 
at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/internal-fn.def:278
#9  0x015e65b6 in expand_internal_call (fn=IFN_SAT_ADD, 
stmt=0x76a36480) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/internal-fn.cc:4914
#10 0x015e65e5 in expand_internal_call (stmt=0x76a36480) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/internal-fn.cc:4922
#11 0x01248c8f in expand_call_stmt (stmt=0x76a36480) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/cfgexpand.cc:2771
#12 0x0124d392 in expand_gimple_stmt_1 (stmt=0x76a36480) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/cfgexpand.cc:3932
#13 0x0124d9aa in expand_gimple_stmt (stmt=0x76a36480) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/cfgexpand.cc:4077
#14 0x0124dac4 in expand_gimple_tailcall (bb=0x76dddae0, 
stmt=0x76a36480, can_fallthru=0x7fffd800) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/cfgexpand.cc:4123
#15 0x0125636b in expand_gimple_basic_block (bb=0x76dddae0, 
disable_tail_calls=false) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/cfgexpand.cc:6107
#16 0x01258a1a in (anonymous namespace)::pass_expand::execute 
(this=0x556d180, fun=0x76a7f2e0) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/cfgexpand.cc:6872
#17 0x01873565 in execute_one_pass (pass=0x556d180) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/passes.cc:2646
#18 0x01873948 in execute_pass_list_1 (pass=0x556d180) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/passes.cc:2755
#19 0x018739d6 in execute_pass_list (fn=0x76a7f2e0, pass=0x5568870) 
at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/passes.cc:2766
#20 0x012bc975 in cgraph_node::expand (this=0x76c2c880) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/cgraphunit.cc:1845
#21 0x012bd18f in expand_all_functions () at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/cgraphunit.cc:2028
#22 0x012bdcc5 in symbol_table::compile (this=0x76c06000) at 
/home/pli/gcc/111/riscv-gnu-toolchain/gcc/__RISC-V_BUILD__/../gcc/cgraphunit.cc:2402
#23 0x012be16c in symbol_table::finalize_compilation_unit 
(this=0x76c06000) at 

[PATCH] bitint: Handle BIT_FIELD_REF lowering [PR114157]

2024-03-04 Thread Jakub Jelinek
Hi!

The following patch adds support for BIT_FIELD_REF lowering with
large/huge _BitInt lhs.  BIT_FIELD_REF requires mode argument first
operand, so the operand shouldn't be any huge _BitInt.
If we only access limbs from inside of BIT_FIELD_REF using constant
indexes, we can just create a new BIT_FIELD_REF to extract the limb,
but if we need to use variable index in a loop, I'm afraid we need
to spill it into memory, which is what the following patch does.
If there is some bitwise type for the extraction, it extracts just
what we need and not more than that, otherwise it spills the whole
first argument of BIT_FIELD_REF and uses MEM_REF with an offset
with VIEW_CONVERT_EXPR around it.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-03-05  Jakub Jelinek  

PR middle-end/114157
* gimple-lower-bitint.cc: Include stor-layout.h.
(mergeable_op): Return true for BIT_FIELD_REF.
(struct bitint_large_huge): Declare handle_bit_field_ref method.
(bitint_large_huge::handle_bit_field_ref): New method.
(bitint_large_huge::handle_stmt): Use it for BIT_FIELD_REF.

* gcc.dg/bitint-98.c: New test.
* gcc.target/i386/avx2-pr114157.c: New test.
* gcc.target/i386/avx512f-pr114157.c: New test.

--- gcc/gimple-lower-bitint.cc.jj   2024-03-04 11:14:57.450288563 +0100
+++ gcc/gimple-lower-bitint.cc  2024-03-04 18:51:06.833008534 +0100
@@ -54,6 +54,7 @@ along with GCC; see the file COPYING3.
 #include "tree-cfgcleanup.h"
 #include "tree-switch-conversion.h"
 #include "ubsan.h"
+#include "stor-layout.h"
 #include "gimple-lower-bitint.h"
 
 /* Split BITINT_TYPE precisions in 4 categories.  Small _BitInt, where
@@ -212,6 +213,7 @@ mergeable_op (gimple *stmt)
 case BIT_NOT_EXPR:
 case SSA_NAME:
 case INTEGER_CST:
+case BIT_FIELD_REF:
   return true;
 case LSHIFT_EXPR:
   {
@@ -435,6 +437,7 @@ struct bitint_large_huge
   tree handle_plus_minus (tree_code, tree, tree, tree);
   tree handle_lshift (tree, tree, tree);
   tree handle_cast (tree, tree, tree);
+  tree handle_bit_field_ref (tree, tree);
   tree handle_load (gimple *, tree);
   tree handle_stmt (gimple *, tree);
   tree handle_operand_addr (tree, gimple *, int *, int *);
@@ -1685,6 +1688,86 @@ bitint_large_huge::handle_cast (tree lhs
   return NULL_TREE;
 }
 
+/* Helper function for handle_stmt method, handle a BIT_FIELD_REF.  */
+
+tree
+bitint_large_huge::handle_bit_field_ref (tree op, tree idx)
+{
+  if (tree_fits_uhwi_p (idx))
+{
+  if (m_first)
+   m_data.safe_push (NULL);
+  ++m_data_cnt;
+  unsigned HOST_WIDE_INT sz = tree_to_uhwi (TYPE_SIZE (m_limb_type));
+  tree bfr = build3 (BIT_FIELD_REF, m_limb_type,
+TREE_OPERAND (op, 0),
+TYPE_SIZE (m_limb_type),
+size_binop (PLUS_EXPR, TREE_OPERAND (op, 2),
+bitsize_int (tree_to_uhwi (idx) * sz)));
+  tree r = make_ssa_name (m_limb_type);
+  gimple *g = gimple_build_assign (r, bfr);
+  insert_before (g);
+  tree type = limb_access_type (TREE_TYPE (op), idx);
+  if (!useless_type_conversion_p (type, m_limb_type))
+   r = add_cast (type, r);
+  return r;
+}
+  tree var;
+  if (m_first)
+{
+  unsigned HOST_WIDE_INT sz = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (op)));
+  machine_mode mode;
+  tree type, bfr;
+  if (bitwise_mode_for_size (sz).exists ()
+ && known_eq (GET_MODE_BITSIZE (mode), sz))
+   type = bitwise_type_for_mode (mode);
+  else
+   {
+ mode = VOIDmode;
+ type = TYPE_MAIN_VARIANT (TREE_TYPE (TREE_OPERAND (op, 0)));
+   }
+  if (TYPE_ALIGN (type) < TYPE_ALIGN (TREE_TYPE (op)))
+   type = build_aligned_type (type, TYPE_ALIGN (TREE_TYPE (op)));
+  var = create_tmp_var (type);
+  TREE_ADDRESSABLE (var) = 1;
+  gimple *g;
+  if (mode != VOIDmode)
+   {
+ bfr = build3 (BIT_FIELD_REF, type, TREE_OPERAND (op, 0),
+   TYPE_SIZE (type), TREE_OPERAND (op, 2));
+ g = gimple_build_assign (make_ssa_name (type),
+  BIT_FIELD_REF, bfr);
+ gimple_set_location (g, m_loc);
+ gsi_insert_after (_init_gsi, g, GSI_NEW_STMT);
+ bfr = gimple_assign_lhs (g);
+   }
+  else
+   bfr = TREE_OPERAND (op, 0);
+  g = gimple_build_assign (var, bfr);
+  gimple_set_location (g, m_loc);
+  gsi_insert_after (_init_gsi, g, GSI_NEW_STMT);
+  if (mode == VOIDmode)
+   {
+ unsigned HOST_WIDE_INT nelts
+   = CEIL (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (op))), limb_prec);
+ tree atype = build_array_type_nelts (m_limb_type, nelts);
+ var = build2 (MEM_REF, atype, build_fold_addr_expr (var),
+   build_int_cst (build_pointer_type (type),
+  tree_to_uhwi (TREE_OPERAND (op, 2))
+

RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-03-04 Thread Li, Pan2
Thanks Jeff for comments.

> But in the case of a vector modes, we can usually reinterpret the 
> underlying bits in whatever mode we want and do any of the usual 
> operations on those bits.

Yes, I think that is why we can allow vector mode in get_stored_val if my 
understanding is correct.
And then the different modes will return by gen_low_part. Unfortunately, there 
are some modes
 (less than a vector bit size like V2SF, V2QI for vlen=128) are considered as 
invalid by validate_subreg, 
and return NULL_RTX result in the final ICE.

Thus, consider stage 4 I wonder if this is a acceptable fix, aka find some 
where to filter-out the invalid
modes before goes to gen_low_part.

Pan

-Original Message-
From: Jeff Law  
Sent: Monday, March 4, 2024 6:47 AM
To: Robin Dapp ; Li, Pan2 ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; Liu, Hongtao 
Subject: Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val



On 2/29/24 06:28, Robin Dapp wrote:
> On 2/29/24 02:38, Li, Pan2 wrote:
>>> So it's going to check if V2SF can be tied to DI and V4QI with SI.  I
>>> suspect those are going to fail for RISC-V as those aren't tieable.
>>
>> Yes, you are right. Different REG_CLASS are not allowed to be tieable in 
>> RISC-V.
>>
>> static bool
>> riscv_modes_tieable_p (machine_mode mode1, machine_mode mode2)
>> {
>>/* We don't allow different REG_CLASS modes tieable since it
>>   will cause ICE in register allocation (RA).
>>   E.g. V2SI and DI are not tieable.  */
>>if (riscv_v_ext_mode_p (mode1) != riscv_v_ext_mode_p (mode2))
>>  return false;
>>return (mode1 == mode2
>>|| !(GET_MODE_CLASS (mode1) == MODE_FLOAT
>> && GET_MODE_CLASS (mode2) == MODE_FLOAT));
>> }
> 
> Yes, but what we set tieable is e.g. V4QI and V2SF.
But in the case of a vector modes, we can usually reinterpret the 
underlying bits in whatever mode we want and do any of the usual 
operations on those bits.

In my mind that's fundamentally different than the int vs fp case.  If 
we have an integer value in an FP register, we can't really operate on 
the value in any sensible way without first copying it over to the 
integer register file and vice-versa.

Jeff


Re: [PATCH] c++: Don't set DECL_CONTEXT to nested template-template parameters [PR98881]

2024-03-04 Thread Nathaniel Shead
On Mon, Mar 04, 2024 at 10:07:33PM -0500, Patrick Palka wrote:
> On Tue, 5 Mar 2024, Nathaniel Shead wrote:
> 
> > On Mon, Mar 04, 2024 at 09:26:00PM -0500, Patrick Palka wrote:
> > > On Tue, 5 Mar 2024, Nathaniel Shead wrote:
> > > 
> > > > On Mon, Mar 04, 2024 at 07:14:54PM -0500, Patrick Palka wrote:
> > > > > On Sat, 2 Mar 2024, Nathaniel Shead wrote:
> > > > > 
> > > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > > > > > 
> > > > > > -- >8 --
> > > > > > 
> > > > > > When streaming in a nested template-template parameter as in the
> > > > > > attached testcase, we end up reaching the containing 
> > > > > > template-template
> > > > > > parameter in 'tpl_parms_fini'. We should not set the DECL_CONTEXT to
> > > > > > this (nested) template-template parameter, as it should already be 
> > > > > > the
> > > > > > struct that the outer template-template parameter is declared on.
> > > > > > 
> > > > > > PR c++/98881
> > > > > > 
> > > > > > gcc/cp/ChangeLog:
> > > > > > 
> > > > > > * module.cc (trees_out::tpl_parms_fini): Clarify logic purely
> > > > > > for checking purposes. Don't consider a template template
> > > > > > parameter as the owning template.
> > > > > > (trees_in::tpl_parms_fini): Don't consider a template template
> > > > > > parameter as the owning template.
> > > > > > 
> > > > > > gcc/testsuite/ChangeLog:
> > > > > > 
> > > > > > * g++.dg/modules/tpl-tpl-parm-3_a.H: New test.
> > > > > > * g++.dg/modules/tpl-tpl-parm-3_b.C: New test.
> > > > > > 
> > > > > > Signed-off-by: Nathaniel Shead 
> > > > > > ---
> > > > > >  gcc/cp/module.cc| 17 
> > > > > > -
> > > > > >  gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H | 11 +++
> > > > > >  gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C | 13 +
> > > > > >  3 files changed, 36 insertions(+), 5 deletions(-)
> > > > > >  create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> > > > > >  create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C
> > > > > > 
> > > > > > diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> > > > > > index 67f132d28d7..5663d01ed9c 100644
> > > > > > --- a/gcc/cp/module.cc
> > > > > > +++ b/gcc/cp/module.cc
> > > > > > @@ -10126,10 +10126,14 @@ trees_out::tpl_parms_fini (tree tmpl, 
> > > > > > unsigned tpl_levels)
> > > > > >   tree dflt = TREE_PURPOSE (parm);
> > > > > >   tree_node (dflt);
> > > > > >  
> > > > > > - if (streaming_p ())
> > > > > > + if (CHECKING_P && streaming_p ())
> > > > > > {
> > > > > > + /* Sanity check that the DECL_CONTEXT we'll infer when
> > > > > > +streaming in is correct.  */
> > > > > >   tree decl = TREE_VALUE (parm);
> > > > > > - if (TREE_CODE (decl) == TEMPLATE_DECL)
> > > > > > + if (TREE_CODE (decl) == TEMPLATE_DECL
> > > > > > + /* A template template parm is not the owning 
> > > > > > template.  */
> > > > > > + && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))
> > > > > > {
> > > > > >   tree ctx = DECL_CONTEXT (decl);
> > > > > >   tree inner = DECL_TEMPLATE_RESULT (decl);
> > > > > > @@ -10164,8 +10168,13 @@ trees_in::tpl_parms_fini (tree tmpl, 
> > > > > > unsigned tpl_levels)
> > > > > > return false;
> > > > > >   TREE_PURPOSE (parm) = dflt;
> > > > > >  
> > > > > > + /* Original template template parms have a context
> > > > > > +of their owning template.  Reduced ones do not.
> > > > > > +But if TMPL is itself a template template parm
> > > > > > +then it cannot be the owning template.  */
> > > > > >   tree decl = TREE_VALUE (parm);
> > > > > > - if (TREE_CODE (decl) == TEMPLATE_DECL)
> > > > > > + if (TREE_CODE (decl) == TEMPLATE_DECL
> > > > > > + && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))
> > > > > 
> > > > > IIUC a TEMPLATE_DECL inside a template parameter list always 
> > > > > represents
> > > > > a template template parm, so won't this effectively disable the
> > > > > DECL_CONTEXT setting logic?
> > > > 
> > > > This is only when 'tmpl' (i.e. the containing TEMPLATE_DECL that we're
> > > > streaming) is itself a template template parm.
> > > 
> > > D'oh, makes sense.
> > > 
> > > > 
> > > > > > {
> > > > > >   tree inner = DECL_TEMPLATE_RESULT (decl);
> > > > > >   tree tpi = (TREE_CODE (inner) == TYPE_DECL
> > > > > > @@ -10173,8 +10182,6 @@ trees_in::tpl_parms_fini (tree tmpl, 
> > > > > > unsigned tpl_levels)
> > > > > >   : DECL_INITIAL (inner));
> > > > > >   bool original = (TEMPLATE_PARM_LEVEL (tpi)
> > > > > >== TEMPLATE_PARM_ORIG_LEVEL (tpi));
> > > > > > - /* Original template template parms have a context
> > > > > > -of their owning template.  Reduced ones do not.  */
> > > > > >   if (original)
> > > > > > 

Re: [PATCH] i386: Guard noreturn no-callee-saved-registers optimization with -mnoreturn-no-callee-saved-registers [PR38534]

2024-03-04 Thread Hongtao Liu
On Thu, Feb 29, 2024 at 2:20 PM Hongtao Liu  wrote:
>
> On Wed, Feb 28, 2024 at 4:54 PM Jakub Jelinek  wrote:
> >
> > Hi!
> >
> > Adding Hongtao and Honza into the loop as the ones who acked the original
> > patch.
> >
> > The no_callee_saved_registers by default for noreturn functions change can
> > break in-process backtrace(3) or backtraces from debugger or other process
> > (quite often, any time the noreturn function decides to use the bp register
> > and any of the parent frames uses a frame pointer; the unwinder just crashes
> > in the libgcc unwinder case, gdb prints stack corrupted message), so I'd
> > like to save bp register in that case:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2024-February/646591.html
> I think this patch makes sense and LGTM, we save and restore frame
> pointer for noreturn.
> >
> > and additionally the no_callee_saved_registers by default for noreturn
> > functions change can make debugging harder, again not localized to the
> > noreturn function, but any of its callers.  So, if say glibc abort function
> > implementation needs a lot of normally callee-saved registers, no matter how
> > users recompile their apps, they will see garbage or optimized out
> > vars/parameters in their code unless they rebuild their glibc with -O0.
> > So, I think we should guard that by a non-default option:
>From what has been discussed so far, I am inclined to this proposal.
If there are no additional objections(or concerns) in a few days, ok
for the trunk.
> >
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao


Re: [patch, libgfortran] Part 2: PR105456 Child I/O does not propage iostat

2024-03-04 Thread Jerry D

On 3/1/24 11:24 AM, rep.dot@gmail.com wrote:

Hi Jerry and Steve,

On 29 February 2024 19:28:19 CET, Jerry D  wrote:

On 2/29/24 10:13 AM, Steve Kargl wrote:

On Thu, Feb 29, 2024 at 09:36:43AM -0800, Jerry D wrote:

On 2/29/24 1:47 AM, Bernhard Reutner-Fischer wrote:


And, just for my own education, the length limitation of iomsg to 255
chars is not backed by the standard AFAICS, right? It's just our
STRERR_MAXSZ?


Yes, its what we have had for a long lone time. Once you throw an error
things get very processor dependent. I found MSGLEN set to 100 and IOMSG_len
to 256. Nothing magic about it.



There is no restriction on the length for the iomsg-variable
that receives the generated error message.  In fact, if the
iomsg-variable has a deferred-length type parameter, then
(re)-allocation to the exact length is expected.

F2023

12.11.6 IOMSG= specifier

If an error, end-of-file, or end-of-record condition occurs during
execution of an input/output statement, iomsg-variable is assigned
an explanatory message, as if by intrinsic assignment. If no such
condition occurs, the definition status and value of iomsg-variable
are unchanged.
   character(len=23) emsg
read(fd,*,iomsg=emsg)

Here, the generated iomsg is either truncated to a length of 23
or padded with blanks to a length of 23.

character(len=:), allocatable :: emsg
read(fd,*,iomsg=emsg)

Here, emsg should have the length of whatever error message was
generated.
   HTH



Well, currently, if someone uses a larger string than 256 we are going to chop 
it off.

Do we want to process this differently now?


Yes. There is some odd hunk about discrepancy of passed len and actual len 
afterwards in 22-007-r1, IIRC. Didn't look closely though.


--- snip ---

Attached is the revised patch using the already available 
string_len_trim function.


This hunk is only executed if a user has not passed an iostat or iomsg 
variable in the parent I/O statement and an error is triggered which 
terminates execution of the program. In this case, the iomsg string is 
provided in the usual error message in a "processor defined" way.


(F2023):

12.6.4.8.3 Executing defined input/output data transfers
---
11 If the iostat argument of the defined input/output procedure has a 
nonzero value when that procedure returns, and the processor therefore 
terminates execution of the program as described in 12.11, the processor 
shall make the value of the iomsg argument available in a 
processor-dependent manner.

---

OK for trunk?

Regards,

Jerry


commit 51a24ace512e96b425bcde46c056e816c4606784
Author: Jerry DeLisle 
Date:   Mon Mar 4 18:45:49 2024 -0800

Fortran: Add user defined error messages for UDTIO.

The defines IOMSG_LEN and MSGLEN were redundant so these are combined
into IOMSG_LEN as defined in io.h.

The remainder of the patch adds checks for when a user defined
derived type IO procedure sets the IOSTAT or IOMSG variables
independent of the librrary defined I/O messages.

PR libfortran/105456

libgfortran/ChangeLog:

* io/io.h (IOMSG_LEN): Moved to here.
* io/list_read.c (MSGLEN): Removed MSGLEN.
(convert_integer): Changed MSGLEN to IOMSG_LEN.
(parse_repeat): Likewise.
(read_logical): Likewise.
(read_integer): Likewise.
(read_character): Likewise.
(parse_real): Likewise.
(read_complex): Likewise.
(read_real): Likewise.
(check_type): Likewise.
(list_formatted_read_scalar): Adjust to IOMSG_LEN.
(nml_read_obj): Add user defined error message.
* io/transfer.c (unformatted_read): Add user defined error
message.
(unformatted_write): Add user defined error message.
(formatted_transfer_scalar_read): Add user defined error message.
(formatted_transfer_scalar_write): Add user defined error message.
* io/write.c (list_formatted_write_scalar): Add user defined error message.
(nml_write_obj): Add user defined error message.

gcc/testsuite/ChangeLog:

* gfortran.dg/pr105456-nmlr.f90: New test.
* gfortran.dg/pr105456-nmlw.f90: New test.
* gfortran.dg/pr105456-ruf.f90: New test.
* gfortran.dg/pr105456-wf.f90: New test.
* gfortran.dg/pr105456-wuf.f90: New test.

diff --git a/gcc/testsuite/gfortran.dg/pr105456-nmlr.f90 b/gcc/testsuite/gfortran.dg/pr105456-nmlr.f90
new file mode 100644
index 000..5ce5d082133
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr105456-nmlr.f90
@@ -0,0 +1,60 @@
+! { dg-do run }
+! { dg-shouldfail "The users message" }
+module m
+  implicit none
+  type :: t
+character :: c
+integer :: k
+  contains
+procedure :: write_formatted
+generic :: write(formatted) => write_formatted
+procedure :: read_formatted
+generic :: 

Re: [PATCH] c++: Don't set DECL_CONTEXT to nested template-template parameters [PR98881]

2024-03-04 Thread Patrick Palka
On Tue, 5 Mar 2024, Nathaniel Shead wrote:

> On Mon, Mar 04, 2024 at 09:26:00PM -0500, Patrick Palka wrote:
> > On Tue, 5 Mar 2024, Nathaniel Shead wrote:
> > 
> > > On Mon, Mar 04, 2024 at 07:14:54PM -0500, Patrick Palka wrote:
> > > > On Sat, 2 Mar 2024, Nathaniel Shead wrote:
> > > > 
> > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > > > > 
> > > > > -- >8 --
> > > > > 
> > > > > When streaming in a nested template-template parameter as in the
> > > > > attached testcase, we end up reaching the containing template-template
> > > > > parameter in 'tpl_parms_fini'. We should not set the DECL_CONTEXT to
> > > > > this (nested) template-template parameter, as it should already be the
> > > > > struct that the outer template-template parameter is declared on.
> > > > > 
> > > > >   PR c++/98881
> > > > > 
> > > > > gcc/cp/ChangeLog:
> > > > > 
> > > > >   * module.cc (trees_out::tpl_parms_fini): Clarify logic purely
> > > > >   for checking purposes. Don't consider a template template
> > > > >   parameter as the owning template.
> > > > >   (trees_in::tpl_parms_fini): Don't consider a template template
> > > > >   parameter as the owning template.
> > > > > 
> > > > > gcc/testsuite/ChangeLog:
> > > > > 
> > > > >   * g++.dg/modules/tpl-tpl-parm-3_a.H: New test.
> > > > >   * g++.dg/modules/tpl-tpl-parm-3_b.C: New test.
> > > > > 
> > > > > Signed-off-by: Nathaniel Shead 
> > > > > ---
> > > > >  gcc/cp/module.cc| 17 
> > > > > -
> > > > >  gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H | 11 +++
> > > > >  gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C | 13 +
> > > > >  3 files changed, 36 insertions(+), 5 deletions(-)
> > > > >  create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> > > > >  create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C
> > > > > 
> > > > > diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> > > > > index 67f132d28d7..5663d01ed9c 100644
> > > > > --- a/gcc/cp/module.cc
> > > > > +++ b/gcc/cp/module.cc
> > > > > @@ -10126,10 +10126,14 @@ trees_out::tpl_parms_fini (tree tmpl, 
> > > > > unsigned tpl_levels)
> > > > > tree dflt = TREE_PURPOSE (parm);
> > > > > tree_node (dflt);
> > > > >  
> > > > > -   if (streaming_p ())
> > > > > +   if (CHECKING_P && streaming_p ())
> > > > >   {
> > > > > +   /* Sanity check that the DECL_CONTEXT we'll infer when
> > > > > +  streaming in is correct.  */
> > > > > tree decl = TREE_VALUE (parm);
> > > > > -   if (TREE_CODE (decl) == TEMPLATE_DECL)
> > > > > +   if (TREE_CODE (decl) == TEMPLATE_DECL
> > > > > +   /* A template template parm is not the owning 
> > > > > template.  */
> > > > > +   && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))
> > > > >   {
> > > > > tree ctx = DECL_CONTEXT (decl);
> > > > > tree inner = DECL_TEMPLATE_RESULT (decl);
> > > > > @@ -10164,8 +10168,13 @@ trees_in::tpl_parms_fini (tree tmpl, 
> > > > > unsigned tpl_levels)
> > > > >   return false;
> > > > > TREE_PURPOSE (parm) = dflt;
> > > > >  
> > > > > +   /* Original template template parms have a context
> > > > > +  of their owning template.  Reduced ones do not.
> > > > > +  But if TMPL is itself a template template parm
> > > > > +  then it cannot be the owning template.  */
> > > > > tree decl = TREE_VALUE (parm);
> > > > > -   if (TREE_CODE (decl) == TEMPLATE_DECL)
> > > > > +   if (TREE_CODE (decl) == TEMPLATE_DECL
> > > > > +   && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))
> > > > 
> > > > IIUC a TEMPLATE_DECL inside a template parameter list always represents
> > > > a template template parm, so won't this effectively disable the
> > > > DECL_CONTEXT setting logic?
> > > 
> > > This is only when 'tmpl' (i.e. the containing TEMPLATE_DECL that we're
> > > streaming) is itself a template template parm.
> > 
> > D'oh, makes sense.
> > 
> > > 
> > > > >   {
> > > > > tree inner = DECL_TEMPLATE_RESULT (decl);
> > > > > tree tpi = (TREE_CODE (inner) == TYPE_DECL
> > > > > @@ -10173,8 +10182,6 @@ trees_in::tpl_parms_fini (tree tmpl, unsigned 
> > > > > tpl_levels)
> > > > > : DECL_INITIAL (inner));
> > > > > bool original = (TEMPLATE_PARM_LEVEL (tpi)
> > > > >  == TEMPLATE_PARM_ORIG_LEVEL (tpi));
> > > > > -   /* Original template template parms have a context
> > > > > -  of their owning template.  Reduced ones do not.  */
> > > > > if (original)
> > > > >   DECL_CONTEXT (decl) = tmpl;
> > > > >   }
> > > > > diff --git a/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H 
> > > > > b/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> > > > > new file mode 100644

Re: [PATCH] c++: Don't set DECL_CONTEXT to nested template-template parameters [PR98881]

2024-03-04 Thread Nathaniel Shead
On Mon, Mar 04, 2024 at 09:26:00PM -0500, Patrick Palka wrote:
> On Tue, 5 Mar 2024, Nathaniel Shead wrote:
> 
> > On Mon, Mar 04, 2024 at 07:14:54PM -0500, Patrick Palka wrote:
> > > On Sat, 2 Mar 2024, Nathaniel Shead wrote:
> > > 
> > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > > > 
> > > > -- >8 --
> > > > 
> > > > When streaming in a nested template-template parameter as in the
> > > > attached testcase, we end up reaching the containing template-template
> > > > parameter in 'tpl_parms_fini'. We should not set the DECL_CONTEXT to
> > > > this (nested) template-template parameter, as it should already be the
> > > > struct that the outer template-template parameter is declared on.
> > > > 
> > > > PR c++/98881
> > > > 
> > > > gcc/cp/ChangeLog:
> > > > 
> > > > * module.cc (trees_out::tpl_parms_fini): Clarify logic purely
> > > > for checking purposes. Don't consider a template template
> > > > parameter as the owning template.
> > > > (trees_in::tpl_parms_fini): Don't consider a template template
> > > > parameter as the owning template.
> > > > 
> > > > gcc/testsuite/ChangeLog:
> > > > 
> > > > * g++.dg/modules/tpl-tpl-parm-3_a.H: New test.
> > > > * g++.dg/modules/tpl-tpl-parm-3_b.C: New test.
> > > > 
> > > > Signed-off-by: Nathaniel Shead 
> > > > ---
> > > >  gcc/cp/module.cc| 17 -
> > > >  gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H | 11 +++
> > > >  gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C | 13 +
> > > >  3 files changed, 36 insertions(+), 5 deletions(-)
> > > >  create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> > > >  create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C
> > > > 
> > > > diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> > > > index 67f132d28d7..5663d01ed9c 100644
> > > > --- a/gcc/cp/module.cc
> > > > +++ b/gcc/cp/module.cc
> > > > @@ -10126,10 +10126,14 @@ trees_out::tpl_parms_fini (tree tmpl, 
> > > > unsigned tpl_levels)
> > > >   tree dflt = TREE_PURPOSE (parm);
> > > >   tree_node (dflt);
> > > >  
> > > > - if (streaming_p ())
> > > > + if (CHECKING_P && streaming_p ())
> > > > {
> > > > + /* Sanity check that the DECL_CONTEXT we'll infer when
> > > > +streaming in is correct.  */
> > > >   tree decl = TREE_VALUE (parm);
> > > > - if (TREE_CODE (decl) == TEMPLATE_DECL)
> > > > + if (TREE_CODE (decl) == TEMPLATE_DECL
> > > > + /* A template template parm is not the owning 
> > > > template.  */
> > > > + && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))
> > > > {
> > > >   tree ctx = DECL_CONTEXT (decl);
> > > >   tree inner = DECL_TEMPLATE_RESULT (decl);
> > > > @@ -10164,8 +10168,13 @@ trees_in::tpl_parms_fini (tree tmpl, unsigned 
> > > > tpl_levels)
> > > > return false;
> > > >   TREE_PURPOSE (parm) = dflt;
> > > >  
> > > > + /* Original template template parms have a context
> > > > +of their owning template.  Reduced ones do not.
> > > > +But if TMPL is itself a template template parm
> > > > +then it cannot be the owning template.  */
> > > >   tree decl = TREE_VALUE (parm);
> > > > - if (TREE_CODE (decl) == TEMPLATE_DECL)
> > > > + if (TREE_CODE (decl) == TEMPLATE_DECL
> > > > + && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))
> > > 
> > > IIUC a TEMPLATE_DECL inside a template parameter list always represents
> > > a template template parm, so won't this effectively disable the
> > > DECL_CONTEXT setting logic?
> > 
> > This is only when 'tmpl' (i.e. the containing TEMPLATE_DECL that we're
> > streaming) is itself a template template parm.
> 
> D'oh, makes sense.
> 
> > 
> > > > {
> > > >   tree inner = DECL_TEMPLATE_RESULT (decl);
> > > >   tree tpi = (TREE_CODE (inner) == TYPE_DECL
> > > > @@ -10173,8 +10182,6 @@ trees_in::tpl_parms_fini (tree tmpl, unsigned 
> > > > tpl_levels)
> > > >   : DECL_INITIAL (inner));
> > > >   bool original = (TEMPLATE_PARM_LEVEL (tpi)
> > > >== TEMPLATE_PARM_ORIG_LEVEL (tpi));
> > > > - /* Original template template parms have a context
> > > > -of their owning template.  Reduced ones do not.  */
> > > >   if (original)
> > > > DECL_CONTEXT (decl) = tmpl;
> > > > }
> > > > diff --git a/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H 
> > > > b/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> > > > new file mode 100644
> > > > index 000..21bbc054fa3
> > > > --- /dev/null
> > > > +++ b/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> > > > @@ -0,0 +1,11 @@
> > > > +// PR c++/98881

Re: [PATCH] c++: Don't set DECL_CONTEXT to nested template-template parameters [PR98881]

2024-03-04 Thread Patrick Palka
On Tue, 5 Mar 2024, Nathaniel Shead wrote:

> On Mon, Mar 04, 2024 at 07:14:54PM -0500, Patrick Palka wrote:
> > On Sat, 2 Mar 2024, Nathaniel Shead wrote:
> > 
> > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > > 
> > > -- >8 --
> > > 
> > > When streaming in a nested template-template parameter as in the
> > > attached testcase, we end up reaching the containing template-template
> > > parameter in 'tpl_parms_fini'. We should not set the DECL_CONTEXT to
> > > this (nested) template-template parameter, as it should already be the
> > > struct that the outer template-template parameter is declared on.
> > > 
> > >   PR c++/98881
> > > 
> > > gcc/cp/ChangeLog:
> > > 
> > >   * module.cc (trees_out::tpl_parms_fini): Clarify logic purely
> > >   for checking purposes. Don't consider a template template
> > >   parameter as the owning template.
> > >   (trees_in::tpl_parms_fini): Don't consider a template template
> > >   parameter as the owning template.
> > > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > >   * g++.dg/modules/tpl-tpl-parm-3_a.H: New test.
> > >   * g++.dg/modules/tpl-tpl-parm-3_b.C: New test.
> > > 
> > > Signed-off-by: Nathaniel Shead 
> > > ---
> > >  gcc/cp/module.cc| 17 -
> > >  gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H | 11 +++
> > >  gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C | 13 +
> > >  3 files changed, 36 insertions(+), 5 deletions(-)
> > >  create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> > >  create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C
> > > 
> > > diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> > > index 67f132d28d7..5663d01ed9c 100644
> > > --- a/gcc/cp/module.cc
> > > +++ b/gcc/cp/module.cc
> > > @@ -10126,10 +10126,14 @@ trees_out::tpl_parms_fini (tree tmpl, unsigned 
> > > tpl_levels)
> > > tree dflt = TREE_PURPOSE (parm);
> > > tree_node (dflt);
> > >  
> > > -   if (streaming_p ())
> > > +   if (CHECKING_P && streaming_p ())
> > >   {
> > > +   /* Sanity check that the DECL_CONTEXT we'll infer when
> > > +  streaming in is correct.  */
> > > tree decl = TREE_VALUE (parm);
> > > -   if (TREE_CODE (decl) == TEMPLATE_DECL)
> > > +   if (TREE_CODE (decl) == TEMPLATE_DECL
> > > +   /* A template template parm is not the owning template.  */
> > > +   && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))
> > >   {
> > > tree ctx = DECL_CONTEXT (decl);
> > > tree inner = DECL_TEMPLATE_RESULT (decl);
> > > @@ -10164,8 +10168,13 @@ trees_in::tpl_parms_fini (tree tmpl, unsigned 
> > > tpl_levels)
> > >   return false;
> > > TREE_PURPOSE (parm) = dflt;
> > >  
> > > +   /* Original template template parms have a context
> > > +  of their owning template.  Reduced ones do not.
> > > +  But if TMPL is itself a template template parm
> > > +  then it cannot be the owning template.  */
> > > tree decl = TREE_VALUE (parm);
> > > -   if (TREE_CODE (decl) == TEMPLATE_DECL)
> > > +   if (TREE_CODE (decl) == TEMPLATE_DECL
> > > +   && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))
> > 
> > IIUC a TEMPLATE_DECL inside a template parameter list always represents
> > a template template parm, so won't this effectively disable the
> > DECL_CONTEXT setting logic?
> 
> This is only when 'tmpl' (i.e. the containing TEMPLATE_DECL that we're
> streaming) is itself a template template parm.

D'oh, makes sense.

> 
> > >   {
> > > tree inner = DECL_TEMPLATE_RESULT (decl);
> > > tree tpi = (TREE_CODE (inner) == TYPE_DECL
> > > @@ -10173,8 +10182,6 @@ trees_in::tpl_parms_fini (tree tmpl, unsigned 
> > > tpl_levels)
> > > : DECL_INITIAL (inner));
> > > bool original = (TEMPLATE_PARM_LEVEL (tpi)
> > >  == TEMPLATE_PARM_ORIG_LEVEL (tpi));
> > > -   /* Original template template parms have a context
> > > -  of their owning template.  Reduced ones do not.  */
> > > if (original)
> > >   DECL_CONTEXT (decl) = tmpl;
> > >   }
> > > diff --git a/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H 
> > > b/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> > > new file mode 100644
> > > index 000..21bbc054fa3
> > > --- /dev/null
> > > +++ b/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> > > @@ -0,0 +1,11 @@
> > > +// PR c++/98881
> > > +// { dg-additional-options "-fmodule-header" }
> > > +// { dg-module-cmi {} }
> > > +
> > > +template  struct X {};
> > > +
> > > +template typename TT>
> > > +struct X> {
> > > +  template typename UU>
> > > +  void f (X>&);
> > > +};
> > 
> > I wonder why the partial specialization is relevant here?  I can't
> > seem to trigger the ICE without the partial specialization.
> > Slightly further reduced to not use bound ttps:
> > 
> > template class TT>
> > struct X { };
> > 
> > template class TT> requires true
> > 

Re: [PATCH] LoongArch: Fix inconsistent description in *sge_

2024-03-04 Thread Guo Jie

Thanks for the feedback.

The comparison between a const_imm12_operand and (const_int 1) does indeed

perform a universal process of constant folding before any tree based 
optimization.


I will fix it in patch v2.


在 2024/3/4 下午5:18, Xi Ruoyao 写道:

On Mon, 2024-03-04 at 11:03 +0800, Guo Jie wrote:

The constraint of op[1] is inconsistent with the output template.

gcc/ChangeLog:

* config/loongarch/loongarch.md
(define_insn "*sge_"): Fix inconsistency
error.

---
  gcc/config/loongarch/loongarch.md | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/loongarch/loongarch.md
b/gcc/config/loongarch/loongarch.md
index f3b5c641fce..2d25374bdc9 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -3357,10 +3357,10 @@ (define_insn "*sgt_"
  
  (define_insn "*sge_"

    [(set (match_operand:GPR 0 "register_operand" "=r")
-   (any_ge:GPR (match_operand:X 1 "register_operand" "r")
+   (any_ge:GPR (match_operand:X 1 "arith_operand" "rI")
     (const_int 1)))]

No, arith_operand is just register_operand or const_imm12_operand, but
comparing a const_imm12_operand with (const_int 1) should be folded into
a constant (even at -O0, AFAIK).  So allowing const_imm12_operand here
makes no benefit.


    ""
-  "slti\t%0,%.,%1"
+  "slt%i1\t%0,%.,%1"
    [(set_attr "type" "slt")
     (set_attr "mode" "")])
  


Re: [PATCH] c++: Don't set DECL_CONTEXT to nested template-template parameters [PR98881]

2024-03-04 Thread Nathaniel Shead
On Mon, Mar 04, 2024 at 07:14:54PM -0500, Patrick Palka wrote:
> On Sat, 2 Mar 2024, Nathaniel Shead wrote:
> 
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > 
> > -- >8 --
> > 
> > When streaming in a nested template-template parameter as in the
> > attached testcase, we end up reaching the containing template-template
> > parameter in 'tpl_parms_fini'. We should not set the DECL_CONTEXT to
> > this (nested) template-template parameter, as it should already be the
> > struct that the outer template-template parameter is declared on.
> > 
> > PR c++/98881
> > 
> > gcc/cp/ChangeLog:
> > 
> > * module.cc (trees_out::tpl_parms_fini): Clarify logic purely
> > for checking purposes. Don't consider a template template
> > parameter as the owning template.
> > (trees_in::tpl_parms_fini): Don't consider a template template
> > parameter as the owning template.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/modules/tpl-tpl-parm-3_a.H: New test.
> > * g++.dg/modules/tpl-tpl-parm-3_b.C: New test.
> > 
> > Signed-off-by: Nathaniel Shead 
> > ---
> >  gcc/cp/module.cc| 17 -
> >  gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H | 11 +++
> >  gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C | 13 +
> >  3 files changed, 36 insertions(+), 5 deletions(-)
> >  create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> >  create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C
> > 
> > diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> > index 67f132d28d7..5663d01ed9c 100644
> > --- a/gcc/cp/module.cc
> > +++ b/gcc/cp/module.cc
> > @@ -10126,10 +10126,14 @@ trees_out::tpl_parms_fini (tree tmpl, unsigned 
> > tpl_levels)
> >   tree dflt = TREE_PURPOSE (parm);
> >   tree_node (dflt);
> >  
> > - if (streaming_p ())
> > + if (CHECKING_P && streaming_p ())
> > {
> > + /* Sanity check that the DECL_CONTEXT we'll infer when
> > +streaming in is correct.  */
> >   tree decl = TREE_VALUE (parm);
> > - if (TREE_CODE (decl) == TEMPLATE_DECL)
> > + if (TREE_CODE (decl) == TEMPLATE_DECL
> > + /* A template template parm is not the owning template.  */
> > + && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))
> > {
> >   tree ctx = DECL_CONTEXT (decl);
> >   tree inner = DECL_TEMPLATE_RESULT (decl);
> > @@ -10164,8 +10168,13 @@ trees_in::tpl_parms_fini (tree tmpl, unsigned 
> > tpl_levels)
> > return false;
> >   TREE_PURPOSE (parm) = dflt;
> >  
> > + /* Original template template parms have a context
> > +of their owning template.  Reduced ones do not.
> > +But if TMPL is itself a template template parm
> > +then it cannot be the owning template.  */
> >   tree decl = TREE_VALUE (parm);
> > - if (TREE_CODE (decl) == TEMPLATE_DECL)
> > + if (TREE_CODE (decl) == TEMPLATE_DECL
> > + && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))
> 
> IIUC a TEMPLATE_DECL inside a template parameter list always represents
> a template template parm, so won't this effectively disable the
> DECL_CONTEXT setting logic?

This is only when 'tmpl' (i.e. the containing TEMPLATE_DECL that we're
streaming) is itself a template template parm.

> > {
> >   tree inner = DECL_TEMPLATE_RESULT (decl);
> >   tree tpi = (TREE_CODE (inner) == TYPE_DECL
> > @@ -10173,8 +10182,6 @@ trees_in::tpl_parms_fini (tree tmpl, unsigned 
> > tpl_levels)
> >   : DECL_INITIAL (inner));
> >   bool original = (TEMPLATE_PARM_LEVEL (tpi)
> >== TEMPLATE_PARM_ORIG_LEVEL (tpi));
> > - /* Original template template parms have a context
> > -of their owning template.  Reduced ones do not.  */
> >   if (original)
> > DECL_CONTEXT (decl) = tmpl;
> > }
> > diff --git a/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H 
> > b/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> > new file mode 100644
> > index 000..21bbc054fa3
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> > @@ -0,0 +1,11 @@
> > +// PR c++/98881
> > +// { dg-additional-options "-fmodule-header" }
> > +// { dg-module-cmi {} }
> > +
> > +template  struct X {};
> > +
> > +template typename TT>
> > +struct X> {
> > +  template typename UU>
> > +  void f (X>&);
> > +};
> 
> I wonder why the partial specialization is relevant here?  I can't
> seem to trigger the ICE without the partial specialization.
> Slightly further reduced to not use bound ttps:
> 
> template class TT>
> struct X { };
> 
> template class TT> requires true
> struct X {
>   template class UU>
>   void f(X);
> };
> 
> Maybe the expectation is that tpl_parms_fini for UU should be called
> with tpl_levels=1 (so that we stream only its own template parameters)
> but it's 

Re: [PATCH] c++: Don't set DECL_CONTEXT to nested template-template parameters [PR98881]

2024-03-04 Thread Patrick Palka
On Sat, 2 Mar 2024, Nathaniel Shead wrote:

> Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> 
> -- >8 --
> 
> When streaming in a nested template-template parameter as in the
> attached testcase, we end up reaching the containing template-template
> parameter in 'tpl_parms_fini'. We should not set the DECL_CONTEXT to
> this (nested) template-template parameter, as it should already be the
> struct that the outer template-template parameter is declared on.
> 
>   PR c++/98881
> 
> gcc/cp/ChangeLog:
> 
>   * module.cc (trees_out::tpl_parms_fini): Clarify logic purely
>   for checking purposes. Don't consider a template template
>   parameter as the owning template.
>   (trees_in::tpl_parms_fini): Don't consider a template template
>   parameter as the owning template.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/modules/tpl-tpl-parm-3_a.H: New test.
>   * g++.dg/modules/tpl-tpl-parm-3_b.C: New test.
> 
> Signed-off-by: Nathaniel Shead 
> ---
>  gcc/cp/module.cc| 17 -
>  gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H | 11 +++
>  gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C | 13 +
>  3 files changed, 36 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
>  create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C
> 
> diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> index 67f132d28d7..5663d01ed9c 100644
> --- a/gcc/cp/module.cc
> +++ b/gcc/cp/module.cc
> @@ -10126,10 +10126,14 @@ trees_out::tpl_parms_fini (tree tmpl, unsigned 
> tpl_levels)
> tree dflt = TREE_PURPOSE (parm);
> tree_node (dflt);
>  
> -   if (streaming_p ())
> +   if (CHECKING_P && streaming_p ())
>   {
> +   /* Sanity check that the DECL_CONTEXT we'll infer when
> +  streaming in is correct.  */
> tree decl = TREE_VALUE (parm);
> -   if (TREE_CODE (decl) == TEMPLATE_DECL)
> +   if (TREE_CODE (decl) == TEMPLATE_DECL
> +   /* A template template parm is not the owning template.  */
> +   && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))
>   {
> tree ctx = DECL_CONTEXT (decl);
> tree inner = DECL_TEMPLATE_RESULT (decl);
> @@ -10164,8 +10168,13 @@ trees_in::tpl_parms_fini (tree tmpl, unsigned 
> tpl_levels)
>   return false;
> TREE_PURPOSE (parm) = dflt;
>  
> +   /* Original template template parms have a context
> +  of their owning template.  Reduced ones do not.
> +  But if TMPL is itself a template template parm
> +  then it cannot be the owning template.  */
> tree decl = TREE_VALUE (parm);
> -   if (TREE_CODE (decl) == TEMPLATE_DECL)
> +   if (TREE_CODE (decl) == TEMPLATE_DECL
> +   && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))

IIUC a TEMPLATE_DECL inside a template parameter list always represents
a template template parm, so won't this effectively disable the
DECL_CONTEXT setting logic?

>   {
> tree inner = DECL_TEMPLATE_RESULT (decl);
> tree tpi = (TREE_CODE (inner) == TYPE_DECL
> @@ -10173,8 +10182,6 @@ trees_in::tpl_parms_fini (tree tmpl, unsigned 
> tpl_levels)
> : DECL_INITIAL (inner));
> bool original = (TEMPLATE_PARM_LEVEL (tpi)
>  == TEMPLATE_PARM_ORIG_LEVEL (tpi));
> -   /* Original template template parms have a context
> -  of their owning template.  Reduced ones do not.  */
> if (original)
>   DECL_CONTEXT (decl) = tmpl;
>   }
> diff --git a/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H 
> b/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> new file mode 100644
> index 000..21bbc054fa3
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> @@ -0,0 +1,11 @@
> +// PR c++/98881
> +// { dg-additional-options "-fmodule-header" }
> +// { dg-module-cmi {} }
> +
> +template  struct X {};
> +
> +template typename TT>
> +struct X> {
> +  template typename UU>
> +  void f (X>&);
> +};

I wonder why the partial specialization is relevant here?  I can't
seem to trigger the ICE without the partial specialization.
Slightly further reduced to not use bound ttps:

template class TT>
struct X { };

template class TT> requires true
struct X {
  template class UU>
  void f(X);
};

Maybe the expectation is that tpl_parms_fini for UU should be called
with tpl_levels=1 (so that we stream only its own template parameters)
but it's instead called with tpl_levels=3 for some reason?

IIUC that assert should always hold in the first iteration of the loop
(for the ttp's own template parameters).  Perhaps for subsequent
iterations we need to actually stream the contexts?

Ah, but we also ICE in the same spot with:

  template class TT> struct X;  // #1
  template 

Re: [PATCH] c++: Don't set DECL_CONTEXT to nested template-template parameters [PR98881]

2024-03-04 Thread Nathaniel Shead
On Mon, Mar 04, 2024 at 06:01:48PM -0500, Jason Merrill wrote:
> On 3/2/24 01:54, Nathaniel Shead wrote:
> > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
> > 
> > -- >8 --
> > 
> > When streaming in a nested template-template parameter as in the
> > attached testcase, we end up reaching the containing template-template
> > parameter in 'tpl_parms_fini'. We should not set the DECL_CONTEXT to
> > this (nested) template-template parameter, as it should already be the
> > struct that the outer template-template parameter is declared on.
> 
> So in the case where tmpl is a template template parameter we want
> DECL_CONTEXT (parm) to be the same as DECL_CONTEXT (tmpl)?  Let's check that
> instead of ignoring it.

No, I don't think so. I guess the closest is that we if we keep
iterating through all the nested 'DECL_CONTEXT's of tmpl we should
eventually reach the template result of parm's context, if we find
enough template infos etc.? I'm not entirely sure how to go about this.

But in particular, in the current test case we have:

- tmpl = 
- DECL_CONTEXT (tmpl) = 

- decl = 
- DECL_CONTEXT (decl) =  

('decl' is the declaration associated with 'parm', a tree_list.)
And it's not immediately obvious to me how to unify these.

> > PR c++/98881
> > 
> > gcc/cp/ChangeLog:
> > 
> > * module.cc (trees_out::tpl_parms_fini): Clarify logic purely
> > for checking purposes. Don't consider a template template
> > parameter as the owning template.
> > (trees_in::tpl_parms_fini): Don't consider a template template
> > parameter as the owning template.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > * g++.dg/modules/tpl-tpl-parm-3_a.H: New test.
> > * g++.dg/modules/tpl-tpl-parm-3_b.C: New test.
> > 
> > Signed-off-by: Nathaniel Shead 
> > ---
> >   gcc/cp/module.cc| 17 -
> >   gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H | 11 +++
> >   gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C | 13 +
> >   3 files changed, 36 insertions(+), 5 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> >   create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C
> > 
> > diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
> > index 67f132d28d7..5663d01ed9c 100644
> > --- a/gcc/cp/module.cc
> > +++ b/gcc/cp/module.cc
> > @@ -10126,10 +10126,14 @@ trees_out::tpl_parms_fini (tree tmpl, unsigned 
> > tpl_levels)
> >   tree dflt = TREE_PURPOSE (parm);
> >   tree_node (dflt);
> > - if (streaming_p ())
> > + if (CHECKING_P && streaming_p ())
> > {
> > + /* Sanity check that the DECL_CONTEXT we'll infer when
> > +streaming in is correct.  */
> >   tree decl = TREE_VALUE (parm);
> > - if (TREE_CODE (decl) == TEMPLATE_DECL)
> > + if (TREE_CODE (decl) == TEMPLATE_DECL
> > + /* A template template parm is not the owning template.  */
> > + && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))
> > {
> >   tree ctx = DECL_CONTEXT (decl);
> >   tree inner = DECL_TEMPLATE_RESULT (decl);
> > @@ -10164,8 +10168,13 @@ trees_in::tpl_parms_fini (tree tmpl, unsigned 
> > tpl_levels)
> > return false;
> >   TREE_PURPOSE (parm) = dflt;
> > + /* Original template template parms have a context
> > +of their owning template.  Reduced ones do not.
> > +But if TMPL is itself a template template parm
> > +then it cannot be the owning template.  */
> >   tree decl = TREE_VALUE (parm);
> > - if (TREE_CODE (decl) == TEMPLATE_DECL)
> > + if (TREE_CODE (decl) == TEMPLATE_DECL
> > + && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))
> > {
> >   tree inner = DECL_TEMPLATE_RESULT (decl);
> >   tree tpi = (TREE_CODE (inner) == TYPE_DECL
> > @@ -10173,8 +10182,6 @@ trees_in::tpl_parms_fini (tree tmpl, unsigned 
> > tpl_levels)
> >   : DECL_INITIAL (inner));
> >   bool original = (TEMPLATE_PARM_LEVEL (tpi)
> >== TEMPLATE_PARM_ORIG_LEVEL (tpi));
> > - /* Original template template parms have a context
> > -of their owning template.  Reduced ones do not.  */
> >   if (original)
> > DECL_CONTEXT (decl) = tmpl;
> > }
> > diff --git a/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H 
> > b/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> > new file mode 100644
> > index 000..21bbc054fa3
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
> > @@ -0,0 +1,11 @@
> > +// PR c++/98881
> > +// { dg-additional-options "-fmodule-header" }
> > +// { dg-module-cmi {} }
> > +
> > +template  struct X {};
> > +
> > +template typename TT>
> > +struct X> {
> > +  template typename UU>
> > +  void f (X>&);
> > +};
> > diff --git a/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C 
> > 

[PATCH] c++: Fix ICE diagnosing incomplete type of overloaded function set [PR98356]

2024-03-04 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

In the linked PR the result of 'get_first_fn' is a USING_DECL against
the template parameter, to be filled in on instantiation. But we don't
actually need to get the first set of the member functions: it's enough
to know that we have a (possibly overloaded) member function at all.

PR c++/98356

gcc/cp/ChangeLog:

* typeck2.cc (cxx_incomplete_type_diagnostic): Don't assume
'member' will be a FUNCTION_DECL (or something like it).

gcc/testsuite/ChangeLog:

* g++.dg/pr98356.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/typeck2.cc  | 11 +--
 gcc/testsuite/g++.dg/pr98356.C |  9 +
 2 files changed, 14 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr98356.C

diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
index 9608bdccd8b..31198b2f9f5 100644
--- a/gcc/cp/typeck2.cc
+++ b/gcc/cp/typeck2.cc
@@ -350,16 +350,15 @@ cxx_incomplete_type_diagnostic (location_t loc, 
const_tree value,
 bad_member:
   {
tree member = TREE_OPERAND (value, 1);
-   if (is_overloaded_fn (member))
- member = get_first_fn (member);
-
-   if (DECL_FUNCTION_MEMBER_P (member)
-   && ! flag_ms_extensions)
+   if (is_overloaded_fn (member) && !flag_ms_extensions)
  {
gcc_rich_location richloc (loc);
/* If "member" has no arguments (other than "this"), then
   add a fix-it hint.  */
-   if (type_num_arguments (TREE_TYPE (member)) == 1)
+   member = MAYBE_BASELINK_FUNCTIONS (member);
+   if (TREE_CODE (member) == FUNCTION_DECL
+   && DECL_OBJECT_MEMBER_FUNCTION_P (member)
+   && type_num_arguments (TREE_TYPE (member)) == 1)
  richloc.add_fixit_insert_after ("()");
complained = emit_diagnostic (diag_kind, , 0,
 "invalid use of member function %qD "
diff --git a/gcc/testsuite/g++.dg/pr98356.C b/gcc/testsuite/g++.dg/pr98356.C
new file mode 100644
index 000..acea238593b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr98356.C
@@ -0,0 +1,9 @@
+// PR c++/98356
+// { dg-do compile { target c++11 } }
+
+template  class T> struct S {
+  using A = T;
+  using A::foo;
+  void foo ();
+  void bar () {foo.}  // { dg-error "invalid use of member function" }
+};
-- 
2.43.2



Re: [PATCH] c++: DECL_DECOMPOSITION_P cleanup

2024-03-04 Thread Jason Merrill

On 3/1/24 19:59, Marek Polacek wrote:

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for 15?


OK for 15, yes.


-- >8 --
DECL_DECOMPOSITION_P already checks VAR_P but we repeat the check
in a lot of places.

gcc/cp/ChangeLog:

* decl.cc (duplicate_decls): Don't check VAR_P before
DECL_DECOMPOSITION_P.
* init.cc (build_aggr_init): Likewise.
* parser.cc (cp_parser_range_for): Likewise.
(do_range_for_auto_deduction): Likewise.
(cp_convert_range_for): Likewise.
(cp_convert_omp_range_for): Likewise.
(cp_finish_omp_range_for): Likewise.
* pt.cc (extract_locals_r): Likewise.
(tsubst_omp_for_iterator): Likewise.
(tsubst_decomp_names): Likewise.
(tsubst_stmt): Likewise.
* typeck.cc (maybe_warn_about_returning_address_of_local): Likewise.
---
  gcc/cp/decl.cc   |  3 +--
  gcc/cp/init.cc   |  2 +-
  gcc/cp/parser.cc | 11 ---
  gcc/cp/pt.cc | 11 +++
  gcc/cp/typeck.cc |  3 +--
  5 files changed, 10 insertions(+), 20 deletions(-)

diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index dbc3df24e77..13df91ce17c 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -1928,8 +1928,7 @@ duplicate_decls (tree newdecl, tree olddecl, bool hiding, 
bool was_hidden)
  inform (olddecl_loc, "previous declaration %q#D", olddecl);
  return error_mark_node;
}
-  else if ((VAR_P (olddecl) && DECL_DECOMPOSITION_P (olddecl))
-  || (VAR_P (newdecl) && DECL_DECOMPOSITION_P (newdecl)))
+  else if (DECL_DECOMPOSITION_P (olddecl) || DECL_DECOMPOSITION_P 
(newdecl))
/* A structured binding must be unique in its declarative region.  */;
else if (DECL_IMPLICIT_TYPEDEF_P (olddecl)
   || DECL_IMPLICIT_TYPEDEF_P (newdecl))
diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index d2586fad86b..bcfb4d350bc 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -1979,7 +1979,7 @@ build_aggr_init (tree exp, tree init, int flags, 
tsubst_flags_t complain)
tree itype = init ? TREE_TYPE (init) : NULL_TREE;
int from_array = 0;
  
-  if (VAR_P (exp) && DECL_DECOMPOSITION_P (exp))

+  if (DECL_DECOMPOSITION_P (exp))
{
  from_array = 1;
  init = mark_rvalue_use (init);
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index a310b9e8c07..8a6ced17c5c 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -14117,7 +14117,6 @@ cp_parser_range_for (cp_parser *parser, tree scope, 
tree init, tree range_decl,
  /* For decomposition declaration get all of the corresponding
 declarations out of the way.  */
  if (TREE_CODE (v) == ARRAY_REF
- && VAR_P (TREE_OPERAND (v, 0))
  && DECL_DECOMPOSITION_P (TREE_OPERAND (v, 0)))
{
  tree d = range_decl;
@@ -14238,7 +14237,7 @@ do_range_for_auto_deduction (tree decl, tree 
range_expr, cp_decomp *decomp)
iter_decl, auto_node,
tf_warning_or_error,
adc_variable_type);
- if (VAR_P (decl) && DECL_DECOMPOSITION_P (decl))
+ if (DECL_DECOMPOSITION_P (decl))
cp_finish_decomp (decl, decomp);
}
  }
@@ -14437,7 +14436,7 @@ cp_convert_range_for (tree statement, tree range_decl, 
tree range_expr,
cp_finish_decl (range_decl, deref_begin,
  /*is_constant_init*/false, NULL_TREE,
  LOOKUP_ONLYCONVERTING, decomp);
-  if (VAR_P (range_decl) && DECL_DECOMPOSITION_P (range_decl))
+  if (DECL_DECOMPOSITION_P (range_decl))
  cp_finish_decomp (range_decl, decomp);
  
warn_for_range_copy (range_decl, deref_begin);

@@ -44288,7 +44287,6 @@ cp_convert_omp_range_for (tree _pre_body, tree ,
{
  tree v = DECL_VALUE_EXPR (decl);
  if (TREE_CODE (v) == ARRAY_REF
- && VAR_P (TREE_OPERAND (v, 0))
  && DECL_DECOMPOSITION_P (TREE_OPERAND (v, 0)))
{
  d = TREE_OPERAND (v, 0);
@@ -44393,7 +44391,6 @@ cp_convert_omp_range_for (tree _pre_body, tree ,
  {
tree v = DECL_VALUE_EXPR (orig_decl);
if (TREE_CODE (v) == ARRAY_REF
- && VAR_P (TREE_OPERAND (v, 0))
  && DECL_DECOMPOSITION_P (TREE_OPERAND (v, 0)))
{
  tree d = orig_decl;
@@ -44471,7 +44468,7 @@ cp_finish_omp_range_for (tree orig, tree begin)
tree decl = TREE_VEC_ELT (TREE_CHAIN (orig), 2);
cp_decomp decomp_d, *decomp = NULL;
  
-  if (VAR_P (decl) && DECL_DECOMPOSITION_P (decl))

+  if (DECL_DECOMPOSITION_P (decl))
  {
decomp = _d;
decomp_d.decl = TREE_VEC_ELT (TREE_CHAIN (orig), 3);
@@ -44497,7 +44494,7 @@ cp_finish_omp_range_for (tree orig, tree begin)
NULL_TREE, tf_warning_or_error),
  /*is_constant_init*/false, NULL_TREE,
  

Re: [PATCH] c++: Don't set DECL_CONTEXT to nested template-template parameters [PR98881]

2024-03-04 Thread Jason Merrill

On 3/2/24 01:54, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?

-- >8 --

When streaming in a nested template-template parameter as in the
attached testcase, we end up reaching the containing template-template
parameter in 'tpl_parms_fini'. We should not set the DECL_CONTEXT to
this (nested) template-template parameter, as it should already be the
struct that the outer template-template parameter is declared on.


So in the case where tmpl is a template template parameter we want 
DECL_CONTEXT (parm) to be the same as DECL_CONTEXT (tmpl)?  Let's check 
that instead of ignoring it.



PR c++/98881

gcc/cp/ChangeLog:

* module.cc (trees_out::tpl_parms_fini): Clarify logic purely
for checking purposes. Don't consider a template template
parameter as the owning template.
(trees_in::tpl_parms_fini): Don't consider a template template
parameter as the owning template.

gcc/testsuite/ChangeLog:

* g++.dg/modules/tpl-tpl-parm-3_a.H: New test.
* g++.dg/modules/tpl-tpl-parm-3_b.C: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/module.cc| 17 -
  gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H | 11 +++
  gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C | 13 +
  3 files changed, 36 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
  create mode 100644 gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 67f132d28d7..5663d01ed9c 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -10126,10 +10126,14 @@ trees_out::tpl_parms_fini (tree tmpl, unsigned 
tpl_levels)
  tree dflt = TREE_PURPOSE (parm);
  tree_node (dflt);
  
-	  if (streaming_p ())

+ if (CHECKING_P && streaming_p ())
{
+ /* Sanity check that the DECL_CONTEXT we'll infer when
+streaming in is correct.  */
  tree decl = TREE_VALUE (parm);
- if (TREE_CODE (decl) == TEMPLATE_DECL)
+ if (TREE_CODE (decl) == TEMPLATE_DECL
+ /* A template template parm is not the owning template.  */
+ && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))
{
  tree ctx = DECL_CONTEXT (decl);
  tree inner = DECL_TEMPLATE_RESULT (decl);
@@ -10164,8 +10168,13 @@ trees_in::tpl_parms_fini (tree tmpl, unsigned 
tpl_levels)
return false;
  TREE_PURPOSE (parm) = dflt;
  
+	  /* Original template template parms have a context

+of their owning template.  Reduced ones do not.
+But if TMPL is itself a template template parm
+then it cannot be the owning template.  */
  tree decl = TREE_VALUE (parm);
- if (TREE_CODE (decl) == TEMPLATE_DECL)
+ if (TREE_CODE (decl) == TEMPLATE_DECL
+ && !DECL_TEMPLATE_TEMPLATE_PARM_P (tmpl))
{
  tree inner = DECL_TEMPLATE_RESULT (decl);
  tree tpi = (TREE_CODE (inner) == TYPE_DECL
@@ -10173,8 +10182,6 @@ trees_in::tpl_parms_fini (tree tmpl, unsigned 
tpl_levels)
  : DECL_INITIAL (inner));
  bool original = (TEMPLATE_PARM_LEVEL (tpi)
   == TEMPLATE_PARM_ORIG_LEVEL (tpi));
- /* Original template template parms have a context
-of their owning template.  Reduced ones do not.  */
  if (original)
DECL_CONTEXT (decl) = tmpl;
}
diff --git a/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H 
b/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
new file mode 100644
index 000..21bbc054fa3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_a.H
@@ -0,0 +1,11 @@
+// PR c++/98881
+// { dg-additional-options "-fmodule-header" }
+// { dg-module-cmi {} }
+
+template  struct X {};
+
+template typename TT>
+struct X> {
+  template typename UU>
+  void f (X>&);
+};
diff --git a/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C 
b/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C
new file mode 100644
index 000..234e822faa9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/tpl-tpl-parm-3_b.C
@@ -0,0 +1,13 @@
+// PR c++/98881
+// { dg-additional-options "-fmodules-ts" }
+
+import "tpl-tpl-parm-3_a.H";
+
+template  struct Y {};
+template  struct Z {};
+
+void foo() {
+  X> y;
+  X> z;
+  y.f(z);
+}




Re: [PATCH] c++/modules: Support exporting using-decls in same namespace as target

2024-03-04 Thread Jason Merrill

On 3/3/24 18:11, Nathaniel Shead wrote:

Came across this issue while working on another PR.

Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk?
Or otherwise for GCC 15?


OK.


-- >8 --

Currently a using-declaration bringing a name into its own namespace is
a no-op, except for functions. This prevents people from being able to
redeclare a name brought in from the GMF as exported, however, which
this patch fixes.

Apart from marking declarations as exported they are also now marked as
effectively being in the module purview (due to the using-decl) so that
they are properly processed, as 'add_binding_entity' assumes that
declarations not in the module purview cannot possibly be exported.

gcc/cp/ChangeLog:

* name-lookup.cc (walk_module_binding): Remove completed FIXME.
(do_nonmember_using_decl): Mark redeclared entities as exported
when needed. Check for re-exporting internal linkage types.

gcc/testsuite/ChangeLog:

* g++.dg/modules/using-12.C: New test.
* g++.dg/modules/using-13.h: New test.
* g++.dg/modules/using-13_a.C: New test.
* g++.dg/modules/using-13_b.C: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/name-lookup.cc | 50 +---
  gcc/testsuite/g++.dg/modules/using-12.C   | 73 +++
  gcc/testsuite/g++.dg/modules/using-13.h   | 16 +
  gcc/testsuite/g++.dg/modules/using-13_a.C | 15 +
  gcc/testsuite/g++.dg/modules/using-13_b.C | 20 +++
  5 files changed, 166 insertions(+), 8 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/modules/using-12.C
  create mode 100644 gcc/testsuite/g++.dg/modules/using-13.h
  create mode 100644 gcc/testsuite/g++.dg/modules/using-13_a.C
  create mode 100644 gcc/testsuite/g++.dg/modules/using-13_b.C

diff --git a/gcc/cp/name-lookup.cc b/gcc/cp/name-lookup.cc
index 6444db3f0eb..dce4caf8981 100644
--- a/gcc/cp/name-lookup.cc
+++ b/gcc/cp/name-lookup.cc
@@ -4189,7 +4189,7 @@ walk_module_binding (tree binding, bitmap partitions,
 void *data)
  {
// FIXME: We don't quite deal with using decls naming stat hack
-  // type.  Also using decls exporting something from the same scope.
+  // type.
tree current = binding;
unsigned count = 0;
  
@@ -5238,13 +5238,36 @@ do_nonmember_using_decl (name_lookup , bool fn_scope_p,

  }
else if (insert_p)
  {
-  value = lookup.value;
-  if (revealing_p && module_exporting_p ())
-   check_can_export_using_decl (value);
+  if (revealing_p
+ && module_exporting_p ()
+ && check_can_export_using_decl (lookup.value)
+ && lookup.value == value
+ && !DECL_MODULE_EXPORT_P (value))
+   {
+ /* We're redeclaring the same value, but this time as
+newly exported: make sure to mark it as such.  */
+ if (TREE_CODE (value) == TEMPLATE_DECL)
+   {
+ DECL_MODULE_EXPORT_P (value) = true;
+
+ tree result = DECL_TEMPLATE_RESULT (value);
+ retrofit_lang_decl (result);
+ DECL_MODULE_PURVIEW_P (result) = true;
+ DECL_MODULE_EXPORT_P (result) = true;
+   }
+ else
+   {
+ retrofit_lang_decl (value);
+ DECL_MODULE_PURVIEW_P (value) = true;
+ DECL_MODULE_EXPORT_P (value) = true;
+   }
+   }
+  else
+   value = lookup.value;
  }

/* Now the type binding.  */

-  if (lookup.type && lookup.type != type)
+  if (lookup.type)
  {
if (type && !decls_match (lookup.type, type))
{
@@ -5253,9 +5276,20 @@ do_nonmember_using_decl (name_lookup , bool 
fn_scope_p,
}
else if (insert_p)
{
- type = lookup.type;
- if (revealing_p && module_exporting_p ())
-   check_can_export_using_decl (type);
+ if (revealing_p
+ && module_exporting_p ()
+ && check_can_export_using_decl (lookup.type)
+ && lookup.type == type
+ && !DECL_MODULE_EXPORT_P (type))
+   {
+ /* We're redeclaring the same type, but this time as
+newly exported: make sure to mark it as such.  */
+ retrofit_lang_decl (type);
+ DECL_MODULE_PURVIEW_P (type) = true;
+ DECL_MODULE_EXPORT_P (type) = true;
+   }
+ else
+   type = lookup.type;
}
  }
  
diff --git a/gcc/testsuite/g++.dg/modules/using-12.C b/gcc/testsuite/g++.dg/modules/using-12.C

new file mode 100644
index 000..54eacf7276e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/using-12.C
@@ -0,0 +1,73 @@
+// { dg-additional-options "-fmodules-ts" }
+// { dg-module-cmi !bad }
+
+// Like using-10.C, but test exporting names within the same namespace.
+
+export module bad;
+
+// internal linkage
+namespace s {
+  namespace {
+struct a1 {};  // { dg-message "declared here with internal linkage" }
+
+template 
+

[PATCH] c++/modules: member alias tmpl partial inst [PR103994]

2024-03-04 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look
OK for trunk?

-- >8 --

Alias templates are weird in that their specializations can appear in
both decl_specializations and type_specializations.  They appear in the
latter only at parse time via finish_template_type.  This should probably
be revisited in GCC 15 since it seems sufficient to store them only in
decl_specializations.  In the meantime, the below patch makes sure that
if a such a specialization is stored in both tables then we don't
overwrite in the type code path the TEMPLATE_INFO set by the decl code
path (which always runs first).  That's because tsubst_template_decl
during partial instantiation of a member template sets TI_TEMPLATE of
the TYPE_DECL to point to the partially instantiated TEMPLATE_DECL
whereas lookup_template_class wants to always point to the most general
template.  This ends up confusing modules in the testcase below for the
partial instantiation A::key_arg -- we decide to stream the
TYPE_DECL for this partial instantiation separately from the
corresponding TEMPLATE_DECL due to this incorrect TI_TEMPLATE setting.

PR c++/103994

gcc/cp/ChangeLog:

* pt.cc (lookup_template_class): Don't overwrite TEMPLATE_INFO
for an alias template specialization.

gcc/testsuite/ChangeLog:

* g++.dg/modules/tpl-alias-2_a.H: New test.
* g++.dg/modules/tpl-alias-2_b.C: New test.
---
 gcc/cp/pt.cc |  7 ++-
 gcc/testsuite/g++.dg/modules/tpl-alias-2_a.H | 15 +++
 gcc/testsuite/g++.dg/modules/tpl-alias-2_b.C |  9 +
 3 files changed, 30 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/tpl-alias-2_a.H
 create mode 100644 gcc/testsuite/g++.dg/modules/tpl-alias-2_b.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index c4bc54a8fdb..ce2d53fe762 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -10431,7 +10431,12 @@ lookup_template_class (tree d1, tree arglist, tree 
in_decl, tree context,
}
 
   /* Build template info for the new specialization.  */
-  SET_TYPE_TEMPLATE_INFO (t, build_template_info (found, arglist));
+  if (DECL_ALIAS_TEMPLATE_P (gen_tmpl))
+   /* Already properly set by instantiate_template (or
+  tsubst_template_decl).  */
+   gcc_assert (DECL_TEMPLATE_INFO (TYPE_NAME (t)));
+  else
+   SET_TYPE_TEMPLATE_INFO (t, build_template_info (found, arglist));
 
   elt.spec = t;
   slot = type_specializations->find_slot_with_hash (, hash, INSERT);
diff --git a/gcc/testsuite/g++.dg/modules/tpl-alias-2_a.H 
b/gcc/testsuite/g++.dg/modules/tpl-alias-2_a.H
new file mode 100644
index 000..76917f778e0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/tpl-alias-2_a.H
@@ -0,0 +1,15 @@
+// PR c++/103994
+// { dg-additional-options -fmodule-header }
+// { dg-module-cmi {} }
+
+template
+struct A {
+  template using key_arg = int;
+};
+
+struct B {
+  template
+  void f() {
+using type = A::key_arg;
+  }
+};
diff --git a/gcc/testsuite/g++.dg/modules/tpl-alias-2_b.C 
b/gcc/testsuite/g++.dg/modules/tpl-alias-2_b.C
new file mode 100644
index 000..44fa5f42757
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/tpl-alias-2_b.C
@@ -0,0 +1,9 @@
+// PR c++/103994
+// { dg-additional-options -fmodules-ts }
+
+import "tpl-alias-2_a.H";
+
+int main() {
+  B b;
+  b.f();
+}
-- 
2.44.0.84.gb387623c12



[PATCH] c++: ICE with variable template and [[deprecated]] [PR110031]

2024-03-04 Thread Marek Polacek
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/13?

-- >8 --
lookup_and_finish_template_variable already has and uses the complain
parameter but it is not passing it down to mark_used so we got the
default tf_warning_or_error, which causes various problems when
lookup_and_finish_template_variable gets called with complain=tf_none.

PR c++/110031

gcc/cp/ChangeLog:

* pt.cc (lookup_and_finish_template_variable): Pass complain to
mark_used.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/inline-var11.C: New test.
---
 gcc/cp/pt.cc  |  2 +-
 gcc/testsuite/g++.dg/cpp1z/inline-var11.C | 32 +++
 2 files changed, 33 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/inline-var11.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index c4bc54a8fdb..48d2b3cbac6 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -10533,7 +10533,7 @@ lookup_and_finish_template_variable (tree templ, tree 
targs,
   if (var == error_mark_node)
 return error_mark_node;
   var = finish_template_variable (var, complain);
-  mark_used (var);
+  mark_used (var, complain);
   return var;
 }
 
diff --git a/gcc/testsuite/g++.dg/cpp1z/inline-var11.C 
b/gcc/testsuite/g++.dg/cpp1z/inline-var11.C
new file mode 100644
index 000..d92911ed3a9
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/inline-var11.C
@@ -0,0 +1,32 @@
+// PR c++/110031
+// { dg-do compile { target c++17 } }
+
+template 
+[[deprecated]]
+inline constexpr bool t = true ;
+
+template 
+struct enableif;
+
+template<>
+struct enableif
+{
+using y = int;
+};
+template 
+using enableif_t = typename enableif::y;
+
+template > = 0>   // { dg-warning "deprecated" }
+struct A {  A(T &&)  {  }};
+
+template 
+struct A {
+  A(T &&) = delete;
+  A() = delete;
+};
+
+int main(void)
+{
+  A a(5.3); // { dg-error "use of deleted function" }
+  return 0;
+}

base-commit: a89c5df317d1de74871e2a05c36aed9cbbb21f42
-- 
2.44.0



[PATCH v2] middle-end/113680 - Optimize (x - y) CMP 0 as x CMP y

2024-03-04 Thread Ken Matsui
(x - y) CMP 0 is equivalent to x CMP y where x and y are signed
integers and CMP is <, <=, >, or >=.  Similarly, 0 CMP (x - y) is
equivalent to y CMP x.  As reported in PR middle-end/113680, this
equivalence does not hold for types other than signed integers.  When
it comes to conditions, the former was translated to a combination of
sub and test, whereas the latter was translated to a single cmp.
Thus, this optimization pass tries to optimize the former to the
latter.

When `-fwrapv` is enabled, GCC treats the overflow of signed integers
as defined behavior, specifically, wrapping around according to two's
complement arithmetic.  This has implications for optimizations that
rely on the standard behavior of signed integers, where overflow is
undefined.  Consider the example given:

long long llmax = __LONG_LONG_MAX__;
long long llmin = -llmax - 1;

Here, `llmax - llmin` effectively becomes `llmax - (-llmax - 1)`, which
simplifies to `2 * llmax + 1`.  Given that `llmax` is the maximum value
for a `long long`, this calculation overflows in a defined manner
(wrapping around), which under `-fwrapv` is a legal operation that
produces a negative value due to two's complement wraparound.
Therefore, `llmax - llmin < 0` is true.

However, the direct comparison `llmax < llmin` is false since `llmax`
is the maximum possible value and `llmin` is the minimum.  Hence,
optimizations that rely on the equivalence of `(x - y) CMP 0` to
`x CMP y` (and vice versa) cannot be safely applied when `-fwrapv` is
enabled.  This is why this optimization pass is disabled under
`-fwrapv`.

This optimization pass must run before the Jump Threading pass and the
VRP pass, as it may modify conditions. For example, in the VRP pass:

(1)
  int diff = x - y;
  if (diff > 0)
foo();
  if (diff < 0)
bar();

The second condition would be converted to diff != 0 in the VRP pass
because we know the postcondition of the first condition is diff <= 0,
and then diff != 0 is cheaper than diff < 0. If we apply this pass
after this VRP, we get:

(2)
  int diff = x - y;
  if (x > y)
foo();
  if (diff != 0)
bar();

This generates sub and test for the second condition and cmp for the
first condition. However, if we apply this pass beforehand, we simply
get:

(3)
  int diff = x - y;
  if (x > y)
foo();
  if (x < y)
bar();

In this code, diff will be eliminated as a dead code, and sub and test
will not be generated, which is more efficient.

For the Jump Threading pass, without this optimization pass, (1) and
(3) above are recognized as different, which prevents TCO.

PR middle-end/113680

gcc/ChangeLog:

* Makefile.in: Add tree-ssa-cmp.o to OBJS.
* common.opt: Define ftree-cmp
* doc/invoke.texi: Document ftree-cmp.
* opts.cc (default_options_table): Handle OPT_ftree_cmp.
* passes.def (pass_cmp): New optimization pass.
* timevar.def (TV_TREE_CMP): New variable for timing.
* tree-pass.h (make_pass_cmp): New declaration.
* tree-ssa-cmp.cc: New file.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr113680.c: New test.

Signed-off-by: Ken Matsui 
---
 gcc/Makefile.in  |   1 +
 gcc/common.opt   |   4 +
 gcc/doc/invoke.texi  |  11 +-
 gcc/opts.cc  |   1 +
 gcc/passes.def   |   3 +
 gcc/testsuite/gcc.dg/tree-ssa/pr113680.c |  47 
 gcc/timevar.def  |   1 +
 gcc/tree-pass.h  |   1 +
 gcc/tree-ssa-cmp.cc  | 262 +++
 9 files changed, 330 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr113680.c
 create mode 100644 gcc/tree-ssa-cmp.cc

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index a74761b7ab3..935b80b6947 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1731,6 +1731,7 @@ OBJS = \
tree-ssa-address.o \
tree-ssa-alias.o \
tree-ssa-ccp.o \
+   tree-ssa-cmp.o \
tree-ssa-coalesce.o \
tree-ssa-copy.o \
tree-ssa-dce.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index 51c4a17da83..7c853224458 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -3053,6 +3053,10 @@ ftree-ch
 Common Var(flag_tree_ch) Optimization
 Enable loop header copying on trees.
 
+ftree-cmp
+Common Var(flag_tree_cmp) Optimization
+Enable SSA comparison optimization on trees.
+
 ftree-coalesce-inlined-vars
 Common Ignore RejectNegative
 Does nothing.  Preserved for backward compatibility.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index bdf05be387d..04762d490a3 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -619,7 +619,7 @@ Objective-C and Objective-C++ Dialects}.
 -fsplit-wide-types  -fsplit-wide-types-early  

[PATCH] middle-end/113680 - Optimize (x - y) CMP 0 as x CMP y

2024-03-04 Thread Ken Matsui
(x - y) CMP 0 is equivalent to x CMP y where x and y are signed
integers and CMP is <, <=, >, or >=.  Similarly, 0 CMP (x - y) is
equivalent to y CMP x.  As reported in PR middle-end/113680, this
equivalence does not hold for types other than signed integers.  When
it comes to conditions, the former was translated to a combination of
sub and test, whereas the latter was translated to a single cmp.
Thus, this optimization pass tries to optimize the former to the
latter.

When `-fwrapv` is enabled, GCC treats the overflow of signed integers
as defined behavior, specifically, wrapping around according to two's
complement arithmetic.  This has implications for optimizations that
rely on the standard behavior of signed integers, where overflow is
undefined.  Consider the example given:

long long llmax = __LONG_LONG_MAX__;
long long llmin = -llmax - 1;

Here, `llmax - llmin` effectively becomes `llmax - (-llmax - 1)`, which
simplifies to `2 * llmax + 1`.  Given that `llmax` is the maximum value
for a `long long`, this calculation overflows in a defined manner
(wrapping around), which under `-fwrapv` is a legal operation that
produces a negative value due to two's complement wraparound.
Therefore, `llmax - llmin < 0` is true.

However, the direct comparison `llmax < llmin` is false since `llmax`
is the maximum possible value and `llmin` is the minimum.  Hence,
optimizations that rely on the equivalence of `(x - y) CMP 0` to
`x CMP y` (and vice versa) cannot be safely applied when `-fwrapv` is
enabled.  This is why this optimization pass is disabled under
`-fwrapv`.

This optimization pass must run before the Jump Threading pass and the
VRP pass, as it may modify conditions. For example, in the VRP pass:

(1)
  int diff = x - y;
  if (diff > 0)
foo();
  if (diff < 0)
bar();

The second condition would be converted to diff != 0 in the VRP pass
because we know the postcondition of the first condition is diff <= 0,
and then diff != 0 is cheaper than diff < 0. If we apply this pass
after this VRP, we get:

(2)
  int diff = x - y;
  if (x > y)
foo();
  if (diff != 0)
bar();

This generates sub and test for the second condition and cmp for the
first condition. However, if we apply this pass beforehand, we simply
get:

(3)
  int diff = x - y;
  if (x > y)
foo();
  if (x < y)
bar();

In this code, diff will be eliminated as a dead code, and sub and test
will not be generated, which is more efficient.

For the Jump Threading pass, without this optimization pass, (1) and
(3) above are recognized as different, which prevents TCO.

PR middle-end/113680

gcc/ChangeLog:

* Makefile.in: Add tree-ssa-cmp.o to OBJS.
* common.opt: Define ftree-cmp
* doc/invoke.texi: Document ftree-cmp.
* opts.cc (default_options_table): Handle OPT_ftree_cmp.
* passes.def (pass_cmp): New optimization pass.
* timevar.def (TV_TREE_CMP): New variable for timing.
* tree-pass.h (make_pass_cmp): New declaration.
* tree-ssa-cmp.cc: New file.

gcc/testsuite/ChangeLog:

* gcc.dg/pr113680.c: New test.

Signed-off-by: Ken Matsui 
---
 gcc/Makefile.in |   1 +
 gcc/common.opt  |   4 +
 gcc/doc/invoke.texi |  11 +-
 gcc/opts.cc |   1 +
 gcc/passes.def  |   3 +
 gcc/testsuite/gcc.dg/pr113680.c |  47 ++
 gcc/timevar.def |   1 +
 gcc/tree-pass.h |   1 +
 gcc/tree-ssa-cmp.cc | 262 
 9 files changed, 330 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr113680.c
 create mode 100644 gcc/tree-ssa-cmp.cc

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index a74761b7ab3..935b80b6947 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1731,6 +1731,7 @@ OBJS = \
tree-ssa-address.o \
tree-ssa-alias.o \
tree-ssa-ccp.o \
+   tree-ssa-cmp.o \
tree-ssa-coalesce.o \
tree-ssa-copy.o \
tree-ssa-dce.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index 51c4a17da83..7c853224458 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -3053,6 +3053,10 @@ ftree-ch
 Common Var(flag_tree_ch) Optimization
 Enable loop header copying on trees.
 
+ftree-cmp
+Common Var(flag_tree_cmp) Optimization
+Enable SSA comparison optimization on trees.
+
 ftree-coalesce-inlined-vars
 Common Ignore RejectNegative
 Does nothing.  Preserved for backward compatibility.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index bdf05be387d..04762d490a3 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -619,7 +619,7 @@ Objective-C and Objective-C++ Dialects}.
 -fsplit-wide-types  -fsplit-wide-types-early  -fssa-backprop  -fssa-phiopt
 -fstdarg-opt  -fstore-merging  -fstrict-aliasing -fipa-strict-aliasing

Re: [PATCH,V2] ctf: fix incorrect CTF for multi-dimensional array types

2024-03-04 Thread David Faust
Hi Indu, Cupertino,

On 3/4/24 10:00, Indu Bhagat wrote:
> From: Cupertino Miranda 
> 
> [Changes from V1]
>   - Refactor the code a bit.
> [End of changes from V1]
> 
> PR debug/114186
> 
> DWARF DIEs of type DW_TAG_subrange_type are linked together to represent
> the information about the subsequent dimensions.  The CTF processing was
> so far working through them in the opposite (incorrect) order.
> 
> While fixing the issue, refactor the code a bit for readability.
> 
> co-authored-By: Indu Bhagat 

Thanks for the patch and refactor. I do find v2 easier to follow.
Two very minor typos in comments, noted inline below.

Otherwise, LGTM and OK.
Thanks!

> 
> gcc/
>   PR debug/114186
>   * dwarf2ctf.cc (gen_ctf_array_type): Invoke the ctf_add_array ()
>   in the correct order of the dimensions.
> (gen_ctf_subrange_type): Refactor out handling of
>   DW_TAG_subrange_type DIE to here.
> 
> gcc/testsuite/
>   PR debug/114186
>   * gcc.dg/debug/ctf/ctf-array-6.c: Add test.
> ---
> 
> Testing notes:
> 
> Regression tested on x86_64-linux-gnu default target.
> Regression tested for target bpf-unknown-none (btf.exp, ctf.exp, bpf.exp).
> 
> ---
>  gcc/dwarf2ctf.cc | 153 +--
>  gcc/testsuite/gcc.dg/debug/ctf/ctf-array-6.c |  14 ++
>  2 files changed, 84 insertions(+), 83 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-6.c
> 
> diff --git a/gcc/dwarf2ctf.cc b/gcc/dwarf2ctf.cc
> index dca86edfffa9..3985de115a79 100644
> --- a/gcc/dwarf2ctf.cc
> +++ b/gcc/dwarf2ctf.cc
> @@ -349,105 +349,92 @@ gen_ctf_pointer_type (ctf_container_ref ctfc, 
> dw_die_ref ptr_type)
>return ptr_type_id;
>  }
>  
> -/* Generate CTF for an array type.  */
> +/* Recursively generate CTF for array dimensions starting at DIE C (of type
> +   DW_TAG_subrange_type) until DIE LAST (of type DW_TAG_subrange_type) is
> +   reached.  ARRAY_ELEMS_TYPE_ID is base type for the array.  */
>  
>  static ctf_id_t
> -gen_ctf_array_type (ctf_container_ref ctfc, dw_die_ref array_type)
> +gen_ctf_subrange_type (ctf_container_ref ctfc, ctf_id_t array_elems_type_id,
> +dw_die_ref c, dw_die_ref last)
>  {
> -  dw_die_ref c;
> -  ctf_id_t array_elems_type_id = CTF_NULL_TYPEID;
> +  ctf_arinfo_t arinfo;
> +  ctf_id_t array_node_type_id = CTF_NULL_TYPEID;
> +
> +  dw_attr_node *upper_bound_at;
> +  dw_die_ref array_index_type;
> +  uint32_t array_num_elements;
> +
> +  /* When DW_AT_upper_bound is used to specify the size of an
> + array in DWARF, it is usually an unsigned constant
> + specifying the upper bound index of the array.  However,
> + for unsized arrays, such as foo[] or bar[0],
> + DW_AT_upper_bound is a signed integer constant
> + instead.  */
> +
> +  upper_bound_at = get_AT (c, DW_AT_upper_bound);
> +  if (upper_bound_at
> +  && AT_class (upper_bound_at) == dw_val_class_unsigned_const)
> +/* This is the ound index.  */

typo, I guess this is meant to be "bound" ? 

> +array_num_elements = get_AT_unsigned (c, DW_AT_upper_bound) + 1;
> +  else if (get_AT (c, DW_AT_count))
> +array_num_elements = get_AT_unsigned (c, DW_AT_count);
> +  else
> +{
> +  /* This is a VLA of some kind.  */
> +  array_num_elements = 0;
> +}
>  
> -  int vector_type_p = get_AT_flag (array_type, DW_AT_GNU_vector);
> -  if (vector_type_p)
> -return array_elems_type_id;
> +  /* Ok, mount and register the array type.  Note how the array
> + type we register here is the type of the elements in
> + subsequent "dimensions", if there are any.  */
> +  arinfo.ctr_nelems = array_num_elements;
>  
> -  dw_die_ref array_elems_type = ctf_get_AT_type (array_type);
> +  array_index_type = ctf_get_AT_type (c);
> +  arinfo.ctr_index = gen_ctf_type (ctfc, array_index_type);
>  
> -  /* First, register the type of the array elements if needed.  */
> -  array_elems_type_id = gen_ctf_type (ctfc, array_elems_type);
> +  if (c == last)
> +arinfo.ctr_contents = array_elems_type_id;
> +  else
> +arinfo.ctr_contents = gen_ctf_subrange_type (ctfc, array_elems_type_id,
> +  dw_get_die_sib (c), last);
>  
> -  /* DWARF array types pretend C supports multi-dimensional arrays.
> - So for the type int[N][M], the array type DIE contains two
> - subrange_type children, the first with upper bound N-1 and the
> - second with upper bound M-1.
> +  if (!ctf_type_exists (ctfc, c, _node_type_id))
> +array_node_type_id = ctf_add_array (ctfc, CTF_ADD_ROOT, , c);
>  
> - CTF, on the other hand, just encodes each array type in its own
> - array type CTF struct.  Therefore we have to iterate on the
> - children and create all the needed types.  */
> +  return array_node_type_id;
> +}
>  
> -  c = dw_get_die_child (array_type);
> -  gcc_assert (c);
> -  do
> -{
> -  ctf_arinfo_t arinfo;
> -  dw_die_ref array_index_type;
> -  uint32_t 

Re: [PATCH] bpf: add inline memset expansion

2024-03-04 Thread Jose E. Marchesi


Hi David.
Thanks for the patch.
OK.

> Similar to memmove and memcpy, the BPF backend cannot fall back on a
> library call to implement __builtin_memset, and should always expand
> calls to it inline if possible.
>
> This patch implements simple inline expansion of memset in the BPF
> backend in a verifier-friendly way. Similar to memcpy and memmove, the
> size must be an integer constant, as is also required by clang.
>
> Tested for bpf-unknown-none target on x86_64-linux-gnu host.
> Also testetd against kernel BPF verifier by compiling and loading a
> test program using the inline memset expansion.
>
> gcc/
>   * config/bpf/bpf-protos.h (bpf_expand_setmem): New prototype.
>   * config/bpf/bpf.cc (bpf_expand_setmem): New.
>   * config/bpf/bpf.md (setmemdi): New define_expand.
>
> gcc/testsuite/
>   * gcc.target/bpf/memset-1.c: New test.
> ---
>  gcc/config/bpf/bpf-protos.h |  1 +
>  gcc/config/bpf/bpf.cc   | 66 +
>  gcc/config/bpf/bpf.md   | 17 +++
>  gcc/testsuite/gcc.target/bpf/memset-1.c | 39 +++
>  4 files changed, 123 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/bpf/memset-1.c
>
> diff --git a/gcc/config/bpf/bpf-protos.h b/gcc/config/bpf/bpf-protos.h
> index 366acb87ae4..ac0c2f4038f 100644
> --- a/gcc/config/bpf/bpf-protos.h
> +++ b/gcc/config/bpf/bpf-protos.h
> @@ -36,5 +36,6 @@ class gimple_opt_pass;
>  gimple_opt_pass *make_pass_lower_bpf_core (gcc::context *ctxt);
>  
>  bool bpf_expand_cpymem (rtx *, bool);
> +bool bpf_expand_setmem (rtx *);
>  
>  #endif /* ! GCC_BPF_PROTOS_H */
> diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
> index 22b0cf2dc46..0e33f4347ba 100644
> --- a/gcc/config/bpf/bpf.cc
> +++ b/gcc/config/bpf/bpf.cc
> @@ -1309,6 +1309,72 @@ bpf_expand_cpymem (rtx *operands, bool is_move)
>return true;
>  }
>  
> +/* Expand setmem, as from __builtin_memset.
> +   OPERANDS are the same as the setmem pattern.
> +   Return true if the expansion was successful, false otherwise.  */
> +
> +bool
> +bpf_expand_setmem (rtx *operands)
> +{
> +  /* Size must be constant for this expansion to work.  */
> +  if (!CONST_INT_P (operands[1]))
> +{
> +  if (flag_building_libgcc)
> + warning (0, "could not inline call to %<__builtin_memset%>: "
> +  "size must be constant");
> +  else
> + error ("could not inline call to %<__builtin_memset%>: "
> +"size must be constant");
> +  return false;
> +}
> +
> +  /* Alignment is a CONST_INT.  */
> +  gcc_assert (CONST_INT_P (operands[3]));
> +
> +  rtx dst = operands[0];
> +  rtx size = operands[1];
> +  rtx val = operands[2];
> +  unsigned HOST_WIDE_INT size_bytes = UINTVAL (size);
> +  unsigned align = UINTVAL (operands[3]);
> +  enum machine_mode mode;
> +  switch (align)
> +{
> +case 1: mode = QImode; break;
> +case 2: mode = HImode; break;
> +case 4: mode = SImode; break;
> +case 8: mode = DImode; break;
> +default:
> +  gcc_unreachable ();
> +}
> +
> +  unsigned iters = size_bytes >> ceil_log2 (align);
> +  unsigned remainder = size_bytes & (align - 1);
> +  unsigned inc = GET_MODE_SIZE (mode);
> +  unsigned offset = 0;
> +
> +  for (unsigned int i = 0; i < iters; i++)
> +{
> +  emit_move_insn (adjust_address (dst, mode, offset), val);
> +  offset += inc;
> +}
> +  if (remainder & 4)
> +{
> +  emit_move_insn (adjust_address (dst, SImode, offset), val);
> +  offset += 4;
> +  remainder -= 4;
> +}
> +  if (remainder & 2)
> +{
> +  emit_move_insn (adjust_address (dst, HImode, offset), val);
> +  offset += 2;
> +  remainder -= 2;
> +}
> +  if (remainder & 1)
> +emit_move_insn (adjust_address (dst, QImode, offset), val);
> +
> +  return true;
> +}
> +
>  /* Finally, build the GCC target.  */
>  
>  struct gcc_target targetm = TARGET_INITIALIZER;
> diff --git a/gcc/config/bpf/bpf.md b/gcc/config/bpf/bpf.md
> index ca677bc6b50..ea688aadf91 100644
> --- a/gcc/config/bpf/bpf.md
> +++ b/gcc/config/bpf/bpf.md
> @@ -663,4 +663,21 @@ (define_expand "movmemdi"
>FAIL;
>  })
>  
> +;; memset
> +;; 0 is dst
> +;; 1 is length
> +;; 2 is value
> +;; 3 is alignment
> +(define_expand "setmemdi"
> +  [(set (match_operand:BLK 0 "memory_operand")
> + (match_operand:QI  2 "nonmemory_operand"))
> +   (use (match_operand:DI  1 "general_operand"))
> +   (match_operand 3 "immediate_operand")]
> + ""
> + {
> +  if (bpf_expand_setmem (operands))
> +DONE;
> +  FAIL;
> +})
> +
>  (include "atomic.md")
> diff --git a/gcc/testsuite/gcc.target/bpf/memset-1.c 
> b/gcc/testsuite/gcc.target/bpf/memset-1.c
> new file mode 100644
> index 000..9e9f8eff028
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/bpf/memset-1.c
> @@ -0,0 +1,39 @@
> +/* Ensure memset is expanded inline rather than emitting a libcall.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +struct context {
> + 

[PATCH v2 00/13] Add aarch64-w64-mingw32 target

2024-03-04 Thread Evgeny Karpov
Zac Walker (13):
  Introduce aarch64-w64-mingw32 target
  aarch64: The aarch64-w64-mingw32 target implements the MS ABI
  aarch64: Mark x18 register as a fixed register for MS ABI
  aarch64: Add aarch64-w64-mingw32 COFF
  Reuse MinGW from i386 for AArch64
  Rename section and encoding functions from i386 which will be used in
aarch64
  Exclude i386 functionality from aarch64 build
  aarch64: Add Cygwin and MinGW environments for AArch64
  aarch64: Add SEH to machine_function
  Rename "x86 Windows Options" to "Cygwin and MinGW Options"
  aarch64: Build and add objects for Cygwin and MinGW for AArch64
  aarch64: Add aarch64-w64-mingw32 target to libatomic
  Add aarch64-w64-mingw32 target to libgcc

 fixincludes/mkfixinc.sh   |   3 +-
 gcc/config.gcc|  47 +++--
 gcc/config/aarch64/aarch64-abi-ms.h   |  64 +++
 gcc/config/aarch64/aarch64-coff.h |  91 +
 gcc/config/aarch64/aarch64-opts.h |   7 +
 gcc/config/aarch64/aarch64-protos.h   |   5 +
 gcc/config/aarch64/aarch64.h  |   6 +
 gcc/config/aarch64/cygming.h  | 175 ++
 gcc/config/i386/cygming.h |  18 +-
 gcc/config/i386/cygming.opt.urls  |  30 ---
 gcc/config/i386/i386-protos.h |  12 +-
 gcc/config/i386/mingw-w64.opt.urls|   2 +-
 gcc/config/lynx.opt.urls  |   2 +-
 gcc/config/{i386 => mingw}/cygming.opt|   0
 gcc/config/mingw/cygming.opt.urls |  30 +++
 gcc/config/{i386 => mingw}/cygwin-d.cc|   0
 gcc/config/{i386 => mingw}/mingw-stdint.h |   9 +-
 gcc/config/{i386 => mingw}/mingw.opt  |   0
 gcc/config/{i386 => mingw}/mingw.opt.urls |   2 +-
 gcc/config/{i386 => mingw}/mingw32.h  |   6 +-
 gcc/config/{i386 => mingw}/msformat-c.cc  |   0
 gcc/config/{i386 => mingw}/t-cygming  |  23 ++-
 gcc/config/{i386 => mingw}/winnt-cxx.cc   |   0
 gcc/config/{i386 => mingw}/winnt-d.cc |   0
 gcc/config/{i386 => mingw}/winnt-stubs.cc |   0
 gcc/config/{i386 => mingw}/winnt.cc   |  30 +--
 gcc/doc/invoke.texi   |  12 +-
 gcc/varasm.cc |   2 +-
 libatomic/configure.tgt   |   2 +-
 libgcc/config.host|  23 ++-
 libgcc/config/aarch64/t-no-eh |   2 +
 libgcc/config/{i386 => mingw}/t-gthr-win32|   0
 libgcc/config/{i386 => mingw}/t-mingw-pthread |   0
 33 files changed, 510 insertions(+), 93 deletions(-)
 create mode 100644 gcc/config/aarch64/aarch64-abi-ms.h
 create mode 100644 gcc/config/aarch64/aarch64-coff.h
 create mode 100644 gcc/config/aarch64/cygming.h
 delete mode 100644 gcc/config/i386/cygming.opt.urls
 rename gcc/config/{i386 => mingw}/cygming.opt (100%)
 create mode 100644 gcc/config/mingw/cygming.opt.urls
 rename gcc/config/{i386 => mingw}/cygwin-d.cc (100%)
 rename gcc/config/{i386 => mingw}/mingw-stdint.h (86%)
 rename gcc/config/{i386 => mingw}/mingw.opt (100%)
 rename gcc/config/{i386 => mingw}/mingw.opt.urls (86%)
 rename gcc/config/{i386 => mingw}/mingw32.h (98%)
 rename gcc/config/{i386 => mingw}/msformat-c.cc (100%)
 rename gcc/config/{i386 => mingw}/t-cygming (73%)
 rename gcc/config/{i386 => mingw}/winnt-cxx.cc (100%)
 rename gcc/config/{i386 => mingw}/winnt-d.cc (100%)
 rename gcc/config/{i386 => mingw}/winnt-stubs.cc (100%)
 rename gcc/config/{i386 => mingw}/winnt.cc (97%)
 create mode 100644 libgcc/config/aarch64/t-no-eh
 rename libgcc/config/{i386 => mingw}/t-gthr-win32 (100%)
 rename libgcc/config/{i386 => mingw}/t-mingw-pthread (100%)


[PATCH,V2] ctf: fix incorrect CTF for multi-dimensional array types

2024-03-04 Thread Indu Bhagat
From: Cupertino Miranda 

[Changes from V1]
  - Refactor the code a bit.
[End of changes from V1]

PR debug/114186

DWARF DIEs of type DW_TAG_subrange_type are linked together to represent
the information about the subsequent dimensions.  The CTF processing was
so far working through them in the opposite (incorrect) order.

While fixing the issue, refactor the code a bit for readability.

co-authored-By: Indu Bhagat 

gcc/
PR debug/114186
* dwarf2ctf.cc (gen_ctf_array_type): Invoke the ctf_add_array ()
in the correct order of the dimensions.
(gen_ctf_subrange_type): Refactor out handling of
DW_TAG_subrange_type DIE to here.

gcc/testsuite/
PR debug/114186
* gcc.dg/debug/ctf/ctf-array-6.c: Add test.
---

Testing notes:

Regression tested on x86_64-linux-gnu default target.
Regression tested for target bpf-unknown-none (btf.exp, ctf.exp, bpf.exp).

---
 gcc/dwarf2ctf.cc | 153 +--
 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-6.c |  14 ++
 2 files changed, 84 insertions(+), 83 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/ctf/ctf-array-6.c

diff --git a/gcc/dwarf2ctf.cc b/gcc/dwarf2ctf.cc
index dca86edfffa9..3985de115a79 100644
--- a/gcc/dwarf2ctf.cc
+++ b/gcc/dwarf2ctf.cc
@@ -349,105 +349,92 @@ gen_ctf_pointer_type (ctf_container_ref ctfc, dw_die_ref 
ptr_type)
   return ptr_type_id;
 }
 
-/* Generate CTF for an array type.  */
+/* Recursively generate CTF for array dimensions starting at DIE C (of type
+   DW_TAG_subrange_type) until DIE LAST (of type DW_TAG_subrange_type) is
+   reached.  ARRAY_ELEMS_TYPE_ID is base type for the array.  */
 
 static ctf_id_t
-gen_ctf_array_type (ctf_container_ref ctfc, dw_die_ref array_type)
+gen_ctf_subrange_type (ctf_container_ref ctfc, ctf_id_t array_elems_type_id,
+  dw_die_ref c, dw_die_ref last)
 {
-  dw_die_ref c;
-  ctf_id_t array_elems_type_id = CTF_NULL_TYPEID;
+  ctf_arinfo_t arinfo;
+  ctf_id_t array_node_type_id = CTF_NULL_TYPEID;
+
+  dw_attr_node *upper_bound_at;
+  dw_die_ref array_index_type;
+  uint32_t array_num_elements;
+
+  /* When DW_AT_upper_bound is used to specify the size of an
+ array in DWARF, it is usually an unsigned constant
+ specifying the upper bound index of the array.  However,
+ for unsized arrays, such as foo[] or bar[0],
+ DW_AT_upper_bound is a signed integer constant
+ instead.  */
+
+  upper_bound_at = get_AT (c, DW_AT_upper_bound);
+  if (upper_bound_at
+  && AT_class (upper_bound_at) == dw_val_class_unsigned_const)
+/* This is the ound index.  */
+array_num_elements = get_AT_unsigned (c, DW_AT_upper_bound) + 1;
+  else if (get_AT (c, DW_AT_count))
+array_num_elements = get_AT_unsigned (c, DW_AT_count);
+  else
+{
+  /* This is a VLA of some kind.  */
+  array_num_elements = 0;
+}
 
-  int vector_type_p = get_AT_flag (array_type, DW_AT_GNU_vector);
-  if (vector_type_p)
-return array_elems_type_id;
+  /* Ok, mount and register the array type.  Note how the array
+ type we register here is the type of the elements in
+ subsequent "dimensions", if there are any.  */
+  arinfo.ctr_nelems = array_num_elements;
 
-  dw_die_ref array_elems_type = ctf_get_AT_type (array_type);
+  array_index_type = ctf_get_AT_type (c);
+  arinfo.ctr_index = gen_ctf_type (ctfc, array_index_type);
 
-  /* First, register the type of the array elements if needed.  */
-  array_elems_type_id = gen_ctf_type (ctfc, array_elems_type);
+  if (c == last)
+arinfo.ctr_contents = array_elems_type_id;
+  else
+arinfo.ctr_contents = gen_ctf_subrange_type (ctfc, array_elems_type_id,
+dw_get_die_sib (c), last);
 
-  /* DWARF array types pretend C supports multi-dimensional arrays.
- So for the type int[N][M], the array type DIE contains two
- subrange_type children, the first with upper bound N-1 and the
- second with upper bound M-1.
+  if (!ctf_type_exists (ctfc, c, _node_type_id))
+array_node_type_id = ctf_add_array (ctfc, CTF_ADD_ROOT, , c);
 
- CTF, on the other hand, just encodes each array type in its own
- array type CTF struct.  Therefore we have to iterate on the
- children and create all the needed types.  */
+  return array_node_type_id;
+}
 
-  c = dw_get_die_child (array_type);
-  gcc_assert (c);
-  do
-{
-  ctf_arinfo_t arinfo;
-  dw_die_ref array_index_type;
-  uint32_t array_num_elements;
+/* Generate CTF for an ARRAY_TYPE.  */
 
-  c = dw_get_die_sib (c);
+static ctf_id_t
+gen_ctf_array_type (ctf_container_ref ctfc,
+   dw_die_ref array_type)
+{
+  dw_die_ref first, last, array_elems_type;
+  ctf_id_t array_elems_type_id = CTF_NULL_TYPEID;
+  ctf_id_t array_type_id = CTF_NULL_TYPEID;
 
-  if (dw_get_die_tag (c) == DW_TAG_subrange_type)
-   {
- dw_attr_node *upper_bound_at;
-
- array_index_type = ctf_get_AT_type (c);
-
- 

Re: [PATCH] combine: Fix recent WORD_REGISTER_OPERATIONS check [PR113010]

2024-03-04 Thread Jeff Law




On 3/4/24 09:49, Jakub Jelinek wrote:

On Mon, Mar 04, 2024 at 05:18:39PM +0100, Rainer Orth wrote:

On 2/26/24 17:17, Greg McGary wrote:

The sign-bit-copies of a sign-extending load cannot be known until runtime on
WORD_REGISTER_OPERATIONS targets, except in the case of a zero-extending MEM
load.  See the fix for PR112758.
2024-02-22  Greg McGary  
  PR rtl-optimization/113010
* combine.cc (simplify_comparison): Simplify a SUBREG on
  WORD_REGISTER_OPERATIONS targets only if it is a zero-extending
  MEM load.
* gcc.c-torture/execute/pr113010.c: New test.

I think this is fine for the trunk.  I'll do some final testing on it
tomorrow.


unfortunately, the patch broke Solaris/SPARC bootstrap
(sparc-sun-solaris2.11):

/vol/gcc/src/hg/master/local/gcc/combine.cc: In function 'rtx_code 
simplify_comparison(rtx_code, rtx_def**, rtx_def**)':
/vol/gcc/src/hg/master/local/gcc/combine.cc:12101:25: error: '*(unsigned 
int*)((char*)_mode + offsetof(scalar_int_mode, scalar_int_mode::m_mode))' 
may be used uninitialized [-Werror=maybe-uninitialized]
12101 |   scalar_int_mode mode, inner_mode, tmode;
   | ^~


I don't see how it could ever work properly, inner_mode in that spot is
just uninitialized.

I think we shouldn't worry about paradoxical subregs of non-scalar_int_mode
REGs/MEMs and for the scalar_int_mode ones should initialize inner_mode
before we use it.
Another option would be to use
maybe_lt (GET_MODE_PRECISION (GET_MODE (SUBREG_REG (op0))), BITS_PER_WORD)
and
load_extend_op (GET_MODE (SUBREG_REG (op0))) == ZERO_EXTEND,
or set machine_mode smode = GET_MODE (SUBREG_REG (op0)); and use it in
those two spots.

2024-03-04  Jakub Jelinek  

PR rtl-optimization/113010
* combine.cc (simplify_comparison): Guard the
WORD_REGISTER_OPERATIONS check on scalar_int_mode of SUBREG_REG
and initialize inner_mode.
Egad.  Sorry.  OK for the trunk.  Thanks for picking this up.  Got 
distracted by an internal issue.


jeff



[PATCH v2 00/13] Add aarch64-w64-mingw32 target

2024-03-04 Thread Evgeny Karpov
gcc/ChangeLog:

* config.gcc:
* config/aarch64/aarch64-opts.h (enum aarch64_calling_abi):
* config/aarch64/aarch64-protos.h 
(mingw_pe_maybe_record_exported_symbol):
(mingw_pe_section_type_flags):
(mingw_pe_unique_section):
(mingw_pe_encode_section_info):
* config/aarch64/aarch64.h (struct seh_frame_state):
(GTY):
* config/i386/cygming.h (SUBTARGET_ENCODE_SECTION_INFO):
(TARGET_ASM_UNIQUE_SECTION):
(TARGET_ASM_NAMED_SECTION):
(TARGET_SECTION_TYPE_FLAGS):
(ASM_DECLARE_COLD_FUNCTION_NAME):
(ASM_OUTPUT_EXTERNAL_LIBCALL):
* config/i386/i386-protos.h (i386_pe_unique_section):
(i386_pe_declare_function_type):
(mingw_pe_unique_section):
(mingw_pe_declare_function_type):
(i386_pe_maybe_record_exported_symbol):
(i386_pe_encode_section_info):
(mingw_pe_maybe_record_exported_symbol):
(mingw_pe_encode_section_info):
(i386_pe_section_type_flags):
(i386_pe_asm_named_section):
(mingw_pe_section_type_flags):
(mingw_pe_asm_named_section):
* config/i386/mingw-w64.opt.urls:
* config/lynx.opt.urls:
* config/i386/cygming.opt: Move to...
* config/mingw/cygming.opt: ...here.
* config/i386/cygwin-d.cc: Move to...
* config/mingw/cygwin-d.cc: ...here.
* config/i386/mingw-stdint.h: Move to...
* config/mingw/mingw-stdint.h: ...here.
* config/i386/mingw.opt: Move to...
* config/mingw/mingw.opt: ...here.
* config/i386/mingw.opt.urls: Move to...
* config/mingw/mingw.opt.urls: ...here.
* config/i386/mingw32.h: Move to...
* config/mingw/mingw32.h: ...here.
* config/i386/msformat-c.cc: Move to...
* config/mingw/msformat-c.cc: ...here.
* config/i386/t-cygming: Move to...
* config/mingw/t-cygming: ...here.
* config/i386/winnt-cxx.cc: Move to...
* config/mingw/winnt-cxx.cc: ...here.
* config/i386/winnt-d.cc: Move to...
* config/mingw/winnt-d.cc: ...here.
* config/i386/winnt-stubs.cc: Move to...
* config/mingw/winnt-stubs.cc: ...here.
* config/i386/winnt.cc: Move to...
* config/mingw/winnt.cc: ...here.
* doc/invoke.texi:
* varasm.cc (switch_to_comdat_section):
* config/i386/cygming.opt.urls: Removed.
* config/aarch64/aarch64-abi-ms.h: New file.
* config/aarch64/aarch64-coff.h: New file.
* config/aarch64/cygming.h: New file.
* config/mingw/cygming.opt.urls: New file.

libatomic/ChangeLog:

* configure.tgt:

libgcc/ChangeLog:

* config.host:
* config/i386/t-gthr-win32: Move to...
* config/mingw/t-gthr-win32: ...here.
* config/i386/t-mingw-pthread: Move to...
* config/mingw/t-mingw-pthread: ...here.
* config/aarch64/t-no-eh: New file.



[PATCH v2 13/13] Add aarch64-w64-mingw32 target to libgcc

2024-03-04 Thread Evgeny Karpov
From: Zac Walker 
Date: Mon, 12 Feb 2024 15:22:47 +0100
Subject: [PATCH v2 13/13] Add aarch64-w64-mingw32 target to libgcc

Reuse MinGW definitions from i386 for libgcc. Move reused files to
libgcc/config/mingw folder.

libgcc/ChangeLog:

* config.host: Add aarch64-w64-mingw32 target. Adjust targets
after moving MinGW files.
* config/i386/t-gthr-win32: Move to...
* config/mingw/t-gthr-win32: ...here.
* config/i386/t-mingw-pthread: Move to...
* config/mingw/t-mingw-pthread: ...here.
* config/aarch64/t-no-eh: New file. EH is not yet implemented for
the target, and the default definition should be disabled.
---
 libgcc/config.host| 23 +++
 libgcc/config/aarch64/t-no-eh |  2 ++
 libgcc/config/{i386 => mingw}/t-gthr-win32|  0
 libgcc/config/{i386 => mingw}/t-mingw-pthread |  0
 4 files changed, 21 insertions(+), 4 deletions(-)
 create mode 100644 libgcc/config/aarch64/t-no-eh
 rename libgcc/config/{i386 => mingw}/t-gthr-win32 (100%)
 rename libgcc/config/{i386 => mingw}/t-mingw-pthread (100%)

diff --git a/libgcc/config.host b/libgcc/config.host
index 59a42d3a01f..3396a84893f 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -456,6 +456,21 @@ aarch64*-*-vxworks7*)
tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
tmake_file="${tmake_file} t-dfprules"
;;
+aarch64-*-mingw*)
+   case ${target_thread_file} in
+ win32)
+   tmake_thr_file="mingw/t-gthr-win32"
+   ;;
+ posix)
+   tmake_thr_file="mingw/t-mingw-pthread"
+   ;;
+   esac
+   tmake_file="${tmake_file} ${cpu_type}/t-no-eh ${tmake_thr_file}"
+   tmake_file="${tmake_file} t-dfprules"
+   tmake_file="${tmake_file} ${cpu_type}/t-aarch64"
+   tmake_file="${tmake_file} ${cpu_type}/t-lse"
+   tmake_file="${tmake_file} ${cpu_type}/t-softfp t-softfp t-crtfm"
+   ;;
 alpha*-*-linux*)
tmake_file="${tmake_file} alpha/t-alpha alpha/t-ieee t-crtfm 
alpha/t-linux"
extra_parts="$extra_parts crtfastmath.o"
@@ -874,10 +889,10 @@ i[34567]86-*-mingw*)
fi
case ${target_thread_file} in
  win32)
-   tmake_thr_file="i386/t-gthr-win32"
+   tmake_thr_file="mingw/t-gthr-win32"
;;
  posix)
-   tmake_thr_file="i386/t-mingw-pthread"
+   tmake_thr_file="mingw/t-mingw-pthread"
;;
  mcf)
tmake_thr_file="i386/t-mingw-mcfgthread"
@@ -901,10 +916,10 @@ i[34567]86-*-mingw*)
 x86_64-*-mingw*)
case ${target_thread_file} in
  win32)
-   tmake_thr_file="i386/t-gthr-win32"
+   tmake_thr_file="mingw/t-gthr-win32"
;;
  posix)
-   tmake_thr_file="i386/t-mingw-pthread"
+   tmake_thr_file="mingw/t-mingw-pthread"
;;
  mcf)
tmake_thr_file="i386/t-mingw-mcfgthread"
diff --git a/libgcc/config/aarch64/t-no-eh b/libgcc/config/aarch64/t-no-eh
new file mode 100644
index 000..1802339a583
--- /dev/null
+++ b/libgcc/config/aarch64/t-no-eh
@@ -0,0 +1,2 @@
+# Not using EH
+LIB2ADDEH =
diff --git a/libgcc/config/i386/t-gthr-win32 b/libgcc/config/mingw/t-gthr-win32
similarity index 100%
rename from libgcc/config/i386/t-gthr-win32
rename to libgcc/config/mingw/t-gthr-win32
diff --git a/libgcc/config/i386/t-mingw-pthread 
b/libgcc/config/mingw/t-mingw-pthread
similarity index 100%
rename from libgcc/config/i386/t-mingw-pthread
rename to libgcc/config/mingw/t-mingw-pthread
-- 
2.25.1



[PATCH v2 12/13] aarch64: Add aarch64-w64-mingw32 target to libatomic

2024-03-04 Thread Evgeny Karpov
From: Zac Walker 
Date: Fri, 1 Mar 2024 02:23:45 +0100
Subject: [PATCH v2 12/13] aarch64: Add aarch64-w64-mingw32 target to libatomic

libatomic/ChangeLog:

* configure.tgt: Add aarch64-w64-mingw32 target.
---
 libatomic/configure.tgt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libatomic/configure.tgt b/libatomic/configure.tgt
index 4237f283fe4..e49fd57ab41 100644
--- a/libatomic/configure.tgt
+++ b/libatomic/configure.tgt
@@ -44,7 +44,7 @@ case "${target_cpu}" in
   aarch64*)
ARCH=aarch64
case "${target}" in
-   aarch64*-*-linux*)
+   aarch64*-*-linux* | aarch64-*-mingw*)
if test -n "$enable_aarch64_lse"; then
try_ifunc=yes
fi
-- 
2.25.1



[PATCH v2 11/13] aarch64: Build and add objects for Cygwin and MinGW for AArch64

2024-03-04 Thread Evgeny Karpov
From: Zac Walker 
Date: Tue, 20 Feb 2024 13:55:51 +0100
Subject: [PATCH v2 11/13] aarch64: Build and add objects for Cygwin and MinGW
 for AArch64

gcc/ChangeLog:

* config.gcc: Build and add objects for Cygwin and MinGW. Add Cygwin
and MinGW options to the target.
---
 gcc/config.gcc | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 4471599454b..ed5431b0f5d 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1272,6 +1272,11 @@ aarch64-*-mingw*)
tm_file="${tm_file} mingw/mingw-stdint.h"
tmake_file="${tmake_file} aarch64/t-aarch64"
target_gtfiles="$target_gtfiles \$(srcdir)/config/mingw/winnt.cc"
+   extra_options="${extra_options} mingw/cygming.opt mingw/mingw.opt"
+   extra_objs="${extra_objs} winnt.o"
+   c_target_objs="${c_target_objs} msformat-c.o"
+   d_target_objs="${d_target_objs} winnt-d.o"
+   tmake_file="${tmake_file} mingw/t-cygming"
case ${enable_threads} in
  "" | yes | win32)
thread_file='win32'
-- 
2.25.1



Re: [PATCH] c++: lambda capturing structured bindings [PR85889]

2024-03-04 Thread Marek Polacek
On Fri, Mar 01, 2024 at 07:58:24PM -0500, Marek Polacek wrote:
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for 15?  (Or even trunk?)
> 
> -- >8 --
>  clarifies that it's OK to capture structured
> bindings.
> 
> [expr.prim.lambda.capture]/4 says "The identifier in a simple-capture shall
> denote a local entity" and [basic.pre]/3: "An entity is a [...] structured
> binding".
> 
> It doesn't appear that this was made a DR, so, strictly speaking, we
> should have a -Wc++20-extensions warning, like clang++.
> 
>   PR c++/85889
> 
> gcc/cp/ChangeLog:
> 
>   * lambda.cc (add_capture): Add a pedwarn for capturing structured
>   bindings.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/cpp2a/decomp3.C: Use -Wno-c++20-extensions.
>   * g++.dg/cpp1z/decomp60.C: New test.
> ---
>  gcc/cp/lambda.cc  |  9 +
>  gcc/testsuite/g++.dg/cpp1z/decomp60.C | 12 
>  gcc/testsuite/g++.dg/cpp2a/decomp3.C  |  2 +-
>  3 files changed, 22 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp1z/decomp60.C
> 
> diff --git a/gcc/cp/lambda.cc b/gcc/cp/lambda.cc
> index 4b1f9391fee..470f9d2c4f1 100644
> --- a/gcc/cp/lambda.cc
> +++ b/gcc/cp/lambda.cc
> @@ -607,6 +607,15 @@ add_capture (tree lambda, tree id, tree orig_init, bool 
> by_reference_p,
>TCTX_CAPTURE_BY_COPY, type))
>   return error_mark_node;
>   }
> +
> +  if (cxx_dialect < cxx20)
> + {
> +   tree stripped_init = tree_strip_any_location_wrapper (initializer);

I was missing an auto_diagnostic_group here.  Fixed.

> +   if (DECL_DECOMPOSITION_P (stripped_init)
> +   && pedwarn (input_location, OPT_Wc__20_extensions,
> +   "captured structured bindings are a C++20 extension"))
> + inform (DECL_SOURCE_LOCATION (stripped_init), "declared here");



[PATCH v2 10/13] Rename "x86 Windows Options" to "Cygwin and MinGW Options"

2024-03-04 Thread Evgeny Karpov
From: Zac Walker 
Date: Fri, 1 Mar 2024 02:17:39 +0100
Subject: [PATCH v2 10/13] Rename "x86 Windows Options" to "Cygwin and MinGW
 Options"

Rename "x86 Windows Options" to "Cygwin and MinGW Options".
It will be used also for AArch64.

gcc/ChangeLog:

* config/i386/mingw-w64.opt.urls: Rename options' name and
regenerate option URLs.
* config/lynx.opt.urls: Likewise.
* config/mingw/cygming.opt.urls: Likewise.
* config/mingw/mingw.opt.urls: Likewise.
* doc/invoke.texi: Likewise.
---
 gcc/config/i386/mingw-w64.opt.urls |  2 +-
 gcc/config/lynx.opt.urls   |  2 +-
 gcc/config/mingw/cygming.opt.urls  | 18 +-
 gcc/config/mingw/mingw.opt.urls|  2 +-
 gcc/doc/invoke.texi| 12 ++--
 5 files changed, 22 insertions(+), 14 deletions(-)

diff --git a/gcc/config/i386/mingw-w64.opt.urls 
b/gcc/config/i386/mingw-w64.opt.urls
index 6bb53ef29b2..5cceba1d1a1 100644
--- a/gcc/config/i386/mingw-w64.opt.urls
+++ b/gcc/config/i386/mingw-w64.opt.urls
@@ -1,5 +1,5 @@
 ; Autogenerated by regenerate-opt-urls.py from gcc/config/i386/mingw-w64.opt 
and generated HTML
 
 municode
-UrlSuffix(gcc/x86-Windows-Options.html#index-municode)
+UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-municode)
 
diff --git a/gcc/config/lynx.opt.urls b/gcc/config/lynx.opt.urls
index 63e7b9c4b33..b547138f7ff 100644
--- a/gcc/config/lynx.opt.urls
+++ b/gcc/config/lynx.opt.urls
@@ -1,5 +1,5 @@
 ; Autogenerated by regenerate-opt-urls.py from gcc/config/lynx.opt and 
generated HTML
 
 mthreads
-UrlSuffix(gcc/x86-Windows-Options.html#index-mthreads-1)
+UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mthreads-1)
 
diff --git a/gcc/config/mingw/cygming.opt.urls 
b/gcc/config/mingw/cygming.opt.urls
index 87799befe3c..c624e22e442 100644
--- a/gcc/config/mingw/cygming.opt.urls
+++ b/gcc/config/mingw/cygming.opt.urls
@@ -1,30 +1,30 @@
 ; Autogenerated by regenerate-opt-urls.py from gcc/config/i386/cygming.opt and 
generated HTML
 
 mconsole
-UrlSuffix(gcc/x86-Windows-Options.html#index-mconsole)
+UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mconsole)
 
 mdll
-UrlSuffix(gcc/x86-Windows-Options.html#index-mdll)
+UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mdll)
 
 mnop-fun-dllimport
-UrlSuffix(gcc/x86-Windows-Options.html#index-mnop-fun-dllimport)
+UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mnop-fun-dllimport)
 
 ; skipping UrlSuffix for 'mthreads' due to multiple URLs:
+;   duplicate: 'gcc/Cygwin-and-MinGW-Options.html#index-mthreads-1'
 ;   duplicate: 'gcc/x86-Options.html#index-mthreads'
-;   duplicate: 'gcc/x86-Windows-Options.html#index-mthreads-1'
 
 mwin32
-UrlSuffix(gcc/x86-Windows-Options.html#index-mwin32)
+UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mwin32)
 
 mwindows
-UrlSuffix(gcc/x86-Windows-Options.html#index-mwindows)
+UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mwindows)
 
 mpe-aligned-commons
-UrlSuffix(gcc/x86-Windows-Options.html#index-mpe-aligned-commons)
+UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mpe-aligned-commons)
 
 fset-stack-executable
-UrlSuffix(gcc/x86-Windows-Options.html#index-fno-set-stack-executable)
+UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-fno-set-stack-executable)
 
 fwritable-relocated-rdata
-UrlSuffix(gcc/x86-Windows-Options.html#index-fno-writable-relocated-rdata)
+UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-fno-writable-relocated-rdata)
 
diff --git a/gcc/config/mingw/mingw.opt.urls b/gcc/config/mingw/mingw.opt.urls
index 2cbbaadf310..f8ee5be6a53 100644
--- a/gcc/config/mingw/mingw.opt.urls
+++ b/gcc/config/mingw/mingw.opt.urls
@@ -1,7 +1,7 @@
 ; Autogenerated by regenerate-opt-urls.py from gcc/config/i386/mingw.opt and 
generated HTML
 
 mcrtdll=
-UrlSuffix(gcc/x86-Windows-Options.html#index-mcrtdll)
+UrlSuffix(gcc/Cygwin-and-MinGW-Options.html#index-mcrtdll)
 
 ; skipping UrlSuffix for 'pthread' due to multiple URLs:
 ;   duplicate: 'gcc/Link-Options.html#index-pthread-1'
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index bdf05be387d..e2e473e095f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1493,6 +1493,8 @@ See RS/6000 and PowerPC Options.
 -munroll-only-small-loops -mlam=@var{choice}}
 
 @emph{x86 Windows Options}
+
+@emph{Cygwin and MinGW Options}
 @gccoptlist{-mconsole  -mcrtdll=@var{library}  -mdll
 -mnop-fun-dllimport  -mthread
 -municode  -mwin32  -mwindows  -fno-set-stack-executable}
@@ -20976,6 +20978,7 @@ platform.
 * C6X Options::
 * CRIS Options::
 * C-SKY Options::
+* Cygwin and MinGW Options::
 * Darwin Options::
 * DEC Alpha Options::
 * eBPF Options::
@@ -36112,8 +36115,13 @@ positions 62:57 can be used for metadata.
 
 @node x86 Windows Options
 @subsection x86 Windows Options
-@cindex x86 Windows Options
-@cindex Windows Options for x86
+
+@xref{Cygwin and MinGW Options}.
+
+@node Cygwin and MinGW Options
+@subsection Cygwin and MinGW Options
+@cindex Cygwin and MinGW Options
+@cindex Options for Cygwin and MinGW
 
 These 

[PATCH v2 09/13] aarch64: Add SEH to machine_function

2024-03-04 Thread Evgeny Karpov
From: Zac Walker 
Date: Tue, 20 Feb 2024 18:10:08 +0100
Subject: [PATCH v2 09/13] aarch64: Add SEH to machine_function

SEH is not enabled in aarch64-w64-mingw32 target yet. However, it is
needed to be declared in machine_function for reusing winnt.cc.

gcc/ChangeLog:

* config/aarch64/aarch64.h (struct seh_frame_state): Declare SEH
structure in machine_function.
(GTY): Add SEH field.
---
 gcc/config/aarch64/aarch64.h | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 45e901cda64..62cc97aa8c8 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -1042,6 +1042,9 @@ struct GTY (()) aarch64_frame
   bool is_scs_enabled;
 };
 
+/* Private to winnt.cc.  */
+struct seh_frame_state;
+
 #ifdef hash_set_h
 typedef struct GTY (()) machine_function
 {
@@ -1082,6 +1085,9 @@ typedef struct GTY (()) machine_function
  still exists and still fulfils its original purpose. the same register
  can be reused by other code.  */
   rtx_insn *advsimd_zero_insn;
+
+  /* During SEH output, this is non-null.  */
+  struct seh_frame_state * GTY ((skip (""))) seh;
 } machine_function;
 #endif
 #endif
-- 
2.25.1



[PATCH v2 08/13] aarch64: Add Cygwin and MinGW environments for AArch64

2024-03-04 Thread Evgeny Karpov
From: Zac Walker 
Date: Fri, 1 Mar 2024 10:49:28 +0100
Subject: [PATCH v2 08/13] aarch64: Add Cygwin and MinGW environments for
 AArch64

Define Cygwin and MinGW environment such as types, SEH definitions,
shared libraries, etc.

gcc/ChangeLog:

* config.gcc: Add Cygwin and MinGW difinitions.
* config/aarch64/aarch64-protos.h
(mingw_pe_maybe_record_exported_symbol): Declare functions
which are used in Cygwin and MinGW environment.
(mingw_pe_section_type_flags): Likewise.
(mingw_pe_unique_section): Likewise.
(mingw_pe_encode_section_info): Likewise.
* config/aarch64/cygming.h: New file.
---
 gcc/config.gcc  |   4 +
 gcc/config/aarch64/aarch64-protos.h |   5 +
 gcc/config/aarch64/cygming.h| 175 
 3 files changed, 184 insertions(+)
 create mode 100644 gcc/config/aarch64/cygming.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 3aca257c322..4471599454b 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1267,7 +1267,11 @@ aarch64*-*-linux*)
 aarch64-*-mingw*)
tm_file="${tm_file} aarch64/aarch64-abi-ms.h"
tm_file="${tm_file} aarch64/aarch64-coff.h"
+   tm_file="${tm_file} aarch64/cygming.h"
+   tm_file="${tm_file} mingw/mingw32.h"
+   tm_file="${tm_file} mingw/mingw-stdint.h"
tmake_file="${tmake_file} aarch64/t-aarch64"
+   target_gtfiles="$target_gtfiles \$(srcdir)/config/mingw/winnt.cc"
case ${enable_threads} in
  "" | yes | win32)
thread_file='win32'
diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index bd719b992a5..759e1a0f9da 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -1110,6 +1110,11 @@ extern void aarch64_output_patchable_area (unsigned int, 
bool);
 
 extern void aarch64_adjust_reg_alloc_order ();
 
+extern void mingw_pe_maybe_record_exported_symbol (tree, const char *, int);
+extern unsigned int mingw_pe_section_type_flags (tree, const char *, int);
+extern void mingw_pe_unique_section (tree, int);
+extern void mingw_pe_encode_section_info (tree, rtx, int);
+
 bool aarch64_optimize_mode_switching (aarch64_mode_entity);
 void aarch64_restore_za (rtx);
 
diff --git a/gcc/config/aarch64/cygming.h b/gcc/config/aarch64/cygming.h
new file mode 100644
index 000..2f239c42a89
--- /dev/null
+++ b/gcc/config/aarch64/cygming.h
@@ -0,0 +1,175 @@
+/* Operating system specific defines to be used when targeting GCC for
+   hosting on Windows32, using a Unix style C library and tools.
+   Copyright (C) 1995-2024 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef GCC_AARCH64_CYGMING_H
+#define GCC_AARCH64_CYGMING_H
+
+#undef PREFERRED_DEBUGGING_TYPE
+#define PREFERRED_DEBUGGING_TYPE DINFO_TYPE_NONE
+
+#define FASTCALL_PREFIX '@'
+
+#define print_reg(rtx, code, file)
+
+#define SYMBOL_FLAG_DLLIMPORT 0
+#define SYMBOL_FLAG_DLLEXPORT 0
+
+#define SYMBOL_REF_DLLEXPORT_P(X) \
+   ((SYMBOL_REF_FLAGS (X) & SYMBOL_FLAG_DLLEXPORT) != 0)
+
+/* Disable SEH and declare the required SEH-related macros that are
+still needed for compilation.  */
+#undef TARGET_SEH
+#define TARGET_SEH 0
+
+#define SSE_REGNO_P(N) 0
+#define GENERAL_REGNO_P(N) 0
+#define SEH_MAX_FRAME_SIZE 0
+
+#undef DEFAULT_ABI
+#define DEFAULT_ABI AARCH64_CALLING_ABI_MS
+
+#undef TARGET_PECOFF
+#define TARGET_PECOFF 1
+
+#include 
+#ifdef __MINGW32__
+#include 
+#endif
+
+extern void mingw_pe_asm_named_section (const char *, unsigned int, tree);
+extern void mingw_pe_declare_function_type (FILE *file, const char *name,
+   int pub);
+
+#define TARGET_ASM_NAMED_SECTION  mingw_pe_asm_named_section
+
+/* Select attributes for named sections.  */
+#define TARGET_SECTION_TYPE_FLAGS  mingw_pe_section_type_flags
+
+#define TARGET_ASM_UNIQUE_SECTION mingw_pe_unique_section
+#define TARGET_ENCODE_SECTION_INFO  mingw_pe_encode_section_info
+
+/* Declare the type properly for any external libcall.  */
+#define ASM_OUTPUT_EXTERNAL_LIBCALL(FILE, FUN) \
+  mingw_pe_declare_function_type (FILE, XSTR (FUN, 0), 1)
+
+#define TARGET_OS_CPP_BUILTINS()   \
+  do   \
+{  \
+  builtin_define 

[PATCH v2 07/13] Exclude i386 functionality from aarch64 build

2024-03-04 Thread Evgeny Karpov
From: Zac Walker 
Date: Fri, 1 Mar 2024 02:35:40 +0100
Subject: [PATCH v2 07/13] Exclude i386 functionality from aarch64 build

This patch defines TARGET_AARCH64_MS_ABI in config.gcc and uses it to
exclude i386 functionality from aarch64 build and adjust MinGW headers
for AArch64 MS ABI.

gcc/ChangeLog:

* config.gcc: Define TARGET_AARCH64_MS_ABI.
* config/mingw/mingw-stdint.h (INTPTR_TYPE): Use
TARGET_AARCH64_MS_ABI to adjust MinGW headers for
AArch64 MS ABI.
(UINTPTR_TYPE): Likewise.
(defined): Likewise.
* config/mingw/mingw32.h (DEFAULT_ABI): Likewise.
(defined): Likewise.
* config/mingw/winnt.cc (defined): Use TARGET_ARM64_MS_ABI to
exclude ix86_get_callcvt.
(i386_pe_maybe_mangle_decl_assembler_name): Likewise.
(i386_pe_mangle_decl_assembler_name): Likewise.
---
 gcc/config.gcc  | 1 +
 gcc/config/mingw/mingw-stdint.h | 9 +++--
 gcc/config/mingw/mingw32.h  | 6 +-
 gcc/config/mingw/winnt.cc   | 8 
 4 files changed, 21 insertions(+), 3 deletions(-)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 26564ead079..3aca257c322 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1278,6 +1278,7 @@ aarch64-*-mingw*)
esac
default_use_cxa_atexit=yes
user_headers_inc_next_post="${user_headers_inc_next_post} float.h"
+   tm_defines="${tm_defines} TARGET_AARCH64_MS_ABI=1"
;;
 aarch64*-wrs-vxworks*)
 tm_file="${tm_file} elfos.h aarch64/aarch64-elf.h"
diff --git a/gcc/config/mingw/mingw-stdint.h b/gcc/config/mingw/mingw-stdint.h
index c0feade76e9..debbe829bdf 100644
--- a/gcc/config/mingw/mingw-stdint.h
+++ b/gcc/config/mingw/mingw-stdint.h
@@ -46,5 +46,10 @@ along with GCC; see the file COPYING3.  If not see
 #define UINT_FAST32_TYPE "unsigned int"
 #define UINT_FAST64_TYPE "long long unsigned int"
 
-#define INTPTR_TYPE (TARGET_64BIT ? "long long int" : "int")
-#define UINTPTR_TYPE (TARGET_64BIT ? "long long unsigned int" : "unsigned int")
+#if defined (TARGET_AARCH64_MS_ABI)
+# define INTPTR_TYPE "long long int"
+# define UINTPTR_TYPE "long long unsigned int"
+#else
+# define INTPTR_TYPE (TARGET_64BIT ? "long long int" : "int")
+# define UINTPTR_TYPE (TARGET_64BIT ? "long long unsigned int" : "unsigned 
int")
+#endif
\ No newline at end of file
diff --git a/gcc/config/mingw/mingw32.h b/gcc/config/mingw/mingw32.h
index 58304fc55f6..040c3e1e521 100644
--- a/gcc/config/mingw/mingw32.h
+++ b/gcc/config/mingw/mingw32.h
@@ -19,7 +19,11 @@ along with GCC; see the file COPYING3.  If not see
 .  */
 
 #undef DEFAULT_ABI
-#define DEFAULT_ABI MS_ABI
+#if defined (TARGET_AARCH64_MS_ABI)
+# define DEFAULT_ABI AARCH64_CALLING_ABI_MS
+#else
+# define DEFAULT_ABI MS_ABI
+#endif
 
 /* By default, target has a 80387, uses IEEE compatible arithmetic,
returns float values in the 387 and needs stack probes.
diff --git a/gcc/config/mingw/winnt.cc b/gcc/config/mingw/winnt.cc
index 1ed383155d0..2a4fc03fc56 100644
--- a/gcc/config/mingw/winnt.cc
+++ b/gcc/config/mingw/winnt.cc
@@ -224,6 +224,8 @@ gen_stdcall_or_fastcall_suffix (tree decl, tree id, bool 
fastcall)
   return get_identifier (new_str);
 }
 
+#if !defined (TARGET_AARCH64_MS_ABI)
+
 /* Maybe decorate and get a new identifier for the DECL of a stdcall or
fastcall function. The original identifier is supplied in ID. */
 
@@ -250,6 +252,8 @@ i386_pe_maybe_mangle_decl_assembler_name (tree decl, tree 
id)
   return new_id;
 }
 
+#endif
+
 /* Emit an assembler directive to set symbol for DECL visibility to
the visibility type VIS, which must not be VISIBILITY_DEFAULT.
As for PE there is no hidden support in gas, we just warn for
@@ -266,6 +270,8 @@ i386_pe_assemble_visibility (tree decl, int)
  "in this configuration; ignored");
 }
 
+#if !defined (TARGET_AARCH64_MS_ABI)
+
 /* This is used as a target hook to modify the DECL_ASSEMBLER_NAME
in the language-independent default hook
langhooks,c:lhd_set_decl_assembler_name ()
@@ -278,6 +284,8 @@ i386_pe_mangle_decl_assembler_name (tree decl, tree id)
   return (new_id ? new_id : id);
 }
 
+#endif
+
 /* This hook behaves the same as varasm.cc/assemble_name(), but
generates the name into memory rather than outputting it to
a file stream.  */
-- 
2.25.1



[PATCH v2 05/13] Reuse MinGW from i386 for AArch64

2024-03-04 Thread Evgeny Karpov
From: Zac Walker 
Date: Fri, 1 Mar 2024 02:41:50 +0100
Subject: [PATCH v2 05/13] Reuse MinGW from i386 for AArch64

This patch creates a new config/mingw directory to share MinGW
related definitions, and moves there the corresponding existing files
from config/i386.

gcc/ChangeLog:

* config.gcc: Adjust targets after moving MinGW related files
from i386 to mingw folder.
* config/i386/cygming.opt: Move to...
* config/mingw/cygming.opt: ...here.
* config/i386/cygming.opt.urls: Move to...
* config/mingw/cygming.opt.urls: ...here.
* config/i386/cygwin-d.cc: Move to...
* config/mingw/cygwin-d.cc: ...here.
* config/i386/mingw-stdint.h: Move to...
* config/mingw/mingw-stdint.h: ...here.
* config/i386/mingw.opt: Move to...
* config/mingw/mingw.opt: ...here.
* config/i386/mingw.opt.urls: Move to...
* config/mingw/mingw.opt.urls: ...here.
* config/i386/mingw32.h: Move to...
* config/mingw/mingw32.h: ...here.
* config/i386/msformat-c.cc: Move to...
* config/mingw/msformat-c.cc: ...here.
* config/i386/t-cygming: Move to...
* config/mingw/t-cygming: ...here and updated.
* config/i386/winnt-cxx.cc: Move to...
* config/mingw/winnt-cxx.cc: ...here.
* config/i386/winnt-d.cc: Move to...
* config/mingw/winnt-d.cc: ...here.
* config/i386/winnt-stubs.cc: Move to...
* config/mingw/winnt-stubs.cc: ...here.
* config/i386/winnt.cc: Move to...
* config/mingw/winnt.cc: ...here.
---
 gcc/config.gcc  | 22 ++--
 gcc/config/{i386 => mingw}/cygming.opt  |  0
 gcc/config/{i386 => mingw}/cygming.opt.urls |  0
 gcc/config/{i386 => mingw}/cygwin-d.cc  |  0
 gcc/config/{i386 => mingw}/mingw-stdint.h   |  0
 gcc/config/{i386 => mingw}/mingw.opt|  0
 gcc/config/{i386 => mingw}/mingw.opt.urls   |  0
 gcc/config/{i386 => mingw}/mingw32.h|  0
 gcc/config/{i386 => mingw}/msformat-c.cc|  0
 gcc/config/{i386 => mingw}/t-cygming| 23 -
 gcc/config/{i386 => mingw}/winnt-cxx.cc |  0
 gcc/config/{i386 => mingw}/winnt-d.cc   |  0
 gcc/config/{i386 => mingw}/winnt-stubs.cc   |  0
 gcc/config/{i386 => mingw}/winnt.cc |  0
 14 files changed, 24 insertions(+), 21 deletions(-)
 rename gcc/config/{i386 => mingw}/cygming.opt (100%)
 rename gcc/config/{i386 => mingw}/cygming.opt.urls (100%)
 rename gcc/config/{i386 => mingw}/cygwin-d.cc (100%)
 rename gcc/config/{i386 => mingw}/mingw-stdint.h (100%)
 rename gcc/config/{i386 => mingw}/mingw.opt (100%)
 rename gcc/config/{i386 => mingw}/mingw.opt.urls (100%)
 rename gcc/config/{i386 => mingw}/mingw32.h (100%)
 rename gcc/config/{i386 => mingw}/msformat-c.cc (100%)
 rename gcc/config/{i386 => mingw}/t-cygming (73%)
 rename gcc/config/{i386 => mingw}/winnt-cxx.cc (100%)
 rename gcc/config/{i386 => mingw}/winnt-d.cc (100%)
 rename gcc/config/{i386 => mingw}/winnt-stubs.cc (100%)
 rename gcc/config/{i386 => mingw}/winnt.cc (100%)

diff --git a/gcc/config.gcc b/gcc/config.gcc
index cb6661f44ef..26564ead079 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2161,9 +2161,9 @@ i[4567]86-wrs-vxworks*|x86_64-wrs-vxworks7*)
 i[34567]86-*-cygwin*)
tm_file="${tm_file} i386/unix.h i386/bsd.h i386/gas.h i386/cygming.h 
i386/cygwin.h i386/cygwin-stdint.h"
xm_file=i386/xm-cygwin.h
-   tmake_file="${tmake_file} i386/t-cygming t-slibgcc"
-   target_gtfiles="$target_gtfiles \$(srcdir)/config/i386/winnt.cc"
-   extra_options="${extra_options} i386/cygming.opt i386/cygwin.opt"
+   tmake_file="${tmake_file} mingw/t-cygming t-slibgcc"
+   target_gtfiles="$target_gtfiles \$(srcdir)/config/mingw/winnt.cc"
+   extra_options="${extra_options} mingw/cygming.opt i386/cygwin.opt"
extra_objs="${extra_objs} winnt.o winnt-stubs.o"
c_target_objs="${c_target_objs} msformat-c.o"
cxx_target_objs="${cxx_target_objs} winnt-cxx.o msformat-c.o"
@@ -2179,9 +2179,9 @@ x86_64-*-cygwin*)
need_64bit_isa=yes
tm_file="${tm_file} i386/unix.h i386/bsd.h i386/gas.h i386/cygming.h 
i386/cygwin.h i386/cygwin-w64.h i386/cygwin-stdint.h"
xm_file=i386/xm-cygwin.h
-   tmake_file="${tmake_file} i386/t-cygming t-slibgcc"
-   target_gtfiles="$target_gtfiles \$(srcdir)/config/i386/winnt.cc"
-   extra_options="${extra_options} i386/cygming.opt i386/cygwin.opt"
+   tmake_file="${tmake_file} mingw/t-cygming t-slibgcc"
+   target_gtfiles="$target_gtfiles \$(srcdir)/config/mingw/winnt.cc"
+   extra_options="${extra_options} mingw/cygming.opt i386/cygwin.opt"
extra_objs="${extra_objs} winnt.o winnt-stubs.o"
c_target_objs="${c_target_objs} msformat-c.o"
cxx_target_objs="${cxx_target_objs} winnt-cxx.o msformat-c.o"
@@ -2217,7 +2217,7 @@ i[34567]86-*-mingw* | x86_64-*-mingw*)
if test x$enable_threads = xmcf 

[PATCH] bpf: add inline memset expansion

2024-03-04 Thread David Faust
Similar to memmove and memcpy, the BPF backend cannot fall back on a
library call to implement __builtin_memset, and should always expand
calls to it inline if possible.

This patch implements simple inline expansion of memset in the BPF
backend in a verifier-friendly way. Similar to memcpy and memmove, the
size must be an integer constant, as is also required by clang.

Tested for bpf-unknown-none target on x86_64-linux-gnu host.
Also testetd against kernel BPF verifier by compiling and loading a
test program using the inline memset expansion.

gcc/
* config/bpf/bpf-protos.h (bpf_expand_setmem): New prototype.
* config/bpf/bpf.cc (bpf_expand_setmem): New.
* config/bpf/bpf.md (setmemdi): New define_expand.

gcc/testsuite/
* gcc.target/bpf/memset-1.c: New test.
---
 gcc/config/bpf/bpf-protos.h |  1 +
 gcc/config/bpf/bpf.cc   | 66 +
 gcc/config/bpf/bpf.md   | 17 +++
 gcc/testsuite/gcc.target/bpf/memset-1.c | 39 +++
 4 files changed, 123 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/bpf/memset-1.c

diff --git a/gcc/config/bpf/bpf-protos.h b/gcc/config/bpf/bpf-protos.h
index 366acb87ae4..ac0c2f4038f 100644
--- a/gcc/config/bpf/bpf-protos.h
+++ b/gcc/config/bpf/bpf-protos.h
@@ -36,5 +36,6 @@ class gimple_opt_pass;
 gimple_opt_pass *make_pass_lower_bpf_core (gcc::context *ctxt);
 
 bool bpf_expand_cpymem (rtx *, bool);
+bool bpf_expand_setmem (rtx *);
 
 #endif /* ! GCC_BPF_PROTOS_H */
diff --git a/gcc/config/bpf/bpf.cc b/gcc/config/bpf/bpf.cc
index 22b0cf2dc46..0e33f4347ba 100644
--- a/gcc/config/bpf/bpf.cc
+++ b/gcc/config/bpf/bpf.cc
@@ -1309,6 +1309,72 @@ bpf_expand_cpymem (rtx *operands, bool is_move)
   return true;
 }
 
+/* Expand setmem, as from __builtin_memset.
+   OPERANDS are the same as the setmem pattern.
+   Return true if the expansion was successful, false otherwise.  */
+
+bool
+bpf_expand_setmem (rtx *operands)
+{
+  /* Size must be constant for this expansion to work.  */
+  if (!CONST_INT_P (operands[1]))
+{
+  if (flag_building_libgcc)
+   warning (0, "could not inline call to %<__builtin_memset%>: "
+"size must be constant");
+  else
+   error ("could not inline call to %<__builtin_memset%>: "
+  "size must be constant");
+  return false;
+}
+
+  /* Alignment is a CONST_INT.  */
+  gcc_assert (CONST_INT_P (operands[3]));
+
+  rtx dst = operands[0];
+  rtx size = operands[1];
+  rtx val = operands[2];
+  unsigned HOST_WIDE_INT size_bytes = UINTVAL (size);
+  unsigned align = UINTVAL (operands[3]);
+  enum machine_mode mode;
+  switch (align)
+{
+case 1: mode = QImode; break;
+case 2: mode = HImode; break;
+case 4: mode = SImode; break;
+case 8: mode = DImode; break;
+default:
+  gcc_unreachable ();
+}
+
+  unsigned iters = size_bytes >> ceil_log2 (align);
+  unsigned remainder = size_bytes & (align - 1);
+  unsigned inc = GET_MODE_SIZE (mode);
+  unsigned offset = 0;
+
+  for (unsigned int i = 0; i < iters; i++)
+{
+  emit_move_insn (adjust_address (dst, mode, offset), val);
+  offset += inc;
+}
+  if (remainder & 4)
+{
+  emit_move_insn (adjust_address (dst, SImode, offset), val);
+  offset += 4;
+  remainder -= 4;
+}
+  if (remainder & 2)
+{
+  emit_move_insn (adjust_address (dst, HImode, offset), val);
+  offset += 2;
+  remainder -= 2;
+}
+  if (remainder & 1)
+emit_move_insn (adjust_address (dst, QImode, offset), val);
+
+  return true;
+}
+
 /* Finally, build the GCC target.  */
 
 struct gcc_target targetm = TARGET_INITIALIZER;
diff --git a/gcc/config/bpf/bpf.md b/gcc/config/bpf/bpf.md
index ca677bc6b50..ea688aadf91 100644
--- a/gcc/config/bpf/bpf.md
+++ b/gcc/config/bpf/bpf.md
@@ -663,4 +663,21 @@ (define_expand "movmemdi"
   FAIL;
 })
 
+;; memset
+;; 0 is dst
+;; 1 is length
+;; 2 is value
+;; 3 is alignment
+(define_expand "setmemdi"
+  [(set (match_operand:BLK 0 "memory_operand")
+   (match_operand:QI  2 "nonmemory_operand"))
+   (use (match_operand:DI  1 "general_operand"))
+   (match_operand 3 "immediate_operand")]
+ ""
+ {
+  if (bpf_expand_setmem (operands))
+DONE;
+  FAIL;
+})
+
 (include "atomic.md")
diff --git a/gcc/testsuite/gcc.target/bpf/memset-1.c 
b/gcc/testsuite/gcc.target/bpf/memset-1.c
new file mode 100644
index 000..9e9f8eff028
--- /dev/null
+++ b/gcc/testsuite/gcc.target/bpf/memset-1.c
@@ -0,0 +1,39 @@
+/* Ensure memset is expanded inline rather than emitting a libcall.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+struct context {
+ unsigned int data;
+ unsigned int data_end;
+ unsigned int data_meta;
+ unsigned int ingress;
+ unsigned int queue_index;
+ unsigned int egress;
+};
+
+void
+set_small (struct context *ctx)
+{
+  void *data = (void *)(long)ctx->data;
+  char *dest = data;
+  __builtin_memset (dest + 4, 0, sizeof (struct context) - 4);
+}
+
+void

[PATCH v2 06/13] Rename section and encoding functions from i386 which will be used in aarch64

2024-03-04 Thread Evgeny Karpov
From: Zac Walker 
Date: Tue, 20 Feb 2024 17:22:31 +0100
Subject: [PATCH v2 06/13] Rename section and encoding functions from i386
 which will be used in aarch64

gcc/ChangeLog:

* config/i386/cygming.h (SUBTARGET_ENCODE_SECTION_INFO):
Rename functions in mingw folder which will be reused for
aarch64.
(TARGET_ASM_UNIQUE_SECTION): Likewise.
(TARGET_ASM_NAMED_SECTION): Likewise.
(TARGET_SECTION_TYPE_FLAGS): Likewise.
(ASM_DECLARE_COLD_FUNCTION_NAME): Likewise.
(ASM_OUTPUT_EXTERNAL_LIBCALL): Likewise.
* config/i386/i386-protos.h (i386_pe_unique_section):
Rename into ...
(mingw_pe_unique_section): ... this.
(i386_pe_declare_function_type): Rename into ...
(mingw_pe_declare_function_type): ... this.
(i386_pe_encode_section_info): Rename into ...
(mingw_pe_encode_section_info): ... this.
(i386_pe_maybe_record_exported_symbol): Rename into ...
(mingw_pe_maybe_record_exported_symbol): ... this.
(i386_pe_section_type_flags): Rename into ...
(mingw_pe_section_type_flags): ... this.
(i386_pe_asm_named_section): Rename into ...
(mingw_pe_asm_named_section): ... this.
* config/mingw/winnt.cc (i386_pe_encode_section_info):
Rename into ...
(mingw_pe_encode_section_info): ... this.
(i386_pe_unique_section): Rename into ...
(mingw_pe_unique_section): ... this.
(i386_pe_section_type_flags): Rename into ...
(mingw_pe_section_type_flags): ... this.
(i386_pe_asm_named_section): Rename into ...
(mingw_pe_asm_named_section): ... this.
(i386_pe_asm_output_aligned_decl_common): Likewise.
(i386_pe_declare_function_type): Rename into ...
(mingw_pe_declare_function_type): ... this.
(i386_pe_maybe_record_exported_symbol): Rename into ...
(mingw_pe_maybe_record_exported_symbol): ... this.
(i386_pe_start_function): Likewise.
* varasm.cc (switch_to_comdat_section): Likewise.
---
 gcc/config/i386/cygming.h | 18 +-
 gcc/config/i386/i386-protos.h | 12 ++--
 gcc/config/mingw/winnt.cc | 22 +++---
 gcc/varasm.cc |  2 +-
 4 files changed, 27 insertions(+), 27 deletions(-)

diff --git a/gcc/config/i386/cygming.h b/gcc/config/i386/cygming.h
index 1af5bc380a5..beedf7c398a 100644
--- a/gcc/config/i386/cygming.h
+++ b/gcc/config/i386/cygming.h
@@ -219,7 +219,7 @@ do {
\
section and we need to set DECL_SECTION_NAME so we do that here.
Note that we can be called twice on the same decl.  */
 
-#define SUBTARGET_ENCODE_SECTION_INFO  i386_pe_encode_section_info
+#define SUBTARGET_ENCODE_SECTION_INFO  mingw_pe_encode_section_info
 
 /* Local and global relocs can be placed always into readonly memory
for PE-COFF targets.  */
@@ -235,7 +235,7 @@ do {
\
 #undef ASM_DECLARE_OBJECT_NAME
 #define ASM_DECLARE_OBJECT_NAME(STREAM, NAME, DECL)\
 do {   \
-  i386_pe_maybe_record_exported_symbol (DECL, NAME, 1);\
+  mingw_pe_maybe_record_exported_symbol (DECL, NAME, 1);   \
   ASM_OUTPUT_LABEL ((STREAM), (NAME)); \
 } while (0)
 
@@ -283,16 +283,16 @@ do {  \
 /* Windows uses explicit import from shared libraries.  */
 #define MULTIPLE_SYMBOL_SPACES 1
 
-#define TARGET_ASM_UNIQUE_SECTION i386_pe_unique_section
+#define TARGET_ASM_UNIQUE_SECTION mingw_pe_unique_section
 #define TARGET_ASM_FUNCTION_RODATA_SECTION default_no_function_rodata_section
 
 #define SUPPORTS_ONE_ONLY 1
 
 /* Switch into a generic section.  */
-#define TARGET_ASM_NAMED_SECTION  i386_pe_asm_named_section
+#define TARGET_ASM_NAMED_SECTION  mingw_pe_asm_named_section
 
 /* Select attributes for named sections.  */
-#define TARGET_SECTION_TYPE_FLAGS  i386_pe_section_type_flags
+#define TARGET_SECTION_TYPE_FLAGS  mingw_pe_section_type_flags
 
 /* Write the extra assembler code needed to declare a function
properly.  */
@@ -307,7 +307,7 @@ do {\
 #define ASM_DECLARE_COLD_FUNCTION_NAME(FILE, NAME, DECL)   \
   do   \
 {  \
-  i386_pe_declare_function_type (FILE, NAME, 0);   \
+  mingw_pe_declare_function_type (FILE, NAME, 0);  \
   i386_pe_seh_cold_init (FILE, NAME);  \
   ASM_OUTPUT_LABEL (FILE, NAME);   \
 }  \
@@ -333,7 +333,7 @@ do {\
 
 /* Declare the type properly for any external libcall.  */
 #define 

[PATCH v2 04/13] aarch64: Add aarch64-w64-mingw32 COFF

2024-03-04 Thread Evgeny Karpov
From: Zac Walker 
Date: Fri, 1 Mar 2024 01:55:47 +0100
Subject: [PATCH v2 04/13] aarch64: Add aarch64-w64-mingw32 COFF

Define ASM specific for COFF format on AArch64.

gcc/ChangeLog:

* config.gcc: Add COFF format support definitions.
* config/aarch64/aarch64-coff.h: New file.
---
 gcc/config.gcc|  1 +
 gcc/config/aarch64/aarch64-coff.h | 91 +++
 2 files changed, 92 insertions(+)
 create mode 100644 gcc/config/aarch64/aarch64-coff.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index b762393b64c..cb6661f44ef 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1266,6 +1266,7 @@ aarch64*-*-linux*)
;;
 aarch64-*-mingw*)
tm_file="${tm_file} aarch64/aarch64-abi-ms.h"
+   tm_file="${tm_file} aarch64/aarch64-coff.h"
tmake_file="${tmake_file} aarch64/t-aarch64"
case ${enable_threads} in
  "" | yes | win32)
diff --git a/gcc/config/aarch64/aarch64-coff.h 
b/gcc/config/aarch64/aarch64-coff.h
new file mode 100644
index 000..79c5a43b970
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-coff.h
@@ -0,0 +1,91 @@
+/* Machine description for AArch64 architecture.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published by
+   the Free Software Foundation; either version 3, or (at your option)
+   any later version.
+
+   GCC is distributed in the hope that it will be useful, but
+   WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   General Public License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+#ifndef GCC_AARCH64_COFF_H
+#define GCC_AARCH64_COFF_H
+
+#include "aarch64.h"
+
+#ifndef LOCAL_LABEL_PREFIX
+# define LOCAL_LABEL_PREFIX""
+#endif
+
+/* Using long long breaks -ansi and -std=c90, so these will need to be
+   made conditional for an LLP64 ABI.  */
+#undef SIZE_TYPE
+#define SIZE_TYPE  "long long unsigned int"
+
+#undef PTRDIFF_TYPE
+#define PTRDIFF_TYPE   "long long int"
+
+#undef LONG_TYPE_SIZE
+#define LONG_TYPE_SIZE 32
+
+#ifndef ASM_GENERATE_INTERNAL_LABEL
+# define ASM_GENERATE_INTERNAL_LABEL(STRING, PREFIX, NUM)  \
+  sprintf (STRING, "*%s%s%u", LOCAL_LABEL_PREFIX, PREFIX, (unsigned int)(NUM))
+#endif
+
+#define ASM_OUTPUT_ALIGN(STREAM, POWER)\
+  fprintf (STREAM, "\t.align\t%d\n", (int)POWER)
+
+/* Output a common block.  */
+#ifndef ASM_OUTPUT_COMMON
+# define ASM_OUTPUT_COMMON(STREAM, NAME, SIZE, ROUNDED)\
+{  \
+  fprintf (STREAM, "\t.comm\t");   \
+  assemble_name (STREAM, NAME);\
+  asm_fprintf (STREAM, ", %d, %d\n",   \
+  (int)(ROUNDED), (int)(SIZE));\
+}
+#endif
+
+/* Output a local common block.  /bin/as can't do this, so hack a
+   `.space' into the bss segment.  Note that this is *bad* practice,
+   which is guaranteed NOT to work since it doesn't define STATIC
+   COMMON space but merely STATIC BSS space.  */
+#ifndef ASM_OUTPUT_ALIGNED_LOCAL
+# define ASM_OUTPUT_ALIGNED_LOCAL(STREAM, NAME, SIZE, ALIGN)   \
+{  \
+  switch_to_section (bss_section); \
+  ASM_OUTPUT_ALIGN (STREAM, floor_log2 (ALIGN / BITS_PER_UNIT));   \
+  ASM_OUTPUT_LABEL (STREAM, NAME); \
+  fprintf (STREAM, "\t.space\t%d\n", (int)(SIZE)); \
+}
+#endif
+
+#define ASM_OUTPUT_SKIP(STREAM, NBYTES)\
+  fprintf (STREAM, "\t.space\t%d  // skip\n", (int) (NBYTES))
+
+#define ASM_OUTPUT_TYPE_DIRECTIVE(STREAM, NAME, TYPE)
+#define ASM_DECLARE_FUNCTION_SIZE(FILE, FNAME, DECL)
+
+#define TEXT_SECTION_ASM_OP"\t.text"
+#define DATA_SECTION_ASM_OP"\t.data"
+#define BSS_SECTION_ASM_OP "\t.bss"
+
+#define CTORS_SECTION_ASM_OP   "\t.section\t.ctors, \"aw\""
+#define DTORS_SECTION_ASM_OP   "\t.section\t.dtors, \"aw\""
+
+#define GLOBAL_ASM_OP "\t.global\t"
+
+#undef SUPPORTS_INIT_PRIORITY
+#define SUPPORTS_INIT_PRIORITY 0
+
+#endif
-- 
2.25.1



[PATCH v2 03/13] aarch64: Mark x18 register as a fixed register for MS ABI

2024-03-04 Thread Evgeny Karpov
From: Zac Walker 
Date: Fri, 1 Mar 2024 09:56:59 +0100
Subject: [PATCH v2 03/13] aarch64: Mark x18 register as a fixed register for
 MS ABI

Define the MS ABI for aarch64-w64-mingw32.
Adjust FIXED_REGISTERS, CALL_REALLY_USED_REGISTERS and
STATIC_CHAIN_REGNUM for AArch64 MS ABI.
The X18 register is reserved on Windows for the TEB.

gcc/ChangeLog:

* config.gcc: Define TARGET_AARCH64_MS_ABI when
AArch64 MS ABI is used.
* config/aarch64/aarch64-abi-ms.h: New file. Adjust
FIXED_REGISTERS, CALL_REALLY_USED_REGISTERS and
STATIC_CHAIN_REGNUM for AArch64 MS ABI.
---
 gcc/config.gcc  |  1 +
 gcc/config/aarch64/aarch64-abi-ms.h | 64 +
 2 files changed, 65 insertions(+)
 create mode 100644 gcc/config/aarch64/aarch64-abi-ms.h

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 2756377e50b..b762393b64c 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1265,6 +1265,7 @@ aarch64*-*-linux*)
TM_MULTILIB_CONFIG=`echo $TM_MULTILIB_CONFIG | sed 's/^,//'`
;;
 aarch64-*-mingw*)
+   tm_file="${tm_file} aarch64/aarch64-abi-ms.h"
tmake_file="${tmake_file} aarch64/t-aarch64"
case ${enable_threads} in
  "" | yes | win32)
diff --git a/gcc/config/aarch64/aarch64-abi-ms.h 
b/gcc/config/aarch64/aarch64-abi-ms.h
new file mode 100644
index 000..90b0dcc5edf
--- /dev/null
+++ b/gcc/config/aarch64/aarch64-abi-ms.h
@@ -0,0 +1,64 @@
+/* Machine description for AArch64 MS ABI.
+   Copyright (C) 2024 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#ifndef GCC_AARCH64_ABI_MS_H
+#define GCC_AARCH64_ABI_MS_H
+
+/* X18 reserved for the TEB on Windows.  */
+
+#undef FIXED_REGISTERS
+#define FIXED_REGISTERS\
+  {\
+0, 0, 0, 0,   0, 0, 0, 0,  /* R0 - R7.  */ \
+0, 0, 0, 0,   0, 0, 0, 0,  /* R8 - R15.  */\
+0, 0, 1, 0,   0, 0, 0, 0,  /* R16 - R23.  */   \
+0, 0, 0, 0,   0, 1, 0, 1,  /* R24 - R30, SP.  */   \
+0, 0, 0, 0,   0, 0, 0, 0,  /* V0 - V7.  */ \
+0, 0, 0, 0,   0, 0, 0, 0,   /* V8 - V15.  */   \
+0, 0, 0, 0,   0, 0, 0, 0,   /* V16 - V23.  */  \
+0, 0, 0, 0,   0, 0, 0, 0,   /* V24 - V31.  */  \
+1, 1, 1, 1,/* SFP, AP, CC, VG.  */ \
+0, 0, 0, 0,   0, 0, 0, 0,  /* P0 - P7.  */ \
+0, 0, 0, 0,   0, 0, 0, 0,   /* P8 - P15.  */   \
+1, 1,  /* FFR and FFRT.  */\
+1, 1, 1, 1, 1, 1, 1, 1 /* Fake registers.  */  \
+  }
+
+#undef CALL_REALLY_USED_REGISTERS
+#define CALL_REALLY_USED_REGISTERS \
+  {\
+1, 1, 1, 1,   1, 1, 1, 1,  /* R0 - R7.  */ \
+1, 1, 1, 1,   1, 1, 1, 1,  /* R8 - R15.  */\
+1, 1, 0, 0,   0, 0, 0, 0,   /* R16 - R23.  */  \
+0, 0, 0, 0,   0, 1, 1, 1,  /* R24 - R30, SP.  */   \
+1, 1, 1, 1,   1, 1, 1, 1,  /* V0 - V7.  */ \
+0, 0, 0, 0,   0, 0, 0, 0,  /* V8 - V15.  */\
+1, 1, 1, 1,   1, 1, 1, 1,   /* V16 - V23.  */  \
+1, 1, 1, 1,   1, 1, 1, 1,   /* V24 - V31.  */  \
+1, 1, 1, 0,/* SFP, AP, CC, VG.  */ \
+1, 1, 1, 1,   1, 1, 1, 1,  /* P0 - P7.  */ \
+1, 1, 1, 1,   1, 1, 1, 1,  /* P8 - P15.  */\
+1, 1,  /* FFR and FFRT.  */\
+0, 0, 0, 0, 0, 0, 0, 0 /* Fake registers.  */  \
+  }
+
+#undef  STATIC_CHAIN_REGNUM
+#define STATIC_CHAIN_REGNUM R17_REGNUM
+
+#endif /* GCC_AARCH64_ABI_MS_H.  */
-- 
2.25.1



[PATCH v2 02/13] aarch64: The aarch64-w64-mingw32 target implements

2024-03-04 Thread Evgeny Karpov
From: Zac Walker 
Date: Fri, 1 Mar 2024 01:45:13 +0100
Subject: [PATCH v2 02/13] aarch64: The aarch64-w64-mingw32 target implements
 the MS ABI

Two ABIs for aarch64 have been defined for different platforms.

gcc/ChangeLog:

* config/aarch64/aarch64-opts.h (enum aarch64_calling_abi):
Define two ABIs.
---
 gcc/config/aarch64/aarch64-opts.h | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/gcc/config/aarch64/aarch64-opts.h 
b/gcc/config/aarch64/aarch64-opts.h
index a05c0d3ded1..52c9e4596d6 100644
--- a/gcc/config/aarch64/aarch64-opts.h
+++ b/gcc/config/aarch64/aarch64-opts.h
@@ -131,4 +131,11 @@ enum aarch64_early_ra_scope {
   AARCH64_EARLY_RA_NONE
 };
 
+/* Available call ABIs.  */
+enum aarch64_calling_abi
+{
+  AARCH64_CALLING_ABI_EABI,
+  AARCH64_CALLING_ABI_MS
+};
+
 #endif
-- 
2.25.1



[PATCH v2 01/13] Introduce aarch64-w64-mingw32 target

2024-03-04 Thread Evgeny Karpov
>From 38efaf5ab1fa017622d10239fff2ca23d2d3fb82 Mon Sep 17 00:00:00 2001
From: Zac Walker 
Date: Fri, 1 Mar 2024 01:40:53 +0100
Subject: [PATCH v2 01/13] Introduce aarch64-w64-mingw32 target

Add the initial aarch64-w64-mingw32 target for gcc.

fixincludes/ChangeLog:

* mkfixinc.sh: Extend for *-mingw32* targets.

gcc/ChangeLog:

* config.gcc: Add aarch64-w64-mingw32 target.
---
 fixincludes/mkfixinc.sh |  3 +--
 gcc/config.gcc  | 13 +
 2 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/fixincludes/mkfixinc.sh b/fixincludes/mkfixinc.sh
index df90720b716..7112f4dcd64 100755
--- a/fixincludes/mkfixinc.sh
+++ b/fixincludes/mkfixinc.sh
@@ -12,8 +12,7 @@ target=fixinc.sh
 # Check for special fix rules for particular targets
 case $machine in
 i?86-*-cygwin* | \
-i?86-*-mingw32* | \
-x86_64-*-mingw32* | \
+*-mingw32* | \
 powerpc-*-eabisim* | \
 powerpc-*-eabi*| \
 powerpc-*-rtems*   | \
diff --git a/gcc/config.gcc b/gcc/config.gcc
index a1480b72c46..2756377e50b 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -1264,6 +1264,19 @@ aarch64*-*-linux*)
done
TM_MULTILIB_CONFIG=`echo $TM_MULTILIB_CONFIG | sed 's/^,//'`
;;
+aarch64-*-mingw*)
+   tmake_file="${tmake_file} aarch64/t-aarch64"
+   case ${enable_threads} in
+ "" | yes | win32)
+   thread_file='win32'
+   ;;
+ posix)
+   thread_file='posix'
+   ;;
+   esac
+   default_use_cxa_atexit=yes
+   user_headers_inc_next_post="${user_headers_inc_next_post} float.h"
+   ;;
 aarch64*-wrs-vxworks*)
 tm_file="${tm_file} elfos.h aarch64/aarch64-elf.h"
 tm_file="${tm_file} vx-common.h vxworks.h aarch64/aarch64-vxworks.h"
-- 
2.25.1



[PATCH v2 00/13] Add aarch64-w64-mingw32 target

2024-03-04 Thread Evgeny Karpov
Hello,

v2 is ready for the review!
Based on the v1 review: 
https://gcc.gnu.org/pipermail/gcc-patches/2024-February/thread.html#646203

Testing for the x86_64-w64-mingw32 target is in progress to avoid
regression due to refactoring.

Regards,
Evgeny


Changes from v1 to v2:
Adjust the target name to aarch64-*-mingw* to exclude the big-endian
target from support.
Exclude 64-bit ISA.
Rename enum calling_abi to aarch64_calling_abi.
Move AArch64 MS ABI definitions FIXED_REGISTERS,
CALL_REALLY_USED_REGISTERS, and STATIC_CHAIN_REGNUM from aarch64.h 
to aarch64-abi-ms.h.
Rename TARGET_ARM64_MS_ABI to TARGET_AARCH64_MS_ABI.
Exclude TARGET_64BIT from the aarch64 target.
Exclude HAVE_GAS_WEAK.
Set HAVE_GAS_ALIGNED_COMM to 1 by default.
Use a reference from "x86 Windows Options" to 
"Cygwin and MinGW Options".
Update commit descriptions to follow standard style.
Rebase from 4th March 2024.



[PATCH] combine: Fix recent WORD_REGISTER_OPERATIONS check [PR113010]

2024-03-04 Thread Jakub Jelinek
On Mon, Mar 04, 2024 at 05:18:39PM +0100, Rainer Orth wrote:
> > On 2/26/24 17:17, Greg McGary wrote:
> >> The sign-bit-copies of a sign-extending load cannot be known until runtime 
> >> on
> >> WORD_REGISTER_OPERATIONS targets, except in the case of a zero-extending 
> >> MEM
> >> load.  See the fix for PR112758.
> >> 2024-02-22  Greg McGary  
> >>  PR rtl-optimization/113010
> >>* combine.cc (simplify_comparison): Simplify a SUBREG on
> >>  WORD_REGISTER_OPERATIONS targets only if it is a zero-extending
> >>  MEM load.
> >>* gcc.c-torture/execute/pr113010.c: New test.
> > I think this is fine for the trunk.  I'll do some final testing on it
> > tomorrow.
> 
> unfortunately, the patch broke Solaris/SPARC bootstrap
> (sparc-sun-solaris2.11):
> 
> /vol/gcc/src/hg/master/local/gcc/combine.cc: In function 'rtx_code 
> simplify_comparison(rtx_code, rtx_def**, rtx_def**)':
> /vol/gcc/src/hg/master/local/gcc/combine.cc:12101:25: error: '*(unsigned 
> int*)((char*)_mode + offsetof(scalar_int_mode, 
> scalar_int_mode::m_mode))' may be used uninitialized 
> [-Werror=maybe-uninitialized]
> 12101 |   scalar_int_mode mode, inner_mode, tmode;
>   | ^~

I don't see how it could ever work properly, inner_mode in that spot is
just uninitialized.

I think we shouldn't worry about paradoxical subregs of non-scalar_int_mode
REGs/MEMs and for the scalar_int_mode ones should initialize inner_mode
before we use it.
Another option would be to use
maybe_lt (GET_MODE_PRECISION (GET_MODE (SUBREG_REG (op0))), BITS_PER_WORD)
and
load_extend_op (GET_MODE (SUBREG_REG (op0))) == ZERO_EXTEND,
or set machine_mode smode = GET_MODE (SUBREG_REG (op0)); and use it in
those two spots.

2024-03-04  Jakub Jelinek  

PR rtl-optimization/113010
* combine.cc (simplify_comparison): Guard the
WORD_REGISTER_OPERATIONS check on scalar_int_mode of SUBREG_REG
and initialize inner_mode.

--- gcc/combine.cc.jj   2024-03-04 10:01:21.054937316 +0100
+++ gcc/combine.cc  2024-03-04 17:40:51.556052647 +0100
@@ -12554,6 +12554,8 @@ simplify_comparison (enum rtx_code code,
  if (paradoxical_subreg_p (op0))
{
  if (WORD_REGISTER_OPERATIONS
+ && is_a  (GET_MODE (SUBREG_REG (op0)),
+_mode)
  && GET_MODE_PRECISION (inner_mode) < BITS_PER_WORD
  /* On WORD_REGISTER_OPERATIONS targets the bits
 beyond sub_mode aren't considered undefined,


Jakub



Re: [PATCH v5] c++: implement [[gnu::non_owning]] [PR110358]

2024-03-04 Thread Marek Polacek
On Mon, Mar 04, 2024 at 11:00:18AM +, Jonathan Wakely wrote:
> On 01/03/24 15:38 -0500, Jason Merrill wrote:
> > On 3/1/24 14:24, Marek Polacek wrote:
> > > +@smallexample
> > > +template 
> > > +[[gnu::no_dangling(std::is_reference_v)]] int foo (T& t) @{
> > 
> > I think this function should return a reference.
> 
> The condition in the attribute can only ever be true if you call this
> function with an explicit template argument list: foo(i). Is
> that intentional?

Not intentional.  I just wanted to make it clear that the user
can use something like std::is_reference as the attribute argument,
but I didn't think about it very long.
 
> And if T is non-const it can't be called with a temporary and so
> dangling seems less of a problem for this function anyway, right?

Right.
 
> Would it make more sense as something like this?
> 
> template 
> [[gnu::no_dangling(std::is_lvalue_reference_v)]]
> decltype(auto) foo(T&& t) {
>   ...
> }
> 
> Or is this getting too complex/subtle for a simple example?

I like your example; it's only slightly more complex than the
original one and most likely more realistic.  I'm pushing the
following patch.  Thanks!

[pushed] doc: update [[gnu::no_dangling]]

...to offer a more realistic example.

gcc/ChangeLog:

* doc/extend.texi: Update [[gnu::no_dangling]].
---
 gcc/doc/extend.texi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index f679c81acf2..df0982fdfda 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -29370,7 +29370,8 @@ Or:
 
 @smallexample
 template 
-[[gnu::no_dangling(std::is_reference_v)]] int& foo (T& t) @{
+[[gnu::no_dangling(std::is_lvalue_reference_v)]]
+decltype(auto) foo(T&& t) @{
   @dots{}
 @};
 @end smallexample

base-commit: 77eb86be8841989651b3150a020dd1a95910cc00
-- 
2.44.0



Re: [PATCH] combine: Don't simplify paradoxical SUBREG on WORD_REGISTER_OPERATIONS [PR113010]

2024-03-04 Thread Rainer Orth
Hi Jeff,

> On 2/26/24 17:17, Greg McGary wrote:
>> The sign-bit-copies of a sign-extending load cannot be known until runtime on
>> WORD_REGISTER_OPERATIONS targets, except in the case of a zero-extending MEM
>> load.  See the fix for PR112758.
>> 2024-02-22  Greg McGary  
>>  PR rtl-optimization/113010
>>  * combine.cc (simplify_comparison): Simplify a SUBREG on
>>WORD_REGISTER_OPERATIONS targets only if it is a zero-extending
>>MEM load.
>>  * gcc.c-torture/execute/pr113010.c: New test.
> I think this is fine for the trunk.  I'll do some final testing on it
> tomorrow.

unfortunately, the patch broke Solaris/SPARC bootstrap
(sparc-sun-solaris2.11):

/vol/gcc/src/hg/master/local/gcc/combine.cc: In function 'rtx_code 
simplify_comparison(rtx_code, rtx_def**, rtx_def**)':
/vol/gcc/src/hg/master/local/gcc/combine.cc:12101:25: error: '*(unsigned 
int*)((char*)_mode + offsetof(scalar_int_mode, scalar_int_mode::m_mode))' 
may be used uninitialized [-Werror=maybe-uninitialized]
12101 |   scalar_int_mode mode, inner_mode, tmode;
  | ^~

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[PATCH, OpenACC 2.7, v2] Adjust acc_map_data/acc_unmap_data interaction with reference counters

2024-03-04 Thread Chung-Lin Tang
Hi Thomas,

On 2023/10/31 11:06 PM, Thomas Schwinge wrote:
> A few comments, should be easy to work in:

>> @@ -460,7 +461,7 @@ acc_unmap_data (void *h)
>>   the different 'REFCOUNT_INFINITY' cases, or simply separate
>>   'REFCOUNT_INFINITY' values per different usage ('REFCOUNT_ACC_MAP_DATA'
>>   etc.)?  */
>> -  else if (n->refcount != REFCOUNT_INFINITY)
>> +  else if (n->refcount != REFCOUNT_ACC_MAP_DATA)
>>  {
>>gomp_mutex_unlock (_dev->lock);
>>gomp_fatal ("refusing to unmap block [%p,+%d] that has not been 
>> mapped"
> 
> Thus remove the TODO comment before this 'else if' block?  :-)

Removed TODO block, comments added below around new assert.

> We should add a comment here that we're unmapping without consideration
> of 'n->dynamic_refcount' (that is, 'acc_unmap_data' has implicit
> 'finalize' semantics -- at least per my reading of the specification; do
> you agree?), that is:
> 
> acc_map_data([var]); // 'dynamic_refcount = 1'
> acc_copyin([var]); // 'dynamic_refcount++'
> acc_unmap_data([var]); // does un-map, despite 'dynamic_refcount == 2'?
> assert (!acc_is_present([var]));
> 
> Do we have such a test case?  If not, please add one.

I've added a new testsuite/libgomp.oacc-c-c++-common/lib-96.c testcase for this.

> To complement 'goacc_exit_datum_1' (see below), we should add here:
> 
> assert (n->dynamic_refcount >= 1);

Added.

> 
> The subsequenct code:
> 
> if (tgt->refcount == REFCOUNT_INFINITY)
>   {
> gomp_mutex_unlock (_dev->lock);
> gomp_fatal ("cannot unmap target block");
>   }
> 
> ... is now unreachable, I think, and may thus be removed -- and any
> inconsistency is caught by the subsequent:
> 
> /* Above, we've verified that the mapping must have been set up by
>'acc_map_data'.  */
> assert (tgt->refcount == 1);

Removed the 'if (tgt->refcount == REFCOUNT_INFINITY)' block.

>> @@ -691,15 +694,27 @@ goacc_exit_datum_1 (struct gomp_device_descr *acc_dev, 
>> void *h, size_t s,
>>
>>if (finalize)
>>  {
>> -  if (n->refcount != REFCOUNT_INFINITY)
>> +  if (n->refcount != REFCOUNT_INFINITY
>> +   && n->refcount != REFCOUNT_ACC_MAP_DATA)
>>   n->refcount -= n->dynamic_refcount;
>> -  n->dynamic_refcount = 0;
>> +
>> +  if (n->refcount == REFCOUNT_ACC_MAP_DATA)
>> + /* Mappings created by acc_map_data are returned to initial
>> +dynamic_refcount of 1. Can only be deleted by acc_unmap_data.  */
>> + n->dynamic_refcount = 1;
>> +  else
>> + n->dynamic_refcount = 0;
>>  }
>>else if (n->dynamic_refcount)
>>  {
>> -  if (n->refcount != REFCOUNT_INFINITY)
>> +  if (n->refcount != REFCOUNT_INFINITY
>> +   && n->refcount != REFCOUNT_ACC_MAP_DATA)
>>   n->refcount--;
>> -  n->dynamic_refcount--;
>> +
>> +  /* When mapping is created by acc_map_data, dynamic_refcount must be
>> +  maintained at >= 1.  */
>> +  if (n->refcount != REFCOUNT_ACC_MAP_DATA || n->dynamic_refcount > 1)
>> + n->dynamic_refcount--;
>>  }
> 
> I'd find those changes more concise to understand if done the following
> way: restore both 'if (finalize)' and 'else if (n->dynamic_refcount)'
> branches to their original form (other than excluding 'n->refcount'
> modification for 'REFCOUNT_ACC_MAP_DATA', as you have), and instead then
> afterwards (that is, here), do:
> 
> /* Mappings created by 'acc_map_data' can only be deleted by 
> 'acc_unmap_data'.  */
> if (n->refcount == REFCOUNT_ACC_MAP_DATA
> && n->dynamic_refcount == 0)
>   n->dynamic_refcount = 1;
> 
> That does have the same semantics, please verify?

This does not have the same semantics, because if the original 
finalize/n->dynamic_refcount
cases are left unmodified, they will treat REFCOUNT_ACC_MAP_DATA like a normal 
refcount and
decrement n->refcount, and handling n->refcount == REFCOUNT_ACC_MAP_DATA later 
won't work either.

I have however, adjusted the nesting of cases to split the 'n->refcount == 
REFCOUNT_ACC_MAP_DATA'
case away. This should be easier to read.

>> @@ -480,7 +480,9 @@ gomp_increment_refcount (splay_tree_key k, htab_t 
>> *refcount_set)
>>
>>uintptr_t *refcount_ptr = >refcount;
>>
>> -  if (REFCOUNT_STRUCTELEM_FIRST_P (k->refcount))
>> +  if (k->refcount == REFCOUNT_ACC_MAP_DATA)
>> +refcount_ptr = >dynamic_refcount;
>> +  else if (REFCOUNT_STRUCTELEM_FIRST_P (k->refcount))
>>  refcount_ptr = >structelem_refcount;
>>else if (REFCOUNT_STRUCTELEM_P (k->refcount))
>>  refcount_ptr = k->structelem_refcount_ptr;
>> @@ -527,7 +529,9 @@ gomp_decrement_refcount (splay_tree_key k, htab_t 
>> *refcount_set, bool delete_p,
>>
>>uintptr_t *refcount_ptr = >refcount;
>>
>> -  if (REFCOUNT_STRUCTELEM_FIRST_P (k->refcount))
>> +  if (k->refcount == REFCOUNT_ACC_MAP_DATA)
>> +refcount_ptr = >dynamic_refcount;
>> +  else if (REFCOUNT_STRUCTELEM_FIRST_P (k->refcount))
>>  refcount_ptr = >structelem_refcount;

Re: [PATCH] fwprop: Avoid volatile defines to be propagated

2024-03-04 Thread Jeff Law




On 3/4/24 02:12, HAO CHEN GUI wrote:

Hi Jeff,

在 2024/3/4 11:37, Jeff Law 写道:

Can the same thing happen with a volatile memory load?  I don't think that will 
be caught by the volatile_insn_p check.


Yes, I think so. If the define rtx contains volatile memory references, it
may hit the same problem. We may use volatile_refs_p instead of
volatile_insn_p?

Yea.  OK with that change.

Thanks,
jeff



Re: [PATCH] vect: Fix integer overflow calculating mask

2024-03-04 Thread Jakub Jelinek
On Mon, Mar 04, 2024 at 03:30:01PM +, Andrew Stubbs wrote:
> vect: Fix integer overflow calculating mask
> 
> The masks and bitvectors were broken when nunits==32 on hosts where int is
> 32-bit.
> 
> gcc/ChangeLog:
> 
>   * dojump.cc (do_compare_and_jump): Use full-width integers for shifts.
>   * expr.cc (store_constructor): Likewise.
>   (do_store_flag): Likewise.

LGTM, thanks.

Jakub



Re: [PATCH] vect: Fix integer overflow calculating mask

2024-03-04 Thread Andrew Stubbs

On 23/02/2024 15:13, Richard Biener wrote:

On Fri, 23 Feb 2024, Jakub Jelinek wrote:


On Fri, Feb 23, 2024 at 02:22:19PM +, Andrew Stubbs wrote:

On 23/02/2024 13:02, Jakub Jelinek wrote:

On Fri, Feb 23, 2024 at 12:58:53PM +, Andrew Stubbs wrote:

This is a follow-up to the previous patch to ensure that integer vector
bit-masks do not have excess bits set. It fixes a bug, observed on
amdgcn, in which the mask could be incorrectly set to zero, resulting in
wrong-code.

The mask was broken when nunits==32. The patched version will probably
be broken for nunits==64, but I don't think any current targets have
masks with more than 64 bits.

OK for mainline?

Andrew

gcc/ChangeLog:

* expr.cc (store_constructor): Use 64-bit shifts.


No, this isn't 64-bit shift on all hosts.
Use HOST_WIDE_INT_1U instead.


OK, I did wonder if there was a proper way to do it. :)

How about this?


If you change the other two GEN_INT ((1 << nunits) - 1) occurrences in
expr.cc the same way, then LGTM.


There's also two in dojump.cc


This patch should fix all the cases, I think.

I have not observed any further test result changes.

OK?

Andrew
vect: Fix integer overflow calculating mask

The masks and bitvectors were broken when nunits==32 on hosts where int is
32-bit.

gcc/ChangeLog:

* dojump.cc (do_compare_and_jump): Use full-width integers for shifts.
* expr.cc (store_constructor): Likewise.
(do_store_flag): Likewise.

diff --git a/gcc/dojump.cc b/gcc/dojump.cc
index ac744e54cf8..88600cb42d3 100644
--- a/gcc/dojump.cc
+++ b/gcc/dojump.cc
@@ -1318,10 +1318,10 @@ do_compare_and_jump (tree treeop0, tree treeop1, enum 
rtx_code signed_code,
 {
   gcc_assert (code == EQ || code == NE);
   op0 = expand_binop (mode, and_optab, op0,
- GEN_INT ((1 << nunits) - 1), NULL_RTX,
+ GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1), NULL_RTX,
  true, OPTAB_WIDEN);
   op1 = expand_binop (mode, and_optab, op1,
- GEN_INT ((1 << nunits) - 1), NULL_RTX,
+ GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1), NULL_RTX,
  true, OPTAB_WIDEN);
 }
 
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 8d34d024c9c..f7d74525c15 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -7879,8 +7879,8 @@ store_constructor (tree exp, rtx target, int cleared, 
poly_int64 size,
auto nunits = TYPE_VECTOR_SUBPARTS (type).to_constant ();
if (maybe_ne (GET_MODE_PRECISION (mode), nunits))
  tmp = expand_binop (mode, and_optab, tmp,
- GEN_INT ((1 << nunits) - 1), target,
- true, OPTAB_WIDEN);
+ GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1),
+ target, true, OPTAB_WIDEN);
if (tmp != target)
  emit_move_insn (target, tmp);
break;
@@ -13707,11 +13707,11 @@ do_store_flag (sepops ops, rtx target, machine_mode 
mode)
 {
   gcc_assert (code == EQ || code == NE);
   op0 = expand_binop (mode, and_optab, op0,
- GEN_INT ((1 << nunits) - 1), NULL_RTX,
- true, OPTAB_WIDEN);
+ GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1),
+ NULL_RTX, true, OPTAB_WIDEN);
   op1 = expand_binop (mode, and_optab, op1,
- GEN_INT ((1 << nunits) - 1), NULL_RTX,
- true, OPTAB_WIDEN);
+ GEN_INT ((HOST_WIDE_INT_1U << nunits) - 1),
+ NULL_RTX, true, OPTAB_WIDEN);
 }
 
   if (target == 0)


[PATCH] c++/modules: Implement P2615 'Meaningful Exports' [PR107688]

2024-03-04 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu. This should probably
wait for GCC 15 I suppose, but sending it in now in case there are any
comments.

-- >8 --

This clarifies which kinds of declarations may and may not be exported
in various contexts. The patch additionally fixes up some small issues
that were clarified by the paper.

Most of the changes are with regards to export-declarations, which are
applied for all standards modes that we support '-fmodules-ts' for.
However there are also a couple of changes made to linkage specifiers
('extern "C"'); I've applied these as since C++20, to line up with when
modules were actually introduced.

PR c++/107688

gcc/cp/ChangeLog:

* name-lookup.cc (push_namespace): Error when exporting
namespace with internal linkage.
* parser.h (struct cp_parser): Add new flag
'in_unbraced_export_declaration_p'.
* parser.cc (cp_debug_parser): Print the new flag.
(cp_parser_new): Initialise the new flag.
(cp_parser_module_export): Set the new flag.
(cp_parser_class_specifier): Clear and restore the new flag.
(cp_parser_import_declaration): Imports can now appear directly
in a linkage specification.
(cp_parser_declaration): Categorise declarations as "name" or
"special"; error on the later in contexts where the former is
required.
(cp_parser_class_head): Error when exporting a partial
specialisation.

gcc/testsuite/ChangeLog:

* g++.dg/modules/contracts-1_a.C: Avoid now-illegal syntax.
* g++.dg/modules/contracts-2_a.C: Likewise.
* g++.dg/modules/contracts-3_a.C: Likewise.
* g++.dg/modules/contracts-4_a.C: Likewise.
* g++.dg/modules/lang-1_c.C: Clarify now-legal syntax.
* g++.dg/template/crash71.C: Update error messages.
* g++.dg/cpp2a/linkage-spec1.C: New test.
* g++.dg/modules/export-3.C: New test.
* g++.dg/modules/export-4_a.C: New test.
* g++.dg/modules/export-4_b.C: New test.

Signed-off-by: Nathaniel Shead 
---
 gcc/cp/name-lookup.cc|  10 +-
 gcc/cp/parser.cc | 105 +++
 gcc/cp/parser.h  |   6 +-
 gcc/testsuite/g++.dg/cpp2a/linkage-spec1.C   |  22 
 gcc/testsuite/g++.dg/modules/contracts-1_a.C |   2 +-
 gcc/testsuite/g++.dg/modules/contracts-2_a.C |   2 +-
 gcc/testsuite/g++.dg/modules/contracts-3_a.C |   2 +-
 gcc/testsuite/g++.dg/modules/contracts-4_a.C |   2 +-
 gcc/testsuite/g++.dg/modules/export-3.C  |  30 ++
 gcc/testsuite/g++.dg/modules/export-4_a.C|  23 
 gcc/testsuite/g++.dg/modules/export-4_b.C|  13 +++
 gcc/testsuite/g++.dg/modules/lang-1_c.C  |   2 +-
 gcc/testsuite/g++.dg/template/crash71.C  |   4 +-
 13 files changed, 192 insertions(+), 31 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/linkage-spec1.C
 create mode 100644 gcc/testsuite/g++.dg/modules/export-3.C
 create mode 100644 gcc/testsuite/g++.dg/modules/export-4_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/export-4_b.C

diff --git a/gcc/cp/name-lookup.cc b/gcc/cp/name-lookup.cc
index 6444db3f0eb..743102d8393 100644
--- a/gcc/cp/name-lookup.cc
+++ b/gcc/cp/name-lookup.cc
@@ -9053,8 +9053,14 @@ push_namespace (tree name, bool make_inline)
 {
   /* A public namespace is exported only if explicitly marked, or
 it contains exported entities.  */
-  if (TREE_PUBLIC (ns) && module_exporting_p ())
-   DECL_MODULE_EXPORT_P (ns) = true;
+  if (module_exporting_p ())
+   {
+ if (TREE_PUBLIC (ns))
+   DECL_MODULE_EXPORT_P (ns) = true;
+ else if (!header_module_p ())
+   error_at (input_location,
+ "exporting namespace with internal linkage");
+   }
   if (module_purview_p ())
DECL_MODULE_PURVIEW_P (ns) = true;
 
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index a310b9e8c07..448392e1bd9 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -560,6 +560,8 @@ cp_debug_parser (FILE *file, cp_parser *parser)
   & THIS_FORBIDDEN));
   cp_debug_print_flag (file, "In unbraced linkage specification",
  parser->in_unbraced_linkage_specification_p);
+  cp_debug_print_flag (file, "In unbraced export declaration",
+ parser->in_unbraced_export_declaration_p);
   cp_debug_print_flag (file, "Parsing a declarator",
  parser->in_declarator_p);
   cp_debug_print_flag (file, "In template argument list",
@@ -4425,6 +4427,9 @@ cp_parser_new (cp_lexer *lexer)
   /* We are not processing an `extern "C"' declaration.  */
   parser->in_unbraced_linkage_specification_p = false;
 
+  /* We aren't parsing an export-declaration.  */
+  parser->in_unbraced_export_declaration_p = false;
+
   /* We are not processing a declarator.  */
   parser->in_declarator_p = false;
 
@@ 

Fix 201001011-1.c on H8

2024-03-04 Thread Jeff Law


Excerpt from gcc.sum:
[...]
PASS: gcc.c-torture/execute/20101011-1.c   -O0  (test for excess errors)
FAIL: gcc.c-torture/execute/20101011-1.c   -O0  execution test
PASS: gcc.c-torture/execute/20101011-1.c   -O1  (test for excess errors)
FAIL: gcc.c-torture/execute/20101011-1.c   -O1  execution test
[ ... ]

This is because H8 MCUs do not throw a "divide by zero" exception.

gcc/testsuite
* gcc.c-torture/execute/20101011-1.c: Do not test on H8 series.

Pushed on Jan's behalf.

Thanks,

Jeff
commit bd6e613c115c758f961999770acedc92d44d6950
Author: Jan Dubiec 
Date:   Mon Mar 4 06:59:07 2024 -0700

Fix 201001011-1.c on H8

Excerpt from gcc.sum:
[...]
PASS: gcc.c-torture/execute/20101011-1.c   -O0  (test for excess errors)
FAIL: gcc.c-torture/execute/20101011-1.c   -O0  execution test
PASS: gcc.c-torture/execute/20101011-1.c   -O1  (test for excess errors)
FAIL: gcc.c-torture/execute/20101011-1.c   -O1  execution test
[ ... ]

This is because H8 MCUs do not throw a "divide by zero" exception.

gcc/testsuite
* gcc.c-torture/execute/20101011-1.c: Do not test on H8 series.

diff --git a/gcc/testsuite/gcc.c-torture/execute/20101011-1.c 
b/gcc/testsuite/gcc.c-torture/execute/20101011-1.c
index d2c0f9ab7ec..9fa10309612 100644
--- a/gcc/testsuite/gcc.c-torture/execute/20101011-1.c
+++ b/gcc/testsuite/gcc.c-torture/execute/20101011-1.c
@@ -26,6 +26,9 @@
 #elif defined (__RX__)
   /* On RX division by zero does not trap.  */
 # define DO_TEST 0
+#elif defined (__H8300H__) || defined (__H8300S__) || defined (__H8300SX__)
+  /* On H8/300H, H8S and H8SX division by zero does not trap.  */
+# define DO_TEST 0
 #elif defined (__aarch64__)
   /* On AArch64 integer division by zero does not trap.  */
 # define DO_TEST 0


[PATCH] tree-optimization/114197 - unexpected if-conversion for vectorization

2024-03-04 Thread Richard Biener
The following avoids lowering a volatile bitfiled access and in case
the if-converted and original loops end up in different outer loops
because of simplifcations enabled scrap the result since that is not
how the vectorizer expects the loops to be laid out.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/114197
* tree-if-conv.cc (bitfields_to_lower_p): Do not lower if
there are volatile bitfield accesses.
(pass_if_conversion::execute): Throw away result if the
if-converted and original loops are not nested as expected.

* gcc.dg/torture/pr114197.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr114197.c | 15 +++
 gcc/tree-if-conv.cc | 23 +++
 2 files changed, 34 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr114197.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr114197.c 
b/gcc/testsuite/gcc.dg/torture/pr114197.c
new file mode 100644
index 000..fb7e2fb712c
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr114197.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+
+#pragma pack(push)
+struct a {
+  volatile signed b : 8;
+};
+#pragma pack(pop)
+int c;
+static struct a d = {5};
+void e() {
+f:
+  for (c = 8; c < 55; ++c)
+if (!d.b)
+  goto f;
+}
diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
index db0d0f4a497..09d99fb9dda 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -3701,6 +3701,14 @@ bitfields_to_lower_p (class loop *loop,
  if (dump_file && (dump_flags & TDF_DETAILS))
print_gimple_stmt (dump_file, stmt, 0, TDF_SLIM);
 
+ if (TREE_THIS_VOLATILE (op))
+   {
+ if (dump_file && (dump_flags & TDF_DETAILS))
+   fprintf (dump_file, "\t Bitfield NO OK to lower,"
+   " the access is volatile.\n");
+ return false;
+   }
+
  if (!INTEGRAL_TYPE_P (TREE_TYPE (op)))
{
  if (dump_file && (dump_flags & TDF_DETAILS))
@@ -4031,20 +4039,27 @@ pass_if_conversion::execute (function *fun)
   if (todo & TODO_update_ssa_any)
 update_ssa (todo & TODO_update_ssa_any);
 
-  /* If if-conversion elided the loop fall back to the original one.  */
+  /* If if-conversion elided the loop fall back to the original one.  Likewise
+ if the loops are not nested in the same outer loop.  */
   for (unsigned i = 0; i < preds.length (); ++i)
 {
   gimple *g = preds[i];
   if (!gimple_bb (g))
continue;
-  unsigned ifcvt_loop = tree_to_uhwi (gimple_call_arg (g, 0));
-  unsigned orig_loop = tree_to_uhwi (gimple_call_arg (g, 1));
-  if (!get_loop (fun, ifcvt_loop) || !get_loop (fun, orig_loop))
+  auto ifcvt_loop = get_loop (fun, tree_to_uhwi (gimple_call_arg (g, 0)));
+  auto orig_loop = get_loop (fun, tree_to_uhwi (gimple_call_arg (g, 1)));
+  if (!ifcvt_loop || !orig_loop)
{
  if (dump_file)
fprintf (dump_file, "If-converted loop vanished\n");
  fold_loop_internal_call (g, boolean_false_node);
}
+  else if (loop_outer (ifcvt_loop) != loop_outer (orig_loop))
+   {
+ if (dump_file)
+   fprintf (dump_file, "If-converted loop in different outer loop\n");
+ fold_loop_internal_call (g, boolean_false_node);
+   }
 }
 
   return 0;
-- 
2.35.3


GCC 14.0.1 Status Report (2024-03-04)

2024-03-04 Thread Richard Biener
Status
==

The GCC development branch which will become GCC 14 is still
in regression and documentation fixes only mode (Stage 4).

GCC 14.1 will be released when we reach the milestone of
zero P1 regressions.

We've been into regression fixing for a good month now and at
least the pace of new important bugs reported is slowing.  We
still have to make good quite a number of regressions, esp.
also on the testsuite side.


Quality Data


Priority  #   Change from last report
---   ---
P1   14-  18
P2  506+   2
P3  244+   3
P4  228+  17
P5   26+   1   
---   ---
Total P1-P3 764-  13
Total  1018+   5


Previous Report
===

https://gcc.gnu.org/pipermail/gcc/2024-January/243148.html


Re: CI for "Option handling: add documentation URLs"

2024-03-04 Thread David Malcolm
On Sun, 2024-03-03 at 21:04 +0100, Mark Wielaard wrote:
> Hi,
> 
> On Sat, Feb 24, 2024 at 06:42:58PM +0100, Mark Wielaard wrote:
> > On Thu, Feb 22, 2024 at 11:57:50AM +0800, YunQiang Su wrote:
> > > Mark Wielaard  于2024年2月19日周一 06:58写道:
> > > > So, I did try the regenerate-opt-urls locally, and it did
> > > > generate the
> > > > attached diff. Which seems to show we really need this
> > > > automated.
> > > > 
> > > > Going over the diff. The -Winfinite-recursion in rust does
> > > > indeed seem
> > > > new.  As do the -mapx-inline-asm-use-gpr32 and mevex512 for
> > > > i386.  And
> > > > the avr options -mskip-bug, -mflmap and mrodata-in-ram.  The
> > > > change in
> > > > common.opt.urls for -Wuse-after-free comes from it being moved
> > > > from
> > > > c++ to the c-family. The changes in mips.opt.urls seem to come
> > > > from
> > > > commit 46df1369 "doc/invoke: Remove duplicate explicit-relocs
> > > > entry of
> > > > MIPS".
> > > 
> > > For MIPS, it's due to malformed patches to invoke.text.
> > > I will fix them.
> > 
> > Thanks. So with your commit 00bc8c0998d8 ("invoke.texi: Fix some
> > skipping UrlSuffix problem for MIPS") pushed now, the attached
> > patch
> > fixes the remaining issues.
> > 
> > Is this OK to push?

Thanks, looks good to me.

> Ping.
> 
> I have now regenerated the patch to also include the new avr mfuse-
> add
> change. It would be nice to get this committed so we can turn on the
> automatic checker.

Please go ahead with that.

Thanks
Dave



Re: [Patch] OpenMP: Reject non-const 'condition' trait in Fortran (was: [Patch] OpenMP: Handle DECL_ASSEMBLER_NAME with 'declare variant')

2024-03-04 Thread Thomas Schwinge
Hi Tobias!

On 2024-02-13T18:31:02+0100, Tobias Burnus  wrote:
> --- a/gcc/fortran/openmp.cc
> +++ b/gcc/fortran/openmp.cc

> +   /* Device number must be conforming, which includes
> +  omp_initial_device (-1) and omp_invalid_device (-4).  */
> +   if (property_kind == OMP_TRAIT_PROPERTY_DEV_NUM_EXPR
> +   && otp->expr->expr_type == EXPR_CONSTANT
> +   && mpz_sgn (otp->expr->value.integer) < 0
> +   && mpz_cmp_si (otp->expr->value.integer, -1) != 0
> +   && mpz_cmp_si (otp->expr->value.integer, -4) != 0)
> + {
> +   gfc_error ("property must be a conforming device number "
> +  "at %C");

Instead of magic numbers, shouldn't this use 'include/gomp-constants.h':

/* We have a compatibility issue.  OpenMP 5.2 introduced
   omp_initial_device with value of -1 which clashes with our
   GOMP_DEVICE_ICV, so we need to remap user supplied device
   ids, -1 (aka omp_initial_device) to GOMP_DEVICE_HOST_FALLBACK,
   and -2 (one of many non-conforming device numbers, but with
   OMP_TARGET_OFFLOAD=mandatory needs to be treated a
   omp_invalid_device) to -3 (so that for dev_num >= -2U we can
   subtract 1).  -4 is then what we use for omp_invalid_device,
   which unlike the other non-conforming device numbers results
   in fatal error regardless of OMP_TARGET_OFFLOAD.  */
#define GOMP_DEVICE_ICV -1
#define GOMP_DEVICE_HOST_FALLBACK   -2
#define GOMP_DEVICE_INVALID -4


Grüße
 Thomas


Re: [PATCH v5] c++: implement [[gnu::non_owning]] [PR110358]

2024-03-04 Thread Jonathan Wakely

On 01/03/24 15:38 -0500, Jason Merrill wrote:

On 3/1/24 14:24, Marek Polacek wrote:

+@smallexample
+template 
+[[gnu::no_dangling(std::is_reference_v)]] int foo (T& t) @{


I think this function should return a reference.


The condition in the attribute can only ever be true if you call this
function with an explicit template argument list: foo(i). Is
that intentional?

And if T is non-const it can't be called with a temporary and so
dangling seems less of a problem for this function anyway, right?

Would it make more sense as something like this?

template 
[[gnu::no_dangling(std::is_lvalue_reference_v)]]
decltype(auto) foo(T&& t) {
  ...
}

Or is this getting too complex/subtle for a simple example?




Re: [Patch] invoke.texi: Add note that -foffload= does not affect device detection

2024-03-04 Thread Tobias Burnus

Hi Sandra,

Sandra Loosemore wrote:

On 3/1/24 08:23, Tobias Burnus wrote:

Maybe the proposed wording will help others to avoid this pitfall.
(Or is this superfluous as -foffload= is not much used and, even if,
no one then remembers or finds this none?)


Well, I spent a long time looking at this, and my only conclusion is 
that I don't really understand what the problem you're trying to solve 
is.  If it's problematical to have the runtime know about offload 
devices the compiled code isn't using, don't users also need to know 
how to restrict the runtime to a particular set of devices the same 
way -foffload= lets you do, and not just how to disable offloading in 
the runtime entirely?
It's pretty clearly documented already how -foffload affects the 
compiler's behavior, and the library's behavior is already documented 
in its own manual.  Maybe what we don't have is a tutorial on how to 
build/link/run programs using a specific offload device, or on the host?


The problem is for code like the following, which is perfectly valid
and works

(A) If you don't have any offload device
(independent of the compiler options)

(B) If you have an offload device (supported by your libgomp)
and compiled with offloading support (for that device)

But (C) if you have an offload device and compile as:
  gcc -fopenmp -foffload=disabled

it will fail at runtime with:

dev = 0 / num devs = 1 Segmentation fault (core dumped) The problem is 
that there is a mismatch between the code (assumes no offload code + 
always host fallback) and the run-time library (which detects offload 
devices), such that the API routines uses a different device than the 
'target' code:


#include 
#include 

#define N 2064
int
main ()
{
  int *x = (int*) omp_target_alloc (sizeof(int)*N,
omp_get_default_device ());
  printf ("dev = %d / num devs = %d\n",
  omp_get_default_device (), omp_get_num_devices ());
  #pragma omp target is_device_ptr(x)
  for (int i = 0; i < N; ++i)
x[i] = i;
}
---

On the technical side, it is not really surprising but it
might be still be confusing for the user. Obviously, it can
also occur if you compile, e.g., for AMD GCN and only an
Nvidia device is available - but there the solution would be
the same (disable all devices).

(OpenMP 6.0 will provide a environment variable that allows
fine tuning of the available devices.)


Questions:

* Is such a usage common enough to matter?
I guess for some benchmark use it make – to test whether
real offloading or host fallback is faster + if the latter
is true, it might also get used in operational code.

* Are API routines used in such a code in a way that it breaks?
(Unfortunately not very unlikely in larger code.)

If there is enough real-world usage (= 2x yes to the questions above):
* How to word is to help users and not to confuse them?

Tobias


Re: [PATCH] arm: Fixed C23 call compatibility with arm-none-eabi

2024-03-04 Thread Torbjorn SVENSSON




On 2024-03-01 15:58, Richard Earnshaw (lists) wrote:

On 19/02/2024 09:13, Torbjörn SVENSSON wrote:

Ok for trunk and releases/gcc-13?
Regtested on top of 945cb8490cb for arm-none-eabi, without any regression.

Backporting to releases/gcc-13 will change -std=c23 to -std=c2x.


Jakub has just pushed a different fix for this, so I don't think we need this 
now.

R.


Would it still be benificial to have the 2 test cases for the AAPCS 
validation?


Kind regards,
Torbjörn






--

In commit 4fe34cdcc80ac225b80670eabc38ac5e31ce8a5a, -std=c23 support was
introduced to support functions without any named arguments.  For
arm-none-eabi, this is not as simple as placing all arguments on the
stack.  Align the caller to use r0, r1, r2 and r3 for arguments even for
functions without any named arguments, as specified in the AAPCS.

Verify that the generic test case have the arguments are in the right
order and add ARM specific test cases.

gcc/ChangeLog:

* calls.h: Added the type of the function to function_arg_info.
* calls.cc: Save the type of the function.
* config/arm/arm.cc: Check in the AAPCS layout function if
function has no named args.

gcc/testsuite/ChangeLog:

* gcc.dg/torture/c23-stdarg-split-1a.c: Detect out of order
arguments.
* gcc.dg/torture/c23-stdarg-split-1b.c: Likewise.
* gcc.target/arm/aapcs/align_vaarg3.c: New test.
* gcc.target/arm/aapcs/align_vaarg4.c: New test.

Signed-off-by: Torbjörn SVENSSON 
Co-authored-by: Yvan ROUX 
---
  gcc/calls.cc  |  2 +-
  gcc/calls.h   | 20 --
  gcc/config/arm/arm.cc | 13 ---
  .../gcc.dg/torture/c23-stdarg-split-1a.c  |  4 +-
  .../gcc.dg/torture/c23-stdarg-split-1b.c  | 15 +---
  .../gcc.target/arm/aapcs/align_vaarg3.c   | 37 +++
  .../gcc.target/arm/aapcs/align_vaarg4.c   | 31 
  7 files changed, 102 insertions(+), 20 deletions(-)
  create mode 100644 gcc/testsuite/gcc.target/arm/aapcs/align_vaarg3.c
  create mode 100644 gcc/testsuite/gcc.target/arm/aapcs/align_vaarg4.c

diff --git a/gcc/calls.cc b/gcc/calls.cc
index 01f44734743..a1cc283b952 100644
--- a/gcc/calls.cc
+++ b/gcc/calls.cc
@@ -1376,7 +1376,7 @@ initialize_argument_information (int num_actuals 
ATTRIBUTE_UNUSED,
 with those made by function.cc.  */
  
/* See if this argument should be passed by invisible reference.  */

-  function_arg_info arg (type, argpos < n_named_args);
+  function_arg_info arg (type, fntype, argpos < n_named_args);
if (pass_by_reference (args_so_far_pnt, arg))
{
  const bool callee_copies
diff --git a/gcc/calls.h b/gcc/calls.h
index 464a4e34e33..88836559ebe 100644
--- a/gcc/calls.h
+++ b/gcc/calls.h
@@ -35,24 +35,33 @@ class function_arg_info
  {
  public:
function_arg_info ()
-: type (NULL_TREE), mode (VOIDmode), named (false),
+: type (NULL_TREE), fntype (NULL_TREE), mode (VOIDmode), named (false),
pass_by_reference (false)
{}
  
/* Initialize an argument of mode MODE, either before or after promotion.  */

function_arg_info (machine_mode mode, bool named)
-: type (NULL_TREE), mode (mode), named (named), pass_by_reference (false)
+: type (NULL_TREE), fntype (NULL_TREE), mode (mode), named (named),
+pass_by_reference (false)
{}
  
/* Initialize an unpromoted argument of type TYPE.  */

function_arg_info (tree type, bool named)
-: type (type), mode (TYPE_MODE (type)), named (named),
+: type (type), fntype (NULL_TREE), mode (TYPE_MODE (type)), named (named),
pass_by_reference (false)
{}
  
+  /* Initialize an unpromoted argument of type TYPE with a known function type

+ FNTYPE.  */
+  function_arg_info (tree type, tree fntype, bool named)
+: type (type), fntype (fntype), mode (TYPE_MODE (type)), named (named),
+pass_by_reference (false)
+  {}
+
/* Initialize an argument with explicit properties.  */
function_arg_info (tree type, machine_mode mode, bool named)
-: type (type), mode (mode), named (named), pass_by_reference (false)
+: type (type), fntype (NULL_TREE), mode (mode), named (named),
+pass_by_reference (false)
{}
  
/* Return true if the gimple-level type is an aggregate.  */

@@ -96,6 +105,9 @@ public:
   libgcc support functions).  */
tree type;
  
+  /* The type of the function that has this argument, or null if not known.  */

+  tree fntype;
+
/* The mode of the argument.  Depending on context, this might be
   the mode of the argument type or the mode after promotion.  */
machine_mode mode;
diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index 1cd69268ee9..98e149e5b7e 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -7006,7 +7006,7 @@ aapcs_libcall_value (machine_mode mode)
 numbers referred to here are those in the AAPCS.  */

[committed] libgomp: Use void (*) (void *) rather than void (*)() for host_fn type [PR114216]

2024-03-04 Thread Jakub Jelinek
Hi!

For the type of the target callbacks we use elsehwere void (*) (void *) and
IMHO should use that for the reverse offload fallback as well (where the actual
callback is emitted using the same code as for host fallback or device kernel
entry routines), even when it is also ok to use void (*) () before C23 and
we aren't building libgomp with C23 yet.  On some arches perhaps void (*) ()
could result in worse code generation because calls in that case like casts
to unprototyped functions need to sometimes pass argument in two different spots
etc. so that it deals with both passing it through ... and as a named argument.

Tested on x86_64-linux, committed to trunk.

2024-03-04  Jakub Jelinek  

PR libgomp/114216
* target.c (gomp_target_rev): Change host_fn type and corresponding
cast from void (*)() to void (*) (void *).

--- libgomp/target.c.jj 2024-01-03 12:07:47.812094729 +0100
+++ libgomp/target.c2024-03-04 11:26:05.745094586 +0100
@@ -3447,7 +3447,7 @@ gomp_target_rev (uint64_t fn_ptr, uint64
 
   if (n == NULL)
 gomp_fatal ("Cannot find reverse-offload function");
-  void (*host_fn)() = (void (*)()) n->k->host_start;
+  void (*host_fn) (void *) = (void (*) (void *)) n->k->host_start;
 
   if ((devicep->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM) || mapnum == 0)
 {

Jakub



[PATCH] tree-optimization/114203 - wrong CLZ niter computation

2024-03-04 Thread Richard Biener
For precision less than int we apply the adjustment to make it defined
at zero after the adjustment to make it compute CLZ rather than CTZ.
That's wrong.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/114203
* tree-ssa-loop-niter.cc (build_cltz_expr): Apply CTZ->CLZ
adjustment before making the result defined at zero.

* gcc.dg/torture/pr114203.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr114203.c | 21 +
 gcc/tree-ssa-loop-niter.cc  |  7 +++
 2 files changed, 24 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr114203.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr114203.c 
b/gcc/testsuite/gcc.dg/torture/pr114203.c
new file mode 100644
index 000..0ef6279942a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr114203.c
@@ -0,0 +1,21 @@
+/* { dg-do run } */
+
+int __attribute__((noipa))
+foo (unsigned char b)
+{
+  int c = 0;
+
+  while (b) {
+  b >>= 1;
+  c++;
+  }
+
+  return c;
+}
+
+int main()
+{
+  if (foo(0) != 0)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
index 038e4331661..c6d010f6d89 100644
--- a/gcc/tree-ssa-loop-niter.cc
+++ b/gcc/tree-ssa-loop-niter.cc
@@ -2288,6 +2288,9 @@ build_cltz_expr (tree src, bool leading, bool 
define_at_zero)
src = fold_convert (unsigned_type_node, src);
 
   call = build_call_expr (fn, 1, src);
+  if (leading && prec < i_prec)
+   call = fold_build2 (MINUS_EXPR, integer_type_node, call,
+   build_int_cst (integer_type_node, i_prec - prec));
   if (define_at_zero)
{
  tree is_zero = fold_build2 (NE_EXPR, boolean_type_node, src,
@@ -2295,10 +2298,6 @@ build_cltz_expr (tree src, bool leading, bool 
define_at_zero)
  call = fold_build3 (COND_EXPR, integer_type_node, is_zero, call,
  build_int_cst (integer_type_node, prec));
}
-
-  if (leading && prec < i_prec)
-   call = fold_build2 (MINUS_EXPR, integer_type_node, call,
-   build_int_cst (integer_type_node, i_prec - prec));
 }
 
   return call;
-- 
2.35.3


[PATCH] tree-optimization/114192 - scalar reduction kept live with early break vect

2024-03-04 Thread Richard Biener
The following fixes a missing replacement of the reduction value
used in the epilog, causing the scalar reduction to be kept live
across the early break exit path.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/114192
* tree-vect-loop.cc (vect_create_epilog_for_reduction): Use the
appropriate def for the live out stmt in case of an alternate
exit.
---
 gcc/tree-vect-loop.cc | 40 ++--
 1 file changed, 26 insertions(+), 14 deletions(-)

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 35f1f8c7d42..761cdc67570 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -6066,20 +6066,32 @@ vect_create_epilog_for_reduction (loop_vec_info 
loop_vinfo,
 
   stmt_vec_info single_live_out_stmt[] = { stmt_info };
   array_slice live_out_stmts = single_live_out_stmt;
-  if (slp_reduc)
-/* All statements produce live-out values.  */
-live_out_stmts = SLP_TREE_SCALAR_STMTS (slp_node);
-  else if (slp_node)
-{
-  /* The last statement in the reduction chain produces the live-out
-value.  Note SLP optimization can shuffle scalar stmts to
-optimize permutations so we have to search for the last stmt.  */
-  for (k = 0; k < group_size; ++k)
-   if (!REDUC_GROUP_NEXT_ELEMENT (SLP_TREE_SCALAR_STMTS (slp_node)[k]))
- {
-   single_live_out_stmt[0] = SLP_TREE_SCALAR_STMTS (slp_node)[k];
-   break;
- }
+  if (LOOP_VINFO_EARLY_BREAKS (loop_vinfo)
+  && loop_exit != LOOP_VINFO_IV_EXIT (loop_vinfo)
+  /* ???  We should fend this off earlier.  For conversions we create
+multiple epilogues, one dead.  */
+  && stmt_info == reduc_info->reduc_def)
+{
+  gcc_assert (!slp_node);
+  single_live_out_stmt[0] = reduc_info;
+}
+  else
+{
+  if (slp_reduc)
+   /* All statements produce live-out values.  */
+   live_out_stmts = SLP_TREE_SCALAR_STMTS (slp_node);
+  else if (slp_node)
+   {
+ /* The last statement in the reduction chain produces the live-out
+value.  Note SLP optimization can shuffle scalar stmts to
+optimize permutations so we have to search for the last stmt.  */
+ for (k = 0; k < group_size; ++k)
+   if (!REDUC_GROUP_NEXT_ELEMENT (SLP_TREE_SCALAR_STMTS (slp_node)[k]))
+ {
+   single_live_out_stmt[0] = SLP_TREE_SCALAR_STMTS (slp_node)[k];
+   break;
+ }
+   }
 }
 
   unsigned vec_num;
-- 
2.35.3


Re: [PATCH] middle-end: Fix dominator information with loop duplication PR114197

2024-03-04 Thread Richard Biener
On Fri, Mar 1, 2024 at 11:16 PM Edwin Lu  wrote:
>
> When adding the new_preheader to the cfg, only the new_preheader's dominator
> information is updated. If one of the new basic block's children was part
> of the original cfg and adding new_preheader to the cfg introduces another 
> path
> to that child, the child's dominator information will not be updated. This may
> cause verify_dominator's assertion to fail.
>
> Force recalculating dominators for all duplicated basic blocks and their
> successors when updating new_preheader's dominator information.

We're already doing this (which IMO is bad), via the
iterate_fix_dominators call.
You're adding another bunch of similar things and I think the use of
recompute_dominator isn't safe as it assumes all predecessors have valid
dominator info.

I'll have a look.

Richard.

> PR 114197
>
> gcc/ChangeLog:
>
> * tree-vect-loop-manip.cc (slpeel_tree_duplicate_loop_to_edge_cfg):
> Recalculate dominator info when adding new_preheader to cfg
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/vect/pr114197.c: New test.
>
> Signed-off-by: Edwin Lu 
> ---
>  gcc/testsuite/gcc.dg/vect/pr114197.c | 18 ++
>  gcc/tree-vect-loop-manip.cc  | 17 -
>  2 files changed, 34 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/pr114197.c
>
> diff --git a/gcc/testsuite/gcc.dg/vect/pr114197.c 
> b/gcc/testsuite/gcc.dg/vect/pr114197.c
> new file mode 100644
> index 000..b1fb807729c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/pr114197.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-O3" } */
> +
> +
> +#pragma pack(push)
> +struct a {
> +  volatile signed b : 8;
> +};
> +#pragma pack(pop)
> +int c;
> +static struct a d = {5};
> +void e() {
> +f:
> +  for (c = 8; c < 55; ++c)
> +if (!d.b)
> +  goto f;
> +}
> +
> diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
> index f72da915103..0f3a489e78c 100644
> --- a/gcc/tree-vect-loop-manip.cc
> +++ b/gcc/tree-vect-loop-manip.cc
> @@ -1840,7 +1840,22 @@ slpeel_tree_duplicate_loop_to_edge_cfg (class loop 
> *loop, edge loop_exit,
> }
>
>if (was_imm_dom || duplicate_outer_loop)
> -   set_immediate_dominator (CDI_DOMINATORS, exit_dest, new_exit->src);
> +   {
> + set_immediate_dominator (CDI_DOMINATORS, exit_dest, new_exit->src);
> +
> + /* Update the dominator info for children of duplicated bbs.  */
> + for (unsigned i = 0; i < scalar_loop->num_nodes; i++)
> +   {
> + basic_block dom_bb = NULL;
> + edge e;
> + edge_iterator ei;
> + FOR_EACH_EDGE (e, ei, new_bbs[i]->succs)
> +   {
> + dom_bb = recompute_dominator (CDI_DOMINATORS, e->dest);
> + set_immediate_dominator (CDI_DOMINATORS, e->dest, dom_bb);
> +   }
> +   }
> +   }
>
>/* And remove the non-necessary forwarder again.  Keep the other
>   one so we have a proper pre-header for the loop at the exit edge.  
> */
> --
> 2.34.1
>


Re: [PATCH v2] Draft|Internal-fn: Introduce internal fn saturation US_PLUS

2024-03-04 Thread Richard Biener
On Sat, Mar 2, 2024 at 8:46 AM Li, Pan2  wrote:
>
> Hi Richard and Tamar,
>
> I have a try with DEF_INTERNAL_SIGNED_OPTAB_FN for SAT_ADD/SUB/MUL but meet 
> some problem when match.pd.
>
> For unsigned SAT_ADD = (x + y) | - ((x + y) < x), the match.pd can be 
> (bit_ior:c (plus:c@2 @0 @1) (negate (convert (lt @2 @0.
> For unsigned SAT_SUB = x >= y ? x - y : 0, and then match.pd can be (cond (ge 
> @0 @1) (minus @0 @1) integer_zerop).
>
> For signed SAT_ADD/SAT_SUB as below, seems not easy to make the simplify 
> pattern works well as expected up to a point.
> sint64_t sat_add (sint64_t x, sint64_t y)
> {
>   sint64_t a = x ^ y;
>   sint64_t add = x + y;
>   sint64_t b = sum ^ x;
>
>   return (a < 0 || (a >= 0 && b >= 0)) ? add : (MAX_INT64 + (x < 0));
> }
>
> sint64_t sad_sub (sint64_t x, sint64_t y)
> {
>   sint64_t a = x ^ y;
>   sint64_t sub = x - y;
>   sint64_t b = sub ^ x;
>
>   return (a >= 0 || (a < 0 && b >= 0) ? sub : (MAX_INT64 + (x < 0));
> }
>
> For SAT_MUL as below, looks we may need widen type. I am not sure if we can 
> leverage MUL_OVERFLOW or not in match.pd.
>
> uint32_t sat_mul (uint32_t x, uint32_t y)
> {
>   uint64_t mul = (uint64_t)x * (uint64_t)y;
>   return mul > UINT32_MAX ? UINT32_MAX : (uint32_t)mul;
> }
>
> sint32_t sat_mul (sint32_t x, sint32_t y)
> {
>   sint64_t mul = (sint64_t)x * (sint64_t))y;
>
>   return mul <= MAX_INT32 && mul >= MIN_INT32 ? mul : MAX_INT32 + (x ^ y) > 0;
> }
>
> Below diff only contains unsigned SAT_ADD and SAT_SUB for prototype 
> validation.
> I will continue to try the rest part in match.pd and keep you posted.
>
> -
>
> diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
> index 80efdf2b7e5..d9ad6fe2b58 100644
> --- a/gcc/config/riscv/riscv-protos.h
> +++ b/gcc/config/riscv/riscv-protos.h
> @@ -132,6 +132,9 @@ extern void riscv_asm_output_external (FILE *, const 
> tree, const char *);
>  extern bool
>  riscv_zcmp_valid_stack_adj_bytes_p (HOST_WIDE_INT, int);
>  extern void riscv_legitimize_poly_move (machine_mode, rtx, rtx, rtx);
> +extern void riscv_expand_usadd (rtx, rtx, rtx);
> +extern void riscv_expand_ussub (rtx, rtx, rtx);
>
>  #ifdef RTX_CODE
>  extern void riscv_expand_int_scc (rtx, enum rtx_code, rtx, rtx, bool 
> *invert_ptr = 0);
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 5e984ee2a55..795462526df 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -10655,6 +10655,28 @@ riscv_vector_mode_supported_any_target_p 
> (machine_mode)
>return true;
>  }
>
> +/* Emit insn for the saturation addu, aka (x + y) | - ((x + y) < x).  */
> +void
> +riscv_expand_usadd (rtx dest, rtx x, rtx y)
> +{
> +  fprintf (stdout, "Hit riscv_expand_usadd.\n");
> +  // ToDo
> +}
> +
> +void
> +riscv_expand_ussub (rtx dest, rtx x, rtx y)
> +{
> +  fprintf (stdout, "Hit riscv_expand_ussub.\n");
> +  // ToDo
> +}
> +
>  /* Initialize the GCC target structure.  */
>  #undef TARGET_ASM_ALIGNED_HI_OP
>  #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index 1fec13092e2..e2dbadb3ead 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -3839,6 +3839,39 @@ (define_insn "*large_load_address"
>[(set_attr "type" "load")
> (set (attr "length") (const_int 8))])
>
> +(define_expand "usadd3"
> +  [(match_operand:ANYI 0 "register_operand")
> +   (match_operand:ANYI 1 "register_operand")
> +   (match_operand:ANYI 2 "register_operand")]
> +  ""
> +  {
> +riscv_expand_usadd (operands[0], operands[1], operands[2]);
> +DONE;
> +  }
> +)
> +
> +(define_expand "ussub3"
> +  [(match_operand:ANYI 0 "register_operand")
> +   (match_operand:ANYI 1 "register_operand")
> +   (match_operand:ANYI 2 "register_operand")]
> +  ""
> +  {
> +riscv_expand_ussub (operands[0], operands[1], operands[2]);
> +DONE;
> +  }
> +)
> +
>  (include "bitmanip.md")
>  (include "crypto.md")
>  (include "sync.md")
> diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> index 848bb9dbff3..0fff19c875f 100644
> --- a/gcc/internal-fn.def
> +++ b/gcc/internal-fn.def
> @@ -275,6 +275,13 @@ DEF_INTERNAL_SIGNED_OPTAB_FN (MULHS, ECF_CONST | 
> ECF_NOTHROW, first,
>  DEF_INTERNAL_SIGNED_OPTAB_FN (MULHRS, ECF_CONST | ECF_NOTHROW, first,
>   smulhrs, umulhrs, binary)
>
> +DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_ADD, ECF_CONST | ECF_NOTHROW, first,
> + ssadd, usadd, binary)
> +DEF_INTERNAL_SIGNED_OPTAB_FN (SAT_SUB, ECF_CONST | ECF_NOTHROW, first,
> + sssub, ussub, binary)
> +
>  DEF_INTERNAL_COND_FN (ADD, ECF_CONST, add, binary)
>  DEF_INTERNAL_COND_FN (SUB, ECF_CONST, sub, binary)
>  DEF_INTERNAL_COND_FN (MUL, ECF_CONST, smul, binary)
> diff --git a/gcc/match.pd 

Re: [Patch] invoke.texi: Add note that -foffload= does not affect device detection

2024-03-04 Thread Tobias Burnus

Hi,

Sandra Loosemore wrote:

On 3/1/24 17:29, Sandra Loosemore wrote:

On 3/1/24 08:23, Tobias Burnus wrote:
Aside: Shouldn't all the HTML documents start with a  and 
 before

the table of content? Currently, it has:
   Top (GNU libgomp)
and the body starts with
   Short Table of Contents


I note that the 'Top(...)' in  already appears in the GCC 8.5 
docs (created with Texinfo 6.5; while GCC 7.5, created with texinfo 6.3, 
is okay). And the  disappears in the GCC 10.5 doc, created with 
Texinfo 7.0dev.


I have no idea why the 'Top(...)' appears with Texinfo 6.5, but the 
missing  is because of Texinfo 7.0, cf. 
https://git.savannah.gnu.org/cgit/texinfo.git/plain/NEWS


I think it would be useful to remove the 'Top()' in  and add the 
 in general.


For the GCC website, we might want to set TOP_NODE_UP_URL.

I think this is a bug in the version of texinfo used to produce the 
HTML content for the GCC web site.  Looking at a recent build of my 
own using Texinfo 6.7, I do see



GNU libgomp

The manual on the web site says it was produced by "GNU Texinfo 7.0dev".


I poked at this a little and apparently you need to fiddle with the 
SHOW_TITLE or NO_TOP_NODE_OUTPUT customization variables in recent 
versions of Texinfo in order to get the document title to show up in 
HTML output.


https://www.gnu.org/software/texinfo/manual/texinfo/texinfo.html#index-SHOW_005fTITLE 



Probably this has to be controlled by a configure check since older 
Texinfo versions may barf on unknown options.

...
I'd think that if we were going to do that, we'd also want to use an 
official release version of Texinfo instead of a "dev" snapshot.


(I concur that we should update 7.0dev to 7.0.3 or 7.1 on the server to 
have a defined version.)


Thanks,

Tobias



Re: [PATCH] bitint: Fix tree node sharing bug [PR114209]

2024-03-04 Thread Richard Biener
On Mon, 4 Mar 2024, Jakub Jelinek wrote:

> Hi!
> 
> We ICE on the following testcase due to invalid tree sharing.
> The second hunk fixes that, the first one is from me looking around at
> other spots which might need end up with invalid tree sharing too.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

> 2024-03-04  Jakub Jelinek  
> 
>   PR middle-end/114209
>   * gimple-lower-bitint.cc (bitint_large_huge::limb_access): Call
>   unshare_expr when creating a MEM_REF from MEM_REF.
>   (bitint_large_huge::lower_stmt): Call unshare_expr.
> 
>   * gcc.dg/bitint-97.c: New test.
> 
> --- gcc/gimple-lower-bitint.cc.jj 2024-03-01 11:04:44.623537149 +0100
> +++ gcc/gimple-lower-bitint.cc2024-03-03 19:18:30.017909558 +0100
> @@ -620,7 +620,7 @@ bitint_large_huge::limb_access (tree typ
>else if (TREE_CODE (var) == MEM_REF && tree_fits_uhwi_p (idx))
>  {
>ret
> - = build2 (MEM_REF, ltype, TREE_OPERAND (var, 0),
> + = build2 (MEM_REF, ltype, unshare_expr (TREE_OPERAND (var, 0)),
> size_binop (PLUS_EXPR, TREE_OPERAND (var, 1),
> build_int_cst (TREE_TYPE (TREE_OPERAND (var, 1)),
>tree_to_uhwi (idx)
> @@ -5342,7 +5342,7 @@ bitint_large_huge::lower_stmt (gimple *s
> = build_qualified_type (ltype,
> TYPE_QUALS (ltype)
> | ENCODE_QUAL_ADDR_SPACE (as));
> -   rhs1 = build1 (VIEW_CONVERT_EXPR, ltype, mem);
> +   rhs1 = build1 (VIEW_CONVERT_EXPR, ltype, unshare_expr (mem));
> gimple_assign_set_rhs1 (stmt, rhs1);
>   }
> else
> --- gcc/testsuite/gcc.dg/bitint-97.c.jj   2024-03-03 18:59:31.084588944 
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-97.c  2024-03-03 19:16:50.114284071 +0100
> @@ -0,0 +1,18 @@
> +/* PR middle-end/114209 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-Og -std=c23 -fno-strict-aliasing" } */
> +/* { dg-add-options float128 } */
> +/* { dg-require-effective-target float128 } */
> +
> +typedef signed char V __attribute__((__vector_size__(16)));
> +typedef _Float128 W __attribute__((__vector_size__(16)));
> +
> +_Float128
> +foo (void *p)
> +{
> +  signed char c = *(_BitInt(128) *) p;
> +  _Float128 f = *(_Float128 *) p;
> +  W w = *(W *) p;
> +  signed char r = ((union { W a; signed char b[16]; }) w).b[1];
> +  return r + f;
> +}
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)


Re: [PATCH] LoongArch: Fix inconsistent description in *sge_

2024-03-04 Thread Xi Ruoyao
On Mon, 2024-03-04 at 11:03 +0800, Guo Jie wrote:
> The constraint of op[1] is inconsistent with the output template.
> 
> gcc/ChangeLog:
> 
>   * config/loongarch/loongarch.md
>   (define_insn "*sge_"): Fix inconsistency
>   error.
>
> ---
>  gcc/config/loongarch/loongarch.md | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/loongarch/loongarch.md
> b/gcc/config/loongarch/loongarch.md
> index f3b5c641fce..2d25374bdc9 100644
> --- a/gcc/config/loongarch/loongarch.md
> +++ b/gcc/config/loongarch/loongarch.md
> @@ -3357,10 +3357,10 @@ (define_insn "*sgt_"
>  
>  (define_insn "*sge_"
>    [(set (match_operand:GPR 0 "register_operand" "=r")
> - (any_ge:GPR (match_operand:X 1 "register_operand" "r")
> + (any_ge:GPR (match_operand:X 1 "arith_operand" "rI")
>    (const_int 1)))]

No, arith_operand is just register_operand or const_imm12_operand, but
comparing a const_imm12_operand with (const_int 1) should be folded into
a constant (even at -O0, AFAIK).  So allowing const_imm12_operand here
makes no benefit.

>    ""
> -  "slti\t%0,%.,%1"
> +  "slt%i1\t%0,%.,%1"
>    [(set_attr "type" "slt")
>     (set_attr "mode" "")])
>  

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH] fwprop: Avoid volatile defines to be propagated

2024-03-04 Thread HAO CHEN GUI
Hi Jeff,

在 2024/3/4 11:37, Jeff Law 写道:
> Can the same thing happen with a volatile memory load?  I don't think that 
> will be caught by the volatile_insn_p check.

Yes, I think so. If the define rtx contains volatile memory references, it
may hit the same problem. We may use volatile_refs_p instead of
volatile_insn_p?

Thanks
Gui Haochen


Re: [PATCH] i386: Fix ICEs with SUBREGs from vector etc. constants to XFmode [PR114184]

2024-03-04 Thread Uros Bizjak
On Mon, Mar 4, 2024 at 9:41 AM Jakub Jelinek  wrote:
>
> On Mon, Mar 04, 2024 at 09:34:30AM +0100, Uros Bizjak wrote:
> > > --- gcc/config/i386/i386-expand.cc.jj   2024-03-01 14:56:34.120925989 
> > > +0100
> > > +++ gcc/config/i386/i386-expand.cc  2024-03-03 18:41:08.278793046 
> > > +0100
> > > @@ -451,6 +451,20 @@ ix86_expand_move (machine_mode mode, rtx
> > >   && GET_MODE (SUBREG_REG (op1)) == DImode
> > >   && SUBREG_BYTE (op1) == 0)
> > > op1 = gen_rtx_ZERO_EXTEND (TImode, SUBREG_REG (op1));
> > > +  /* As not all values in XFmode are representable in real_value,
> > > +we might be called with unfoldable SUBREGs of constants.  */
> > > +  if (mode == XFmode
> > > + && CONSTANT_P (SUBREG_REG (op1))
> > > + && can_create_pseudo_p ())
> >
> > We have quite some unguarded force_regs in ix86_expand_move. While it
> > doesn't hurt to have an extra safety net, is there a particular reason
> > for can_create_pseudo_p check in the added code?
>
> Various other places in ix86_expand_move do check can_create_pseudo_p, the
> case I've mostly copied this from in ix86_expand_vector_move also does that,
> and then there is the
>  Therefore, when given such a pair of operands, the pattern must
>  generate RTL which needs no reloading and needs no temporary
>  registers--no registers other than the operands.  For example, if
>  you support the pattern with a 'define_expand', then in such a case
>  the 'define_expand' mustn't call 'force_reg' or any other such
>  function which might generate new pseudo registers.
> in mov description, which initially scared me off from using it at all.
> Guess we'll ICE either way if something like that appears during RA.

Thanks for the insight - it was PIC handling in ix86_expand_move that
catched my eye, especially the TARGET_MACHO part that looks like it
was somehow left behind. OTOH, the whole ix86_expand_move would need
some TLC anyway.

FAOD - the patch is OK as is.

Thanks,
Uros.


[PATCH] bitint: Fix tree node sharing bug [PR114209]

2024-03-04 Thread Jakub Jelinek
Hi!

We ICE on the following testcase due to invalid tree sharing.
The second hunk fixes that, the first one is from me looking around at
other spots which might need end up with invalid tree sharing too.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-03-04  Jakub Jelinek  

PR middle-end/114209
* gimple-lower-bitint.cc (bitint_large_huge::limb_access): Call
unshare_expr when creating a MEM_REF from MEM_REF.
(bitint_large_huge::lower_stmt): Call unshare_expr.

* gcc.dg/bitint-97.c: New test.

--- gcc/gimple-lower-bitint.cc.jj   2024-03-01 11:04:44.623537149 +0100
+++ gcc/gimple-lower-bitint.cc  2024-03-03 19:18:30.017909558 +0100
@@ -620,7 +620,7 @@ bitint_large_huge::limb_access (tree typ
   else if (TREE_CODE (var) == MEM_REF && tree_fits_uhwi_p (idx))
 {
   ret
-   = build2 (MEM_REF, ltype, TREE_OPERAND (var, 0),
+   = build2 (MEM_REF, ltype, unshare_expr (TREE_OPERAND (var, 0)),
  size_binop (PLUS_EXPR, TREE_OPERAND (var, 1),
  build_int_cst (TREE_TYPE (TREE_OPERAND (var, 1)),
 tree_to_uhwi (idx)
@@ -5342,7 +5342,7 @@ bitint_large_huge::lower_stmt (gimple *s
  = build_qualified_type (ltype,
  TYPE_QUALS (ltype)
  | ENCODE_QUAL_ADDR_SPACE (as));
- rhs1 = build1 (VIEW_CONVERT_EXPR, ltype, mem);
+ rhs1 = build1 (VIEW_CONVERT_EXPR, ltype, unshare_expr (mem));
  gimple_assign_set_rhs1 (stmt, rhs1);
}
  else
--- gcc/testsuite/gcc.dg/bitint-97.c.jj 2024-03-03 18:59:31.084588944 +0100
+++ gcc/testsuite/gcc.dg/bitint-97.c2024-03-03 19:16:50.114284071 +0100
@@ -0,0 +1,18 @@
+/* PR middle-end/114209 */
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-Og -std=c23 -fno-strict-aliasing" } */
+/* { dg-add-options float128 } */
+/* { dg-require-effective-target float128 } */
+
+typedef signed char V __attribute__((__vector_size__(16)));
+typedef _Float128 W __attribute__((__vector_size__(16)));
+
+_Float128
+foo (void *p)
+{
+  signed char c = *(_BitInt(128) *) p;
+  _Float128 f = *(_Float128 *) p;
+  W w = *(W *) p;
+  signed char r = ((union { W a; signed char b[16]; }) w).b[1];
+  return r + f;
+}

Jakub



Re: [PATCH] i386: Fix ICEs with SUBREGs from vector etc. constants to XFmode [PR114184]

2024-03-04 Thread Jakub Jelinek
On Mon, Mar 04, 2024 at 09:34:30AM +0100, Uros Bizjak wrote:
> > --- gcc/config/i386/i386-expand.cc.jj   2024-03-01 14:56:34.120925989 +0100
> > +++ gcc/config/i386/i386-expand.cc  2024-03-03 18:41:08.278793046 +0100
> > @@ -451,6 +451,20 @@ ix86_expand_move (machine_mode mode, rtx
> >   && GET_MODE (SUBREG_REG (op1)) == DImode
> >   && SUBREG_BYTE (op1) == 0)
> > op1 = gen_rtx_ZERO_EXTEND (TImode, SUBREG_REG (op1));
> > +  /* As not all values in XFmode are representable in real_value,
> > +we might be called with unfoldable SUBREGs of constants.  */
> > +  if (mode == XFmode
> > + && CONSTANT_P (SUBREG_REG (op1))
> > + && can_create_pseudo_p ())
> 
> We have quite some unguarded force_regs in ix86_expand_move. While it
> doesn't hurt to have an extra safety net, is there a particular reason
> for can_create_pseudo_p check in the added code?

Various other places in ix86_expand_move do check can_create_pseudo_p, the
case I've mostly copied this from in ix86_expand_vector_move also does that,
and then there is the
 Therefore, when given such a pair of operands, the pattern must
 generate RTL which needs no reloading and needs no temporary
 registers--no registers other than the operands.  For example, if
 you support the pattern with a 'define_expand', then in such a case
 the 'define_expand' mustn't call 'force_reg' or any other such
 function which might generate new pseudo registers.
in mov description, which initially scared me off from using it at all.
Guess we'll ICE either way if something like that appears during RA.

Jakub



Re: [PATCH] i386: Fix ICEs with SUBREGs from vector etc. constants to XFmode [PR114184]

2024-03-04 Thread Uros Bizjak
On Mon, Mar 4, 2024 at 9:25 AM Jakub Jelinek  wrote:
>
> Hi!
>
> The Intel extended format has the various weird number categories,
> pseudo denormals, pseudo infinities, pseudo NaNs and unnormals.
> Those are not representable in the GCC real_value and so neither
> GIMPLE nor RTX VIEW_CONVERT_EXPR/SUBREG folding folds those into
> constants.
>
> As can be seen on the following testcase, because it isn't folded
> (since GCC 12, before that we were folding it) we can end up with
> a SUBREG of a CONST_VECTOR or similar constant, which isn't valid
> general_operand, so we ICE during vregs pass trying to recognize
> the move instruction.
> Initially I thought it is a middle-end bug, the movxf instruction
> has general_operand predicate, but the middle-end certainly never
> tests that predicate, seems moves are special optabs.
> And looking at other mov optabs, e.g. for vector modes the i386
> patterns use nonimmediate_operand predicate on the input, yet
> ix86_expand_vector_move deals with CONSTANT_P and SUBREG of CONSTANT_P
> arguments which if the predicate was checked couldn't ever make it through.
>
> The following patch handles this case similarly to the
> ix86_expand_vector_move's SUBREG of CONSTANT_P case, does it just for XFmode
> because I believe that is the only mode that needs it from the scalar ones,
> others should just be folded.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>
> 2024-03-04  Jakub Jelinek  
>
> PR target/114184
> * config/i386/i386-expand.cc (ix86_expand_move): If XFmode op1
> is SUBREG of CONSTANT_P, force the SUBREG_REG into memory or
> register.
>
> * gcc.target/i386/pr114184.c: New test.

OK, with a question inline.

Thanks,
Uros.

>
> --- gcc/config/i386/i386-expand.cc.jj   2024-03-01 14:56:34.120925989 +0100
> +++ gcc/config/i386/i386-expand.cc  2024-03-03 18:41:08.278793046 +0100
> @@ -451,6 +451,20 @@ ix86_expand_move (machine_mode mode, rtx
>   && GET_MODE (SUBREG_REG (op1)) == DImode
>   && SUBREG_BYTE (op1) == 0)
> op1 = gen_rtx_ZERO_EXTEND (TImode, SUBREG_REG (op1));
> +  /* As not all values in XFmode are representable in real_value,
> +we might be called with unfoldable SUBREGs of constants.  */
> +  if (mode == XFmode
> + && CONSTANT_P (SUBREG_REG (op1))
> + && can_create_pseudo_p ())

We have quite some unguarded force_regs in ix86_expand_move. While it
doesn't hurt to have an extra safety net, is there a particular reason
for can_create_pseudo_p check in the added code?

> +   {
> + machine_mode imode = GET_MODE (SUBREG_REG (op1));
> + rtx r = force_const_mem (imode, SUBREG_REG (op1));
> + if (r)
> +   r = validize_mem (r);
> + else
> +   r = force_reg (imode, SUBREG_REG (op1));
> + op1 = simplify_gen_subreg (mode, r, imode, SUBREG_BYTE (op1));
> +   }
>break;
>  }
>
> --- gcc/testsuite/gcc.target/i386/pr114184.c.jj 2024-03-03 18:45:45.912964030 
> +0100
> +++ gcc/testsuite/gcc.target/i386/pr114184.c2024-03-03 18:45:37.639078138 
> +0100
> @@ -0,0 +1,22 @@
> +/* PR target/114184 */
> +/* { dg-do compile } */
> +/* { dg-options "-Og -mavx2" } */
> +
> +typedef unsigned char V __attribute__((vector_size (32)));
> +typedef unsigned char W __attribute__((vector_size (16)));
> +
> +_Complex long double
> +foo (void)
> +{
> +  _Complex long double d;
> +  *(V *) = (V) { 149, 136, 89, 42, 38, 240, 196, 194 };
> +  return d;
> +}
> +
> +long double
> +bar (void)
> +{
> +  long double d;
> +  *(W *) = (W) { 149, 136, 89, 42, 38, 240, 196, 194 };
> +  return d;
> +}
>
> Jakub
>


[PATCH] i386: Fix ICEs with SUBREGs from vector etc. constants to XFmode [PR114184]

2024-03-04 Thread Jakub Jelinek
Hi!

The Intel extended format has the various weird number categories,
pseudo denormals, pseudo infinities, pseudo NaNs and unnormals.
Those are not representable in the GCC real_value and so neither
GIMPLE nor RTX VIEW_CONVERT_EXPR/SUBREG folding folds those into
constants.

As can be seen on the following testcase, because it isn't folded
(since GCC 12, before that we were folding it) we can end up with
a SUBREG of a CONST_VECTOR or similar constant, which isn't valid
general_operand, so we ICE during vregs pass trying to recognize
the move instruction.
Initially I thought it is a middle-end bug, the movxf instruction
has general_operand predicate, but the middle-end certainly never
tests that predicate, seems moves are special optabs.
And looking at other mov optabs, e.g. for vector modes the i386
patterns use nonimmediate_operand predicate on the input, yet
ix86_expand_vector_move deals with CONSTANT_P and SUBREG of CONSTANT_P
arguments which if the predicate was checked couldn't ever make it through.

The following patch handles this case similarly to the
ix86_expand_vector_move's SUBREG of CONSTANT_P case, does it just for XFmode
because I believe that is the only mode that needs it from the scalar ones,
others should just be folded.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-03-04  Jakub Jelinek  

PR target/114184
* config/i386/i386-expand.cc (ix86_expand_move): If XFmode op1
is SUBREG of CONSTANT_P, force the SUBREG_REG into memory or
register.

* gcc.target/i386/pr114184.c: New test.

--- gcc/config/i386/i386-expand.cc.jj   2024-03-01 14:56:34.120925989 +0100
+++ gcc/config/i386/i386-expand.cc  2024-03-03 18:41:08.278793046 +0100
@@ -451,6 +451,20 @@ ix86_expand_move (machine_mode mode, rtx
  && GET_MODE (SUBREG_REG (op1)) == DImode
  && SUBREG_BYTE (op1) == 0)
op1 = gen_rtx_ZERO_EXTEND (TImode, SUBREG_REG (op1));
+  /* As not all values in XFmode are representable in real_value,
+we might be called with unfoldable SUBREGs of constants.  */
+  if (mode == XFmode
+ && CONSTANT_P (SUBREG_REG (op1))
+ && can_create_pseudo_p ())
+   {
+ machine_mode imode = GET_MODE (SUBREG_REG (op1));
+ rtx r = force_const_mem (imode, SUBREG_REG (op1));
+ if (r)
+   r = validize_mem (r);
+ else
+   r = force_reg (imode, SUBREG_REG (op1));
+ op1 = simplify_gen_subreg (mode, r, imode, SUBREG_BYTE (op1));
+   }
   break;
 }
 
--- gcc/testsuite/gcc.target/i386/pr114184.c.jj 2024-03-03 18:45:45.912964030 
+0100
+++ gcc/testsuite/gcc.target/i386/pr114184.c2024-03-03 18:45:37.639078138 
+0100
@@ -0,0 +1,22 @@
+/* PR target/114184 */
+/* { dg-do compile } */
+/* { dg-options "-Og -mavx2" } */
+
+typedef unsigned char V __attribute__((vector_size (32)));
+typedef unsigned char W __attribute__((vector_size (16)));
+
+_Complex long double
+foo (void)
+{
+  _Complex long double d;
+  *(V *) = (V) { 149, 136, 89, 42, 38, 240, 196, 194 };
+  return d;
+}
+
+long double
+bar (void)
+{
+  long double d;
+  *(W *) = (W) { 149, 136, 89, 42, 38, 240, 196, 194 };
+  return d;
+}

Jakub



[PATCH] arm: Force flag_pic for FDPIC

2024-03-04 Thread Fangrui Song
From: Fangrui Song 

-fno-pic -mfdpic generated code is like regular -fno-pic, not suitable
for FDPIC (absolute addressing for symbol references and no function
descriptor).  The sh port simply upgrades -fno-pic to -fpie by setting
flag_pic.  Let's follow suit.

Link: 
https://inbox.sourceware.org/gcc-patches/20150913165303.gc17...@brightrain.aerifal.cx/

gcc/ChangeLog:

* config/arm/arm.cc (arm_option_override): Set flag_pic if
  TARGET_FDPIC.

gcc/testsuite/ChangeLog:

* gcc.target/arm/fdpic-pie.c: New test.
---
 gcc/config/arm/arm.cc|  6 +
 gcc/testsuite/gcc.target/arm/fdpic-pie.c | 30 
 2 files changed, 36 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/arm/fdpic-pie.c

diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index 1cd69268ee9..f2fd3cce48c 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -3682,6 +3682,12 @@ arm_option_override (void)
   arm_pic_register = FDPIC_REGNUM;
   if (TARGET_THUMB1)
sorry ("FDPIC mode is not supported in Thumb-1 mode");
+
+  /* FDPIC code is a special form of PIC, and the vast majority of code
+generation constraints that apply to PIC also apply to FDPIC, so we
+ set flag_pic to avoid the need to check TARGET_FDPIC everywhere
+ flag_pic is checked. */
+  flag_pic = 2;
 }
 
   if (arm_pic_register_string != NULL)
diff --git a/gcc/testsuite/gcc.target/arm/fdpic-pie.c 
b/gcc/testsuite/gcc.target/arm/fdpic-pie.c
new file mode 100644
index 000..909db8bce74
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/fdpic-pie.c
@@ -0,0 +1,30 @@
+// { dg-do compile }
+// { dg-options "-O2 -fno-pic -mfdpic" }
+// { dg-skip-if "-mpure-code and -fPIC incompatible" { *-*-* } { "-mpure-code" 
} }
+
+__attribute__((visibility("hidden"))) void hidden_fun(void);
+void fun(void);
+__attribute__((visibility("hidden"))) extern int hidden_var;
+extern int var;
+__attribute__((visibility("hidden"))) const int ro_hidden_var = 42;
+
+// { dg-final { scan-assembler "hidden_fun\\(GOTOFFFUNCDESC\\)" } }
+void *addr_hidden_fun(void) { return hidden_fun; }
+
+// { dg-final { scan-assembler "fun\\(GOTFUNCDESC\\)" } }
+void *addr_fun(void) { return fun; }
+
+// { dg-final { scan-assembler "hidden_var\\(GOT\\)" } }
+void *addr_hidden_var(void) { return _var; }
+
+// { dg-final { scan-assembler "var\\(GOT\\)" } }
+void *addr_var(void) { return  }
+
+// { dg-final { scan-assembler ".LANCHOR0\\(GOT\\)" } }
+const int *addr_ro_hidden_var(void) { return _hidden_var; }
+
+// { dg-final { scan-assembler "hidden_var\\(GOT\\)" } }
+int read_hidden_var(void) { return hidden_var; }
+
+// { dg-final { scan-assembler "var\\(GOT\\)" } }
+int read_var(void) { return var; }
-- 
2.44.0.rc1.240.g4c46232300-goog



Re: [PATCH] rtl-optimization/113597 - recover base term for argument pointers

2024-03-04 Thread Richard Biener
On Sun, 3 Mar 2024, Jeff Law wrote:

> 
> 
> On 2/9/24 03:26, Richard Biener wrote:
> > The following allows a base term to be derived from an existing
> > MEM_EXPR, notably the points-to set of a MEM_REF base.  For the
> > testcase in the PR this helps RTL DSE elide stores to a stack
> > temporary.  This covers pointers to NONLOCAL which can be mapped
> > to arg_base_value, helping to disambiguate against other special
> > bases (ADDRESS) as well as PARM_DECL accesses.
> I like it and as you note later, it's extendable.
> 
> > 
> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > 
> > This is an attempt to recover some of the losses from dumbing down
> > find_base_{term,value}.  I did give my ideas how to properly do
> > this during stage1 a start, I will post a short incomplete RFC series
> > later today.
> I saw those, but set them aside for gcc-15.
> 
> > 
> > OK for trunk?
> > 
> > I've included all languages in testing and also tested with -m32 but
> > details of RTL alias analysis might escape me ...
> > 
> > Thanks,
> > Richard.
> > 
> >  PR rtl-optimization/113597
> >  * alias.cc (find_base_term): Add argument for the whole mem
> >  and derive a base term from its MEM_EXPR.
> >  (true_dependence_1): Pass down the MEMs to find_base_term.
> >  (write_dependence_p): Likewise.
> >  (may_alias_p): Likewise.
> I'd lean ever so slightly against including this.  Not because I see anything
> wrong, more so because we don't have a lot of time for this to shake out if
> there are any problems.  But I wouldn't go as far as to say I object to
> including it.
> 
> So OK for the trunk if you want to go forward now.  Or defer if you want to
> take the somewhat safer route of waiting to gcc-15 to tackle this.

There was fallout (arm bootstrap fail) reported, so I defer it to 15
for which I posted another RFC series.  I do admit that I can't promise
to finish anything here.  The reported fallout was not too bad
luckily, or maybe just nobody noticed yet.

Richard.