Re: [aarch64] Use exact_log2 (INTVAL (operands[2])) >= 0 to gate for vec_merge patterns.

2023-01-18 Thread Prathamesh Kulkarni via Gcc-patches
On Wed, 18 Jan 2023 at 20:00, Richard Sandiford
 wrote:
>
> Prathamesh Kulkarni  writes:
> > Hi Richard,
> > Based on your suggestion in the other thread, the patch uses
> > exact_log2 (INTVAL (operands[2])) >= 0 to gate for vec_merge patterns.
> > Bootstrap+test in progress on aarch64-linux-gnu.
> > Does it look OK ?
>
> Yeah, this is OK, thanks.  IMO it's a latent bug and suitable for stage 4.
Thanks, pushed in 22c75b4ed94bd731cb6e37c507de1d91954a17cf.

Thanks,
Prathamesh
>
> Richard
>
> >
> > Thanks,
> > Prathamesh
> >
> > [aarch64] Use exact_log2 (INTVAL (operands[2])) >= 0 to gate for vec_merge 
> > patterns.
> >
> > gcc/ChangeLog:
> >   * gcc/config/aarch64-simd.md (aarch64_simd_vec_set): Use
> >   exact_log2 (INTVAL (operands[2])) >= 0 as condition for gating
> >   the pattern.
> >   (aarch64_simd_vec_copy_lane): Likewise.
> >   (aarch64_simd_vec_copy_lane_): Likewise.
> >
> > diff --git a/gcc/config/aarch64/aarch64-simd.md 
> > b/gcc/config/aarch64/aarch64-simd.md
> > index 104088f67d2..7cc8c00f0ec 100644
> > --- a/gcc/config/aarch64/aarch64-simd.md
> > +++ b/gcc/config/aarch64/aarch64-simd.md
> > @@ -1064,7 +1064,7 @@
> >   (match_operand: 1 "aarch64_simd_nonimmediate_operand" 
> > "w,?r,Utv"))
> >   (match_operand:VALL_F16 3 "register_operand" "0,0,0")
> >   (match_operand:SI 2 "immediate_operand" "i,i,i")))]
> > -  "TARGET_SIMD"
> > +  "TARGET_SIMD && exact_log2 (INTVAL (operands[2])) >= 0"
> >{
> > int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
> > operands[2] = GEN_INT ((HOST_WIDE_INT) 1 << elt);
> > @@ -1093,7 +1093,7 @@
> > [(match_operand:SI 4 "immediate_operand" "i")])))
> >   (match_operand:VALL_F16 1 "register_operand" "0")
> >   (match_operand:SI 2 "immediate_operand" "i")))]
> > -  "TARGET_SIMD"
> > +  "TARGET_SIMD && exact_log2 (INTVAL (operands[2])) >= 0"
> >{
> >  int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
> >  operands[2] = GEN_INT (HOST_WIDE_INT_1 << elt);
> > @@ -1114,7 +1114,7 @@
> > [(match_operand:SI 4 "immediate_operand" "i")])))
> >   (match_operand:VALL_F16_NO_V2Q 1 "register_operand" "0")
> >   (match_operand:SI 2 "immediate_operand" "i")))]
> > -  "TARGET_SIMD"
> > +  "TARGET_SIMD && exact_log2 (INTVAL (operands[2])) >= 0"
> >{
> >  int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
> >  operands[2] = GEN_INT (HOST_WIDE_INT_1 << elt);


Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Richard Biener via Gcc-patches
On Wed, Jan 18, 2023 at 4:07 PM Jan Hubicka  wrote:
>
> > No unwind tables are generated, as if -funwind-tables is ignored.  If
> > LTO is disabled, everything works as expected.
> I think it is because dwaf2out_do_eh_frame is called out of function
> context at the end of compilation. At that time cfun is NULL
> and the flag is read from global settings that are wrong.
> So we need to bookkeep if we saw function that needs EH tables and not.

I think we do that with the do_eh_frame variable which was introduced to
fix a similar problem (PR81351).  But then dwarf2out_do_eh_frame seems
to be used both at a function and at the TU level and the latter now should
check do_eh_frame?  (and not be called early).  That is, the caller in
dwarf2out_do_frame wants to check that variable.

I see the first invocation of the function with cfun == null from c_cpp_builtins
but all others seem to have cfun set?

Richard.

> Honza
> >
> > --
> > Andreas Schwab, SUSE Labs, sch...@suse.de
> > GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> > "And now for something completely different."


[PATCH] RISC-V: Fix pred_mov constraint for vle.v

2023-01-18 Thread juzhe . zhong
From: Ju-Zhe Zhong 

The original constraint is incorrect in pred_mov pattern.
Take a look at Alternative 2, the operands[0] is "vr",
operands[1] which is mask operand can be "vm".
Such alternative matching will give the wrong codegen (vle.v v0,0(a5),v0.t)
This is illegal according to RVV ISA.

To fix this issue and not destroy the RA performance, fix this pattern in
this patch.
 
gcc/ChangeLog:

* config/riscv/vector.md: Fix constraints.

---
 gcc/config/riscv/vector.md | 29 +++--
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md
index 48414e200cf..e1173f2d5a6 100644
--- a/gcc/config/riscv/vector.md
+++ b/gcc/config/riscv/vector.md
@@ -633,22 +633,23 @@
 ;;2. (const_vector:VNx1SF repeat [
 ;;(const_double:SF 0.0 [0x0.0p+0])]).
 (define_insn_and_split "@pred_mov"
-  [(set (match_operand:V 0 "nonimmediate_operand"  "=vd,vr, m, 
   vr,vr")
-   (if_then_else:V
- (unspec:
-   [(match_operand: 1 "vector_mask_operand" "vmWc1, vmWc1, vmWc1,  
 Wc1,   Wc1")
-(match_operand 4 "vector_length_operand""   rK,rK,rK,  
  rK,rK")
-(match_operand 5 "const_int_operand""i, i, i,  
   i, i")
-(match_operand 6 "const_int_operand""i, i, i,  
   i, i")
-(match_operand 7 "const_int_operand""i, i, i,  
   i, i")
-(reg:SI VL_REGNUM)
-(reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
- (match_operand:V 3 "vector_move_operand"   "m, m,vr,  
  vr, viWc0")
- (match_operand:V 2 "vector_merge_operand"  "0,vu,vu,  
 vu0,   vu0")))]
+  [(set (match_operand:V 0 "nonimmediate_operand"  "=vr,vr,vd, 
m,vr,vr")
+(if_then_else:V
+  (unspec:
+[(match_operand: 1 "vector_mask_operand" "vmWc1,   Wc1,vm, 
vmWc1,   Wc1,   Wc1")
+ (match_operand 4 "vector_length_operand""   rK,rK,rK,
rK,rK,rK")
+ (match_operand 5 "const_int_operand""i, i, i, 
i, i, i")
+ (match_operand 6 "const_int_operand""i, i, i, 
i, i, i")
+ (match_operand 7 "const_int_operand""i, i, i, 
i, i, i")
+ (reg:SI VL_REGNUM)
+ (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE)
+  (match_operand:V 3 "vector_move_operand"   "m, m, m,
vr,vr, viWc0")
+  (match_operand:V 2 "vector_merge_operand"  "0,vu,vu,
vu,   vu0,   vu0")))]
   "TARGET_VECTOR"
   "@
vle.v\t%0,%3%p1
-   vle.v\t%0,%3%p1
+   vle.v\t%0,%3
+   vle.v\t%0,%3,%1.t
vse.v\t%3,%0%p1
vmv.v.v\t%0,%3
vmv.v.i\t%0,%v3"
@@ -657,7 +658,7 @@
&& satisfies_constraint_vu (operands[2])"
   [(set (match_dup 0) (match_dup 3))]
   ""
-  [(set_attr "type" "vlde,vlde,vste,vimov,vimov")
+  [(set_attr "type" "vlde,vlde,vlde,vste,vimov,vimov")
(set_attr "mode" "")])
 
 ;; Dedicated pattern for vse.v instruction since we can't reuse pred_mov 
pattern to include
-- 
2.36.3



Re: [PATCH] middle-end/106075 - non-call EH and DSE

2023-01-18 Thread Richard Biener via Gcc-patches
On Wed, 18 Jan 2023, Jan Hubicka wrote:

> > On Tue, 17 Jan 2023, Jan Hubicka wrote:
> > 
> > > > > We don't use same argumentation about other control flow statements.
> > > > > The following:
> > > > > 
> > > > > fn()
> > > > > {
> > > > >   try {
> > > > > i_read_no_global_memory ();
> > > > >   } catch (...)
> > > > >   {
> > > > > reutrn 1;
> > > > >   }
> > > > >   return 0;
> > > > > }
> > > > > 
> > > > > should be detected as const.  Marking throw pure would make fn pure 
> > > > > too.
> > > > 
> > > > I suppose i_read_no_global_memory is const here.  Not sure why that
> > > Suppose we have:
> > > 
> > > void
> > > i_read_no_global_memory ()
> > > {
> > >   throw(0);
> > > }
> > > 
> > > If cxa_throw itself was annotated as 'p' rahter than 'c' ipa-modref will
> > > believe that cxa_throw will read any global memory and will propagate it
> > > to all callers. So fn() will be also marked as reading all global
> > > memory.
> > 
> > Sure - but for the purpose of local optimizations in 
> > i_read_no_global_memory cxa_throw has to appear to read memory.
> 
> Yes, I think every stmt that can throw externally need VUSE (just like
> return_stmt needs it).  Even if throw(0) was replaced by a=b/c with
> -fnon-call-exceptions.  It is still not clear to me why this should
> imply that we need 'p' instead of 'c' in fnspecs.
> 
> So I think we should try to make the following to work:
> 
> diff --git a/gcc/tree-ssa-operands.cc b/gcc/tree-ssa-operands.cc
> index 57e393ae164..d24f1721eb2 100644
> --- a/gcc/tree-ssa-operands.cc
> +++ b/gcc/tree-ssa-operands.cc
> @@ -951,6 +951,9 @@ operands_scanner::parse_ssa_operands ()
>enum gimple_code code = gimple_code (stmt);
>size_t i, n, start = 0;
>  
> +  if (stmt_can_throw_external (fn, stmt))
> +append_vuse (gimple_vop (fn));
> +
>switch (code)
>  {
>  case GIMPLE_ASM:

It's going to be a bit tricky since in many places we use
gimple_vuse () != NULL to check whether an assignment is a
load/store.  But yes, the above is sort-of what we'd need to do.

> > Having a VUSE there dependent on whether the function performs any
> > load or store would be quite ugly.  Instead modref could special-case
> > cxa_throw and not treat it as reading memory (like it already does
> > for the return stmt I suppose - that also has a VUSE).
> 
> modref looks into statements with VUSEs on them and checks what
> reads/stores are done.  So return statement with VUSE is walked and no
> load is recorded because no actual load is found.
> Similarly that would happen with __cxa_throw if it was 'c'.
> With 'p' it has nothing to analyze so it would trust the fact that
> cxa_throw itself reads some global state.

I see.  But does __cxa_throw stmt_can_throw_external ()?  Otherwise
the operand scanner elides VUSE on const function calls.

> > 
> > The problem is IIRC GIMPLE_RESX which doesn't derive from
> > gimple_statement_with_memory_ops_base.  There's a bugzilla I can't find
> > right now refering to this issue.
> 
> I never tried to play with gimple hiearchy. It is hard to fix resx?  I
> wonder if we have other cases.  I guess for a=b/c we are luck just
> because gimple_assign can also be load or store so it has memory_ops...

Fixing resx would come at the cost of deriving from _with_ops, but not
sure if that waste of space is too important.

Richard.

> Thanks,
> Honza
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


[PATCH] RISC-V: Add vlm/vsm C/C++ API intrinsics support

2023-01-18 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc (BASE): Add vlm/vsm 
support.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vlm): New define.
(vsm): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct loadstore_def): 
Add vlm/vsm support.
* config/riscv/riscv-vector-builtins-types.def (DEF_RVV_B_OPS): Ditto.
(vbool64_t): Ditto.
(vbool32_t): Ditto.
(vbool16_t): Ditto.
(vbool8_t): Ditto.
(vbool4_t): Ditto.
(vbool2_t): Ditto.
(vbool1_t): Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_B_OPS): Ditto.
(rvv_arg_type_info::get_tree_type): Ditto.
(function_expander::use_contiguous_load_insn): Ditto.
* config/riscv/vector.md (@pred_store): Ditto.

gcc/testsuite/ChangeLog:

* g++.target/riscv/rvv/base/vsm-1.C: New test.
* g++.target/riscv/rvv/rvv.exp: New test.
* gcc.target/riscv/rvv/base/vlm_vsm-1.c: New test.
* gcc.target/riscv/rvv/base/vlm_vsm-2.c: New test.
* gcc.target/riscv/rvv/base/vlm_vsm-3.c: New test.

---
 .../riscv/riscv-vector-builtins-bases.cc  |  6 +-
 .../riscv/riscv-vector-builtins-bases.h   |  2 +
 .../riscv/riscv-vector-builtins-functions.def |  2 +
 .../riscv/riscv-vector-builtins-shapes.cc |  3 +-
 .../riscv/riscv-vector-builtins-types.def | 15 
 gcc/config/riscv/riscv-vector-builtins.cc | 43 ++-
 gcc/config/riscv/vector.md| 23 +-
 .../g++.target/riscv/rvv/base/vsm-1.C | 40 ++
 gcc/testsuite/g++.target/riscv/rvv/rvv.exp| 44 +++
 .../gcc.target/riscv/rvv/base/vlm_vsm-1.c | 75 +++
 .../gcc.target/riscv/rvv/base/vlm_vsm-2.c | 75 +++
 .../gcc.target/riscv/rvv/base/vlm_vsm-3.c | 75 +++
 12 files changed, 395 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.target/riscv/rvv/base/vsm-1.C
 create mode 100644 gcc/testsuite/g++.target/riscv/rvv/rvv.exp
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vlm_vsm-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vlm_vsm-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/vlm_vsm-3.c

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index af66b016b49..0da4797d272 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -84,7 +84,7 @@ public:
   }
 };
 
-/* Implements vle.v/vse.v codegen.  */
+/* Implements vle.v/vse.v/vlm.v/vsm.v codegen.  */
 template 
 class loadstore : public function_base
 {
@@ -116,6 +116,8 @@ static CONSTEXPR const vsetvl vsetvl_obj;
 static CONSTEXPR const vsetvl vsetvlmax_obj;
 static CONSTEXPR const loadstore vle_obj;
 static CONSTEXPR const loadstore vse_obj;
+static CONSTEXPR const loadstore vlm_obj;
+static CONSTEXPR const loadstore vsm_obj;
 
 /* Declare the function base NAME, pointing it to an instance
of class _obj.  */
@@ -126,5 +128,7 @@ BASE (vsetvl)
 BASE (vsetvlmax)
 BASE (vle)
 BASE (vse)
+BASE (vlm)
+BASE (vsm)
 
 } // end namespace riscv_vector
diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.h 
b/gcc/config/riscv/riscv-vector-builtins-bases.h
index 79684bcb50d..28151a8d8d2 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.h
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.h
@@ -28,6 +28,8 @@ extern const function_base *const vsetvl;
 extern const function_base *const vsetvlmax;
 extern const function_base *const vle;
 extern const function_base *const vse;
+extern const function_base *const vlm;
+extern const function_base *const vsm;
 }
 
 } // end namespace riscv_vector
diff --git a/gcc/config/riscv/riscv-vector-builtins-functions.def 
b/gcc/config/riscv/riscv-vector-builtins-functions.def
index e5ebb7d829c..63aa8fe32c8 100644
--- a/gcc/config/riscv/riscv-vector-builtins-functions.def
+++ b/gcc/config/riscv/riscv-vector-builtins-functions.def
@@ -42,5 +42,7 @@ DEF_RVV_FUNCTION (vsetvlmax, vsetvlmax, none_preds, 
i_none_size_void_ops)
 /* 7. Vector Loads and Stores. */
 DEF_RVV_FUNCTION (vle, loadstore, full_preds, all_v_scalar_const_ptr_ops)
 DEF_RVV_FUNCTION (vse, loadstore, none_m_preds, all_v_scalar_ptr_ops)
+DEF_RVV_FUNCTION (vlm, loadstore, none_preds, b_v_scalar_const_ptr_ops)
+DEF_RVV_FUNCTION (vsm, loadstore, none_preds, b_v_scalar_ptr_ops)
 
 #undef DEF_RVV_FUNCTION
diff --git a/gcc/config/riscv/riscv-vector-builtins-shapes.cc 
b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
index 0332c031ce4..76cf14a8cc4 100644
--- a/gcc/config/riscv/riscv-vector-builtins-shapes.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-shapes.cc
@@ -116,7 +116,8 @@ struct loadstore_def : public build_base
 machine_mode mode = TYPE_MODE (type);
 int sew = GET_MODE_BITSIZE (GET_MODE_INNER (mode));
 

[PATCH v3] xtensa: Eliminate unnecessary general-purpose reg-reg moves

2023-01-18 Thread Takayuki 'January June' Suwa via Gcc-patches
Register-register move instructions that can be easily seen as
unnecessary by the human eye may remain in the compiled result.
For example:

/* example */
double test(double a, double b) {
  return __builtin_copysign(a, b);
}

test:
add.n   a3, a3, a3
extui   a5, a5, 31, 1
ssai1
;; be in the same BB
src a7, a5, a3  ;; No '0' in the source constraints
;; No CALL insns in this span
;; Both A3 and A7 are irrelevant to
;;   insns in this span
mov.n   a3, a7  ;; An unnecessary reg-reg move
;; A7 is not used after this
ret.n

The last two instructions above, excluding the return instruction,
could be done like this:

src a3, a5, a3

This symptom often occurs when handling DI/DFmode values with SImode
instructions.  This patch solves the above problem using peephole2
pattern.

gcc/ChangeLog:

* config/xtensa/xtensa.md: New peephole2 pattern that eliminates
the occurrence of general-purpose register used only once and for
transferring intermediate value.
---
 gcc/config/xtensa/xtensa.md | 45 +
 1 file changed, 45 insertions(+)

diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index abcab231d8e..517dcecf2c1 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -3110,3 +3110,48 @@ FALLTHRU:;
   df_insn_rescan (insnR);
   set_insn_deleted (insnP);
 })
+
+(define_peephole2
+  [(set (match_operand 0 "register_operand")
+   (match_operand 1 "register_operand"))]
+  "GET_MODE_SIZE (GET_MODE (operands[0])) == 4
+   && GET_MODE_SIZE (GET_MODE (operands[1])) == 4
+   && GP_REG_P (REGNO (operands[0])) && GP_REG_P (REGNO (operands[1]))
+   && peep2_reg_dead_p (1, operands[1])"
+  [(const_int 0)]
+{
+  basic_block bb = BLOCK_FOR_INSN (curr_insn);
+  rtx_insn *head = BB_HEAD (bb), *insn;
+  rtx dest = operands[0], src = operands[1], pattern, t_dest;
+  int i;
+  for (insn = PREV_INSN (curr_insn);
+   insn && insn != head;
+   insn = PREV_INSN (insn))
+if (CALL_P (insn))
+  break;
+else if (INSN_P (insn))
+  {
+   if (GET_CODE (pattern = PATTERN (insn)) == SET
+   && REG_P (t_dest = SET_DEST (pattern))
+   && GET_MODE_SIZE (GET_MODE (t_dest)) == 4
+   && REGNO (t_dest) == REGNO (src))
+   {
+ extract_constrain_insn (insn);
+ for (i = 1; i < recog_data.n_operands; ++i)
+   if (strchr (recog_data.constraints[i], '0'))
+ goto ABORT;
+ SET_DEST (pattern) = gen_rtx_REG (GET_MODE (t_dest),
+   REGNO (dest));
+ df_insn_rescan (insn);
+ goto FALLTHRU;
+   }
+   if (reg_overlap_mentioned_p (dest, pattern)
+   || reg_overlap_mentioned_p (src, pattern)
+   || set_of (dest, insn)
+   || set_of (src, insn))
+ break;
+  }
+ABORT:
+  FAIL;
+FALLTHRU:;
+})
-- 
2.30.2


[PATCH v4] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-18 Thread Takayuki 'January June' Suwa via Gcc-patches
In the previous patch, if insn is JUMP_INSN or CALL_INSN, it bypasses the reg 
check (possibly FAIL).

=
In the case of the CALL0 ABI, values that must be retained before and
after function calls are placed in the callee-saved registers (A12
through A15) and referenced later.  However, it is often the case that
the save and the reference are each only once and a simple register-
register move (the frame pointer is needed to recover the stack pointer
and must be excluded).

e.g. in the following example, if there are no other occurrences of
register A14:

;; before
; prologue {
  ...
s32i.n  a14, sp, 16
  ...
; } prologue
  ...
mov.n   a14, a6
  ...
call0   foo
  ...
mov.n   a8, a14
  ...
; epilogue {
  ...
l32i.n  a14, sp, 16
  ...
; } epilogue

It can be possible like this:

;; after
; prologue {
  ...
(deleted)
  ...
; } prologue
  ...
s32i.n  a6, sp, 16
  ...
call0   foo
  ...
l32i.n  a8, sp, 16
  ...
; epilogue {
  ...
(deleted)
  ...
; } epilogue

This patch introduces a new peephole2 pattern that implements the above.

gcc/ChangeLog:

* config/xtensa/xtensa.md: New peephole2 pattern that eliminates
the use of callee-saved register that saves and restores only once
for other register, by using its stack slot directly.
---
 gcc/config/xtensa/xtensa.md | 62 +
 1 file changed, 62 insertions(+)

diff --git a/gcc/config/xtensa/xtensa.md b/gcc/config/xtensa/xtensa.md
index 4f1e8fd13..ac04ef6f0 100644
--- a/gcc/config/xtensa/xtensa.md
+++ b/gcc/config/xtensa/xtensa.md
@@ -3029,3 +3029,65 @@ FALLTHRU:;
   operands[1] = GEN_INT (imm0);
   operands[2] = GEN_INT (imm1);
 })
+
+(define_peephole2
+  [(set (match_operand:SI 0 "register_operand")
+   (match_operand:SI 1 "reload_operand"))]
+  "!TARGET_WINDOWED_ABI && df
+   && epilogue_contains (insn)
+   && ! call_used_or_fixed_reg_p (REGNO (operands[0]))
+   && (!frame_pointer_needed
+   || REGNO (operands[0]) != HARD_FRAME_POINTER_REGNUM)"
+  [(const_int 0)]
+{
+  rtx reg = operands[0], pattern;
+  rtx_insn *insnP = NULL, *insnS = NULL, *insnR = NULL;
+  df_ref ref;
+  rtx_insn *insn;
+  for (ref = DF_REG_DEF_CHAIN (REGNO (reg));
+   ref; ref = DF_REF_NEXT_REG (ref))
+if (DF_REF_CLASS (ref) != DF_REF_REGULAR
+   || DEBUG_INSN_P (insn = DF_REF_INSN (ref)))
+  continue;
+else if (insn == curr_insn)
+  continue;
+else if (GET_CODE (pattern = PATTERN (insn)) == SET
+&& rtx_equal_p (SET_DEST (pattern), reg)
+&& REG_P (SET_SRC (pattern)))
+  {
+   if (insnS)
+ FAIL;
+   insnS = insn;
+   continue;
+  }
+else
+  FAIL;
+  for (ref = DF_REG_USE_CHAIN (REGNO (reg));
+   ref; ref = DF_REF_NEXT_REG (ref))
+if (DF_REF_CLASS (ref) != DF_REF_REGULAR
+   || DEBUG_INSN_P (insn = DF_REF_INSN (ref)))
+  continue;
+else if (prologue_contains (insn))
+  {
+   insnP = insn;
+   continue;
+  }
+else if (GET_CODE (pattern = PATTERN (insn)) == SET
+&& rtx_equal_p (SET_SRC (pattern), reg)
+&& REG_P (SET_DEST (pattern)))
+  {
+   if (insnR)
+ FAIL;
+   insnR = insn;
+   continue;
+  }
+else
+  FAIL;
+  if (!insnP || !insnS || !insnR)
+FAIL;
+  SET_DEST (PATTERN (insnS)) = copy_rtx (operands[1]);
+  df_insn_rescan (insnS);
+  SET_SRC (PATTERN (insnR)) = copy_rtx (operands[1]);
+  df_insn_rescan (insnR);
+  set_insn_deleted (insnP);
+})
-- 
2.30.2


Re: [PATCH] x86: Check invalid third argument to __builtin_ia32_prefetch

2023-01-18 Thread Hongtao Liu via Gcc-patches
On Thu, Jan 19, 2023 at 3:12 AM H.J. Lu via Gcc-patches
 wrote:
>
> Check invalid third argument to __builtin_ia32_prefetch when expaning
> __builtin_ia32_prefetch to avoid ICE later.
Ok, thanks.
>
> gcc/
>
> PR target/108436
> * config/i386/i386-expand.cc (ix86_expand_builtin): Check
> invalid third argument to __builtin_ia32_prefetch.
>
> gcc/testsuite/
>
> * gcc.target/i386/pr108436.c: New test.
> ---
>  gcc/config/i386/i386-expand.cc   | 12 
>  gcc/testsuite/gcc.target/i386/pr108436.c | 15 +++
>  2 files changed, 27 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr108436.c
>
> diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
> index 54f700cd09d..e2e2d28bb47 100644
> --- a/gcc/config/i386/i386-expand.cc
> +++ b/gcc/config/i386/i386-expand.cc
> @@ -13175,6 +13175,12 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
> subtarget,
>
> if (INTVAL (op3) == 1)
>   {
> +   if (INTVAL (op2) < 2 || INTVAL (op2) > 3)
> + {
> +   error ("invalid third argument");
> +   return const0_rtx;
> + }
> +
> if (TARGET_64BIT && TARGET_PREFETCHI
> && local_func_symbolic_operand (op0, GET_MODE (op0)))
>   emit_insn (gen_prefetchi (op0, op2));
> @@ -13195,6 +13201,12 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
> subtarget,
> op0 = copy_addr_to_reg (op0);
>   }
>
> +   if (INTVAL (op2) < 0 || INTVAL (op2) > 3)
> + {
> +   warning (0, "invalid third argument to 
> %<__builtin_ia32_prefetch%>; using zero");
> +   op2 = const0_rtx;
> + }
> +
> if (TARGET_3DNOW || TARGET_PREFETCH_SSE
> || TARGET_PRFCHW || TARGET_PREFETCHWT1)
>   emit_insn (gen_prefetch (op0, op1, op2));
> diff --git a/gcc/testsuite/gcc.target/i386/pr108436.c 
> b/gcc/testsuite/gcc.target/i386/pr108436.c
> new file mode 100644
> index 000..d51f25863a5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr108436.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mprefetchi" } */
> +
> +int
> +foo (int a)
> +{
> +  return a + 1;
> +}
> +
> +void
> +bad (int *p)
> +{
> +  __builtin_ia32_prefetch (p, 0, 4, 0);   /* { dg-warning "invalid third 
> argument to '__builtin_ia32_prefetch'; using zero" } */
> +  __builtin_ia32_prefetch (foo, 0, 4, 1);   /* { dg-error "invalid third 
> argument" } */
> +}
> --
> 2.39.0
>


-- 
BR,
Hongtao


Re: [Patch] libfortran: Fix execute_command_line for Windows

2023-01-18 Thread Jerry D via Gcc-patches

On 1/18/23 7:42 AM, Tobias Burnus wrote:

Reported by nightstrike, who also tested this patch.

On Windows, we call system() which works as described at
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/system-wsystem?view=msvc-170

Namely, it only fails with "-1" if the command interpreter
could not be started. Otherwise, it has the return value.
(Same on Linux.) On POSIX systems, 'sh' calls exit(127) or
_exit(127) if it cannot execute the program of the passed string,
as documented. Cf. https://www.unix.com/man-page/posix/3p/system/

Thus, the question is what happens on Windows. Our experiments, several
webpages (like stackoverflow) and the source code of WINE for cmd.exe 
indicate

that Windows returns 9009 in that case. See for instance
https://github.com/wine-mirror/wine/blob/master/programs/cmd/wcmdmain.c#L1262-L1269

Thus, we now do likewise. The code is for MINGW; Cygwin does not set 
that that

var and is likely to use return values closer to POSIX.

OK for mainline?

Tobias


OK, thanks fir fix.

Jerry



[r13-5244 Regression] FAIL: gcc.dg/analyzer/SARD-tc841-basic-00182-min.c (test for excess errors) on Linux/x86_64

2023-01-18 Thread Jiang, Haochen via Gcc-patches
The mail system is still broken on that machine, still sending this manually. 
Before that mail down, I will keep check the script daily to see if there is 
new regression.

BTW, since there is a Bugzilla for r13-5202 regression, not resending that 
report

On Linux/x86_64,

c6a09bfa038ccbfc9f123ede14a3d6237fab is the first bad commit
commit c6a09bfa038ccbfc9f123ede14a3d6237fab
Author: David Malcolm dmalc...@redhat.com
Date:   Wed Jan 18 11:41:47 2023 -0500

analyzer: add SARD testsuite 81

caused

FAIL: gcc.dg/analyzer/SARD-tc841-basic-00182-min.c (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-5244/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="analyzer.exp=gcc.dg/analyzer/SARD-tc841-basic-00182-min.c 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="analyzer.exp=gcc.dg/analyzer/SARD-tc841-basic-00182-min.c 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="analyzer.exp=gcc.dg/analyzer/SARD-tc841-basic-00182-min.c 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="analyzer.exp=gcc.dg/analyzer/SARD-tc841-basic-00182-min.c 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com)


[PATCH v2] c++: -Wdangling-reference with reference wrapper [PR107532]

2023-01-18 Thread Marek Polacek via Gcc-patches
On Wed, Jan 18, 2023 at 04:07:59PM -0500, Jason Merrill wrote:
> On 1/18/23 12:52, Marek Polacek wrote:
> > Here, -Wdangling-reference triggers where it probably shouldn't, causing
> > some grief.  The code in question uses a reference wrapper with a member
> > function returning a reference to a subobject of a non-temporary object:
> > 
> >const Plane & meta = fm.planes().inner();
> > 
> > I've tried a few approaches, e.g., checking that the member function's
> > return type is the same as the type of the enclosing class (which is
> > the case for member functions returning *this), but that then breaks
> > Wdangling-reference4.C with std::optional.
> > 
> > So I figured that perhaps we want to look at the object we're invoking
> > the member function(s) on and see if that is a temporary, as in, don't
> > warn about
> > 
> >const Plane & meta = fm.planes().inner();
> > 
> > but do warn about
> > 
> >const Plane & meta = FrameMetadata().planes().inner();
> > 
> > It's ugly, but better than asking users to add #pragmas into their code.
> 
> Hmm, that doesn't seem right; the former is only OK because Ref is in fact a
> reference-like type.  If planes() returned a class that held data, we would
> want to warn.

Sure, it's always some kind of tradeoff with warnings :/.
 
> In this case, we might recognize the reference-like class because it has a
> reference member and a constructor taking the same reference type.

That occurred to me too, but then I found out that std::reference_wrapper
actually uses T*, not T&, as you say.  But here's a patch to do that
(I hope).
 
> That wouldn't help with std::reference_wrapper or std::ref_view because they
> have pointer members instead of references, but perhaps loosening the check
> to include that case would make sense?

Sorry, I don't understand what you mean by loosening the check.  I could
hardcode std::reference_wrapper and std::ref_view but I don't think that's
what you meant.  Surely I cannot _not_ warn for any class that contains a
T*.

Here's the patch so that we have some actual code to discuss...  Thanks.

-- >8 --
Here, -Wdangling-reference triggers where it probably shouldn't, causing
some grief.  The code in question uses a reference wrapper with a member
function returning a reference to a subobject of a non-temporary object:

  const Plane & meta = fm.planes().inner();

I've tried a few approaches, e.g., checking that the member function's
return type is the same as the type of the enclosing class (which is
the case for member functions returning *this), but that then breaks
Wdangling-reference4.C with std::optional.

Perhaps we want to look at the member function's enclosing class
to see if it's a reference wrapper class (meaning, has a reference
member and a constructor taking the same reference type) and don't
warn if so, supposing that the member function returns a reference
to a non-temporary object.

It's ugly, but better than asking users to add #pragmas into their code.

PR c++/107532

gcc/cp/ChangeLog:

* call.cc (do_warn_dangling_reference): Don't warn when the
member function comes from a reference wrapper class.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference8.C: New test.
---
 gcc/cp/call.cc| 32 
 .../g++.dg/warn/Wdangling-reference8.C| 77 +++
 2 files changed, 109 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference8.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 0780b5840a3..b0670a21240 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -13832,6 +13832,38 @@ do_warn_dangling_reference (tree expr)
if (!(TYPE_REF_OBJ_P (rettype) || std_pair_ref_ref_p (rettype)))
  return NULL_TREE;
 
+   /* An attempt to reduce the number of -Wdangling-reference
+  false positives concerning reference wrappers (c++/107532).
+  If the enclosing class is a reference-like class, that is, has
+  a reference member and a constructor taking the same reference type,
+  we suppose that the member function is returning a reference
+  to a non-temporary object.  */
+   if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fndecl)
+   && !DECL_OVERLOADED_OPERATOR_P (fndecl))
+ {
+   tree ctx = CP_DECL_CONTEXT (fndecl);
+   for (tree fields = TYPE_FIELDS (ctx);
+fields;
+fields = DECL_CHAIN (fields))
+ {
+   if (TREE_CODE (fields) != FIELD_DECL || DECL_ARTIFICIAL 
(fields))
+ continue;
+   tree type = TREE_TYPE (fields);
+   if (!TYPE_REF_P (type))
+ continue;
+   /* OK, the field is a reference member.  Do we have
+  a constructor taking its type?  */
+   for (tree fn : ovl_range (CLASSTYPE_CONSTRUCTORS (ctx)))
+ {
+   tree args = 

[committed] libstdc++: Minor updates to Policy Based Data Structures: Biblio

2023-01-18 Thread Gerald Pfeifer
Segher kindly pointed out that when I changed the COM reference I 
claimed I updated the title, but didn't. This fixes that and updates 
www.open-std.org links.

Pushed.

Gerald


libstdc++-v3/ChangeLog:

2023-01-18  Gerald Pfeifer  

* doc/xml/manual/policy_data_structures_biblio.xml: Adjust links
to www.open-std.org to use https.
(COM: Component Model Object Technologies): Rename from...
(The Component Object Model): ...to.
* doc/html/manual/policy_data_structures.html: Regenerate.
---
 libstdc++-v3/doc/html/manual/policy_data_structures.html  | 8 
 .../doc/xml/manual/policy_data_structures_biblio.xml  | 6 +++---
 2 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/libstdc++-v3/doc/html/manual/policy_data_structures.html 
b/libstdc++-v3/doc/html/manual/policy_data_structures.html
index cb9f2461e88..ef54da8053a 100644
--- a/libstdc++-v3/doc/html/manual/policy_data_structures.html
+++ b/libstdc++-v3/doc/html/manual/policy_data_structures.html
@@ -768,7 +768,7 @@
change the order of growth of the entire sequence of
operations.
  Bibliography[biblio.abrahams97exception] 
-   http://www.open-std.org/jtc1/sc22/wg21/docs/papers/1997/N1075.pdf; 
target="_top">
+   https://www.open-std.org/jtc1/sc22/wg21/docs/papers/1997/N1075.pdf; 
target="_top">
  STL Exception Handling Contract

   . 1997. 
@@ -809,7 +809,7 @@
  . 
  C++ Report
. [biblio.austern01htprop] 
-   http://www.open-std.org/JTC1/sc22/wg21/docs/papers/2001/n1326.html; 
target="_top">
+   https://www.open-std.org/JTC1/sc22/wg21/docs/papers/2001/n1326.html; 
target="_top">
  A Proposal to Add Hashtables to the Standard Library

   . 
@@ -1158,7 +1158,7 @@
  Cambridge University Press
. [biblio.mscom] 
https://docs.microsoft.com/en-us/windows/win32/com/the-component-object-model;
 target="_top">
- COM: Component Model Object Technologies
+ The Component Object Model

   . 
  Microsoft
@@ -1297,4 +1297,4 @@
Wickland
  . 
  National Psychological Institute
-   . Prev??Up??NextImplementation??Home??Using
\ No newline at end of file
+   . Prev??Up??NextImplementation??Home??Using
diff --git a/libstdc++-v3/doc/xml/manual/policy_data_structures_biblio.xml 
b/libstdc++-v3/doc/xml/manual/policy_data_structures_biblio.xml
index 7c563fec40a..5234a10d197 100644
--- a/libstdc++-v3/doc/xml/manual/policy_data_structures_biblio.xml
+++ b/libstdc++-v3/doc/xml/manual/policy_data_structures_biblio.xml
@@ -7,7 +7,7 @@
 
   
http://www.w3.org/1999/xlink;
- 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/1997/N1075.pdf;>
+ 
xlink:href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/1997/N1075.pdf;>
  STL Exception Handling Contract

   
@@ -123,7 +123,7 @@
 
   
http://www.w3.org/1999/xlink;
- 
xlink:href="http://www.open-std.org/JTC1/sc22/wg21/docs/papers/2001/n1326.html;>
+ 
xlink:href="https://www.open-std.org/JTC1/sc22/wg21/docs/papers/2001/n1326.html;>
  A Proposal to Add Hashtables to the Standard Library

   
@@ -1062,7 +1062,7 @@
   
http://www.w3.org/1999/xlink;
  
xlink:href="https://docs.microsoft.com/en-us/windows/win32/com/the-component-object-model;>
- COM: Component Model Object Technologies
+ The Component Object Model

   
   
-- 
2.39.0


[committed] libstdc++: Fix std::random_device::entropy() for non-posix targets

2023-01-18 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

Since the r12-4515-g58f339fc5eaae7 change std::random_device::entropy()
returns non-zero for hardware sources such as RDRAND. However, the call
to the underlying _M_getentropy function is conditionally compiled
according to #if _GLIBCXX_USE_DEV_RANDOM which means it only happens for
targets that support /dev/random and /dev/urandom. This means entropy()
always returns zero for x86 Windows, even though the RDRAND and RDSEED
sources work there.

The _M_getentropy() function is always compiled into the library, it
just doesn't get called for targets without /dev/random. We can change
that just by removing the #if conditional. This is not an ABI change,
because new code will just start calling the existing _M_getentropy
function, old code that has inlined entropy() will not call it.

Similarly, the std::random_device destructor doesn't call the underlying
_M_fini function unless _GLIBCXX_USE_DEV_RANDOM is defined. That's less
of a problem because it's still true that the only resources that need
to be freed are when one of /dev/random or /dev/urandom has been opened
for reading, which is only possible when _GLIBCXX_USE_DEV_RANDOM is
defined. The _M_fini function does also destroy a random engine object
if a std::linear_congruential_engine object is used, but that destructor
is trivial and so no resources are leaked if it's not called. Remove the
preprocessor condition in the destructor too, so that we always call the
_M_fini function even if it doesn't have side effects. This makes the
destructor non-trivial for Windows and bare metal targets, but as the
class is non-copyable that shouldn't cause any ABI change in practice.

libstdc++-v3/ChangeLog:

* include/bits/random.h (random_device) [!_GLIBCXX_USE_DEV_RANDOM]:
Always call _M_fini and _M_getentropy.
---
 libstdc++-v3/include/bits/random.h | 10 +-
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/libstdc++-v3/include/bits/random.h 
b/libstdc++-v3/include/bits/random.h
index e2b9bdf568c..42f37c1e77e 100644
--- a/libstdc++-v3/include/bits/random.h
+++ b/libstdc++-v3/include/bits/random.h
@@ -1639,10 +1639,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 explicit
 random_device(const std::string& __token) { _M_init(__token); }
 
-#if defined _GLIBCXX_USE_DEV_RANDOM
 ~random_device()
 { _M_fini(); }
-#endif
 
 static constexpr result_type
 min()
@@ -1654,13 +1652,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 double
 entropy() const noexcept
-{
-#ifdef _GLIBCXX_USE_DEV_RANDOM
-  return this->_M_getentropy();
-#else
-  return 0.0;
-#endif
-}
+{ return this->_M_getentropy(); }
 
 result_type
 operator()()
-- 
2.39.0



[committed] libstdc++: Deprecate std::filesystem::u8path for C++20

2023-01-18 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux. Pushed to trunk.

-- >8 --

P0482R6 deprecated these functions for C++20. There was a ballot comment
on the C++23 CD saying to un-deprecate it, but LEWG just rejected that,
so let's add attributes to deprecate them.

libstdc++-v3/ChangeLog:

* include/bits/fs_path.h (u8path): Add deprecated attribute.
* testsuite/27_io/filesystem/path/construct/90281.cc: Add
-Wno-deprecated-declarations for C++20 and later.
* testsuite/27_io/filesystem/path/factory/u8path-char8_t.cc:
Likewise.
* testsuite/27_io/filesystem/path/factory/u8path.cc: Likewise.
* testsuite/27_io/filesystem/path/native/string.cc: Likewise.
* testsuite/27_io/filesystem/path/factory/u8path-depr.cc: New test.
---
 libstdc++-v3/include/bits/fs_path.h  |  2 ++
 .../27_io/filesystem/path/construct/90281.cc |  1 +
 .../filesystem/path/factory/u8path-char8_t.cc|  1 +
 .../27_io/filesystem/path/factory/u8path-depr.cc | 16 
 .../27_io/filesystem/path/factory/u8path.cc  |  1 +
 .../27_io/filesystem/path/native/string.cc   |  1 +
 6 files changed, 22 insertions(+)
 create mode 100644 
libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path-depr.cc

diff --git a/libstdc++-v3/include/bits/fs_path.h 
b/libstdc++-v3/include/bits/fs_path.h
index 5f18f2314d1..1cbfaaa5427 100644
--- a/libstdc++-v3/include/bits/fs_path.h
+++ b/libstdc++-v3/include/bits/fs_path.h
@@ -808,6 +808,7 @@ namespace __detail
   typename _Require = __detail::_Path2<_InputIterator>,
   typename _CharT
 = __detail::__value_type_is_char_or_char8_t<_InputIterator>>
+_GLIBCXX20_DEPRECATED_SUGGEST("path(u8string(first, last))")
 inline path
 u8path(_InputIterator __first, _InputIterator __last)
 {
@@ -830,6 +831,7 @@ namespace __detail
   template,
   typename _CharT = __detail::__value_type_is_char_or_char8_t<_Source>>
+_GLIBCXX20_DEPRECATED_SUGGEST("path((const char8_t*)&*source)")
 inline path
 u8path(const _Source& __source)
 {
diff --git a/libstdc++-v3/testsuite/27_io/filesystem/path/construct/90281.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/path/construct/90281.cc
index 4b38646c0e0..d26b5e15b29 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/path/construct/90281.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/path/construct/90281.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-do run { target c++17 } }
+// { dg-additional-options "-Wno-deprecated-declarations" { target c++20 } }
 
 #include 
 #include 
diff --git 
a/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path-char8_t.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path-char8_t.cc
index ceedd5fbdc2..eff95c18bdf 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path-char8_t.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path-char8_t.cc
@@ -17,6 +17,7 @@
 
 // { dg-options "-fchar8_t -Wno-stringop-overread" }
 // { dg-do run { target c++17 } }
+// { dg-additional-options "-Wno-deprecated-declarations" { target c++20 } }
 
 #include 
 #include 
diff --git 
a/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path-depr.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path-depr.cc
new file mode 100644
index 000..de54668c055
--- /dev/null
+++ b/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path-depr.cc
@@ -0,0 +1,16 @@
+// { dg-options "-std=gnu++20" }
+// { dg-do compile { target c++20 } }
+
+#include 
+
+namespace fs = std::filesystem;
+
+const char* s = "";
+auto p1 = fs::u8path(s); // { dg-warning "deprecated" }
+auto p2 = fs::u8path(s, s); // { dg-warning "deprecated" }
+
+#if __cpp_lib_char8_t
+const char8_t* u = u8"";
+auto p3 = fs::u8path(u); // { dg-warning "deprecated" }
+auto p4 = fs::u8path(u, u); // { dg-warning "deprecated" }
+#endif
diff --git a/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path.cc
index 726b3eaadd8..4c41fc28da9 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/path/factory/u8path.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-do run { target c++17 } }
+// { dg-additional-options "-Wno-deprecated-declarations" { target c++20 } }
 
 #include 
 #include 
diff --git a/libstdc++-v3/testsuite/27_io/filesystem/path/native/string.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/path/native/string.cc
index 8620c15fa88..d5942c9beaa 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/path/native/string.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/path/native/string.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-do run { target c++17 } }
+// { dg-additional-options "-Wno-deprecated-declarations" { target c++20 } }
 
 #include 
 #include 
-- 
2.39.0



Re: [PATCH] libstdc++: testsuite: Simplify codecvt_unicode

2023-01-18 Thread Jonathan Wakely via Gcc-patches
On Wed, 18 Jan 2023 at 19:52, Dimitrij Mijoski wrote:
>
> On Wed, 2023-01-18 at 18:53 +, Jonathan Wakely wrote:
> > This doesn't compile in C++11 or C++14, because there's no guaranteed
> > elision.
>
> I see. I just looked up in the docs and found that I need to put
> --target_board=unix/-std=c++11 inside RUNTESTFLAGS to test in C++11
> mode.

That's right. I have multiple options used by default, via ~/.dejagnurc

$ cat ~/.dejagnurc
# Need to test if $tool exists prior to the r11-551 change.
if { [info exists tool] && "$tool" == "libstdc++" } {
global tool_timeout
set tool_timeout 50
puts "dejagnu - timeout default set to ${tool_timeout}s"
set target_list {
"unix{,-D_GLIBCXX_USE_CXX11_ABI=0,-std=gnu++2b,-std=gnu++11}" }
}

This makes the testsuite take four times as long, but increases
coverage and finds issues like this one. As long as somebody runs the
extended list of options now and then, we don't need them to all be
run for everybody.



Re: [PATCH] c: ICE with nullptr as case expression [PR108424]

2023-01-18 Thread Joseph Myers
On Wed, 18 Jan 2023, Marek Polacek via Gcc-patches wrote:

> In this ICE-on-invalid, we crash on
> 
>   gcc_assert (INTEGRAL_TYPE_P (type));
> 
> in perform_integral_promotions, because a nullptr is an INTEGER_CST,
> but not INTEGRAL_TYPE_P, and check_case_value is only checking the
> former.  In the test I'm testing other "shall be an integral constant
> expression" contexts as well.
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

OK.  (INTEGER_CST of pointer type is detected in c_add_case_label.)

-- 
Joseph S. Myers
jos...@codesourcery.com


[PATCH] Fortran: error recovery for invalid CLASS component [PR108434]

2023-01-18 Thread Harald Anlauf via Gcc-patches
Dear all,

I intend to commit the attached obvious fix for a NULL pointer dereference
within the next 24h unless there are comment or objections.
The patch has been checked with valgrind that it prevents invalid reads
for the testcase, and it is certainly safe.

Regtested on x86_64-pc-linux-gnu.

Thanks,
Harald

From e240637f6c2e2605a8424538bee885d899507506 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Wed, 18 Jan 2023 22:13:29 +0100
Subject: [PATCH] Fortran: error recovery for invalid CLASS component
 [PR108434]

gcc/fortran/ChangeLog:

	PR fortran/108434
	* expr.cc (class_allocatable): Prevent NULL pointer dereference
	or invalid read.
	(class_pointer): Likewise.

gcc/testsuite/ChangeLog:

	PR fortran/108434
	* gfortran.dg/pr108434.f90: New test.
---
 gcc/fortran/expr.cc|  4 ++--
 gcc/testsuite/gfortran.dg/pr108434.f90 | 11 +++
 2 files changed, 13 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr108434.f90

diff --git a/gcc/fortran/expr.cc b/gcc/fortran/expr.cc
index 5ec369c9cd8..3036b1be60f 100644
--- a/gcc/fortran/expr.cc
+++ b/gcc/fortran/expr.cc
@@ -4996,14 +4996,14 @@ get_union_initializer (gfc_symbol *union_type, gfc_component **map_p)
 static bool
 class_allocatable (gfc_component *comp)
 {
-  return comp->ts.type == BT_CLASS && CLASS_DATA (comp)
+  return comp->ts.type == BT_CLASS && comp->attr.class_ok && CLASS_DATA (comp)
 && CLASS_DATA (comp)->attr.allocatable;
 }

 static bool
 class_pointer (gfc_component *comp)
 {
-  return comp->ts.type == BT_CLASS && CLASS_DATA (comp)
+  return comp->ts.type == BT_CLASS && comp->attr.class_ok && CLASS_DATA (comp)
 && CLASS_DATA (comp)->attr.pointer;
 }

diff --git a/gcc/testsuite/gfortran.dg/pr108434.f90 b/gcc/testsuite/gfortran.dg/pr108434.f90
new file mode 100644
index 000..e1768a57574
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr108434.f90
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! PR fortran/108434 - ICE in class_allocatable
+! Contributed by G.Steinmetz
+
+program p
+  type t
+ class(c), pointer :: a(2) ! { dg-error "must have a deferred shape" }
+  end type t
+  class(t), allocatable :: x
+  class(t), pointer :: y
+end
--
2.35.3



[PATCH] c: ICE with nullptr as case expression [PR108424]

2023-01-18 Thread Marek Polacek via Gcc-patches
In this ICE-on-invalid, we crash on

  gcc_assert (INTEGRAL_TYPE_P (type));

in perform_integral_promotions, because a nullptr is an INTEGER_CST,
but not INTEGRAL_TYPE_P, and check_case_value is only checking the
former.  In the test I'm testing other "shall be an integral constant
expression" contexts as well.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c/108424

gcc/c-family/ChangeLog:

* c-common.cc (check_case_value): Check INTEGRAL_TYPE_P.

gcc/testsuite/ChangeLog:

* gcc.dg/c2x-nullptr-6.c: New test.
---
 gcc/c-family/c-common.cc |  3 ++-
 gcc/testsuite/gcc.dg/c2x-nullptr-6.c | 33 
 2 files changed, 35 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/c2x-nullptr-6.c

diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc
index 76c8abef296..ae92cd5adaf 100644
--- a/gcc/c-family/c-common.cc
+++ b/gcc/c-family/c-common.cc
@@ -2238,7 +2238,8 @@ check_case_value (location_t loc, tree value)
   if (value == NULL_TREE)
 return value;
 
-  if (TREE_CODE (value) == INTEGER_CST)
+  if (INTEGRAL_TYPE_P (TREE_TYPE (value))
+  && TREE_CODE (value) == INTEGER_CST)
 /* Promote char or short to int.  */
 value = perform_integral_promotions (value);
   else if (value != error_mark_node)
diff --git a/gcc/testsuite/gcc.dg/c2x-nullptr-6.c 
b/gcc/testsuite/gcc.dg/c2x-nullptr-6.c
new file mode 100644
index 000..24e14fa6921
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-nullptr-6.c
@@ -0,0 +1,33 @@
+/* PR c/108424 */
+/* { dg-options "-std=c2x" } */
+
+struct S {
+  int i;
+  int : nullptr; /* { dg-error "not an integer constant" } */
+};
+
+enum E { X = nullptr }; /* { dg-error "not an integer constant" } */
+
+alignas(nullptr) int g; /* { dg-error "not an integer constant" } */
+
+int arr[10] = { [nullptr] = 1 }; /* { dg-error "not of integer type" } */
+
+_Static_assert (nullptr, "nullptr"); /* { dg-error "not an integer" } */
+
+void f (int n)
+{
+  switch (n) {
+  case nullptr: /* { dg-error "an integer constant" } */
+  default:
+  }
+
+  switch (n) {
+  case 1 ... nullptr: /* { dg-error "an integer constant" } */
+  default:
+  }
+
+  switch (n) {
+  case nullptr ... 2: /* { dg-error "an integer constant" } */
+  default:
+  }
+}

base-commit: af7881e07631fc1c61deb307119f7cabdd4094a1
-- 
2.39.0



Re: [PATCH] c++: -Wdangling-reference with reference wrapper [PR107532]

2023-01-18 Thread Jason Merrill via Gcc-patches

On 1/18/23 12:52, Marek Polacek wrote:

Here, -Wdangling-reference triggers where it probably shouldn't, causing
some grief.  The code in question uses a reference wrapper with a member
function returning a reference to a subobject of a non-temporary object:

   const Plane & meta = fm.planes().inner();

I've tried a few approaches, e.g., checking that the member function's
return type is the same as the type of the enclosing class (which is
the case for member functions returning *this), but that then breaks
Wdangling-reference4.C with std::optional.

So I figured that perhaps we want to look at the object we're invoking
the member function(s) on and see if that is a temporary, as in, don't
warn about

   const Plane & meta = fm.planes().inner();

but do warn about

   const Plane & meta = FrameMetadata().planes().inner();

It's ugly, but better than asking users to add #pragmas into their code.


Hmm, that doesn't seem right; the former is only OK because Ref is in 
fact a reference-like type.  If planes() returned a class that held 
data, we would want to warn.


In this case, we might recognize the reference-like class because it has 
a reference member and a constructor taking the same reference type.


That wouldn't help with std::reference_wrapper or std::ref_view because 
they have pointer members instead of references, but perhaps loosening 
the check to include that case would make sense?


Jason


Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/107532

gcc/cp/ChangeLog:

* call.cc (do_warn_dangling_reference): Don't warn when the
object member functions are invoked on is not a temporary.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference8.C: New test.
---
  gcc/cp/call.cc| 33 +++-
  .../g++.dg/warn/Wdangling-reference8.C| 77 +++
  2 files changed, 109 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference8.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 0780b5840a3..43e65c3dffb 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -13850,7 +13850,38 @@ do_warn_dangling_reference (tree expr)
if (TREE_CODE (arg) == ADDR_EXPR)
  arg = TREE_OPERAND (arg, 0);
if (expr_represents_temporary_p (arg))
- return expr;
+ {
+   /* An ugly attempt to reduce the number of -Wdangling-reference
+  false positives concerning reference wrappers (c++/107532).
+  Don't warn about s.a().b() but do warn about S().a().b(),
+  supposing that the member function is returning a reference
+  to a subobject of the (non-temporary) object.  */
+   if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fndecl)
+   && !DECL_OVERLOADED_OPERATOR_P (fndecl)
+   && i == 0)
+ {
+   tree t = arg;
+   while (handled_component_p (t))
+ t = TREE_OPERAND (t, 0);
+   t = TARGET_EXPR_INITIAL (arg);
+   /* Quite likely we don't have a chain of member functions
+  (like a().b().c()).  */
+   if (TREE_CODE (t) != CALL_EXPR)
+ return expr;
+   /* Walk the call chain to the original object and see if
+  it was a temporary.  */
+   do
+ t = tree_strip_nop_conversions (CALL_EXPR_ARG (t, 0));
+   while (TREE_CODE (t) == CALL_EXPR);
+   /* If the object argument is _EXPR<>, we've started
+  off the chain with a temporary and we want to warn.  */
+   if (TREE_CODE (t) == ADDR_EXPR)
+ t = TREE_OPERAND (t, 0);
+   if (!expr_represents_temporary_p (t))
+ break;
+ }
+   return expr;
+ }
  /* Don't warn about member function like:
  std::any a(...);
  S& s = a.emplace({0}, 0);
diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference8.C 
b/gcc/testsuite/g++.dg/warn/Wdangling-reference8.C
new file mode 100644
index 000..32280f3e282
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference8.C
@@ -0,0 +1,77 @@
+// PR c++/107532
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wdangling-reference" }
+
+struct Plane { unsigned int bytesused; };
+
+// Passes a reference through. Does not change lifetime.
+template 
+struct Ref {
+const T& i_;
+Ref(const T & i) : i_(i) {}
+const T & inner();
+};
+
+struct FrameMetadata {
+Ref planes() const { return p_; }
+
+Plane p_;
+};
+
+void bar(const Plane & meta);
+void foo(const FrameMetadata & fm)
+{
+const Plane & meta = fm.planes().inner();
+bar(meta);
+const Plane & meta2 = FrameMetadata().planes().inner(); // { 

Re: [Patch] libfortran: Fix execute_command_line for Windows

2023-01-18 Thread Harald Anlauf via Gcc-patches

Hi Tobias,

Am 18.01.23 um 16:42 schrieb Tobias Burnus:

Reported by nightstrike, who also tested this patch.

On Windows, we call system() which works as described at
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/system-wsystem?view=msvc-170

Namely, it only fails with "-1" if the command interpreter
could not be started. Otherwise, it has the return value.
(Same on Linux.) On POSIX systems, 'sh' calls exit(127) or
_exit(127) if it cannot execute the program of the passed string,
as documented. Cf. https://www.unix.com/man-page/posix/3p/system/

Thus, the question is what happens on Windows. Our experiments, several
webpages (like stackoverflow) and the source code of WINE for cmd.exe
indicate
that Windows returns 9009 in that case. See for instance
https://github.com/wine-mirror/wine/blob/master/programs/cmd/wcmdmain.c#L1262-L1269

Thus, we now do likewise. The code is for MINGW; Cygwin does not set
that that
var and is likely to use return values closer to POSIX.


I don't use Windows, but this LGTM.


OK for mainline?


Yes, and thanks for the patch!

Harald



Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201,
80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer:
Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München;
Registergericht München, HRB 106955




[PATCH 1/2] aarch64: fix ICE in aarch64_layout_arg [PR108411]

2023-01-18 Thread Christophe Lyon via Gcc-patches
The previous patch added an assert which should not be applied to PST
types (Pure Scalable Types) because alignment does not matter in this
case.  This patch moves the assert after the PST case is handled to
avoid the ICE.

PR target/108411
gcc/
* config/aarch64/aarch64.cc (aarch64_layout_arg): Improve
comment. Move assert about alignment a bit later.
---
 gcc/config/aarch64/aarch64.cc | 28 +---
 1 file changed, 21 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index d36b57341b3..7175b453b3a 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -7659,7 +7659,18 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const 
function_arg_info )
&& (currently_expanding_function_start
   || currently_expanding_gimple_stmt));
 
-  /* There are several things to note here:
+  /* HFAs and HVAs can have an alignment greater than 16 bytes.  For example:
+
+   typedef struct foo {
+ __Int8x16_t foo[2] __attribute__((aligned(32)));
+   } foo;
+
+ is still a HVA despite its larger-than-normal alignment.
+ However, such over-aligned HFAs and HVAs are guaranteed to have
+ no padding.
+
+ If we exclude HFAs and HVAs from the discussion below, then there
+ are several things to note:
 
  - Both the C and AAPCS64 interpretations of a type's alignment should
give a value that is no greater than the type's size.
@@ -7704,12 +7715,6 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const 
function_arg_info )
would treat the alignment as though it was *equal to* 16 bytes.
 
  Both behaviors were wrong, but in different cases.  */
-  unsigned int alignment
-= aarch64_function_arg_alignment (mode, type, _break,
- _break_packed);
-  gcc_assert (alignment <= 16 * BITS_PER_UNIT
- && (!alignment || abi_break < alignment)
- && (!abi_break_packed || alignment < abi_break_packed));
 
   pcum->aapcs_arg_processed = true;
 
@@ -7780,6 +7785,15 @@ aarch64_layout_arg (cumulative_args_t pcum_v, const 
function_arg_info )
 );
   gcc_assert (!sve_p || !allocate_nvrn);
 
+  unsigned int alignment
+= aarch64_function_arg_alignment (mode, type, _break,
+ _break_packed);
+
+  gcc_assert (allocate_nvrn || (alignment <= 16 * BITS_PER_UNIT
+   && (!alignment || abi_break < alignment)
+   && (!abi_break_packed
+   || alignment < abi_break_packed)));
+
   /* allocate_ncrn may be false-positive, but allocate_nvrn is quite reliable.
  The following code thus handles passing by SIMD/FP registers first.  */
 
-- 
2.25.1



[PATCH 2/2] aarch64: add -fno-stack-protector to some tests [PR108411]

2023-01-18 Thread Christophe Lyon via Gcc-patches
As discussed in the PR, these recently added tests fail when the
testsuite is executed with -fstack-protector-strong.  To avoid this,
this patch adds -fno-stack-protector to dg-options.

PR target/108411
gcc/testsuite
* g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C: Add
-fno-stack-protector.
* g++.target/aarch64/bitfield-abi-warning-align16-O2.C: Likewise.
* g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C: Likewise.
* g++.target/aarch64/bitfield-abi-warning-align32-O2.C: Likewise.
* g++.target/aarch64/bitfield-abi-warning-align8-O2.C: Likewise.
* gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c: Likewise.
* gcc.target/aarch64/bitfield-abi-warning-align16-O2.c: Likewise.
* gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c: Likewise.
* gcc.target/aarch64/bitfield-abi-warning-align32-O2.c: Likewise.
* gcc.target/aarch64/bitfield-abi-warning-align8-O2.c: Likewise.
---
 .../g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C  | 2 +-
 .../g++.target/aarch64/bitfield-abi-warning-align16-O2.C| 2 +-
 .../g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C  | 2 +-
 .../g++.target/aarch64/bitfield-abi-warning-align32-O2.C| 2 +-
 .../g++.target/aarch64/bitfield-abi-warning-align8-O2.C | 2 +-
 .../gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c  | 2 +-
 .../gcc.target/aarch64/bitfield-abi-warning-align16-O2.c| 2 +-
 .../gcc.target/aarch64/bitfield-abi-warning-align32-O2-extra.c  | 2 +-
 .../gcc.target/aarch64/bitfield-abi-warning-align32-O2.c| 2 +-
 .../gcc.target/aarch64/bitfield-abi-warning-align8-O2.c | 2 +-
 10 files changed, 10 insertions(+), 10 deletions(-)

diff --git 
a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C 
b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C
index 443cd458b4c..52f9cdd1ee9 100644
--- a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C
+++ b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2-extra.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -save-temps -Wno-narrowing" } */
+/* { dg-options "-O2 -fno-stack-protector -save-temps -Wno-narrowing" } */
 
 #define ALIGN 16
 //#define EXTRA
diff --git a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2.C 
b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2.C
index 76a7e3d0ad4..9ff4e46645b 100644
--- a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2.C
+++ b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align16-O2.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -save-temps -Wno-narrowing" } */
+/* { dg-options "-O2 -fno-stack-protector -save-temps -Wno-narrowing" } */
 
 #define ALIGN 16
 #define EXTRA
diff --git 
a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C 
b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C
index 6f8f54f41ff..55dcbfe4b7c 100644
--- a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C
+++ b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2-extra.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -save-temps -Wno-narrowing" } */
+/* { dg-options "-O2 -fno-stack-protector -save-temps -Wno-narrowing" } */
 
 #define ALIGN 32
 //#define EXTRA
diff --git a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2.C 
b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2.C
index 6b8ad5fbea1..6bb8778ee90 100644
--- a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2.C
+++ b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align32-O2.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -save-temps -Wno-narrowing" } */
+/* { dg-options "-O2 -fno-stack-protector -save-temps -Wno-narrowing" } */
 
 #define ALIGN 32
 #define EXTRA
diff --git a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align8-O2.C 
b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align8-O2.C
index b1764d97ea0..41bcc894a2b 100644
--- a/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align8-O2.C
+++ b/gcc/testsuite/g++.target/aarch64/bitfield-abi-warning-align8-O2.C
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -save-temps -Wno-narrowing" } */
+/* { dg-options "-O2 -fno-stack-protector -save-temps -Wno-narrowing" } */
 
 #define ALIGN 8
 #define EXTRA
diff --git 
a/gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c 
b/gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c
index f248a129509..3b2c932ac23 100644
--- a/gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c
+++ b/gcc/testsuite/gcc.target/aarch64/bitfield-abi-warning-align16-O2-extra.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -save-temps" } */
+/* { 

Re: [PATCH] libstdc++: testsuite: Simplify codecvt_unicode

2023-01-18 Thread Dimitrij Mijoski via Gcc-patches
On Wed, 2023-01-18 at 18:53 +, Jonathan Wakely wrote:
> This doesn't compile in C++11 or C++14, because there's no guaranteed
> elision.

I see. I just looked up in the docs and found that I need to put
--target_board=unix/-std=c++11 inside RUNTESTFLAGS to test in C++11
mode.


[PATCH] x86: Check invalid third argument to __builtin_ia32_prefetch

2023-01-18 Thread H.J. Lu via Gcc-patches
Check invalid third argument to __builtin_ia32_prefetch when expaning
__builtin_ia32_prefetch to avoid ICE later.

gcc/

PR target/108436
* config/i386/i386-expand.cc (ix86_expand_builtin): Check
invalid third argument to __builtin_ia32_prefetch.

gcc/testsuite/

* gcc.target/i386/pr108436.c: New test.
---
 gcc/config/i386/i386-expand.cc   | 12 
 gcc/testsuite/gcc.target/i386/pr108436.c | 15 +++
 2 files changed, 27 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr108436.c

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index 54f700cd09d..e2e2d28bb47 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -13175,6 +13175,12 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
subtarget,
 
if (INTVAL (op3) == 1)
  {
+   if (INTVAL (op2) < 2 || INTVAL (op2) > 3)
+ {
+   error ("invalid third argument");
+   return const0_rtx;
+ }
+
if (TARGET_64BIT && TARGET_PREFETCHI
&& local_func_symbolic_operand (op0, GET_MODE (op0)))
  emit_insn (gen_prefetchi (op0, op2));
@@ -13195,6 +13201,12 @@ ix86_expand_builtin (tree exp, rtx target, rtx 
subtarget,
op0 = copy_addr_to_reg (op0);
  }
 
+   if (INTVAL (op2) < 0 || INTVAL (op2) > 3)
+ {
+   warning (0, "invalid third argument to 
%<__builtin_ia32_prefetch%>; using zero");
+   op2 = const0_rtx;
+ }
+
if (TARGET_3DNOW || TARGET_PREFETCH_SSE
|| TARGET_PRFCHW || TARGET_PREFETCHWT1)
  emit_insn (gen_prefetch (op0, op1, op2));
diff --git a/gcc/testsuite/gcc.target/i386/pr108436.c 
b/gcc/testsuite/gcc.target/i386/pr108436.c
new file mode 100644
index 000..d51f25863a5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr108436.c
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-mprefetchi" } */
+
+int
+foo (int a)
+{
+  return a + 1;
+}
+
+void
+bad (int *p)
+{
+  __builtin_ia32_prefetch (p, 0, 4, 0);   /* { dg-warning "invalid third 
argument to '__builtin_ia32_prefetch'; using zero" } */
+  __builtin_ia32_prefetch (foo, 0, 4, 1);   /* { dg-error "invalid third 
argument" } */
+}
-- 
2.39.0



Re: [PATCH] libstdc++: testsuite: Simplify codecvt_unicode

2023-01-18 Thread Jonathan Wakely via Gcc-patches

On 17/01/23 22:12 +0100, Dimitrij Mijoski wrote:

Stop using unique_ptr, create some objects directly.

libstdc++-v3/ChangeLog:

* testsuite/22_locale/codecvt/codecvt_unicode.cc: Simplify.
* testsuite/22_locale/codecvt/codecvt_unicode.h: Simplify.
* testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc: Simplify.
---
.../22_locale/codecvt/codecvt_unicode.cc   | 18 ++
.../22_locale/codecvt/codecvt_unicode.h|  9 +
.../codecvt/codecvt_unicode_wchar_t.cc | 12 ++--
3 files changed, 17 insertions(+), 22 deletions(-)

diff --git a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc 
b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
index ae4b6c896..3d7393e4a 100644
--- a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
+++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
@@ -29,11 +29,12 @@ test_utf8_utf32_codecvts ()
  using codecvt_c32 = codecvt;
  auto loc_c = locale::classic ();
  VERIFY (has_facet (loc_c));
+
  auto  = use_facet (loc_c);
  test_utf8_utf32_codecvts (cvt);

-  auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ());
-  test_utf8_utf32_codecvts (*cvt_ptr);
+  auto cvt2 = codecvt_utf8 ();


This doesn't compile in C++11 or C++14, because there's no guaranteed
elision.

I've pushed the attached change instead. Thanks for the patch.

commit 7d0cdbbcdd1c9e5e709d0be7d1843e8c1045fd16
Author: Dimitrij Mijoski 
Date:   Tue Jan 17 21:12:12 2023

libstdc++: testsuite: Simplify codecvt_unicode

Stop using unique_ptr, create some objects directly.

libstdc++-v3/ChangeLog:

* testsuite/22_locale/codecvt/codecvt_unicode.cc: Simplify.
* testsuite/22_locale/codecvt/codecvt_unicode.h: Simplify.
* testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc: Simplify.

diff --git a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
index ae4b6c8968f..df1a2b4cc51 100644
--- a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
+++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.cc
@@ -29,11 +29,12 @@ test_utf8_utf32_codecvts ()
   using codecvt_c32 = codecvt;
   auto loc_c = locale::classic ();
   VERIFY (has_facet (loc_c));
+
   auto  = use_facet (loc_c);
   test_utf8_utf32_codecvts (cvt);
 
-  auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ());
-  test_utf8_utf32_codecvts (*cvt_ptr);
+  codecvt_utf8 cvt2;
+  test_utf8_utf32_codecvts (cvt2);
 }
 
 void
@@ -42,21 +43,22 @@ test_utf8_utf16_codecvts ()
   using codecvt_c16 = codecvt;
   auto loc_c = locale::classic ();
   VERIFY (has_facet (loc_c));
+
   auto  = use_facet (loc_c);
   test_utf8_utf16_cvts (cvt);
 
-  auto cvt_ptr = to_unique_ptr (new codecvt_utf8_utf16 ());
-  test_utf8_utf16_cvts (*cvt_ptr);
+  codecvt_utf8_utf16 cvt2;
+  test_utf8_utf16_cvts (cvt2);
 
-  auto cvt_ptr2 = to_unique_ptr (new codecvt_utf8_utf16 ());
-  test_utf8_utf16_cvts (*cvt_ptr2);
+  codecvt_utf8_utf16 cvt3;
+  test_utf8_utf16_cvts (cvt3);
 }
 
 void
 test_utf8_ucs2_codecvts ()
 {
-  auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ());
-  test_utf8_ucs2_cvts (*cvt_ptr);
+  codecvt_utf8 cvt;
+  test_utf8_ucs2_cvts (cvt);
 }
 
 int
diff --git a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h
index 99d1a46840e..fbdc7a35b28 100644
--- a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h
+++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode.h
@@ -15,18 +15,11 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
+#include 
 #include 
 #include 
-#include 
 #include 
 
-template 
-std::unique_ptr
-to_unique_ptr (T *ptr)
-{
-  return std::unique_ptr (ptr);
-}
-
 struct test_offsets_ok
 {
   size_t in_size, out_size;
diff --git a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc
index 169504939a2..4fd1bfec63a 100644
--- a/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc
+++ b/libstdc++-v3/testsuite/22_locale/codecvt/codecvt_unicode_wchar_t.cc
@@ -27,8 +27,8 @@ void
 test_utf8_utf32_codecvts ()
 {
 #if __SIZEOF_WCHAR_T__ == 4
-  auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ());
-  test_utf8_utf32_codecvts (*cvt_ptr);
+  codecvt_utf8 cvt;
+  test_utf8_utf32_codecvts (cvt);
 #endif
 }
 
@@ -36,8 +36,8 @@ void
 test_utf8_utf16_codecvts ()
 {
 #if __SIZEOF_WCHAR_T__ >= 2
-  auto cvt_ptr = to_unique_ptr (new codecvt_utf8_utf16 ());
-  test_utf8_utf16_cvts (*cvt_ptr);
+  codecvt_utf8_utf16 cvt;
+  test_utf8_utf16_cvts (cvt);
 #endif
 }
 
@@ -45,8 +45,8 @@ void
 test_utf8_ucs2_codecvts ()
 {
 #if __SIZEOF_WCHAR_T__ == 2
-  auto cvt_ptr = to_unique_ptr (new codecvt_utf8 ());
-  test_utf8_ucs2_cvts (*cvt_ptr);
+  codecvt_utf8 cvt;
+  test_utf8_ucs2_cvts (cvt);
 #endif
 }
 


RE: [PATCH][GCC] arm: Add support for new frame unwinding instruction "0xb5".

2023-01-18 Thread Srinath Parvathaneni via Gcc-patches
Hi Ramana,

> -Original Message-
> From: Ramana Radhakrishnan 
> Sent: Sunday, November 20, 2022 10:48 PM
> To: Srinath Parvathaneni 
> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw
> ; Kyrylo Tkachov 
> Subject: Re: [PATCH][GCC] arm: Add support for new frame unwinding
> instruction "0xb5".
> 
> On Fri, Nov 18, 2022 at 9:33 AM Srinath Parvathaneni
>  wrote:
> >
> > Hi,
> >
> > > -Original Message-
> > > From: Ramana Radhakrishnan 
> > > Sent: Thursday, November 17, 2022 8:27 PM
> > > To: Srinath Parvathaneni 
> > > Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw
> > > ; Kyrylo Tkachov
> 
> > > Subject: Re: [PATCH][GCC] arm: Add support for new frame unwinding
> > > instruction "0xb5".
> > >
> > > On Thu, Nov 10, 2022 at 10:38 AM Srinath Parvathaneni via
> > > Gcc-patches  wrote:
> > > >
> > > > Hi,
> > > >
> > > > This patch adds support for Arm frame unwinding instruction "0xb5"
> > > > [1]. When an exception is taken and "0xb5" instruction is
> > > > encounter during runtime stack-unwinding, we use effective vsp as
> > > > modifier in pointer
> > > authentication.
> > > > On completion of stack unwinding if "0xb5" instruction is not
> > > > encountered then CFA will be used as modifier in pointer
> authentication.
> > > >
> > > > [1]
> > > > https://github.com/ARM-software/abi-
> > > aa/releases/download/2022Q3/ehabi3
> > > > 2.pdf
> > > >
> > > > Regression tested on arm-none-eabi target and found no regressions.
> > > >
> > > > Ok for master?
> > > >
> > >
> > > No, not yet.
> > >
> > > Presumably the logic to produce 0xb5 is in the source base and this
> > > was tested with suitable options that produce said opcode ? I see no
> > > logic in place to produce the said opcode in the backend in a quick
> > > read as the pacbti patches still seem to be in review. ?
> > >
> > > So what was the test suite run actually testing ?
> >
> > Sorry for the late response, the patch supporting the said opcode (directive
> ".pacspval)" is here:
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605524.html
> > (still under upstream review)
> >
> > and the patch to encode ".pacspval" with the mentioned opcode "0xb5" in
> binutils is here:
> > https://sourceware.org/pipermail/binutils/2022-November/124328.html
> (approved and committed to binutils).
> 
> Thanks for the answer but perhaps I should make my question more explicit
> - are you saying that this patch was tested in combination with those and
> other dependent patches on a suitable simulator with suitable multilibs and
> C++ to test for this presumably for frame unwinding ?
> 
Sorry for the late response, I'm re-spinning other pacbti patches on top of 
which this
patch needs to be applied, so I could not respond to you.

I have applied this patch on top of all the pacbti and related multilib patches,
the patch applies cleanly, and the toolchain build is successful.

I have tested this patch with C testcase with nested function (which emits 
.pacspval
directive in case of clobber IP) on a simulator which supports PACBTI and 
executed the 
binary successfully.

But I'm unable to test this patch for C++ frame unwinding for this opcode 
because C++ 
doesn't support nested functions and with current pacbti code IP register is 
clobbered
and we emit .pacspval  directive only for nested function.

> For the future , it would certainly be worth being explicit about this in your
> patch submission :)

Thank you, I will keep this is in mind for my later patch submissions.

Regards,
Srinath.

> regards
> Ramana
> 
> >
> > Regards,
> > Srinath.
> >
> > > regards
> > > Ramana
> > >
> > >
> > > > Regards,
> > > > Srinath.
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > 2022-11-09  Srinath Parvathaneni  
> > > >
> > > > * libgcc/config/arm/pr-support.c (__gnu_unwind_execute):
> > > > Decode
> > > opcode
> > > > "0xb5".
> > > >
> > > >
> > > > ### Attachment also inlined for ease of reply
> > > ###
> > > >
> > > >
> > > > diff --git a/libgcc/config/arm/pr-support.c
> > > > b/libgcc/config/arm/pr-support.c index
> > > >
> > >
> e48854587c667a959aa66ccc4982231f6ecc..73e4942a39b34a83c2da85de
> > > f6b1
> > > > 3e82ec501552 100644
> > > > --- a/libgcc/config/arm/pr-support.c
> > > > +++ b/libgcc/config/arm/pr-support.c
> > > > @@ -107,7 +107,9 @@ __gnu_unwind_execute (_Unwind_Context *
> > > context, __gnu_unwind_state * uws)
> > > >_uw op;
> > > >int set_pc;
> > > >int set_pac = 0;
> > > > +  int set_pac_sp = 0;
> > > >_uw reg;
> > > > +  _uw sp;
> > > >
> > > >set_pc = 0;
> > > >for (;;)
> > > > @@ -124,10 +126,11 @@ __gnu_unwind_execute (_Unwind_Context *
> > > context,
> > > > __gnu_unwind_state * uws)  #if defined(TARGET_HAVE_PACBTI)
> > > >   if (set_pac)
> > > > {
> > > > - _uw sp;
> > > >   _uw lr;
> > > >   _uw pac;
> > > > - _Unwind_VRS_Get (context, _UVRSC_CORE, R_SP,
> > > _UVRSD_UINT32, );
> > > > + 

[PATCH] c++: -Wdangling-reference with reference wrapper [PR107532]

2023-01-18 Thread Marek Polacek via Gcc-patches
Here, -Wdangling-reference triggers where it probably shouldn't, causing
some grief.  The code in question uses a reference wrapper with a member
function returning a reference to a subobject of a non-temporary object:

  const Plane & meta = fm.planes().inner();

I've tried a few approaches, e.g., checking that the member function's
return type is the same as the type of the enclosing class (which is
the case for member functions returning *this), but that then breaks
Wdangling-reference4.C with std::optional.

So I figured that perhaps we want to look at the object we're invoking
the member function(s) on and see if that is a temporary, as in, don't
warn about

  const Plane & meta = fm.planes().inner();

but do warn about

  const Plane & meta = FrameMetadata().planes().inner();

It's ugly, but better than asking users to add #pragmas into their code.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

PR c++/107532

gcc/cp/ChangeLog:

* call.cc (do_warn_dangling_reference): Don't warn when the
object member functions are invoked on is not a temporary.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wdangling-reference8.C: New test.
---
 gcc/cp/call.cc| 33 +++-
 .../g++.dg/warn/Wdangling-reference8.C| 77 +++
 2 files changed, 109 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/warn/Wdangling-reference8.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 0780b5840a3..43e65c3dffb 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -13850,7 +13850,38 @@ do_warn_dangling_reference (tree expr)
if (TREE_CODE (arg) == ADDR_EXPR)
  arg = TREE_OPERAND (arg, 0);
if (expr_represents_temporary_p (arg))
- return expr;
+ {
+   /* An ugly attempt to reduce the number of -Wdangling-reference
+  false positives concerning reference wrappers (c++/107532).
+  Don't warn about s.a().b() but do warn about S().a().b(),
+  supposing that the member function is returning a reference
+  to a subobject of the (non-temporary) object.  */
+   if (DECL_NONSTATIC_MEMBER_FUNCTION_P (fndecl)
+   && !DECL_OVERLOADED_OPERATOR_P (fndecl)
+   && i == 0)
+ {
+   tree t = arg;
+   while (handled_component_p (t))
+ t = TREE_OPERAND (t, 0);
+   t = TARGET_EXPR_INITIAL (arg);
+   /* Quite likely we don't have a chain of member functions
+  (like a().b().c()).  */
+   if (TREE_CODE (t) != CALL_EXPR)
+ return expr;
+   /* Walk the call chain to the original object and see if
+  it was a temporary.  */
+   do
+ t = tree_strip_nop_conversions (CALL_EXPR_ARG (t, 0));
+   while (TREE_CODE (t) == CALL_EXPR);
+   /* If the object argument is _EXPR<>, we've started
+  off the chain with a temporary and we want to warn.  */
+   if (TREE_CODE (t) == ADDR_EXPR)
+ t = TREE_OPERAND (t, 0);
+   if (!expr_represents_temporary_p (t))
+ break;
+ }
+   return expr;
+ }
  /* Don't warn about member function like:
  std::any a(...);
  S& s = a.emplace({0}, 0);
diff --git a/gcc/testsuite/g++.dg/warn/Wdangling-reference8.C 
b/gcc/testsuite/g++.dg/warn/Wdangling-reference8.C
new file mode 100644
index 000..32280f3e282
--- /dev/null
+++ b/gcc/testsuite/g++.dg/warn/Wdangling-reference8.C
@@ -0,0 +1,77 @@
+// PR c++/107532
+// { dg-do compile { target c++11 } }
+// { dg-options "-Wdangling-reference" }
+
+struct Plane { unsigned int bytesused; };
+
+// Passes a reference through. Does not change lifetime.
+template 
+struct Ref {
+const T& i_;
+Ref(const T & i) : i_(i) {}
+const T & inner();
+};
+
+struct FrameMetadata {
+Ref planes() const { return p_; }
+
+Plane p_;
+};
+
+void bar(const Plane & meta);
+void foo(const FrameMetadata & fm)
+{
+const Plane & meta = fm.planes().inner();
+bar(meta);
+const Plane & meta2 = FrameMetadata().planes().inner(); // { dg-warning 
"dangling reference" }
+bar(meta2);
+}
+
+struct S {
+  const S& self () { return *this; }
+} s;
+
+const S& r1 = s.self();
+const S& r2 = S().self(); // { dg-warning "dangling reference" }
+
+struct D {
+};
+
+struct C {
+  D d;
+  Ref get() const { return d; }
+};
+
+struct B {
+  C c;
+  const C& get() const { return c; }
+  B();
+};
+
+struct A {
+  B b;
+  const B& get() const { return b; }
+};
+
+void
+g (const A& a)
+{
+  const auto& d1 = a.get().get().get().inner();
+  (void) d1;
+  const auto& d2 = A().get().get().get().inner(); // { 

[GCC][PATCH 13/15, v6] arm: Add support for dwarf debug directives and pseudo hard-register for PAC feature.

2023-01-18 Thread Srinath Parvathaneni via Gcc-patches
Hello,

This patch teaches the DWARF support in gcc about RA_AUTH_CODE pseudo 
hard-register and also 
updates the ".save", ".cfi_register", ".cfi_offset", ".cfi_restore" directives 
accordingly.
This patch also adds support to emit ".pacspval" directive when "pac ip, lr, 
sp" instruction
in generated in the assembly.

RA_AUTH_CODE register number is 107 and it's dwarf register number is 143.

Applying this patch on top of PACBTI series posted here
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599658.html and when 
compiling the following
test.c with "-march=armv8.1-m.main+mve+pacbti -mbranch-protection=pac-ret 
-mthumb -mfloat-abi=hard
fasynchronous-unwind-tables -g -O0 -S" command line options, the assembly 
output after this patch
looks like below:

$cat test.c

void fun1(int a);
void fun(int a,...)
{
  fun1(a);
}

int main()
{
  fun (10);
  return 0;
}

$ arm-none-eabi-gcc -march=armv8.1-m.main+mve+pacbti 
-mbranch-protection=pac-ret -mthumb -mfloat-abi=hard
-fasynchronous-unwind-tables -g -O0 -S test.s

Assembly output:
...
fun:
...
.pacspval
pac ip, lr, sp
.cfi_register 143, 12
push{r3, r7, ip, lr}
.save {r3, r7, ra_auth_code, lr}
...
.cfi_offset 143, -24
...
.cfi_restore 143
...
aut ip, lr, sp
bx  lr
...
main:
...
.pacspval
pac ip, lr, sp
.cfi_register 143, 12
push{r3, r7, ip, lr}
.save {r3, r7, ra_auth_code, lr}
...
.cfi_offset 143, -8
...
.cfi_restore 143
...
aut ip, lr, sp
bx  lr
...

Regression tested on arm-none-eabi target and found no regressions.

Ok for master?

Regards,
Srinath.

2023-01-18  Srinath Parvathaneni  

* config/arm/aout.h (ra_auth_code): Add entry in enum.
(emit_multi_reg_push): Add RA_AUTH_CODE register to
dwarf frame expression.
(arm_emit_multi_reg_pop): Restore RA_AUTH_CODE register.
(arm_expand_prologue): Update frame related information and reg notes
for pac/pacbit insn.
(arm_regno_class): Check for pac pseudo reigster.
(arm_dbx_register_number): Assign ra_auth_code register number in dwarf.
(arm_init_machine_status): Set pacspval_needed to zero.
(arm_debugger_regno): Check for PAC register.
(arm_unwind_emit_sequence): Print .save directive with ra_auth_code
register.
(arm_unwind_emit_set): Add entry for IP_REGNUM in switch case.
(arm_unwind_emit): Update REG_CFA_REGISTER case._
* config/arm/arm.h (FIRST_PSEUDO_REGISTER): Modify.
(DWARF_PAC_REGNUM): Define.
(IS_PAC_REGNUM): Likewise.
(enum reg_class): Add PAC_REG entry.
(machine_function): Add pacbti_needed state to structure.
* config/arm/arm.md (RA_AUTH_CODE): Define.

gcc/testsuite/ChangeLog:

2023-01-18  Srinath Parvathaneni  

* g++.target/arm/pac-1.C: New test.
* gcc.target/arm/pac-15.c: Likewise.


pacbti_dwarf.patch
Description: pacbti_dwarf.patch


[committed] analyzer: add SARD testsuite 81

2023-01-18 Thread David Malcolm via Gcc-patches
A 2013 paper [1] proposed 5 simple tests for evaluating the
effectiveness of static analysis tools at detecting
CWE-121 ("Stack-based Buffer Overflow").

The tests can be found in:
  https://samate.nist.gov/SARD/test-suites/81

This patch adds theses 5 tests to -fanalyzer's testsuite, lightly
modified to add DejaGnu directives.

This is for unit-testing; for broader testing of -fanalyzer I'm working
on a separate integration testing suite that builds various real-world C
projects with -fanalyzer, currently here:
  https://github.com/davidmalcolm/gcc-analyzer-integration-tests

Successfully regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r13-5244-gc6a09bfa03.

[1] Black, P. , Koo, H. and Irish, T. (2013), A Basic CWE-121 Buffer Overflow 
Effectiveness Test Suite, Proc. 6th Latin-American Symposium on Dependable 
Computing, Rio de Janeiro, -1, [online], 
https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=913117 (Accessed January 
17, 2023)

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/SARD-tc117-basic-1-min.c: New test, adapted
from https://samate.nist.gov/SARD/test-suites/81.
* gcc.dg/analyzer/SARD-tc1909-stack_overflow_loop.c: Likewise.
* gcc.dg/analyzer/SARD-tc249-basic-00034-min.c: Likewise.
* gcc.dg/analyzer/SARD-tc293-basic-00045-min.c: Likewise.
* gcc.dg/analyzer/SARD-tc841-basic-00182-min.c: Likewise.

Signed-off-by: David Malcolm 
---
 .../analyzer/SARD-tc117-basic-1-min.c | 67 +
 .../SARD-tc1909-stack_overflow_loop.c | 29 
 .../analyzer/SARD-tc249-basic-00034-min.c | 67 +
 .../analyzer/SARD-tc293-basic-00045-min.c | 69 ++
 .../analyzer/SARD-tc841-basic-00182-min.c | 73 +++
 5 files changed, 305 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/SARD-tc117-basic-1-min.c
 create mode 100644 
gcc/testsuite/gcc.dg/analyzer/SARD-tc1909-stack_overflow_loop.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/SARD-tc249-basic-00034-min.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/SARD-tc293-basic-00045-min.c
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/SARD-tc841-basic-00182-min.c

diff --git a/gcc/testsuite/gcc.dg/analyzer/SARD-tc117-basic-1-min.c 
b/gcc/testsuite/gcc.dg/analyzer/SARD-tc117-basic-1-min.c
new file mode 100644
index 000..e1ce195ad8b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/analyzer/SARD-tc117-basic-1-min.c
@@ -0,0 +1,67 @@
+/* Adapted from https://samate.nist.gov/SARD/test-cases/117/versions/1.0.0
+   Part of https://samate.nist.gov/SARD/test-suites/81
+   See:
+ Black, P. , Koo, H. and Irish, T. (2013), A Basic CWE-121 Buffer Overflow 
Effectiveness Test Suite, Proc. 6th Latin-American Symposium on Dependable 
Computing, Rio de Janeiro, -1, [online], 
https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=913117 (Accessed January 
17, 2023)
+*/
+
+/* Taxonomy Classification: 000100 */
+
+/*
+ *  WRITE/READ  0  write
+ *  WHICH BOUND 0  upper
+ *  DATA TYPE   0  char
+ *  MEMORY LOCATION 0  stack
+ *  SCOPE   0  same
+ *  CONTAINER   0  no
+ *  POINTER 0  no
+ *  INDEX COMPLEXITY0  constant
+ *  ADDRESS COMPLEXITY  0  constant
+ *  LENGTH COMPLEXITY   0  N/A
+ *  ADDRESS ALIAS   0  none
+ *  INDEX ALIAS 0  none
+ *  LOCAL CONTROL FLOW  0  none
+ *  SECONDARY CONTROL FLOW  0  none
+ *  LOOP STRUCTURE  0  no
+ *  LOOP COMPLEXITY 0  N/A
+ *  ASYNCHRONY  0  no
+ *  TAINT   0  no
+ *  RUNTIME ENV. DEPENDENCE 0  no
+ *  MAGNITUDE   1  1 byte
+ *  CONTINUOUS/DISCRETE 0  discrete
+ *  SIGNEDNESS  0  no
+ */
+
+/*
+Copyright 2004 M.I.T.
+
+Permission is hereby granted, without written agreement or royalty fee, to 
use, 
+copy, modify, and distribute this software and its documentation for any 
+purpose, provided that the above copyright notice and the following three 
+paragraphs appear in all copies of this software.
+
+IN NO EVENT SHALL M.I.T. BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, 
+INCIDENTAL, OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OF THIS SOFTWARE 
+AND ITS DOCUMENTATION, EVEN IF M.I.T. HAS BEEN ADVISED OF THE POSSIBILITY OF 
+SUCH DAMANGE.
+
+M.I.T. SPECIFICALLY DISCLAIMS ANY WARRANTIES INCLUDING, BUT NOT LIMITED TO 
+THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE, 
+AND NON-INFRINGEMENT.
+
+THE SOFTWARE IS PROVIDED ON AN "AS-IS" BASIS AND M.I.T. HAS NO OBLIGATION TO 
+PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS.
+*/
+
+
+int main(int argc, char *argv[])
+{
+  char buf[10];
+
+
+  /*  BAD  */
+  buf[10] = 'A'; /* { 

Re: [GCC][PATCH 13/15, v5] arm: Add support for dwarf debug directives and pseudo hard-register for PAC feature.

2023-01-18 Thread Richard Earnshaw via Gcc-patches




On 13/01/2023 17:44, Srinath Parvathaneni via Gcc-patches wrote:

Hello,

This patch teaches the DWARF support in gcc about RA_AUTH_CODE pseudo 
hard-register and also
updates the ".save", ".cfi_register", ".cfi_offset", ".cfi_restore" directives 
accordingly.
This patch also adds support to emit ".pacspval" directive when "pac ip, lr, 
sp" instruction
in generated in the assembly.

RA_AUTH_CODE register number is 107 and it's dwarf register number is 143.

Applying this patch on top of PACBTI series posted here
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599658.html and when 
compiling the following
test.c with "-march=armv8.1-m.main+mve+pacbti -mbranch-protection=pac-ret 
-mthumb -mfloat-abi=hard
fasynchronous-unwind-tables -g -O0 -S" command line options, the assembly 
output after this patch
looks like below:

$cat test.c

void fun1(int a);
void fun(int a,...)
{
   fun1(a);
}

int main()
{
   fun (10);
   return 0;
}

$ arm-none-eabi-gcc -march=armv8.1-m.main+mve+pacbti 
-mbranch-protection=pac-ret -mthumb -mfloat-abi=hard
-fasynchronous-unwind-tables -g -O0 -S test.s

Assembly output:
...
fun:
...
 .pacspval
 pac ip, lr, sp
 .cfi_register 143, 12
 push{r3, r7, ip, lr}
 .save {r3, r7, ra_auth_code, lr}
...
 .cfi_offset 143, -24
...
 .cfi_restore 143
...
 aut ip, lr, sp
 bx  lr
...
main:
...
 .pacspval
 pac ip, lr, sp
 .cfi_register 143, 12
 push{r3, r7, ip, lr}
 .save {r3, r7, ra_auth_code, lr}
...
 .cfi_offset 143, -8
...
 .cfi_restore 143
...
 aut ip, lr, sp
 bx  lr
...

Regression tested on arm-none-eabi target and found no regressions.

Ok for master?

Regards,
Srinath.

2023-01-11  Srinath Parvathaneni  

 * config/arm/aout.h (ra_auth_code): Add entry in enum.
 (emit_multi_reg_push): Add RA_AUTH_CODE register to
 dwarf frame expression.
 (arm_emit_multi_reg_pop): Restore RA_AUTH_CODE register.
 (arm_expand_prologue): Update frame related information and reg notes
 for pac/pacbit insn.
 (arm_regno_class): Check for pac pseudo reigster.
 (arm_dbx_register_number): Assign ra_auth_code register number in 
dwarf.
 (arm_init_machine_status): Set pacspval_needed to zero.
 (arm_debugger_regno): Check for PAC register.
 (arm_unwind_emit_sequence): Print .save directive with ra_auth_code
 register.
 (arm_unwind_emit_set): Add entry for IP_REGNUM in switch case.
 (arm_unwind_emit): Update REG_CFA_REGISTER case._
 * config/arm/arm.h (FIRST_PSEUDO_REGISTER): Modify.
 (DWARF_PAC_REGNUM): Define.
 (IS_PAC_REGNUM): Likewise.
 (enum reg_class): Add PAC_REG entry.
 (machine_function): Add pacbti_needed state to structure.
 * config/arm/arm.md (RA_AUTH_CODE): Define.

gcc/testsuite/ChangeLog:

2023-01-11  Srinath Parvathaneni  

 * g++.target/arm/pac-1.C: New test.
 * gcc.target/arm/pac-15.c: Likewise.


Your attachments are still not being correctly detected.  Perhaps this 
is because of the filename you've chosen, which has no recognizable 
extension.  If you name your files .patch (or .diff, or even 
.txt) then the system should automatically pick the right mime type 
for encoding.


+ /* NOTE: Dwarf code emitter handle reg-reg copies correctly and in the
+following example reg-reg copy of SP to IP register is handled
+through .cfi_def_cfa_register directive and the .cfi_offset
+directive for IP register is skipped by dwarf code emitter.
+Example:
+   mov ip, sp
+   .cfi_def_cfa_register 12
+   push{fp, ip, lr, pc}
+   .cfi_offset 11, -16
+   .cfi_offset 13, -12
+   .cfi_offset 14, -8
+
+Where as Arm-specific .save directive reg-reg copy handling is
+buggy.  After the reg-reg copy, the copied registers need to be

It's not buggy (if it were you'd need to fix it :).  It just works in a 
different way to the dwarf tracker and doesn't need to handle reg->reg 
copies.  So please rephrase this.


+populated in .save directive register list but with the current
+implementation of .save directive original registers are getting
+populated in the register list.  So to avoid this issue for IP
+register when PACBTI is enabled we manually updated the .save
+directive register list to use "ra_auth_code" (pseduo register 143)
+instead of IP register as shown in following example.
+Example:
+   pacbti  ip, lr, sp
+   .cfi_register 143, 12
+   push{r3, r7, ip, lr}
+   .save {r3, r7, ra_auth_code, lr}
+ */

R.


Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 18, 2023 at 05:25:10PM +0100, Jan Hubicka wrote:
> > On Jan 18 2023, Michael Matz wrote:
> > 
> > > The purest solution is to emit unwind tables for all functions that 
> > > request it into .eh_frame and for those that don't request it put 
> > > into .debug_frame (if also -g is on).
> > 
> > The assembler does not allow switching back to .eh_frame once a
> > different format has been chosen, so .eh_frame must be either on or off
> > all the way through.
> 
> This is unforutnate (and I did not noticed this earlier).
> Would it be hard to fix assembler? In general situations like this can
> be handled by forced partitioning in GCC, but it is not a good solution
> since we want to keep partitioning algorithm an optional step by design.

If it was just about compiler emitted .cfi_* directives, we could say
use .cfi_* directives for .eh_frame and hand emitted .debug_line for
.debug_frame or vice versa in the case of mixing functions with different
flags.
But inline asm can contain user written .cfi_* directives, so I think we
need to do something on the assembler side and then adjust gcc.

Jakub



Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Jan Hubicka via Gcc-patches
> On Jan 18 2023, Michael Matz wrote:
> 
> > The purest solution is to emit unwind tables for all functions that 
> > request it into .eh_frame and for those that don't request it put 
> > into .debug_frame (if also -g is on).
> 
> The assembler does not allow switching back to .eh_frame once a
> different format has been chosen, so .eh_frame must be either on or off
> all the way through.

This is unforutnate (and I did not noticed this earlier).
Would it be hard to fix assembler? In general situations like this can
be handled by forced partitioning in GCC, but it is not a good solution
since we want to keep partitioning algorithm an optional step by design.

Honza
> 
> -- 
> Andreas Schwab, SUSE Labs, sch...@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."


Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Andreas Schwab via Gcc-patches
On Jan 18 2023, Michael Matz wrote:

> The purest solution is to emit unwind tables for all functions that 
> request it into .eh_frame and for those that don't request it put 
> into .debug_frame (if also -g is on).

The assembler does not allow switching back to .eh_frame once a
different format has been chosen, so .eh_frame must be either on or off
all the way through.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[Patch] libfortran: Fix execute_command_line for Windows

2023-01-18 Thread Tobias Burnus

Reported by nightstrike, who also tested this patch.

On Windows, we call system() which works as described at
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/system-wsystem?view=msvc-170

Namely, it only fails with "-1" if the command interpreter
could not be started. Otherwise, it has the return value.
(Same on Linux.) On POSIX systems, 'sh' calls exit(127) or
_exit(127) if it cannot execute the program of the passed string,
as documented. Cf. https://www.unix.com/man-page/posix/3p/system/

Thus, the question is what happens on Windows. Our experiments, several
webpages (like stackoverflow) and the source code of WINE for cmd.exe indicate
that Windows returns 9009 in that case. See for instance
https://github.com/wine-mirror/wine/blob/master/programs/cmd/wcmdmain.c#L1262-L1269

Thus, we now do likewise. The code is for MINGW; Cygwin does not set that that
var and is likely to use return values closer to POSIX.

OK for mainline?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
libfortran: Fix execute_command_line for Windows

On Windows, 'system' is called - that fails with -1 if the command
interpreter could not be started; on POSIX systems, if the child
process could not be started by the shell, exit(127)/_exit(127) is
called/returned. On Windows, cmd.exe (and also the PowerShell) return
errorlevel 9009.

libgfortran/ChangeLog:

	* intrinsics/execute_command_line.c (execute_command_line): On
	Windows, regard system()'s return value of 9009 as EXEC_INVALIDCOMMAND.

diff --git a/libgfortran/intrinsics/execute_command_line.c b/libgfortran/intrinsics/execute_command_line.c
index 305f067d973..0d1688400c2 100644
--- a/libgfortran/intrinsics/execute_command_line.c
+++ b/libgfortran/intrinsics/execute_command_line.c
@@ -142,10 +142,15 @@ execute_command_line (const char *command, bool wait, int *exitstat,
 #endif
   else if (res == 127 || res == 126
 #if defined(WEXITSTATUS) && defined(WIFEXITED)
 	   || (WIFEXITED(res) && WEXITSTATUS(res) == 127)
 	   || (WIFEXITED(res) && WEXITSTATUS(res) == 126)
+#endif
+#ifdef __MINGW32__
+		  /* cmd.exe sets the errorlevel to 9009,
+		 if the command could not be executed.  */
+		|| res == 9009
 #endif
 	   )
 	/* Shell return codes 126 and 127 mean that the command line could
 	   not be executed for various reasons.  */
 	set_cmdstat (cmdstat, EXEC_INVALIDCOMMAND);


Re: [PATCH] IPA: do not release body if still needed

2023-01-18 Thread Jan Hubicka via Gcc-patches
> The code removing function bodies when the last call graph clone of a
> node is removed is too aggressive when there are nodes up the
> clone_of chain which still need them.  Fixed by expanding the check.
> 
> gcc/ChangeLog:
> 
> 2023-01-18  Martin Jambor  
> 
>   PR ipa/107944
>   * cgraph.cc (cgraph_node::remove): Check whether nodes up the
>   lcone_of chain also do not need the body.
> ---
>  gcc/cgraph.cc | 14 --
>  1 file changed, 12 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
> index 5e60c2b73db..5f72ace9b57 100644
> --- a/gcc/cgraph.cc
> +++ b/gcc/cgraph.cc
> @@ -1893,8 +1893,18 @@ cgraph_node::remove (void)
>else if (clone_of)
>  {
>clone_of->clones = next_sibling_clone;
> -  if (!clone_of->analyzed && !clone_of->clones && !clones)
> - clone_of->release_body ();
> +  if (!clones)
> + {
> +   bool need_body = false;
> +   for (cgraph_node *n = clone_of; n; n = n->clone_of)
> + if (n->analyzed || n->clones)
> +   {
> + need_body = true;
If you want to walk immediate clones and see if any of them is needed, I
wonder why you don't also walk recursively clones of clones?

Original idea was that the clones should be materialized and removed one
by one (or proved unreachable and just removed) and once we remove last
one, we should figure out that body is not needed. For that one does not
not need the walk at all.

How exactly we end up with clones that are not analyzed?

Honza
> + break;
> +   }
> +   if (!need_body)
> + clone_of->release_body ();
> + }
>  }
>if (next_sibling_clone)
>  next_sibling_clone->prev_sibling_clone = prev_sibling_clone;
> -- 
> 2.39.0
> 


Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Michael Matz via Gcc-patches
Hello,

On Wed, 18 Jan 2023, Jakub Jelinek wrote:

> > > > > Partly OT, what is riscv not defaulting that on as well?  Does it have
> > > > > usable unwind info even without that option, something else?
> > > > 
> > > > The RISC-V ABI does not address this, AFAICS.
> > > 
> > > And neither do many other ABIs, still we default there to
> > > -fasynchronous-unwind-tables because we've decided it is a good idea.
> > 
> > That might or might not be, but in the context of this thread that's 
> > immaterial.  Doing the same as the other archs will then simply hide the 
> > problem on risc-v as well, instead of fixing it.
> 
> Yeah, that is why I've mentioned "Partly OT".  We want this bug to be fixed
> (but the fix is not what has been posted but rather decide what we want to
> ask there; if it is at the end of compilation, whether it is at least one
> function with that flag has been compiled, or all functions have been with
> that flag, something else),

The answer to this should be guided by normal use cases.  The normal use 
case is that a whole input file is compiled with or without 
-funwind-tables, and not that individual functions change this.  So any 
solution in which that usecase doesn't work is not a solution.

The purest solution is to emit unwind tables for all functions that 
request it into .eh_frame and for those that don't request it put 
into .debug_frame (if also -g is on).  If that requires enabling 
unwind-tables globally first (i.e. if even just one input function 
requests it) then that is what needs to be done.  (this seems to be the 
problem currently, that the unwind-table activation on a per-function 
basis comes too late).

The easier solution might be to make funwind-tables also be a global 
special-cased option for LTO (so that the usual use-case above works), 
that would trade one or another bug, but I'd say the current bug is more 
serious than the other bug that would be introduced.

> and IMHO riscv should switch to
> -fasynchronous-unwind-tables by default.

I don't disagree, as long as that doesn't lead to the bug being ignored :)


Ciao,
Michael.


Re: [PATCH] middle-end/106075 - non-call EH and DSE

2023-01-18 Thread Jan Hubicka via Gcc-patches
> On Tue, 17 Jan 2023, Jan Hubicka wrote:
> 
> > > > We don't use same argumentation about other control flow statements.
> > > > The following:
> > > > 
> > > > fn()
> > > > {
> > > >   try {
> > > > i_read_no_global_memory ();
> > > >   } catch (...)
> > > >   {
> > > > reutrn 1;
> > > >   }
> > > >   return 0;
> > > > }
> > > > 
> > > > should be detected as const.  Marking throw pure would make fn pure too.
> > > 
> > > I suppose i_read_no_global_memory is const here.  Not sure why that
> > Suppose we have:
> > 
> > void
> > i_read_no_global_memory ()
> > {
> >   throw(0);
> > }
> > 
> > If cxa_throw itself was annotated as 'p' rahter than 'c' ipa-modref will
> > believe that cxa_throw will read any global memory and will propagate it
> > to all callers. So fn() will be also marked as reading all global
> > memory.
> 
> Sure - but for the purpose of local optimizations in 
> i_read_no_global_memory cxa_throw has to appear to read memory.

Yes, I think every stmt that can throw externally need VUSE (just like
return_stmt needs it).  Even if throw(0) was replaced by a=b/c with
-fnon-call-exceptions.  It is still not clear to me why this should
imply that we need 'p' instead of 'c' in fnspecs.

So I think we should try to make the following to work:

diff --git a/gcc/tree-ssa-operands.cc b/gcc/tree-ssa-operands.cc
index 57e393ae164..d24f1721eb2 100644
--- a/gcc/tree-ssa-operands.cc
+++ b/gcc/tree-ssa-operands.cc
@@ -951,6 +951,9 @@ operands_scanner::parse_ssa_operands ()
   enum gimple_code code = gimple_code (stmt);
   size_t i, n, start = 0;
 
+  if (stmt_can_throw_external (fn, stmt))
+append_vuse (gimple_vop (fn));
+
   switch (code)
 {
 case GIMPLE_ASM:

> Having a VUSE there dependent on whether the function performs any
> load or store would be quite ugly.  Instead modref could special-case
> cxa_throw and not treat it as reading memory (like it already does
> for the return stmt I suppose - that also has a VUSE).

modref looks into statements with VUSEs on them and checks what
reads/stores are done.  So return statement with VUSE is walked and no
load is recorded because no actual load is found.
Similarly that would happen with __cxa_throw if it was 'c'.
With 'p' it has nothing to analyze so it would trust the fact that
cxa_throw itself reads some global state.
> 
> The problem is IIRC GIMPLE_RESX which doesn't derive from
> gimple_statement_with_memory_ops_base.  There's a bugzilla I can't find
> right now refering to this issue.

I never tried to play with gimple hiearchy. It is hard to fix resx?  I
wonder if we have other cases.  I guess for a=b/c we are luck just
because gimple_assign can also be load or store so it has memory_ops...

Thanks,
Honza


Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 18, 2023 at 03:16:07PM +, Michael Matz wrote:
> On Wed, 18 Jan 2023, Jakub Jelinek wrote:
> 
> > On Wed, Jan 18, 2023 at 04:09:08PM +0100, Andreas Schwab wrote:
> > > On Jan 18 2023, Jakub Jelinek wrote:
> > > 
> > > > Partly OT, what is riscv not defaulting that on as well?  Does it have
> > > > usable unwind info even without that option, something else?
> > > 
> > > The RISC-V ABI does not address this, AFAICS.
> > 
> > And neither do many other ABIs, still we default there to
> > -fasynchronous-unwind-tables because we've decided it is a good idea.
> 
> That might or might not be, but in the context of this thread that's 
> immaterial.  Doing the same as the other archs will then simply hide the 
> problem on risc-v as well, instead of fixing it.

Yeah, that is why I've mentioned "Partly OT".  We want this bug to be fixed
(but the fix is not what has been posted but rather decide what we want to
ask there; if it is at the end of compilation, whether it is at least one
function with that flag has been compiled, or all functions have been with
that flag, something else), and IMHO riscv should switch to
-fasynchronous-unwind-tables by default.

Jakub



Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Michael Matz via Gcc-patches
Hello,

On Wed, 18 Jan 2023, Jakub Jelinek wrote:

> On Wed, Jan 18, 2023 at 04:09:08PM +0100, Andreas Schwab wrote:
> > On Jan 18 2023, Jakub Jelinek wrote:
> > 
> > > Partly OT, what is riscv not defaulting that on as well?  Does it have
> > > usable unwind info even without that option, something else?
> > 
> > The RISC-V ABI does not address this, AFAICS.
> 
> And neither do many other ABIs, still we default there to
> -fasynchronous-unwind-tables because we've decided it is a good idea.

That might or might not be, but in the context of this thread that's 
immaterial.  Doing the same as the other archs will then simply hide the 
problem on risc-v as well, instead of fixing it.


Ciao,
Michael.


Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Andreas Schwab via Gcc-patches
On Jan 18 2023, Jakub Jelinek wrote:

> Neither of that will always match all the states of all the functions.

But if the translation units are compiled with -funwind-tables, we want
the ltrans "units" to behave the same.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 18, 2023 at 04:09:08PM +0100, Andreas Schwab wrote:
> On Jan 18 2023, Jakub Jelinek wrote:
> 
> > Partly OT, what is riscv not defaulting that on as well?  Does it have
> > usable unwind info even without that option, something else?
> 
> The RISC-V ABI does not address this, AFAICS.

And neither do many other ABIs, still we default there to
-fasynchronous-unwind-tables because we've decided it is a good idea.

Jakub



Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Andreas Schwab via Gcc-patches
On Jan 18 2023, Jakub Jelinek wrote:

> Partly OT, what is riscv not defaulting that on as well?  Does it have
> usable unwind info even without that option, something else?

The RISC-V ABI does not address this, AFAICS.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Jan Hubicka via Gcc-patches
> No unwind tables are generated, as if -funwind-tables is ignored.  If
> LTO is disabled, everything works as expected.
I think it is because dwaf2out_do_eh_frame is called out of function
context at the end of compilation. At that time cfun is NULL
and the flag is read from global settings that are wrong.
So we need to bookkeep if we saw function that needs EH tables and not.

Honza
> 
> -- 
> Andreas Schwab, SUSE Labs, sch...@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."


Re: [PATCH] IPA: do not release body if still needed

2023-01-18 Thread Martin Jambor
Hi,

On Mon, Jan 16 2023, Martin Liška wrote:
> On 1/14/23 22:36, Jan Hubicka wrote:
>>> Noticed during building of libbackend.a with the LTO partial linking.
>>>
>>> The function release_body is called even if clone_of is a clone
>>> of a another function and thus it shares tree declaration. We should
>>> preserve it in that situation.
>>>
>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>>
>>> Ready to be installed?
>>> Thanks,
>>> Martin
>>>
>>> PR ipa/107944
>>>
>>> gcc/ChangeLog:
>>>
>>> * cgraph.cc (cgraph_node::remove): Do not release body
>>> if a node is clone of another node.
>>> ---
>>>  gcc/cgraph.cc | 2 +-
>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
>>> index f15cb47c8b8..2e7d77ffd6c 100644
>>> --- a/gcc/cgraph.cc
>>> +++ b/gcc/cgraph.cc
>>> @@ -1893,7 +1893,7 @@ cgraph_node::remove (void)
>>>else if (clone_of)
>>>  {
>>>clone_of->clones = next_sibling_clone;
>>> -  if (!clone_of->analyzed && !clone_of->clones && !clones)
>>> +  if (!clone_of->analyzed && !clone_of->clones && !clones && 
>>> !clone_of->clone_of)
>>> clone_of->release_body ();
>> 
>> It is interesting that the problem reproduced only after almost 20
>> years.  But I suppose it is because we materialize clones in parituclar
>> order.
>
> Well, it started with r13-48-g27ee75dbe81bb7 where Martin add a new code
> that calls the release_body function. So it's pretty new.
>
>> 
>> I think there are two ways to fix it.  Either declare release_body to be
>> applicable only to the master clone and avoid calling it here (as you
>> do) or make release_body do nothing when called on a clone.
>> I guess it makes sense to keep your approach but please add sanity check
>> to release_body that clone_of == NULL with a comment.
>
> I do support Martin's enhanced version of the patch.
>

I take that as an approval, so I am about to commit the following after
re-testing it on trunk.  Afterwards I'll backport it to the affected
release branches too.

Thanks,

Martin


The code removing function bodies when the last call graph clone of a
node is removed is too aggressive when there are nodes up the
clone_of chain which still need them.  Fixed by expanding the check.

gcc/ChangeLog:

2023-01-18  Martin Jambor  

PR ipa/107944
* cgraph.cc (cgraph_node::remove): Check whether nodes up the
lcone_of chain also do not need the body.
---
 gcc/cgraph.cc | 14 --
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
index 5e60c2b73db..5f72ace9b57 100644
--- a/gcc/cgraph.cc
+++ b/gcc/cgraph.cc
@@ -1893,8 +1893,18 @@ cgraph_node::remove (void)
   else if (clone_of)
 {
   clone_of->clones = next_sibling_clone;
-  if (!clone_of->analyzed && !clone_of->clones && !clones)
-   clone_of->release_body ();
+  if (!clones)
+   {
+ bool need_body = false;
+ for (cgraph_node *n = clone_of; n; n = n->clone_of)
+   if (n->analyzed || n->clones)
+ {
+   need_body = true;
+   break;
+ }
+ if (!need_body)
+   clone_of->release_body ();
+   }
 }
   if (next_sibling_clone)
 next_sibling_clone->prev_sibling_clone = prev_sibling_clone;
-- 
2.39.0



Re: [aarch64] Use exact_log2 (INTVAL (operands[2])) >= 0 to gate for vec_merge patterns.

2023-01-18 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni  writes:
> Hi Richard,
> Based on your suggestion in the other thread, the patch uses
> exact_log2 (INTVAL (operands[2])) >= 0 to gate for vec_merge patterns.
> Bootstrap+test in progress on aarch64-linux-gnu.
> Does it look OK ?

Yeah, this is OK, thanks.  IMO it's a latent bug and suitable for stage 4.

Richard

>
> Thanks,
> Prathamesh
>
> [aarch64] Use exact_log2 (INTVAL (operands[2])) >= 0 to gate for vec_merge 
> patterns.
>
> gcc/ChangeLog:
>   * gcc/config/aarch64-simd.md (aarch64_simd_vec_set): Use
>   exact_log2 (INTVAL (operands[2])) >= 0 as condition for gating
>   the pattern.
>   (aarch64_simd_vec_copy_lane): Likewise.
>   (aarch64_simd_vec_copy_lane_): Likewise.
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 104088f67d2..7cc8c00f0ec 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -1064,7 +1064,7 @@
>   (match_operand: 1 "aarch64_simd_nonimmediate_operand" 
> "w,?r,Utv"))
>   (match_operand:VALL_F16 3 "register_operand" "0,0,0")
>   (match_operand:SI 2 "immediate_operand" "i,i,i")))]
> -  "TARGET_SIMD"
> +  "TARGET_SIMD && exact_log2 (INTVAL (operands[2])) >= 0"
>{
> int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
> operands[2] = GEN_INT ((HOST_WIDE_INT) 1 << elt);
> @@ -1093,7 +1093,7 @@
> [(match_operand:SI 4 "immediate_operand" "i")])))
>   (match_operand:VALL_F16 1 "register_operand" "0")
>   (match_operand:SI 2 "immediate_operand" "i")))]
> -  "TARGET_SIMD"
> +  "TARGET_SIMD && exact_log2 (INTVAL (operands[2])) >= 0"
>{
>  int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
>  operands[2] = GEN_INT (HOST_WIDE_INT_1 << elt);
> @@ -1114,7 +1114,7 @@
> [(match_operand:SI 4 "immediate_operand" "i")])))
>   (match_operand:VALL_F16_NO_V2Q 1 "register_operand" "0")
>   (match_operand:SI 2 "immediate_operand" "i")))]
> -  "TARGET_SIMD"
> +  "TARGET_SIMD && exact_log2 (INTVAL (operands[2])) >= 0"
>{
>  int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
>  operands[2] = GEN_INT (HOST_WIDE_INT_1 << elt);


Re: [aarch64] Use wzr/xzr for assigning vector element to 0

2023-01-18 Thread Richard Sandiford via Gcc-patches
Prathamesh Kulkarni  writes:
> On Tue, 17 Jan 2023 at 18:29, Richard Sandiford
>  wrote:
>>
>> Prathamesh Kulkarni  writes:
>> > Hi Richard,
>> > For the following (contrived) test:
>> >
>> > void foo(int32x4_t v)
>> > {
>> >   v[3] = 0;
>> >   return v;
>> > }
>> >
>> > -O2 code-gen:
>> > foo:
>> > fmovs1, wzr
>> > ins v0.s[3], v1.s[0]
>> > ret
>> >
>> > I suppose we can instead emit the following code-gen ?
>> > foo:
>> >  ins v0.s[3], wzr
>> >  ret
>> >
>> > combine produces:
>> > Failed to match this instruction:
>> > (set (reg:V4SI 95 [ v ])
>> > (vec_merge:V4SI (const_vector:V4SI [
>> > (const_int 0 [0]) repeated x4
>> > ])
>> > (reg:V4SI 97)
>> > (const_int 8 [0x8])))
>> >
>> > So, I wrote the following pattern to match the above insn:
>> > (define_insn "aarch64_simd_vec_set_zero"
>> >   [(set (match_operand:VALL_F16 0 "register_operand" "=w")
>> > (vec_merge:VALL_F16
>> > (match_operand:VALL_F16 1 "const_dup0_operand" "w")
>> > (match_operand:VALL_F16 3 "register_operand" "0")
>> > (match_operand:SI 2 "immediate_operand" "i")))]
>> >   "TARGET_SIMD"
>> >   {
>> > int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
>> > operands[2] = GEN_INT ((HOST_WIDE_INT) 1 << elt);
>> > return "ins\\t%0.[%p2], wzr";
>> >   }
>> > )
>> >
>> > which now matches the above insn produced by combine.
>> > However, in reload dump, it creates a new insn for assigning
>> > register to (const_vector (const_int 0)),
>> > which results in:
>> > (insn 19 8 13 2 (set (reg:V4SI 33 v1 [99])
>> > (const_vector:V4SI [
>> > (const_int 0 [0]) repeated x4
>> > ])) "wzr-test.c":8:1 1269 {*aarch64_simd_movv4si}
>> >  (nil))
>> > (insn 13 19 14 2 (set (reg/i:V4SI 32 v0)
>> > (vec_merge:V4SI (reg:V4SI 33 v1 [99])
>> > (reg:V4SI 32 v0 [97])
>> > (const_int 8 [0x8]))) "wzr-test.c":8:1 1808
>> > {aarch64_simd_vec_set_zerov4si}
>> >  (nil))
>> >
>> > and eventually the code-gen:
>> > foo:
>> > moviv1.4s, 0
>> > ins v0.s[3], wzr
>> > ret
>> >
>> > To get rid of redundant assignment of 0 to v1, I tried to split the
>> > above pattern
>> > as in the attached patch. This works to emit code-gen:
>> > foo:
>> > ins v0.s[3], wzr
>> > ret
>> >
>> > However, I am not sure if this is the right approach. Could you suggest,
>> > if it'd be possible to get rid of UNSPEC_SETZERO in the patch ?
>>
>> The problem is with the "w" constraint on operand 1, which tells LRA
>> to force the zero into an FPR.  It should work if you remove the
>> constraint.
> Ah indeed, sorry about that, changing the constrained works.

"i" isn't right though, because that's for scalar integers.
There's no need for any constraint here -- the predicate does
all of the work.

> Does the attached patch look OK after bootstrap+test ?
> Since we're in stage-4, shall it be OK to commit now, or queue it for stage-1 
> ?

It needs tests as well. :-)

Also:

> Thanks,
> Prathamesh
>
>
>>
>> Also, I think you'll need to use zr for the zero, so that
>> it uses xzr for 64-bit elements.
>>
>> I think this and the existing patterns ought to test
>> exact_log2 (INTVAL (operands[2])) >= 0 in the insn condition,
>> since there's no guarantee that RTL optimisations won't form
>> vec_merges that have other masks.
>>
>> Thanks,
>> Richard
>
> [aarch64] Use wzr/xzr for assigning 0 to vector element.
>
> gcc/ChangeLog:
>   * config/aaarch64/aarch64-simd.md (aarch64_simd_vec_set_zero):
>   New pattern.
>   * config/aarch64/predicates.md (const_dup0_operand): New.
>
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 104088f67d2..8e54ee4e886 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -1083,6 +1083,20 @@
>[(set_attr "type" "neon_ins, neon_from_gp, neon_load1_one_lane")]
>  )
>  
> +(define_insn "aarch64_simd_vec_set_zero"
> +  [(set (match_operand:VALL_F16 0 "register_operand" "=w")
> + (vec_merge:VALL_F16
> + (match_operand:VALL_F16 1 "const_dup0_operand" "i")
> + (match_operand:VALL_F16 3 "register_operand" "0")
> + (match_operand:SI 2 "immediate_operand" "i")))]
> +  "TARGET_SIMD && exact_log2 (INTVAL (operands[2])) >= 0"
> +  {
> +int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
> +operands[2] = GEN_INT ((HOST_WIDE_INT) 1 << elt);
> +return "ins\\t%0.[%p2], zr";
> +  }
> +)
> +
>  (define_insn "@aarch64_simd_vec_copy_lane"
>[(set (match_operand:VALL_F16 0 "register_operand" "=w")
>   (vec_merge:VALL_F16
> diff --git a/gcc/config/aarch64/predicates.md 
> b/gcc/config/aarch64/predicates.md
> index ff7f73d3f30..901fa1bd7f9 100644
> --- a/gcc/config/aarch64/predicates.md
> +++ b/gcc/config/aarch64/predicates.md
> @@ -49,6 +49,13 @@
>return 

Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 18, 2023 at 02:03:42PM +, Michael Matz wrote:
> On Risc-V btw.  (which, unlike i386,aarch64,s390,rs6000 does not 
> effectively enable funwind-tables always via either backend or 
> common/config/$arch/ code, which is the reason why the problem can't be 
> seen there).  It's an interaction with -g :

Partly OT, what is riscv not defaulting that on as well?  Does it have
usable unwind info even without that option, something else?

Jakub



Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 18, 2023 at 03:14:01PM +0100, Andreas Schwab wrote:
> On Jan 18 2023, Michael Matz wrote:
> 
> > So, it's quite clear that the option merging algorithm related to all this 
> > is somewhat broken, the global (or per function, or whatever) 
> > -funwind-tables option from hello.o doesn't make it correctly into the 
> > output (when -g is there).  Adding -fexception makes it work because then 
> > the functions will have personalities and on LTO-read-in _that_ will 
> > implicitely enable funwind-tables again (which should have been enabled 
> > already by the option-read-in).
> 
> My guess is that flag_unwind_tables is not yet set when .cfi_sections is
> emitted (which is done by dwarf2out_assembly_start before compile starts).

Well, the primary question for PerFunction/Optimization flag is what such
flag means outside of any function.
Because with such flags, it no longer is everything wants unwind tables (or
asynchronous unwind tables), but perhaps some functions want that and others
don't.
So, do we for .cfi_sections want to know whether at least one of the
functions in the TU (or partition for lto1) wants unwind tables /
asynchronous unwind tables, or whether all of them do, something else?

That isn't specific to LTO btw, one can compile say:
-g -O2 -fasynchronous-unwind-tables -funwind-tables
__attribute__((optimize ("no-asynchronous-unwind-tables,no-unwind-tables"))) 
int foo (int x) { return x; }
__attribute__((optimize ("no-asynchronous-unwind-tables,no-unwind-tables"))) 
int bar (int x) { return x; }
or
-g -O2 -fno-asynchronous-unwind-tables -fno-unwind-tables
__attribute__((optimize ("asynchronous-unwind-tables,unwind-tables"))) int foo 
(int x) { return x; }
__attribute__((optimize ("asynchronous-unwind-tables,unwind-tables"))) int bar 
(int x) { return x; }
Now, for non-LTO what you get in flag_asynchronous_unwind_tables or
flag_unwind_tables when cfun is NULL is I think whatever has been
set on the command line (or defaulted), which doesn't need to match
any of the emitted functions.
For LTO we currently get there just whatever has been defaulted.
Neither of that will always match all the states of all the functions.

Jakub



Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Andreas Schwab via Gcc-patches
On Jan 18 2023, Michael Matz wrote:

> So, it's quite clear that the option merging algorithm related to all this 
> is somewhat broken, the global (or per function, or whatever) 
> -funwind-tables option from hello.o doesn't make it correctly into the 
> output (when -g is there).  Adding -fexception makes it work because then 
> the functions will have personalities and on LTO-read-in _that_ will 
> implicitely enable funwind-tables again (which should have been enabled 
> already by the option-read-in).

My guess is that flag_unwind_tables is not yet set when .cfi_sections is
emitted (which is done by dwarf2out_assembly_start before compile starts).

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


[PATCH] wwwdocs: Announce Solaris 11.3 obsoletion

2023-01-18 Thread Rainer Orth
Here's the changes.html patch corresponding to the Solaris 11.3
obsoletion notice in

https://gcc.gnu.org/pipermail/gcc/2022-December/240322.html
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608384.html

Since this is the only obsoletion in GCC 13 so far, I haven't introduced
a toplevel bulletpoint as in GCC 9.

Ok?


Btw., I noticed the -gz=zstd addition is listed under Caveats.  I don't
think this belongs here and probably only landed due to the -gz=zlib-gnu
removal above.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2023-01-18  Rainer Orth  

* htdocs/gcc-13/changes.html (Caveats): Document Solaris 11.3
obsoletion.

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index ca9cd2da..7047e742 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -32,6 +32,13 @@ a work-in-progress.
 The support for the cr16-elf, tilegx*-linux, tilepro*-linux,
   hppa[12]*-*-hpux10*, hppa[12]*-*-hpux11*
   and m32c-rtems configurations has been removed.
+Support for Solaris 11.3 (*-*-solaris2.11.3) has been
+  declared obsolete.  The next release of GCC will have corresponding
+  code permanently removed.  Details can be found in
+  the
+  https://gcc.gnu.org/pipermail/gcc/2022-December/240322.html;>
+  announcement.
+
 Support for emitting the STABS debugging format (including the
   -gstabs and -gxcoff options) has been removed.
   (This means the dbx debugger is no longer


Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Michael Matz via Gcc-patches
On Wed, 18 Jan 2023, Andreas Schwab via Gcc-patches wrote:

> No unwind tables are generated, as if -funwind-tables is ignored.  If
> LTO is disabled, everything works as expected.

On Risc-V btw.  (which, unlike i386,aarch64,s390,rs6000 does not 
effectively enable funwind-tables always via either backend or 
common/config/$arch/ code, which is the reason why the problem can't be 
seen there).  It's an interaction with -g :

The problem (with cross compiler here, but shouldn't matter):

% riscv64-gcc -g -flto -funwind-tables -fPIC -c hello.c
% riscv64-gcc -shared hello.o
% readelf -wF a.out
... empty .eh_frame ...
Contents of the .debug_frame section:
... content ...

Using a compiler for any of the above archs makes this work (working means 
that the unwind info is placed into .eh_frame, _not_ into .debug_frame).  
Not using -g makes it work.  Adding -funwind-tables to the link command 
makes it work.  Adding -fexceptions to the compile command makes it work.  
Not using LTO makes it work.

So, it's quite clear that the option merging algorithm related to all this 
is somewhat broken, the global (or per function, or whatever) 
-funwind-tables option from hello.o doesn't make it correctly into the 
output (when -g is there).  Adding -fexception makes it work because then 
the functions will have personalities and on LTO-read-in _that_ will 
implicitely enable funwind-tables again (which should have been enabled 
already by the option-read-in).

As written initially the other archs are working because they all have 
various ways of essentially setting flag_unwind_tables always:

i386 via common/config/i386/i386-common.cc
   opts->x_flag_asynchronous_unwind_tables = 2;
  config/i386/i386-options.cc
 if (opts->x_flag_asynchronous_unwind_tables == 2)
   opts->x_flag_unwind_tables
 = opts->x_flag_asynchronous_unwind_tables = 1;

rs6000 via common/config/rs6000/rs6000-common.cc
   #ifdef OBJECT_FORMAT_ELF
 opts->x_flag_asynchronous_unwind_tables = 1;
   #endif
  (which ultimately also enabled flag_unwind_tables)

s390 via common/config/s390/s390-common.cc
opts->x_flag_asynchronous_unwind_tables = 1;

aarch64 via common/config/aarch64/aarch64-common.cc
  #if (TARGET_DEFAULT_ASYNC_UNWIND_TABLES == 1)
{ OPT_LEVELS_ALL, OPT_fasynchronous_unwind_tables, NULL, 1 },
{ OPT_LEVELS_ALL, OPT_funwind_tables, NULL, 1},
  #endif

  (the #define here is set for aarch64*-*-linux* )

So the problem cannot be readily demonstrated on these architectures.  
risc-v has no such code (and doesn't need to).


Ciao,
Michael.


[PATCH] lto/108445 - avoid LTO decl wrapping being confused by tree sharing

2023-01-18 Thread Richard Biener via Gcc-patches
r13-4743 exposed more tree sharing which runs into a latent issue
with LTO decl wrapping during streaming.  The following adds a
testcase triggering the issue.

Pushed.

PR lto/108445
* gcc.dg/lto/pr108445_0.c: New testcase.
* gcc.dg/lto/pr108445_1.c: Likewise.
---
 gcc/testsuite/gcc.dg/lto/pr108445_0.c |  4 
 gcc/testsuite/gcc.dg/lto/pr108445_1.c | 19 +++
 2 files changed, 23 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr108445_0.c
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr108445_1.c

diff --git a/gcc/testsuite/gcc.dg/lto/pr108445_0.c 
b/gcc/testsuite/gcc.dg/lto/pr108445_0.c
new file mode 100644
index 000..06dac691e84
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr108445_0.c
@@ -0,0 +1,4 @@
+/* { dg-lto-do link } */
+/* { dg-lto-options { "-g -O2 -flto" } } */
+
+int gArray[16];
diff --git a/gcc/testsuite/gcc.dg/lto/pr108445_1.c 
b/gcc/testsuite/gcc.dg/lto/pr108445_1.c
new file mode 100644
index 000..50db9feb8a5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr108445_1.c
@@ -0,0 +1,19 @@
+extern int gArray[];
+
+int foo(int *a)
+{
+  int *p = a;
+
+  return *p;
+}
+
+int main(int argc, char *argv[])
+{
+  if (argc & 1)
+gArray[argc - 1] = 1;
+
+  if (argc > 1)
+return foo(gArray);
+
+  return 0;
+}
-- 
2.35.3


Re: [PATCH] middle-end/108086 - avoid unshare_expr when remapping SSA names

2023-01-18 Thread Richard Biener via Gcc-patches
On Fri, 16 Dec 2022, Richard Biener wrote:

> r0-89280-g129a37fc319db8 added unsharing to remap_ssa_name but
> that wasn't in the version of the patch posted.  That has some
> non-trivial cost through mostly_copy_tree_r and copy_tree_r but
> more importantly it doesn't seem to be necessary.  I've successfully
> bootstrapped and tested with an assert we only get
> tree_node_can_be_shared trees here.
> 
> Bootstrapped and tested on x86_64-unknown-linux-gnu with all
> languages.
> 
> Pushed to trunk.

Reverted due to PR108445, will revisit during stage1.

Richard.

>   PR middle-end/108086
>   * tree-inline.cc (remap_ssa_name): Do not unshare the
>   result from the decl_map.
> ---
>  gcc/tree-inline.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
> index c802792fa07..b471774ce51 100644
> --- a/gcc/tree-inline.cc
> +++ b/gcc/tree-inline.cc
> @@ -183,7 +183,7 @@ remap_ssa_name (tree name, copy_body_data *id)
> return name;
>   }
>  
> -  return unshare_expr (*n);
> +  return *n;
>  }
>  
>if (processing_debug_stmt)
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Frankenstrasse 146, 90461 Nuernberg,
Germany; GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman;
HRB 36809 (AG Nuernberg)


Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Andreas Schwab via Gcc-patches
On Jan 18 2023, Jakub Jelinek wrote:

> That is streamed in by lto1 back and on each set_cfun such saved options
> are stored into global_options{,_set}.

Is that done in time for dwarf2out_do_eh_frame?

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Andreas Schwab via Gcc-patches
No unwind tables are generated, as if -funwind-tables is ignored.  If
LTO is disabled, everything works as expected.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] libgcc: Fix uninitialized RA signing on AArch64 [PR107678]

2023-01-18 Thread Wilco Dijkstra via Gcc-patches
Hi,

>> +  /* Return-address signing state is toggled by DW_CFA_GNU_window_save 
>> (where
>> + REG_UNDEFINED means enabled), or set by a DW_CFA_expression.  */
>
> Needs updating to REG_UNSAVED_ARCHEXT.
> 
> OK with that changes, thanks, and sorry for the delays & runaround.

Thanks, I've improved the comment and it has been committed to trunk now.

Cheers,
Wilco

Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 18, 2023 at 01:39:18PM +0100, Andreas Schwab wrote:
> On Jan 18 2023, Jakub Jelinek wrote:
> 
> > With LTO each function has the DECL_FUNCTION_SPECIFIC_OPTIMIZATION
> > (and _TARGET), for functions with optimize attribute obviously as without
> > LTO specific to what options have been overridden (but with defaults from
> > TU's command line etc.), for functions without that simply with what
> > options has the TU.
> 
> Sorry, I cannot parse that sentence.  Could you please try again?

After parsing options GCC creates an OPTIMIZATION_NODE tree with all the
PerFunction/Optimization option state recorded in it (and more).
That node is then when writing LTO attached to each function (different
node if a function has optimize attribute etc.).
That is streamed in by lto1 back and on each set_cfun such saved options
are stored into global_options{,_set}.

> > lto1 then streams in those options and when switching functions switches
> > the global options.
> 
> Why does that not work then?

I don't know what doesn't work.  You haven't mentioned what kind of PR
you're trying to fix or what problem you are seeing.

Jakub



Re: [wwwdocs] gcc-13/changes.html + projects/gomp/: OpenMP update

2023-01-18 Thread Tobias Burnus

Hi Gerald,

On 16.01.23 23:16, Gerald Pfeifer wrote:

On Mon, 16 Jan 2023, Tobias Burnus wrote:

 requires_offload, unified_address
-  and unified_shared_memory clauses cause that the
-  only available device is the initial device (the host). Fortran now
+  and unified_shared_memory clauses imply the initial
+  device (= the host) as the only available device. Fortran now

I really stumble over the "as" – that sounds wrong and I fail to parse this
part. I think it should be "is".

happy to make this change. Or do you have an idea to reframe the
sentence (or paragraph) altogether?


Actually, I thinking about it again, the "imply" is also misleading – by
itself the restrictions do not imply that accelerators/GPUs are not
supported; that's only implied in GCC as the libgomp plugins for nvptx
and amdgcn don't handle it, yet.

How about the following? I put the other change into its own bullet
point to be less confusing, completely rewording the remaining item and
mention reverse offload support.

(Reverse offload is: While being in a target region ('omp target', i.e.
running code targeted for an offload device), it is possible to execute
a code on the host. — If there is no available non-host device, the
target region will run on the host (host fallback); in that case,
reverse offload is trivial (as host code calls host code).)


BTW: Before the release, further updates to changes.html are required.

Keep them coming! :-)


Actually, I think only one change was missing (looking at
libgomp/libgomp.texi), unless some more pending patches are accepted. –
I have now included that change in the attached patch.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
OpenMP: Update gcc-13/changes + projects/gomp

* htdocs/gcc-13/changes.html: Improve wording; mention nvptx reverse
  offload.
* htdocs/projects/gomp/index.html: Split clause/directive entry
  for 'allocate' and mark the clause variant as fully implemented.

 htdocs/gcc-13/changes.html  | 19 +--
 htdocs/projects/gomp/index.html |  9 +++--
 2 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index ca9cd2da..6deb445f 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -53,12 +53,19 @@ a work-in-progress.
   https://gcc.gnu.org/projects/gomp/;>OpenMP
   
 
-  Reverse offload is now supported and the all clauses to the
-  requires directive are now accepted. However, the
-  requires_offload, unified_address
-  and unified_shared_memory clauses imply the initial
-  device (= the host) as the only available device. Fortran now
-  supports non-rectangular loop nests, which were added for C/C++ in GCC 11.
+  Reverse offload is now supported with nvptx devices. Additionally, the
+  requires handling has been improved and all clauses are
+  now accepted. If a requirement cannot be fulfilled for an accessible
+  device, this device is excluded from the list of available devices. This
+  may imply that the only device left is the host (the initial device).
+  In particular, requires_offload is currently unsupported on
+  AMD GCN devices while unified_address and
+  unified_shared_memory are unsupported by all non-host
+  devices.
+
+
+  OpenMP 5.0: Fortran now supports non-rectangular loop nests, which were
+  added for C/C++ in GCC 11.
 
 
   The following OpenMP 5.1 features have been added: the
diff --git a/htdocs/projects/gomp/index.html b/htdocs/projects/gomp/index.html
index 19ff3c7d..dc9c88e7 100644
--- a/htdocs/projects/gomp/index.html
+++ b/htdocs/projects/gomp/index.html
@@ -547,9 +547,14 @@ than listed, depending on resolved corner cases and optimizations.
 
   
   
-align clause/modifier in allocate directive/clause and allocator directive
+align clause in allocate directive
+No
+
+  
+  
+align modifier in allocate clause
 GCC12
-C/C++ on clause only
+
   
   
 thread_limit clause to target construct


Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Andreas Schwab via Gcc-patches
On Jan 18 2023, Jakub Jelinek wrote:

> With LTO each function has the DECL_FUNCTION_SPECIFIC_OPTIMIZATION
> (and _TARGET), for functions with optimize attribute obviously as without
> LTO specific to what options have been overridden (but with defaults from
> TU's command line etc.), for functions without that simply with what
> options has the TU.

Sorry, I cannot parse that sentence.  Could you please try again?

> lto1 then streams in those options and when switching functions switches
> the global options.

Why does that not work then?

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 18, 2023 at 01:30:53PM +0100, Andreas Schwab wrote:
> On Jan 18 2023, Jakub Jelinek wrote:
> 
> > On Wed, Jan 18, 2023 at 12:25:11PM +0100, Andreas Schwab via Gcc-patches 
> > wrote:
> >> On Jan 18 2023, Richard Biener wrote:
> >> 
> >> > On Wed, Jan 18, 2023 at 11:17 AM Andreas Schwab via Gcc-patches
> >> >  wrote:
> >> >>
> >> >> The -funwind-tables and -fasynchronous-unwind-tables options are 
> >> >> relevant
> >> >> for the output pass, thus they need to be passed through by the lto
> >> >> wrapper.
> >> >
> >> > They are already stored per function, and ...
> >> 
> >> Are they?  Are you sure you don't confuse that with -fexceptions?
> >
> > They clearly are:
> > fasynchronous-unwind-tables
> > Common Var(flag_asynchronous_unwind_tables) Optimization
> > Generate unwind tables that are exact at each instruction boundary.
> > and
> > funwind-tables
> > Common Var(flag_unwind_tables) Optimization
> > Just generate unwind tables for exception handling.
> 
> How is that supposed to work then?

With LTO each function has the DECL_FUNCTION_SPECIFIC_OPTIMIZATION
(and _TARGET), for functions with optimize attribute obviously as without
LTO specific to what options have been overridden (but with defaults from
TU's command line etc.), for functions without that simply with what
options has the TU.
lto1 then streams in those options and when switching functions switches
the global options.

Jakub



Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Andreas Schwab via Gcc-patches
On Jan 18 2023, Jakub Jelinek wrote:

> On Wed, Jan 18, 2023 at 12:25:11PM +0100, Andreas Schwab via Gcc-patches 
> wrote:
>> On Jan 18 2023, Richard Biener wrote:
>> 
>> > On Wed, Jan 18, 2023 at 11:17 AM Andreas Schwab via Gcc-patches
>> >  wrote:
>> >>
>> >> The -funwind-tables and -fasynchronous-unwind-tables options are relevant
>> >> for the output pass, thus they need to be passed through by the lto
>> >> wrapper.
>> >
>> > They are already stored per function, and ...
>> 
>> Are they?  Are you sure you don't confuse that with -fexceptions?
>
> They clearly are:
> fasynchronous-unwind-tables
> Common Var(flag_asynchronous_unwind_tables) Optimization
> Generate unwind tables that are exact at each instruction boundary.
> and
> funwind-tables
> Common Var(flag_unwind_tables) Optimization
> Just generate unwind tables for exception handling.

How is that supposed to work then?

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH][GCC] arm: fix __arm_vld1q_z* and __arm_vst1q_p* intrinsics.

2023-01-18 Thread Richard Earnshaw via Gcc-patches




On 22/12/2021 16:21, Murray Steele via Gcc-patches wrote:

Hi,

On 22/12/2021 16:04, Richard Earnshaw wrote:



Is there a PR in bugzilla for this?

R.




No, not at this time. It's something I came across whilst
making changes of my own.

For completeness, the ACLE specification I am referencing
has been added below [1].

[1]: https://github.com/ARM-software/acle/releases/tag/r2021Q3

Thanks,
Murray


Andre created one today and I've now pulled this patch in.  Thanks, and 
sorry for the delay getting it committed.


R.


Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Jakub Jelinek via Gcc-patches
On Wed, Jan 18, 2023 at 12:25:11PM +0100, Andreas Schwab via Gcc-patches wrote:
> On Jan 18 2023, Richard Biener wrote:
> 
> > On Wed, Jan 18, 2023 at 11:17 AM Andreas Schwab via Gcc-patches
> >  wrote:
> >>
> >> The -funwind-tables and -fasynchronous-unwind-tables options are relevant
> >> for the output pass, thus they need to be passed through by the lto
> >> wrapper.
> >
> > They are already stored per function, and ...
> 
> Are they?  Are you sure you don't confuse that with -fexceptions?

They clearly are:
fasynchronous-unwind-tables
Common Var(flag_asynchronous_unwind_tables) Optimization
Generate unwind tables that are exact at each instruction boundary.
and
funwind-tables
Common Var(flag_unwind_tables) Optimization
Just generate unwind tables for exception handling.

The Optimization keyword is what implies that, as documented:
'PerFunction'
 This is an option that can be overridden on a per-function basis.
 'Optimization' implies 'PerFunction', but options that do not
 affect executable code generation may use this flag instead, so
 that the option is not taken into account in ways that might affect
 executable code generation.

Jakub



Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Andreas Schwab via Gcc-patches
On Jan 18 2023, Richard Biener wrote:

> On Wed, Jan 18, 2023 at 11:17 AM Andreas Schwab via Gcc-patches
>  wrote:
>>
>> The -funwind-tables and -fasynchronous-unwind-tables options are relevant
>> for the output pass, thus they need to be passed through by the lto
>> wrapper.
>
> They are already stored per function, and ...

Are they?  Are you sure you don't confuse that with -fexceptions?

> What exactly are you fixing?

Making -funwind-tables effective in LTO mode.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Richard Biener via Gcc-patches
On Wed, Jan 18, 2023 at 11:17 AM Andreas Schwab via Gcc-patches
 wrote:
>
> The -funwind-tables and -fasynchronous-unwind-tables options are relevant
> for the output pass, thus they need to be passed through by the lto
> wrapper.

They are already stored per function, and ...

> gcc/
> * lto-wrapper.cc (merge_and_complain): Pass through
> -funwind-tables and -fasynchronous-unwind-tables.
> (append_compiler_options): Likewise.
> ---
>  gcc/lto-wrapper.cc | 4 
>  1 file changed, 4 insertions(+)
>
> diff --git a/gcc/lto-wrapper.cc b/gcc/lto-wrapper.cc
> index 11c4d1b38a4..627e8238606 100644
> --- a/gcc/lto-wrapper.cc
> +++ b/gcc/lto-wrapper.cc
> @@ -314,6 +314,8 @@ merge_and_complain (vec 
> _options,
> case OPT_fshow_column:
> case OPT_fcommon:
> case OPT_fgnu_tm:
> +   case OPT_funwind_tables:
> +   case OPT_fasynchronous_unwind_tables:

this would pick -fno-unwind-tables if picked up first?

If in global/IPA context the setting of flag_unwind_tables matters
then should we
compute it in lto1 from the set of functions in the partition instead?
 Thus enable
it when the user requested unwind tables from at least one function?

Handling of options in lto-wrapper that are marked Optimization and thus
streamed per function is somewhat dubious.

What exactly are you fixing?

Richard.

> case OPT_g:
>   /* Do what the old LTO code did - collect exactly one option
>  setting per OPT code, we pick the first we encounter.
> @@ -737,6 +739,8 @@ append_compiler_options (obstack *argv_obstack, 
> vec opts)
> case OPT_fopenacc_dim_:
> case OPT_foffload_abi_:
> case OPT_fcf_protection_:
> +   case OPT_funwind_tables:
> +   case OPT_fasynchronous_unwind_tables:
> case OPT_g:
> case OPT_O:
> case OPT_Ofast:
> --
> 2.39.1
>
>
> --
> Andreas Schwab, SUSE Labs, sch...@suse.de
> GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
> "And now for something completely different."


[aarch64] Use exact_log2 (INTVAL (operands[2])) >= 0 to gate for vec_merge patterns.

2023-01-18 Thread Prathamesh Kulkarni via Gcc-patches
Hi Richard,
Based on your suggestion in the other thread, the patch uses
exact_log2 (INTVAL (operands[2])) >= 0 to gate for vec_merge patterns.
Bootstrap+test in progress on aarch64-linux-gnu.
Does it look OK ?

Thanks,
Prathamesh
[aarch64] Use exact_log2 (INTVAL (operands[2])) >= 0 to gate for vec_merge 
patterns.

gcc/ChangeLog:
* gcc/config/aarch64-simd.md (aarch64_simd_vec_set): Use
exact_log2 (INTVAL (operands[2])) >= 0 as condition for gating
the pattern.
(aarch64_simd_vec_copy_lane): Likewise.
(aarch64_simd_vec_copy_lane_): Likewise.

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 104088f67d2..7cc8c00f0ec 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1064,7 +1064,7 @@
(match_operand: 1 "aarch64_simd_nonimmediate_operand" 
"w,?r,Utv"))
(match_operand:VALL_F16 3 "register_operand" "0,0,0")
(match_operand:SI 2 "immediate_operand" "i,i,i")))]
-  "TARGET_SIMD"
+  "TARGET_SIMD && exact_log2 (INTVAL (operands[2])) >= 0"
   {
int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
operands[2] = GEN_INT ((HOST_WIDE_INT) 1 << elt);
@@ -1093,7 +1093,7 @@
  [(match_operand:SI 4 "immediate_operand" "i")])))
(match_operand:VALL_F16 1 "register_operand" "0")
(match_operand:SI 2 "immediate_operand" "i")))]
-  "TARGET_SIMD"
+  "TARGET_SIMD && exact_log2 (INTVAL (operands[2])) >= 0"
   {
 int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
 operands[2] = GEN_INT (HOST_WIDE_INT_1 << elt);
@@ -1114,7 +1114,7 @@
  [(match_operand:SI 4 "immediate_operand" "i")])))
(match_operand:VALL_F16_NO_V2Q 1 "register_operand" "0")
(match_operand:SI 2 "immediate_operand" "i")))]
-  "TARGET_SIMD"
+  "TARGET_SIMD && exact_log2 (INTVAL (operands[2])) >= 0"
   {
 int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
 operands[2] = GEN_INT (HOST_WIDE_INT_1 << elt);


Re: [aarch64] Use wzr/xzr for assigning vector element to 0

2023-01-18 Thread Prathamesh Kulkarni via Gcc-patches
On Tue, 17 Jan 2023 at 18:29, Richard Sandiford
 wrote:
>
> Prathamesh Kulkarni  writes:
> > Hi Richard,
> > For the following (contrived) test:
> >
> > void foo(int32x4_t v)
> > {
> >   v[3] = 0;
> >   return v;
> > }
> >
> > -O2 code-gen:
> > foo:
> > fmovs1, wzr
> > ins v0.s[3], v1.s[0]
> > ret
> >
> > I suppose we can instead emit the following code-gen ?
> > foo:
> >  ins v0.s[3], wzr
> >  ret
> >
> > combine produces:
> > Failed to match this instruction:
> > (set (reg:V4SI 95 [ v ])
> > (vec_merge:V4SI (const_vector:V4SI [
> > (const_int 0 [0]) repeated x4
> > ])
> > (reg:V4SI 97)
> > (const_int 8 [0x8])))
> >
> > So, I wrote the following pattern to match the above insn:
> > (define_insn "aarch64_simd_vec_set_zero"
> >   [(set (match_operand:VALL_F16 0 "register_operand" "=w")
> > (vec_merge:VALL_F16
> > (match_operand:VALL_F16 1 "const_dup0_operand" "w")
> > (match_operand:VALL_F16 3 "register_operand" "0")
> > (match_operand:SI 2 "immediate_operand" "i")))]
> >   "TARGET_SIMD"
> >   {
> > int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
> > operands[2] = GEN_INT ((HOST_WIDE_INT) 1 << elt);
> > return "ins\\t%0.[%p2], wzr";
> >   }
> > )
> >
> > which now matches the above insn produced by combine.
> > However, in reload dump, it creates a new insn for assigning
> > register to (const_vector (const_int 0)),
> > which results in:
> > (insn 19 8 13 2 (set (reg:V4SI 33 v1 [99])
> > (const_vector:V4SI [
> > (const_int 0 [0]) repeated x4
> > ])) "wzr-test.c":8:1 1269 {*aarch64_simd_movv4si}
> >  (nil))
> > (insn 13 19 14 2 (set (reg/i:V4SI 32 v0)
> > (vec_merge:V4SI (reg:V4SI 33 v1 [99])
> > (reg:V4SI 32 v0 [97])
> > (const_int 8 [0x8]))) "wzr-test.c":8:1 1808
> > {aarch64_simd_vec_set_zerov4si}
> >  (nil))
> >
> > and eventually the code-gen:
> > foo:
> > moviv1.4s, 0
> > ins v0.s[3], wzr
> > ret
> >
> > To get rid of redundant assignment of 0 to v1, I tried to split the
> > above pattern
> > as in the attached patch. This works to emit code-gen:
> > foo:
> > ins v0.s[3], wzr
> > ret
> >
> > However, I am not sure if this is the right approach. Could you suggest,
> > if it'd be possible to get rid of UNSPEC_SETZERO in the patch ?
>
> The problem is with the "w" constraint on operand 1, which tells LRA
> to force the zero into an FPR.  It should work if you remove the
> constraint.
Ah indeed, sorry about that, changing the constrained works.
Does the attached patch look OK after bootstrap+test ?
Since we're in stage-4, shall it be OK to commit now, or queue it for stage-1 ?

Thanks,
Prathamesh


>
> Also, I think you'll need to use zr for the zero, so that
> it uses xzr for 64-bit elements.
>
> I think this and the existing patterns ought to test
> exact_log2 (INTVAL (operands[2])) >= 0 in the insn condition,
> since there's no guarantee that RTL optimisations won't form
> vec_merges that have other masks.
>
> Thanks,
> Richard
[aarch64] Use wzr/xzr for assigning 0 to vector element.

gcc/ChangeLog:
* config/aaarch64/aarch64-simd.md (aarch64_simd_vec_set_zero):
New pattern.
* config/aarch64/predicates.md (const_dup0_operand): New.

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 104088f67d2..8e54ee4e886 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1083,6 +1083,20 @@
   [(set_attr "type" "neon_ins, neon_from_gp, neon_load1_one_lane")]
 )
 
+(define_insn "aarch64_simd_vec_set_zero"
+  [(set (match_operand:VALL_F16 0 "register_operand" "=w")
+   (vec_merge:VALL_F16
+   (match_operand:VALL_F16 1 "const_dup0_operand" "i")
+   (match_operand:VALL_F16 3 "register_operand" "0")
+   (match_operand:SI 2 "immediate_operand" "i")))]
+  "TARGET_SIMD && exact_log2 (INTVAL (operands[2])) >= 0"
+  {
+int elt = ENDIAN_LANE_N (, exact_log2 (INTVAL (operands[2])));
+operands[2] = GEN_INT ((HOST_WIDE_INT) 1 << elt);
+return "ins\\t%0.[%p2], zr";
+  }
+)
+
 (define_insn "@aarch64_simd_vec_copy_lane"
   [(set (match_operand:VALL_F16 0 "register_operand" "=w")
(vec_merge:VALL_F16
diff --git a/gcc/config/aarch64/predicates.md b/gcc/config/aarch64/predicates.md
index ff7f73d3f30..901fa1bd7f9 100644
--- a/gcc/config/aarch64/predicates.md
+++ b/gcc/config/aarch64/predicates.md
@@ -49,6 +49,13 @@
   return CONST_INT_P (op) && IN_RANGE (INTVAL (op), 1, 3);
 })
 
+(define_predicate "const_dup0_operand"
+  (match_code "const_vector")
+{
+  op = unwrap_const_vec_duplicate (op);
+  return CONST_INT_P (op) && rtx_equal_p (op, const0_rtx);
+})
+
 (define_predicate "subreg_lowpart_operator"
   (ior (match_code "truncate")
(and (match_code "subreg")


realpath() patch to fix symlinks resolution for win32

2023-01-18 Thread i.nixman--- via Gcc-patches

hello again!

the final version of the path for 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108350


successfully bootstraped for x86_64-mingw32 and x86_64-linux.

could anyone apply it please?



best!
diff --git a/libiberty/lrealpath.c b/libiberty/lrealpath.c
index 3c7053b0b70..a1ad074d00e 100644
--- a/libiberty/lrealpath.c
+++ b/libiberty/lrealpath.c
@@ -68,8 +68,135 @@ extern char *canonicalize_file_name (const char *);
   /* cygwin has realpath, so it won't get here.  */ 
 # if defined (_WIN32)
 #  define WIN32_LEAN_AND_MEAN
-#  include  /* for GetFullPathName */
-# endif
+#  include  /* for GetFullPathName/GetFinalPathNameByHandle/
+  CreateFile/CloseHandle */
+#  define WIN32_REPLACE_SLASHES(_ptr, _len) \
+ for (unsigned i = 0; i != (_len); ++i) \
+   if ((_ptr)[i] == '\\') (_ptr)[i] = '/';
+
+#  define WIN32_UNC_PREFIX "//?/UNC/"
+#  define WIN32_UNC_PREFIX_LEN (sizeof(WIN32_UNC_PREFIX)-1)
+#  define WIN32_IS_UNC_PREFIX(ptr) \
+  (0 == memcmp(ptr, WIN32_UNC_PREFIX, WIN32_UNC_PREFIX_LEN))
+
+#  define WIN32_NON_UNC_PREFIX "//?/"
+#  define WIN32_NON_UNC_PREFIX_LEN (sizeof(WIN32_NON_UNC_PREFIX)-1)
+#  define WIN32_IS_NON_UNC_PREFIX(ptr) \
+  (0 == memcmp(ptr, WIN32_NON_UNC_PREFIX, WIN32_NON_UNC_PREFIX_LEN))
+
+/* Get full path name without symlinks resolution.
+   It also converts all forward slashes to back slashes.
+*/
+char* get_full_path_name(const char *filename) {
+  DWORD len;
+  char *buf, *ptr, *res;
+
+  /* determining the required buffer size.
+ from the man: `If the lpBuffer buffer is too small to contain
+ the path, the return value is the size, in TCHARs, of the buffer
+ that is required to hold the path _and_the_terminating_null_character_`
+  */
+  len = GetFullPathName(filename, 0, NULL, NULL);
+
+  if ( len == 0 )
+return strdup(filename);
+
+  buf = (char *)malloc(len);
+
+  /* no point to check the result again */
+  len = GetFullPathName(filename, len, buf, NULL);
+  buf[len] = 0;
+
+  /* replace slashes */
+  WIN32_REPLACE_SLASHES(buf, len);
+
+  /* calculate offset based on prefix type */
+  len = WIN32_IS_UNC_PREFIX(buf)
+? (WIN32_UNC_PREFIX_LEN - 2)
+: WIN32_IS_NON_UNC_PREFIX(buf)
+  ? WIN32_NON_UNC_PREFIX_LEN
+  : 0
+  ;
+
+  ptr = buf + len;
+  if ( WIN32_IS_UNC_PREFIX(buf) ) {
+ptr[0] = '/';
+ptr[1] = '/';
+  }
+
+  res = strdup(ptr);
+
+  free(buf);
+
+  return res;
+}
+
+# if _WIN32_WINNT >= 0x0600
+
+/* Get full path name WITH symlinks resolution.
+   It also converts all forward slashes to back slashes.
+*/
+char* get_final_path_name(HANDLE fh) {
+  DWORD len;
+  char *buf, *ptr, *res;
+
+  /* determining the required buffer size.
+ from the  man: `If the function fails because lpszFilePath is too
+ small to hold the string plus the terminating null character,
+ the return value is the required buffer size, in TCHARs. This
+ value _includes_the_size_of_the_terminating_null_character_`.
+ but in my testcase I have path with 26 chars, the function
+ returns 26 also, ie without the trailing zero-char...
+  */
+  len = GetFinalPathNameByHandle(
+ fh
+,NULL
+,0
+,FILE_NAME_NORMALIZED | VOLUME_NAME_DOS
+  );
+
+  if ( len == 0 )
+return NULL;
+
+  len += 1; /* for zero-char */
+  buf = (char *)malloc(len);
+
+  /* no point to check the result again */
+  len = GetFinalPathNameByHandle(
+ fh
+,buf
+,len
+,FILE_NAME_NORMALIZED | VOLUME_NAME_DOS
+  );
+  buf[len] = 0;
+
+  /* replace slashes */
+  WIN32_REPLACE_SLASHES(buf, len);
+
+  /* calculate offset based on prefix type */
+  len = WIN32_IS_UNC_PREFIX(buf)
+? (WIN32_UNC_PREFIX_LEN - 2)
+: WIN32_IS_NON_UNC_PREFIX(buf)
+  ? WIN32_NON_UNC_PREFIX_LEN
+  : 0
+  ;
+
+  ptr = buf + len;
+  if ( WIN32_IS_UNC_PREFIX(buf) ) {
+ptr[0] = '/';
+ptr[1] = '/';
+  }
+
+  res = strdup(ptr);
+
+  free(buf);
+
+  return res;
+}
+
+# endif // _WIN32_WINNT >= 0x0600
+
+# endif // _WIN32
 #endif
 
 char *
@@ -128,30 +255,52 @@ lrealpath (const char *filename)
   }
 #endif
 
-  /* The MS Windows method.  If we don't have realpath, we assume we
- don't have symlinks and just canonicalize to a Windows absolute
- path.  GetFullPath converts ../ and ./ in relative paths to
- absolute paths, filling in current drive if one is not given
- or using the current directory of a specified drive (eg, "E:foo").
- It also converts all forward slashes to back slashes.  */
+  /* The MS Windows method */
 #if defined (_WIN32)
   {
-char buf[MAX_PATH];
-char* basename;
-DWORD len = GetFullPathName (filename, MAX_PATH, buf, );
-if (len == 0 || len > MAX_PATH - 1)
-  return strdup (filename);
-else
-  {
-	/* The file system is case-preserving but case-insensitive,
-	   Canonicalize to lowercase, using the codepage associated
-	   with the process locale.  */
-CharLowerBuff (buf, len);
-return strdup (buf);
-  }
-  }
-#endif
+char *res;
+

Re: [PATCH] xtensa: Optimize inversion of the MSB

2023-01-18 Thread Max Filippov via Gcc-patches
On Tue, Jan 17, 2023 at 9:43 PM Takayuki 'January June' Suwa
 wrote:
>
> Such operation can be done either bitwise-XOR or addition with -2147483648,
> but the latter is one byte less if TARGET_DENSITY.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md (xorsi3_internal):
> Rename from the original of "xorsi3".
> (xorsi3): New expansion pattern that emits addition rather than
> bitwise-XOR when the second source is a constant of -2147483648
> if TARGET_DENSITY.
> ---
>  gcc/config/xtensa/xtensa.md | 26 +-
>  1 file changed, 25 insertions(+), 1 deletion(-)

Regtested for target=xtensa-linux-uclibc, no new regressions.
Committed to master.

-- 
Thanks.
-- Max


[PATCH] lto: pass through -funwind-tables and -fasynchronous-unwind-tables

2023-01-18 Thread Andreas Schwab via Gcc-patches
The -funwind-tables and -fasynchronous-unwind-tables options are relevant
for the output pass, thus they need to be passed through by the lto
wrapper.

gcc/
* lto-wrapper.cc (merge_and_complain): Pass through
-funwind-tables and -fasynchronous-unwind-tables.
(append_compiler_options): Likewise.
---
 gcc/lto-wrapper.cc | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/lto-wrapper.cc b/gcc/lto-wrapper.cc
index 11c4d1b38a4..627e8238606 100644
--- a/gcc/lto-wrapper.cc
+++ b/gcc/lto-wrapper.cc
@@ -314,6 +314,8 @@ merge_and_complain (vec _options,
case OPT_fshow_column:
case OPT_fcommon:
case OPT_fgnu_tm:
+   case OPT_funwind_tables:
+   case OPT_fasynchronous_unwind_tables:
case OPT_g:
  /* Do what the old LTO code did - collect exactly one option
 setting per OPT code, we pick the first we encounter.
@@ -737,6 +739,8 @@ append_compiler_options (obstack *argv_obstack, 
vec opts)
case OPT_fopenacc_dim_:
case OPT_foffload_abi_:
case OPT_fcf_protection_:
+   case OPT_funwind_tables:
+   case OPT_fasynchronous_unwind_tables:
case OPT_g:
case OPT_O:
case OPT_Ofast:
-- 
2.39.1


-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH v6] LoongArch: Fixed a compilation failure with '%c' in inline assembly [PR107731].

2023-01-18 Thread Richard Sandiford via Gcc-patches
Lulu Cheng  writes:
> Co-authored-by: Yang Yujie 
>
> gcc/ChangeLog:
>
>   * config/loongarch/loongarch.cc (loongarch_classify_address):
>   Add precessint for CONST_INT.
>   (loongarch_print_operand_reloc): Operand modifier 'c' is supported.
>   (loongarch_print_operand): Increase the processing of '%c'.
>   * doc/extend.texi: Adds documents for LoongArch operand modifiers.
>   And port the public operand modifiers information to this document.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/loongarch/tst-asm-const.c: Moved to...
>   * gcc.target/loongarch/pr107731.c: ...here.
> ---
> V2 -> v3:
> 1. Correct a clerical error.
> 2. Adding document for loongarch operand modifiers.
>
> v3 -> v4:
> Copy the description of "%c" "%n" "%a" "%l" from gccint.pdf to gcc.pdf.
>
> v4 -> v5:
> Move the operand modifiers description of "%c", "%n", "%a", "%l" to the top 
> of the
> x86Operandmodifiers section.
>
> v5 -> v6:
> Adjust the location of the added section in the document.
>
> ---
>  gcc/config/loongarch/loongarch.cc | 14 +
>  gcc/doc/extend.texi   | 51 +--
>  .../loongarch/{tst-asm-const.c => pr107731.c} |  6 +--
>  3 files changed, 64 insertions(+), 7 deletions(-)
>  rename gcc/testsuite/gcc.target/loongarch/{tst-asm-const.c => pr107731.c} 
> (78%)
>
> diff --git a/gcc/config/loongarch/loongarch.cc 
> b/gcc/config/loongarch/loongarch.cc
> index c6b03fcf2f9..cdf190b985e 100644
> --- a/gcc/config/loongarch/loongarch.cc
> +++ b/gcc/config/loongarch/loongarch.cc
> @@ -2075,6 +2075,11 @@ loongarch_classify_address (struct 
> loongarch_address_info *info, rtx x,
>return (loongarch_valid_base_register_p (info->reg, mode, strict_p)
> && loongarch_valid_lo_sum_p (info->symbol_type, mode,
>  info->offset));
> +case CONST_INT:
> +  /* Small-integer addresses don't occur very often, but they
> +  are legitimate if $r0 is a valid base register.  */
> +  info->type = ADDRESS_CONST_INT;
> +  return IMM12_OPERAND (INTVAL (x));
>  
>  default:
>return false;
> @@ -4933,6 +4938,7 @@ loongarch_print_operand_reloc (FILE *file, rtx op, bool 
> hi64_part,
>  
> 'A'   Print a _DB suffix if the memory model requires a release.
> 'b'   Print the address of a memory operand, without offset.
> +   'c'  Print an integer.
> 'C'   Print the integer branch condition for comparison OP.
> 'd'   Print CONST_INT OP in decimal.
> 'F'   Print the FPU branch condition for comparison OP.
> @@ -4979,6 +4985,14 @@ loongarch_print_operand (FILE *file, rtx op, int 
> letter)
> fputs ("_db", file);
>break;
>  
> +case 'c':
> +  if (CONST_INT_P (op))
> + fprintf (file, HOST_WIDE_INT_PRINT_DEC, INTVAL (op));
> +  else
> + output_operand_lossage ("unsupported operand for code '%c'", letter);
> +
> +  break;
> +
>  case 'C':
>loongarch_print_int_branch_condition (file, code, letter);
>break;
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 1103e9936f7..6a5d9faf2f3 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -10402,8 +10402,10 @@ ensures that modifying @var{a} does not affect the 
> address referenced by
>  is undefined if @var{a} is modified before using @var{b}.
>  
>  @code{asm} supports operand modifiers on operands (for example @samp{%k2} 
> -instead of simply @samp{%2}). Typically these qualifiers are hardware 
> -dependent. The list of supported modifiers for x86 is found at 
> +instead of simply @samp{%2}). @ref{GenericOperandmodifiers,
> +Generic Operand modifiers} lists the modifiers that are available
> +on all targets.  Other modifiers are hardware dependent.
> +For example, the list of supported modifiers for x86 is found at
>  @ref{x86Operandmodifiers,x86 Operand modifiers}.
>  
>  If the C code that follows the @code{asm} makes no use of any of the output 
> @@ -10671,8 +10673,10 @@ optimizers may discard the @code{asm} statement as 
> unneeded
>  (see @ref{Volatile}).
>  
>  @code{asm} supports operand modifiers on operands (for example @samp{%k2} 
> -instead of simply @samp{%2}). Typically these qualifiers are hardware 
> -dependent. The list of supported modifiers for x86 is found at 
> +instead of simply @samp{%2}). @ref{GenericOperandmodifiers,
> +Generic Operand modifiers} lists the modifiers that are available
> +on all targets.  Other modifiers are hardware dependent.
> +For example, the list of supported modifiers for x86 is found at
>  @ref{x86Operandmodifiers,x86 Operand modifiers}.
>  
>  In this example using the fictitious @code{combine} instruction, the 
> @@ -11024,6 +11028,30 @@ lab:
>  @}
>  @end example
>  
> +@anchor{GenericOperandmodifiers}
> +@subsubsection Generic Operand Modifiers
> +@noindent
> +The following table shows the modifiers supported by all targets and their 
> effects:
> +
> +@multitable 

Re: Ping: [PATCH v4] rs6000: Fix incorrect RTL for Power LE when removing the UNSPECS [PR106069]

2023-01-18 Thread Kewen.Lin via Gcc-patches
Hi Segher,

I guessed that this patch escaped from your radar. :)

As Jakub asked the status in PR106069, I applied this attached patch from 
Xionghu
to the latest trunk, re-tested it and confirmed that it's still bootstrapped and
regtested on powerpc64-linux-gnu P8 and powerpc64le-linux-gnu P9 and P10.

This new version has separated out direct le and be, it's more clear than 
before,
it looked good to me.  What do you think of this?  Looking forward to your 
opinion.

btw, the link in archives:
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600169.html

BR,
Kewen

on 2022/8/24 09:24, Xionghu Luo wrote:
> 主题:
> Ping: [PATCH v4] rs6000: Fix incorrect RTL for Power LE when removing the 
> UNSPECS [PR106069]
> From:
> Xionghu Luo 
> 日期:
> 2022/8/24, 09:24
> 
> 收件人:
> "Kewen.Lin" , Segher Boessenkool 
> 
> 抄送:
> Xionghu Luo , gcc-patches@gcc.gnu.org, David Edelsohn 
> , Segher Boessenkool 
> 
> 
> Hi Segher, I'd like to resend and ping for this patch. Thanks.
> 
> v4-0001-rs6000-Fix-incorrect-RTL-for-Power-LE-when-removi.patch
> 
> From 23bffdacdf0eb1140c7a3571e6158797f4818d57 Mon Sep 17 00:00:00 2001
> From: Xionghu Luo 
> Date: Thu, 4 Aug 2022 03:44:58 +
> Subject: [PATCH v4] rs6000: Fix incorrect RTL for Power LE when removing the
>  UNSPECS [PR106069]
> 
> v4: Update per comments.
> v3: rename altivec_vmrghb_direct_le to altivec_vmrglb_direct_le to match
> the actual output ASM vmrglb. Likewise for all similar xxx_direct_le
> patterns.
> v2: Split the direct pattern to be and le with same RTL but different insn.
> 
> The native RTL expression for vec_mrghw should be same for BE and LE as
> they are register and endian-independent.  So both BE and LE need
> generate exactly same RTL with index [0 4 1 5] when expanding vec_mrghw
> with vec_select and vec_concat.
> 
> (set (reg:V4SI 141) (vec_select:V4SI (vec_concat:V8SI
>  (subreg:V4SI (reg:V16QI 139) 0)
>  (subreg:V4SI (reg:V16QI 140) 0))
>  [const_int 0 4 1 5]))
> 
> Then combine pass could do the nested vec_select optimization
> in simplify-rtx.c:simplify_binary_operation_1 also on both BE and LE:
> 
> 21: r150:V4SI=vec_select(vec_concat(r141:V4SI,r146:V4SI),parallel [0 4 1 5])
> 24: {r151:SI=vec_select(r150:V4SI,parallel [const_int 3]);}
> 
> =>
> 
> 21: r150:V4SI=vec_select(vec_concat(r141:V4SI,r146:V4SI),parallel)
> 24: {r151:SI=vec_select(r146:V4SI,parallel [const_int 1]);}
> 
> The endianness check need only once at ASM generation finally.
> ASM would be better due to nested vec_select simplified to simple scalar
> load.
> 
> Regression tested pass for Power8{LE,BE}{32,64} and Power{9,10}LE{32,64}
> Linux.
> 
> gcc/ChangeLog:
> 
>   PR target/106069
>   * config/rs6000/altivec.md (altivec_vmrghb_direct): Remove.
>   (altivec_vmrghb_direct_be): New pattern for BE.
>   (altivec_vmrghb_direct_le): New pattern for LE.
>   (altivec_vmrghh_direct): Remove.
>   (altivec_vmrghh_direct_be): New pattern for BE.
>   (altivec_vmrghh_direct_le): New pattern for LE.
>   (altivec_vmrghw_direct_): Remove.
>   (altivec_vmrghw_direct__be): New pattern for BE.
>   (altivec_vmrghw_direct__le): New pattern for LE.
>   (altivec_vmrglb_direct): Remove.
>   (altivec_vmrglb_direct_be): New pattern for BE.
>   (altivec_vmrglb_direct_le): New pattern for LE.
>   (altivec_vmrglh_direct): Remove.
>   (altivec_vmrglh_direct_be): New pattern for BE.
>   (altivec_vmrglh_direct_le): New pattern for LE.
>   (altivec_vmrglw_direct_): Remove.
>   (altivec_vmrglw_direct__be): New pattern for BE.
>   (altivec_vmrglw_direct__le): New pattern for LE.
>   * config/rs6000/rs6000.cc (altivec_expand_vec_perm_const):
>   Adjust.
>   * config/rs6000/vsx.md: Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR target/106069
>   * g++.target/powerpc/pr106069.C: New test.
> 
> Signed-off-by: Xionghu Luo 
> ---
>  gcc/config/rs6000/altivec.md| 222 ++--
>  gcc/config/rs6000/rs6000.cc |  24 +--
>  gcc/config/rs6000/vsx.md|  28 +--
>  gcc/testsuite/g++.target/powerpc/pr106069.C | 118 +++
>  4 files changed, 307 insertions(+), 85 deletions(-)
>  create mode 100644 gcc/testsuite/g++.target/powerpc/pr106069.C
> 
> diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
> index 2c4940f2e21..c6a381908cb 100644
> --- a/gcc/config/rs6000/altivec.md
> +++ b/gcc/config/rs6000/altivec.md
> @@ -1144,15 +1144,16 @@ (define_expand "altivec_vmrghb"
> (use (match_operand:V16QI 2 "register_operand"))]
>"TARGET_ALTIVEC"
>  {
> -  rtx (*fun) (rtx, rtx, rtx) = BYTES_BIG_ENDIAN ? gen_altivec_vmrghb_direct
> - : gen_altivec_vmrglb_direct;
> -  if (!BYTES_BIG_ENDIAN)
> -std::swap (operands[1], operands[2]);
> -  emit_insn (fun (operands[0], operands[1], operands[2]));
> +  if (BYTES_BIG_ENDIAN)
> +emit_insn (
> +  

Re: [PATCH v2] xtensa: Eliminate unnecessary general-purpose reg-reg moves

2023-01-18 Thread Max Filippov via Gcc-patches
Hi Suwa-san,

On Tue, Jan 17, 2023 at 9:25 PM Takayuki 'January June' Suwa
 wrote:
>
> Register-register move instructions that can be easily seen as
> unnecessary by the human eye may remain in the compiled result.
> For example:
>
> /* example */
> double test(double a, double b) {
>   return __builtin_copysign(a, b);
> }
>
> test:
> add.n   a3, a3, a3
> extui   a5, a5, 31, 1
> ssai1
> ;; be in the same BB
> src a7, a5, a3  ;; No '0' in the source constraints
> ;; No CALL insns in this span
> ;; Both A3 and A7 are irrelevant to
> ;;   insns in this span
> mov.n   a3, a7  ;; An unnecessary reg-reg move
> ;; A7 is not used after this
> ret.n
>
> The last two instructions above, excluding the return instruction,
> could be done like this:
>
> src a3, a5, a3
>
> This symptom often occurs when handling DI/DFmode values with SImode
> instructions.  This patch solves the above problem using peephole2
> pattern.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md: New peephole2 pattern that eliminates
> the occurrence of genral-purpose register used only once and for
> transferring intermediate value.
> ---
>  gcc/config/xtensa/xtensa.md | 43 +
>  1 file changed, 43 insertions(+)

This still generates ICE, this time while building libstdc++:

during RTL pass: ce3
In file included from
build/xtensa-buildroot-linux-uclibc/libstdc++-v3/include/bits/locale_facets.h:2687,
from
build/xtensa-buildroot-linux-uclibc/libstdc++-v3/include/locale:42,
from gcc/libstdc++-v3/src/c++11/locale-inst.cc:38,
from gcc/libstdc++-v3/src/c++11/wlocale-inst.cc:35:
build/xtensa-buildroot-linux-uclibc/libstdc++-v3/include/bits/locale_facets.tcc:
In member function ‘_InIter std::num_get<_CharT,
_InIter>::do_get(iter_type, iter_type, std::ios_base&,
std::ios_base::iostate&, bool&) const [with _CharT = wchar_t; _InIter
= std::istreamb
uf_iterator >]’:
build/xtensa-buildroot-linux-uclibc/libstdc++-v3/include/bits/locale_facets.tcc:686:5:
internal compiler error: in df_refs_verify, at df-scan.cc:4009
 686 | }
 | ^
0x6eb0dc df_refs_verify
   gcc/gcc/df-scan.cc:4009
0xd19a74 df_insn_refs_verify
   gcc/gcc/df-scan.cc:4092
0xd1b94c df_bb_verify
   gcc/gcc/df-scan.cc:4125
0xd1bd77 df_scan_verify()
   gcc/gcc/df-scan.cc:4246
0xd06ca7 df_verify()
   gcc/gcc/df-core.cc:1818
0xd06ca7 df_analyze_1
   gcc/gcc/df-core.cc:1214
0x1a7287c if_convert
   gcc/gcc/ifcvt.cc:5858
0x1a73ddd execute
   gcc/gcc/ifcvt.cc:6026

-- 
Thanks.
-- Max


Re: [PATCH 1/1] [fwprop]: Add the support of forwarding the vec_duplicate rtx

2023-01-18 Thread Richard Sandiford via Gcc-patches
"丁乐华"  writes:
> > I don't think this pattern is correct, because SEL isn't commutative
> > in the vector operands.
>
> Indeed, I think I should invert PRED operand or the comparison
> operator which produce the PRED operand first.

That would work, but it would no longer be a win.  The vectoriser already
has code to try to reuse existing predicates where possible, to increase
the chances that the operand order of VEC_COND_EXPRs is reasonable.

> > I think this should be:
> >
> > if (...)
> >  to = XEXP (to, 0);>
> > and should be before the REG_P test. We don't want to treat
> > arbitrary duplicates as profitable.
>
> Agree, the adjustment is more rigorous.
>
> > It's not obvious that vec_duplicate is special enough that we should
> > treat it differently from other unary operators. For example,
> > zero_extend and sign_extend don't seem fundamentally more expensive
> > than vec_duplicate.
>
> Juzhe and I also discussed offline recently. We also have widened vector
> operator that needs to be added, this can be finished in RTL with forwarding
> instead of adding widen GIMPLE internal function. We think we can add a
> TARGET HOOK, for example: 
> `rtx try_forward (rtx dest, rtx src, rtx use_insn, rtx def_insn)`
>
>
> If it returns NULL_RTX, it means that it cannot be forwarded, otherwise
> it means replace the dest part in use_insn with the returned rtx.
> Letting the backend decide which ones can be forwarded has several
> advantages compared to:
> 1. Let the insn related to TARGET, such as unspec, also can be forwarded,
>   and when forwarding, the corresponding content can be extracted
>   from def_insn instead of the complete src part.
> 2. By default this HOOK returns NULL_TREE, which can reduce compatibility
>   issues.

Personally, I'm not in favour of a hook along these lines.  I think
it would effectively split the pass between target-independent and
target-specific code, which (a) tends to lead to more duplication
between targets and (b) makes it harder to test for correctness
(as opposed to performance) when updating the target-independent code.

If a value can't be forwarded, then either (a) substitution will fail
to give a valid instruction or (b) the new instruction will be more
costly than the old one (as measured by existing hooks).

The possible downsides (e.g. on register pressure, as you mention below)
are something that target-independent code should deal with, since it
can look at the function as a whole.

> > It's a while since I looked at this code, but I assume that, even after
> > this change, we will still require the new in-loop instruction to be
> > no more expensive than the old in-loop instruction. Is that right?
>
>
> Yeah. Forwarding vec_duplicate maybe reduce the use of vector registers,
> but increase the life cycle of scalar registers. If the scalar register 
> pressure
> is higher, this change may become more expensive. This decision does not
> feel very easy to make, is there some way to do this?

Yeah.  But on many architectures, scalar floats are stored in the same
register file as vectors, so whether this is a problem will depend also
on the mode of the scalar.

Also, the cost is different if we eliminate all uses of the duplicate in
the loop vs. if we only eliminate some.

The handling of flag_ira_hoist_pressure is one example of code that
tries to use register pressure to guide optimisation, but I don't
know the code very well.  (Of course, if we did reuse that,
we'd want to commonise it rather than duplicate it.)

Thanks,
Richard


Re: [PATCH v3] xtensa: Eliminate the use of callee-saved register that saves and restores only once

2023-01-18 Thread Max Filippov via Gcc-patches
Hi Suwa-san,

On Tue, Jan 17, 2023 at 8:23 PM Takayuki 'January June' Suwa
 wrote:
> In the case of the CALL0 ABI, values that must be retained before and
> after function calls are placed in the callee-saved registers (A12
> through A15) and referenced later.  However, it is often the case that
> the save and the reference are each only once and a simple register-
> register move (the frame pointer is needed to recover the stack pointer
> and must be excluded).
>
> e.g. in the following example, if there are no other occurrences of
> register A14:
>
> ;; before
> ; prologue {
>   ...
> s32i.n  a14, sp, 16
>   ...
> ; } prologue
>   ...
> mov.n   a14, a6
>   ...
> call0   foo
>   ...
> mov.n   a8, a14
>   ...
> ; epilogue {
>   ...
> l32i.n  a14, sp, 16
>   ...
> ; } epilogue
>
> It can be possible like this:
>
> ;; after
> ; prologue {
>   ...
> (deleted)
>   ...
> ; } prologue
>   ...
> s32i.n  a6, sp, 16
>   ...
> call0   foo
>   ...
> l32i.n  a8, sp, 16
>   ...
> ; epilogue {
>   ...
> (deleted)
>   ...
> ; } epilogue
>
> This patch introduces a new peephole2 pattern that implements the above.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.md: New peephole2 pattern that eliminates
> the use of callee-saved register that saves and restores only once
> for other register, by using its stack slot directly.
> ---
>  gcc/config/xtensa/xtensa.md | 62 +
>  1 file changed, 62 insertions(+)

This change introduces a bunch of different test failures:

FAIL: gcc.c-torture/execute/builtins/strpbrk.c execution,  -O2
FAIL: gcc.c-torture/execute/builtins/strpbrk.c execution,  -O3 -g
FAIL: gcc.c-torture/execute/builtins/strpbrk.c execution,  -Os
FAIL: gcc.c-torture/execute/builtins/strpbrk.c execution,  -O2 -flto
-fno-use-linker-plugin -flto-partition=none
FAIL: gcc.c-torture/execute/builtins/strstr-asm.c execution,  -Os
FAIL: gcc.c-torture/execute/20001130-1.c   -Os  execution test
FAIL: gcc.c-torture/execute/20040311-1.c   -O2  execution test
FAIL: gcc.c-torture/execute/20040311-1.c   -O3 -g  execution test
FAIL: gcc.c-torture/execute/20040311-1.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.c-torture/execute/20121108-1.c   -O2  execution test
FAIL: gcc.c-torture/execute/20121108-1.c   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution
test
FAIL: gcc.c-torture/execute/20121108-1.c   -O3 -g  execution test
FAIL: gcc.c-torture/execute/20121108-1.c   -Os  execution test
FAIL: gcc.c-torture/execute/20121108-1.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.c-torture/execute/20121108-1.c   -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects  execution test
FAIL: gcc.c-torture/execute/20140622-1.c   -O2  execution test
FAIL: gcc.c-torture/execute/20140622-1.c   -O3 -g  execution test
FAIL: gcc.c-torture/execute/20140622-1.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.c-torture/execute/20141022-1.c   -O2  execution test
FAIL: gcc.c-torture/execute/20141022-1.c   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution
test
FAIL: gcc.c-torture/execute/20141022-1.c   -O3 -g  execution test
FAIL: gcc.c-torture/execute/20141022-1.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.c-torture/execute/20141022-1.c   -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects  execution test
FAIL: gcc.c-torture/execute/20141107-1.c   -O2 -flto
-fuse-linker-plugin -fno-fat-lto-objects  execution test
FAIL: gcc.c-torture/execute/961213-1.c   -Os  execution test
FAIL: gcc.c-torture/execute/builtin-bitops-1.c   -Os  execution test
FAIL: gcc.c-torture/execute/cvt-1.c   -O2  execution test
FAIL: gcc.c-torture/execute/cvt-1.c   -O3 -g  execution test
FAIL: gcc.c-torture/execute/cvt-1.c   -Os  execution test
FAIL: gcc.c-torture/execute/cvt-1.c   -O2 -flto -fno-use-linker-plugin
-flto-partition=none  execution test
FAIL: gcc.c-torture/execute/pr40747.c   -O2  execution test
FAIL: gcc.c-torture/execute/pr40747.c   -O3 -g  execution test
FAIL: gcc.c-torture/execute/pr40747.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.c-torture/execute/pr60960.c   -O2  execution test
FAIL: gcc.c-torture/execute/pr60960.c   -O3 -g  execution test
FAIL: gcc.c-torture/execute/pr60960.c   -O2 -flto
-fno-use-linker-plugin -flto-partition=none  execution test
FAIL: gcc.c-torture/execute/pr60960.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  execution test
FAIL: gcc.c-torture/execute/ieee/fp-cmp-5.c execution,  -O2
FAIL: gcc.c-torture/execute/ieee/fp-cmp-5.c execution,  -O3
-fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer
-finline-functions
FAIL: gcc.c-torture/execute/ieee/fp-cmp-5.c execution,  -O3 -g

Re: [PATCH/RFC] rs6000: Remove optimize_for_speed check for implicit TARGET_SAVE_TOC_INDIRECT [PR108184]

2023-01-18 Thread Kewen.Lin via Gcc-patches
Hi Mike,

Thanks for the comments!

on 2023/1/18 04:57, Michael Meissner wrote:
> On Mon, Jan 16, 2023 at 05:39:04PM +0800, Kewen.Lin wrote:
>> Hi,
>>
>> Now we will check optimize_function_for_speed_p (cfun) for
>> TARGET_SAVE_TOC_INDIRECT if it's implicitly enabled.  But
>> the effect of -msave-toc-indirect is actually to save the
>> TOC in the prologue for indirect calls rather than inline,
>> it's also good for optimize_function_for_size?  So this
>> patch is to remove the check of optimize_function_for_speed
>> and make it work for both optimizing for size and speed.
>>
>> Bootstrapped and regtested on powerpc64-linux-gnu P8,
>> powerpc64le-linux-gnu P{9,10} and powerpc-ibm-aix.
>>
>> Any thoughts?
>>
>> Thanks in advance!
> 
> Well in terms of size, it is only a savings if we have 2 or more indirect 
> calls
> within a module, and we are not compiling for power10.
> 
> On power9, if we have just one indirect call, then it is the same size.
> 
> On power10, the -msave-toc-indirect switch does nothing, because we don't need
> TOCs when we have prefixed addressing.

Yes, exactly, so the test cases have the explicit option -mno-pcrel.

> 
> So I have objection to the change.  I suspect it may be better with a check 
> for
> just optimize either for speed or size, and not for speed.
> 
> The option however, can slow things down if there is an early exit to the
> function since the store would always be done, even if the function exits
> early.
> 

Good point, I guessed that's why we only try to turn it on under the guard
flag_shrink_wrap_separate when there is no explicit -m{no-,}save-toc-indirect.

BR,
Kewen


[PATCH 2/2] rs6000: Refactor genfusion.pl a bit further

2023-01-18 Thread Kewen.Lin via Gcc-patches
Hi,

To keep the previous refactoring patch not need to
re-generate fusion.md and make the review easier,
I didn't merge this patch into the previous one.

But I think this one can help to make the subroutine
gen_logical_addsubf_scalar more clear, by separating
logical-logical and add-logical handlings into two
different loops.  It needs to regenerate fusion.md,
since add-logical type definitions would have their
own continuous area (it needs some rearrangments).

Bootstrapped and regtested on powerpc64le-linux-gnu P10.

Any comments are highly appreciated.

BR,
Kewen
-
gcc/ChangeLog:

* config/rs6000/fusion.md : Regenerate.
* config/rs6000/genfusion.pl (gen_logical_addsubf_scalar): Split
logical-logical and add-logical handlings into two loops.
---
 gcc/config/rs6000/fusion.md| 288 -
 gcc/config/rs6000/genfusion.pl |  28 ++--
 2 files changed, 162 insertions(+), 154 deletions(-)

diff --git a/gcc/config/rs6000/fusion.md b/gcc/config/rs6000/fusion.md
index d45fb138a70..0427505b7f7 100644
--- a/gcc/config/rs6000/fusion.md
+++ b/gcc/config/rs6000/fusion.md
@@ -499,42 +499,6 @@ (define_insn "*fuse_xor_and"
(set_attr "cost" "6")
(set_attr "length" "8")])

-;; add-logical fusion pattern generated by gen_logical_addsubf
-;; scalar add -> and
-(define_insn "*fuse_add_and"
-  [(set (match_operand:GPR 3 "gpc_reg_operand" "=&0,&1,,r")
-(and:GPR (plus:GPR (match_operand:GPR 0 "gpc_reg_operand" "r,r,r,r")
-  (match_operand:GPR 1 "gpc_reg_operand" "r,r,r,r"))
- (match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r")))
-   (clobber (match_scratch:GPR 4 "=X,X,X,"))]
-  "(TARGET_P10_FUSION)"
-  "@
-   add %3,%1,%0\;and %3,%3,%2
-   add %3,%1,%0\;and %3,%3,%2
-   add %3,%1,%0\;and %3,%3,%2
-   add %4,%1,%0\;and %3,%4,%2"
-  [(set_attr "type" "fused_arith_logical")
-   (set_attr "cost" "6")
-   (set_attr "length" "8")])
-
-;; add-logical fusion pattern generated by gen_logical_addsubf
-;; scalar subf -> and
-(define_insn "*fuse_subf_and"
-  [(set (match_operand:GPR 3 "gpc_reg_operand" "=&0,&1,,r")
-(and:GPR (minus:GPR (match_operand:GPR 0 "gpc_reg_operand" "r,r,r,r")
-  (match_operand:GPR 1 "gpc_reg_operand" "r,r,r,r"))
- (match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r")))
-   (clobber (match_scratch:GPR 4 "=X,X,X,"))]
-  "(TARGET_P10_FUSION)"
-  "@
-   subf %3,%1,%0\;and %3,%3,%2
-   subf %3,%1,%0\;and %3,%3,%2
-   subf %3,%1,%0\;and %3,%3,%2
-   subf %4,%1,%0\;and %3,%4,%2"
-  [(set_attr "type" "fused_arith_logical")
-   (set_attr "cost" "6")
-   (set_attr "length" "8")])
-
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; scalar and -> andc
 (define_insn "*fuse_and_andc"
@@ -967,42 +931,6 @@ (define_insn "*fuse_xor_nand"
(set_attr "cost" "6")
(set_attr "length" "8")])

-;; add-logical fusion pattern generated by gen_logical_addsubf
-;; scalar add -> nand
-(define_insn "*fuse_add_nand"
-  [(set (match_operand:GPR 3 "gpc_reg_operand" "=&0,&1,,r")
-(ior:GPR (not:GPR (plus:GPR (match_operand:GPR 0 "gpc_reg_operand" 
"r,r,r,r")
-  (match_operand:GPR 1 "gpc_reg_operand" "r,r,r,r")))
- (not:GPR (match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r"
-   (clobber (match_scratch:GPR 4 "=X,X,X,"))]
-  "(TARGET_P10_FUSION)"
-  "@
-   add %3,%1,%0\;nand %3,%3,%2
-   add %3,%1,%0\;nand %3,%3,%2
-   add %3,%1,%0\;nand %3,%3,%2
-   add %4,%1,%0\;nand %3,%4,%2"
-  [(set_attr "type" "fused_arith_logical")
-   (set_attr "cost" "6")
-   (set_attr "length" "8")])
-
-;; add-logical fusion pattern generated by gen_logical_addsubf
-;; scalar subf -> nand
-(define_insn "*fuse_subf_nand"
-  [(set (match_operand:GPR 3 "gpc_reg_operand" "=&0,&1,,r")
-(ior:GPR (not:GPR (minus:GPR (match_operand:GPR 0 "gpc_reg_operand" 
"r,r,r,r")
-  (match_operand:GPR 1 "gpc_reg_operand" "r,r,r,r")))
- (not:GPR (match_operand:GPR 2 "gpc_reg_operand" "r,r,r,r"
-   (clobber (match_scratch:GPR 4 "=X,X,X,"))]
-  "(TARGET_P10_FUSION)"
-  "@
-   subf %3,%1,%0\;nand %3,%3,%2
-   subf %3,%1,%0\;nand %3,%3,%2
-   subf %3,%1,%0\;nand %3,%3,%2
-   subf %4,%1,%0\;nand %3,%4,%2"
-  [(set_attr "type" "fused_arith_logical")
-   (set_attr "cost" "6")
-   (set_attr "length" "8")])
-
 ;; logical-logical fusion pattern generated by gen_logical_addsubf
 ;; scalar and -> nor
 (define_insn "*fuse_and_nor"
@@ -1147,42 +1075,6 @@ (define_insn "*fuse_xor_nor"
(set_attr "cost" "6")
(set_attr "length" "8")])

-;; add-logical fusion pattern generated by gen_logical_addsubf
-;; scalar add -> nor
-(define_insn "*fuse_add_nor"
-  [(set (match_operand:GPR 3 "gpc_reg_operand" "=&0,&1,,r")
-(and:GPR (not:GPR (plus:GPR (match_operand:GPR 0 "gpc_reg_operand" 
"r,r,r,r")
-  (match_operand:GPR 1 "gpc_reg_operand" "r,r,r,r")))
- (not:GPR (match_operand:GPR 2 

Re: [PATCH] middle-end/106075 - non-call EH and DSE

2023-01-18 Thread Richard Biener via Gcc-patches
On Tue, 17 Jan 2023, Jan Hubicka wrote:

> > > We don't use same argumentation about other control flow statements.
> > > The following:
> > > 
> > > fn()
> > > {
> > >   try {
> > > i_read_no_global_memory ();
> > >   } catch (...)
> > >   {
> > > reutrn 1;
> > >   }
> > >   return 0;
> > > }
> > > 
> > > should be detected as const.  Marking throw pure would make fn pure too.
> > 
> > I suppose i_read_no_global_memory is const here.  Not sure why that
> Suppose we have:
> 
> void
> i_read_no_global_memory ()
> {
>   throw(0);
> }
> 
> If cxa_throw itself was annotated as 'p' rahter than 'c' ipa-modref will
> believe that cxa_throw will read any global memory and will propagate it
> to all callers. So fn() will be also marked as reading all global
> memory.

Sure - but for the purpose of local optimizations in 
i_read_no_global_memory cxa_throw has to appear to read memory.
Having a VUSE there dependent on whether the function performs any
load or store would be quite ugly.  Instead modref could special-case
cxa_throw and not treat it as reading memory (like it already does
for the return stmt I suppose - that also has a VUSE).

> > should make it pure?  Only if anything throws externally (not catched
> > in fn) should force it to be pure, no?
> > 
> > Of course for IPA purposes whether 'fn' is to be considered const
> > or pure depends on whether its exceptions are catched in the context
> > where that's interesting - that is, whether the EH side-effect is
> > explicitely or implicitely modeled.
> 
> We have two things here. const/pure attributes 'c'/'p' fnspec
> specifiers.  const/pure implies that function call can be removed when
> result is not necessary. This is not the case of funcitons calling
> throw() (we have -fdelete-dead-exceptions for noncall exceptions and
> those would be OK).  However 'c'/'p' is about memory side effects only
> and it is safe for i_read_no_global_memory.
> 
> With the C++ FE change adding fnspec to EH handling modref will detect
> both i_read_no_global_memory and fn() as 'c'. It won't infer const
> attribute that is something I can implement later.
> We are very poor on detecting scenarios where all exceptions thrown are
> actually caught. It is long time on my TODO to fix that, so probably
> next stage1 is time to look into that.
> 
> > 
> > > With noncall exceptions a=b/c also can transfer to place that inspect
> > > memory.  We may want all statements with can_throw_extenral to have VUSE
> > > on them (like with return) since they may cause function to return, but
> > > I think fnspec is wrong place to model this.
> > 
> > Yes, I think all control transfer instructions need a VUSE.
> 
> I think it is right way to go.  So operands_scanner::parse_ssa_operands
> can add vuse to anything that can_throw_external_p (like it does for
> GIMPLE_RETURN) and passes like DSE can test for it and understand that
> on the EH path the globally accessible memory is live and thus "used" by
> the statement.
>
> I can try to cook up a patch.

The problem is IIRC GIMPLE_RESX which doesn't derive from
gimple_statement_with_memory_ops_base.  There's a bugzilla I can't find
right now refering to this issue.

Richard.

> Thanks,
> Honza
> > 
> > Richard.
> > 
> > > > > According to compiler explorer testcase:
> > > > > struct a{int a,b,c,d,e;};
> > > > > void
> > > > > test(struct a * __restrict a, struct a *b)
> > > > > {
> > > > >   *a = (struct a){0,1,2,3,4};
> > > > >   *a = *b;
> > > > > }
> > > > > Is compiled correctly by GCC 5.4 and first miscopmiled by 6.1, so I
> > > > > think it is a regression. (For C++ not very important one as
> > > > > -fnon-call-exceptions is not very common for C++)
> > > > 
> > > > Ah, yes - RTL DSE probably is too weak for this and GIMPLE DSE
> > > > didn't handle aggregates well at some point.
> > > 
> > > Yep, we never handled it really correctly but were weaker on optimizing
> > > and thus also producing wrong code :)
> > > 
> > > Honza
> > > > 
> > > > Richard.
> > > > 
> > > > > 
> > > > > Honza
> > > > > > 
> > > > > > PR middle-end/106075
> > > > > > * dse.cc (scan_insn): Consider externally throwing insns
> > > > > > to read from not frame based memory.
> > > > > > * tree-ssa-dse.cc (dse_classify_store): Consider externally
> > > > > > throwing uses to read from global memory.
> > > > > > 
> > > > > > * gcc.dg/torture/pr106075-1.c: New testcase.
> > > > > > ---
> > > > > >  gcc/dse.cc|  5 
> > > > > >  gcc/testsuite/gcc.dg/torture/pr106075-1.c | 36 
> > > > > > +++
> > > > > >  gcc/tree-ssa-dse.cc   |  8 -
> > > > > >  3 files changed, 48 insertions(+), 1 deletion(-)
> > > > > >  create mode 100644 gcc/testsuite/gcc.dg/torture/pr106075-1.c
> > > > > > 
> > > > > > diff --git a/gcc/dse.cc b/gcc/dse.cc
> > > > > > index a2db8d1cc32..7e258b81f66 100644
> > > > > > --- a/gcc/dse.cc
> > > > > > +++ b/gcc/dse.cc
> > > > > > @@ -2633,6 +2633,11 

[PATCH 1/2] rs6000: Refactor script genfusion.pl

2023-01-18 Thread Kewen.Lin via Gcc-patches
Hi,

As Segher suggested in [1], this patch is to refactor the
script genfusion.pl for generating fusion.md.

It mainly consists of:
  1) Add main subroutine, which calls several backbone
 subroutines, hope it can show the skeleton clearly.
  2) Encapsulate copyright and top comments emission to a
 separated subroutine gen_copyright_and_top_comments.
  3) Remove multiple nested loops in gen_ld_cmpi_p10 by
 expanding them directly, hope it can be more clear.
 Also factor out some logics to ld_cmpi_p10_emit_define
 which aims to focus on define_insn_and_split emission.
 Refine subroutine mode_to_ldst_char a bit.
  4) For gen_logical_addsubf, separate scalar and vector
 handlings into gen_logical_addsubf_{vector,scalar},
 factor out op information querying on complement/invert/
 commute2/"rtl op name" to subroutine
 logical_addsub_get_op_info, factor out some logics on
 define_insn_and_split emission to subroutine
 logical_addsubf_emit_define, and factor out some logic
 to construct inner and outer expression to subroutine
 logical_addsubf_make_exp.
  5) For gen_addadd, it's quite simple so I leave it alone,
 just removes one useless variable.

Note that this patch keeps the fusion.md is exactly the same
as before.

Any comments are highly appreciated.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608830.html

BR,
Kewen
-
gcc/ChangeLog:

* config/rs6000/genfusion.pl (gen_copyright_and_top_comments): New 
subroutine,
refactor from some existing code.
(mode_to_ldst_char): Adjust with die.
(ld_cmpi_p10_emit_define): New subroutine, refactor from 
gen_ld_cmpi_p10,
emit define_insn_and_split for load-cmpi fusion.
(gen_ld_cmpi_p10): Adjust with ld_cmpi_p10_emit_define.
(logical_addsubf_emit_define): New subroutine, refactor from
gen_logical_addsubf, emit define_insn_and_split for logical/addsubf 
fusion.
(logical_addsub_get_op_info): New subroutine, refactor from
gen_logical_addsubf, offer some information for the given operator.
(logical_addsubf_make_exp): New subroutine, refactor from
gen_logical_addsubf, construct the expression used for emission.
(gen_logical_addsubf_scalar): New subroutine, refactor from
gen_logical_addsubf, focus on scalar kind of logical/addsubf fusion.
(gen_logical_addsubf_vector): New subroutine, refactor from
gen_logical_addsubf, focus on vector kind of logical/addsubf fusion.
(gen_logical_addsubf): Adjust with calling gen_logical_addsubf_scalar
and gen_logical_addsubf_vector.
(gen_addadd): Remove useless variable.
(main): New subroutine, call the corresponding main subroutine for each
fusion type.
---
 gcc/config/rs6000/genfusion.pl | 554 -
 1 file changed, 337 insertions(+), 217 deletions(-)

diff --git a/gcc/config/rs6000/genfusion.pl b/gcc/config/rs6000/genfusion.pl
index e4db352e0ce..487e662ce05 100755
--- a/gcc/config/rs6000/genfusion.pl
+++ b/gcc/config/rs6000/genfusion.pl
@@ -22,7 +22,9 @@
 use warnings;
 use strict;

-print <<'EOF';
+sub gen_copyright_and_top_comments
+{
+  print << "EOF";
 ;; Generated automatically by genfusion.pl

 ;; Copyright (C) 2020-2023 Free Software Foundation, Inc.
@@ -44,255 +46,369 @@ print <<'EOF';
 ;; .

 EOF
+}

+# Map any mode of DI/SI/HI/QI to single char d/w/h/b,
+# die if the given mode in arg 0 isn't expected.
 sub mode_to_ldst_char
 {
-my ($mode) = @_;
-my %x = (DI => 'd', SI => 'w', HI => 'h', QI => 'b');
-return $x{$mode} if exists $x{$mode};
-return '?';
+  my $mode = $_[0];
+  die "Unexpected mode: $mode" unless $mode =~ /[QHSD]I/;
+  my %map = (DI => 'd', SI => 'w', HI => 'h', QI => 'b');
+  return $map{$mode};
 }

-sub gen_ld_cmpi_p10
+# Emit define_insn_and_split for load-cmpi fusion type based
+# on the below given arguments:
+#   arg 0: mode of load.
+#   arg 1: mode of result.
+#   arg 2: mode of comparison.
+#   arg 3: extension type.
+sub ld_cmpi_p10_emit_define
 {
-my ($lmode, $ldst, $clobbermode, $result, $cmpl, $echr, $constpred,
-   $mempred, $ccmode, $np, $extend, $resultmode);
-  LMODE: foreach $lmode ('DI','SI','HI','QI') {
-  $ldst = mode_to_ldst_char($lmode);
-  $clobbermode = $lmode;
-  # For clobber, we need a SI/DI reg in case we
-  # split because we have to sign/zero extend.
-  if ($lmode eq 'HI' || $lmode eq 'QI') { $clobbermode = "GPR"; }
-RESULT: foreach $result ('clobber', $lmode,  "EXT".$lmode) {
-   # EXTDI does not exist, and we cannot directly produce HI/QI results.
-   next RESULT if $result eq "EXTDI" || $result eq "HI" || $result eq "QI";
-   # Don't allow EXTQI because that would allow HI result which we can't 
do.
-   $result = "GPR" if $result eq "EXTQI";
-  CCMODE: foreach $ccmode ('CC','CCUNS') {
- $np = "NON_PREFIXED_D";

[PATCH, rs6000] Convert TI AND with a special constant to DI AND [PR93123]

2023-01-18 Thread HAO CHEN GUI via Gcc-patches
Hi,
  When TI AND with a special constant (the high part or low part is all
ones), it may be converted to DI AND with a 64-bit constant and a simple
DI move. When the DI AND can be implemented by rotate and mask or
"andi.", it eliminates the 128-bit constant loading to save the cost.

  The patch creates three insn_and_split patterns to match these cases
in combine pass and splits them later. The new predicate
"double_wide_cint_operand" is used to identify if a constant is a
double wide constant.

  Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
Is this okay for trunk? Any recommendations? Thanks a lot.

Gui Haochen


ChangeLog
2023-01-18  Haochen Gui 

gcc/
PR target/93123
* config/rs6000/predicates.md (double_wide_cint_operand): New.
* config/rs6000/rs6000.md (*andti3_128bit_imm_highpart): New.
(*andti3_128bit_imm_lowpart): New.
(*andti3_64bit_imm): New.

gcc/testsuite/
PR target/93123
* gcc.target/powerpc/pr93123.c: New.

patch.diff
diff --git a/gcc/config/rs6000/predicates.md b/gcc/config/rs6000/predicates.md
index a1764018545..bacb87c3fb2 100644
--- a/gcc/config/rs6000/predicates.md
+++ b/gcc/config/rs6000/predicates.md
@@ -255,6 +255,19 @@ (define_predicate "u10bit_cint_operand"
   (and (match_code "const_int")
(match_test "INTVAL (op) >= 0 && INTVAL (op) <= 1023")))

+;; Return 1 if op is a 65-128 bits constant integer.
+(define_predicate "double_wide_cint_operand"
+  (match_operand 0 "const_scalar_int_operand")
+{
+  if (CONST_INT_P (op))
+return 0;
+
+  if (CONST_WIDE_INT_NUNITS (op) == 2)
+return 1;
+
+  return 0;
+})
+
 ;; Return 1 if op is a constant integer that can fit in a D field.
 (define_predicate "short_cint_operand"
   (and (match_code "const_int")
diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 6011f5bf76a..1fecb2d734e 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -7199,6 +7199,128 @@ (define_expand "orc3"
   "mode == TImode || mode == PTImode || TARGET_P8_VECTOR"
   "")

+(define_insn_and_split "*andti3_128bit_imm_highpart"
+  [(set (match_operand:TI 0 "gpc_reg_operand" "=r")
+   (and:TI
+ (match_operand:TI 1 "gpc_reg_operand" "r")
+ (match_operand:TI 2 "double_wide_cint_operand" "n")))]
+  "CONST_WIDE_INT_ELT (operands[2], 0) == -1
+   && (rs6000_is_valid_and_mask (GEN_INT (CONST_WIDE_INT_ELT (operands[2], 1)),
+E_DImode)
+   || logical_const_operand (GEN_INT (CONST_WIDE_INT_ELT (operands[2], 1)),
+E_DImode))"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  rtx in_lo, in_hi, out_lo, out_hi;
+  rtx imm = GEN_INT (CONST_WIDE_INT_ELT (operands[2], 1));
+  int hi_off, lo_off;
+
+  if (BYTES_BIG_ENDIAN)
+{
+  hi_off = 0;
+  lo_off = 8;
+}
+  else
+{
+  hi_off = 8;
+  lo_off = 0;
+}
+
+  in_lo = simplify_gen_subreg (DImode, operands[1], TImode, lo_off);
+  out_lo = simplify_gen_subreg (DImode, operands[0], TImode, lo_off);
+  in_hi = simplify_gen_subreg (DImode, operands[1], TImode, hi_off);
+  out_hi = simplify_gen_subreg (DImode, operands[0], TImode, hi_off);
+
+  if (rs6000_is_valid_and_mask (imm, E_DImode))
+emit_insn (gen_anddi3_mask (out_hi, in_hi, imm));
+  else
+emit_insn (gen_anddi3_imm (out_hi, in_hi, imm));
+
+  emit_move_insn (out_lo, in_lo);
+}
+  [(set_attr "length" "8")])
+
+(define_insn_and_split "*andti3_128bit_imm_lowpart"
+  [(set (match_operand:TI 0 "gpc_reg_operand" "=r")
+   (and:TI
+ (match_operand:TI 1 "gpc_reg_operand" "r")
+ (match_operand:TI 2 "double_wide_cint_operand" "n")))]
+  "CONST_WIDE_INT_ELT (operands[2], 1) == -1
+   && (rs6000_is_valid_and_mask (GEN_INT (CONST_WIDE_INT_ELT (operands[2], 0)),
+E_DImode)
+   || logical_const_operand (GEN_INT (CONST_WIDE_INT_ELT (operands[2], 0)),
+E_DImode))"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  rtx in_lo, in_hi, out_lo, out_hi;
+  rtx imm = GEN_INT (CONST_WIDE_INT_ELT (operands[2], 0));
+  int hi_off, lo_off;
+
+  if (BYTES_BIG_ENDIAN)
+{
+  hi_off = 0;
+  lo_off = 8;
+}
+  else
+{
+  hi_off = 8;
+  lo_off = 0;
+}
+
+  in_lo = simplify_gen_subreg (DImode, operands[1], TImode, lo_off);
+  out_lo = simplify_gen_subreg (DImode, operands[0], TImode, lo_off);
+  in_hi = simplify_gen_subreg (DImode, operands[1], TImode, hi_off);
+  out_hi = simplify_gen_subreg (DImode, operands[0], TImode, hi_off);
+
+  if (rs6000_is_valid_and_mask (imm, E_DImode))
+emit_insn (gen_anddi3_mask (out_lo, in_lo, imm));
+  else
+emit_insn (gen_anddi3_imm (out_lo, in_lo, imm));
+
+  emit_move_insn (out_hi, in_hi);
+}
+  [(set_attr "length" "8")])
+
+
+(define_insn_and_split "*andti3_64bit_imm"
+  [(set (match_operand:TI 0 "gpc_reg_operand" "=r")
+   (and:TI
+ (match_operand:TI 1 "gpc_reg_operand" "r")
+ (match_operand:TI 

[committed] wwwdocs: gcc-4.6: Adjust www.open-std.org links to https

2023-01-18 Thread Gerald Pfeifer
Pushed

Gerald
---
 htdocs/gcc-4.6/changes.html  |   2 +-
 htdocs/gcc-4.6/cxx0x_status.html | 122 +++
 2 files changed, 62 insertions(+), 62 deletions(-)

diff --git a/htdocs/gcc-4.6/changes.html b/htdocs/gcc-4.6/changes.html
index eb71f855..c96d347f 100644
--- a/htdocs/gcc-4.6/changes.html
+++ b/htdocs/gcc-4.6/changes.html
@@ -447,7 +447,7 @@
 In 4.6.0 and 4.6.1 G++ no longer allows objects of const-qualified
   type to be default initialized unless the type has a user-declared
   default constructor.  In 4.6.2 G++ implements the proposed resolution
-  of http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#253;>DR
+  of https://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#253;>DR
   253, so default initialization is allowed if it initializes all
   subobjects.  Code that fails to compile can be fixed by providing an
   initializer e.g.
diff --git a/htdocs/gcc-4.6/cxx0x_status.html b/htdocs/gcc-4.6/cxx0x_status.html
index 767335e4..b678777a 100644
--- a/htdocs/gcc-4.6/cxx0x_status.html
+++ b/htdocs/gcc-4.6/cxx0x_status.html
@@ -18,7 +18,7 @@
 GCC's C++0x mode tracks the C++0x working paper drafts produced by
 the ISO C++ committee, available on the ISO C++ committee's web site
 at http://www.open-std.org/jtc1/sc22/wg21/;>http://www.open-std.org/jtc1/sc22/wg21/.
 Since
+href="https://www.open-std.org/jtc1/sc22/wg21/;>https://www.open-std.org/jtc1/sc22/wg21/.
 Since
 this standard is still being extended and modified, the feature set
 provided by the experimental C++0x mode may vary greatly from one GCC
 version to another. No attempts will be made to preserve backward
@@ -40,231 +40,231 @@ page.
 
 
   Rvalue references
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2118.html;>N2118
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n2118.html;>N2118
Yes
 
 
   Rvalue references for *this
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2439.htm;>N2439
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2439.htm;>N2439
   No
 
 
   Initialization of class objects by rvalues
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1610.html;>N1610
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1610.html;>N1610
   Yes
 
 
   Non-static data member initializers
-  http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2008/n2756.htm;>N2756
+  https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2008/n2756.htm;>N2756
   No
 
 
   Variadic templates
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2242.pdf;>N2242
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2242.pdf;>N2242
Yes
 
 
   Extending variadic template template 
parameters
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2555.pdf;>N2555
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2555.pdf;>N2555
Yes
 
 
   Initializer lists
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2672.htm;>N2672
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2672.htm;>N2672
Yes
 
 
   Static assertions
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1720.html;>N1720
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1720.html;>N1720
Yes
 
 
   auto-typed variables
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1984.pdf;>N1984
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1984.pdf;>N1984
Yes
 
 
   Multi-declarator auto
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1737.pdf;>N1737
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1737.pdf;>N1737
Yes
 
 
   Removal of auto as a storage-class 
specifier
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2546.htm;>N2546
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2546.htm;>N2546
Yes
 
 
   New function declarator syntax
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2541.htm;>N2541
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2541.htm;>N2541
Yes
 
 
   New wording for C++0x lambdas
-  http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2009/n2927.pdf;>N2927
+  https://www.open-std.org/JTC1/SC22/WG21/docs/papers/2009/n2927.pdf;>N2927
   Yes
 
 
   Declared type of an expression
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2343.pdf;>N2343
+  https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2343.pdf;>N2343
Yes
 
 
   Right angle brackets
-  http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1757.html;>N1757
+  

Re: [PATCH] tree-optimization/104475 - bogus -Wstringop-overflow

2023-01-18 Thread Richard Biener via Gcc-patches
On Tue, 17 Jan 2023, Jason Merrill wrote:

> On 12/7/22 06:25, Richard Biener wrote:
> > The following avoids a bogus -Wstringop-overflow diagnostic by
> > properly recognizing that >m_mutex cannot be nullptr in C++
> > even if m_mutex is at offset zero.  The frontend already diagnoses
> > a >m_mutex != nullptr comparison and the following transfers
> > this knowledge to the middle-end which sees >m_mutex as
> > simple pointer arithmetic.  The new ADDR_NONZERO flag on an
> > ADDR_EXPR is used to carry this information and it's checked in
> > the tree_expr_nonzero_p API which causes this to be folded early.
> > 
> > To avoid the bogus diagnostic this avoids separating the nullptr
> > path via jump-threading by eliminating the nullptr check.
> > 
> > I'd appreciate C++ folks picking this up and put the flag on
> > the appropriate ADDR_EXPRs - I've tried avoiding to put it on
> > all of them and didn't try hard to mimick what -Waddress warns
> > on (the code is big, maybe some refactoring would help but also
> > not sure what exactly the C++ standard constraints are here).
> 
> This is allowed by the standard, at least after CWG2535, but we need to check
> -fsanitize=null before asserting that the address is non-null. With that
> elaboration, a flag on the ADDR_EXPR may not be a convenient way to express
> the property?

Adding a flag on the ADDR_EXPR was mostly out of caution for other
languages that do not have this guarantee (it seems C has a similar
guarantee at least) and for the middle-end (accidentially) producing
such expressions.  That is, I intended to set the flag on ADDR_EXPRs
written by the user as opposed to those created artificially.

I noticed the &* contraction rule and wondered how to conservatively
enforce that - I suppose we'd rely on the frontend to never actually
produce the ADDR_EXPR here.

That said, we could re-define GENERIC/GIMPLE here to the extent
that ADDR_EXPR of a COMPONENT_REF (or all handled components?)
is never nullptr when the target specifies nullptr is not a valid
object address.  We currently already assert there's a valid
object for >x if x lives at non-zero offset, so the case we
fail to handle is specifically _only_ the one the component is
at offset zero.  Note >x != (void *)4 isn't currently optimized
when x is at offset 4 even though *p would be at address zero
and -Waddress also doesn't diagnose this case - we could
canonicalize this to to p != (void *)0 but then we cannot
treat this as false anymore because of the address-taking of a component.

Richard.

> > Bootstrapped and tested on x86_64-unknown-linux-gnu.
> > 
> > Thanks,
> > Richard.
> > 
> > PR tree-optimization/104475
> > gcc/
> >  * tree-core.h: Document use of nothrow_flag on ADDR_EXPR.
> >  * tree.h (ADDR_NONZERO): New.
> >  * fold-const.cc (tree_single_nonzero_warnv_p): Check
> >  ADDR_NONZERO.
> > 
> > gcc/cp/
> >  * typeck.cc (cp_build_addr_expr_1): Set ADDR_NONZERO
> >  on the built address if it is of a COMPONENT_REF.
> > 
> > * g++.dg/opt/pr104475.C: New testcase.
> > ---
> >   gcc/cp/typeck.cc|  3 +++
> >   gcc/fold-const.cc   |  4 +++-
> >   gcc/testsuite/g++.dg/opt/pr104475.C | 12 
> >   gcc/tree-core.h |  3 +++
> >   gcc/tree.h  |  4 
> >   5 files changed, 25 insertions(+), 1 deletion(-)
> >   create mode 100644 gcc/testsuite/g++.dg/opt/pr104475.C
> > 
> > diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
> > index 7dfe5acc67e..3563750803e 100644
> > --- a/gcc/cp/typeck.cc
> > +++ b/gcc/cp/typeck.cc
> > @@ -7232,6 +7232,9 @@ cp_build_addr_expr_1 (tree arg, bool strict_lvalue,
> > tsubst_flags_t complain)
> > gcc_assert (same_type_ignoring_top_level_qualifiers_p
> >   (TREE_TYPE (object), decl_type_context (field)));
> > val = build_address (arg);
> > +  if (TREE_CODE (val) == ADDR_EXPR
> > + && TREE_CODE (TREE_OPERAND (val, 0)) == COMPONENT_REF)
> > +   ADDR_NONZERO (val) = 1;
> >   }
> >   
> > if (TYPE_PTR_P (argtype)
> > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> > index e80be8049e1..cdfe3f50ae3 100644
> > --- a/gcc/fold-const.cc
> > +++ b/gcc/fold-const.cc
> > @@ -15308,8 +15308,10 @@ tree_single_nonzero_warnv_p (tree t, bool
> > *strict_overflow_p)
> >   
> >   case ADDR_EXPR:
> > {
> > -   tree base = TREE_OPERAND (t, 0);
> > +   if (ADDR_NONZERO (t))
> > + return true;
> >   + tree base = TREE_OPERAND (t, 0);
> >if (!DECL_P (base))
> >  base = get_base_address (base);
> >   diff --git a/gcc/testsuite/g++.dg/opt/pr104475.C
> > b/gcc/testsuite/g++.dg/opt/pr104475.C
> > new file mode 100644
> > index 000..013c70302c6
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/opt/pr104475.C
> > @@ -0,0 +1,12 @@
> > +// { dg-do compile }
> > +// { dg-require-effective-target c++11 }
> > +// { dg-options "-O -Waddress -fdump-tree-original" }
> > +
> > +struct X { int i; };
> > +
> > +bool foo (struct X