Re: [PATCH v3 1/5] LoongArch: Fix usage of LSX and LASX frint/ftint instructions [PR112578]

2023-11-22 Thread chenglulu



在 2023/11/23 下午3:11, Xi Ruoyao 写道:

On Thu, 2023-11-23 at 14:35 +0800, chenglulu wrote:

Hi,

   I don’t quite understand this part. Is it because define_insn would be
duplicated with the above implementation,

so define_insn_and_split is used?

Yes, but if you think duplicating the above implementation is better I
can dup it as well (as it's just a single line).

(I wrote it as a define_expand but it didn't work, then I modified it to
define_insn_and_split).


I just thought it was weird when I was looking at the code.

I modified this code to use define_expand:

    (define_expand "fix_trunc2"
      [(set (match_operand: 0 "register_operand" "=f")
        (fix: (match_operand:FVEC 1 "register_operand" "f")))]
      ""
      {
        emit_insn (gen__vftintrz__ (
      operands[0], operands[1]));
        DONE;
      }
      [(set_attr "type" "simd_fcvt")
       (set_attr "mode" "")])

Here are my test cases:

    typedef float __attribute__ ((mode (SF))) float_t;
    typedef int __attribute__ ((mode (SI))) int_t;

    extern int_t v[4];
    int_t
    lt_fixdfsi (float_t *x)
    {

      for (int i=0;i<4;i++)
        v[i] = x[i];
    }

This still achieves the desired effect, generating the following 
assembly code:


lt_fixdfsi:
.LFB0 = .
    .cfi_startproc

    or    $r13,$r4,$r0     # 16    [c=4 l=4]  *movdi_64bit/0
    la.global    $r12,v     # 8    [c=4 l=12]  *movdi_64bit/1
    vld    $vr0,$r13,0     # 6    [c=12 l=4]  movv4sf_lsx/1
    vftintrz.w.s    $vr0,$vr0     # 7    [c=12 l=4] lsx_vftintrz_w_s
    vst    $vr0,$r12,0     # 9    [c=4 l=4]  movv4si_lsx/2

So I don't know if I'm getting it right?:-(


+(define_insn_and_split "fix_trunc2"
+  [(set (match_operand: 0 "register_operand" "=f")
+   (fix: (match_operand:FVEC 1 "register_operand" "f")))]
+  ""
+  "#"
+  ""
+  [(const_int 0)]
+  {
+    emit_insn (gen__vftintrz__ (
+  operands[0], operands[1]));
+    DONE;
+  }
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "")])




Re: Propagate value ranges of return values

2023-11-22 Thread Andrew Pinski
On Tue, Nov 21, 2023 at 6:07 AM Jan Hubicka  wrote:
>
> > After this patch in addition to the problem already reported about
> > vlda1.c and return-value-range-1.c, we have noticed these regressions
> > on aarch64:
> > Running gcc:gcc.target/aarch64/aarch64.exp ...
> > FAIL: gcc.target/aarch64/movk.c scan-assembler movk\tx[0-9]+, 0x4667, lsl 16
> > FAIL: gcc.target/aarch64/movk.c scan-assembler movk\tx[0-9]+, 0x7a3d, lsl 32
> >
> > Running gcc:gcc.target/aarch64/simd/simd.exp ...
> > FAIL: gcc.target/aarch64/simd/vmulxd_f64_2.c scan-assembler-times
> > fmul[ \t]+[dD][0-9]+, ?[dD][0-9]+, ?[dD][0-9]+\n 1
> > FAIL: gcc.target/aarch64/simd/vmulxd_f64_2.c scan-assembler-times
> > fmulx[ \t]+[dD][0-9]+, ?[dD][0-9]+, ?[dD][0-9]+\n 4
> > FAIL: gcc.target/aarch64/simd/vmulxs_f32_2.c scan-assembler-times
> > fmul[ \t]+[sS][0-9]+, ?[sS][0-9]+, ?[sS][0-9]+\n 1
> > FAIL: gcc.target/aarch64/simd/vmulxs_f32_2.c scan-assembler-times
> > fmulx[ \t]+[sS][0-9]+, ?[sS][0-9]+, ?[sS][0-9]+\n 4
>
> Sorry for that - I guess we will see some on various targets.
> This is quite common issue - the testcase is having
> dummy_number_generator function returning constant and prevents
> inlining to avoid constant being visible to compiler.  This no longer
> works, since we get it from the return value range.  This should fix it.
>
> return-value_range-1.c should be fixed now and I do not have vlda1.c in
> my tree.  I will check.

This is the other change that needs to happen I think:
diff --git a/gcc/testsuite/gcc.target/aarch64/simd/vmulx.x
b/gcc/testsuite/gcc.target/aarch64/simd/vmulx.x
index 8968a64a95c..869e7485646 100644
--- a/gcc/testsuite/gcc.target/aarch64/simd/vmulx.x
+++ b/gcc/testsuite/gcc.target/aarch64/simd/vmulx.x
@@ -33,13 +33,13 @@
   while (0)\

 /* Functions used to return values that won't be optimised away.  */
-float32_t  __attribute__ ((noinline))
+float32_t  __attribute__ ((noipa))
 foo32 ()
 {
   return 1.0;
 }

-float64_t  __attribute__ ((noinline))
+float64_t  __attribute__ ((noipa))
 foo64 ()
 {
   return 1.0;


Thanks,
Andrew Pinski

>
> diff --git a/gcc/testsuite/gcc.target/aarch64/movk.c 
> b/gcc/testsuite/gcc.target/aarch64/movk.c
> index e6e4e3a8961..6b1f3f8ecf5 100644
> --- a/gcc/testsuite/gcc.target/aarch64/movk.c
> +++ b/gcc/testsuite/gcc.target/aarch64/movk.c
> @@ -1,8 +1,9 @@
>  /* { dg-do run } */
> -/* { dg-options "-O2 --save-temps -fno-inline" } */
> +/* { dg-options "-O2 --save-temps" } */
>
>  extern void abort (void);
>
> +__attribute__ ((noipa))
>  long long int
>  dummy_number_generator ()
>  {
>
> >
> > We have already sent you a notification for the regression on arm, but
> > it includes on vla-1.c and return-value-range-1.c.
> > The notification email contains a pointer to the page where we record
> > all the configurations that regress because of this patch:
> >
> > https://linaro.atlassian.net/browse/GNU-1025
> >
> > Can you have a look?
> >
> > Thanks,
> >
> > Christophe
> >
> >
> >
> >
> > > diff --git a/gcc/cgraph.cc b/gcc/cgraph.cc
> > > index e41e5ad3ae7..71dacf23ce1 100644
> > > --- a/gcc/cgraph.cc
> > > +++ b/gcc/cgraph.cc
> > > @@ -2629,6 +2629,54 @@ cgraph_node::set_malloc_flag (bool malloc_p)
> > >return changed;
> > >  }
> > >
> > > +/* Worker to set malloc flag.  */
> > > +static void
> > > +add_detected_attribute_1 (cgraph_node *node, const char *attr, bool 
> > > *changed)
> > > +{
> > > +  if (!lookup_attribute (attr, DECL_ATTRIBUTES (node->decl)))
> > > +{
> > > +  DECL_ATTRIBUTES (node->decl) = tree_cons (get_identifier (attr),
> > > +NULL_TREE, DECL_ATTRIBUTES 
> > > (node->decl));
> > > +  *changed = true;
> > > +}
> > > +
> > > +  ipa_ref *ref;
> > > +  FOR_EACH_ALIAS (node, ref)
> > > +{
> > > +  cgraph_node *alias = dyn_cast (ref->referring);
> > > +  if (alias->get_availability () > AVAIL_INTERPOSABLE)
> > > +   add_detected_attribute_1 (alias, attr, changed);
> > > +}
> > > +
> > > +  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
> > > +if (e->caller->thunk
> > > +   && (e->caller->get_availability () > AVAIL_INTERPOSABLE))
> > > +  add_detected_attribute_1 (e->caller, attr, changed);
> > > +}
> > > +
> > > +/* Set DECL_IS_MALLOC on NODE's decl and on NODE's aliases if any.  */
> > > +
> > > +bool
> > > +cgraph_node::add_detected_attribute (const char *attr)
> > > +{
> > > +  bool changed = false;
> > > +
> > > +  if (get_availability () > AVAIL_INTERPOSABLE)
> > > +add_detected_attribute_1 (this, attr, );
> > > +  else
> > > +{
> > > +  ipa_ref *ref;
> > > +
> > > +  FOR_EACH_ALIAS (this, ref)
> > > +   {
> > > + cgraph_node *alias = dyn_cast (ref->referring);
> > > + if (alias->get_availability () > AVAIL_INTERPOSABLE)
> > > +   add_detected_attribute_1 (alias, attr, );
> > > +   }
> > > +}
> > > +  return changed;
> > > +}
> > > +
> > >  /* 

Re: [PATCH v3 1/5] LoongArch: Fix usage of LSX and LASX frint/ftint instructions [PR112578]

2023-11-22 Thread Xi Ruoyao
On Thu, 2023-11-23 at 14:35 +0800, chenglulu wrote:
> Hi,
> 
>   I don’t quite understand this part. Is it because define_insn would be 
> duplicated with the above implementation,
> 
> so define_insn_and_split is used?

Yes, but if you think duplicating the above implementation is better I
can dup it as well (as it's just a single line).

(I wrote it as a define_expand but it didn't work, then I modified it to
define_insn_and_split).


> > +(define_insn_and_split "fix_trunc2"
> > +  [(set (match_operand: 0 "register_operand" "=f")
> > +   (fix: (match_operand:FVEC 1 "register_operand" "f")))]
> > +  ""
> > +  "#"
> > +  ""
> > +  [(const_int 0)]
> > +  {
> > +    emit_insn (gen__vftintrz__ (
> > +  operands[0], operands[1]));
> > +    DONE;
> > +  }
> > +  [(set_attr "type" "simd_fcvt")
> > +   (set_attr "mode" "")])

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH] gimple-vr-values:Add constraint for gimple-cond optimization

2023-11-22 Thread Andrew Pinski
On Wed, Nov 22, 2023 at 10:07 PM Feng Wang  wrote:
>
> This patch add another condition for gimple-cond optimization. Refer to
> the following test case.
> int foo1 (int data, int res)
> {
>   res = data & 0xf;
>   res |= res << 4;
>   if (res < 0x22)
> return 0x22;
>   return res;
> }
> with the compilation flag "-march=rv64gc_zba_zbb -mabi=lp64d -O2",
> before this patch the compilation result is
> foo1:
> andia0,a0,15
> slliw   a5,a0,4
> addwa3,a5,a0
> li  a4,33
> add a0,a5,a0
> bleua3,a4,.L5
> ret
> .L5:
> li  a0,34
> ret
> after this patch the compilation result is
> foo1:
> andia0,a0,15
> slliw   a5,a0,4
> add a5,a5,a0
> li  a0,34
> max a0,a5,a0
> ret
> The reason is in the pass_early_vrp, the arg0 of gimple_cond
> is replaced,but the PHI node still use the arg0.
> The some of evrp pass logs are as follows
>  gimple_assign 
>   gimple_assign 
>   gimple_cond 
> goto ; [INV]
>   else
> goto ; [INV]
>
>:
>   // predicted unlikely by early return (on trees) predictor.
>
>:
>   # gimple_phi <_2, 34(3), res_5(2)>
> The arg0 of gimple_cond is replaced by _9,but the gimple_phi still
> uses res_5,it will cause optimization fail of PHI node to MAX_EXPR.
> So the next_use_is_phi is added to control the replacement.
>
> gcc/ChangeLog:
>
> * vr-values.cc (next_use_is_phi):
> (simplify_using_ranges::simplify_casted_compare):
> add new function next_use_is_phi to control the replacement.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zbb-min-max-04.c: New test.

One more comment, since this is a generic gimple change, you should
add a testcase that is not riscv specific that scans the tree dumps. I
would scan phiopt1 in this case to make sure we MAX_EXPR is created
early on.

Thanks,
Andrew Pinski


> ---
>  gcc/testsuite/gcc.target/riscv/zbb-min-max-04.c | 14 ++
>  gcc/vr-values.cc| 15 ++-
>  2 files changed, 28 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-min-max-04.c
>
> diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max-04.c 
> b/gcc/testsuite/gcc.target/riscv/zbb-min-max-04.c
> new file mode 100644
> index 000..8c3e87a35e0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max-04.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64d" } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
> +
> +int foo1 (int data, int res)
> +{
> +  res = data & 0xf;
> +  res |= res << 4;
> +  if (res < 0x22)
> +return 0x22;
> +  return res;
> +}
> +
> +/* { dg-final { scan-assembler-times "max\t" 1 } } */
> \ No newline at end of file
> diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
> index ecb294131b0..1f7a727c638 100644
> --- a/gcc/vr-values.cc
> +++ b/gcc/vr-values.cc
> @@ -1263,6 +1263,18 @@ simplify_using_ranges::simplify_compare_using_ranges_1 
> (tree_code _code, tr
>return happened;
>  }
>
> +/* Return true if the next use of SSA_NAME is PHI node */
> +bool
> +next_use_is_phi (tree arg)
> +{
> +  use_operand_p imm = &(SSA_NAME_IMM_USE_NODE (arg));
> +  use_operand_p next = imm->next;
> +  if (next && next->loc.stmt
> +  && (gimple_code (next->loc.stmt) == GIMPLE_PHI))
> +return true;
> +  return false;
> +}
> +
>  /* Simplify OP0 code OP1 when OP1 is a constant and OP0 was a SSA_NAME
> defined by a type conversion. Replacing OP0 with RHS of the type 
> conversion.
> Doing so makes the conversion dead which helps subsequent passes.  */
> @@ -1305,7 +1317,8 @@ simplify_using_ranges::simplify_casted_compare 
> (tree_code &, tree , tree 
>if (TREE_CODE (innerop) == SSA_NAME
>   && !POINTER_TYPE_P (TREE_TYPE (innerop))
>   && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (innerop)
> - && desired_pro_or_demotion_p (TREE_TYPE (innerop), TREE_TYPE (op0)))
> + && desired_pro_or_demotion_p (TREE_TYPE (innerop), TREE_TYPE (op0))
> +  && !next_use_is_phi (op0))
> {
>   value_range vr;
>
> --
> 2.17.1
>


Re: [PATCH v3 1/5] LoongArch: Fix usage of LSX and LASX frint/ftint instructions [PR112578]

2023-11-22 Thread chenglulu



在 2023/11/20 上午8:47, Xi Ruoyao 写道:

The usage LSX and LASX frint/ftint instructions had some problems:

1. These instructions raises FE_INEXACT, which is not allowed with
-fno-fp-int-builtin-inexact for most C2x section F.10.6 functions
(the only exceptions are rint, lrint, and llrint).
2. The "frint" instruction without explicit rounding mode is used for
roundM2, this is incorrect because roundM2 is defined "rounding
operand 1 to the *nearest* integer, rounding away from zero in the
event of a tie".  We actually don't have such an instruction.  Our
frintrne instruction is roundevenM2 (unfortunately, this is not
documented).
3. These define_insn's are written in a way not so easy to hack.

So I removed these instructions and created a "simd.md" file, then added
them and the corresponding expanders there.  The advantage of the
simd.md file is we don't need to duplicate the RTL template twice (in
lsx.md and lasx.md).

/* snip */

+;; fix_trunc is allowed to raise inexact exception even if
+;; -fno-fp-int-builtin-inexact.  Because the middle end trys to match
+;; (FIX x) and it does not know (FIX (UNSPEC_SIMD_FRINTRZ x)), we need
+;; to use define_insn_and_split instead of define_expand (expanders are
+;; not considered during matching).


Hi,

 I don’t quite understand this part. Is it because define_insn would be 
duplicated with the above implementation,


so define_insn_and_split is used?


Thanks.


+(define_insn_and_split "fix_trunc2"
+  [(set (match_operand: 0 "register_operand" "=f")
+   (fix: (match_operand:FVEC 1 "register_operand" "f")))]
+  ""
+  "#"
+  ""
+  [(const_int 0)]
+  {
+emit_insn (gen__vftintrz__ (
+  operands[0], operands[1]));
+DONE;
+  }
+  [(set_attr "type" "simd_fcvt")
+   (set_attr "mode" "")])






Re: [PATCH] gimple-vr-values:Add constraint for gimple-cond optimization

2023-11-22 Thread Andrew Pinski
On Wed, Nov 22, 2023 at 10:07 PM Feng Wang  wrote:
>
> This patch add another condition for gimple-cond optimization. Refer to
> the following test case.
> int foo1 (int data, int res)
> {
>   res = data & 0xf;
>   res |= res << 4;
>   if (res < 0x22)
> return 0x22;
>   return res;
> }
> with the compilation flag "-march=rv64gc_zba_zbb -mabi=lp64d -O2",
> before this patch the compilation result is
> foo1:
> andia0,a0,15
> slliw   a5,a0,4
> addwa3,a5,a0
> li  a4,33
> add a0,a5,a0
> bleua3,a4,.L5
> ret
> .L5:
> li  a0,34
> ret
> after this patch the compilation result is
> foo1:
> andia0,a0,15
> slliw   a5,a0,4
> add a5,a5,a0
> li  a0,34
> max a0,a5,a0
> ret
> The reason is in the pass_early_vrp, the arg0 of gimple_cond
> is replaced,but the PHI node still use the arg0.
> The some of evrp pass logs are as follows
>  gimple_assign 
>   gimple_assign 
>   gimple_cond 
> goto ; [INV]
>   else
> goto ; [INV]
>
>:
>   // predicted unlikely by early return (on trees) predictor.
>
>:
>   # gimple_phi <_2, 34(3), res_5(2)>
> The arg0 of gimple_cond is replaced by _9,but the gimple_phi still
> uses res_5,it will cause optimization fail of PHI node to MAX_EXPR.
> So the next_use_is_phi is added to control the replacement.

I don't think this is the correct appoarch here.
We end up with the same original issue if we had wrote it like:
```
int foo1 (int data, int res)
{
  res = data & 0xf;
  unsigned int r = res;
  r*=17;
  res = r;
  if (r < 0x22)
return 0x22;
  return res;
}
```
I suspect instead we should extend the match.pd patterns to match this max.
We should be able to extend:
```
(for cmp (lt le gt ge eq ne)
 (simplify
  (cond (cmp (convert1? @1) INTEGER_CST@3) (convert2? @1) INTEGER_CST@2)
  (with
```
To match instead by changing the second @1 with @4 and then using
bitwise_equal_p . If @1 != @4 but bitwise_equal_p is true, you need to
make sure the outer convert1/convert2 are nop conversions so that you
get the same extension I think ...

Note you could instead improve minmax_replacement but I have been in
the process of moving those changes to match.pd.

Thanks,
Andrew Pinski

>
> gcc/ChangeLog:
>
> * vr-values.cc (next_use_is_phi):
> (simplify_using_ranges::simplify_casted_compare):
> add new function next_use_is_phi to control the replacement.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/zbb-min-max-04.c: New test.
> ---
>  gcc/testsuite/gcc.target/riscv/zbb-min-max-04.c | 14 ++
>  gcc/vr-values.cc| 15 ++-
>  2 files changed, 28 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-min-max-04.c
>
> diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max-04.c 
> b/gcc/testsuite/gcc.target/riscv/zbb-min-max-04.c
> new file mode 100644
> index 000..8c3e87a35e0
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max-04.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64d" } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
> +
> +int foo1 (int data, int res)
> +{
> +  res = data & 0xf;
> +  res |= res << 4;
> +  if (res < 0x22)
> +return 0x22;
> +  return res;
> +}
> +
> +/* { dg-final { scan-assembler-times "max\t" 1 } } */
> \ No newline at end of file
> diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
> index ecb294131b0..1f7a727c638 100644
> --- a/gcc/vr-values.cc
> +++ b/gcc/vr-values.cc
> @@ -1263,6 +1263,18 @@ simplify_using_ranges::simplify_compare_using_ranges_1 
> (tree_code _code, tr
>return happened;
>  }
>
> +/* Return true if the next use of SSA_NAME is PHI node */
> +bool
> +next_use_is_phi (tree arg)
> +{
> +  use_operand_p imm = &(SSA_NAME_IMM_USE_NODE (arg));
> +  use_operand_p next = imm->next;
> +  if (next && next->loc.stmt
> +  && (gimple_code (next->loc.stmt) == GIMPLE_PHI))
> +return true;
> +  return false;
> +}
> +
>  /* Simplify OP0 code OP1 when OP1 is a constant and OP0 was a SSA_NAME
> defined by a type conversion. Replacing OP0 with RHS of the type 
> conversion.
> Doing so makes the conversion dead which helps subsequent passes.  */
> @@ -1305,7 +1317,8 @@ simplify_using_ranges::simplify_casted_compare 
> (tree_code &, tree , tree 
>if (TREE_CODE (innerop) == SSA_NAME
>   && !POINTER_TYPE_P (TREE_TYPE (innerop))
>   && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (innerop)
> - && desired_pro_or_demotion_p (TREE_TYPE (innerop), TREE_TYPE (op0)))
> + && desired_pro_or_demotion_p (TREE_TYPE (innerop), TREE_TYPE (op0))
> +  && !next_use_is_phi (op0))
> {
>   value_range vr;
>
> --
> 2.17.1
>


[PATCH] i386: Fix AVX512 and AVX10 option issues

2023-11-22 Thread Haochen Jiang
Hi all,

This patch should be able to fix the current issue mentioned in PR112643.

Also, I fixed some legacy issues in code related to AVX512/AVX10.

Ok for trunk?

Thx,
Haochen

gcc/ChangeLog:

PR target/112643
* config/i386/driver-i386.cc (check_avx10_avx512_features):
Renamed to ...
(check_avx512_features): this and remove avx10 check.
(host_detect_local_cpu): Never append -mno-avx10.1-{256,512} to
avoid emitting warnings when building GCC with native arch.
* config/i386/i386-builtin.def (BDESC): Add missing AVX512VL for
128/256 bit builtin for AVX512VP2INTERSECT.
* config/i386/i386-options.cc (ix86_option_override_internal):
Also check whether the AVX512 flags is set when trying to reset.
* config/i386/i386.h
(PTA_SKYLAKE_AVX512): Add missing PTA_EVEX512.
(PTA_ZNVER4): Ditto.
---
 gcc/config/i386/driver-i386.cc   | 19 +--
 gcc/config/i386/i386-builtin.def |  8 
 gcc/config/i386/i386-options.cc  |  8 +---
 gcc/config/i386/i386.h   |  4 ++--
 4 files changed, 20 insertions(+), 19 deletions(-)

diff --git a/gcc/config/i386/driver-i386.cc b/gcc/config/i386/driver-i386.cc
index ae67efc49c3..204600e128a 100644
--- a/gcc/config/i386/driver-i386.cc
+++ b/gcc/config/i386/driver-i386.cc
@@ -377,15 +377,10 @@ detect_caches_intel (bool xeon_mp, unsigned max_level,
enabled and the other disabled.  Add this function to avoid push "-mno-"
options under this scenario for -march=native.  */
 
-bool check_avx10_avx512_features (__processor_model _model,
- unsigned int 
(_features2)[SIZE_OF_CPU_FEATURES],
- const enum processor_features feature)
+bool check_avx512_features (__processor_model _model,
+   unsigned int (_features2)[SIZE_OF_CPU_FEATURES],
+   const enum processor_features feature)
 {
-  if (has_feature (FEATURE_AVX512F)
-  && ((feature == FEATURE_AVX10_1_256)
- || (feature == FEATURE_AVX10_1_512)))
-return false;
-
   if (has_feature (FEATURE_AVX10_1_256)
   && ((feature == FEATURE_AVX512F)
  || (feature == FEATURE_AVX512CD)
@@ -900,8 +895,12 @@ const char *host_detect_local_cpu (int argc, const char 
**argv)
  options = concat (options, " ",
isa_names_table[i].option, NULL);
  }
-   else if (check_avx10_avx512_features (cpu_model, cpu_features2,
- isa_names_table[i].feature))
+   /* Never push -mno-avx10.1-{256,512} under -march=native to
+  avoid unnecessary warnings when building librarys.  */
+   else if ((isa_names_table[i].feature != FEATURE_AVX10_1_256)
+&& (isa_names_table[i].feature != FEATURE_AVX10_1_512)
+&& check_avx512_features (cpu_model, cpu_features2,
+  isa_names_table[i].feature))
  options = concat (options, neg_option,
isa_names_table[i].option + 2, NULL);
  }
diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index 19fa5c107c7..7a5f2676999 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -301,10 +301,10 @@ BDESC (OPTION_MASK_ISA_AVX512BW, 
OPTION_MASK_ISA2_EVEX512, CODE_FOR_avx512bw_sto
 /* AVX512VP2INTERSECT */
 BDESC (0, OPTION_MASK_ISA2_AVX512VP2INTERSECT | OPTION_MASK_ISA2_EVEX512, 
CODE_FOR_nothing, "__builtin_ia32_2intersectd512", IX86_BUILTIN_2INTERSECTD512, 
UNKNOWN, (int) VOID_FTYPE_PUHI_PUHI_V16SI_V16SI)
 BDESC (0, OPTION_MASK_ISA2_AVX512VP2INTERSECT | OPTION_MASK_ISA2_EVEX512, 
CODE_FOR_nothing, "__builtin_ia32_2intersectq512", IX86_BUILTIN_2INTERSECTQ512, 
UNKNOWN, (int) VOID_FTYPE_PUQI_PUQI_V8DI_V8DI)
-BDESC (0, OPTION_MASK_ISA2_AVX512VP2INTERSECT, CODE_FOR_nothing, 
"__builtin_ia32_2intersectd256", IX86_BUILTIN_2INTERSECTD256, UNKNOWN, (int) 
VOID_FTYPE_PUQI_PUQI_V8SI_V8SI)
-BDESC (0, OPTION_MASK_ISA2_AVX512VP2INTERSECT, CODE_FOR_nothing, 
"__builtin_ia32_2intersectq256", IX86_BUILTIN_2INTERSECTQ256, UNKNOWN, (int) 
VOID_FTYPE_PUQI_PUQI_V4DI_V4DI)
-BDESC (0, OPTION_MASK_ISA2_AVX512VP2INTERSECT, CODE_FOR_nothing, 
"__builtin_ia32_2intersectd128", IX86_BUILTIN_2INTERSECTD128, UNKNOWN, (int) 
VOID_FTYPE_PUQI_PUQI_V4SI_V4SI)
-BDESC (0, OPTION_MASK_ISA2_AVX512VP2INTERSECT, CODE_FOR_nothing, 
"__builtin_ia32_2intersectq128", IX86_BUILTIN_2INTERSECTQ128, UNKNOWN, (int) 
VOID_FTYPE_PUQI_PUQI_V2DI_V2DI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512VP2INTERSECT, 
CODE_FOR_nothing, "__builtin_ia32_2intersectd256", IX86_BUILTIN_2INTERSECTD256, 
UNKNOWN, (int) VOID_FTYPE_PUQI_PUQI_V8SI_V8SI)
+BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512VP2INTERSECT, 
CODE_FOR_nothing, "__builtin_ia32_2intersectq256", IX86_BUILTIN_2INTERSECTQ256, 

[PATCH] gimple-vr-values:Add constraint for gimple-cond optimization

2023-11-22 Thread Feng Wang
This patch add another condition for gimple-cond optimization. Refer to
the following test case.
int foo1 (int data, int res)
{
  res = data & 0xf;
  res |= res << 4;
  if (res < 0x22)
return 0x22;
  return res;
}
with the compilation flag "-march=rv64gc_zba_zbb -mabi=lp64d -O2",
before this patch the compilation result is
foo1:
andia0,a0,15
slliw   a5,a0,4
addwa3,a5,a0
li  a4,33
add a0,a5,a0
bleua3,a4,.L5
ret
.L5:
li  a0,34
ret
after this patch the compilation result is
foo1:
andia0,a0,15
slliw   a5,a0,4
add a5,a5,a0
li  a0,34
max a0,a5,a0
ret
The reason is in the pass_early_vrp, the arg0 of gimple_cond
is replaced,but the PHI node still use the arg0.
The some of evrp pass logs are as follows
 gimple_assign 
  gimple_assign 
  gimple_cond 
goto ; [INV]
  else
goto ; [INV]

   :
  // predicted unlikely by early return (on trees) predictor.

   :
  # gimple_phi <_2, 34(3), res_5(2)>
The arg0 of gimple_cond is replaced by _9,but the gimple_phi still
uses res_5,it will cause optimization fail of PHI node to MAX_EXPR.
So the next_use_is_phi is added to control the replacement.

gcc/ChangeLog:

* vr-values.cc (next_use_is_phi):
(simplify_using_ranges::simplify_casted_compare):
add new function next_use_is_phi to control the replacement.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbb-min-max-04.c: New test.
---
 gcc/testsuite/gcc.target/riscv/zbb-min-max-04.c | 14 ++
 gcc/vr-values.cc| 15 ++-
 2 files changed, 28 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-min-max-04.c

diff --git a/gcc/testsuite/gcc.target/riscv/zbb-min-max-04.c 
b/gcc/testsuite/gcc.target/riscv/zbb-min-max-04.c
new file mode 100644
index 000..8c3e87a35e0
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/zbb-min-max-04.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zba_zbb -mabi=lp64d" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+int foo1 (int data, int res)
+{
+  res = data & 0xf;
+  res |= res << 4;
+  if (res < 0x22)
+return 0x22;
+  return res;
+}
+
+/* { dg-final { scan-assembler-times "max\t" 1 } } */
\ No newline at end of file
diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
index ecb294131b0..1f7a727c638 100644
--- a/gcc/vr-values.cc
+++ b/gcc/vr-values.cc
@@ -1263,6 +1263,18 @@ simplify_using_ranges::simplify_compare_using_ranges_1 
(tree_code _code, tr
   return happened;
 }
 
+/* Return true if the next use of SSA_NAME is PHI node */
+bool
+next_use_is_phi (tree arg)
+{
+  use_operand_p imm = &(SSA_NAME_IMM_USE_NODE (arg));
+  use_operand_p next = imm->next;
+  if (next && next->loc.stmt
+  && (gimple_code (next->loc.stmt) == GIMPLE_PHI))
+return true;
+  return false;
+}
+
 /* Simplify OP0 code OP1 when OP1 is a constant and OP0 was a SSA_NAME
defined by a type conversion. Replacing OP0 with RHS of the type conversion.
Doing so makes the conversion dead which helps subsequent passes.  */
@@ -1305,7 +1317,8 @@ simplify_using_ranges::simplify_casted_compare (tree_code 
&, tree , tree 
   if (TREE_CODE (innerop) == SSA_NAME
  && !POINTER_TYPE_P (TREE_TYPE (innerop))
  && !SSA_NAME_OCCURS_IN_ABNORMAL_PHI (innerop)
- && desired_pro_or_demotion_p (TREE_TYPE (innerop), TREE_TYPE (op0)))
+ && desired_pro_or_demotion_p (TREE_TYPE (innerop), TREE_TYPE (op0))
+  && !next_use_is_phi (op0))
{
  value_range vr;
 
-- 
2.17.1



Re: [PATCH v2] ifcvt: Remove obsolete code for subreg handling in noce_convert_multiple_sets

2023-11-22 Thread Philipp Tomsich
Applied to master, thanks!
Philipp,


On Thu, 23 Nov 2023 at 04:48, Jeff Law  wrote:
>
>
>
> On 11/21/23 11:04, Manolis Tsamis wrote:
> > This code used to handle SUBREG for register replacement when ifcvt was 
> > doing
> > the replacements manually. This special handling is not needed anymore
> > because simplify_replace_rtx is used for the replacements and it properly
> > handles these cases.
> >
> > gcc/ChangeLog:
> >
> >   * ifcvt.cc (noce_convert_multiple_sets_1): Remove old code.
> OK.
> jeff


[PATCH v2] aarch64: Add support for Ampere-1B (-mcpu=ampere1b) CPU

2023-11-22 Thread Philipp Tomsich
This patch adds initial support for Ampere-1B core.

The Ampere-1B core implements ARMv8.7 with the following (compiler
visible) extensions:
 - CSSC (Common Short Sequence Compression instructions),
 - MTE (Memory Tagging Extension)
 - SM3/SM4

gcc/ChangeLog:

* config/aarch64/aarch64-cores.def (AARCH64_CORE): Add ampere-1b
* config/aarch64/aarch64-cost-tables.h: Add ampere1b_extra_costs
* config/aarch64/aarch64-tune.md: Regenerate
* config/aarch64/aarch64.cc: Include ampere1b tuning model
* doc/invoke.texi: Document -mcpu=ampere1b
* config/aarch64/tuning_models/ampere1b.h: New file.

Signed-off-by: Philipp Tomsich 
---

Changes in v2:
- moved ampere1b model to a separated file
- regenerated aarch64-tune.md after rebase

 gcc/config/aarch64/aarch64-cores.def|   1 +
 gcc/config/aarch64/aarch64-cost-tables.h| 107 ++
 gcc/config/aarch64/aarch64-tune.md  |   2 +-
 gcc/config/aarch64/aarch64.cc   |   1 +
 gcc/config/aarch64/tuning_models/ampere1b.h | 114 
 gcc/doc/invoke.texi |   2 +-
 6 files changed, 225 insertions(+), 2 deletions(-)
 create mode 100644 gcc/config/aarch64/tuning_models/ampere1b.h

diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 16752b77f4b..ad896a80f1f 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -74,6 +74,7 @@ AARCH64_CORE("thunderxt83",   thunderxt83,   thunderx,  V8A,  
(CRC, CRYPTO), thu
 /* Ampere Computing ('\xC0') cores. */
 AARCH64_CORE("ampere1", ampere1, cortexa57, V8_6A, (F16, RNG, AES, SHA3), 
ampere1, 0xC0, 0xac3, -1)
 AARCH64_CORE("ampere1a", ampere1a, cortexa57, V8_6A, (F16, RNG, AES, SHA3, 
SM4, MEMTAG), ampere1a, 0xC0, 0xac4, -1)
+AARCH64_CORE("ampere1b", ampere1b, cortexa57, V8_7A, (F16, RNG, AES, SHA3, 
SM4, MEMTAG, CSSC), ampere1b, 0xC0, 0xac5, -1)
 /* Do not swap around "emag" and "xgene1",
this order is required to handle variant correctly. */
 AARCH64_CORE("emag",emag,  xgene1,V8A,  (CRC, CRYPTO), emag, 
0x50, 0x000, 3)
diff --git a/gcc/config/aarch64/aarch64-cost-tables.h 
b/gcc/config/aarch64/aarch64-cost-tables.h
index 0cb638f3a13..4c8da7f119b 100644
--- a/gcc/config/aarch64/aarch64-cost-tables.h
+++ b/gcc/config/aarch64/aarch64-cost-tables.h
@@ -882,4 +882,111 @@ const struct cpu_cost_table ampere1a_extra_costs =
   }
 };
 
+const struct cpu_cost_table ampere1b_extra_costs =
+{
+  /* ALU */
+  {
+0, /* arith.  */
+0, /* logical.  */
+0, /* shift.  */
+COSTS_N_INSNS (1), /* shift_reg.  */
+0, /* arith_shift.  */
+COSTS_N_INSNS (1), /* arith_shift_reg.  */
+0, /* log_shift.  */
+COSTS_N_INSNS (1), /* log_shift_reg.  */
+0, /* extend.  */
+COSTS_N_INSNS (1), /* extend_arith.  */
+0, /* bfi.  */
+0, /* bfx.  */
+0, /* clz.  */
+0, /* rev.  */
+0, /* non_exec.  */
+true   /* non_exec_costs_exec.  */
+  },
+  {
+/* MULT SImode */
+{
+  COSTS_N_INSNS (2),   /* simple.  */
+  COSTS_N_INSNS (2),   /* flag_setting.  */
+  COSTS_N_INSNS (2),   /* extend.  */
+  COSTS_N_INSNS (3),   /* add.  */
+  COSTS_N_INSNS (3),   /* extend_add.  */
+  COSTS_N_INSNS (12)   /* idiv.  */
+},
+/* MULT DImode */
+{
+  COSTS_N_INSNS (2),   /* simple.  */
+  0,   /* flag_setting (N/A).  */
+  COSTS_N_INSNS (2),   /* extend.  */
+  COSTS_N_INSNS (3),   /* add.  */
+  COSTS_N_INSNS (3),   /* extend_add.  */
+  COSTS_N_INSNS (18)   /* idiv.  */
+}
+  },
+  /* LD/ST */
+  {
+COSTS_N_INSNS (2), /* load.  */
+COSTS_N_INSNS (2), /* load_sign_extend.  */
+0, /* ldrd (n/a).  */
+0, /* ldm_1st.  */
+0, /* ldm_regs_per_insn_1st.  */
+0, /* ldm_regs_per_insn_subsequent.  */
+COSTS_N_INSNS (3), /* loadf.  */
+COSTS_N_INSNS (3), /* loadd.  */
+COSTS_N_INSNS (3), /* load_unaligned.  */
+0, /* store.  */
+0, /* strd.  */
+0, /* stm_1st.  */
+0, /* stm_regs_per_insn_1st.  */
+0, /* stm_regs_per_insn_subsequent.  */
+COSTS_N_INSNS (1), /* storef.  */
+COSTS_N_INSNS (1), /* stored.  */
+COSTS_N_INSNS (1), /* store_unaligned.  */
+COSTS_N_INSNS (3), /* loadv.  */
+COSTS_N_INSNS (3)  /* storev.  */
+  },
+  {
+/* FP SFmode */
+{
+  COSTS_N_INSNS (18),  /* div.  */
+  COSTS_N_INSNS (3),   /* 

Re: [PATCH] LoongArch: Fix runtime error in a gcc build with --with-build-config=bootstrap-ubsan

2023-11-22 Thread Xi Ruoyao
On Thu, 2023-11-23 at 11:05 +0800, Guo Jie wrote:
> gcc/ChangeLog:
> 
>   * config/loongarch/loongarch.cc (loongarch_split_plus_constant):
>   avoid left shift of negative value -0x8000.

> ---
>  gcc/config/loongarch/loongarch.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/loongarch/loongarch.cc 
> b/gcc/config/loongarch/loongarch.cc
> index 33357c670e1..81cd9fa1e7c 100644
> --- a/gcc/config/loongarch/loongarch.cc
> +++ b/gcc/config/loongarch/loongarch.cc
> @@ -4249,7 +4249,7 @@ loongarch_split_plus_constant (rtx *op, machine_mode 
> mode)
>    else if (loongarch_addu16i_imm12_operand_p (v, mode))
>  a = (v & ~HWIT_UC_0xFFF) + ((v & 0x800) << 1);
>    else if (mode == DImode && DUAL_ADDU16I_OPERAND (v))
> -    a = (v > 0 ? 0x7fff : -0x8000) << 16;
> +    a = (v > 0 ? 0x7fff : ~0x7fff);

LGTM.

"-0x8000 << 16" is allowed by C++20 [it allows x << 16 as long as x * (1
<< 16) does not overflow], but not C++11.

Unfortunately when I wrote the code I just used the C++20 specification
as a reference...  Thanks for the correction.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [PATCH v2] LoongArch: Optimize the loading of immediate numbers with the same high and low 32-bit values

2023-11-22 Thread Xi Ruoyao
On Thu, 2023-11-23 at 11:04 +0800, Guo Jie wrote:
> For the following immediate load operation in 
> gcc/testsuite/gcc.target/loongarch/imm-load1.c:
> 
>   long long r = 0x0101010101010101;
> 
> Before this patch:
> 
>   lu12i.w     $r15,16842752>>12
>   ori     $r15,$r15,257
>   lu32i.d     $r15,0x10101>>32
>   lu52i.d     $r15,$r15,0x100>>52
> 
> After this patch:
> 
>   lu12i.w $r15,16842752>>12
>   ori $r15,$r15,257
>   bstrins.d   $r15,$r15,63,32
> 
> gcc/ChangeLog:
> 
>   * config/loongarch/loongarch.cc
>   (enum loongarch_load_imm_method): Add new method.
>   (loongarch_build_integer): Add relevant implementations for
>   new method.
>   (loongarch_move_integer): Ditto.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/loongarch/imm-load1.c: Change old check.
> 
> ---
> Update in v2:
>   1. Correct the format of ChangeLog.
>   2. Avoid left shift of negative value in loongarch_build_integer.

LGTM.

> 
> ---
>  gcc/config/loongarch/loongarch.cc | 22 ++-
>  .../gcc.target/loongarch/imm-load1.c  |  3 ++-
>  2 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/loongarch/loongarch.cc 
> b/gcc/config/loongarch/loongarch.cc
> index d05743bec87..f95507e2348 100644
> --- a/gcc/config/loongarch/loongarch.cc
> +++ b/gcc/config/loongarch/loongarch.cc
> @@ -142,12 +142,16 @@ struct loongarch_address_info
>  
>     METHOD_LU52I:
>   Load 52-63 bit of the immediate number.
> +
> +   METHOD_MIRROR:
> + Copy 0-31 bit of the immediate number to 32-63bit.
>  */
>  enum loongarch_load_imm_method
>  {
>    METHOD_NORMAL,
>    METHOD_LU32I,
> -  METHOD_LU52I
> +  METHOD_LU52I,
> +  METHOD_MIRROR
>  };
>  
>  struct loongarch_integer_op
> @@ -1556,11 +1560,23 @@ loongarch_build_integer (struct loongarch_integer_op 
> *codes,
>  
>    int sign31 = (value & (HOST_WIDE_INT_1U << 31)) >> 31;
>    int sign51 = (value & (HOST_WIDE_INT_1U << 51)) >> 51;
> +
> +  uint32_t hival = (uint32_t) (value >> 32);
> +  uint32_t loval = (uint32_t) value;
> +
>    /* Determine whether the upper 32 bits are sign-extended from the lower
>    32 bits. If it is, the instructions to load the high order can be
>    ommitted.  */
>    if (lu32i[sign31] && lu52i[sign31])
>   return cost;
> +  /* If the lower 32 bits are the same as the upper 32 bits, just copy
> +  the lower 32 bits to the upper 32 bits.  */
> +  else if (loval == hival)
> + {
> +   codes[cost].method = METHOD_MIRROR;
> +   codes[cost].curr_value = value;
> +   return cost + 1;
> + }
>    /* Determine whether bits 32-51 are sign-extended from the lower 32
>    bits. If so, directly load 52-63 bits.  */
>    else if (lu32i[sign31])
> @@ -3230,6 +3246,10 @@ loongarch_move_integer (rtx temp, rtx dest, unsigned 
> HOST_WIDE_INT value)
>      gen_rtx_AND (DImode, x, GEN_INT (0xf)),
>      GEN_INT (codes[i].value));
>     break;
> + case METHOD_MIRROR:
> +   gcc_assert (mode == DImode);
> +   emit_insn (gen_insvdi (x, GEN_INT (32), GEN_INT (32), x));
> +   break;
>   default:
>     gcc_unreachable ();
>   }
> diff --git a/gcc/testsuite/gcc.target/loongarch/imm-load1.c 
> b/gcc/testsuite/gcc.target/loongarch/imm-load1.c
> index 2ff02971239..f64cc2956a3 100644
> --- a/gcc/testsuite/gcc.target/loongarch/imm-load1.c
> +++ b/gcc/testsuite/gcc.target/loongarch/imm-load1.c
> @@ -1,6 +1,7 @@
>  /* { dg-do compile } */
>  /* { dg-options "-mabi=lp64d -O2" } */
> -/* { dg-final { scan-assembler "test:.*lu52i\.d.*\n\taddi\.w.*\n\.L2:" } } */
> +/* { dg-final { scan-assembler-not "test:.*lu52i\.d.*\n\taddi\.w.*\n\.L2:" } 
> } */
> +/* { dg-final { scan-assembler "test:.*lu12i\.w.*\n\tbstrins\.d.*\n\.L2:" } 
> } */
>  
>  
>  extern long long b[10];

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[PATCH] LoongArch: Fix runtime error in a gcc build with --with-build-config=bootstrap-ubsan

2023-11-22 Thread Guo Jie
gcc/ChangeLog:

* config/loongarch/loongarch.cc (loongarch_split_plus_constant):
avoid left shift of negative value -0x8000.

---
 gcc/config/loongarch/loongarch.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index 33357c670e1..81cd9fa1e7c 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -4249,7 +4249,7 @@ loongarch_split_plus_constant (rtx *op, machine_mode mode)
   else if (loongarch_addu16i_imm12_operand_p (v, mode))
 a = (v & ~HWIT_UC_0xFFF) + ((v & 0x800) << 1);
   else if (mode == DImode && DUAL_ADDU16I_OPERAND (v))
-a = (v > 0 ? 0x7fff : -0x8000) << 16;
+a = (v > 0 ? 0x7fff : ~0x7fff);
   else
 gcc_unreachable ();
 
-- 
2.20.1



[PATCH v2] LoongArch: Optimize the loading of immediate numbers with the same high and low 32-bit values

2023-11-22 Thread Guo Jie
For the following immediate load operation in 
gcc/testsuite/gcc.target/loongarch/imm-load1.c:

long long r = 0x0101010101010101;

Before this patch:

lu12i.w $r15,16842752>>12
ori $r15,$r15,257
lu32i.d $r15,0x10101>>32
lu52i.d $r15,$r15,0x100>>52

After this patch:

lu12i.w $r15,16842752>>12
ori $r15,$r15,257
bstrins.d   $r15,$r15,63,32

gcc/ChangeLog:

* config/loongarch/loongarch.cc
(enum loongarch_load_imm_method): Add new method.
(loongarch_build_integer): Add relevant implementations for
new method.
(loongarch_move_integer): Ditto.

gcc/testsuite/ChangeLog:

* gcc.target/loongarch/imm-load1.c: Change old check.

---
Update in v2:
1. Correct the format of ChangeLog.
2. Avoid left shift of negative value in loongarch_build_integer.

---
 gcc/config/loongarch/loongarch.cc | 22 ++-
 .../gcc.target/loongarch/imm-load1.c  |  3 ++-
 2 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/gcc/config/loongarch/loongarch.cc 
b/gcc/config/loongarch/loongarch.cc
index d05743bec87..f95507e2348 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -142,12 +142,16 @@ struct loongarch_address_info
 
METHOD_LU52I:
  Load 52-63 bit of the immediate number.
+
+   METHOD_MIRROR:
+ Copy 0-31 bit of the immediate number to 32-63bit.
 */
 enum loongarch_load_imm_method
 {
   METHOD_NORMAL,
   METHOD_LU32I,
-  METHOD_LU52I
+  METHOD_LU52I,
+  METHOD_MIRROR
 };
 
 struct loongarch_integer_op
@@ -1556,11 +1560,23 @@ loongarch_build_integer (struct loongarch_integer_op 
*codes,
 
   int sign31 = (value & (HOST_WIDE_INT_1U << 31)) >> 31;
   int sign51 = (value & (HOST_WIDE_INT_1U << 51)) >> 51;
+
+  uint32_t hival = (uint32_t) (value >> 32);
+  uint32_t loval = (uint32_t) value;
+
   /* Determine whether the upper 32 bits are sign-extended from the lower
 32 bits. If it is, the instructions to load the high order can be
 ommitted.  */
   if (lu32i[sign31] && lu52i[sign31])
return cost;
+  /* If the lower 32 bits are the same as the upper 32 bits, just copy
+the lower 32 bits to the upper 32 bits.  */
+  else if (loval == hival)
+   {
+ codes[cost].method = METHOD_MIRROR;
+ codes[cost].curr_value = value;
+ return cost + 1;
+   }
   /* Determine whether bits 32-51 are sign-extended from the lower 32
 bits. If so, directly load 52-63 bits.  */
   else if (lu32i[sign31])
@@ -3230,6 +3246,10 @@ loongarch_move_integer (rtx temp, rtx dest, unsigned 
HOST_WIDE_INT value)
   gen_rtx_AND (DImode, x, GEN_INT (0xf)),
   GEN_INT (codes[i].value));
  break;
+   case METHOD_MIRROR:
+ gcc_assert (mode == DImode);
+ emit_insn (gen_insvdi (x, GEN_INT (32), GEN_INT (32), x));
+ break;
default:
  gcc_unreachable ();
}
diff --git a/gcc/testsuite/gcc.target/loongarch/imm-load1.c 
b/gcc/testsuite/gcc.target/loongarch/imm-load1.c
index 2ff02971239..f64cc2956a3 100644
--- a/gcc/testsuite/gcc.target/loongarch/imm-load1.c
+++ b/gcc/testsuite/gcc.target/loongarch/imm-load1.c
@@ -1,6 +1,7 @@
 /* { dg-do compile } */
 /* { dg-options "-mabi=lp64d -O2" } */
-/* { dg-final { scan-assembler "test:.*lu52i\.d.*\n\taddi\.w.*\n\.L2:" } } */
+/* { dg-final { scan-assembler-not "test:.*lu52i\.d.*\n\taddi\.w.*\n\.L2:" } } 
*/
+/* { dg-final { scan-assembler "test:.*lu12i\.w.*\n\tbstrins\.d.*\n\.L2:" } } 
*/
 
 
 extern long long b[10];
-- 
2.36.0



Re: PING^1 [PATCH v3] sched: Change no_real_insns_p to no_real_nondebug_insns_p [PR108273]

2023-11-22 Thread Kewen.Lin
on 2023/11/22 18:25, Richard Biener wrote:
> On Wed, Nov 22, 2023 at 10:31 AM Kewen.Lin  wrote:
>>
>> on 2023/11/17 20:55, Alexander Monakov wrote:
>>>
>>> On Fri, 17 Nov 2023, Kewen.Lin wrote:
> I don't think you can run cleanup_cfg after sched_init. I would suggest
> to put it early in schedule_insns.

 Thanks for the suggestion, I placed it at the beginning of haifa_sched_init
 instead, since schedule_insns invokes haifa_sched_init, although the
 calls rgn_setup_common_sched_info and rgn_setup_sched_infos are executed
 ahead but they are all "setup" functions, shouldn't affect or be affected
 by this placement.
>>>
>>> I was worried because sched_init invokes df_analyze, and I'm not sure if
>>> cfg_cleanup can invalidate it.
>>
>> Thanks for further explaining!  By scanning cleanup_cfg, it seems that it
>> considers df, like compact_blocks checks df, try_optimize_cfg invokes
>> df_analyze etc., but I agree that moving cleanup_cfg before sched_init
>> makes more sense.
>>
>>>
> I suspect this may be caused by invoking cleanup_cfg too late.

 By looking into some failures, I found that although cleanup_cfg is 
 executed
 there would be still some empty blocks left, by analyzing a few failures 
 there
 are at least such cases:
   1. empty function body
   2. block holding a label for return.
   3. block without any successor.
   4. block which becomes empty after scheduling some other block.
   5. block which looks mergeable with its always successor but left.
   ...

 For 1,2, there is one single successor EXIT block, I think they don't 
 affect
 state transition, for 3, it's the same.  For 4, it depends on if we can 
 have
 the assumption this kind of empty block doesn't have the chance to have 
 debug
 insn (like associated debug insn should be moved along), I'm not sure.  
 For 5,
 a reduced test case is:
>>>
>>> Oh, I should have thought of cases like these, really sorry about the slip
>>> of attention, and thanks for showing a testcase for item 5. As Richard as
>>> saying in his response, cfg_cleanup cannot be a fix here. The thing to check
>>> would be changing no_real_insns_p to always return false, and see if the
>>> situation looks recoverable (if it breaks bootstrap, regtest statistics of
>>> a non-bootstrapped compiler are still informative).
>>
>> As you suggested, I forced no_real_insns_p to return false all the time, some
>> issues got exposed, almost all of them are asserting NOTE_P insn shouldn't be
>> encountered in those places, so the adjustments for most of them are just to
>> consider NOTE_P or this kind of special block and so on.  One draft patch is
>> attached, it can be bootstrapped and regress-tested on ppc64{,le} and x86.
>> btw, it's without the previous cfg_cleanup adjustment (hope it can get more
>> empty blocks and expose more issues).  The draft isn't qualified for code
>> review but I hope it can provide some information on what kinds of changes
>> are needed for the proposal.  If this is the direction which we all agree on,
>> I'll further refine it and post a formal patch.  One thing I want to note is
>> that this patch disable one assertion below:
>>
>> diff --git a/gcc/sched-rgn.cc b/gcc/sched-rgn.cc
>> index e5964f54ead..abd334864fb 100644
>> --- a/gcc/sched-rgn.cc
>> +++ b/gcc/sched-rgn.cc
>> @@ -3219,7 +3219,7 @@ schedule_region (int rgn)
>>  }
>>
>>/* Sanity check: verify that all region insns were scheduled.  */
>> -  gcc_assert (sched_rgn_n_insns == rgn_n_insns);
>> +  // gcc_assert (sched_rgn_n_insns == rgn_n_insns);
>>
>>sched_finish_ready_list ();
>>
>> Some cases can cause this assertion to fail, it's due to the mismatch on
>> to-be-scheduled and scheduled insn counts.  The reason why it happens is that
>> one block previously has only one INSN_P but while scheduling some other 
>> blocks
>> it gets moved as well then we ends up with an empty block so that the only
>> NOTE_P insn was counted then, but since this block isn't empty initially and
>> NOTE_P gets skipped in a normal block, the count to-be-scheduled can't count
>> it in.  It can be fixed with special-casing this kind of block for counting
>> like initially recording which block is empty and if a block isn't recorded
>> before then fix up the count for it accordingly.  I'm not sure if someone may
>> have an argument that all the complication make this proposal beaten by
>> previous special-casing debug insn approach, looking forward to more 
>> comments.
> 
> Just a comment that the NOTE_P thing is odd - do we only ever have those for
> otherwise empty BBs?  How are they skipped otherwise (and why does that not
> work for otherwise empty BBs)?

Yes, previously (bypassing empty BBs) there is no chance to encounter NOTE_P
when scheduling insns, as for notes in normal BBs, when setting up the head
and tail, some are skipped (like get_ebb_head_tail), and there are also 

Re: Re: [PATCH 0/5] Add support for operand-specific alignment requirements

2023-11-22 Thread 钟居哲
Hi, Richard.

Current define_mode_attr can only map an attribute for a mode.
I wonder whether we can map a mode to multiple attributes ?

E.g. (define_mode_attr dest_constraint [(V16QI "")])

But I want it to be:

(define_mode_attr dest_constraint [(V16QI (TARGET_MIN_VLEN <= 128 "vr") 
(TARGET_MIN_VLEN > 128 "")) ])

It seems that we can't achieve this for now. Would it be possible we exend it 
in GCC-15 ?


juzhe.zh...@rivai.ai
 
From: Richard Sandiford
Date: 2023-11-22 18:08
To: juzhe.zhong\@rivai.ai
CC: gcc-patches; vmakarov\@redhat.com; kito.cheng
Subject: Re: [PATCH 0/5] Add support for operand-specific alignment requirements
"juzhe.zh...@rivai.ai"  writes:
> Hi, Richard.
>
> Thanks for supporting register filter in IRA/LRA.
> I found it is useful for RVV since we have a set of widen operations that 
> allow source register overlap highpart of dest register group
>
> For example, if vsext.vf2 v0(dest consume reg v0 and reg v1), v1 (source 
> consume v1 only)
> I want to support the highpart overlap above. (Currently, we don't any 
> overlap between source and dest in such instructions).
>
> So, I wonder whether we can pass "machine_mode" into register filter. Ok, I 
> think it's too late since stage 1 closes. I wonder we can add it in GCC-15?
 
I think adding a mode would add too much overhead.  The mode would be
the mode of the operand, but with subregs, the mode of the operand can
be different from the mode of the RA allocno.  So it would no longer
be enough for the RA to calculate a bitmask of filters.  It would need
ro remember which modes are used with those filters.
 
We'd also need to turn the current HARD_REG_SETs into [MAX_MACHINE_MODE]
arrays of HARD_REG_SETs.  (And there are now more than 256 machine modes
for riscv.)
 
The pattern that uses the constraints should already "know" the mode.
So if possible, I think it would be better to use different constraints
for different modes, using define_mode_attrs.
 
Thanks,
Richard
 


[PATCH] c++: alias template of non-template class [PR112633]

2023-11-22 Thread Patrick Palka
Bootstrapped and regtested on x86-64-pc-linux-gnu, does this look OK for
trunk/13?

-- >8 --

The entering_scope adjustment in tsubst_aggr_type assumes if an alias is
dependent, then so is the aliased type (and therefore it has template info)
but that's not true for the dependent alias template specialization ty1
below which aliases the non-template class A.

PR c++/112633

gcc/cp/ChangeLog:

* pt.cc (tsubst_aggr_type): Handle empty TYPE_TEMPLATE_INFO
in the entering_scope adjustment.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-75.C: New test.
---
 gcc/cp/pt.cc   |  1 +
 gcc/testsuite/g++.dg/cpp0x/alias-decl-75.C | 13 +
 2 files changed, 14 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-75.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index ed681afb5d4..68ce4a87372 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -13976,6 +13976,7 @@ tsubst_aggr_type (tree t,
   if (entering_scope
  && CLASS_TYPE_P (t)
  && dependent_type_p (t)
+ && TYPE_TEMPLATE_INFO (t)
  && TYPE_CANONICAL (t) == TREE_TYPE (TYPE_TI_TEMPLATE (t)))
t = TYPE_CANONICAL (t);
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/alias-decl-75.C 
b/gcc/testsuite/g++.dg/cpp0x/alias-decl-75.C
new file mode 100644
index 000..1a73a99856e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/alias-decl-75.C
@@ -0,0 +1,13 @@
+// PR c++/112633
+// { dg-do compile { target c++11 } }
+
+struct A { using type = void; };
+
+template
+using ty1 = A;
+
+template
+using ty2 = typename ty1::type;
+
+template
+ty2 f();
-- 
2.43.0.rc1



Re: [PATCH] AArch64/testsuite: Use non-capturing parentheses with ccmp_1.c

2023-11-22 Thread Richard Earnshaw (lists)
On 22/11/2023 15:21, Maciej W. Rozycki wrote:
> Use non-capturing parentheses for the subexpressions used with 
> `scan-assembler-times', to avoid a quirk with double-counting.
> 
>   gcc/testsuite/
>   * gcc.target/aarch64/ccmp_1.c: Use non-capturing parentheses 
>   with `scan-assembler-times'.

OK

R.

> ---
> Hi,
> 
>  Here's another one.  I realised my original regexp used to grep the tree 
> for `scan-assembler-times' with subexpressions was too strict and with an 
> updated pattern I found this second test case that does regress once the 
> `scan-assembler-times' double-counting quirk has been fixed.
> 
>  As with the ARM change we don't need capturing parentheses here, usually 
> used for back references, so let's just avoid the double-counting quirk 
> altogether and make our matching here work whether the quirk has been 
> fixed or not.
> 
>  Verified for the `aarch64-linux-gnu' target with the quirk fix submitted 
> as  
> and the aarch64.exp subset of the C language test suite.  OK to apply?
> 
>   Maciej
> ---
>  gcc/testsuite/gcc.target/aarch64/ccmp_1.c |4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> gcc-aarch64-test-ccmp_1-non-capturing.diff
> Index: gcc/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
> ===
> --- gcc.orig/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
> +++ gcc/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
> @@ -86,8 +86,8 @@ f13 (int a, int b)
>  /* { dg-final { scan-assembler "cmp\t(.)+35" } } */
>  
>  /* { dg-final { scan-assembler-times "\tcmp\tw\[0-9\]+, 0" 4 } } */
> -/* { dg-final { scan-assembler-times "fcmpe\t(.)+0\\.0" 2 } } */
> -/* { dg-final { scan-assembler-times "fcmp\t(.)+0\\.0" 2 } } */
> +/* { dg-final { scan-assembler-times "fcmpe\t(?:.)+0\\.0" 1 } } */
> +/* { dg-final { scan-assembler-times "fcmp\t(?:.)+0\\.0" 1 } } */
>  
>  /* { dg-final { scan-assembler "adds\t" } } */
>  /* { dg-final { scan-assembler-times "\tccmp\t" 11 } } */



Re: [RFA] New pass for sign/zero extension elimination

2023-11-22 Thread Jeff Law




On 11/20/23 11:56, Dimitar Dimitrov wrote:

On Sun, Nov 19, 2023 at 05:47:56PM -0700, Jeff Law wrote:
...

+/* Process uses in INSN.  Set appropriate bits in LIVENOW for any chunks of
+   pseudos that become live, potentially filtering using bits from LIVE_TMP.
+
+   If MODIFIED is true, then optimize sign/zero extensions to SUBREGs when
+   the extended bits are never read and mark pseudos which had extensions
+   eliminated in CHANGED_PSEUDOS.  */
+
+static void
+ext_dce_process_uses (rtx insn, bitmap livenow, bitmap live_tmp,
+ bool modify, bitmap changed_pseudos)
+{
+  /* A nonlocal goto implicitly uses the frame pointer.  */
+  if (JUMP_P (insn) && find_reg_note (insn, REG_NON_LOCAL_GOTO, NULL_RTX))
+{
+  bitmap_set_range (livenow, FRAME_POINTER_REGNUM * 4, 4);
+  if (!HARD_FRAME_POINTER_IS_FRAME_POINTER)
+   bitmap_set_range (livenow, HARD_FRAME_POINTER_REGNUM * 4, 4);
+}
+
+  subrtx_var_iterator::array_type array_var;
+  rtx pat = PATTERN (insn);
+  FOR_EACH_SUBRTX_VAR (iter, array_var, pat, NONCONST)
+{
+  /* An EXPR_LIST (from call fusage) ends in NULL_RTX.  */
+  rtx x = *iter;
+  if (x == NULL_RTX)
+   continue;
+
+  /* So the basic idea in this FOR_EACH_SUBRTX_VAR loop is to
+handle SETs explicitly, possibly propagating live information
+into the uses.
+
+We may continue the loop at various points which will cause
+iteration into the next level of RTL.  Breaking from the loop
+is never safe as it can lead us to fail to process some of the
+RTL and thus not make objects live when necessary.  */
+  enum rtx_code xcode = GET_CODE (x);
+  if (xcode == SET)
+   {
+ const_rtx dst = SET_DEST (x);
+ rtx src = SET_SRC (x);
+ const_rtx y;
+ unsigned HOST_WIDE_INT bit = 0;
+
+ /* The code of the RHS of a SET.  */
+ enum rtx_code code = GET_CODE (src);
+
+ /* ?!? How much of this should mirror SET handling, potentially
+being shared?   */
+ if (SUBREG_BYTE (dst).is_constant () && SUBREG_P (dst))


Shouldn't SUBREG_P be checked first like:
  if (SUBREG_P (dst) && SUBREG_BYTE (dst).is_constant ())

Yes, absolutely. It'll be fixed in the next update.

This also highlighted that I never added pru-elf to the configurations 
in my tester.  I remember thinking that it needed to be added, but 
obviously that mental TODO got lost.  I've just fixed that.


jeff



Re: [PATCH] c++, v4: Implement C++26 P2741R3 - user-generated static_assert messages [PR110348]

2023-11-22 Thread Jason Merrill

On 11/22/23 05:00, Jakub Jelinek wrote:

On Tue, Nov 21, 2023 at 10:51:36PM -0500, Jason Merrill wrote:

Actually, let's go back to the previous message, but change the tf_nones
above to 'complain' so that we see those errors and then this explanation.
Likewise with the conversion checks later in the function.


So like this?
Besides what you asked for I've separated the diagnostics for when size
member isn't found in lookup vs. when data isn't found, because it looked
weird to get 2 same errors e.g. in the udlit-error1.C case.

+  message_sz
+   = finish_class_member_access_expr (message,
+  get_identifier ("size"),
+  false, complain);
+  if (message_sz == error_mark_node)
+   {
+ error_at (location, "% message must be a string "
+ "literal or object with % and "
+ "% members");
+ return;
+   }
+  message_data
+   = finish_class_member_access_expr (message,
+  get_identifier ("data"),
+  false, complain);
+  if (message_data == error_mark_node)
+   {
+ error_at (location, "% message must be a string "
+ "literal or object with % and "
+ "% members");
+ return;
+   }


I agree it's weird to get two of the same error, but maybe instead of 
duplicating the error, we could look up data only if size succeeded, and 
then error once if either failed?


OK with that change.

Jason



Re: RISC-V: Support XTheadVector extensions

2023-11-22 Thread Kito Cheng
I am less worry about the thead vector combined with other zv extension,
instead we should reject those combinations at all.

My reason is thead vector is transitional products, they won't have any
further new products with that longer, also it's not compatible with all
other zv extension in theory, zv extension requires at least zve32x which
is subset of v1p0, and I don't think it's valid to use thead vector as
replacement required extension - it should just introduce another thead
vector extension instead.



Jeff Law  於 2023年11月23日 週四 06:27 寫道:

>
>
> On 11/22/23 07:24, Christoph Müllner wrote:
> > On Wed, Nov 22, 2023 at 2:52 PM 钟居哲  wrote:
> >>
> >> I am totally ok to approve theadvector on GCC-14 before stage 3 close
> >> as long as it doesn't touch the current RVV codes too much and binutils
> supports theadvector.
> >>
> >> I have provided the draft approach:
> >> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637349.html
> >> which turns out doesn't need to change any codes of vector.md.
> >> I strongly suggest follow this draft. I can be actively review
> theadvector during stage 3.
> >> And hopefully can help you land theadvector on GCC-14.
> >
> > I see now two approaches:
> > 1) Let GCC emit RVV instructions for XTheadVector for instructions
> > that are in both
> > 2) Use the ASM_OUTPUT_OPCODE hook to output "th." for these instructions
> >
> > No doubt, the ASM_OUTPUT_OPCODE hook approach is better than our
> > format-string approach, but would 1) not be the even better
> > solution? It would also mean, that not a single test case is required
> > for these overlapping instructions (only a few tests that ensure that
> > we don't emit RVV instructions that are not available in
> > XTheadVector). Besides that, letting GCC emit RVV instructions for
> > XTheadVector is a very clever idea, because it fully utilizes the
> > fact that both extensions overlap to a huge degree.
> >
> > The ASM_OUTPUT_OPCODE approach could lead to an issue if we enable
> XTheadVector
> > with any other vector extension, say Zvfoo. In this case the Zvfoo
> > instructions will all be prefixed as well with "th.". I know that it
> > is not likely to run into this problem (such a machine does not exist
> > in real hardware), but it is possible to trigger this issue easily
> > and approach 1) would not have this potential issue.
> I'm not a big fan of the ASM_OUTPUT_OPCODE approach.While it is
> simple, I worry a bit about it from a long term maintenance standpoint.
> As you note we could well end up at some point with an extension that
> has an mnenomic starting with "v" that would blow up.  But I certainly
> see the appeal of such a simple test to support thead vector.
>
> Given there are at least 3 approaches that can fix that problem (%^,
> assembler dialect or ASM_OUTPUT_OPCODE), maybe we could set that
> discussion aside in the immediate term and see if there are other issues
> that are potentially more substantial.
>
>
>
>
> --
>
>
>
> More generally, I think I need to soften my prior statement about
> deferring this to gcc-15.  This code was submitted in time for the
> gcc-14 deadline, so it should be evaluated just like we do anything else
> that makes the deadline.  There are various criteria we use to evaluate
> if something should get integrated and we should just work through this
> series like we always do and not treat it specially in any way.
>
>
> jeff
>


Adjust 'libgomp.c/declare-variant-{3,4}-[...]' for inter-procedural value range propagation (was: Propagate value ranges of return values)

2023-11-22 Thread Thomas Schwinge
Hi!

On 2023-11-19T16:05:42+0100, Jan Hubicka  wrote:
> this is updated version which also adds testuiste compensation
> I lost earlier while maintaining the patch in my testing tree.
> There are quite few testcases that use constant return values to hide
> something from optimizer.

One more: commit a53da3a213ee00866d132c228a0e89bd2f61d65c
"Adjust 'libgomp.c/declare-variant-{3,4}-[...]' for inter-procedural value 
range propagation"
pushed to master branch, see attached.  (Those regressions are only
visible in GCC offloading configurations.)  (And actually, all those test
cases have other issues; will install further patches later on.)

Jakub, Tobias, please let me know if it's not expected that *all* the
"variant" functions have to be tagged '__attribute__ ((noipa))' (as I've
done); just tagging the "dispatcher" function 'f' isn't sufficient.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From a53da3a213ee00866d132c228a0e89bd2f61d65c Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 21 Nov 2023 22:42:49 +0100
Subject: [PATCH] Adjust 'libgomp.c/declare-variant-{3,4}-[...]' for
 inter-procedural value range propagation

..., that is, commit 53ba8d669550d3a1f809048428b97ca607f95cf5
"inter-procedural value range propagation", after which we see:

[-PASS:-]{+FAIL:+} libgomp.c/declare-variant-3-sm30.c scan-nvptx-none-offload-tree-dump optimized "= f30 \\(\\);"

Etc.  That's due to:

@@ -144,13 +144,11 @@
 __attribute__((omp target entrypoint, noclone))
 void main._omp_fn.0 (const struct .omp_data_t.3 & restrict .omp_data_i)
 {
-  int _3;
   int * _5;

[local count: 1073741824]:
-  _3 = f30 ();
   _5 = *.omp_data_i_4(D).v;
-  *_5 = _3;
+  *_5 = 30;
   return;

It's nice to see this optimization work here, too, but it does interfere with
how we're currently testing OpenMP 'declare variant'.

	libgomp/
	* testsuite/libgomp.c/declare-variant-3.h (f30, f35, f53, f70)
	(f75, f80, f): Add '__attribute__ ((noipa))'.
	* testsuite/libgomp.c/declare-variant-4.h (gfx803, gfx900, gfx906)
	(gfx908, gfx90a, f): Likewise.
---
 libgomp/testsuite/libgomp.c/declare-variant-3.h | 8 
 libgomp/testsuite/libgomp.c/declare-variant-4.h | 7 +++
 2 files changed, 15 insertions(+)

diff --git a/libgomp/testsuite/libgomp.c/declare-variant-3.h b/libgomp/testsuite/libgomp.c/declare-variant-3.h
index 772fc20a519..646e15e5311 100644
--- a/libgomp/testsuite/libgomp.c/declare-variant-3.h
+++ b/libgomp/testsuite/libgomp.c/declare-variant-3.h
@@ -1,34 +1,41 @@
 #pragma omp declare target
+
+__attribute__ ((noipa))
 int
 f30 (void)
 {
   return 30;
 }
 
+__attribute__ ((noipa))
 int
 f35 (void)
 {
   return 35;
 }
 
+__attribute__ ((noipa))
 int
 f53 (void)
 {
   return 53;
 }
 
+__attribute__ ((noipa))
 int
 f70 (void)
 {
   return 70;
 }
 
+__attribute__ ((noipa))
 int
 f75 (void)
 {
   return 75;
 }
 
+__attribute__ ((noipa))
 int
 f80 (void)
 {
@@ -41,6 +48,7 @@ f80 (void)
 #pragma omp declare variant (f70) match (device={isa("sm_70")})
 #pragma omp declare variant (f75) match (device={isa("sm_75")})
 #pragma omp declare variant (f80) match (device={isa("sm_80")})
+__attribute__ ((noipa))
 int
 f (void)
 {
diff --git a/libgomp/testsuite/libgomp.c/declare-variant-4.h b/libgomp/testsuite/libgomp.c/declare-variant-4.h
index 2d7c1ef1a5a..47517b75ee7 100644
--- a/libgomp/testsuite/libgomp.c/declare-variant-4.h
+++ b/libgomp/testsuite/libgomp.c/declare-variant-4.h
@@ -1,28 +1,34 @@
 #pragma omp declare target
+
+__attribute__ ((noipa))
 int
 gfx803 (void)
 {
   return 0x803;
 }
 
+__attribute__ ((noipa))
 int
 gfx900 (void)
 {
   return 0x900;
 }
 
+__attribute__ ((noipa))
 int
 gfx906 (void)
 {
   return 0x906;
 }
 
+__attribute__ ((noipa))
 int
 gfx908 (void)
 {
   return 0x908;
 }
 
+__attribute__ ((noipa))
 int
 gfx90a (void)
 {
@@ -38,6 +44,7 @@ gfx90a (void)
 #pragma omp declare variant(gfx906) match(device = {isa("gfx906")})
 #pragma omp declare variant(gfx908) match(device = {isa("gfx908")})
 #pragma omp declare variant(gfx90a) match(device = {isa("gfx90a")})
+__attribute__ ((noipa))
 int
 f (void)
 {
-- 
2.34.1



Re: RISC-V: Support XTheadVector extensions

2023-11-22 Thread Christoph Müllner
On Wed, Nov 22, 2023 at 11:48 PM Kito Cheng  wrote:
>
> I am less worry about the thead vector combined with other zv extension, 
> instead we should reject those combinations at all.
>
> My reason is thead vector is transitional products, they won't have any 
> further new products with that longer, also it's not compatible with all 
> other zv extension in theory, zv extension requires at least zve32x which is 
> subset of v1p0, and I don't think it's valid to use thead vector as 
> replacement required extension - it should just introduce another thead 
> vector extension instead.

The "transitional products" argument is probably enough to add this restriction,
so we will add this to the first patch of the series.

Further, we'll implement approach 1 (emitting no "th." prefix for
instructions in vector.md)
with an additional patch on top, which implements the ASM_OUTPUT_OPCODE hook
(with a comment that clarifies why "ptr[0] == 'v'" is sufficient there).
So the decision about this can be postponed and we can focus on the rest
of the patchset as Jeff suggested.

Thanks for the inputs!

>
>
>
> Jeff Law  於 2023年11月23日 週四 06:27 寫道:
>>
>>
>>
>> On 11/22/23 07:24, Christoph Müllner wrote:
>> > On Wed, Nov 22, 2023 at 2:52 PM 钟居哲  wrote:
>> >>
>> >> I am totally ok to approve theadvector on GCC-14 before stage 3 close
>> >> as long as it doesn't touch the current RVV codes too much and binutils 
>> >> supports theadvector.
>> >>
>> >> I have provided the draft approach:
>> >> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637349.html
>> >> which turns out doesn't need to change any codes of vector.md.
>> >> I strongly suggest follow this draft. I can be actively review 
>> >> theadvector during stage 3.
>> >> And hopefully can help you land theadvector on GCC-14.
>> >
>> > I see now two approaches:
>> > 1) Let GCC emit RVV instructions for XTheadVector for instructions
>> > that are in both
>> > 2) Use the ASM_OUTPUT_OPCODE hook to output "th." for these instructions
>> >
>> > No doubt, the ASM_OUTPUT_OPCODE hook approach is better than our
>> > format-string approach, but would 1) not be the even better
>> > solution? It would also mean, that not a single test case is required
>> > for these overlapping instructions (only a few tests that ensure that
>> > we don't emit RVV instructions that are not available in
>> > XTheadVector). Besides that, letting GCC emit RVV instructions for
>> > XTheadVector is a very clever idea, because it fully utilizes the
>> > fact that both extensions overlap to a huge degree.
>> >
>> > The ASM_OUTPUT_OPCODE approach could lead to an issue if we enable
>> XTheadVector
>> > with any other vector extension, say Zvfoo. In this case the Zvfoo
>> > instructions will all be prefixed as well with "th.". I know that it
>> > is not likely to run into this problem (such a machine does not exist
>> > in real hardware), but it is possible to trigger this issue easily
>> > and approach 1) would not have this potential issue.
>> I'm not a big fan of the ASM_OUTPUT_OPCODE approach.While it is
>> simple, I worry a bit about it from a long term maintenance standpoint.
>> As you note we could well end up at some point with an extension that
>> has an mnenomic starting with "v" that would blow up.  But I certainly
>> see the appeal of such a simple test to support thead vector.
>>
>> Given there are at least 3 approaches that can fix that problem (%^,
>> assembler dialect or ASM_OUTPUT_OPCODE), maybe we could set that
>> discussion aside in the immediate term and see if there are other issues
>> that are potentially more substantial.
>>
>>
>>
>>
>> --
>>
>>
>>
>> More generally, I think I need to soften my prior statement about
>> deferring this to gcc-15.  This code was submitted in time for the
>> gcc-14 deadline, so it should be evaluated just like we do anything else
>> that makes the deadline.  There are various criteria we use to evaluate
>> if something should get integrated and we should just work through this
>> series like we always do and not treat it specially in any way.
>>
>>
>> jeff


Re: [PATCH #2/4] c++: mark short-enums as packed

2023-11-22 Thread Jason Merrill

On 11/22/23 13:12, Jason Merrill wrote:

On 11/22/23 03:17, Alexandre Oliva wrote:

On Nov 20, 2023, Jason Merrill  wrote:


I think the warning is wrong here.


Interesting...  Yeah, your analysis makes perfect sense.

Still, we're left with a divergence WRT the TYPE_PACKED status of enum
types between C and C++.

It sort of kind of makes sense to mark short enums as packed, because,
well, they are.


The enum is conceptually packed into a smaller integer type, sure.


Even enum types with explicit attribute packed, that IIUC uses the same
underlying type selection as -fshort-enums, IIRC are not be marked with
TYPE_PACKED in C++, at least not at the place where I proposed to set
it.  Do you consider that behavior correct?


Since attribute ((packed)) has this meaning, it seems reasonable to set 
TYPE_PACKED to express it.



Even if the warning happens to be buggy in this regard, it is at best
(or worst) accessory to this patch, in that it makes that difference
between languages apparent, and I worry that there might be other middle
end tests involving TYPE_PACKED that would get things different in C vs
C++.  (admittedly, I haven't searched for occurrences of TYPE_PACKED in
the tree, but I could, to alleviate my concerns, in case there's a
decision to keep them different)


The middle-end doesn't seem to use TYPE_PACKED for anything other than 
structure layout.



In the analyzer testcase, we have a cast from an
enum pointer that we don't know what it points to, and even if it did
point to the obj_type member of struct connection, that wouldn't be a
problem because it's at offset 0.


Maybe I misunderstand the point of the warning, but ISTM that the
circumstance it's warning about is real: the member is not as aligned as
the enclosing struct, so the cast is risky.  Now, I suppose the idiom of
finding the enclosing struct given a member is common enough that we
don't want to warn about it in general.  I'm not sure what makes packed
structs special in this regard, though.  I don't really see much
difference, more laxly-aligned fields seem equally warn-worthy, whether
the enclosing struct is packed or not, but what do I know?


Exactly.  If we want to warn about casting from pointer to less-aligned 
type to pointer to more-aligned type, that's already 
-Wcast-align=strict; whether the lower alignment is due to TYPE_PACKED 
seems irrelevant.


The observation that the type-based warning is a subset of 
-Wcast-align=strict was previously made in the discussion of the patch 
for PR88928.


And the motivating testcase for the warning was about converting from 
unaligned int* to aligned int*, not to a different type at all.  And 
that warning doesn't involve TYPE_PACKED.


The clang -Waddress-of-packed-member doesn't seem to include the 
type-based warning.



Also, -fshort-enums has nothing to do with structure packing


*nod*, it's about packing of the enum type itself.  It is some sort of a
degenerated aggregate type ;-) But yeah, I guess it doesn't fit the
circumstance the warning was meant to catch, and the fact that in C is
does is a consequence of marking C short enums as TYPE_PACKED.

Which might be a bug in C.

But wouldn't it be a bug in C++ if an enum with attribute packed weren't
markd as TYPE_PACKED?  Or is TYPE_PACKED really meant to say something
about the enclosing struct rather than about the enclosed type itself?
(am I getting too philosophical here? :-)


I'm coming to the conclusion that your C++ patch is fine but we should 
remove the TYPE_PACKED warning from 
check_address_or_pointer_of_packed_member.  And maybe add 
-Wcast-align=strict to -Wextra.


Since I seem to have opinions, I'm preparing a patch for this.

Jason



Re: [PATCH v5 1/1] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-11-22 Thread Jason Merrill

On 11/22/23 15:46, waffl3x wrote:

On Tuesday, November 21st, 2023 at 8:22 PM, Jason Merrill  
wrote:

On 11/21/23 08:04, waffl3x wrote:


/* Nonzero for FUNCTION_DECL means that this decl is a non-static
- member function. */
+ member function, use DECL_IOBJ_MEMBER_FUNC_P instead. */
#define DECL_NONSTATIC_MEMBER_FUNCTION_P(NODE) \
(TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE)

+/* Nonzero for FUNCTION_DECL means that this decl is an implicit object
+ member function. */
+#define DECL_IOBJ_MEMBER_FUNC_P(NODE) \
+ (TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE)


I was thinking to rename DECL_NONSTATIC_MEMBER_FUNCTION_P rather than
add a new, equivalent one. And then go through all the current uses of
the old macro to decide whether they mean IOBJ or OBJECT.


I figure it would be easier to make that transition if there's a clear
line between old versus new. To be clear, my intention is for the old
macro to be removed once all the uses of it are changed over to the new
macro. I can still remove it for the patch if you like but having both
and removing the old one later seems better to me.


Hmm, I think changing all the uses is a necessary part of this change. 
I suppose it could happen before the main patch, if you'd prefer, but it 
seems more straightforward to include it.



+ else if (declarator->declarator->kind == cdk_pointer)
+ error_at (DECL_SOURCE_LOCATION (xobj_parm),
+ /* "a pointer to function type cannot "? */
+ "a function pointer type cannot "
+ "have an explicit object parameter");


"pointer to function type", yes.


+ /* The locations being used here are probably not correct. */


Why not?


I threw them in just so I could call inform, but it doesn't feel like
the messages should be pointing at the parameter, but rather at the
full type declaration. When I put those locations in I wasn't sure how
to get the full declaration location, and I'm still not 100% confident
in how to do it, so I just threw them in and moved on.


That would be more precise, but I think it's actually preferable for the 
inform to have the same location as the previous error to avoid 
redundant quoting of the source.



Let's clear xobj_parm after giving an error in the TYPENAME case


I don't like the spirit of this very much, whats your reasoning for
this? We're nearly at the end of the scope where it is last used, I
think it would be more unclear if we suddenly set it to NULL_TREE near
the end. It raises the question of whether that assignment actually
does anything, or if we are just trying to indicate that it isn't being
used anymore, but I already made sure to declare it in the deepest
scope possible. That much should be sufficient for indicating it's
usage, no?


Hmm, I think I poked at that and changed my mind, but forgot to delete 
the suggestion.  Never mind.



if ((!methodp && !DECL_XOBJ_MEMBER_FUNC_P (decl))
|| DECL_STATIC_FUNCTION_P (decl))


I think this can just be if (DECL_OBJECT_MEMBER_FUNC_P (decl)).


Alright, and going forward I'll try to make more changes that are
consistent with this one. With that said I'm not sure it can, but I'll
take a close look and if you're right I'll make that change.


if (TREE_CODE (fntype) == METHOD_TYPE)
ctype = TYPE_METHOD_BASETYPE (fntype);
+ else if (DECL_XOBJ_MEMBER_FUNC_P (decl1))
+ ctype = DECL_CONTEXT (decl1);


All of this can be

if (DECL_CLASS_SCOPE_P (decl1))
ctype = DECL_CONTEXT (decl1);

I think I'm going to go ahead and clean that up now.


Sounds good to me, a lot of this stuff needs small cleanups and I'm
just concerned about making them too much.


My cleanup of the ctype logic is in now.


+ /* Error reporting here is a little awkward, if the type of the
+ object parameter is deduced, we should tell them the lambda
+ is effectively already const, or to make the param const if it is
+ not, but if it is deduced and taken by value shouldn't we say
+ that it's taken by copy and won't mutate?
+ Seems right to me, but it's a little strange. */


I think just omit the inform if dependent_type_p.


Maybe I don't understand what a dependent type is as well as I thought,
but doesn't this defeat every useful case? The most common being an
xobj parameter of lambda type, which will always be deduced. Unless a
template parameter does not count as a dependent type, which is not
something I've ever thought about before.


No, you're right.  A template parameter is certainly dependent.  I think 
the informs are fine as they are.



Mildly related, a lot of the stuff I hacked together with multiple
levels of accessing macros and predicates was due to not being able to
find a solution for what I needed. I think we would highly benefit from
better documentation of the accessors and predicates. I believe I've
seen some that appear to be duplicates, and some where they don't
appear to be implemented properly or match their description. If there
is such a document please direct me to it as I have spent an hour or so
each time I stumble on one of these problems.

In the 

Re: [PATCH v5 1/1] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-11-22 Thread waffl3x






On Wednesday, November 22nd, 2023 at 2:38 PM, Jason Merrill  
wrote:


> 
> 
> On 11/22/23 15:46, waffl3x wrote:
> 
> > On Tuesday, November 21st, 2023 at 8:22 PM, Jason Merrill ja...@redhat.com 
> > wrote:
> > 
> > > On 11/21/23 08:04, waffl3x wrote:
> > > 
> > > > /* Nonzero for FUNCTION_DECL means that this decl is a non-static
> > > > - member function. */
> > > > + member function, use DECL_IOBJ_MEMBER_FUNC_P instead. */
> > > > #define DECL_NONSTATIC_MEMBER_FUNCTION_P(NODE) \
> > > > (TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE)
> > > > 
> > > > +/* Nonzero for FUNCTION_DECL means that this decl is an implicit object
> > > > + member function. */
> > > > +#define DECL_IOBJ_MEMBER_FUNC_P(NODE) \
> > > > + (TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE)
> > > 
> > > I was thinking to rename DECL_NONSTATIC_MEMBER_FUNCTION_P rather than
> > > add a new, equivalent one. And then go through all the current uses of
> > > the old macro to decide whether they mean IOBJ or OBJECT.
> > 
> > I figure it would be easier to make that transition if there's a clear
> > line between old versus new. To be clear, my intention is for the old
> > macro to be removed once all the uses of it are changed over to the new
> > macro. I can still remove it for the patch if you like but having both
> > and removing the old one later seems better to me.
> 
> 
> Hmm, I think changing all the uses is a necessary part of this change.
> I suppose it could happen before the main patch, if you'd prefer, but it
> seems more straightforward to include it.
> 
> > > > + else if (declarator->declarator->kind == cdk_pointer)
> > > > + error_at (DECL_SOURCE_LOCATION (xobj_parm),
> > > > + /* "a pointer to function type cannot "? */
> > > > + "a function pointer type cannot "
> > > > + "have an explicit object parameter");
> > > 
> > > "pointer to function type", yes.
> > > 
> > > > + /* The locations being used here are probably not correct. */
> > > 
> > > Why not?
> > 
> > I threw them in just so I could call inform, but it doesn't feel like
> > the messages should be pointing at the parameter, but rather at the
> > full type declaration. When I put those locations in I wasn't sure how
> > to get the full declaration location, and I'm still not 100% confident
> > in how to do it, so I just threw them in and moved on.
> 
> 
> That would be more precise, but I think it's actually preferable for the
> inform to have the same location as the previous error to avoid
> redundant quoting of the source.

Yeah that makes sense, I'll revise the comment with that rationale and
we can maybe revisit it later.

> > > Let's clear xobj_parm after giving an error in the TYPENAME case
> > 
> > I don't like the spirit of this very much, whats your reasoning for
> > this? We're nearly at the end of the scope where it is last used, I
> > think it would be more unclear if we suddenly set it to NULL_TREE near
> > the end. It raises the question of whether that assignment actually
> > does anything, or if we are just trying to indicate that it isn't being
> > used anymore, but I already made sure to declare it in the deepest
> > scope possible. That much should be sufficient for indicating it's
> > usage, no?
> 
> 
> Hmm, I think I poked at that and changed my mind, but forgot to delete
> the suggestion. Never mind.

Perfect.

> > > > if ((!methodp && !DECL_XOBJ_MEMBER_FUNC_P (decl))
> > > > || DECL_STATIC_FUNCTION_P (decl))
> > > 
> > > I think this can just be if (DECL_OBJECT_MEMBER_FUNC_P (decl)).
> > 
> > Alright, and going forward I'll try to make more changes that are
> > consistent with this one. With that said I'm not sure it can, but I'll
> > take a close look and if you're right I'll make that change.
> > 
> > > > if (TREE_CODE (fntype) == METHOD_TYPE)
> > > > ctype = TYPE_METHOD_BASETYPE (fntype);
> > > > + else if (DECL_XOBJ_MEMBER_FUNC_P (decl1))
> > > > + ctype = DECL_CONTEXT (decl1);
> > > 
> > > All of this can be
> > > 
> > > if (DECL_CLASS_SCOPE_P (decl1))
> > > ctype = DECL_CONTEXT (decl1);
> > > 
> > > I think I'm going to go ahead and clean that up now.
> > 
> > Sounds good to me, a lot of this stuff needs small cleanups and I'm
> > just concerned about making them too much.
> 
> 
> My cleanup of the ctype logic is in now.

I'll make sure to base the final patch off a newer commit then, I don't
think I'll do that right now because it takes a lot of time for me. I
still haven't gotten used to all the workflows with git so doing
anything with it takes a lot out of me. Also, I would have to rerun the
testsuite on the newer commit so I can get an accurate baseline which
takes a few hours.

> > > > + /* Error reporting here is a little awkward, if the type of the
> > > > + object parameter is deduced, we should tell them the lambda
> > > > + is effectively already const, or to make the param const if it is
> > > > + not, but if it is deduced and taken by value shouldn't we say
> > > > + that it's taken by copy and won't mutate?
> > > > + Seems 

Re: [committed] d: Merge upstream dmd ff57fec515, druntime ff57fec515, phobos 17bafda79.

2023-11-22 Thread Iain Buclaw
Excerpts from Rainer Orth's message of November 21, 2023 5:03 pm:
> Rainer Orth  writes:
> 
>> either this patch or the previous one broke D bootstrap with GCC 9.  On
>> both i386-pc-solaris2.11 with gdc 9.4.0 and sparc-sun-solaris2.11 with
>> gdc 9.3.0, stage 1 d21 fails to link with
>>
>> Undefined   first referenced
>>  symbol in file
>> _D3dmd4root11stringtable34__T11StringValueTC3dmd5mtype4TypeZ11StringValue7lstringMFNaNbNiNjZPa
>>  d/func.o
>> _D3dmd4root11stringtable34__T11StringValueTC3dmd5mtype4TypeZ11StringValue8toDcharsMxFNaNbNiNjZPxa
>>  d/func.o
>> _D3dmd4root11stringtable34__T11StringValueTC3dmd5mtype4TypeZ11StringValue8toStringMxFNaNbNiNjZAxa
>>  d/func.o
>> ld: fatal: symbol referencing errors
>> collect2: error: ld returned 1 exit status
>> make[3]: *** [/vol/gcc/src/hg/master/local/gcc/d/Make-lang.in:236: d21] 
>> Error 1
> 
> Same on i686-pc-linux-gnu, btw.
> 

Thanks, I've found the culprit.  There's been quite a few changes in the
import graph upstream.  This looks to have exposed some unfortunate
template emission bugs in older versions of the compiler that as you've
pointed out, work just fine with gdc-11.

I'm err'ing on the side of reverting the individual patches, though if I
get time later, maybe try a partial revert by restoring the old import
statements only.

Iain.


Re: [PATCH v3 02/11] aarch64: Call named function in gcc.target/aarch64/aapcs64/ice_1.c

2023-11-22 Thread Joseph Myers
On Mon, 20 Nov 2023, Florian Weimer wrote:

> This test looks like it intends to pass a small struct argument
> through both a non-variadic and variadic argument, but due to
> the typo, it does not achieve that.
> 
> gcc/testsuite/
> 
>   * gcc.target/aarch64/aapcs64/ice_1.c (foo): Call named.

OK in the absence of AArch64 maintainer objections within 48 hours.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH v2] ifcvt: Remove obsolete code for subreg handling in noce_convert_multiple_sets

2023-11-22 Thread Jeff Law




On 11/21/23 11:04, Manolis Tsamis wrote:

This code used to handle SUBREG for register replacement when ifcvt was doing
the replacements manually. This special handling is not needed anymore
because simplify_replace_rtx is used for the replacements and it properly
handles these cases.

gcc/ChangeLog:

* ifcvt.cc (noce_convert_multiple_sets_1): Remove old code.

OK.
jeff


Re: [PATCH] c: Add __builtin_stdc_bit_{width,floor,ceil} builtins

2023-11-22 Thread Joseph Myers
On Mon, 20 Nov 2023, Jakub Jelinek wrote:

> On Mon, Nov 20, 2023 at 04:03:07PM +0100, Jakub Jelinek wrote:
> > > Note that stdc_bit_ceil now has defined behavior (return 0) on overflow: 
> > > CD2 comment FR-135 was accepted for the DIS at the June WG14 meeting.  
> > > This affects both the documentation and the implementation, as they need 
> > > to avoid an undefined shift by the width of the type.  That's why my 
> > > stdbit.h implementations have two shifts (not claiming that's necessarily 
> > > the optimal way of ensuring the correct result in the overflow case).
> > > 
> > >   return __x <= 1 ? 1 : ((uint64_t) 1) << (__bw64_inline (__x - 1) - 1) 
> > > << 1;
> > 
> > Given the feedback from Richi I've in the meantime reworked the patch to
> > add all 14 builtins (but because the enum rid is very close to 256 values
> > and with 14 new ones was already 7 too many, used one RID value for all 14
> > builtins (different spellings)).
> > 
> > Will need to rework it for CD2 FR-135 then...
> 
> Here it is updated to use that
> x <= 1 ? 1 : ((type) 2) << (prec - 1 - __builtin_clzg ((type) (x - 1)))
> I've mentioned.
> 
> 2023-11-20  Jakub Jelinek  
> 
> gcc/
>   * doc/extend.texi (__builtin_stdc_bit_ceil, __builtin_stdc_bit_floor,
>   __builtin_stdc_bit_width, __builtin_stdc_count_ones,
>   __builtin_stdc_count_zeros, __builtin_stdc_first_leading_one,
>   __builtin_stdc_first_leading_zero, __builtin_stdc_first_trailing_one,
>   __builtin_stdc_first_trailing_zero, __builtin_stdc_has_single_bit,
>   __builtin_stdc_leading_ones, __builtin_stdc_leading_zeros,
>   __builtin_stdc_trailing_ones, __builtin_stdc_trailing_zeros): Document.
> gcc/c-family/
>   * c-common.h (enum rid): Add RID_BUILTIN_STDC: New.
>   * c-common.cc (c_common_reswords): Add __builtin_stdc_bit_ceil,
>   __builtin_stdc_bit_floor, __builtin_stdc_bit_width,
>   __builtin_stdc_count_ones, __builtin_stdc_count_zeros,
>   __builtin_stdc_first_leading_one, __builtin_stdc_first_leading_zero,
>   __builtin_stdc_first_trailing_one, __builtin_stdc_first_trailing_zero,
>   __builtin_stdc_has_single_bit, __builtin_stdc_leading_ones,
>   __builtin_stdc_leading_zeros, __builtin_stdc_trailing_ones and
>   __builtin_stdc_trailing_zeros.  Move __builtin_assoc_barrier
>   alphabetically earlier.
> gcc/c/
>   * c-parser.cc (c_parser_postfix_expression): Handle RID_BUILTIN_STDC.
>   * c-decl.cc (names_builtin_p): Likewise.
> gcc/testsuite/
>   * gcc.dg/builtin-stdc-bit-1.c: New test.
>   * gcc.dg/builtin-stdc-bit-2.c: New test.

OK with tests added for unsigned _BitInt(1).  Specifically, unsigned 
_BitInt(1) is a bit of a degenerate case for stdc_bit_ceil (always 
returning 1 after evaluating the argument's side effects); I think the 
code that builds of constant 2 of that type (a constant only used in dead 
code) should still work (and produce a constant 0), and that the 
documentation is also still correct in the case where converting 2 to the 
type produces 0, but given those degeneracies I think it's worth testing 
unsigned _BitInt(1) with these functions to make sure they do behave as 
expected.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] c++: alias template of non-template class [PR112633]

2023-11-22 Thread Jason Merrill

On 11/22/23 12:26, Patrick Palka wrote:

Bootstrapped and regtested on x86-64-pc-linux-gnu, does this look OK for
trunk/13?


OK.


-- >8 --

The entering_scope adjustment in tsubst_aggr_type assumes if an alias is
dependent, then so is the aliased type (and therefore it has template info)
but that's not true for the dependent alias template specialization ty1
below which aliases the non-template class A.

PR c++/112633

gcc/cp/ChangeLog:

* pt.cc (tsubst_aggr_type): Handle empty TYPE_TEMPLATE_INFO
in the entering_scope adjustment.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/alias-decl-75.C: New test.
---
  gcc/cp/pt.cc   |  1 +
  gcc/testsuite/g++.dg/cpp0x/alias-decl-75.C | 13 +
  2 files changed, 14 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/alias-decl-75.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index ed681afb5d4..68ce4a87372 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -13976,6 +13976,7 @@ tsubst_aggr_type (tree t,
if (entering_scope
  && CLASS_TYPE_P (t)
  && dependent_type_p (t)
+ && TYPE_TEMPLATE_INFO (t)
  && TYPE_CANONICAL (t) == TREE_TYPE (TYPE_TI_TEMPLATE (t)))
t = TYPE_CANONICAL (t);
  
diff --git a/gcc/testsuite/g++.dg/cpp0x/alias-decl-75.C b/gcc/testsuite/g++.dg/cpp0x/alias-decl-75.C

new file mode 100644
index 000..1a73a99856e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/alias-decl-75.C
@@ -0,0 +1,13 @@
+// PR c++/112633
+// { dg-do compile { target c++11 } }
+
+struct A { using type = void; };
+
+template
+using ty1 = A;
+
+template
+using ty2 = typename ty1::type;
+
+template
+ty2 f();




Re: [PATCH 2/2] bugzilla: remove `gcc-bugs@` mailing list address

2023-11-22 Thread Xi Ruoyao
On Wed, 2023-11-22 at 20:57 -0500, Ben Boeckel wrote:
> Is there a version of autoconf I should use? I have 2.71 laying around
> but see that these were generated with 2.69. If you want me to regen
> with 2.71, I'll do that as separate prep commits so that this diff is
> sensible. Or I can try and dig up a 2.69 in some container to do it.

Use 2.69.

A container is not needed, you can just install autoconf-2.69 with --
prefix=$HOME/ac269 (or another directory you like) and run
$HOME/ac269/bin/autoconf, $HOME/ac269/bin/autoheader, etc.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


[pushed] wwwdocs: faq: Refer to gcc-testresults instead of buildstat.html

2023-11-22 Thread Gerald Pfeifer
This is the last obsolete reference to buildstat.html shared by Thomas 
and per my own `grep -r`.

Pushed.

Gerald

---
 htdocs/faq.html | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/htdocs/faq.html b/htdocs/faq.html
index 203661dc..5c713a70 100644
--- a/htdocs/faq.html
+++ b/htdocs/faq.html
@@ -99,8 +99,9 @@ about known problems with installing or using GCC on 
particular platforms.
 These are included in the sources for a release in INSTALL/specific.html,
 and the https://gcc.gnu.org/install/specific.html;>latest version
 is always available at the GCC web site.
-Reports of successful builds
-for several versions of GCC are also available at the web site.
+There you also find
+https://gcc.gnu.org/pipermail/gcc-testresults/;>reports around
+successful builds.
 
 
 Installation
-- 
2.42.1


Re: [PATCH v2 5/6] libgomp, nvptx: Cuda pinned memory

2023-11-22 Thread Tobias Burnus

(I have not fully thought about the 2/6, 3/6 and 4/6 patches, but I
think except for some patch apply issues, 1/6 + this 5/6 can be both
committed without needing 2-4.)

On 23.08.23 16:14, Andrew Stubbs wrote:

Use Cuda to pin memory, instead of Linux mlock, when available.

There are two advantages: firstly, this gives a significant speed boost for
NVPTX offloading, and secondly, it side-steps the usual OS ulimit/rlimit
setting.

I think both the ulimit issue for the 1/6 patch and the non-issue for
this variant should/could be mentioned in libgomp.texi (in the Memory
Management section and in the nvptx section, respectively.)

The design adds a device independent plugin API for allocating pinned memory,
and then implements it for NVPTX.  At present, the other supported devices do
not have equivalent capabilities (or requirements).


Note before: Starting with TR11 alias OpenMP 6.0, OpenMP supports handling
multiple devices for allocation. It seems as if after using:

  my_memspace = omp_get_device_and_host_memspace( 5 , omp_default_mem_space)
  my_alloc = omp_init_allocator (my_memspace, my_traits_with_pinning);

The pinning should be done via device '5' is possible.

 * * *

However, I believe that it shouldn't really matter for now, given that CUDA
has no special handling of NUMA hierarchy on the host nor for specific devices
and GCN has none.

It only becomes interesting if mmap/mlock memory is (measurably) faster than
CUDA allocated memory when accessed from the host or, for USM, from GCN.

* * *

Let's start with the patch itself:

--- a/libgomp/target.c +++ b/libgomp/target.c ... +static struct
gomp_device_descr * +get_device_for_page_locked (void) +{ + gomp_debug
(0, "%s\n", + __FUNCTION__); + + struct gomp_device_descr *device;
+#ifdef HAVE_SYNC_BUILTINS + device + = __atomic_load_n
(_for_page_locked, MEMMODEL_RELAXED); + if (device == (void *)
-1) + { + gomp_debug (0, " init\n"); + + gomp_init_targets_once (); +
+ device = NULL; + for (int i = 0; i < num_devices; ++i)


Given that this function just sets a single variable based on whether the
page_locked_host_alloc_func function pointer exists, wouldn't it be much
simpler to just do all this handling in   gomp_target_init  ?


+ for (int i = 0; i < num_devices; ++i) ... + /* We consider only the
first device of potentially several of the + same type as this
functionality is not specific to an individual + offloading device,
but instead relates to the host-side + implementation of the
respective offloading implementation. */ + if (devices[i].target_id !=
0) + continue; + + if (!devices[i].page_locked_host_alloc_func) +
continue; ... + if (device) + gomp_fatal ("Unclear how %s and %s
libgomp plugins may" + " simultaneously provide functionality" + " for
page-locked memory", + device->name, devices[i].name); + else + device
= [i];


I find this a bit inconsistent: If - let's say - GCN does not not provide its
own pinning, the code assumes that CUDA pinning is just fine.  However, if both
support it, CUDA pinning suddenly is not fine for GCN.

Additionally, all wording suggests that it does not matter for CUDA for which
device access we want to optimize the pinning. But the code above also fails if
I have a system with two Nvidia cards.  From the wording, it sounds as if just
checking whether the  device->type  is different would do.


But all in all, I wonder whether it wouldn't be much simpler to state something
like the following (where applicable):

If first device that provided pinning support is used; the assumption is that
all other devices and the host can access this memory without measurable
performance penalty compared to a normal page lock and that having multiple
device types or host/device NUMA aware pinning support in the plugin is not
available.
NOTE: For OpenMP 6.0's OMP_AVAILABLE_DEVICES environment variable, device-set
memory spaces this might need to be revisited.

(The note is only meant for the *.c code / this review, the first sentence up
to the ';' should do in some way into libgomp.texi as well.)

And document in libgomp.texi that the first device for which device-specific
host-memory pinning support is available is used → to be added in
https://gcc.gnu.org/onlinedocs/libgomp/Memory-allocation.html
The nvidia specific part - i.e. that it is supported and possibly more details -
can then be added to: https://gcc.gnu.org/onlinedocs/libgomp/nvptx.html

I think the @ref added to 'Offload-Target Specifics' for the third time for
this patch will be sufficient - if not, add some referring words as well.


 * * *


+gomp_page_locked_host_alloc (void **ptr, size_t size) +{ + gomp_debug
(0, "%s: ptr=%p, size=%llu\n", + __FUNCTION__, ptr, (unsigned long
long) size); + + struct gomp_device_descr *device =
get_device_for_page_locked ();


With the proposed changes above, we could just call
  device_for_page_locked
and then access the global variable nonatomically.

BTW: I think the global variable needs a _mem(ory) suffix, I find
  

[PATCH] c++: Implement P2582R1, CTAD from inherited constructors

2023-11-22 Thread Patrick Palka
Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

-- >8 --

This patch implements C++23 class template argument deduction from
inherited constructors, which is specified in terms of C++20 alias
CTAD which we already fully support.  The rule for transforming
the return type of an inherited guide is specified in terms of a
partially specialized class template, but this patch implements it
in a simpler way, performing ahead of time deduction instead of
instantiation time deduction.  I wasn't able to find an example for
which this implementation strategy makes a difference, but I didn't
look very hard.  Support seems good enough to advertise as complete,
and there should be no functional change before C++23 mode.

There's a couple of FIXMEs, one in inherited_ctad_tweaks for recognizing
more forms of inherited constructors, and one in deduction_guides_for for
making the cache aware of base-class dependencies.

There doesn't seem to be a feature-test macro update for this paper.

gcc/cp/ChangeLog:

* cp-tree.h (type_targs_deducible_from): Adjust declaration.
* pt.cc (alias_ctad_tweaks): Handle C++23 inherited CTAD.
(inherited_ctad_tweaks): Define.
(type_targs_deducible_from): Add defaulted 'targs_out' parameter.
Handle 'tmpl' being a TREE_LIST representing a synthetic alias
template.  Set 'targs_out' upon success.
(ctor_deduction_guides_for): Do inherited_ctad_tweaks for each
USING_DECL in C++23 mode.
(deduction_guides_for): Add FIXME for stale cache entries in
light of inherited CTAD.

gcc/testsuite/ChangeLog:

* g++.dg/cpp1z/class-deduction67.C: Accept in C++23 mode.
* g++.dg/cpp23/class-deduction-inherited1.C: New test.
* g++.dg/cpp23/class-deduction-inherited2.C: New test.
* g++.dg/cpp23/class-deduction-inherited3.C: New test.
---
 gcc/cp/cp-tree.h  |   2 +-
 gcc/cp/pt.cc  | 176 +++---
 .../g++.dg/cpp1z/class-deduction67.C  |   5 +-
 .../g++.dg/cpp23/class-deduction-inherited1.C |  36 
 .../g++.dg/cpp23/class-deduction-inherited2.C |  26 +++
 .../g++.dg/cpp23/class-deduction-inherited3.C |  16 ++
 6 files changed, 231 insertions(+), 30 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp23/class-deduction-inherited1.C
 create mode 100644 gcc/testsuite/g++.dg/cpp23/class-deduction-inherited2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp23/class-deduction-inherited3.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 1fa710d7154..633d58b1d12 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7434,7 +7434,7 @@ extern tree fn_type_unification   (tree, 
tree, tree,
 bool, bool);
 extern void mark_decl_instantiated (tree, int);
 extern int more_specialized_fn (tree, tree, int);
-extern bool type_targs_deducible_from  (tree, tree);
+extern bool type_targs_deducible_from  (tree, tree, tree * = nullptr);
 extern void do_decl_instantiation  (tree, tree);
 extern void do_type_instantiation  (tree, tree, tsubst_flags_t);
 extern bool always_instantiate_p   (tree);
diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 324f6f01555..75f5bc9bed5 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -223,6 +223,9 @@ static void instantiate_body (tree pattern, tree args, tree 
d, bool nested);
 static tree maybe_dependent_member_ref (tree, tree, tsubst_flags_t, tree);
 static void mark_template_arguments_used (tree, tree);
 static bool uses_outer_template_parms (tree);
+static tree alias_ctad_tweaks (tree, tree);
+static tree inherited_ctad_tweaks (tree, tree, tsubst_flags_t);
+static tree deduction_guides_for (tree, bool&, tsubst_flags_t);
 
 /* Make the current scope suitable for access checking when we are
processing T.  T can be FUNCTION_DECL for instantiated function
@@ -29753,8 +29756,6 @@ is_spec_or_derived (tree etype, tree tmpl)
   return !err;
 }
 
-static tree alias_ctad_tweaks (tree, tree);
-
 /* Return a C++20 aggregate deduction candidate for TYPE initialized from
INIT.  */
 
@@ -29859,7 +29860,13 @@ maybe_aggr_guide (tree tmpl, tree init, 
vec *args)
 }
 
 /* UGUIDES are the deduction guides for the underlying template of alias
-   template TMPL; adjust them to be deduction guides for TMPL.  */
+   template TMPL; adjust them to be deduction guides for TMPL.
+
+   This routine also handles C++23 inherited CTAD, in which case TMPL is a
+   TREE_LIST representing a synthetic alias template whose TREE_PURPOSE is
+   the template parameter list of the alias template (equivalently, of the
+   derived class) and TREE_VALUE the defining-type-id (equivalently, the
+   base whose guides we're inheriting).  UGUIDES are the base's guides.  */
 
 static tree
 alias_ctad_tweaks (tree tmpl, tree uguides)
@@ -29903,13 +29910,30 @@ alias_ctad_tweaks 

[PATCH, v4] Fortran: restrictions on integer arguments to SYSTEM_CLOCK [PR112609]

2023-11-22 Thread Harald Anlauf

Hi Mikael!

On 11/22/23 10:36, Mikael Morin wrote:

(...)


diff --git a/gcc/fortran/error.cc b/gcc/fortran/error.cc
index 2ac51e95e4d..be715b50469 100644
--- a/gcc/fortran/error.cc
+++ b/gcc/fortran/error.cc
@@ -980,7 +980,11 @@ char const*
 notify_std_msg(int std)
 {

-  if (std & GFC_STD_F2018_DEL)
+  if (std & GFC_STD_F2023_DEL)
+    return _("Fortran 2023 deleted feature:");


As there are officially no deleted feature in f2023, maybe use a 
slightly different wording?  Say "Not allowed in fortran 2023" or 
"forbidden in Fortran 2023" or similar?



+  else if (std & GFC_STD_F2023)
+    return _("Fortran 2023:");
+  else if (std & GFC_STD_F2018_DEL)
 return _("Fortran 2018 deleted feature:");
   else if (std & GFC_STD_F2018_OBS)
 return _("Fortran 2018 obsolescent feature:");


I skimmed over existing error messages, and since "forbidden" did
not show up and since "Not allowed" exists but not at the beginning
of a message, I found that

"Prohibited in Fortran 2023"

appeared to be a good alternative.

Not being a native speaker, I hope that someone speaks up if this
is not appropriate.  And since I do not explicitly verify that part
in the testcase, it can be changed.


diff --git a/gcc/fortran/libgfortran.h b/gcc/fortran/libgfortran.h
index bdddb317ab0..af7a170c2b1 100644
--- a/gcc/fortran/libgfortran.h
+++ b/gcc/fortran/libgfortran.h
@@ -19,9 +19,10 @@ along with GCC; see the file COPYING3.  If not see


 /* Flags to specify which standard/extension contains a feature.
-   Note that no features were obsoleted nor deleted in F2003 nor in 
F2023.

+   Note that no features were obsoleted nor deleted in F2003.


I think we can add a comment that F2023 has no deleted feature, but some 
more stringent restrictions in f2023 forbid some previously valid code.



    Please remember to keep those definitions in sync with
    gfortran.texi.  */
+#define GFC_STD_F2023_DEL    (1<<13)    /* Deleted in F2023.  */
 #define GFC_STD_F2023    (1<<12)    /* New in F2023.  */
 #define GFC_STD_F2018_DEL    (1<<11)    /* Deleted in F2018.  */
 #define GFC_STD_F2018_OBS    (1<<10)    /* Obsolescent in F2018.  */
@@ -41,12 +42,13 @@ along with GCC; see the file COPYING3.  If not see
  * are allowed with a certain -std option.  */
 #define GFC_STD_OPT_F95    (GFC_STD_F77 | GFC_STD_F95 | 
GFC_STD_F95_OBS  \

 | GFC_STD_F2008_OBS | GFC_STD_F2018_OBS \
-    | GFC_STD_F2018_DEL)
+    | GFC_STD_F2018_DEL | GFC_STD_F2023_DEL)
 #define GFC_STD_OPT_F03    (GFC_STD_OPT_F95 | GFC_STD_F2003)
 #define GFC_STD_OPT_F08    (GFC_STD_OPT_F03 | GFC_STD_F2008)
 #define GFC_STD_OPT_F18    ((GFC_STD_OPT_F08 | GFC_STD_F2018) \
 & (~GFC_STD_F2018_DEL))
F03, F08 and F18 should have GFC_STD_F2023_DEL (and also F03 and F08 
should have GFC_STD_F2018_DEL).


Well, these macros do an incremental bitwise-or, so the bit representing
GFC_STD_F2023_DEL is included everywhere.  I also ran the testcases with
different -std= options to check.

OK with this fixed (and the previous comments as you wish), if Steve has 
no more comments.


Thanks for the patch.




If there are no further comments, I will commit once I am able to
fully build again with --disable-bootstrap and -march=native ...

Thanks,
Harald

From 56386f4f332cf8970a424ba67678335fa6186e4c Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Wed, 22 Nov 2023 20:57:59 +0100
Subject: [PATCH] Fortran: restrictions on integer arguments to SYSTEM_CLOCK
 [PR112609]

Fortran 2023 added restrictions on integer arguments to SYSTEM_CLOCK to
have a decimal exponent range at least as large as a default integer,
and that all integer arguments have the same kind type parameter.

gcc/fortran/ChangeLog:

	PR fortran/112609
	* check.cc (gfc_check_system_clock): Add checks on integer arguments
	to SYSTEM_CLOCK specific to F2023.
	* error.cc (notify_std_msg): Adjust to handle new features added
	in F2023.
	* gfortran.texi (_gfortran_set_options): Document GFC_STD_F2023_DEL,
	remove obsolete option GFC_STD_F2008_TS and fix enumeration values.
	* libgfortran.h (GFC_STD_F2023_DEL): Add and use in GFC_STD_OPT_F23.
	* options.cc (set_default_std_flags): Add GFC_STD_F2023_DEL.

gcc/testsuite/ChangeLog:

	PR fortran/112609
	* gfortran.dg/system_clock_1.f90: Add option -std=f2003.
	* gfortran.dg/system_clock_3.f08: Add option -std=f2008.
	* gfortran.dg/system_clock_4.f90: New test.
---
 gcc/fortran/check.cc | 50 
 gcc/fortran/error.cc |  6 ++-
 gcc/fortran/gfortran.texi| 10 ++--
 gcc/fortran/libgfortran.h|  7 ++-
 gcc/fortran/options.cc   |  6 ++-
 gcc/testsuite/gfortran.dg/system_clock_1.f90 |  1 +
 gcc/testsuite/gfortran.dg/system_clock_3.f08 |  1 +
 gcc/testsuite/gfortran.dg/system_clock_4.f90 | 24 ++
 8 files changed, 95 insertions(+), 10 deletions(-)
 create mode 100644 

Re: [PATCH #2/4] c++: mark short-enums as packed

2023-11-22 Thread Jason Merrill

On 11/22/23 03:17, Alexandre Oliva wrote:

On Nov 20, 2023, Jason Merrill  wrote:


I think the warning is wrong here.


Interesting...  Yeah, your analysis makes perfect sense.

Still, we're left with a divergence WRT the TYPE_PACKED status of enum
types between C and C++.

It sort of kind of makes sense to mark short enums as packed, because,
well, they are.


The enum is conceptually packed into a smaller integer type, sure.


Even enum types with explicit attribute packed, that IIUC uses the same
underlying type selection as -fshort-enums, IIRC are not be marked with
TYPE_PACKED in C++, at least not at the place where I proposed to set
it.  Do you consider that behavior correct?


Since attribute ((packed)) has this meaning, it seems reasonable to set 
TYPE_PACKED to express it.



Even if the warning happens to be buggy in this regard, it is at best
(or worst) accessory to this patch, in that it makes that difference
between languages apparent, and I worry that there might be other middle
end tests involving TYPE_PACKED that would get things different in C vs
C++.  (admittedly, I haven't searched for occurrences of TYPE_PACKED in
the tree, but I could, to alleviate my concerns, in case there's a
decision to keep them different)


The middle-end doesn't seem to use TYPE_PACKED for anything other than 
structure layout.



In the analyzer testcase, we have a cast from an
enum pointer that we don't know what it points to, and even if it did
point to the obj_type member of struct connection, that wouldn't be a
problem because it's at offset 0.


Maybe I misunderstand the point of the warning, but ISTM that the
circumstance it's warning about is real: the member is not as aligned as
the enclosing struct, so the cast is risky.  Now, I suppose the idiom of
finding the enclosing struct given a member is common enough that we
don't want to warn about it in general.  I'm not sure what makes packed
structs special in this regard, though.  I don't really see much
difference, more laxly-aligned fields seem equally warn-worthy, whether
the enclosing struct is packed or not, but what do I know?


Exactly.  If we want to warn about casting from pointer to less-aligned 
type to pointer to more-aligned type, that's already 
-Wcast-align=strict; whether the lower alignment is due to TYPE_PACKED 
seems irrelevant.


The observation that the type-based warning is a subset of 
-Wcast-align=strict was previously made in the discussion of the patch 
for PR88928.


And the motivating testcase for the warning was about converting from 
unaligned int* to aligned int*, not to a different type at all.  And 
that warning doesn't involve TYPE_PACKED.


The clang -Waddress-of-packed-member doesn't seem to include the 
type-based warning.



Also, -fshort-enums has nothing to do with structure packing


*nod*, it's about packing of the enum type itself.  It is some sort of a
degenerated aggregate type ;-) But yeah, I guess it doesn't fit the
circumstance the warning was meant to catch, and the fact that in C is
does is a consequence of marking C short enums as TYPE_PACKED.

Which might be a bug in C.

But wouldn't it be a bug in C++ if an enum with attribute packed weren't
markd as TYPE_PACKED?  Or is TYPE_PACKED really meant to say something
about the enclosing struct rather than about the enclosed type itself?
(am I getting too philosophical here? :-)


I'm coming to the conclusion that your C++ patch is fine but we should 
remove the TYPE_PACKED warning from 
check_address_or_pointer_of_packed_member.  And maybe add 
-Wcast-align=strict to -Wextra.


Jason



Re: [PATCH v5] Introduce attribute sym_alias

2023-11-22 Thread Joseph Myers
Is it OK to apply this attribute to a (file-scope or block-scope) static 
variable or function in C (and if it is, what's the linkage of the 
resulting alias)?  That doesn't seem very clear to me from the 
documentation, and I'd also expect a testcase of this, whatever the answer 
is.

What's the interaction with visibility attributes, pragmas and options?  
Do the aliases have the same visibility as the main declaration, or do 
they have the visibility that would be given to a declaration in the 
current context (command-line options and pragmas)?  If the latter, 
there's the question of what context is relevant when there are multiple 
declarations of an object or function, some of which might have sym_alias 
attributes and some of which might not.  Again, I think this should be 
documented and tested, whatever the decision about the desired semantics.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: Re: RISC-V: Support XTheadVector extensions

2023-11-22 Thread 钟居哲
I prefer ASM_OUTPUT_OPCODE or  assembler dialect to %^ and I don't want to see 
any change of vector.md.

%^ will cause high burden for future maintainment.

Besides, ASM_OUTPUT_OPCODE can the whole string. My patch is just a draft.
We can exlude for example, in zvbb, we can exclude appending "th." in vrev.v 
instruction.



juzhe.zh...@rivai.ai
 
From: Jeff Law
Date: 2023-11-23 06:27
To: Christoph Müllner; 钟居哲
CC: gcc-patches; kito.cheng; kito.cheng; cooper.joshua; rdapp.gcc; 
philipp.tomsich; Cooper Qu; jinma; Nelson Chu
Subject: Re: RISC-V: Support XTheadVector extensions
 
 
On 11/22/23 07:24, Christoph Müllner wrote:
> On Wed, Nov 22, 2023 at 2:52 PM 钟居哲  wrote:
>>
>> I am totally ok to approve theadvector on GCC-14 before stage 3 close
>> as long as it doesn't touch the current RVV codes too much and binutils 
>> supports theadvector.
>>
>> I have provided the draft approach:
>> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637349.html
>> which turns out doesn't need to change any codes of vector.md.
>> I strongly suggest follow this draft. I can be actively review theadvector 
>> during stage 3.
>> And hopefully can help you land theadvector on GCC-14.
> 
> I see now two approaches:
> 1) Let GCC emit RVV instructions for XTheadVector for instructions
> that are in both
> 2) Use the ASM_OUTPUT_OPCODE hook to output "th." for these instructions
> 
> No doubt, the ASM_OUTPUT_OPCODE hook approach is better than our
> format-string approach, but would 1) not be the even better
> solution? It would also mean, that not a single test case is required
> for these overlapping instructions (only a few tests that ensure that
> we don't emit RVV instructions that are not available in
> XTheadVector). Besides that, letting GCC emit RVV instructions for
> XTheadVector is a very clever idea, because it fully utilizes the
> fact that both extensions overlap to a huge degree.
> 
> The ASM_OUTPUT_OPCODE approach could lead to an issue if we enable
XTheadVector
> with any other vector extension, say Zvfoo. In this case the Zvfoo 
> instructions will all be prefixed as well with "th.". I know that it
> is not likely to run into this problem (such a machine does not exist
> in real hardware), but it is possible to trigger this issue easily
> and approach 1) would not have this potential issue.
I'm not a big fan of the ASM_OUTPUT_OPCODE approach.While it is 
simple, I worry a bit about it from a long term maintenance standpoint. 
As you note we could well end up at some point with an extension that 
has an mnenomic starting with "v" that would blow up.  But I certainly 
see the appeal of such a simple test to support thead vector.
 
Given there are at least 3 approaches that can fix that problem (%^, 
assembler dialect or ASM_OUTPUT_OPCODE), maybe we could set that 
discussion aside in the immediate term and see if there are other issues 
that are potentially more substantial.
 
 
 
 
--
 
 
 
More generally, I think I need to soften my prior statement about 
deferring this to gcc-15.  This code was submitted in time for the 
gcc-14 deadline, so it should be evaluated just like we do anything else 
that makes the deadline.  There are various criteria we use to evaluate 
if something should get integrated and we should just work through this 
series like we always do and not treat it specially in any way.
 
 
jeff
 


Re: [PATCH 1/2] testsuite/unroll-8: Avoid triggering undefined behavior

2023-11-22 Thread Jeff Law




On 11/21/23 16:27, Palmer Dabbelt wrote:

I was poking around with this test failure and noticed it was exercising
undefined behavior.  The return type doesn't matter for what's being
tested, so just mark it as void.

gcc/testsuite/ChangeLog:

* gcc.dg/unroll-8.c: Remove UB.
I just reviewed the history of unroll-8, I don't think this compromises 
the test's original intent.  OK for the trunk.


jeff


Re: [RFA] New pass for sign/zero extension elimination

2023-11-22 Thread Jeff Law




On 11/20/23 11:26, Richard Sandiford wrote:


+
+/* If we know the destination of CODE only uses some low bits
+   (say just the QI bits of an SI operation), then return true
+   if we can propagate the need for just the subset of bits
+   from the destination to the sources.  */
+
+static bool
+safe_for_live_propagation (rtx_code code)
+{
+  /* First handle rtx classes which as a whole are known to
+ be either safe or unsafe.  */
+  switch (GET_RTX_CLASS (code))
+{
+  case RTX_OBJ:
+   return true;
+
+  case RTX_COMPARE:
+  case RTX_COMM_COMPARE:
+  case RTX_TERNARY:


I suppose operands 1 and 2 of an IF_THEN_ELSE would be safe.
Yes.  The only downside is we'd need to special case IF_THEN_ELSE 
because it doesn't apply to operand 0.  Right now we're pretty 
conservative with anything other than binary codes.  Comment added about 
the possibility of handling I-T-E as well.






This made me wonder: is this safe for !TRULY_NOOP_TRUNCATION?  But I
suppose it is.  What !TRULY_NOOP_TRUNCATION models is that the target
mode has a canonical form that must be maintained, and wouldn't be by
a plain subreg.  So TRULY_NOOP_TRUNCATION is more of an issue for
consumers of the liveness information, rather than the computing the
liveness information itself.
Really interesting question.  I think ext-dce is safe.  As you note this 
is more a consumer side question and on the consumer side we don't muck 
with TRUNCATE at all.






+case SS_TRUNCATE:
+case US_TRUNCATE:
+case PLUS:
+case MULT:
+case SS_MULT:
+case US_MULT:
+case SMUL_HIGHPART:
+case UMUL_HIGHPART:
+case AND:
+case IOR:
+case XOR:
+case SS_PLUS:
+case US_PLUS:


I don't think it's safe to propagate through saturating ops.
They don't have the property that (x op y)%z == (x%z op x%z)%z

Yea, you're probably right.  Removed.




+
+ /* We don't support vector destinations or destinations
+wider than DImode.   It is safe to continue this loop.
+At worst, it will leave things live which could have
+been made dead.  */
+ if (VECTOR_MODE_P (GET_MODE (x)) || GET_MODE (x) > E_DImode)
+   continue;


The E_DImode comparison hard-codes an assumption about the order of
the mode enum.  How about using something like:

Guilty as charged :-)  Not surprised you called that out.





  scalar_int_mode outer_mode;
  if (!is_a (GET_MODE (x), _mode)
  || GET_MODE_BITSIZE (outer_mode) > 64)
continue;
Wouldn't we also want to verify that the size is constant, or is it the 
case that all the variable cases are vector (and would we want to 
actually depend on that)?




The other continues use iter.skip_subrtxes (); when continuing.
I don't think it matters for correctness whether we do that or not,
since SETs and CLOBBERs shouldn't be nested.  But skipping should
be faster.
My thought on not skipping the sub-rtxs in this case was to make sure we 
processed things like memory addresses which could have embedded side 
effects.  It probably doesn't matter in practice though.




Maybe it would be worth splitting the SET/CLOBBER code out into > a 
subfunction, to make the loop iteration easier to handle?
Yea, it could use another round of such work.  In the originalm set and 
use handling were one big function which drove me nuts.





+ /* Transfer all the LIVENOW bits for X into LIVE_TMP.  */
+ HOST_WIDE_INT rn = REGNO (SUBREG_REG (x));
+ for (HOST_WIDE_INT i = 4 * rn; i < 4 * rn + 4; i++)
+   if (bitmap_bit_p (livenow, i))
+ bitmap_set_bit (live_tmp, i);
+
+ /* The mode of the SUBREG tells us how many bits we can
+clear.  */
+ machine_mode mode = GET_MODE (x);
+ HOST_WIDE_INT size = GET_MODE_SIZE (mode).to_constant ();
+ bitmap_clear_range (livenow, 4 * rn, size);


Is clearing SIZE bytes correct?  Feels like it should be clearing
something like log2 (size) + 1 instead.

Yea, I think you're right.  Fixed.




+ bit = SUBREG_BYTE (x).to_constant () * BITS_PER_UNIT;
+ if (WORDS_BIG_ENDIAN)
+   bit = (GET_MODE_BITSIZE (GET_MODE (SUBREG_REG (x))).to_constant 
()
+  - GET_MODE_BITSIZE (GET_MODE (x)).to_constant () - bit);
+
+ /* Catch big endian correctness issues rather than triggering
+undefined behavior.  */
+ gcc_assert (bit < sizeof (HOST_WIDE_INT) * 8);


This could probably use subreg_lsb, to avoid the inline endianness adjustment.
That's the routine I was looking for!  The original totally mucked up 
the endianness adjustment and I kept thinking we must have an existing 
routine to do this for us but didn't find it immediately, so I just 
banged out a trivial endianness adjustment.





+
+ mask = GET_MODE_MASK (GET_MODE (SUBREG_REG (x))) << bit;
+ if 

Re: [PATCH, v3] Fortran: restrictions on integer arguments to SYSTEM_CLOCK [PR112609]

2023-11-22 Thread Steve Kargl
On Wed, Nov 22, 2023 at 10:36:00AM +0100, Mikael Morin wrote:
> 
> OK with this fixed (and the previous comments as you wish), if Steve has no
> more comments.
> 

No further comments.  Thanks for your patients, Harald.

As side note, I found John Reid's "What's new" document
where it is noted that there are no new obsolescent or
delete features.

https://wg5-fortran.org/N2201-N2250/N2212.pdf

-- 
Steve


Re: [PATCH] mingw: Exclude utf8 manifest [PR111170, PR108865]

2023-11-22 Thread Jonathan Yong

On 11/22/23 12:34, Costas Argyris wrote:

Attached a new patch.

A couple things to note:

1) I changed your

host_extra_objs=utf8-mingw32.o

to

host_extra_objs_mingw=utf8-mingw32.o

to match the other two, since I believe that's what you meant.

2) This approach has the complication that the variables
in configure.ac need to be set before it sources config.host.



I specifically asked for it to be done that way so users are aware of it 
with --help. Thanks, pushed to master.


Re: [PATCH v3 01/11] aarch64: Avoid -Wincompatible-pointer-types warning in Linux unwinder

2023-11-22 Thread Joseph Myers
On Mon, 20 Nov 2023, Florian Weimer wrote:

>   * config/aarch64/linux-unwind.h
>   (aarch64_fallback_frame_state): Add cast to the expected type
>   in sc assignment.

OK in the absence of AArch64 maintainer objections within 48 hours.

-- 
Joseph S. Myers
jos...@codesourcery.com


[committed] hppa: Define MAX_FIXED_MODE_SIZE

2023-11-22 Thread John David Anglin
Tested on hppa-unknown-linux-gnu and hppa64-hp-hpux11.11.  Committed to
trunk.

Fixes FAIL: c-c++-common/pr111309-1.c ICE.

Dave
---

hppa: Define MAX_FIXED_MODE_SIZE

Replace default define.  We support TImode when TARGET_64BIT is true.

2023-11-22  John David Anglin  

gcc/ChangeLog:

PR target/112592
* config/pa/pa.h (MAX_FIXED_MODE_SIZE): Define.

diff --git a/gcc/config/pa/pa.h b/gcc/config/pa/pa.h
index aba2cec7357..d73428682e7 100644
--- a/gcc/config/pa/pa.h
+++ b/gcc/config/pa/pa.h
@@ -1310,3 +1310,7 @@ do {  
 \
 
 /* Output default function prologue for hpux.  */
 #define TARGET_ASM_FUNCTION_PROLOGUE pa_output_function_prologue
+
+/* An integer expression for the size in bits of the largest integer machine
+   mode that should actually be used.  We allow pairs of registers.  */
+#define MAX_FIXED_MODE_SIZE GET_MODE_BITSIZE (TARGET_64BIT ? TImode : DImode)


signature.asc
Description: PGP signature


[PATCH, testsuite, fortran] fix invalid testcases (missing MOLD argument to NULL)

2023-11-22 Thread Harald Anlauf
Dear all,

testcases assumed_rank_8.f90 and assumed_rank_10.f90 are invalid:
NULL() is passed without MOLD to an assumed-rank dummy argument.

This is detected by NAG, but not yet by gfortran (see pr104819).
gfortran even ignores the MOLD argument; the dump-tree is identical
if MOLD is there or not.

Now these testcases are { dg-do run }.  Therefore I would like to
fix these testcases, independent of the work on fixing pr104819.

Comments?

Thanks,
Harald

From cbb0c61f9d6f06667666a33da6e6ce3213a92248 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Wed, 22 Nov 2023 21:45:46 +0100
Subject: [PATCH] testsuite: fortran: fix invalid testcases (missing MOLD
 argument to NULL)

The Fortran standard requires that NULL() passed to an assumed-rank
dummy argument has a MOLD argument.

gcc/testsuite/ChangeLog:

	PR fortran/104819
	* gfortran.dg/assumed_rank_10.f90: Add MOLD argument to NULL().
	* gfortran.dg/assumed_rank_8.f90: Likewise.
---
 gcc/testsuite/gfortran.dg/assumed_rank_10.f90 | 6 +++---
 gcc/testsuite/gfortran.dg/assumed_rank_8.f90  | 4 ++--
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gfortran.dg/assumed_rank_10.f90 b/gcc/testsuite/gfortran.dg/assumed_rank_10.f90
index 6a3cc94483e..f22d43ab955 100644
--- a/gcc/testsuite/gfortran.dg/assumed_rank_10.f90
+++ b/gcc/testsuite/gfortran.dg/assumed_rank_10.f90
@@ -50,9 +50,9 @@ program test

  is_present = .false.

- call fpa(null(), null()) ! No copy back
- call fpi(null(), null()) ! No copy back
- call fno(null(), null()) ! No copy back
+ call fpa(null(iip), null(jjp)) ! No copy back
+ call fpi(null(iip), null(jjp)) ! No copy back
+ call fno(null(iip), null(jjp)) ! No copy back

  call fno() ! No copy back

diff --git a/gcc/testsuite/gfortran.dg/assumed_rank_8.f90 b/gcc/testsuite/gfortran.dg/assumed_rank_8.f90
index 5873296a7a5..34ff42c0be2 100644
--- a/gcc/testsuite/gfortran.dg/assumed_rank_8.f90
+++ b/gcc/testsuite/gfortran.dg/assumed_rank_8.f90
@@ -22,13 +22,13 @@ program main
   call f (ii)
   call f (489)
   call f ()
-  call f (null())
+  call f (null(kk))
   call f (kk)
   if (j /= 2) STOP 1

   j = 0
   nullify (ll)
-  call g (null())
+  call g (null(ll))
   call g (ll)
   call g (ii)
   if (j /= 1) STOP 2
--
2.35.3



Re: [PATCH v3 03/11] gm2: Add missing declaration of m2pim_M2RTS_Terminate to test

2023-11-22 Thread Joseph Myers
On Mon, 20 Nov 2023, Florian Weimer wrote:

> gcc/testsuite/
> 
>   * gm2/link/externalscaffold/pass/scaffold.c (m2pim_M2RTS_Terminate):
>   Declare.

OK in the absence of Modula-2 maintainer objections within 48 hours.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH 2/2] bugzilla: remove `gcc-bugs@` mailing list address

2023-11-22 Thread Ben Boeckel
On Wed, Nov 22, 2023 at 23:15:56 +, Joseph Myers wrote:
> On Mon, 20 Nov 2023, Ben Boeckel wrote:
> 
> > Bugzilla is preferred today.
> > 
> > ChangeLog:
> > 
> > * config-ml.in: Replace gcc-bugs@ with Bugzilla link.
> > * symlink-tree: Replace gcc-bugs@ with Bugzilla link.
> 
> I don't think we should use a URL that redirects (i.e. 
> https://gcc.gnu.org/bugzilla should preferably have a trailing '/'), and 
> arguably we should use https://gcc.gnu.org/bugs/ as the URL; that's the 
> preferred one to point people to for bugs in the compilers themselves, 
> since it gives more instructions on bug reporting (though those 
> instructions may be less relevant for bugs in these files).

I'll update the URL.

> codingconventions.html claims that symlink-tree is "copied from mainline 
> automake".  That is, I think, out-of-date information: automake's 
> contrib/multilib/README says "The master (and probably more up-to-date) 
> copies of the 'config-ml.in' and 'symlink-tree' files are maintained in 
> the GCC development tree".  But it does indicate that 
> codingconventions.html itself should be updated to stop suggesting 
> symlink-tree is maintained elsewhere.

I'll also change that.

> > libcpp/ChangeLog:
> > 
> > * configure: Replace gcc-bugs@ with Bugzilla link.
> > * configure.ac: Replace gcc-bugs@ with Bugzilla link.
> > 
> > libdecnumber/ChangeLog:
> > 
> > * configure: Replace gcc-bugs@ with Bugzilla link.
> > * configure.ac: Replace gcc-bugs@ with Bugzilla link.
> 
> I hope the configure changes are the same as you get with regeneration 
> with the right autoconf version, and so should be described as 
> regeneration in the ChangeLog entries.

Is there a version of autoconf I should use? I have 2.71 laying around
but see that these were generated with 2.69. If you want me to regen
with 2.71, I'll do that as separate prep commits so that this diff is
sensible. Or I can try and dig up a 2.69 in some container to do it.

Thanks,

--Ben


Re: [PATCH 1/2] testsuite/unroll-8: Avoid triggering undefined behavior

2023-11-22 Thread Andrew Pinski
On Tue, Nov 21, 2023 at 3:29 PM Palmer Dabbelt  wrote:
>
> I was poking around with this test failure and noticed it was exercising
> undefined behavior.  The return type doesn't matter for what's being
> tested, so just mark it as void.

Just a quick note, this is NOT undefined behavior in C to return
without a value from a function which has a non-void return type. It
is only undefined if the value was used.
It is undefined behavior in C++ though for a fallthrough.
Yes there is a difference in the language. As Jeff said it does not
change what the testcase was/is testing but we should be clear in the
changelog that this is NOT undefined behavior.

Thanks,
Andrew Pinski

>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/unroll-8.c: Remove UB.
> ---
> I didn't tes this, but it seems trivial enough that I'm just going to
> throw it at the bots and hope I'm right.
> ---
>  gcc/testsuite/gcc.dg/unroll-8.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/gcc.dg/unroll-8.c b/gcc/testsuite/gcc.dg/unroll-8.c
> index 4388f47d4c7..06d32e56893 100644
> --- a/gcc/testsuite/gcc.dg/unroll-8.c
> +++ b/gcc/testsuite/gcc.dg/unroll-8.c
> @@ -3,7 +3,7 @@
>  /* { dg-additional-options "-fno-tree-vectorize" { target amdgcn-*-* } } */
>
>  struct a {int a[7];};
> -int t(struct a *a, int n)
> +void t(struct a *a, int n)
>  {
>int i;
>for (i=0;i --
> 2.42.1
>


Re: [PATCH v3 00/11] : More warnings as errors by default

2023-11-22 Thread Florian Weimer
* Jeff Law:

> On 11/20/23 02:55, Florian Weimer wrote:
>> This revision addresses Marek's comment about handing
>> -Wdeclaration-missing-parameter-type properly in conjunction with
>> -fpermissive.  A new test (permerror-fpermissive-nowarning.c)
>> demonstrates the expected behavior.  I added a test for -std=gnu89
>> -fno-permissive, too.
>> I'm including the precursor cleanup patches in this posting.
>> Hopefully
>> this will make the aarch64 tester happy.
>> Thanks,
>> Florian
>> Florian Weimer (11):
>>aarch64: Avoid -Wincompatible-pointer-types warning in Linux unwinder
>>aarch64: Call named function in gcc.target/aarch64/aapcs64/ice_1.c
>>gm2: Add missing declaration of m2pim_M2RTS_Terminate to test
>>Add tests for validating future C permerrors
>>c: Turn int-conversion warnings into permerrors
>>c: Turn -Wimplicit-function-declaration into a permerror
>>c: Turn -Wimplicit-int into a permerror
>>c: Do not ignore some forms of -Wimplicit-int in system headers
>>c: Turn -Wreturn-mismatch into a permerror
>>c: Turn -Wincompatible-pointer-types into a permerror
>>c: Add new -Wdeclaration-missing-parameter-type permerror

> The series is fine by me.

Thanks.

> But give Marek additional time to chime in, particularly given the
> holidays this week in the US.  Say through this time next week?

Yes, Marek and I spoke about it today.  I'll wait a bit longer for
feedback.

I'm also gathering some numbers regarding autoconf impact and potential
silent miscompilation.

Thanks,
Florian



Re: [PATCH v3 00/11] : More warnings as errors by default

2023-11-22 Thread Jeff Law




On 11/20/23 02:55, Florian Weimer wrote:

This revision addresses Marek's comment about handing
-Wdeclaration-missing-parameter-type properly in conjunction with
-fpermissive.  A new test (permerror-fpermissive-nowarning.c)
demonstrates the expected behavior.  I added a test for -std=gnu89
-fno-permissive, too.

I'm including the precursor cleanup patches in this posting.  Hopefully
this will make the aarch64 tester happy.

Thanks,
Florian

Florian Weimer (11):
   aarch64: Avoid -Wincompatible-pointer-types warning in Linux unwinder
   aarch64: Call named function in gcc.target/aarch64/aapcs64/ice_1.c
   gm2: Add missing declaration of m2pim_M2RTS_Terminate to test
   Add tests for validating future C permerrors
   c: Turn int-conversion warnings into permerrors
   c: Turn -Wimplicit-function-declaration into a permerror
   c: Turn -Wimplicit-int into a permerror
   c: Do not ignore some forms of -Wimplicit-int in system headers
   c: Turn -Wreturn-mismatch into a permerror
   c: Turn -Wincompatible-pointer-types into a permerror
   c: Add new -Wdeclaration-missing-parameter-type permerror
The series is fine by me.  But give Marek additional time to chime in, 
particularly given the holidays this week in the US.  Say through this 
time next week?


jeff


[PATCH 1/2] c-family: -Waddress-of-packed-member and casts

2023-11-22 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, OK for trunk?

-- 8< --

-Waddress-of-packed-member, in addition to the documented warning about
taking the address of a packed member, also warns about casting from
a pointer to a TYPE_PACKED type to a pointer to a type with greater
alignment.

This wrongly warns if the source is a pointer to enum when -fshort-enums
is on, since that is also represented by TYPE_PACKED.

And there's already -Wcast-align to catch casting from pointer to less
aligned type (packed or otherwise) to pointer to more aligned type; even
apart from the enum problem, this seems like a somewhat arbitrary subset of
that warning.  Though that isn't currently on by default.

So, this patch removes the undocumented type-based warning from
-Waddress-of-packed-member.  Some of the tests where the warning is
desirable I changed to use -Wcast-align=strict instead.  The ones that
require -Wno-incompatible-pointer-types, I just removed.

gcc/c-family/ChangeLog:

* c-warn.cc (check_address_or_pointer_of_packed_member):
Remove warning based on TYPE_PACKED.

gcc/testsuite/ChangeLog:

* c-c++-common/Waddress-of-packed-member-1.c: Don't expect
a warning on the cast cases.
* c-c++-common/pr51628-35.c: Use -Wcast-align=strict.
* g++.dg/warn/Waddress-of-packed-member3.C: Likewise.
* gcc.dg/pr88928.c: Likewise.
* gcc.dg/pr51628-20.c: Removed.
* gcc.dg/pr51628-21.c: Removed.
* gcc.dg/pr51628-25.c: Removed.
---
 gcc/c-family/c-warn.cc| 58 +--
 .../Waddress-of-packed-member-1.c | 12 ++--
 gcc/testsuite/c-c++-common/pr51628-35.c   |  6 +-
 .../g++.dg/warn/Waddress-of-packed-member3.C  |  8 +--
 gcc/testsuite/gcc.dg/pr51628-20.c | 11 
 gcc/testsuite/gcc.dg/pr51628-21.c | 11 
 gcc/testsuite/gcc.dg/pr51628-25.c |  9 ---
 gcc/testsuite/gcc.dg/pr88928.c|  6 +-
 8 files changed, 19 insertions(+), 102 deletions(-)
 delete mode 100644 gcc/testsuite/gcc.dg/pr51628-20.c
 delete mode 100644 gcc/testsuite/gcc.dg/pr51628-21.c
 delete mode 100644 gcc/testsuite/gcc.dg/pr51628-25.c

diff --git a/gcc/c-family/c-warn.cc b/gcc/c-family/c-warn.cc
index d2938b91043..2a399ba6d14 100644
--- a/gcc/c-family/c-warn.cc
+++ b/gcc/c-family/c-warn.cc
@@ -2991,10 +2991,9 @@ check_alignment_of_packed_member (tree type, tree field, 
bool rvalue)
   return NULL_TREE;
 }
 
-/* Return struct or union type if the right hand value, RHS:
-   1. Is a pointer value which isn't aligned to a pointer type TYPE.
-   2. Is an address which takes the unaligned address of packed member
-  of struct or union when assigning to TYPE.
+/* Return struct or union type if the right hand value, RHS
+   is an address which takes the unaligned address of packed member
+   of struct or union when assigning to TYPE.
Otherwise, return NULL_TREE.  */
 
 static tree
@@ -3021,57 +3020,6 @@ check_address_or_pointer_of_packed_member (tree type, 
tree rhs)
 
   type = TREE_TYPE (type);
 
-  if (TREE_CODE (rhs) == PARM_DECL
-  || VAR_P (rhs)
-  || TREE_CODE (rhs) == CALL_EXPR)
-{
-  tree rhstype = TREE_TYPE (rhs);
-  if (TREE_CODE (rhs) == CALL_EXPR)
-   {
- rhs = CALL_EXPR_FN (rhs); /* Pointer expression.  */
- if (rhs == NULL_TREE)
-   return NULL_TREE;
- rhs = TREE_TYPE (rhs);/* Pointer type.  */
- /* We could be called while processing a template and RHS could be
-a functor.  In that case it's a class, not a pointer.  */
- if (!rhs || !POINTER_TYPE_P (rhs))
-   return NULL_TREE;
- rhs = TREE_TYPE (rhs);/* Function type.  */
- rhstype = TREE_TYPE (rhs);
- if (!rhstype || !POINTER_TYPE_P (rhstype))
-   return NULL_TREE;
- rvalue = true;
-   }
-  if (rvalue && POINTER_TYPE_P (rhstype))
-   rhstype = TREE_TYPE (rhstype);
-  while (TREE_CODE (rhstype) == ARRAY_TYPE)
-   rhstype = TREE_TYPE (rhstype);
-  if (TYPE_PACKED (rhstype))
-   {
- unsigned int type_align = min_align_of_type (type);
- unsigned int rhs_align = min_align_of_type (rhstype);
- if (rhs_align < type_align)
-   {
- auto_diagnostic_group d;
- location_t location = EXPR_LOC_OR_LOC (rhs, input_location);
- if (warning_at (location, OPT_Waddress_of_packed_member,
- "converting a packed %qT pointer (alignment %d) "
- "to a %qT pointer (alignment %d) may result in "
- "an unaligned pointer value",
- rhstype, rhs_align, type, type_align))
-   {
- tree decl = TYPE_STUB_DECL (rhstype);
- if (decl)
-   inform (DECL_SOURCE_LOCATION (decl), "defined here");
- decl = TYPE_STUB_DECL (type);
- if 

Re: [PATCH v5 1/1] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-11-22 Thread waffl3x






On Tuesday, November 21st, 2023 at 8:22 PM, Jason Merrill  
wrote:


>
>
> On 11/21/23 08:04, waffl3x wrote:
>
> > Bootstrapped and tested on x86_64-linux with no regressions.
> >
> > Hopefully this patch is legible enough for reviewing purposes, I've not
> > been feeling the greatest so it was a task to get this finished.
> > Tomorrow I will look at putting the diagnostics in
> > start_preparsed_function and also fixing up anything else.
> >
> > To reiterate in case it wasn't abundantly clear by the barren changelog
> > and commit message, this version is not intended as the final revision.
> >
> > Handling re-declarations was kind of nightmarish, so the comments in
> > there are lengthy, but I am fairly certain I implemented them correctly.
> >
> > I am going to get some sleep now, hopefully I will feel better tomorrow
> > and be ready to polish off the patch. Thanks for the patience.
>
>
> Great!
>
> > I stared at start_preparsed_function for a long while and couldn't
> > figure out where to start off at. So for now the error handling is
> > split up between instantiate_body and cp_parser_lambda_declarator_opt.
> > The latter is super not correct but I've been stuck on this for a long
> > time now though so I wanted to actually get something that works and
> > then try to make it better.
>
>
> I see what you mean, instantiate body isn't prepared for
> start_preparsed_function to fail. It's ok to handle this in two places.
> Though actually, instantiate_body is too late for it to fail; I think
> for the template case it should fail in tsubst_lambda_expr, before we
> even start to consider the body.
>
> Incidentally, I notice this code in tsubst_function_decl seems to need
> adjusting for xobj:
>
> tree parms = DECL_ARGUMENTS (t);
> if (closure && !DECL_STATIC_FUNCTION_P (t))
> parms = DECL_CHAIN (parms);
> parms = tsubst (parms, args, complain, t);
> for (tree parm = parms; parm; parm = DECL_CHAIN (parm))
> DECL_CONTEXT (parm) = r;
> if (closure && !DECL_STATIC_FUNCTION_P (t))
> ...
>
> and this in tsubst_lambda_expr that assumes iobj:
>
> /* Fix the type of 'this'. */
> fntype = build_memfn_type (fntype, type,
> type_memfn_quals (fntype),
> type_memfn_rqual (fntype));

Originally I was going to say this doesn't look like a problem in
tsubst_lambda_expr, but after looking at tsubst_function_decl I'm
thinking it might be the source of some trouble. If it really was
causing problems I would think it would be working much worse than it
currently is, but it does feel like it might be the actual source of
the bug I was chasing yesterday. Assigning to a capture with a deduced
const xobj parameter is not being rejected right now, this might be
causing it. I'll look more thoroughly today.

> This also seems like the place to check for unrelated type.

It does feel that way, I agree.

> > /* Nonzero for FUNCTION_DECL means that this decl is a non-static
> > - member function. */
> > + member function, use DECL_IOBJ_MEMBER_FUNC_P instead. */
> > #define DECL_NONSTATIC_MEMBER_FUNCTION_P(NODE) \
> > (TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE)
> >
> > +/* Nonzero for FUNCTION_DECL means that this decl is an implicit object
> > + member function. */
> > +#define DECL_IOBJ_MEMBER_FUNC_P(NODE) \
> > + (TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE)
>
>
> I was thinking to rename DECL_NONSTATIC_MEMBER_FUNCTION_P rather than
> add a new, equivalent one. And then go through all the current uses of
> the old macro to decide whether they mean IOBJ or OBJECT.

I figure it would be easier to make that transition if there's a clear
line between old versus new. To be clear, my intention is for the old
macro to be removed once all the uses of it are changed over to the new
macro. I can still remove it for the patch if you like but having both
and removing the old one later seems better to me.

> > - (static or non-static). */
> > + (static or object). */
>
>
> Let's leave this comment as it was.

Okay.

> > + auto handle_arg = [fn, flags, complain](tree type,
> > + tree arg,
> > + int const param_index,
> > + conversion *conv,
> > + bool const conversion_warning)
>
>
> Let's move the conversion_warning logic into the handle_arg lambda
> rather than have it as a parameter. Yes, we don't need it for the xobj
> parm, but I think it's cleaner to have less in the loop.

I would argue that it's cleaner to have the lambda be concise, but I'll
make this change.

> Also, let's move handle_arg after the iobj 'this' handling so it's
> closer to the uses. For which the 'else xobj' needs to drop the 'else',
> or change to 'if (first_arg)'.

Agreed, I didn't like how far away it was.

> > + /* We currently handle for the case where first_arg is NULL_TREE
> > + in the future this should be changed and the assert reactivated. */
> > + #if 0
> > + gcc_assert (first_arg);
> > + #endif
>
>
> Let's leave this out.

Alright.

> > + val = handle_arg(TREE_VALUE (parm),
>
>
> Missing space before (.
>
> > - if (null_node_p (arg)
> > - && 

Re: [PATCH, v3] Fortran: restrictions on integer arguments to SYSTEM_CLOCK [PR112609]

2023-11-22 Thread Harald Anlauf

Hi Steve,

On 11/22/23 19:03, Steve Kargl wrote:

On Wed, Nov 22, 2023 at 10:36:00AM +0100, Mikael Morin wrote:


OK with this fixed (and the previous comments as you wish), if Steve has no
more comments.



No further comments.  Thanks for your patients, Harald.

As side note, I found John Reid's "What's new" document
where it is noted that there are no new obsolescent or
delete features.

https://wg5-fortran.org/N2201-N2250/N2212.pdf



this is good to know.

There is an older version (still referring to F202x) on the wiki:

https://gcc.gnu.org/wiki/GFortranStandards

It would be great if someone with editing permission could update
the link and point to the above.

Thanks,
Harald



[committed] hppa: Fix integer REG+D address reloads

2023-11-22 Thread John David Anglin
Tested on hppa-unknown-linux-gnu and hppa64-hp-hpux11.11.  Fixes testcase
in PR.  Committed to trunk.

Dave
---

hppa: Fix integer REG+D address reloads

I made a mistake in the previous change to integer_store_memory_operand.
There is no support pa_emit_move sequence to handle secondary reloads of
integer REG+D instructions.  Further, the Q constraint is used for some
non-simple instructions (movb and addib).  Thus, we need to return true
when reload is in progress.

2023-11-22  John David Anglin  

gcc/ChangeLog:

PR target/112617
* config/pa/predicates.md (integer_store_memory_operand): Return
true for REG+D addresses when reload_in_progress is true.

diff --git a/gcc/config/pa/predicates.md b/gcc/config/pa/predicates.md
index 1b50020e1de..4c07c0a3828 100644
--- a/gcc/config/pa/predicates.md
+++ b/gcc/config/pa/predicates.md
@@ -308,6 +308,13 @@
 
   if (reg_plus_base_memory_operand (op, mode))
 {
+  /* There is no support for handling secondary reloads of integer
+REG+D instructions in pa_emit_move_sequence.  Further, the Q
+constraint is used in more than simple move instructions.  So,
+we must return true and let reload handle the reload.  */
+  if (reload_in_progress)
+   return true;
+
   /* Extract CONST_INT operand.  */
   if (GET_CODE (op) == SUBREG)
op = SUBREG_REG (op);


signature.asc
Description: PGP signature


Re: [PATCH 2/2] bugzilla: remove `gcc-bugs@` mailing list address

2023-11-22 Thread Joseph Myers
On Mon, 20 Nov 2023, Ben Boeckel wrote:

> Bugzilla is preferred today.
> 
> ChangeLog:
> 
>   * config-ml.in: Replace gcc-bugs@ with Bugzilla link.
>   * symlink-tree: Replace gcc-bugs@ with Bugzilla link.

I don't think we should use a URL that redirects (i.e. 
https://gcc.gnu.org/bugzilla should preferably have a trailing '/'), and 
arguably we should use https://gcc.gnu.org/bugs/ as the URL; that's the 
preferred one to point people to for bugs in the compilers themselves, 
since it gives more instructions on bug reporting (though those 
instructions may be less relevant for bugs in these files).

codingconventions.html claims that symlink-tree is "copied from mainline 
automake".  That is, I think, out-of-date information: automake's 
contrib/multilib/README says "The master (and probably more up-to-date) 
copies of the 'config-ml.in' and 'symlink-tree' files are maintained in 
the GCC development tree".  But it does indicate that 
codingconventions.html itself should be updated to stop suggesting 
symlink-tree is maintained elsewhere.

> libcpp/ChangeLog:
> 
>   * configure: Replace gcc-bugs@ with Bugzilla link.
>   * configure.ac: Replace gcc-bugs@ with Bugzilla link.
> 
> libdecnumber/ChangeLog:
> 
>   * configure: Replace gcc-bugs@ with Bugzilla link.
>   * configure.ac: Replace gcc-bugs@ with Bugzilla link.

I hope the configure changes are the same as you get with regeneration 
with the right autoconf version, and so should be described as 
regeneration in the ChangeLog entries.

-- 
Joseph S. Myers
jos...@codesourcery.com


[pushed] wwwdocs: branching: No longer refer to buildstat.html

2023-11-22 Thread Gerald Pfeifer
Thomas spotted this (among others) not being necessary any longer and 
kindly reported it.

Pushed.
---
 htdocs/branching.html | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/htdocs/branching.html b/htdocs/branching.html
index 0d48dce1..23ff92e8 100644
--- a/htdocs/branching.html
+++ b/htdocs/branching.html
@@ -52,9 +52,6 @@ populate it with initial copies of changes.html 
and
 based on the previous release branch to the directory corresponding to
 the newly created release branch.

-Add buildstat.html and update the toplevel
-buildstat.html accordingly.
-
 Update the toplevel index.html page to show the new active
 release branch, the current release series, and active development
 (mainline).  Update the version and development stage for mainline.
-- 
2.42.1


Re: [PATCH] Clean up by_pieces_ninsns

2023-11-22 Thread Richard Sandiford
"Kewen.Lin"  writes:
> Hi,
>
> on 2023/11/15 10:26, HAO CHEN GUI wrote:
>> Hi,
>>   This patch cleans up by_pieces_ninsns and does following things.
>> 1. Do the length and alignment adjustment for by pieces compare when
>> overlap operation is enabled.
>> 2. Remove unnecessary mov_optab checks.
>> 
>>   Bootstrapped and tested on x86 and powerpc64-linux BE and LE with
>> no regressions. Is this OK for trunk?
>> 
>> Thanks
>> Gui Haochen
>> 
>> ChangeLog
>> Clean up by_pieces_ninsns
>> 
>> The by pieces compare can be implemented by overlapped operations. So
>> it should be taken into consideration when doing the adjustment for
>> overlap operations.  The mode returned from
>> widest_fixed_size_mode_for_size is already checked with mov_optab in
>> by_pieces_mode_supported_p called by widest_fixed_size_mode_for_size.
>> So there is no need to check mov_optab again in by_pieces_ninsns.
>> The patch fixes these issues.
>> 
>> gcc/
>>  * expr.cc (by_pieces_ninsns): Include by pieces compare when
>>  do the adjustment for overlap operations.  Remove unnecessary
>>  mov_optab check.
>> 
>> patch.diff
>> diff --git a/gcc/expr.cc b/gcc/expr.cc
>> index 3e2a678710d..7cb2c935177 100644
>> --- a/gcc/expr.cc
>> +++ b/gcc/expr.cc
>> @@ -1090,18 +1090,15 @@ by_pieces_ninsns (unsigned HOST_WIDE_INT l, unsigned 
>> int align,
>>unsigned HOST_WIDE_INT n_insns = 0;
>>fixed_size_mode mode;
>> 
>> -  if (targetm.overlap_op_by_pieces_p () && op != COMPARE_BY_PIECES)
>> +  if (targetm.overlap_op_by_pieces_p ())
>>  {
>>/* NB: Round up L and ALIGN to the widest integer mode for
>>   MAX_SIZE.  */
>>mode = widest_fixed_size_mode_for_size (max_size, op);
>> -  if (optab_handler (mov_optab, mode) != CODE_FOR_nothing)
>
> These changes are on generic code, so not a review.  :)
>
> If it's guaranteed previously, maybe we can replace it with an assertion
> like: gcc_assert (optab_handler (mov_optab, mode) != CODE_FOR_nothing);

Yeah, sounds OK to me FWIW.  I suppose the counter-argument is that
nothing here directly relies on the move optab.  It's just checking on
behalf of later code, which is now done by widest_fixed_size_mode_for_size
instead.

So the patch as posted is OK for trunk too, except that:

>
>> -{
>> -  unsigned HOST_WIDE_INT up = ROUND_UP (l, GET_MODE_SIZE (mode));
>> -  if (up > l)
>> -l = up;
>> -  align = GET_MODE_ALIGNMENT (mode);
>> -}
>> +  unsigned HOST_WIDE_INT up = ROUND_UP (l, GET_MODE_SIZE (mode));
>> +if (up > l)
>> +  l = up;
>> +  align = GET_MODE_ALIGNMENT (mode);

the indentation looks off here (the "if" is indented differently from the
first and last statements).

Thanks,
Richard

>>  }
>> 
>>align = alignment_for_piecewise_move (MOVE_MAX_PIECES, align);
>> @@ -1109,12 +1106,10 @@ by_pieces_ninsns (unsigned HOST_WIDE_INT l, unsigned 
>> int align,
>>while (max_size > 1 && l > 0)
>>  {
>>mode = widest_fixed_size_mode_for_size (max_size, op);
>> -  enum insn_code icode;
>> 
>>unsigned int modesize = GET_MODE_SIZE (mode);
>> 
>> -  icode = optab_handler (mov_optab, mode);
>
> ... likewise.
>
> BR,
> Kewen
>
>> -  if (icode != CODE_FOR_nothing && align >= GET_MODE_ALIGNMENT (mode))
>> +  if (align >= GET_MODE_ALIGNMENT (mode))
>>  {
>>unsigned HOST_WIDE_INT n_pieces = l / modesize;
>>l %= modesize;
>>


[PATCH V2 3/3] OpenMP: Use enumerators for names of trait-sets and traits

2023-11-22 Thread Sandra Loosemore
This patch introduces enumerators to represent trait-set names and
trait names, which makes it easier to use tables to control other
behavior and for switch statements to dispatch on the tags.  The tags
are stored in the same place in the TREE_LIST structure (OMP_TSS_ID or
OMP_TS_ID) and are encoded there as integer constants.

gcc/ChangeLog
* omp-general.h (enum omp_tss_code): New.
(enum omp_ts_code): New.
(enum omp_tp_type): New.
(omp_tss_map): New.
(struct omp_ts_info): New.
(omp_ts_map): New.
(OMP_TSS_CODE, OMP_TSS_NAME): New.
(OMP_TS_CODE, OMP_TS_NAME): New.
(make_trait_set_selector, make_trait_selector): Adjust declarations.
(omp_construct_traits_to_codes): Likewise.
(omp_context_selector_set_compare): Likewise.
(omp_get_context_selector): Likewise.
(omp_get_context_selector_list): New.
(omp_lookup_tss_code): New.
(omp_lookup_ts_code): New.
* omp-general.cc (omp_construct_traits_to_codes): Pass length in
as argument instead of returning it.  Make it table-driven.
(omp_tss_map): New.
(kind_properties, vendor_properties, extension_properties): New.
(atomic_default_mem_order_properties): New.
(omp_ts_map): New.
(omp_check_context_selector): Simplify lookup and dispatch logic.
(omp_mark_declare_variant): Adjust for new representation.
(make_trait_set_selector, make_trait_selector): Adjust for new
representations.
(omp_context_selector_matches): Simplify dispatch logic.  Avoid
fixed-sized buffers and adjust call to omp_construct_traits_to_codes.
(omp_context_selector_props_compare): Adjust for new representations
and simplify dispatch logic.
(omp_context_selector_set_compare): Likewise.
(omp_context_selector_compare): Likewise.
(omp_get_context_selector): Adjust for new representations, and split
out...
(omp_get_context_selector_list): New function.
(omp_lookup_tss_code): New.
(omp_lookup_ts_code): New.
(omp_context_compute_score): Adjust for new representations.  Avoid
fixed-sized buffers and magic numbers.  Adjust call to
omp_construct_traits_to_codes.
* gimplify.cc (omp_construct_selector_matches): Avoid use of
fixed-size buffer.  Adjust call to omp_construct_traits_to_codes.

gcc/c/ChangeLog
* c-parser.cc (omp_construct_selectors): Delete.
(omp_device_selectors): Delete.
(omp_implementation_selectors): Delete.
(omp_user_selectors): Delete.
(c_parser_omp_context_selector): Adjust for new representations
and simplify dispatch logic.
(c_parser_omp_context_selector_specification): Likewise.
(c_finish_omp_declare_variant): Adjust for new representations.

gcc/cp/ChangeLog
* decl.cc (omp_declare_variant_finalize_one): Adjust for new
representations.
* parser.cc (omp_construct_selectors): Delete.
(omp_device_selectors): Delete.
(omp_implementation_selectors): Delete.
(omp_user_selectors): Delete.
(cp_parser_omp_context_selector): Adjust for new representations
and simplify dispatch logic.
(cp_parser_omp_context_selector_specification): Likewise.
* pt.cc (tsubst_attribute): Adjust for new representations.

gcc/fortran/ChangeLog
* trans-openmp.cc (gfc_trans_omp_declare_variant): Adjust for
new representations.
---
 gcc/c/c-parser.cc   | 192 -
 gcc/cp/decl.cc  |   8 +-
 gcc/cp/parser.cc| 189 -
 gcc/cp/pt.cc|  15 +-
 gcc/fortran/trans-openmp.cc |  41 ++-
 gcc/gimplify.cc |  17 +-
 gcc/omp-general.cc  | 530 +++-
 gcc/omp-general.h   |  89 +-
 8 files changed, 590 insertions(+), 491 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index a2ff381e0c1..70c0e1828ca 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -24016,16 +24016,6 @@ c_parser_omp_declare_simd (c_parser *parser, enum 
pragma_context context)
 }
 }
 
-static const char *const omp_construct_selectors[] = {
-  "simd", "target", "teams", "parallel", "for", NULL };
-static const char *const omp_device_selectors[] = {
-  "kind", "isa", "arch", NULL };
-static const char *const omp_implementation_selectors[] = {
-  "vendor", "extension", "atomic_default_mem_order", "unified_address",
-  "unified_shared_memory", "dynamic_allocators", "reverse_offload", NULL };
-static const char *const omp_user_selectors[] = {
-  "condition", NULL };
-
 /* OpenMP 5.0:
 
trait-selector:
@@ -24038,7 +24028,8 @@ static const char *const omp_user_selectors[] = {
trait-selector-set SET.  */
 
 static tree
-c_parser_omp_context_selector (c_parser *parser, tree set, tree parms)
+c_parser_omp_context_selector (c_parser *parser, enum 

[PATCH 2/2] c-family: rename warn_for_address_or_pointer_of_packed_member

2023-11-22 Thread Jason Merrill
Following the last patch, let's rename the functions to reflect the change
in behavior.

gcc/c-family/ChangeLog:

* c-warn.cc (check_address_or_pointer_of_packed_member):
Rename to check_address_of_packed_member.
(check_and_warn_address_or_pointer_of_packed_member):
Rename to check_and_warn_address_of_packed_member.
(warn_for_address_or_pointer_of_packed_member):
Rename to warn_for_address_of_packed_member.
* c-common.h: Adjust.

gcc/c/ChangeLog:

* c-typeck.cc (convert_for_assignment): Adjust call to
warn_for_address_of_packed_member.

gcc/cp/ChangeLog:

* call.cc (convert_for_arg_passing)
* typeck.cc (convert_for_assignment): Adjust call to
warn_for_address_of_packed_member.
---
 gcc/c-family/c-common.h |  2 +-
 gcc/c-family/c-warn.cc  | 32 ++--
 gcc/c/c-typeck.cc   |  4 ++--
 gcc/cp/call.cc  |  2 +-
 gcc/cp/typeck.cc|  2 +-
 5 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index b57e83d7c5d..9380452a93b 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1482,7 +1482,7 @@ extern void warnings_for_convert_and_check (location_t, 
tree, tree, tree);
 extern void c_do_switch_warnings (splay_tree, location_t, tree, tree, bool);
 extern void warn_for_omitted_condop (location_t, tree);
 extern bool warn_for_restrict (unsigned, tree *, unsigned);
-extern void warn_for_address_or_pointer_of_packed_member (tree, tree);
+extern void warn_for_address_of_packed_member (tree, tree);
 extern void warn_parm_array_mismatch (location_t, tree, tree);
 extern void maybe_warn_sizeof_array_div (location_t, tree, tree, tree, tree);
 extern void do_warn_array_compare (location_t, tree_code, tree, tree);
diff --git a/gcc/c-family/c-warn.cc b/gcc/c-family/c-warn.cc
index 2a399ba6d14..abe66dd3030 100644
--- a/gcc/c-family/c-warn.cc
+++ b/gcc/c-family/c-warn.cc
@@ -2991,13 +2991,13 @@ check_alignment_of_packed_member (tree type, tree 
field, bool rvalue)
   return NULL_TREE;
 }
 
-/* Return struct or union type if the right hand value, RHS
+/* Return struct or union type if the right hand value, RHS,
is an address which takes the unaligned address of packed member
of struct or union when assigning to TYPE.
Otherwise, return NULL_TREE.  */
 
 static tree
-check_address_or_pointer_of_packed_member (tree type, tree rhs)
+check_address_of_packed_member (tree type, tree rhs)
 {
   bool rvalue = true;
   bool indirect = false;
@@ -3042,14 +3042,12 @@ check_address_or_pointer_of_packed_member (tree type, 
tree rhs)
   return context;
 }
 
-/* Check and warn if the right hand value, RHS:
-   1. Is a pointer value which isn't aligned to a pointer type TYPE.
-   2. Is an address which takes the unaligned address of packed member
-  of struct or union when assigning to TYPE.
- */
+/* Check and warn if the right hand value, RHS,
+   is an address which takes the unaligned address of packed member
+   of struct or union when assigning to TYPE.  */
 
 static void
-check_and_warn_address_or_pointer_of_packed_member (tree type, tree rhs)
+check_and_warn_address_of_packed_member (tree type, tree rhs)
 {
   bool nop_p = false;
   tree orig_rhs;
@@ -3067,11 +3065,11 @@ check_and_warn_address_or_pointer_of_packed_member 
(tree type, tree rhs)
   if (TREE_CODE (rhs) == COND_EXPR)
 {
   /* Check the THEN path.  */
-  check_and_warn_address_or_pointer_of_packed_member
+  check_and_warn_address_of_packed_member
(type, TREE_OPERAND (rhs, 1));
 
   /* Check the ELSE path.  */
-  check_and_warn_address_or_pointer_of_packed_member
+  check_and_warn_address_of_packed_member
(type, TREE_OPERAND (rhs, 2));
 }
   else
@@ -3095,7 +3093,7 @@ check_and_warn_address_or_pointer_of_packed_member (tree 
type, tree rhs)
}
 
   tree context
-   = check_address_or_pointer_of_packed_member (type, rhs);
+   = check_address_of_packed_member (type, rhs);
   if (context)
{
  location_t loc = EXPR_LOC_OR_LOC (rhs, input_location);
@@ -3107,14 +3105,12 @@ check_and_warn_address_or_pointer_of_packed_member 
(tree type, tree rhs)
 }
 }
 
-/* Warn if the right hand value, RHS:
-   1. Is a pointer value which isn't aligned to a pointer type TYPE.
-   2. Is an address which takes the unaligned address of packed member
-  of struct or union when assigning to TYPE.
-*/
+/* Warn if the right hand value, RHS,
+   is an address which takes the unaligned address of packed member
+   of struct or union when assigning to TYPE.  */
 
 void
-warn_for_address_or_pointer_of_packed_member (tree type, tree rhs)
+warn_for_address_of_packed_member (tree type, tree rhs)
 {
   if (!warn_address_of_packed_member)
 return;
@@ -3123,7 +3119,7 @@ warn_for_address_or_pointer_of_packed_member (tree type, 
tree rhs)
   if (!POINTER_TYPE_P (type))
 return;
 
-  

[pushed] wwwdocs: releasing: No longer refer to buildstat.html

2023-11-22 Thread Gerald Pfeifer
That's the counterpart to the branching.html change I just made, also 
reported by Thomas.

Pushed.

Gerald
---
 htdocs/releasing.html | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/htdocs/releasing.html b/htdocs/releasing.html
index 1cd56f72..c7365e64 100644
--- a/htdocs/releasing.html
+++ b/htdocs/releasing.html
@@ -85,9 +85,6 @@ web pages.
 Update the version numbers of the current and future releases on
 the main web page, and add a proper news item there as well.
 
-For a new major release, ensure that the build status page is present
-and add a link from the main buildstat.html page.
-
 Generate online documentation for the new release with
 update_web_docs_git.  The appropriate command to run (as gccadmin)
 to generate the documentation would be scripts/update_web_docs_git
-- 
2.42.1


Re: RISC-V: Support XTheadVector extensions

2023-11-22 Thread Jeff Law




On 11/22/23 07:24, Christoph Müllner wrote:

On Wed, Nov 22, 2023 at 2:52 PM 钟居哲  wrote:


I am totally ok to approve theadvector on GCC-14 before stage 3 close
as long as it doesn't touch the current RVV codes too much and binutils 
supports theadvector.

I have provided the draft approach:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637349.html
which turns out doesn't need to change any codes of vector.md.
I strongly suggest follow this draft. I can be actively review theadvector 
during stage 3.
And hopefully can help you land theadvector on GCC-14.


I see now two approaches:
1) Let GCC emit RVV instructions for XTheadVector for instructions
that are in both
2) Use the ASM_OUTPUT_OPCODE hook to output "th." for these instructions

No doubt, the ASM_OUTPUT_OPCODE hook approach is better than our
format-string approach, but would 1) not be the even better
solution? It would also mean, that not a single test case is required
for these overlapping instructions (only a few tests that ensure that
we don't emit RVV instructions that are not available in
XTheadVector). Besides that, letting GCC emit RVV instructions for
XTheadVector is a very clever idea, because it fully utilizes the
fact that both extensions overlap to a huge degree.

The ASM_OUTPUT_OPCODE approach could lead to an issue if we enable

XTheadVector
with any other vector extension, say Zvfoo. In this case the Zvfoo 
instructions will all be prefixed as well with "th.". I know that it

is not likely to run into this problem (such a machine does not exist
in real hardware), but it is possible to trigger this issue easily
and approach 1) would not have this potential issue.
I'm not a big fan of the ASM_OUTPUT_OPCODE approach.While it is 
simple, I worry a bit about it from a long term maintenance standpoint. 
As you note we could well end up at some point with an extension that 
has an mnenomic starting with "v" that would blow up.  But I certainly 
see the appeal of such a simple test to support thead vector.


Given there are at least 3 approaches that can fix that problem (%^, 
assembler dialect or ASM_OUTPUT_OPCODE), maybe we could set that 
discussion aside in the immediate term and see if there are other issues 
that are potentially more substantial.





--



More generally, I think I need to soften my prior statement about 
deferring this to gcc-15.  This code was submitted in time for the 
gcc-14 deadline, so it should be evaluated just like we do anything else 
that makes the deadline.  There are various criteria we use to evaluate 
if something should get integrated and we should just work through this 
series like we always do and not treat it specially in any way.



jeff


Re: [PATCH v5 1/1] c++: Initial support for P0847R7 (Deducing This) [PR102609]

2023-11-22 Thread waffl3x
> > > > /* Nonzero for FUNCTION_DECL means that this decl is a non-static
> > > > - member function. */
> > > > + member function, use DECL_IOBJ_MEMBER_FUNC_P instead. */
> > > > #define DECL_NONSTATIC_MEMBER_FUNCTION_P(NODE) \
> > > > (TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE)
> > > > 
> > > > +/* Nonzero for FUNCTION_DECL means that this decl is an implicit object
> > > > + member function. */
> > > > +#define DECL_IOBJ_MEMBER_FUNC_P(NODE) \
> > > > + (TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE)
> > > 
> > > I was thinking to rename DECL_NONSTATIC_MEMBER_FUNCTION_P rather than
> > > add a new, equivalent one. And then go through all the current uses of
> > > the old macro to decide whether they mean IOBJ or OBJECT.
> > 
> > I figure it would be easier to make that transition if there's a clear
> > line between old versus new. To be clear, my intention is for the old
> > macro to be removed once all the uses of it are changed over to the new
> > macro. I can still remove it for the patch if you like but having both
> > and removing the old one later seems better to me.
> 
> 
> Hmm, I think changing all the uses is a necessary part of this change.
> I suppose it could happen before the main patch, if you'd prefer, but it
> seems more straightforward to include it.
> 

I had meant to reply to this as well but forgot, I agree that it's
likely necessary but I've only been changing them as I come across
things that don't work right rather than trying to evaluate them
through the code. Making changes to them without having a test case
that demonstrates that the case is definitely being handled incorrectly
is risky, especially for me since I don't have a full understanding of
the code base. I would rather only change ones that are evidently
wrong, and defer the rest to someone else who knows the code base
better.

With that said, I have been neglecting replacing uses of the old macro,
but I now realize that's just creating more work for whoever is
evaluating the rest of them. Going forward I will make sure I replace
the old macro when I am fairly certain it should be.

Alex



RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store

2023-11-22 Thread Li, Pan2
Committed, thanks all.

Pan

-Original Message-
From: Richard Sandiford  
Sent: Thursday, November 23, 2023 2:39 AM
To: Li, Pan2 
Cc: Richard Biener ; juzhe.zh...@rivai.ai; Wang, 
Yanzhang ; kito.ch...@gmail.com; Jeff Law 
; gcc-patches@gcc.gnu.org
Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < 
store

"Li, Pan2"  writes:
>> It looks like Jeff approved the patch?
>
> Yes, just would like to double check the way of this patch is expected as 
> following the suggestion of Richard S.

Yeah, it looks good to me, thanks.

Richard

> Pan
>
> -Original Message-
> From: Richard Biener  
> Sent: Wednesday, November 22, 2023 4:02 PM
> To: Li, Pan2 
> Cc: richard.sandif...@arm.com; juzhe.zh...@rivai.ai; Wang, Yanzhang 
> ; kito.ch...@gmail.com; Jeff Law 
> ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < 
> store
>
> On Wed, Nov 22, 2023 at 3:30 AM Li, Pan2  wrote:
>>
>> Hi Richard S,
>>
>> Thanks a lot for reviewing and comments. May I know is there any concern or 
>> further comments for landing this patch to GCC-14?
>
> It looks like Jeff approved the patch?
>
> Richard.
>
>> Pan
>>
>> -Original Message-
>> From: Li, Pan2
>> Sent: Wednesday, November 15, 2023 8:25 AM
>> To: gcc-patches@gcc.gnu.org
>> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
>> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandif...@arm.com; 
>> Jeff Law 
>> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read 
>> < store
>>
>> Sorry for disturbing, looks I have a typo for Richard S's email address, cc 
>> the right email address for awareness.
>>
>> Pan
>>
>> -Original Message-
>> From: Li, Pan2
>> Sent: Wednesday, November 15, 2023 8:18 AM
>> To: Jeff Law ; gcc-patches@gcc.gnu.org
>> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
>> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandiford@arm.com2
>> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read 
>> < store
>>
>> > I wouldn't try to handle that case unless we had actual evidence it was
>> > useful to do so.  Just wanted to point out that unlike pseudos we can
>> > have multiple modes referencing the same memory location.
>>
>> Got the point here, thanks Jeff for emphasizing this, .
>>
>> Pan
>>
>> -Original Message-
>> From: Jeff Law 
>> Sent: Tuesday, November 14, 2023 4:12 AM
>> To: Li, Pan2 ; gcc-patches@gcc.gnu.org
>> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
>> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandiford@arm.com2
>> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read 
>> < store
>>
>>
>>
>> On 11/12/23 20:22, pan2...@intel.com wrote:
>> > From: Pan Li 
>> >
>> > Update in v4:
>> > * Merge upstream and removed some independent changes.
>> >
>> > Update in v3:
>> > * Take known_le instead of known_lt for vector size.
>> > * Return NULL_RTX when gap is not equal 0 and not constant.
>> >
>> > Update in v2:
>> > * Move vector type support to get_stored_val.
>> >
>> > Original log:
>> >
>> > This patch would like to allow the vector mode in the
>> > get_stored_val in the DSE. It is valid for the read
>> > rtx if and only if the read bitsize is less than the
>> > stored bitsize.
>> >
>> > Given below example code with
>> > --param=riscv-autovec-preference=fixed-vlmax.
>> >
>> > vuint8m1_t test () {
>> >uint8_t arr[32] = {
>> >  1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>> >  1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>> >};
>> >
>> >return __riscv_vle8_v_u8m1(arr, 32);
>> > }
>> >
>> > Before this patch:
>> > test:
>> >lui a5,%hi(.LANCHOR0)
>> >addisp,sp,-32
>> >addia5,a5,%lo(.LANCHOR0)
>> >li  a3,32
>> >vl2re64.v   v2,0(a5)
>> >vsetvli zero,a3,e8,m1,ta,ma
>> >vs2r.v  v2,0(sp) <== Unnecessary store to stack
>> >vle8.v  v1,0(sp) <== Ditto
>> >vs1r.v  v1,0(a0)
>> >addisp,sp,32
>> >jr  ra
>> >
>> > After this patch:
>> > test:
>> >lui a5,%hi(.LANCHOR0)
>> >addia5,a5,%lo(.LANCHOR0)
>> >li  a4,32
>> >addisp,sp,-32
>> >vsetvli zero,a4,e8,m1,ta,ma
>> >vle8.v  v1,0(a5)
>> >vs1r.v  v1,0(a0)
>> >addisp,sp,32
>> >jr  ra
>> >
>> > Below tests are passed within this patch:
>> > * The risc-v regression test.
>> > * The x86 bootstrap and regression test.
>> > * The aarch64 regression test.
>> >
>> >   PR target/111720
>> >
>> > gcc/ChangeLog:
>> >
>> >   * dse.cc (get_stored_val): Allow vector mode if read size is
>> >   less than or equal to stored size.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >   * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
>> >   * 

Re: [PATCH] tree: Fix up try_catch_may_fallthru [PR112619]

2023-11-22 Thread Jakub Jelinek
On Wed, Nov 22, 2023 at 01:21:12PM +0100, Jakub Jelinek wrote:
> So, pedantically perhaps just assuming TRY_CATCH_EXPR where second argument
> is not STATEMENT_LIST to be the CATCH_EXPR/EH_FILTER_EXPR case could work
> for C++, but there are other FEs and it would be fragile (and weird, given
> that STATEMENT_LIST with single stmt in it vs. that stmt ought to be
> generally interchangeable).

Looking at other FE, e.g. go/go-gcc.cc clearly has:
stat_tree = build2_loc(location.gcc_location(), TRY_CATCH_EXPR,
   void_type_node, stat_tree,
   build2_loc(location.gcc_location(), CATCH_EXPR,
  void_type_node, NULL, except_tree));
so CATCH_EXPR is immediately the second operand of TRY_CATCH_EXPR.
d/toir.cc has:
/* Back-end expects all catches in a TRY_CATCH_EXPR to be enclosed in a
   statement list, however pop_stmt_list may optimize away the list
   if there is only a single catch to push.  */
if (TREE_CODE (catches) != STATEMENT_LIST)
  {
tree stmt_list = alloc_stmt_list ();
append_to_statement_list_force (catches, _list);
catches = stmt_list;
  }

add_stmt (build2 (TRY_CATCH_EXPR, void_type_node, trybody, catches));
so I assume it run into the try_catch_may_fallthru issue (because gimplifier
clearly doesn't require that).
rust/rust-gcc.cc copies go-gcc.cc and also creates CATCH_EXPR directly in
TRY_CATCH_EXPR's operand.

Note, the only time one runs into the ICE is when the first operand (i.e.
try body) doesn't fall thru, otherwise the function returns true early.

Jakub



[PATCHv2] Clean up by_pieces_ninsns

2023-11-22 Thread HAO CHEN GUI
Hi,
  This patch cleans up by_pieces_ninsns and does following things.
1. Do the length and alignment adjustment for by pieces compare when
overlap operation is enabled.
2. Replace unnecessary mov_optab checks with gcc assertions.

  Compared to last version, the main change is to replace unnecessary
mov_optab checks with gcc assertions and fix the indentation.

  Bootstrapped and tested on x86 and powerpc64-linux BE and LE with
no regressions. Is this OK for trunk?

Thanks
Gui Haochen

ChangeLog
Clean up by_pieces_ninsns

The by pieces compare can be implemented by overlapped operations. So
it should be taken into consideration when doing the adjustment for
overlap operations.  The mode returned from
widest_fixed_size_mode_for_size is already checked with mov_optab in
by_pieces_mode_supported_p called by widest_fixed_size_mode_for_size.
So it is no need to check mov_optab again in by_pieces_ninsns.  The
patch fixes these issues.

gcc/
* expr.cc (by_pieces_ninsns): Include by pieces compare when
do the adjustment for overlap operations.  Replace mov_optab
checks with gcc assertions.

patch.diff
diff --git a/gcc/expr.cc b/gcc/expr.cc
index 556bcf7ef59..ffd18fe43cc 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -1090,18 +1090,16 @@ by_pieces_ninsns (unsigned HOST_WIDE_INT l, unsigned 
int align,
   unsigned HOST_WIDE_INT n_insns = 0;
   fixed_size_mode mode;

-  if (targetm.overlap_op_by_pieces_p () && op != COMPARE_BY_PIECES)
+  if (targetm.overlap_op_by_pieces_p ())
 {
   /* NB: Round up L and ALIGN to the widest integer mode for
 MAX_SIZE.  */
   mode = widest_fixed_size_mode_for_size (max_size, op);
-  if (optab_handler (mov_optab, mode) != CODE_FOR_nothing)
-   {
- unsigned HOST_WIDE_INT up = ROUND_UP (l, GET_MODE_SIZE (mode));
- if (up > l)
-   l = up;
- align = GET_MODE_ALIGNMENT (mode);
-   }
+  gcc_assert (optab_handler (mov_optab, mode) != CODE_FOR_nothing);
+  unsigned HOST_WIDE_INT up = ROUND_UP (l, GET_MODE_SIZE (mode));
+  if (up > l)
+   l = up;
+  align = GET_MODE_ALIGNMENT (mode);
 }

   align = alignment_for_piecewise_move (MOVE_MAX_PIECES, align);
@@ -1109,12 +1107,11 @@ by_pieces_ninsns (unsigned HOST_WIDE_INT l, unsigned 
int align,
   while (max_size > 1 && l > 0)
 {
   mode = widest_fixed_size_mode_for_size (max_size, op);
-  enum insn_code icode;
+  gcc_assert (optab_handler (mov_optab, mode) != CODE_FOR_nothing);

   unsigned int modesize = GET_MODE_SIZE (mode);

-  icode = optab_handler (mov_optab, mode);
-  if (icode != CODE_FOR_nothing && align >= GET_MODE_ALIGNMENT (mode))
+  if (align >= GET_MODE_ALIGNMENT (mode))
{
  unsigned HOST_WIDE_INT n_pieces = l / modesize;
  l %= modesize;


Re: [committed] d: Merge upstream dmd ff57fec515, druntime ff57fec515, phobos 17bafda79.

2023-11-22 Thread Iain Buclaw
Excerpts from Rainer Orth's message of November 21, 2023 4:59 pm:
> Hi Iain,
> 
>> This patch merges the D front-end and runtime library with upstream dmd
>> ff57fec515, and the standard library with phobos 17bafda79.
>>
>> Synchronizing with the upstream release candidate of v2.106.0.
>>
>> D front-end changes:
>>
>> - Import dmd v2.106.0-rc.1.
>> - New'ing multi-dimensional arrays are now are converted to a single
>>   template call `_d_newarraymTX'.
>>
>> D runtime changes:
>>
>> - Import druntime v2.106.0-rc.1.
>>
>> Phobos changes:
>>
>> - Import phobos v2.106.0-rc.1.
>>
>> Bootstrapped and regression tested on x86_64-linux-gnu/-m32, committed
>> to mainline.
> 
> either this patch or the previous one broke D bootstrap with GCC 9.  On
> both i386-pc-solaris2.11 with gdc 9.4.0 and sparc-sun-solaris2.11 with
> gdc 9.3.0, stage 1 d21 fails to link with
> 
> Undefined   first referenced
>  symbol in file
> _D3dmd4root11stringtable34__T11StringValueTC3dmd5mtype4TypeZ11StringValue7lstringMFNaNbNiNjZPa
>  d/func.o
> _D3dmd4root11stringtable34__T11StringValueTC3dmd5mtype4TypeZ11StringValue8toDcharsMxFNaNbNiNjZPxa
>  d/func.o
> _D3dmd4root11stringtable34__T11StringValueTC3dmd5mtype4TypeZ11StringValue8toStringMxFNaNbNiNjZAxa
>  d/func.o
> ld: fatal: symbol referencing errors
> collect2: error: ld returned 1 exit status
> make[3]: *** [/vol/gcc/src/hg/master/local/gcc/d/Make-lang.in:236: d21] Error 
> 1
> 
> I'm now running bootstraps with gdc 11.1.0 instead, which seems to work:
> in both cases, stage 1 d21 did link.
> 
> If this is intentional, install.texi should be updated accordingly.
> 

Thanks Rainer,

I don't think this should happen if we can help it just yet.  I'll have
a look to see which specific upstream change might have caused it.

Iain.


Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < store

2023-11-22 Thread Richard Sandiford
"Li, Pan2"  writes:
>> It looks like Jeff approved the patch?
>
> Yes, just would like to double check the way of this patch is expected as 
> following the suggestion of Richard S.

Yeah, it looks good to me, thanks.

Richard

> Pan
>
> -Original Message-
> From: Richard Biener  
> Sent: Wednesday, November 22, 2023 4:02 PM
> To: Li, Pan2 
> Cc: richard.sandif...@arm.com; juzhe.zh...@rivai.ai; Wang, Yanzhang 
> ; kito.ch...@gmail.com; Jeff Law 
> ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read < 
> store
>
> On Wed, Nov 22, 2023 at 3:30 AM Li, Pan2  wrote:
>>
>> Hi Richard S,
>>
>> Thanks a lot for reviewing and comments. May I know is there any concern or 
>> further comments for landing this patch to GCC-14?
>
> It looks like Jeff approved the patch?
>
> Richard.
>
>> Pan
>>
>> -Original Message-
>> From: Li, Pan2
>> Sent: Wednesday, November 15, 2023 8:25 AM
>> To: gcc-patches@gcc.gnu.org
>> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
>> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandif...@arm.com; 
>> Jeff Law 
>> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read 
>> < store
>>
>> Sorry for disturbing, looks I have a typo for Richard S's email address, cc 
>> the right email address for awareness.
>>
>> Pan
>>
>> -Original Message-
>> From: Li, Pan2
>> Sent: Wednesday, November 15, 2023 8:18 AM
>> To: Jeff Law ; gcc-patches@gcc.gnu.org
>> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
>> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandiford@arm.com2
>> Subject: RE: [PATCH v4] DSE: Allow vector type for get_stored_val when read 
>> < store
>>
>> > I wouldn't try to handle that case unless we had actual evidence it was
>> > useful to do so.  Just wanted to point out that unlike pseudos we can
>> > have multiple modes referencing the same memory location.
>>
>> Got the point here, thanks Jeff for emphasizing this, .
>>
>> Pan
>>
>> -Original Message-
>> From: Jeff Law 
>> Sent: Tuesday, November 14, 2023 4:12 AM
>> To: Li, Pan2 ; gcc-patches@gcc.gnu.org
>> Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang ; 
>> kito.ch...@gmail.com; richard.guent...@gmail.com; richard.sandiford@arm.com2
>> Subject: Re: [PATCH v4] DSE: Allow vector type for get_stored_val when read 
>> < store
>>
>>
>>
>> On 11/12/23 20:22, pan2...@intel.com wrote:
>> > From: Pan Li 
>> >
>> > Update in v4:
>> > * Merge upstream and removed some independent changes.
>> >
>> > Update in v3:
>> > * Take known_le instead of known_lt for vector size.
>> > * Return NULL_RTX when gap is not equal 0 and not constant.
>> >
>> > Update in v2:
>> > * Move vector type support to get_stored_val.
>> >
>> > Original log:
>> >
>> > This patch would like to allow the vector mode in the
>> > get_stored_val in the DSE. It is valid for the read
>> > rtx if and only if the read bitsize is less than the
>> > stored bitsize.
>> >
>> > Given below example code with
>> > --param=riscv-autovec-preference=fixed-vlmax.
>> >
>> > vuint8m1_t test () {
>> >uint8_t arr[32] = {
>> >  1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>> >  1, 2, 7, 1, 3, 4, 5, 3, 1, 0, 1, 2, 4, 4, 9, 9,
>> >};
>> >
>> >return __riscv_vle8_v_u8m1(arr, 32);
>> > }
>> >
>> > Before this patch:
>> > test:
>> >lui a5,%hi(.LANCHOR0)
>> >addisp,sp,-32
>> >addia5,a5,%lo(.LANCHOR0)
>> >li  a3,32
>> >vl2re64.v   v2,0(a5)
>> >vsetvli zero,a3,e8,m1,ta,ma
>> >vs2r.v  v2,0(sp) <== Unnecessary store to stack
>> >vle8.v  v1,0(sp) <== Ditto
>> >vs1r.v  v1,0(a0)
>> >addisp,sp,32
>> >jr  ra
>> >
>> > After this patch:
>> > test:
>> >lui a5,%hi(.LANCHOR0)
>> >addia5,a5,%lo(.LANCHOR0)
>> >li  a4,32
>> >addisp,sp,-32
>> >vsetvli zero,a4,e8,m1,ta,ma
>> >vle8.v  v1,0(a5)
>> >vs1r.v  v1,0(a0)
>> >addisp,sp,32
>> >jr  ra
>> >
>> > Below tests are passed within this patch:
>> > * The risc-v regression test.
>> > * The x86 bootstrap and regression test.
>> > * The aarch64 regression test.
>> >
>> >   PR target/111720
>> >
>> > gcc/ChangeLog:
>> >
>> >   * dse.cc (get_stored_val): Allow vector mode if read size is
>> >   less than or equal to stored size.
>> >
>> > gcc/testsuite/ChangeLog:
>> >
>> >   * gcc.target/riscv/rvv/base/pr111720-0.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-1.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-10.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-2.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-3.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-4.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-5.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-6.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-7.c: New test.
>> >   * gcc.target/riscv/rvv/base/pr111720-8.c: 

Re: [PATCH 2/2] testsuite/unroll-8: Disable vectorization for varibale-factor targets

2023-11-22 Thread Jeff Law




On 11/21/23 16:27, Palmer Dabbelt wrote:

The vectorizer picks up these loops and disables unrolling on targets
with variable vector factors.  That result in better code here, but it
trips up the unrolling tests.  So just disable vectorization for these.

gcc/testsuite/ChangeLog:

PR target/112531
* gcc.dg/unroll-8.c: Disable vectorization on arm64 and riscv.
So probably the right check is to test for vector and 
!vect_variable_length rather than doing something target specific for 
aarch64/riscv


Jeff


[PATCH] AArch64/testsuite: Use non-capturing parentheses with ccmp_1.c

2023-11-22 Thread Maciej W. Rozycki
Use non-capturing parentheses for the subexpressions used with 
`scan-assembler-times', to avoid a quirk with double-counting.

gcc/testsuite/
* gcc.target/aarch64/ccmp_1.c: Use non-capturing parentheses 
with `scan-assembler-times'.
---
Hi,

 Here's another one.  I realised my original regexp used to grep the tree 
for `scan-assembler-times' with subexpressions was too strict and with an 
updated pattern I found this second test case that does regress once the 
`scan-assembler-times' double-counting quirk has been fixed.

 As with the ARM change we don't need capturing parentheses here, usually 
used for back references, so let's just avoid the double-counting quirk 
altogether and make our matching here work whether the quirk has been 
fixed or not.

 Verified for the `aarch64-linux-gnu' target with the quirk fix submitted 
as  
and the aarch64.exp subset of the C language test suite.  OK to apply?

  Maciej
---
 gcc/testsuite/gcc.target/aarch64/ccmp_1.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

gcc-aarch64-test-ccmp_1-non-capturing.diff
Index: gcc/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
===
--- gcc.orig/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
+++ gcc/gcc/testsuite/gcc.target/aarch64/ccmp_1.c
@@ -86,8 +86,8 @@ f13 (int a, int b)
 /* { dg-final { scan-assembler "cmp\t(.)+35" } } */
 
 /* { dg-final { scan-assembler-times "\tcmp\tw\[0-9\]+, 0" 4 } } */
-/* { dg-final { scan-assembler-times "fcmpe\t(.)+0\\.0" 2 } } */
-/* { dg-final { scan-assembler-times "fcmp\t(.)+0\\.0" 2 } } */
+/* { dg-final { scan-assembler-times "fcmpe\t(?:.)+0\\.0" 1 } } */
+/* { dg-final { scan-assembler-times "fcmp\t(?:.)+0\\.0" 1 } } */
 
 /* { dg-final { scan-assembler "adds\t" } } */
 /* { dg-final { scan-assembler-times "\tccmp\t" 11 } } */


Re: [PATCH] Fix PR ada/111909 On Darwin, determine filesystem case sensitivity at runtime

2023-11-22 Thread Simon Wright


> On 22 Nov 2023, at 15:03, Iain Sandoe  wrote:
> 
> 
> 
>> On 22 Nov 2023, at 14:48, Iain Sandoe  wrote:
>> 
>> 
>> 
>>> On 22 Nov 2023, at 13:55, Arnaud Charlet  wrote:
>>> 
>> #if defined (__APPLE__)
>> -#include 
> 
> If removing unistd.h is intentional (i.e. you determined that it’s no 
> longer
> needed for Darwin), then we should make that a separate patch.
 
 I thought that I’d had to include unistd.h for the first patch in this 
 thread; clearly not!
 
 What I hope will be the final version:
>>> 
>>> OK here.
>> 
>> also OK here, thanks
> 
> I think this fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111909 ?
> if you agree then please add that to the commit.
> Iain

git format-patch does so much, I forgot this, sorry:

gcc/ada/Changelog:

2023-11-22 Simon Wright mailto:si...@pushface.org>>

PR ada/111909

> 
>> Iain
>> 
>>> 
 ——— 8< .———
 
 In gcc/ada/adaint.c(__gnat_get_file_names_case_sensitive), the current
 assumption for __APPLE__ is that file names are case-insensitive
 unless __arm__ or __arm64__ are defined, in which case file names are
 declared case-sensitive.
 
 The associated comment is
 "By default, we suppose filesystems aren't case sensitive on
 Windows and Darwin (but they are on arm-darwin)."
 
 This means that on aarch64-apple-darwin, file names are treated as
 case-sensitive, which is not the default case.
 
 The true default position is that macOS file systems are
 case-insensitive, iOS file systems are case-sensitive.
 
 Apple provide a header file  which permits a
 compile-time check for the compiler target (e.g. OSX vs IOS); if
 TARGET_OS_IOS is defined as 1, this is a build for iOS.
 
 * gcc/ada/adaint.c
 (__gnat_get_file_names_case_sensitive): Split out the __APPLE__
 check and remove the checks for __arm__, __arm64__.
 For Apple, file names are by default case-insensitive unless
 TARGET_OS_IOS is set.
 
 Signed-off-by: Simon Wright 
> 



Re: [PATCH v3 2/8] Unify implementations of print_hard_reg_set()

2023-11-22 Thread Vladimir Makarov



On 11/22/23 06:14, Maxim Kuvyrkov wrote:

We currently have 3 implementations of print_hard_reg_set()
(all with the same name!) in ira-color.cc, ira-conflicts.cc, and
sel-sched-dump.cc.  This patch generalizes implementation in
ira-color.cc, and uses it in all other places.  The declaration
is added to hard-reg-set.h.

The motivation for this patch is the [upcoming] need for
print_hard_reg_set() in sched-deps.cc.

gcc/ChangeLog:

* hard-reg-set.h (print_hard_reg_set): Declare.
* ira-color.cc (print_hard_reg_set): Generalize a bit.
(debug_hard_reg_set, print_hard_regs_subforest,)
(setup_allocno_available_regs_num): Update.
* ira-conflicts.cc (print_hard_reg_set): Remove.
(print_allocno_conflicts): Use global print_hard_reg_set().
* sel-sched-dump.cc (print_hard_reg_set): Remove.
(dump_hard_reg_set): Use global print_hard_reg_set().
* sel-sched-dump.h (dump_hard_reg_set): Mark as DEBUG_FUNCTION.


OK for me.  Thank you for consolidation of the print code, Maxim.




Re: [PATCH] Fix PR ada/111909 On Darwin, determine filesystem case sensitivity at runtime

2023-11-22 Thread Iain Sandoe



> On 22 Nov 2023, at 14:48, Iain Sandoe  wrote:
> 
> 
> 
>> On 22 Nov 2023, at 13:55, Arnaud Charlet  wrote:
>> 
> #if defined (__APPLE__)
> -#include 
 
 If removing unistd.h is intentional (i.e. you determined that it’s no 
 longer
 needed for Darwin), then we should make that a separate patch.
>>> 
>>> I thought that I’d had to include unistd.h for the first patch in this 
>>> thread; clearly not!
>>> 
>>> What I hope will be the final version:
>> 
>> OK here.
> 
> also OK here, thanks

I think this fixes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111909 ?
if you agree then please add that to the commit.
Iain

> Iain
> 
>> 
>>> ——— 8< .———
>>> 
>>> In gcc/ada/adaint.c(__gnat_get_file_names_case_sensitive), the current
>>> assumption for __APPLE__ is that file names are case-insensitive
>>> unless __arm__ or __arm64__ are defined, in which case file names are
>>> declared case-sensitive.
>>> 
>>> The associated comment is
>>> "By default, we suppose filesystems aren't case sensitive on
>>> Windows and Darwin (but they are on arm-darwin)."
>>> 
>>> This means that on aarch64-apple-darwin, file names are treated as
>>> case-sensitive, which is not the default case.
>>> 
>>> The true default position is that macOS file systems are
>>> case-insensitive, iOS file systems are case-sensitive.
>>> 
>>> Apple provide a header file  which permits a
>>> compile-time check for the compiler target (e.g. OSX vs IOS); if
>>> TARGET_OS_IOS is defined as 1, this is a build for iOS.
>>> 
>>> * gcc/ada/adaint.c
>>> (__gnat_get_file_names_case_sensitive): Split out the __APPLE__
>>> check and remove the checks for __arm__, __arm64__.
>>> For Apple, file names are by default case-insensitive unless
>>> TARGET_OS_IOS is set.
>>> 
>>> Signed-off-by: Simon Wright 



Re: [PATCH RFC] c++: mangle function template constraints

2023-11-22 Thread Jonathan Wakely
On Wed, 22 Nov 2023 at 14:50, Jonathan Wakely  wrote:
>
> On Mon, 20 Nov 2023 at 02:56, Jason Merrill wrote:
> >
> > Tested x86_64-pc-linux-gnu.  Are the library bits OK?  Any comments before I
> > push this?
>
> The library parts are OK.
>
> The variable template is_trivially_copyable_v just uses
> __is_trivially_copyable so should be just as efficient, and the change
> to  is fine.
>
> The variable template is_trivially_destructible_v instantiates the
> is_trivially_destructible type trait, which instantiates
> __is_destructible_safe and __is_destructible_impl, which is probably
> why we used the built-in directly in . But that's an
> acceptable overhead to avoid using the built-in in a mangled context,
> and it would be good to optimize the variable template anyway, as a
> separate change.

For C++20 we could do:

#if __cpp_concepts
template 
  inline constexpr bool is_trivially_destructible_v = false;
template  requires (_Tp& __t) { __t.~_Tp(); }
  inline constexpr bool is_trivially_destructible_v<_Tp>
= __has_trivial_destructor(_Tp);
#else
template 
  inline constexpr bool is_trivially_destructible_v =
is_trivially_destructible<_Tp>::value;
#endif

But that won't help C++17.



[pushed] testsuite: Update path to intl include.

2023-11-22 Thread Iain Sandoe
Tested on i686, x86_64 and aarch64 Darwin, aarch64 and x86_64 Linux,
pushed to master as obvious, thanks
Iain

--- 8< ---

When we are building libintl in-tree, we need to pass the path
to the generated libintl.h include to the plugin tests.  This
path has changed with the use of gettext directly.

gcc/testsuite/ChangeLog:

* lib/plugin-support.exp: Update the expected path to an
in-tree build of libintl.

Signed-off-by: Iain Sandoe 
---
 gcc/testsuite/lib/plugin-support.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/plugin-support.exp 
b/gcc/testsuite/lib/plugin-support.exp
index 378881b0f5d..8accf13fab6 100644
--- a/gcc/testsuite/lib/plugin-support.exp
+++ b/gcc/testsuite/lib/plugin-support.exp
@@ -85,7 +85,7 @@ proc plugin-test-execute { plugin_src plugin_tests } {
 set gcc_objdir "$objdir/../../.."
 set includes "-I. -I${srcdir} -I${gcc_srcdir}/gcc -I${gcc_objdir}/gcc \
   -I${gcc_srcdir}/include -I${gcc_srcdir}/libcpp/include \
-  $GMPINC -I${gcc_objdir}/intl"
+  $GMPINC -I${gcc_objdir}/gettext/intl"
 
 if { [ ishost *-*-darwin* ] } {
# -mdynamic-no-pic is incompatible with -fPIC.
-- 
2.39.2 (Apple Git-143)



Re: [PATCH v3] aarch64: SVE/NEON Bridging intrinsics

2023-11-22 Thread Richard Sandiford
Richard Ball  writes:
> ACLE has added intrinsics to bridge between SVE and Neon.
>
> The NEON_SVE Bridge adds intrinsics that allow conversions between NEON and
> SVE vectors.
>
> This patch adds support to GCC for the following 3 intrinsics:
> svset_neonq, svget_neonq and svdup_neonq
>
> gcc/ChangeLog:
>
>   * config.gcc: Adds new header to config.
>   * config/aarch64/aarch64-builtins.cc (enum aarch64_type_qualifiers):
>   Moved to header file.
>   (ENTRY): Likewise.
>   (enum aarch64_simd_type): Likewise.
>   (struct aarch64_simd_type_info): Make extern.
>   (GTY): Likewise.
>   * config/aarch64/aarch64-c.cc (aarch64_pragma_aarch64):
>   Defines pragma for arm_neon_sve_bridge.h.
>   * config/aarch64/aarch64-protos.h: New function.
>   * config/aarch64/aarch64-sve-builtins-base.h: New intrinsics.
>   * config/aarch64/aarch64-sve-builtins-base.cc
>   (class svget_neonq_impl): New intrinsic implementation.
>   (class svset_neonq_impl): Likewise.
>   (class svdup_neonq_impl): Likewise.
>   (NEON_SVE_BRIDGE_FUNCTION): New intrinsics.
>   * config/aarch64/aarch64-sve-builtins-functions.h
>   (NEON_SVE_BRIDGE_FUNCTION): Defines macro for NEON_SVE_BRIDGE
>   functions.
>   * config/aarch64/aarch64-sve-builtins-shapes.h: New shapes.
>   * config/aarch64/aarch64-sve-builtins-shapes.cc
>   (parse_element_type): Add NEON element types.
>   (parse_type): Likewise.
>   (struct get_neonq_def): Defines function shape for get_neonq.
>   (struct set_neonq_def): Defines function shape for set_neonq.
>   (struct dup_neonq_def): Defines function shape for dup_neonq.
>   * config/aarch64/aarch64-sve-builtins.cc (DEF_SVE_TYPE_SUFFIX):
>   (DEF_SVE_NEON_TYPE_SUFFIX): Defines 
> macro for NEON_SVE_BRIDGE type suffixes.
>   (DEF_NEON_SVE_FUNCTION): Defines 
> macro for NEON_SVE_BRIDGE functions.
>   (function_resolver::infer_neon128_vector_type): Infers type suffix
>   for overloaded functions.
>   (init_neon_sve_builtins): Initialise neon_sve_bridge_builtins for LTO.
>   (handle_arm_neon_sve_bridge_h): Handles #pragma arm_neon_sve_bridge.h.
>   * config/aarch64/aarch64-sve-builtins.def
>   (DEF_SVE_NEON_TYPE_SUFFIX): Macro for handling neon_sve type suffixes.
>   (bf16): Replace entry with neon-sve entry.
>   (f16): Likewise.
>   (f32): Likewise.
>   (f64): Likewise.
>   (s8): Likewise.
>   (s16): Likewise.
>   (s32): Likewise.
>   (s64): Likewise.
>   (u8): Likewise.
>   (u16): Likewise.
>   (u32): Likewise.
>   (u64): Likewise.
>   * config/aarch64/aarch64-sve-builtins.h
>   (GCC_AARCH64_SVE_BUILTINS_H): Include aarch64-builtins.h.
>   (ENTRY): Add aarch64_simd_type definiton.
>   (enum aarch64_simd_type): Add neon information to type_suffix_info.
>   (struct type_suffix_info): New function.
>   * config/aarch64/aarch64-sve.md
>   (@aarch64_sve_get_neonq_): New intrinsic insn for big endian.
>   (@aarch64_sve_set_neonq_): Likewise.
>   (@aarch64_sve_dup_neonq_): Likewise.
>   * config/aarch64/aarch64.cc 
>   (aarch64_init_builtins): Add call to init_neon_sve_builtins.
> (aarch64_output_sve_set_neonq): asm output for Big Endian set_neonq.
>   * config/aarch64/iterators.md: Add UNSPEC_SET_NEONQ.
>   * config/aarch64/aarch64-builtins.h: New file.
>   * config/aarch64/aarch64-neon-sve-bridge-builtins.def: New file.
>   * config/aarch64/arm_neon_sve_bridge.h: New file.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/aarch64/sve/acle/asm/test_sve_acle.h: Add include 
>   arm_neon_sve_bridge header file
>   * gcc.dg/torture/neon-sve-bridge.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_bf16.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_f16.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_f32.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_f64.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_s16.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_s32.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_s64.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_s8.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_u16.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_u32.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_u64.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/dup_neonq_u8.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/get_neonq_bf16.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/get_neonq_f16.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/get_neonq_f32.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/get_neonq_f64.c: New test.
>   * gcc.target/aarch64/sve/acle/asm/get_neonq_s16.c: New test.
>   * 

Re: [PATCH RFC] c++: mangle function template constraints

2023-11-22 Thread Jonathan Wakely
On Mon, 20 Nov 2023 at 02:56, Jason Merrill wrote:
>
> Tested x86_64-pc-linux-gnu.  Are the library bits OK?  Any comments before I
> push this?

The library parts are OK.

The variable template is_trivially_copyable_v just uses
__is_trivially_copyable so should be just as efficient, and the change
to  is fine.

The variable template is_trivially_destructible_v instantiates the
is_trivially_destructible type trait, which instantiates
__is_destructible_safe and __is_destructible_impl, which is probably
why we used the built-in directly in . But that's an
acceptable overhead to avoid using the built-in in a mangled context,
and it would be good to optimize the variable template anyway, as a
separate change.



Re: [PATCH] Fix PR ada/111909 On Darwin, determine filesystem case sensitivity at runtime

2023-11-22 Thread Iain Sandoe



> On 22 Nov 2023, at 13:55, Arnaud Charlet  wrote:
> 
 #if defined (__APPLE__)
 -#include 
>>> 
>>> If removing unistd.h is intentional (i.e. you determined that it’s no longer
>>> needed for Darwin), then we should make that a separate patch.
>> 
>> I thought that I’d had to include unistd.h for the first patch in this 
>> thread; clearly not!
>> 
>> What I hope will be the final version:
> 
> OK here.

also OK here, thanks
Iain

> 
>> ——— 8< .———
>> 
>> In gcc/ada/adaint.c(__gnat_get_file_names_case_sensitive), the current
>> assumption for __APPLE__ is that file names are case-insensitive
>> unless __arm__ or __arm64__ are defined, in which case file names are
>> declared case-sensitive.
>> 
>> The associated comment is
>>  "By default, we suppose filesystems aren't case sensitive on
>>  Windows and Darwin (but they are on arm-darwin)."
>> 
>> This means that on aarch64-apple-darwin, file names are treated as
>> case-sensitive, which is not the default case.
>> 
>> The true default position is that macOS file systems are
>> case-insensitive, iOS file systems are case-sensitive.
>> 
>> Apple provide a header file  which permits a
>> compile-time check for the compiler target (e.g. OSX vs IOS); if
>> TARGET_OS_IOS is defined as 1, this is a build for iOS.
>> 
>>  * gcc/ada/adaint.c
>>  (__gnat_get_file_names_case_sensitive): Split out the __APPLE__
>>  check and remove the checks for __arm__, __arm64__.
>>  For Apple, file names are by default case-insensitive unless
>>  TARGET_OS_IOS is set.
>> 
>> Signed-off-by: Simon Wright 



Re: [PATCH 01/11] rtl-ssa: Support for inserting new insns

2023-11-22 Thread Alex Coplan
On 21/11/2023 11:51, Richard Sandiford wrote:
> Alex Coplan  writes:
> > N.B. this is just a rebased (but otherwise unchanged) version of the
> > same patch already posted here:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633348.html
> >
> > this is the only unreviewed dependency from the previous series, so it
> > seemed easier just to re-post it (not least to appease the pre-commit
> > CI).
> >
> > -- >8 --
> >
> > The upcoming aarch64 load pair pass needs to form store pairs, and can
> > re-order stores over loads when alias analysis determines this is safe.
> > In the case that both mem defs have uses in the RTL-SSA IR, and both
> > stores require re-ordering over their uses, we represent that as
> > (tentative) deletion of the original store insns and creation of a new
> > insn, to prevent requiring repeated re-parenting of uses during the
> > pass.  We then update all mem uses that require re-parenting in one go
> > at the end of the pass.
> >
> > To support this, RTL-SSA needs to handle inserting new insns (rather
> > than just changing existing ones), so this patch adds support for that.
> >
> > New insns (and new accesses) are temporaries, allocated above a temporary
> > obstack_watermark, such that the user can easily back out of a change 
> > without
> > awkward bookkeeping.
> >
> > Bootstrapped/regtested as a series on aarch64-linux-gnu, OK for trunk?
> >
> > gcc/ChangeLog:
> >
> > * rtl-ssa/accesses.cc (function_info::create_set): New.
> > * rtl-ssa/accesses.h (access_info::is_temporary): New.
> > * rtl-ssa/changes.cc (move_insn): Handle new (temporary) insns.
> > (function_info::finalize_new_accesses): Handle new/temporary
> > user-created accesses.
> > (function_info::apply_changes_to_insn): Ensure m_is_temp flag
> > on new insns gets cleared.
> > (function_info::change_insns): Handle new/temporary insns.
> > (function_info::create_insn): New.
> > * rtl-ssa/changes.h (class insn_change): Make function_info a
> > friend class.
> > * rtl-ssa/functions.h (function_info): Declare new entry points:
> > create_set, create_insn.  Declare new change_alloc helper.
> > * rtl-ssa/insns.cc (insn_info::print_full): Identify temporary 
> > insns in
> > dump.
> > * rtl-ssa/insns.h (insn_info): Add new m_is_temp flag and 
> > accompanying
> > is_temporary accessor.
> > * rtl-ssa/internals.inl (insn_info::insn_info): Initialize 
> > m_is_temp to
> > false.
> > * rtl-ssa/member-fns.inl (function_info::change_alloc): New.
> > * rtl-ssa/movement.h (restrict_movement_for_defs_ignoring): Add
> > handling for temporary defs.
> 
> Looks good, but there were a couple of things I didn't understand:

Thanks for the review.

> 
> > ---
> >  gcc/rtl-ssa/accesses.cc| 10 ++
> >  gcc/rtl-ssa/accesses.h |  4 +++
> >  gcc/rtl-ssa/changes.cc | 74 +++---
> >  gcc/rtl-ssa/changes.h  |  2 ++
> >  gcc/rtl-ssa/functions.h| 14 
> >  gcc/rtl-ssa/insns.cc   |  5 +++
> >  gcc/rtl-ssa/insns.h|  7 +++-
> >  gcc/rtl-ssa/internals.inl  |  1 +
> >  gcc/rtl-ssa/member-fns.inl | 12 +++
> >  gcc/rtl-ssa/movement.h |  8 -
> >  10 files changed, 123 insertions(+), 14 deletions(-)
> >
> > diff --git a/gcc/rtl-ssa/accesses.cc b/gcc/rtl-ssa/accesses.cc
> > index 510545a8bad..76d70fd8bd3 100644
> > --- a/gcc/rtl-ssa/accesses.cc
> > +++ b/gcc/rtl-ssa/accesses.cc
> > @@ -1456,6 +1456,16 @@ function_info::make_uses_available 
> > (obstack_watermark ,
> >return use_array (new_uses, num_uses);
> >  }
> >  
> > +set_info *
> > +function_info::create_set (obstack_watermark ,
> > +  insn_info *insn,
> > +  resource_info resource)
> > +{
> > +  auto set = change_alloc (watermark, insn, resource);
> > +  set->m_is_temp = true;
> > +  return set;
> > +}
> > +
> >  // Return true if ACCESS1 can represent ACCESS2 and if ACCESS2 can
> >  // represent ACCESS1.
> >  static bool
> > diff --git a/gcc/rtl-ssa/accesses.h b/gcc/rtl-ssa/accesses.h
> > index fce31d46717..7e7a90ece97 100644
> > --- a/gcc/rtl-ssa/accesses.h
> > +++ b/gcc/rtl-ssa/accesses.h
> > @@ -204,6 +204,10 @@ public:
> >// in the main instruction pattern.
> >bool only_occurs_in_notes () const { return m_only_occurs_in_notes; }
> >  
> > +  // Return true if this is a temporary access, e.g. one created for
> > +  // an insn that is about to be inserted.
> > +  bool is_temporary () const { return m_is_temp; }
> > +
> >  protected:
> >access_info (resource_info, access_kind);
> >  
> > diff --git a/gcc/rtl-ssa/changes.cc b/gcc/rtl-ssa/changes.cc
> > index aab532b9f26..da2a61d701a 100644
> > --- a/gcc/rtl-ssa/changes.cc
> > +++ b/gcc/rtl-ssa/changes.cc
> > @@ -394,14 +394,20 @@ move_insn (insn_change , insn_info *after)
> >// At the moment we don't support moving 

Re: [PATCH] Fix PR ada/111909 On Darwin, determine filesystem case sensitivity at runtime

2023-11-22 Thread Paul Koning



> On Nov 22, 2023, at 8:54 AM, Simon Wright  wrote:
> 
> On 21 Nov 2023, at 23:13, Iain Sandoe  wrote:
> 
>>> #if defined (__APPLE__)
>>> -#include 
>> 
>> If removing unistd.h is intentional (i.e. you determined that it’s no longer
>> needed for Darwin), then we should make that a separate patch.
> 
> I thought that I’d had to include unistd.h for the first patch in this 
> thread; clearly not!
> 
> What I hope will be the final version:
> 
> ——— 8< .———
> 
> In gcc/ada/adaint.c(__gnat_get_file_names_case_sensitive), the current
> assumption for __APPLE__ is that file names are case-insensitive
> unless __arm__ or __arm64__ are defined, in which case file names are
> declared case-sensitive.
> 
> The associated comment is
>  "By default, we suppose filesystems aren't case sensitive on
>  Windows and Darwin (but they are on arm-darwin)."
> 
> This means that on aarch64-apple-darwin, file names are treated as
> case-sensitive, which is not the default case.
> 
> The true default position is that macOS file systems are
> case-insensitive, iOS file systems are case-sensitive.

Sort of.  The most common choices for Mac OS file system type are indeed case 
insensitive, but it also allows case sensitive file systems. 

paul




[PATCH] tree-optimization/112344 - wrong final value replacement

2023-11-22 Thread Richard Biener
When performing final value replacement chrec_apply that's used to
compute the overall effect of niters to a CHREC doesn't consider that
the overall increment of { -2147483648, +, 2 } doesn't fit in
a signed integer when the loop iterates until the value of the IV
of 20.  The following fixes this mistake, carrying out the multiply
and add in an unsigned type instead, avoiding undefined overflow
and thus later miscompilation by path range analysis.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/112344
* tree-chrec.cc (chrec_apply): Perform the overall increment
calculation and increment in an unsigned type.

* gcc.dg/torture/pr112344.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr112344.c | 20 
 gcc/tree-chrec.cc   | 32 -
 2 files changed, 41 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr112344.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr112344.c 
b/gcc/testsuite/gcc.dg/torture/pr112344.c
new file mode 100644
index 000..c52d2c8304b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr112344.c
@@ -0,0 +1,20 @@
+/* { dg-do run } */
+/* { dg-require-effective-target int32plus } */
+
+int
+main ()
+{
+  long long b = 2036854775807LL;
+  signed char c = 3;
+  short d = 0;
+  int e = -2147483647 - 1, f;
+  for (f = 0; f < 7; f++)
+while (e < 20)
+  {
+   e += 2;
+   d = c -= b;
+  }
+  if (d != 13)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/tree-chrec.cc b/gcc/tree-chrec.cc
index 2f67581591a..f4ba130ba20 100644
--- a/gcc/tree-chrec.cc
+++ b/gcc/tree-chrec.cc
@@ -613,32 +613,42 @@ chrec_apply (unsigned var,
   if (evolution_function_is_affine_p (chrec))
{
  tree chrecr = CHREC_RIGHT (chrec);
+ tree chrecl = CHREC_LEFT (chrec);
  if (CHREC_VARIABLE (chrec) != var)
-   res = build_polynomial_chrec
- (CHREC_VARIABLE (chrec),
-  chrec_apply (var, CHREC_LEFT (chrec), x),
-  chrec_apply (var, chrecr, x));
+   res = build_polynomial_chrec (CHREC_VARIABLE (chrec),
+ chrec_apply (var, chrecl, x),
+ chrec_apply (var, chrecr, x));
 
- /* "{a, +, b} (x)"  ->  "a + b*x".  */
- else if (operand_equal_p (CHREC_LEFT (chrec), chrecr)
+ /* "{a, +, a}" (x-1) -> "a*x".  */
+ else if (operand_equal_p (chrecl, chrecr)
   && TREE_CODE (x) == PLUS_EXPR
   && integer_all_onesp (TREE_OPERAND (x, 1))
   && !POINTER_TYPE_P (type)
   && TYPE_PRECISION (TREE_TYPE (x))
  >= TYPE_PRECISION (type))
{
- /* We know the number of iterations can't be negative.
-So {a, +, a} (x-1) -> "a*x".  */
+ /* We know the number of iterations can't be negative.  */
  res = build_int_cst (TREE_TYPE (x), 1);
  res = chrec_fold_plus (TREE_TYPE (x), x, res);
  res = chrec_convert_rhs (type, res, NULL);
  res = chrec_fold_multiply (type, chrecr, res);
}
+ /* "{a, +, b} (x)"  ->  "a + b*x".  */
  else
{
- res = chrec_convert_rhs (TREE_TYPE (chrecr), x, NULL);
- res = chrec_fold_multiply (TREE_TYPE (chrecr), chrecr, res);
- res = chrec_fold_plus (type, CHREC_LEFT (chrec), res);
+ /* The overall increment might not fit in a signed type so
+use an unsigned computation to get at the final value
+and avoid undefined signed overflow.  */
+ tree utype = TREE_TYPE (chrecr);
+ if (INTEGRAL_TYPE_P (utype) && !TYPE_OVERFLOW_WRAPS (utype))
+   utype = unsigned_type_for (TREE_TYPE (chrecr));
+ res = chrec_convert_rhs (utype, x, NULL);
+ res = chrec_fold_multiply (utype,
+chrec_convert (utype, chrecr, NULL),
+res);
+ res = chrec_fold_plus (utype,
+chrec_convert (utype, chrecl, NULL), res);
+ res = chrec_convert (type, res, NULL);
}
}
   else if (TREE_CODE (x) == INTEGER_CST
-- 
2.35.3


[PATCH] libgcc: mark __hardcfr_check_fail as always_inline

2023-11-22 Thread Jose E. Marchesi
The function __hardcfr_check_fail in hardcfr.c is internal and static
inline.  It receives many arguments, which require more than five
registers to be passed in bpf-none-unknown targets.  BPF is limited to
that number of registers to pass arguments, and therefore libgcc fails
to build in that target.  This patch marks the function with the
always_inline attribute, fixing the bpf build.

Tested in bpf-unknown-none target and x86_64-linux-gnu host.

libgcc/ChangeLog:

* hardcfr.c (__hardcfr_check_fail): Mark as always_inline.
---
 libgcc/hardcfr.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libgcc/hardcfr.c b/libgcc/hardcfr.c
index 25ff06742cb..48a87a5a87a 100644
--- a/libgcc/hardcfr.c
+++ b/libgcc/hardcfr.c
@@ -206,7 +206,8 @@ __hardcfr_debug_cfg (size_t const blocks,
enabled, it also forces __hardcfr_debug_cfg (above) to be compiled into an
out-of-line function, that could be called from a debugger.
*/
-static inline void
+
+static inline  __attribute__((__always_inline__)) void
 __hardcfr_check_fail (size_t const blocks ATTRIBUTE_UNUSED,
  vword const *const visited ATTRIBUTE_UNUSED,
  vword const *const cfg ATTRIBUTE_UNUSED,
-- 
2.30.2



[committed] amdgcn: Fix vector TImode reload loop

2023-11-22 Thread Andrew Stubbs
This patch fixes a reload bug that's hard to reproduce reliably (so far 
I've only observed it on the OG13 branch, with testcase 
gcc.c-torture/compile/pr70355.c), but causes an infinite loop in reload 
when it fails.


For some reason it wants to save a value from AVGPRs to memory, this 
can't happen directly on CDNA1, so secondary reload moves the value to 
VGPRS, but instead of proceeding to memory, LRA just goes and moves the 
value right back into AVGPRs.  Disparaging this move (when a reload is 
needed) fixes the issue, but I don't know if this is the intended or 
optimal solution in these cases.


Andrewamdgcn: Fix vector TImode reload loop

I've only observed the problem on the devel/omp/gcc-13 branch, but this
could theoretically affect mainline also.  The mov insns for the other modes
already have '$', so this completes the set.

gcc/ChangeLog:

* config/gcn/gcn-valu.md (*mov_4reg): Disparage AVGPR use when a
reload is required.

diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md
index 23f2bbe454b..a928decd408 100644
--- a/gcc/config/gcn/gcn-valu.md
+++ b/gcc/config/gcn/gcn-valu.md
@@ -566,10 +566,10 @@ (define_insn "*mov_4reg"
(match_operand:V_4REG 1 "general_operand"))]
   ""
   {@ [cons: =0, 1; attrs: type, length, gcn_version]
-  [v,vDB;vmult,16,*]   v_mov_b32\t%L0, %L1\;  
v_mov_b32\t%H0, %H1\;  v_mov_b32\t%J0, %J1\;  v_mov_b32\t%K0, 
%K1
-  [v,a  ;vmult,32,*]  v_accvgpr_read_b32\t%L0, %L1\; 
v_accvgpr_read_b32\t%H0, %H1\; v_accvgpr_read_b32\t%J0, %J1\; 
v_accvgpr_read_b32\t%K0, %K1
-  [a,v  ;vmult,32,*] v_accvgpr_write_b32\t%L0, 
%L1\;v_accvgpr_write_b32\t%H0, %H1\;v_accvgpr_write_b32\t%J0, 
%J1\;v_accvgpr_write_b32\t%K0, %K1
-  [a,a  ;vmult,32,cdna2]   v_accvgpr_mov_b32\t%L0, %L1\;  
v_accvgpr_mov_b32\t%H0, %H1\;  v_accvgpr_mov_b32\t%J0, %J1\;  
v_accvgpr_mov_b32\t%K0, %K1
+  [v ,vDB;vmult,16,*]   v_mov_b32\t%L0, %L1\;  
v_mov_b32\t%H0, %H1\;  v_mov_b32\t%J0, %J1\;  v_mov_b32\t%K0, 
%K1
+  [v ,a  ;vmult,32,*]  v_accvgpr_read_b32\t%L0, %L1\; 
v_accvgpr_read_b32\t%H0, %H1\; v_accvgpr_read_b32\t%J0, %J1\; 
v_accvgpr_read_b32\t%K0, %K1
+  [$a,v  ;vmult,32,*] v_accvgpr_write_b32\t%L0, 
%L1\;v_accvgpr_write_b32\t%H0, %H1\;v_accvgpr_write_b32\t%J0, 
%J1\;v_accvgpr_write_b32\t%K0, %K1
+  [a ,a  ;vmult,32,cdna2]   v_accvgpr_mov_b32\t%L0, %L1\;  
v_accvgpr_mov_b32\t%H0, %H1\;  v_accvgpr_mov_b32\t%J0, %J1\;  
v_accvgpr_mov_b32\t%K0, %K1
   })
 
 (define_insn "mov_exec"


Re: [PATCH v2 1/6] libgomp: basic pinned memory on Linux

2023-11-22 Thread Tobias Burnus

Hi Andrew,

Side remark:


-#define MEMSPACE_CALLOC(MEMSPACE, SIZE) \ - calloc (1,
(((void)(MEMSPACE), (SIZE


This fits a bit more to previous patch, but I wonder whether that should
use (MEMSPACE, NMEMB, SIZE) instead - to fit to the actual calloc arguments.

I think the main/only difference between SIZE and NMEMB and SIZE is that
"If the multiplication of nmemb and size would result in integer overflow,
then calloc() returns an error." (Linux manpage)

However, while this wording seems to be neither in POSIX nor in the OpenMP
spec. There was some alignment discussion at https://gcc.gnu.org/PR112364
regarding whether C (since C23) has a different alignment for
calloc(1, n) vs. calloc(n,1) but Joseph believes it doen't.

Thus, this is more bikesheding than making a real difference.

* * *

[somehow my email program caused some odd formatting issues when I
hit some odd key combo. I am not sure whether I fully fixed it or not;
sorry if some parts look odd.]


On 23.08.23 16:14, Andrew Stubbs wrote:

Implement the OpenMP pinned memory trait on Linux hosts using the
mlock syscall.  Pinned allocations are performed using mmap, not
malloc, to ensure that they can be unpinned safely when freed.

This implementation will work OK for page-scale allocations, and
finer-grained allocations will be implemented in a future patch.


Can you also update libgomp.texi, i.e. 
https://gcc.gnu.org/onlinedocs/libgomp/Memory-allocation.html
to document that and how pinning works on Linux?

I think I proposed in the low-latency patch to add a @ref to
https://gcc.gnu.org/onlinedocs/libgomp/Offload-Target-Specifics.html and
add there the nvptx and gcn specific memory-allocation handling.

* * *

I think the following is not ideal in the pinning-is-not-supported case:


@@ -434,10 +435,6 @@ omp_init_allocator (omp_memspace_handle_t
memspace, int ntraits, } #endif

-  /* No support for this so far.  */
-  if (data.pinned)
-return omp_null_allocator;
- ret = gomp_malloc (sizeof (struct omp_allocator_data));
*ret = data;
#ifndef HAVE_SYNC_BUILTINS


which continues as:
  gomp_mutex_init (>lock);
#endif
  return (omp_allocator_handle_t) ret;
}


Therefore:

This code will always return a handle, even if pinning is not supported.
I had expected that the following happens:

"Otherwise if an allocator based on the requirements cannot be created
then the special omp_null_allocator handle is returned."

Using this allocator on a system where libgomp does not support pinning
will always fail with the fallback, which could be either of:

default_mem_fb (= omp_atv_default), null_fb, abort_fb, allocator_fb (+ fb_data).

Thus, the current code kind of works if the fallback is (explicitly or 
implicitly)
omp_atv_default or (explicitly) default_mem_fb – but otherwise, allocations will
always fail, most prominently with "abort_fb".

* * *

The following definitions (ab)use comma operators to avoid unused
variable errors. */ #ifndef MEMSPACE_ALLOC -#define
MEMSPACE_ALLOC(MEMSPACE, SIZE) \ - malloc (((void)(MEMSPACE), (SIZE)))
+#define MEMSPACE_ALLOC(MEMSPACE, SIZE, PIN) \ + (PIN ? NULL : malloc
(((void)(MEMSPACE), (SIZE


I wonder whether the comment should note something like: All of the
following will return NULL or are a no-op when pinning is enabled
(unless overridden).

And the following looks odd:


+#define MEMSPACE_FREE(MEMSPACE, ADDR, SIZE, PIN) \
+  (PIN ? NULL :  free (((void)(MEMSPACE), (void)(SIZE), (ADDR
#endif


Contrary to the other functions that return a value (pointer), 'free' "returns" 
'void'.

And, indeed, the compiler might complain:

test.c:5:14: warning: ISO C forbids conditional expr with only one void side 
[-Wpedantic]
5 |   pin ? NULL : free(p);
  |  ^

While (void)NULL works, I think the simplest is to just use an 'if (pin) 
free(...)'.

* * *

+linux_memspace_alloc (omp_memspace_handle_t memspace, size_t size,
int pin) +{ + (void)memspace; + + if (pin) + { + void *addr = mmap
(NULL, size, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS,
-1, 0);



Maybe add a comment noting that mmap returns nullified memory – as required for 
the calloc call.

(The linux man page states for MAP_ANONYMOUS: "The mapping ...; its contents are 
initialized to zero."
while POSIX has: "The system shall always zero-fill any partial page at the end of 
an object.", which
should be all in case of addr = NULL.)


+  if (mlock (addr, size))
+ { +   gomp_debug (0, "libgomp: failed to pin memory (ulimit too 
low?)\n");


I wonder whether the size should be included in the output - it might help to 
debug to know
whether "just" 1 kiB or 20 GB were tried to be pinned.

For the comment, I wonder whether it should mention RLIMIT_MEMLOCK or 'lockable 
memory' instead
or in addition to ulimit to be clearer.
(csh uses 'limit' instead of ulimit, but POSIX has both as function and as 
shell (sh, bash) ulimit,
i.e. using 'ulimit' is fine. Albeit 'ulimit()' has been deprecated in favour of 

Re: Re: RISC-V: Support XTheadVector extensions

2023-11-22 Thread Christoph Müllner
On Wed, Nov 22, 2023 at 2:52 PM 钟居哲  wrote:
>
> I am totally ok to approve theadvector on GCC-14 before stage 3 close
> as long as it doesn't touch the current RVV codes too much and binutils 
> supports theadvector.
>
> I have provided the draft approach:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637349.html
> which turns out doesn't need to change any codes of vector.md.
> I strongly suggest follow this draft. I can be actively review theadvector 
> during stage 3.
> And hopefully can help you land theadvector on GCC-14.

I see now two approaches:
1) Let GCC emit RVV instructions for XTheadVector for instructions
that are in both
2) Use the ASM_OUTPUT_OPCODE hook to output "th." for these instructions

No doubt, the ASM_OUTPUT_OPCODE hook approach is better than
our format-string approach, but would 1) not be the even better solution?
It would also mean, that not a single test case is required for these
overlapping instructions (only a few tests that ensure that we don't emit
RVV instructions that are not available in XTheadVector).
Besides that, letting GCC emit RVV instructions for XTheadVector is a
very clever idea,
because it fully utilizes the fact that both extensions overlap to a
huge degree.

The ASM_OUTPUT_OPCODE approach could lead to an issue if we enable XTheadVector
with any other vector extension, say Zvfoo. In this case the Zvfoo
instructions will
all be prefixed as well with "th.". I know that it is not likely to
run into this problem
(such a machine does not exist in real hardware), but it is possible
to trigger this
issue easily and approach 1) would not have this potential issue.

Thanks,
Christoph


>
> Thanks.
>
> 
> juzhe.zh...@rivai.ai
>
>
> From: Christoph Müllner
> Date: 2023-11-22 18:07
> To: juzhe.zh...@rivai.ai
> CC: gcc-patches; kito.cheng; Kito.cheng; cooper.joshua; Robin Dapp; 
> jeffreyalaw; Philipp Tomsich; Cooper Qu; Jin Ma; Nelson Chu
> Subject: Re: RISC-V: Support XTheadVector extensions
> Hi Juzhe,
>
> Sorry for the late reply, but I was not on CC, so I missed this email.
>
> On Fri, Nov 17, 2023 at 2:41 PM juzhe.zh...@rivai.ai
>  wrote:
> >
> > Ok. I just read the theadvector extension.
> >
> > https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadvector.adoc
> >
> > Theadvector is not custom extension. Just a uarch to disable some of the 
> > RVV1.0 extension
> > Theadvector can be considered as subextension of 'V' extension with 
> > disabling some of the
> > instructions and adding some new thead vector target load/store (This is 
> > another story).
> >
> > So, for disabling the instruction that theadvector doesn't support.
> > You don't need to touch such many codes.
> >
> > Here is a much simpler approach to do (I think it's definitely working):
> > 1. Don't change any codes in vector.md and keep GCC generates ASM with 
> > "th." prefix.
> > 2. Add !TARGET_THEADVECTOR into vector-iterator.md to disable the mode you 
> > don't want.
> > For example , theadvector doesn't support fractional vector.
> >
> > Then it's pretty simple:
> >
> > RVVMF2SI "TARGET_VECTOR && !TARGET_THEADVECTOR".
> >
> > 3. Remove all the tests you add in this patch.
> > 4. You can add theadvector specific load/store for example, th.vlb 
> > instructions they are allowed.
> > 5. Modify binutils, and make th.vmulh.vv as the pseudo instruction of 
> > vmulh.vv
> > 6. So with compile option "-S", you will still see ASM as  "vmulh.vv". but 
> > with objdump, you will see th.vmulh.vv.
>
> Yes, all these points sound reasonable, to minimize the patchset size.
> I believe in point 1 you meant "without th. prefix".
>
> I've added Jin Ma (who is the main author of the Binutils patchset) so
> he is also aware
> of the proposal to use pseudo instructions to avoid duplication in Binutils.
>
> Thank you very much!
> Christoph
>
>
> >
> > After this change, you can send V2, then I can continue to review on GCC-15.
> >
> > Thanks.
> >
> > 
> > juzhe.zh...@rivai.ai
> >
> >
> > From: juzhe.zh...@rivai.ai
> > Date: 2023-11-17 19:39
> > To: gcc-patches
> > CC: kito.cheng; kito.cheng; cooper.joshua; Robin Dapp; jeffreyalaw
> > Subject: RISC-V: Support XTheadVector extensions
> > 90% theadvector extension reusing current RVV 1.0 instructions patterns:
> > Just change ASM, For example:
> >
> > @@ -2923,7 +2923,7 @@ (define_insn "*pred_mulh_scalar"
> >   (match_operand:VFULLI_D 3 "register_operand"  "vr,vr, vr, vr")] VMULH)
> >(match_operand:VFULLI_D 2 "vector_merge_operand" "vu, 0, vu,  0")))]
> >"TARGET_VECTOR"
> > -  "vmulh.vx\t%0,%3,%z4%p1"
> > +  "%^vmulh.vx\t%0,%3,%z4%p1"
> >[(set_attr "type" "vimul")
> > (set_attr "mode" "")])
> >
> > +  if (letter == '^')
> > +{
> > +  if (TARGET_XTHEADVECTOR)
> > + fputs ("th.", file);
> > +  return;
> > +}
> >
> >
> > For almost all patterns, you just simply append "th." in the ASM prefix.
> > like change "vmulh.vv" -> "th.vmulh.vv"
> >
> 

Re: [PATCH v2] gcov: Fix integer types in gen_counter_update()

2023-11-22 Thread Sebastian Huber

On 22.11.23 15:22, Christophe Lyon wrote:

On Tue, 21 Nov 2023 at 12:22, Sebastian Huber
  wrote:

On 21.11.23 11:46, Jakub Jelinek wrote:

On Tue, Nov 21, 2023 at 11:42:06AM +0100, Sebastian Huber wrote:

On 21.11.23 11:34, Jakub Jelinek wrote:

--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -281,10 +281,13 @@ gen_assign_counter_update (gimple_stmt_iterator *gsi, 
gcall *call, tree func,
  if (result)
{
  tree result_type = TREE_TYPE (TREE_TYPE (func));
-  tree tmp = make_temp_ssa_name (result_type, NULL, name);
-  gimple_set_lhs (call, tmp);
+  tree tmp1 = make_temp_ssa_name (result_type, NULL, name);
+  gimple_set_lhs (call, tmp1);
  gsi_insert_after (gsi, call, GSI_NEW_STMT);
-  gassign *assign = gimple_build_assign (result, tmp);
+  tree tmp2 = make_ssa_name (TREE_TYPE (result));
+  gassign *assign = gimple_build_assign (tmp2, NOP_EXPR, tmp1);
+  gsi_insert_after (gsi, assign, GSI_NEW_STMT);
+  assign = gimple_build_assign (result, gimple_assign_lhs (assign));

When you use a temporary tmp2 for the lhs of the conversion, you can just
use it here,
 assign = gimple_build_assign (result, tmp2);

Ok for trunk with that change.

Just a question, could I also use

tree tmp2 = make_temp_ssa_name (TREE_TYPE (result), NULL, name);

?

This make_temp_ssa_name() is used throughout the file and the new
make_ssa_name() would be the first use in this file.

Yes.  The only difference is that it won't be _234 = (type) something;
but PROF_time_profile_234 = (type) something; in the dumps, but sure,
consistency is useful.

Thanks for your help. I checked in an updated version.


Our CI bisected a regression to this commit:
Running gcc:gcc.dg/tree-prof/tree-prof.exp ...
FAIL: gcc.dg/tree-prof/time-profiler-3.c scan-ipa-dump-times profile
"Read tp_first_run: 0" 1
FAIL: gcc.dg/tree-prof/time-profiler-3.c scan-ipa-dump-times profile
"Read tp_first_run: 2" 1

(on aarch64)

Can you check?


Yes, I will have a look at it.

--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


Re: [PATCH v2] gcov: Fix integer types in gen_counter_update()

2023-11-22 Thread Christophe Lyon
Hi,

On Tue, 21 Nov 2023 at 12:22, Sebastian Huber
 wrote:
>
> On 21.11.23 11:46, Jakub Jelinek wrote:
> > On Tue, Nov 21, 2023 at 11:42:06AM +0100, Sebastian Huber wrote:
> >>
> >> On 21.11.23 11:34, Jakub Jelinek wrote:
>  --- a/gcc/tree-profile.cc
>  +++ b/gcc/tree-profile.cc
>  @@ -281,10 +281,13 @@ gen_assign_counter_update (gimple_stmt_iterator 
>  *gsi, gcall *call, tree func,
>   if (result)
> {
>   tree result_type = TREE_TYPE (TREE_TYPE (func));
>  -  tree tmp = make_temp_ssa_name (result_type, NULL, name);
>  -  gimple_set_lhs (call, tmp);
>  +  tree tmp1 = make_temp_ssa_name (result_type, NULL, name);
>  +  gimple_set_lhs (call, tmp1);
>   gsi_insert_after (gsi, call, GSI_NEW_STMT);
>  -  gassign *assign = gimple_build_assign (result, tmp);
>  +  tree tmp2 = make_ssa_name (TREE_TYPE (result));
>  +  gassign *assign = gimple_build_assign (tmp2, NOP_EXPR, tmp1);
>  +  gsi_insert_after (gsi, assign, GSI_NEW_STMT);
>  +  assign = gimple_build_assign (result, gimple_assign_lhs (assign));
> >>> When you use a temporary tmp2 for the lhs of the conversion, you can just
> >>> use it here,
> >>> assign = gimple_build_assign (result, tmp2);
> >>>
> >>> Ok for trunk with that change.
> >> Just a question, could I also use
> >>
> >> tree tmp2 = make_temp_ssa_name (TREE_TYPE (result), NULL, name);
> >>
> >> ?
> >>
> >> This make_temp_ssa_name() is used throughout the file and the new
> >> make_ssa_name() would be the first use in this file.
> > Yes.  The only difference is that it won't be _234 = (type) something;
> > but PROF_time_profile_234 = (type) something; in the dumps, but sure,
> > consistency is useful.
>
> Thanks for your help. I checked in an updated version.
>

Our CI bisected a regression to this commit:
Running gcc:gcc.dg/tree-prof/tree-prof.exp ...
FAIL: gcc.dg/tree-prof/time-profiler-3.c scan-ipa-dump-times profile
"Read tp_first_run: 0" 1
FAIL: gcc.dg/tree-prof/time-profiler-3.c scan-ipa-dump-times profile
"Read tp_first_run: 2" 1

(on aarch64)

Can you check?

Thanks,

Christophe

> --
> embedded brains GmbH
> Herr Sebastian HUBER
> Dornierstr. 4
> 82178 Puchheim
> Germany
> email: sebastian.hu...@embedded-brains.de
> phone: +49-89-18 94 741 - 16
> fax:   +49-89-18 94 741 - 08
>
> Registergericht: Amtsgericht München
> Registernummer: HRB 157899
> Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
> Unsere Datenschutzerklärung finden Sie hier:
> https://embedded-brains.de/datenschutzerklaerung/


Re: [PATCH v2] A new copy propagation and PHI elimination pass

2023-11-22 Thread Filip Kastl
Hi Richard,

> Can you name the new file gimple-ssa-sccopy.cc please?

Yes, no problem.

Btw, I thought that it is standard that gimple ssa passes have the tree-ssa-
prefix. Do I understand it correctly that this is not true and many
tree-ssa-*.cc passes should actually be named gimple-ssa-*.cc but remain
tree-ssa-*.cc for historical reasons?

>> +   3 A set of PHI statements that only refer to each other or to one other
>> + value.
>> +
>> +   _8 = PHI <_9, _10>;
>> +   _9 = PHI <_8, _10>;
>> +   _10 = PHI <_8, _9, _1>;
> 
> this case necessarily involves a cyclic CFG, so maybe say
> 
> "This is a lightweight SSA copy propagation pass that is able to handle
> cycles optimistically, eliminating PHIs within those."
> 
> ?  Or is this a mis-characterization?

I'm not sure what you mean here. Yes, this case always involves a cyclic CFG.
Is it weird that a lightweight pass is able to handle cyclic CFG and therefore
you suggest to comment this fact and say that the pass handles cycles
optimistically?

I'm not sure if optimistic is a good word to characterize the pass. I'd expect
an "optimistic" pass to make assumptions which may not be true and therefore
not always all redundancies it can. This pass however should achieve all that
it sets out to do.

> It might be nice to optimize SCCs of size 1 somehow, not sure how
> many times these appear - possibly prevent them from even entering
> the SCC discovery?

Maybe that could be done. I would have to think about it and make sure it
doesn't break anything. I'd prefer to get this version into upstream and then
possibly post this upgrade later.

Btw, SCCs of size of size 1 appear all the time. Those are the cases 1 and 2
described in the comment at the beginning of the file.

> I'll note that while you are working with stmts everywhere that
> you are really bound to using SSA defs and those would already
> nicely have numbers (the SSA_NAME_VERSION).  In principle the
> SCC lattice could be pre-allocated once, indexed by
> SSA_NAME_VERSION and you could keep a "generation" number
> indicating what SCC discovery round it belongs to (aka the
> set_using).

I see. I could allocate a vertex struct for each statement only once when the
pass is invoked instead of allocating the structs each time tarjan_compute_sccs
is called. Will do that.

I'm not sure if I want to use SSA_NAME_VERSION for indexing an vec/array with
all those vertex structs. Many SSA names will be defined neither by PHI nor by
a copy assignment statement. If I created a vertex struct for every SSA name I
would allocate a lot of extra memory.

> There's a old SCC finding algorithm working on the SSA graph
> in the old SCC based value-numbering, for example on the
> gcc 7 branch in tree-ssa-sccvn.c:DFS

> For reading it would be nice to put the SCC finding in its
> own class.

Okay, I'll do that.

> > +   }
> > +}
> > +
> > +  if (!stack.is_empty ())
> > +gcc_unreachable ();
> > +
> > +  /* Clear copy stmts' 'using' flags.  */
> > +  for (vertex v : vs)
> > +{
> > +  gimple *s = v.stmt;
> > +  tarjan_clear_using (s);
> > +}
> > +
> > +  return sccs;
> > +}
> > +
> > +/* Could this statement potentially be a copy statement?
> > +
> > +   This pass only considers statements for which this function returns 
> > 'true'.
> > +   Those are basically PHI functions and assignment statements similar to
> > +
> > +   _2 = _1;
> > +   or
> > +   _2 = 5;  */
> > +
> > +static bool
> > +stmt_may_generate_copy (gimple *stmt)
> > +{
> > +  if (gimple_code (stmt) == GIMPLE_PHI)
> > +{
> > +  gphi *phi = as_a  (stmt);
> > +
> > +  /* No OCCURS_IN_ABNORMAL_PHI SSA names in lhs nor rhs.  */
> > +  if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_phi_result (phi)))
> > +   return false;
> > +
> > +  unsigned i;
> > +  for (i = 0; i < gimple_phi_num_args (phi); i++)
> > +   {
> > + tree op = gimple_phi_arg_def (phi, i);
> > + if (TREE_CODE (op) == SSA_NAME
> > + && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op))
> > +   return false;
> > +   }
> 
> When there's more than one non-SSA PHI argument and they are not
> the same then the stmt also cannot be a copy, right?
> 
> > +  return true;
> > +}

Do I understand you correctly that you propose to put another check here?
Something like

unsigned nonssa_args_num = 0;
unsigned i;
for (i = 0; i < gimple_phi_num_args (phi); i++)
  {
tree op = gimple_phi_arg_def (phi, i);
if (TREE_CODE (op) == SSA_NAME)
  {
nonssa_args_num++;
if (nonssa_args_num >= 2)
  return false;

if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op))
  return false;
  }
  }

> > +
> > +  if (gimple_code (stmt) != GIMPLE_ASSIGN)
> > +return false;
> > +
> > +  /* If the statement has volatile operands, it won't generate a
> > + useful copy.  */
> > +  if (gimple_has_volatile_ops (stmt))
> > +return false;
> > +
> > +  /* Statements with loads and/or stores will never generate a useful 
> > copy.  

Re: [PATCH v4] Introduce strub: machine-independent stack scrubbing

2023-11-22 Thread Richard Biener
On Mon, Nov 20, 2023 at 1:40 PM Alexandre Oliva  wrote:
>
> On Oct 26, 2023, Alexandre Oliva  wrote:
>
> >> This is a refreshed and improved version of the version posted back in
> >> June.  https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621936.html
>
> > Ping? https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633675.html
> > I'm combining the gcc/ipa-strub.cc bits from
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633526.html
>
> Ping?
> Retested on x86_64-linux-gnu, with and without -fstrub=all.

@@ -898,7 +899,24 @@ decl_attributes (tree *node, tree attributes, int flags,
   TYPE_NAME (tt) = *node;
 }

-  *anode = cur_and_last_decl[0];
+  if (*anode != cur_and_last_decl[0])
+{
+  /* Even if !spec->function_type_required, allow the attribute
+ handler to request the attribute to be applied to the function
+ type, rather than to the function pointer type, by setting
+ cur_and_last_decl[0] to the function type.  */
+  if (!fn_ptr_tmp
+  && POINTER_TYPE_P (*anode)
+  && TREE_TYPE (*anode) == cur_and_last_decl[0]
+  && FUNC_OR_METHOD_TYPE_P (TREE_TYPE (*anode)))
+ {
+  fn_ptr_tmp = TREE_TYPE (*anode);
+  fn_ptr_quals = TYPE_QUALS (*anode);
+  anode = _ptr_tmp;
+ }
+  *anode = cur_and_last_decl[0];
+}
+

what is this a workaround for?  Isn't there a suitable parsing position
for placing the attribute?

+#ifndef STACK_GROWS_DOWNWARD
+# define STACK_TOPS GT
+#else
+# define STACK_TOPS LT
+#endif

according to docs this is defined to 0 or 1 so the above looks wrong
(it's always defined).

+  if (optimize < 2 || optimize_size || flag_no_inline)
+return NULL_RTX;

I'm wondering about these checks in the expansions of the builtins,
I think this is about inline expanding or emitting a libcall, right?
I wonder if you should use optimize_function_for_speed (cfun) instead?
Usually -fno-inline shouldn't affect such calls, but -fno-builtin-FOO would.
I have no strong opinion here though.

The new builtins seem undocumented - usually those are documented
within extend.texi - I guess placing __builtin___strub_enter calls in
the code manually will break in interesting ways - if that's not supposed
to happen the trick is to embed a space in the name of the built-in.
__builtin_stack_address looks like something users will pick up though
(and thus should be documented)?

-symtab_node::reset (void)
+symtab_node::reset (bool preserve_comdat_group)

not sure what for, I'll leave Honza to comment.

+/* Create a distinct copy of the type of NODE's function, and change
+   the fntype of all calls to it with the same main type to the new
+   type.  */
+
+static void
+distinctify_node_type (cgraph_node *node)
+{
+  tree old_type = TREE_TYPE (node->decl);
+  tree new_type = build_distinct_type_copy (old_type);
+  tree new_ptr_type = NULL_TREE;
+
+  /* Remap any calls to node->decl that use old_type, or a variant
+ thereof, to new_type as well.  We don't look for aliases, their
+ declarations will have their types changed independently, and
+ we'll adjust their fntypes then.  */
+  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
+{
+  if (!e->call_stmt)
+ continue;
+  tree fnaddr = gimple_call_fn (e->call_stmt);
+  gcc_checking_assert (TREE_CODE (fnaddr) == ADDR_EXPR
+   && TREE_OPERAND (fnaddr, 0) == node->decl);
+  if (strub_call_fntype_override_p (e->call_stmt))
+ continue;
+  if (!new_ptr_type)
+ new_ptr_type = build_pointer_type (new_type);
+  TREE_TYPE (fnaddr) = new_ptr_type;
+  gimple_call_set_fntype (e->call_stmt, new_type);
+}
+
+  TREE_TYPE (node->decl) = new_type;

it does feel like there's IPA mechanisms to deal with what you are trying to do
here (or in the caller(s)).


+unsigned int
+pass_ipa_strub_mode::execute (function *)
+{
+  last_cgraph_order = 0;
+  ipa_strub_set_mode_for_new_functions ();
+
+  /* Verify before any inlining or other transformations.  */
+  verify_strub ();

if  (flag_checking) verify_strub ();

please.  I guess we talked about this last year - what's the reason to have both
an IPA pass and a simple IPA pass?  IIRC the simple IPA pass is a simple
one because it wants to see inlined bodies and "fixes" those up?  Some toplevel
comments explaining both passes in the ipa-strub.cc pass would be nice to
have.  I guess I also asked before - did you try it with -flto?

+/* Decide which of the wrapped function's parms we want to turn into
+   references to the argument passed to the wrapper.  In general,
we want to
+   copy small arguments, and avoid copying large ones.
Variable-sized array
+   lengths given by other arguments, as in 20020210-1.c, would lead to
+   problems if passed by value, after resetting the original function and
+   dropping the length computation; passing them by reference works.
+   DECL_BY_REFERENCE is *not* a substitute for this: it involves copying
+   anyway, but performed at the caller.  */
+indirect_parms_t indirect_nparms (3, 

[pushed] [PR112610] [IRA]: Fix using undefined dump file in IRA code during insn scheduling

2023-11-22 Thread Vladimir Makarov

The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112610

The patch was successfully tested and bootstrapped on x86-64.

commit 95f61de95bbcc2e4fb7020e27698140abea23788
Author: Vladimir N. Makarov 
Date:   Wed Nov 22 09:01:02 2023 -0500

[IRA]: Fix using undefined dump file in IRA code during insn scheduling

Part of IRA code is used for register pressure sensitive insn
scheduling and live range shrinkage.  Numerous changes of IRA resulted
in that this IRA code uses dump file passed by the scheduler and
internal ira dump file (in called functions) which can be undefined or
freed by the scheduler during compiling previous functions.  The patch
fixes this problem.  To reproduce the error valgrind should be used
and GCC should be compiled with valgrind annotations.  Therefor the
patch does not contain the test case.

gcc/ChangeLog:

PR rtl-optimization/112610
* ira-costs.cc: (find_costs_and_classes): Remove arg.
Use ira_dump_file for printing.
(print_allocno_costs, print_pseudo_costs): Ditto.
(ira_costs): Adjust call of find_costs_and_classes.
(ira_set_pseudo_classes): Set up and restore ira_dump_file.

diff --git a/gcc/ira-costs.cc b/gcc/ira-costs.cc
index e0528e76a64..c3efd295e54 100644
--- a/gcc/ira-costs.cc
+++ b/gcc/ira-costs.cc
@@ -1662,16 +1662,16 @@ scan_one_insn (rtx_insn *insn)
 
 
 
-/* Print allocnos costs to file F.  */
+/* Print allocnos costs to the dump file.  */
 static void
-print_allocno_costs (FILE *f)
+print_allocno_costs (void)
 {
   int k;
   ira_allocno_t a;
   ira_allocno_iterator ai;
 
   ira_assert (allocno_p);
-  fprintf (f, "\n");
+  fprintf (ira_dump_file, "\n");
   FOR_EACH_ALLOCNO (a, ai)
 {
   int i, rclass;
@@ -1681,32 +1681,34 @@ print_allocno_costs (FILE *f)
   enum reg_class *cost_classes = cost_classes_ptr->classes;
 
   i = ALLOCNO_NUM (a);
-  fprintf (f, "  a%d(r%d,", i, regno);
+  fprintf (ira_dump_file, "  a%d(r%d,", i, regno);
   if ((bb = ALLOCNO_LOOP_TREE_NODE (a)->bb) != NULL)
-	fprintf (f, "b%d", bb->index);
+	fprintf (ira_dump_file, "b%d", bb->index);
   else
-	fprintf (f, "l%d", ALLOCNO_LOOP_TREE_NODE (a)->loop_num);
-  fprintf (f, ") costs:");
+	fprintf (ira_dump_file, "l%d", ALLOCNO_LOOP_TREE_NODE (a)->loop_num);
+  fprintf (ira_dump_file, ") costs:");
   for (k = 0; k < cost_classes_ptr->num; k++)
 	{
 	  rclass = cost_classes[k];
-	  fprintf (f, " %s:%d", reg_class_names[rclass],
+	  fprintf (ira_dump_file, " %s:%d", reg_class_names[rclass],
 		   COSTS (costs, i)->cost[k]);
 	  if (flag_ira_region == IRA_REGION_ALL
 	  || flag_ira_region == IRA_REGION_MIXED)
-	fprintf (f, ",%d", COSTS (total_allocno_costs, i)->cost[k]);
+	fprintf (ira_dump_file, ",%d",
+		 COSTS (total_allocno_costs, i)->cost[k]);
 	}
-  fprintf (f, " MEM:%i", COSTS (costs, i)->mem_cost);
+  fprintf (ira_dump_file, " MEM:%i", COSTS (costs, i)->mem_cost);
   if (flag_ira_region == IRA_REGION_ALL
 	  || flag_ira_region == IRA_REGION_MIXED)
-	fprintf (f, ",%d", COSTS (total_allocno_costs, i)->mem_cost);
-  fprintf (f, "\n");
+	fprintf (ira_dump_file, ",%d",
+		 COSTS (total_allocno_costs, i)->mem_cost);
+  fprintf (ira_dump_file, "\n");
 }
 }
 
-/* Print pseudo costs to file F.  */
+/* Print pseudo costs to the dump file.  */
 static void
-print_pseudo_costs (FILE *f)
+print_pseudo_costs (void)
 {
   int regno, k;
   int rclass;
@@ -1714,21 +1716,21 @@ print_pseudo_costs (FILE *f)
   enum reg_class *cost_classes;
 
   ira_assert (! allocno_p);
-  fprintf (f, "\n");
+  fprintf (ira_dump_file, "\n");
   for (regno = max_reg_num () - 1; regno >= FIRST_PSEUDO_REGISTER; regno--)
 {
   if (REG_N_REFS (regno) <= 0)
 	continue;
   cost_classes_ptr = regno_cost_classes[regno];
   cost_classes = cost_classes_ptr->classes;
-  fprintf (f, "  r%d costs:", regno);
+  fprintf (ira_dump_file, "  r%d costs:", regno);
   for (k = 0; k < cost_classes_ptr->num; k++)
 	{
 	  rclass = cost_classes[k];
-	  fprintf (f, " %s:%d", reg_class_names[rclass],
+	  fprintf (ira_dump_file, " %s:%d", reg_class_names[rclass],
 		   COSTS (costs, regno)->cost[k]);
 	}
-  fprintf (f, " MEM:%i\n", COSTS (costs, regno)->mem_cost);
+  fprintf (ira_dump_file, " MEM:%i\n", COSTS (costs, regno)->mem_cost);
 }
 }
 
@@ -1939,7 +1941,7 @@ calculate_equiv_gains (void)
and their best costs.  Set up preferred, alternative and allocno
classes for pseudos.  */
 static void
-find_costs_and_classes (FILE *dump_file)
+find_costs_and_classes (void)
 {
   int i, k, start, max_cost_classes_num;
   int pass;
@@ -1991,8 +1993,8 @@ find_costs_and_classes (FILE *dump_file)
  classes to guide the selection.  */
   for (pass = start; pass <= flag_expensive_optimizations; pass++)
 {
-  if ((!allocno_p || internal_flag_ira_verbose > 0) && dump_file)
-	fprintf 

Re: [PATCH] Fix PR ada/111909 On Darwin, determine filesystem case sensitivity at runtime

2023-11-22 Thread Arnaud Charlet
> >> #if defined (__APPLE__)
> >> -#include 
> > 
> > If removing unistd.h is intentional (i.e. you determined that it’s no longer
> > needed for Darwin), then we should make that a separate patch.
> 
> I thought that I’d had to include unistd.h for the first patch in this 
> thread; clearly not!
> 
> What I hope will be the final version:

OK here.

> ——— 8< .———
> 
> In gcc/ada/adaint.c(__gnat_get_file_names_case_sensitive), the current
> assumption for __APPLE__ is that file names are case-insensitive
> unless __arm__ or __arm64__ are defined, in which case file names are
> declared case-sensitive.
> 
> The associated comment is
>   "By default, we suppose filesystems aren't case sensitive on
>   Windows and Darwin (but they are on arm-darwin)."
> 
> This means that on aarch64-apple-darwin, file names are treated as
> case-sensitive, which is not the default case.
> 
> The true default position is that macOS file systems are
> case-insensitive, iOS file systems are case-sensitive.
> 
> Apple provide a header file  which permits a
> compile-time check for the compiler target (e.g. OSX vs IOS); if
> TARGET_OS_IOS is defined as 1, this is a build for iOS.
> 
>   * gcc/ada/adaint.c
>   (__gnat_get_file_names_case_sensitive): Split out the __APPLE__
>   check and remove the checks for __arm__, __arm64__.
>   For Apple, file names are by default case-insensitive unless
>   TARGET_OS_IOS is set.
> 
> Signed-off-by: Simon Wright 


Re: [PATCH] Fix PR ada/111909 On Darwin, determine filesystem case sensitivity at runtime

2023-11-22 Thread Simon Wright
On 21 Nov 2023, at 23:13, Iain Sandoe  wrote:

>> #if defined (__APPLE__)
>> -#include 
> 
> If removing unistd.h is intentional (i.e. you determined that it’s no longer
> needed for Darwin), then we should make that a separate patch.

I thought that I’d had to include unistd.h for the first patch in this thread; 
clearly not!

What I hope will be the final version:

——— 8< .———

In gcc/ada/adaint.c(__gnat_get_file_names_case_sensitive), the current
assumption for __APPLE__ is that file names are case-insensitive
unless __arm__ or __arm64__ are defined, in which case file names are
declared case-sensitive.

The associated comment is
  "By default, we suppose filesystems aren't case sensitive on
  Windows and Darwin (but they are on arm-darwin)."

This means that on aarch64-apple-darwin, file names are treated as
case-sensitive, which is not the default case.

The true default position is that macOS file systems are
case-insensitive, iOS file systems are case-sensitive.

Apple provide a header file  which permits a
compile-time check for the compiler target (e.g. OSX vs IOS); if
TARGET_OS_IOS is defined as 1, this is a build for iOS.

  * gcc/ada/adaint.c
  (__gnat_get_file_names_case_sensitive): Split out the __APPLE__
  check and remove the checks for __arm__, __arm64__.
  For Apple, file names are by default case-insensitive unless
  TARGET_OS_IOS is set.

Signed-off-by: Simon Wright 
---
 gcc/ada/adaint.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/gcc/ada/adaint.c b/gcc/ada/adaint.c
index bb4ed2607e5..2e9c59ae958 100644
--- a/gcc/ada/adaint.c
+++ b/gcc/ada/adaint.c
@@ -85,6 +85,7 @@
 
 #if defined (__APPLE__)
 #include 
+#include 
 #endif
 
 #if defined (__hpux__)
@@ -613,11 +614,18 @@ __gnat_get_file_names_case_sensitive (void)
   else
{
  /* By default, we suppose filesystems aren't case sensitive on
-Windows and Darwin (but they are on arm-darwin).  */
-#if defined (WINNT) || defined (__DJGPP__) \
-  || (defined (__APPLE__) && !(defined (__arm__) || defined (__arm64__)))
+Windows or DOS.  */
+#if defined (WINNT) || defined (__DJGPP__)
  file_names_case_sensitive_cache = 0;
+#elif defined (__APPLE__)
+ /* By default, macOS volumes are case-insensitive, iOS
+volumes are case-sensitive.  */
+#if TARGET_OS_IOS
+ file_names_case_sensitive_cache = 1;
 #else
+ file_names_case_sensitive_cache = 0;
+#endif   
+#else /* Neither Windows nor Apple.  */
  file_names_case_sensitive_cache = 1;
 #endif
}
-- 
2.37.1 (Apple Git-137.1)



Re: Re: RISC-V: Support XTheadVector extensions

2023-11-22 Thread 钟居哲
I am totally ok to approve theadvector on GCC-14 before stage 3 close
as long as it doesn't touch the current RVV codes too much and binutils 
supports theadvector.

I have provided the draft approach:
https://gcc.gnu.org/pipermail/gcc-patches/2023-November/637349.html 
which turns out doesn't need to change any codes of vector.md.
I strongly suggest follow this draft. I can be actively review theadvector 
during stage 3.
And hopefully can help you land theadvector on GCC-14.

Thanks.



juzhe.zh...@rivai.ai
 
From: Christoph Müllner
Date: 2023-11-22 18:07
To: juzhe.zh...@rivai.ai
CC: gcc-patches; kito.cheng; Kito.cheng; cooper.joshua; Robin Dapp; 
jeffreyalaw; Philipp Tomsich; Cooper Qu; Jin Ma; Nelson Chu
Subject: Re: RISC-V: Support XTheadVector extensions
Hi Juzhe,
 
Sorry for the late reply, but I was not on CC, so I missed this email.
 
On Fri, Nov 17, 2023 at 2:41 PM juzhe.zh...@rivai.ai
 wrote:
>
> Ok. I just read the theadvector extension.
>
> https://github.com/T-head-Semi/thead-extension-spec/blob/master/xtheadvector.adoc
>
> Theadvector is not custom extension. Just a uarch to disable some of the 
> RVV1.0 extension
> Theadvector can be considered as subextension of 'V' extension with disabling 
> some of the
> instructions and adding some new thead vector target load/store (This is 
> another story).
>
> So, for disabling the instruction that theadvector doesn't support.
> You don't need to touch such many codes.
>
> Here is a much simpler approach to do (I think it's definitely working):
> 1. Don't change any codes in vector.md and keep GCC generates ASM with "th." 
> prefix.
> 2. Add !TARGET_THEADVECTOR into vector-iterator.md to disable the mode you 
> don't want.
> For example , theadvector doesn't support fractional vector.
>
> Then it's pretty simple:
>
> RVVMF2SI "TARGET_VECTOR && !TARGET_THEADVECTOR".
>
> 3. Remove all the tests you add in this patch.
> 4. You can add theadvector specific load/store for example, th.vlb 
> instructions they are allowed.
> 5. Modify binutils, and make th.vmulh.vv as the pseudo instruction of vmulh.vv
> 6. So with compile option "-S", you will still see ASM as  "vmulh.vv". but 
> with objdump, you will see th.vmulh.vv.
 
Yes, all these points sound reasonable, to minimize the patchset size.
I believe in point 1 you meant "without th. prefix".
 
I've added Jin Ma (who is the main author of the Binutils patchset) so
he is also aware
of the proposal to use pseudo instructions to avoid duplication in Binutils.
 
Thank you very much!
Christoph
 
 
>
> After this change, you can send V2, then I can continue to review on GCC-15.
>
> Thanks.
>
> 
> juzhe.zh...@rivai.ai
>
>
> From: juzhe.zh...@rivai.ai
> Date: 2023-11-17 19:39
> To: gcc-patches
> CC: kito.cheng; kito.cheng; cooper.joshua; Robin Dapp; jeffreyalaw
> Subject: RISC-V: Support XTheadVector extensions
> 90% theadvector extension reusing current RVV 1.0 instructions patterns:
> Just change ASM, For example:
>
> @@ -2923,7 +2923,7 @@ (define_insn "*pred_mulh_scalar"
>   (match_operand:VFULLI_D 3 "register_operand"  "vr,vr, vr, vr")] VMULH)
>(match_operand:VFULLI_D 2 "vector_merge_operand" "vu, 0, vu,  0")))]
>"TARGET_VECTOR"
> -  "vmulh.vx\t%0,%3,%z4%p1"
> +  "%^vmulh.vx\t%0,%3,%z4%p1"
>[(set_attr "type" "vimul")
> (set_attr "mode" "")])
>
> +  if (letter == '^')
> +{
> +  if (TARGET_XTHEADVECTOR)
> + fputs ("th.", file);
> +  return;
> +}
>
>
> For almost all patterns, you just simply append "th." in the ASM prefix.
> like change "vmulh.vv" -> "th.vmulh.vv"
>
> Almost all theadvector instructions are not new features,  all same as RVV1.0.
> Why do you invent the such ISA doesn't include any features that RVV1.0 
> doesn't satisfy ?
>
> I am not explicitly object this patch. But I should know the reason.
>
> Btw, stage 1 will close soon.  So I will review this patch on GCC-15 as long 
> as all other RISC-V maintainers agree.
>
>
> 
> juzhe.zh...@rivai.ai
 


Re: [PATCH v5] Introduce attribute sym_alias (was: Last call for bikeshedding on attribute sym/exalias/reverse_alias)

2023-11-22 Thread Jan Hubicka
Hi,
it seems that interface to symbol table is fairly minimal here reduced
to...
>   (create_sym_alias_decl, create_sym_alias_decls): New.
>   * cgraphunit.cc (cgraph_node::analyze): Create alias_target
>   node if needed.
called from here...
>   (analyze_functions): Fixup visibility of implicit alias only
>   after its node is analyzed.

> +  if (VAR_P (replaced))
> + varpool_node::create_extra_name_alias (sym_node->decl, replacement);
> +  else
> + cgraph_node::create_same_body_alias (sym_node->decl, replacement);

I wonder why you use same body aliases, which are kind of special to C++
frontend (and come with fixup code working around its quirks you had to
disable above).

Why you do not produce usual alias attribute once you know the symbol
table so it goes the cgraph_node::create_alias or
vaprool_node::create_alias path?

Honza


Re: [PATCH] mingw: Exclude utf8 manifest [PR111170, PR108865]

2023-11-22 Thread Costas Argyris
Attached a new patch.

A couple things to note:

1) I changed your

host_extra_objs=utf8-mingw32.o

to

host_extra_objs_mingw=utf8-mingw32.o

to match the other two, since I believe that's what you meant.

2) This approach has the complication that the variables
in configure.ac need to be set before it sources config.host.

On Wed, 22 Nov 2023 at 01:17, Jonathan Yong <10wa...@gmail.com> wrote:

> On 11/21/23 18:07, Costas Argyris wrote:
> > This patch makes the inclusion of the utf8 manifest on the
> > mingw hosts optional by introducing the configure option
> > --disable-win32-utf8-manifest (has no effect on non-mingw
> > hosts).
> >
> > Bootstrapped OK on i686-w64-mingw32 and x86_64-w64-mingw32
> > with and without --disable-win32-utf8-manifest.
> >
> > Costas
> >
>
> I would prefer a AC_ARG_ENABLE to document the option in configure.ac,
> so it would show with configure --help. It should set new variables to
> i386/x-mingw32-utf8, utf8rc-mingw32.o and utf8-mingw32.o respectively
> unless disabled, like so:
>
> host_xmake_mingw=i386/x-mingw32-utf8
> host_extra_gcc_objs_mingw=utf8rc-mingw32.o
> host_extra_objs=utf8-mingw32.o
>
> And then entries in config.host would be:
>
> >   i[34567]86-*-mingw32* | x86_64-*-mingw*)
> > host_xm_file=i386/xm-mingw32.h
> > host_xmake_file="${host_xmake_file} ${host_xmake_mingw}
> i386/x-mingw32"
> > host_extra_gcc_objs="${host_extra_gcc_objs}
> ${host_extra_gcc_objs_mingw} driver-mingw32. >
>  host_extra_objs="${host_extra_objs} ${host_extra_objs_mingw}"
>
>


Exclude-win32-utf8-manifest.patch
Description: Binary data


Re: [PATCH] RISC-V: Fix incorrect use of vcompress in permutation auto-vectorization

2023-11-22 Thread juzhe.zh...@rivai.ai
Committed as it is obvious bug fix.



juzhe.zh...@rivai.ai
 
From: Juzhe-Zhong
Date: 2023-11-22 18:53
To: gcc-patches
CC: kito.cheng; kito.cheng; jeffreyalaw; rdapp.gcc; Juzhe-Zhong
Subject: [PATCH] RISC-V: Fix incorrect use of vcompress in permutation 
auto-vectorization
This patch fixes following FAILs on zvl512b of RV32 system:
 
FAIL: gcc.target/riscv/rvv/autovec/struct/struct_vect_run-12.c execution test
FAIL: gcc.target/riscv/rvv/autovec/struct/struct_vect_run-9.c execution test
 
The root cause is that for permutation indice = {0,3,7,0} use vcompress 
optimization
which is incorrect. Fix vcompress optimization bug.
 
PR target/112598
 
gcc/ChangeLog:
 
* config/riscv/riscv-v.cc (shuffle_compress_patterns): Fix vcompress bug.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/autovec/pr112598-3.c: New test.
 
---
gcc/config/riscv/riscv-v.cc   | 15 ++---
.../gcc.target/riscv/rvv/autovec/pr112598-3.c | 21 +++
2 files changed, 29 insertions(+), 7 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112598-3.c
 
diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc
index 7d6d0821d87..7d3e8038dab 100644
--- a/gcc/config/riscv/riscv-v.cc
+++ b/gcc/config/riscv/riscv-v.cc
@@ -3005,14 +3005,15 @@ shuffle_compress_patterns (struct expand_vec_perm_d *d)
   if (compress_point < 0)
 return false;
-  /* It must be series increasing from compress point.  */
-  if (!d->perm.series_p (compress_point, 1, d->perm[compress_point], 1))
-return false;
-
   /* We can only apply compress approach when all index values from 0 to
  compress point are increasing.  */
   for (int i = 1; i < compress_point; i++)
-if (known_le (d->perm[i], d->perm[i - 1]))
+if (maybe_le (d->perm[i], d->perm[i - 1]))
+  return false;
+
+  /* It must be series increasing from compress point.  */
+  for (int i = 1 + compress_point; i < vlen; i++)
+if (maybe_ne (d->perm[i], d->perm[i - 1] + 1))
   return false;
   /* Success!  */
@@ -3080,10 +3081,10 @@ shuffle_compress_patterns (struct expand_vec_perm_d *d)
   if (need_slideup_p)
 {
   int slideup_cnt = vlen - (d->perm[vlen - 1].to_constant () % vlen) - 1;
-  rtx ops[] = {d->target, d->op1, gen_int_mode (slideup_cnt, Pmode)};
+  merge = gen_reg_rtx (vmode);
+  rtx ops[] = {merge, d->op1, gen_int_mode (slideup_cnt, Pmode)};
   insn_code icode = code_for_pred_slide (UNSPEC_VSLIDEUP, vmode);
   emit_vlmax_insn (icode, BINARY_OP, ops);
-  merge = d->target;
 }
   insn_code icode = code_for_pred_compress (vmode);
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112598-3.c 
b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112598-3.c
new file mode 100644
index 000..231a068c680
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr112598-3.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv32gcv_zvfh_zfh_zvl512b -mabi=ilp32d -O3 
-ftree-vectorize -std=c99 -fno-vect-cost-model" } */
+
+#include 
+#define TYPE uint64_t
+#define ITYPE int64_t
+
+void __attribute__ ((noinline, noclone))
+foo (TYPE *__restrict a, TYPE *__restrict b, TYPE *__restrict c,
+TYPE *__restrict d, ITYPE n)
+{
+  for (ITYPE i = 0; i < n; ++i)
+{
+  d[i * 3] = a[i];
+  d[i * 3 + 1] = b[i];
+  d[i * 3 + 2] = c[i];
+}
+}
+
+/* We don't want vcompress.vv.  */
+/* { dg-final { scan-assembler-not {vcompress\.vv} } } */
-- 
2.36.3
 


Re: [PATCH] tree: Fix up try_catch_may_fallthru [PR112619]

2023-11-22 Thread Jakub Jelinek
On Wed, Nov 22, 2023 at 01:06:28PM +0100, Jakub Jelinek wrote:
> Looking at a trivial example
> void bar ();
> void
> foo (void)
> {
>   try { bar (); } catch (int) {}
> }
> it seems it is even more complicated, because what e.g. the gimplification
> sees is not TRY_CATCH_EXPR with CATCH_EXPR second operand, but
> TRY_BLOCK with HANDLER second operand (note, certainly not wrapped in a
> STATEMENT_LIST, one would need another catch (long) {} for it after it),
> C++ FE specific trees.
> And cp_gimplify_expr then on the fly turns the TRY_BLOCK into TRY_CATCH_EXPR
> (in genericize_try_block) and HANDLER into CATCH_EXPR
> (genericize_catch_block).
> When gimplifying EH_SPEC_BLOCK in genericize_eh_spec_block it even
> creates TRY_CATCH_EXPR with genericize_eh_spec_block -> 
> build_gimple_eh_filter_tree
> if even creates TRY_CATCH_EXPR with EH_FILTER_EXPR as its second operand
> (without intervening STATEMENT_LIST).

Ah, and the difference between the above where TRY_BLOCK is turned into
TRY_CATCH_EXPR and HANDLER into CATCH_EXPR vs. the ICE on the testcase from
the PR is that in that case it isn't TRY_BLOCK, but CLEANUP_STMT which is
not changed during gimplification but already during cp generication.
So, pedantically perhaps just assuming TRY_CATCH_EXPR where second argument
is not STATEMENT_LIST to be the CATCH_EXPR/EH_FILTER_EXPR case could work
for C++, but there are other FEs and it would be fragile (and weird, given
that STATEMENT_LIST with single stmt in it vs. that stmt ought to be
generally interchangeable).

Plus of course question whether we want to handle TRY_BLOCK/EH_SPEC_BLOCK in
cxx_block_may_fallthru in addition to that remains (it apparently already
handles CLEANUP_STMT, but strangely just the try/finally special case of it
- I'd assume the CLEANUP_EH_ONLY case would be
(block_may_fallthru (CLEANUP_BODY (stmt))
 || block_may_fallthru (CLEANUP_EXPR (stmt)))
because if the body can fallthru, everything can, and if there is an
exception and cleanup can fallthru, then it could fallthru as well).

Jakub



Re: [PATCH v5] Introduce attribute sym_alias

2023-11-22 Thread Richard Biener
On Mon, Nov 20, 2023 at 1:54 PM Alexandre Oliva  wrote:
>
> On Sep 20, 2023, Alexandre Oliva  wrote:
>
> > This patch introduces an attribute to add extra asm names (aliases)
> > for a decl when its definition is output.
>
> Ping?
> https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630971.html
>
> Re-regstrapped on x86_64-linux-gnu.  Ok to install?

OK if Honza or C/C++ maintainers do not request additional changes
this week.

Thanks,
Richard.

> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Gcc

2023-11-22 Thread Suma Luther
Hi Gcc,

I'm following up to confirm if you are interested in acquiring the 
Registrants/Attendees/Members list.

 *   CMAA Annual Conference (Washington, USA, Oct 29-31, 2023)
 *   1,000+ Contacts

Let me know your thoughts so that I can share the price & more information.

Regards,
Suma - Business Executive



Re: [PATCH] tree: Fix up try_catch_may_fallthru [PR112619]

2023-11-22 Thread Jakub Jelinek
On Wed, Nov 22, 2023 at 11:32:10AM +, Richard Biener wrote:
> > hack in gcc 13 and triggered on hundreds of tests there within just 5
> > seconds of running make check-g++ -j32 (and in cases I looked at had nothing
> > to do with the r14-5086 backports), so I believe this is just bad
> > assumption on the try_catch_may_fallthru side, gimplify.cc certainly doesn't
> > care, it just calls gimplify_and_add (TREE_OPERAND (*expr_p, 1), );
> > on it.  So, IMHO non-STATEMENT_LIST in the second operand is equivalent to
> > a STATEMENT_LIST containing a single statement.
> 
> Did you check if there's ever a CATCH_EXPR or EH_FILTER_EXPR not wrapped
> inside a STATEMENT_LIST?  That is, does
> 
>  if (TREE_CODE (TREE_OPERAND (stmt, 1)) != STATEMENT_LIST)
>{
>  gcc_checking_assert (code != CATCH_EXPR && code != EH_FILTER_EXPR);
>  return false;
>}
> 
> work?

Looking at a trivial example
void bar ();
void
foo (void)
{
  try { bar (); } catch (int) {}
}
it seems it is even more complicated, because what e.g. the gimplification
sees is not TRY_CATCH_EXPR with CATCH_EXPR second operand, but
TRY_BLOCK with HANDLER second operand (note, certainly not wrapped in a
STATEMENT_LIST, one would need another catch (long) {} for it after it),
C++ FE specific trees.
And cp_gimplify_expr then on the fly turns the TRY_BLOCK into TRY_CATCH_EXPR
(in genericize_try_block) and HANDLER into CATCH_EXPR
(genericize_catch_block).
When gimplifying EH_SPEC_BLOCK in genericize_eh_spec_block it even
creates TRY_CATCH_EXPR with genericize_eh_spec_block -> 
build_gimple_eh_filter_tree
if even creates TRY_CATCH_EXPR with EH_FILTER_EXPR as its second operand
(without intervening STATEMENT_LIST).

So, I believe the patch is correct but for C++ it might be hard to see it
actually trigger because most often one will see the C++ FE specific trees
of TRY_BLOCK (with HANDLER) and EH_SPEC_BLOCK instead.
So, I wonder why cxx_block_may_fallthru doesn't handle TRY_BLOCK and
EH_SPEC_BLOCK as well.  Given the genericization, I think
TRY_BLOCK should be handled similarly to TRY_CATCH_EXPR in tree.cc,
if second operand is HANDLER or STATEMENT_LIST starting with HANDLER,
check if any of the handler bodies can fall thru, dunno if TRY_BLOCK without
HANDLERs is possible, and for EH_SPEC_BLOCK see if the failure can fall
through.

Jakub



  1   2   >