[Bug target/111755] The built-in memset function in GCC inadvertently generates code like "vst1.8 {d8-d9}, [sp:64]", which assumes an 8-byte alignment on the stack pointer $sp, leading to alignment vi

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111755

--- Comment #2 from Andrew Pinski  ---
Also can you attach the testcase where this happens?

Please read https://gcc.gnu.org/bugs/ on what information we need.

[Bug target/111755] The built-in memset function in GCC inadvertently generates code like "vst1.8 {d8-d9}, [sp:64]", which assumes an 8-byte alignment on the stack pointer $sp, leading to alignment vi

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111755

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2023-10-10
 Ever confirmed|0   |1
 Status|UNCONFIRMED |WAITING

--- Comment #1 from Andrew Pinski  ---
> which assumes an 8-byte alignment on the stack pointer $sp, leading to 
> alignment violations.

Isn't that the ABI?

What target is this for?

[Bug c/111755] New: The built-in memset function in GCC inadvertently generates code like "vst1.8 {d8-d9}, [sp:64]", which assumes an 8-byte alignment on the stack pointer $sp, leading to alignment vi

2023-10-09 Thread kuzume at axell dot co.jp via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111755

Bug ID: 111755
   Summary: The built-in memset function in GCC inadvertently
generates code like "vst1.8 {d8-d9}, [sp:64]", which
assumes an 8-byte alignment on the stack pointer $sp,
leading to alignment violations
   Product: gcc
   Version: 9.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kuzume at axell dot co.jp
  Target Milestone: ---

The built-in memset function in GCC inadvertently generates code like 

"vst1.8 {d8-d9}, [sp:64]"

which assumes an 8-byte alignment on the stack pointer $sp, leading to
alignment violations.

While the issue can be temporarily circumvented using the -fno-builtin-memset
option to inhibit the use of the built-in functions, the stack pointer $sp is
4-byte aligned during C function calls. This might be a bug related to GCC's
built-in function handling.

By the way, the problem can also be resolved by generating assembly listings
without alignment specification, like "vst1.8 {d8-d9}, [sp]". Although, from an
alignment perspective, this is not the optimal performance solution.

[Bug rtl-optimization/111754] New: [14 Regression] ICE: in decompose, at rtl.h:2313 at -O

2023-10-09 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754

Bug ID: 111754
   Summary: [14 Regression] ICE: in decompose, at rtl.h:2313 at -O
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 56088
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56088=edit
reduced testcase

Compiler output:
$ x86_64-pc-linux-gnu-gcc -O testcase.c
during RTL pass: expand
testcase.c: In function 'foo':
testcase.c:14:10: internal compiler error: in decompose, at rtl.h:2313
   14 |   return bar ((F){9}, (F){});
  |  ^~~
0x7ea1ba wi::int_traits >::decompose(long*,
unsigned int, std::pair const&)
/repo/gcc-trunk/gcc/rtl.h:2313
0x7ea1ba wide_int_ref_storage::wide_int_ref_storage
>(std::pair const&)
/repo/gcc-trunk/gcc/wide-int.h:1030
0x7ea1ba generic_wide_int
>::generic_wide_int >(std::pair const&)
/repo/gcc-trunk/gcc/wide-int.h:788
0x7ea1ba poly_int<1u, generic_wide_int >
>::poly_int >(poly_int_full,
std::pair const&)
/repo/gcc-trunk/gcc/poly-int.h:453
0x7ea1ba poly_int<1u, generic_wide_int >
>::poly_int >(std::pair const&)
/repo/gcc-trunk/gcc/poly-int.h:439
0x7ea1ba wi::to_poly_wide(rtx_def const*, machine_mode)
/repo/gcc-trunk/gcc/rtl.h:2382
0x7ea1ba rtx_vector_builder::step(rtx_def*, rtx_def*) const
/repo/gcc-trunk/gcc/rtx-vector-builder.h:122
0x143d95b vector_builder::elt(unsigned int) const
/repo/gcc-trunk/gcc/vector-builder.h:254
0x143d841 rtx_vector_builder::build()
/repo/gcc-trunk/gcc/rtx-vector-builder.cc:73
0x107c7a1 const_vector_from_tree
/repo/gcc-trunk/gcc/expr.cc:13494
0x10856ce expand_expr_real_1(tree_node*, rtx_def*, machine_mode,
expand_modifier, rtx_def**, bool)
/repo/gcc-trunk/gcc/expr.cc:11066
0xf50792 expand_expr(tree_node*, rtx_def*, machine_mode, expand_modifier)
/repo/gcc-trunk/gcc/expr.h:310
0xf50792 expand_return
/repo/gcc-trunk/gcc/cfgexpand.cc:3809
0xf50792 expand_gimple_stmt_1
/repo/gcc-trunk/gcc/cfgexpand.cc:3918
0xf50792 expand_gimple_stmt
/repo/gcc-trunk/gcc/cfgexpand.cc:4044
0xf51106 expand_gimple_basic_block
/repo/gcc-trunk/gcc/cfgexpand.cc:6100
0xf5378e execute
/repo/gcc-trunk/gcc/cfgexpand.cc:6835
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-4521-20231009151152-g08d0f840dc7-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
--with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-4521-20231009151152-g08d0f840dc7-checking-yes-rtl-df-extra-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.0 20231009 (experimental) (GCC)

[PATCH] x86: set spincount 1 for x86 hybrid platform [PR109812]

2023-10-09 Thread Jun Zhang
From: "Zhang, Jun" 

By test, we find in hybrid platform spincount 1 is better.

Use '-march=native -Ofast -funroll-loops -flto',
results as follows:

spec2017 speed   RPL ADL
657.xz_s 0.00%   0.50%
603.bwaves_s 10.90%  26.20%
607.cactuBSSN_s  5.50%   72.50%
619.lbm_s2.40%   2.50%
621.wrf_s-7.70%  2.40%
627.cam4_s   0.50%   0.70%
628.pop2_s   48.20%  153.00%
638.imagick_s-0.10%  0.20%
644.nab_s2.30%   1.40%
649.fotonik3d_s  8.00%   13.80%
654.roms_s   1.20%   1.10%
Geomean-int  0.00%   0.50%
Geomean-fp   6.30%   21.10%
Geomean-all  5.70%   19.10%

omp2012  RPL ADL
350.md   -1.81%  -1.75%
351.bwaves   7.72%   12.50%
352.nab  14.63%  19.71%
357.bt331-0.20%  1.77%
358.botsalgn 0.00%   0.00%
359.botsspar 0.00%   0.65%
360.ilbdc0.00%   0.25%
362.fma3d2.66%   -0.51%
363.swim 10.44%  0.00%
367.imagick  0.00%   0.12%
370.mgrid331 2.49%   25.56%
371.applu331 1.06%   4.22%
372.smithwa  0.74%   3.34%
376.kdtree   10.67%  16.03%
GEOMEAN  3.34%   5.53%

include/ChangeLog:

* omphook.h: define RUNOMPHOOK macro.

libgomp/ChangeLog:

* env.c (initialize_env): add RUNOMPHOOK macro.
* config/linux/x86/omphook.h: define RUNOMPHOOK macro.
---
 include/omphook.h  |  1 +
 libgomp/config/linux/x86/omphook.h | 19 +++
 libgomp/env.c  |  3 +++
 3 files changed, 23 insertions(+)
 create mode 100644 include/omphook.h
 create mode 100644 libgomp/config/linux/x86/omphook.h

diff --git a/include/omphook.h b/include/omphook.h
new file mode 100644
index 000..2ebe3ad57e6
--- /dev/null
+++ b/include/omphook.h
@@ -0,0 +1 @@
+#define RUNOMPHOOK()
diff --git a/libgomp/config/linux/x86/omphook.h 
b/libgomp/config/linux/x86/omphook.h
new file mode 100644
index 000..aefb311cc07
--- /dev/null
+++ b/libgomp/config/linux/x86/omphook.h
@@ -0,0 +1,19 @@
+#ifdef __x86_64__
+#include "cpuid.h"
+
+/* only for x86 hybrid platform */
+#define RUNOMPHOOK()  \
+  do \
+{ \
+  unsigned int eax, ebx, ecx, edx; \
+  if ((getenv ("GOMP_SPINCOUNT") == NULL) && (wait_policy < 0) \
+ && __get_cpuid_count (7, 0, , , , ) \
+ && ((edx >> 15) & 1)) \
+   gomp_spin_count_var = 1LL; \
+  if (gomp_throttled_spin_count_var > gomp_spin_count_var) \
+   gomp_throttled_spin_count_var = gomp_spin_count_var; \
+} \
+  while (0)
+#else
+# include "../../../../include/omphook.h"
+#endif
diff --git a/libgomp/env.c b/libgomp/env.c
index a21adb3fd4b..1f13a148694 100644
--- a/libgomp/env.c
+++ b/libgomp/env.c
@@ -61,6 +61,7 @@
 
 #include "secure_getenv.h"
 #include "environ.h"
+#include "omphook.h"
 
 /* Default values of ICVs according to the OpenMP standard,
except for default-device-var.  */
@@ -2496,5 +2497,7 @@ initialize_env (void)
   goacc_runtime_initialize ();
 
   goacc_profiling_initialize ();
+
+  RUNOMPHOOK ();
 }
 #endif /* LIBGOMP_OFFLOADED_ONLY */
-- 
2.31.1



[Bug rtl-optimization/111753] New: [14 Regression] ICE: in extract_constrain_insn, at recog.cc:2692 insn does not satisfy its constraints: {*movsf_internal} with -O2 -mavx512bw -fno-tree-ter

2023-10-09 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111753

Bug ID: 111753
   Summary: [14 Regression] ICE: in extract_constrain_insn, at
recog.cc:2692 insn does not satisfy its constraints:
{*movsf_internal} with -O2 -mavx512bw -fno-tree-ter
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 56087
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56087=edit
reduced testcase

Compiler output:
$ x86_64-pc-linux-gnu-gcc -O2 -mavx512bw -fno-tree-ter testcase.c 
testcase.c: In function 'foo':
testcase.c:35:9: warning: division by zero [-Wdiv-by-zero]
   35 |   f32_0 /= 0;
  | ^~
testcase.c:38:13: warning: division by zero [-Wdiv-by-zero]
   38 |   v256f32_0 /= 0;
  | ^~
testcase.c:66:1: error: insn does not satisfy its constraints:
   66 | }
  | ^
(insn 713 222 227 2 (set (reg:SF 52 xmm16 [473])
(const_double:SF 0.0 [0x0.0p+0])) "testcase.c":45:13 160
{*movsf_internal}
 (expr_list:REG_EQUAL (const_double:SF 0.0 [0x0.0p+0])
(nil)))
during RTL pass: cprop_hardreg
testcase.c:66:1: internal compiler error: in extract_constrain_insn, at
recog.cc:2692
0x7e2e60 _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/repo/gcc-trunk/gcc/rtl-error.cc:108
0x7e2ee7 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
/repo/gcc-trunk/gcc/rtl-error.cc:118
0x7d359b extract_constrain_insn(rtx_insn*)
/repo/gcc-trunk/gcc/recog.cc:2692
0x13fdd85 copyprop_hardreg_forward_1
/repo/gcc-trunk/gcc/regcprop.cc:836
0x13ff199 execute
/repo/gcc-trunk/gcc/regcprop.cc:1423
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-4521-20231009151152-g08d0f840dc7-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu
--with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-4521-20231009151152-g08d0f840dc7-checking-yes-rtl-df-extra-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.0 20231009 (experimental) (GCC)

[PATCH v2 3/4] RISC-V: Extend riscv_subset_list, preparatory for target attribute support

2023-10-09 Thread Kito Cheng
riscv_subset_list only accept a full arch string before, but we need to
parse single extension when supporting target attribute, also we may set
a riscv_subset_list directly rather than re-parsing the ISA string
again.

gcc/ChangeLog:

* config/riscv/riscv-subset.h (riscv_subset_list::parse_single_std_ext):
New.
(riscv_subset_list::parse_single_multiletter_ext): Ditto.
(riscv_subset_list::clone): Ditto.
(riscv_subset_list::parse_single_ext): Ditto.
(riscv_subset_list::set_loc): Ditto.
(riscv_set_arch_by_subset_list): Ditto.
* common/config/riscv/riscv-common.cc
(riscv_subset_list::parse_single_std_ext): New.
(riscv_subset_list::parse_single_multiletter_ext): Ditto.
(riscv_subset_list::clone): Ditto.
(riscv_subset_list::parse_single_ext): Ditto.
(riscv_subset_list::set_loc): Ditto.
(riscv_set_arch_by_subset_list): Ditto.
---
 gcc/common/config/riscv/riscv-common.cc | 203 
 gcc/config/riscv/riscv-subset.h |  11 ++
 2 files changed, 214 insertions(+)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 9a0a68fe5db..25630d5923e 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -1036,6 +1036,41 @@ riscv_subset_list::parse_std_ext (const char *p)
   return p;
 }
 
+/* Parsing function for one standard extensions.
+
+   Return Value:
+ Points to the end of extensions.
+
+   Arguments:
+ `p`: Current parsing position.  */
+
+const char *
+riscv_subset_list::parse_single_std_ext (const char *p)
+{
+  if (*p == 'x' || *p == 's' || *p == 'z')
+{
+  error_at (m_loc,
+   "%<-march=%s%>: Not single-letter extension. "
+   "%<%c%>",
+   m_arch, *p);
+  return nullptr;
+}
+
+  unsigned major_version = 0;
+  unsigned minor_version = 0;
+  bool explicit_version_p = false;
+  char subset[2] = {0, 0};
+
+  subset[0] = *p;
+
+  p++;
+
+  p = parsing_subset_version (subset, p, _version, _version,
+ /* std_ext_p= */ true, _version_p);
+
+  add (subset, major_version, minor_version, explicit_version_p, false);
+  return p;
+}
 
 /* Check any implied extensions for EXT.  */
 void
@@ -1138,6 +1173,102 @@ riscv_subset_list::handle_combine_ext ()
 }
 }
 
+/* Parsing function for multi-letter extensions.
+
+   Return Value:
+ Points to the end of extensions.
+
+   Arguments:
+ `p`: Current parsing position.
+ `ext_type`: What kind of extensions, 's', 'z' or 'x'.
+ `ext_type_str`: Full name for kind of extension.  */
+
+
+const char *
+riscv_subset_list::parse_single_multiletter_ext (const char *p,
+const char *ext_type,
+const char *ext_type_str)
+{
+  unsigned major_version = 0;
+  unsigned minor_version = 0;
+  size_t ext_type_len = strlen (ext_type);
+
+  if (strncmp (p, ext_type, ext_type_len) != 0)
+return NULL;
+
+  char *subset = xstrdup (p);
+  const char *end_of_version;
+  bool explicit_version_p = false;
+  char *ext;
+  char backup;
+  size_t len = strlen (p);
+  size_t end_of_version_pos, i;
+  bool found_any_number = false;
+  bool found_minor_version = false;
+
+  end_of_version_pos = len;
+  /* Find the begin of version string.  */
+  for (i = len -1; i > 0; --i)
+{
+  if (ISDIGIT (subset[i]))
+   {
+ found_any_number = true;
+ continue;
+   }
+  /* Might be version seperator, but need to check one more char,
+we only allow p, so we could stop parsing if found
+any more `p`.  */
+  if (subset[i] == 'p' &&
+ !found_minor_version &&
+ found_any_number && ISDIGIT (subset[i-1]))
+   {
+ found_minor_version = true;
+ continue;
+   }
+
+  end_of_version_pos = i + 1;
+  break;
+}
+
+  backup = subset[end_of_version_pos];
+  subset[end_of_version_pos] = '\0';
+  ext = xstrdup (subset);
+  subset[end_of_version_pos] = backup;
+
+  end_of_version
+= parsing_subset_version (ext, subset + end_of_version_pos, _version,
+ _version, /* std_ext_p= */ false,
+ _version_p);
+  free (ext);
+
+  if (end_of_version == NULL)
+return NULL;
+
+  subset[end_of_version_pos] = '\0';
+
+  if (strlen (subset) == 1)
+{
+  error_at (m_loc, "%<-march=%s%>: name of %s must be more than 1 letter",
+   m_arch, ext_type_str);
+  free (subset);
+  return NULL;
+}
+
+  add (subset, major_version, minor_version, explicit_version_p, false);
+  p += end_of_version - subset;
+  free (subset);
+
+  if (*p != '\0' && *p != '_')
+{
+  error_at (m_loc, "%<-march=%s%>: %s must separate with %<_%>",
+   m_arch, ext_type_str);
+  return NULL;
+}
+
+  return p;
+
+}
+
 /* Parsing function for 

[PATCH v2 4/4] RISC-V: Implement target attribute

2023-10-09 Thread Kito Cheng
The target attribute which proposed in [1], target attribute allow user
to specify a local setting per-function basis.

The syntax of target attribute is `__attribute__((target("")))`.

and the syntax of `` describes below:
```
ATTR-STRING := ATTR-STRING ';' ATTR
 | ATTR

ATTR:= ARCH-ATTR
 | CPU-ATTR
 | TUNE-ATTR

ARCH-ATTR   := 'arch=' EXTENSIONS-OR-FULLARCH

EXTENSIONS-OR-FULLARCH := 
| 

EXTENSIONS :=  ',' 
| 

FULLARCHSTR:= 

EXTENSION  :=   

OP := '+'

VERSION:= [0-9]+ 'p' [0-9]+
| [1-9][0-9]*
|

EXTENSION-NAME := Naming rule is defined in RISC-V ISA manual

CPU-ATTR:= 'cpu=' 
TUNE-ATTR   := 'tune=' 
```

[1] https://github.com/riscv-non-isa/riscv-c-api-doc/pull/35

gcc/ChangeLog:

* config.gcc (riscv): Add riscv-target-attr.o.
* config/riscv/riscv-opts.h (TARGET_MIN_VLEN_OPTS): New.
* config/riscv/riscv-protos.h (riscv_declare_function_size) New.
(riscv_option_valid_attribute_p): New.
(riscv_override_options_internal): New.
(struct riscv_tune_info): New.
(riscv_parse_tune): New.
* config/riscv/riscv-target-attr.cc
(class riscv_target_attr_parser): New.
(struct riscv_attribute_info): New.
(riscv_attributes): New.
(riscv_target_attr_parser::parse_arch):
(riscv_target_attr_parser::handle_arch):
(riscv_target_attr_parser::handle_cpu):
(riscv_target_attr_parser::handle_tune):
(riscv_target_attr_parser::update_settings):
(riscv_process_one_target_attr):
(num_occurences_in_str):
(riscv_process_target_attr):
(riscv_option_valid_attribute_p):
* config/riscv/riscv.cc: Include target-globals.h and
riscv-subset.h.
(struct riscv_tune_info): Move to riscv-protos.h.
(get_tune_str):
(riscv_parse_tune):
(riscv_declare_function_size):
(riscv_option_override): Build target_option_default_node and
target_option_current_node.
(riscv_save_restore_target_globals):
(riscv_option_restore):
(riscv_previous_fndecl):
(riscv_set_current_function): Apply the target attribute.
(TARGET_OPTION_RESTORE): Define.
(TARGET_OPTION_VALID_ATTRIBUTE_P): Ditto.
* config/riscv/riscv.h (SWITCHABLE_TARGET): Define to 1.
(ASM_DECLARE_FUNCTION_SIZE) Define.
* config/riscv/riscv.opt (mtune=): Add Save attribute.
(mcpu=): Ditto.
(mcmodel=): Ditto.
* config/riscv/t-riscv: Add build rule for riscv-target-attr.o
* doc/extend.texi: Add doc for target attribute.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/target-attr-01.c: New.
* gcc.target/riscv/target-attr-02.c: Ditto.
* gcc.target/riscv/target-attr-03.c: Ditto.
* gcc.target/riscv/target-attr-04.c: Ditto.
* gcc.target/riscv/target-attr-05.c: Ditto.
* gcc.target/riscv/target-attr-06.c: Ditto.
* gcc.target/riscv/target-attr-07.c: Ditto.
* gcc.target/riscv/target-attr-bad-01.c: Ditto.
* gcc.target/riscv/target-attr-bad-02.c: Ditto.
* gcc.target/riscv/target-attr-bad-03.c: Ditto.
* gcc.target/riscv/target-attr-bad-04.c: Ditto.
* gcc.target/riscv/target-attr-bad-05.c: Ditto.
* gcc.target/riscv/target-attr-bad-06.c: Ditto.
* gcc.target/riscv/target-attr-bad-07.c: Ditto.
* gcc.target/riscv/target-attr-warning-01.c: Ditto.
* gcc.target/riscv/target-attr-warning-02.c: Ditto.
* gcc.target/riscv/target-attr-warning-03.c: Ditto.
---
 gcc/config.gcc|   2 +-
 gcc/config/riscv/riscv-opts.h |   6 +
 gcc/config/riscv/riscv-protos.h   |  21 +
 gcc/config/riscv/riscv-target-attr.cc | 395 ++
 gcc/config/riscv/riscv.cc | 192 +++--
 gcc/config/riscv/riscv.h  |   6 +
 gcc/config/riscv/riscv.opt|   6 +-
 gcc/config/riscv/t-riscv  |   5 +
 gcc/doc/extend.texi   |  58 +++
 .../gcc.target/riscv/target-attr-01.c |  31 ++
 .../gcc.target/riscv/target-attr-02.c |  31 ++
 .../gcc.target/riscv/target-attr-03.c |  26 ++
 .../gcc.target/riscv/target-attr-04.c |  28 ++
 .../gcc.target/riscv/target-attr-05.c |  27 ++
 .../gcc.target/riscv/target-attr-06.c |  27 ++
 .../gcc.target/riscv/target-attr-07.c |  25 ++
 .../gcc.target/riscv/target-attr-bad-01.c |  13 +
 .../gcc.target/riscv/target-attr-bad-02.c |  13 +
 .../gcc.target/riscv/target-attr-bad-03.c |  13 +
 .../gcc.target/riscv/target-attr-bad-04.c |  13 +
 .../gcc.target/riscv/target-attr-bad-05.c |  13 +
 

[PATCH v2 2/4] RISC-V: Refactor riscv_option_override and riscv_convert_vector_bits. [NFC]

2023-10-09 Thread Kito Cheng
Allow those funciton apply from a local gcc_options rather than the
global options.

Preparatory for target attribute, sperate this change for eaiser reivew
since it's a NFC.

gcc/ChangeLog:

* config/riscv/riscv.cc (riscv_convert_vector_bits): Get setting
from argument rather than get setting from global setting.
(riscv_override_options_internal): New, splited from
riscv_override_options, also take a gcc_options argument.
(riscv_option_override): Splited most part to
riscv_override_options_internal.
---
 gcc/config/riscv/riscv.cc | 93 ++-
 1 file changed, 52 insertions(+), 41 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index b7acf836d02..c7d0d300345 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -8066,10 +8066,11 @@ riscv_init_machine_status (void)
 /* Return the VLEN value associated with -march.
TODO: So far we only support length-agnostic value. */
 static poly_uint16
-riscv_convert_vector_bits (void)
+riscv_convert_vector_bits (struct gcc_options *opts)
 {
   int chunk_num;
-  if (TARGET_MIN_VLEN > 32)
+  int min_vlen = TARGET_MIN_VLEN_OPTS (opts);
+  if (min_vlen > 32)
 {
   /* When targetting minimum VLEN > 32, we should use 64-bit chunk size.
 Otherwise we can not include SEW = 64bits.
@@ -8087,7 +8088,7 @@ riscv_convert_vector_bits (void)
   - TARGET_MIN_VLEN = 2048bit: [256,256]
   - TARGET_MIN_VLEN = 4096bit: [512,512]
   FIXME: We currently DON'T support TARGET_MIN_VLEN > 4096bit.  */
-  chunk_num = TARGET_MIN_VLEN / 64;
+  chunk_num = min_vlen / 64;
 }
   else
 {
@@ -8106,10 +8107,10 @@ riscv_convert_vector_bits (void)
  to set RVV mode size. The RVV machine modes size are run-time constant if
  TARGET_VECTOR is enabled. The RVV machine modes size remains default
  compile-time constant if TARGET_VECTOR is disabled.  */
-  if (TARGET_VECTOR)
+  if (TARGET_VECTOR_OPTS_P (opts))
 {
-  if (riscv_autovec_preference == RVV_FIXED_VLMAX)
-   return (int) TARGET_MIN_VLEN / (riscv_bytes_per_vector_chunk * 8);
+  if (opts->x_riscv_autovec_preference == RVV_FIXED_VLMAX)
+   return (int) min_vlen / (riscv_bytes_per_vector_chunk * 8);
   else
return poly_uint16 (chunk_num, chunk_num);
 }
@@ -8117,40 +8118,33 @@ riscv_convert_vector_bits (void)
 return 1;
 }
 
-/* Implement TARGET_OPTION_OVERRIDE.  */
-
-static void
-riscv_option_override (void)
+/* 'Unpack' up the internal tuning structs and update the options
+in OPTS.  The caller must have set up selected_tune and selected_arch
+as all the other target-specific codegen decisions are
+derived from them.  */
+void
+riscv_override_options_internal (struct gcc_options *opts)
 {
   const struct riscv_tune_info *cpu;
 
-#ifdef SUBTARGET_OVERRIDE_OPTIONS
-  SUBTARGET_OVERRIDE_OPTIONS;
-#endif
-
-  flag_pcc_struct_return = 0;
-
-  if (flag_pic)
-g_switch_value = 0;
-
   /* The presence of the M extension implies that division instructions
  are present, so include them unless explicitly disabled.  */
-  if (TARGET_MUL && (target_flags_explicit & MASK_DIV) == 0)
-target_flags |= MASK_DIV;
-  else if (!TARGET_MUL && TARGET_DIV)
+  if (TARGET_MUL_OPTS_P (opts) && (target_flags_explicit & MASK_DIV) == 0)
+opts->x_target_flags |= MASK_DIV;
+  else if (!TARGET_MUL_OPTS_P (opts) && TARGET_DIV_OPTS_P (opts))
 error ("%<-mdiv%> requires %<-march%> to subsume the % extension");
 
   /* Likewise floating-point division and square root.  */
   if ((TARGET_HARD_FLOAT || TARGET_ZFINX) && (target_flags_explicit & 
MASK_FDIV) == 0)
-target_flags |= MASK_FDIV;
+opts->x_target_flags |= MASK_FDIV;
 
   /* Handle -mtune, use -mcpu if -mtune is not given, and use default -mtune
  if both -mtune and -mcpu are not given.  */
-  cpu = riscv_parse_tune (riscv_tune_string ? riscv_tune_string :
- (riscv_cpu_string ? riscv_cpu_string :
+  cpu = riscv_parse_tune (opts->x_riscv_tune_string ? 
opts->x_riscv_tune_string :
+ (opts->x_riscv_cpu_string ? opts->x_riscv_cpu_string :
   RISCV_TUNE_STRING_DEFAULT));
   riscv_microarchitecture = cpu->microarchitecture;
-  tune_param = optimize_size ? _size_tune_info : cpu->tune_param;
+  tune_param = opts->x_optimize_size ? _size_tune_info : 
cpu->tune_param;
 
   /* Use -mtune's setting for slow_unaligned_access, even when optimizing
  for size.  For architectures that trap and emulate unaligned accesses,
@@ -8166,15 +8160,38 @@ riscv_option_override (void)
 
   if ((target_flags_explicit & MASK_STRICT_ALIGN) == 0
   && cpu->tune_param->slow_unaligned_access)
-target_flags |= MASK_STRICT_ALIGN;
+opts->x_target_flags |= MASK_STRICT_ALIGN;
 
   /* If the user hasn't specified a branch cost, use the processor's
  default.  */
-  if (riscv_branch_cost == 0)
-

[PATCH v2 1/4] options: Define TARGET__P and TARGET__OPTS_P macro for Mask and InverseMask

2023-10-09 Thread Kito Cheng
We TARGET__P marcro to test a Mask and InverseMask with user
specified target_variable, however we may want to test with specific
gcc_options variable rather than target_variable.

Like RISC-V has defined lots of Mask with TargetVariable, which is not
easy to use, because that means we need to known which Mask are associate with
which TargetVariable, so take a gcc_options variable is a better interface
for such use case.

gcc/ChangeLog:

* doc/options.texi (Mask): Document TARGET__P and
TARGET__OPTS_P.
(InverseMask): Ditto.
* opth-gen.awk (Mask): Generate TARGET__P and
TARGET__OPTS_P macro.
(InverseMask): Ditto.
---
 gcc/doc/options.texi | 23 ---
 gcc/opth-gen.awk | 13 -
 2 files changed, 28 insertions(+), 8 deletions(-)

diff --git a/gcc/doc/options.texi b/gcc/doc/options.texi
index 1f7c15b8eb4..715f0a1479c 100644
--- a/gcc/doc/options.texi
+++ b/gcc/doc/options.texi
@@ -404,18 +404,27 @@ You may also specify @code{Var} to select a variable 
other than
 The options-processing script will automatically allocate a unique bit
 for the option.  If the option is attached to @samp{target_flags} or @code{Var}
 which is defined by @code{TargetVariable},  the script will set the macro
-@code{MASK_@var{name}} to the appropriate bitmask.  It will also declare a 
-@code{TARGET_@var{name}} macro that has the value 1 when the option is active
-and 0 otherwise.  If you use @code{Var} to attach the option to a different 
variable
-which is not defined by @code{TargetVariable}, the bitmask macro with be
-called @code{OPTION_MASK_@var{name}}.
+@code{MASK_@var{name}} to the appropriate bitmask.  It will also declare a
+@code{TARGET_@var{name}}, @code{TARGET_@var{name}_P} and
+@code{TARGET_@var{name}_OPTS_P}: @code{TARGET_@var{name}} macros that has the
+value 1 when the option is active and 0 otherwise, @code{TARGET_@var{name}_P} 
is
+similar to @code{TARGET_@var{name}} but take an argument as @samp{target_flags}
+or @code{TargetVariable}, and @code{TARGET_@var{name}_OPTS_P} also similar to
+@code{TARGET_@var{name}} but take an argument as @code{gcc_options}.
+If you use @code{Var} to attach the option to a different variable which is not
+defined by @code{TargetVariable}, the bitmask macro with be called
+@code{OPTION_MASK_@var{name}}.
 
 @item InverseMask(@var{othername})
 @itemx InverseMask(@var{othername}, @var{thisname})
 The option is the inverse of another option that has the
 @code{Mask(@var{othername})} property.  If @var{thisname} is given,
-the options-processing script will declare a @code{TARGET_@var{thisname}}
-macro that is 1 when the option is active and 0 otherwise.
+the options-processing script will declare @code{TARGET_@var{thisname}},
+@code{TARGET_@var{name}_P} and @code{TARGET_@var{name}_OPTS_P} macros:
+@code{TARGET_@var{thisname}} is 1 when the option is active and 0 otherwise,
+@code{TARGET_@var{name}_P} is similar to @code{TARGET_@var{name}} but take an
+argument as @samp{target_flags}, and and @code{TARGET_@var{name}_OPTS_P} also
+similar to @code{TARGET_@var{name}} but take an argument as @code{gcc_options}.
 
 @item Enum(@var{name})
 The option's argument is a string from the set of strings associated
diff --git a/gcc/opth-gen.awk b/gcc/opth-gen.awk
index c4398be2f3a..26551575d55 100644
--- a/gcc/opth-gen.awk
+++ b/gcc/opth-gen.awk
@@ -439,6 +439,10 @@ for (i = 0; i < n_target_vars; i++)
{
print "#define TARGET_" other_masks[i "," j] \
  " ((" target_vars[i] " & MASK_" other_masks[i "," j] ") 
!= 0)"
+   print "#define TARGET_" other_masks[i "," j] "_P(" 
target_vars[i] ")" \
+ " (((" target_vars[i] ") & MASK_" other_masks[i "," j] ") 
!= 0)"
+   print "#define TARGET_" other_masks[i "," j] "_OPTS_P(opts)" \
+ " (((opts->x_" target_vars[i] ") & MASK_" other_masks[i 
"," j] ") != 0)"
}
 }
 print ""
@@ -469,15 +473,22 @@ for (i = 0; i < n_opts; i++) {
  " ((" vname " & " mask original_name ") != 0)"
print "#define TARGET_" name "_P(" vname ")" \
  " (((" vname ") & " mask original_name ") != 0)"
+   print "#define TARGET_" name "_OPTS_P(opts)" \
+ " (((opts->x_" vname ") & " mask original_name ") != 0)"
print "#define TARGET_EXPLICIT_" name "_P(opts)" \
  " ((opts->x_" vname "_explicit & " mask original_name ") 
!= 0)"
print "#define SET_TARGET_" name "(opts) opts->x_" vname " |= " 
mask original_name
}
 }
 for (i = 0; i < n_extra_masks; i++) {
-   if (extra_mask_macros[extra_masks[i]] == 0)
+   if (extra_mask_macros[extra_masks[i]] == 0) {
print "#define TARGET_" extra_masks[i] \
  " ((target_flags & MASK_" extra_masks[i] ") != 0)"
+   print "#define TARGET_" extra_masks[i] "_P(target_flags)" \
+   

[PATCH v2 0/4] RISC-V target attribute

2023-10-09 Thread Kito Cheng
This patch set implement target attribute for RISC-V target, which is similar 
to other target like x86 or ARM, let user able to set some local setting per 
function without changing global settings.

We support arch, tune and cpu first, and we will support other target attribute 
later, this version DOES NOT include multi-version function support yet, that 
is future work, probably work for GCC 15.

The full proposal is put in RISC-V C-API document[1], which has discussed with 
RISC-V LLVM community, so we have consistent syntax and semantics. 

[1] https://github.com/riscv-non-isa/riscv-c-api-doc/pull/35

v2 changelog:
- Resolve awk multi-dimensional issue.
- Tweak code format
- Tweak testcases




[Bug middle-end/111752] New: -Wfree-nonheap-object (vec.h:347:10: warning: 'free' called on unallocated object 'dest_bbs') during bootstrap with LTO

2023-10-09 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111752

Bug ID: 111752
   Summary: -Wfree-nonheap-object (vec.h:347:10: warning: 'free'
called on unallocated object 'dest_bbs') during
bootstrap with LTO
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sjames at gcc dot gnu.org
  Target Milestone: ---

I'm not sure this was always there - I think I would've noticed if it was a
long-standing thing. I get this -Wfree-nonheap-object warning during bootstrap.

I can reproduce it with:
```
./configure --disable-analyzer --disable-bootstrap --disable-cet
--disable-default-pie --disable-default-ssp --disable-fixincludes
--disable-gcov --disable-libada --disable-libatomic --disable-libgomp
--disable-libitm --disable-libquadmath --disable-libsanitizer --disable-libssp
--disable-libstdcxx-pch --disable-libvtv --disable-lto --disable-multilib
--disable-nls --disable-objc-gc --disable-systemtap --disable-werror
--enable-languages=c,c++ --prefix=/tmp/bisect --without-isl --without-zstd
--with-system-zlib --enable-bootstrap --enable-lto
make BUILD_CONFIG=bootstrap-lto -j$(nproc)
```

I can only reproduce when building with bootstrap-lto.

On trunk at r14-4523-gfb124f2a23e92b, I get this:
```
/home/sam/git/gcc/host-x86_64-pc-linux-gnu/prev-gcc/xg++
-B/home/sam/git/gcc/host-x86_64-pc-linux-gnu/prev-gcc/
-B/tmp/bisect/x86_64-pc-linux-gnu/bin/ -nostdinc++
-B/home/sam/git/gcc/prev-x86_64-pc-linux-gnu/
libstdc++-v3/src/.libs
-B/home/sam/git/gcc/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs 
-I/home/sam/git/gcc/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu
 -I/home/sam/git/gcc/pre
v-x86_64-pc-linux-gnu/libstdc++-v3/include 
-I/home/sam/git/gcc/libstdc++-v3/libsupc++
-L/home/sam/git/gcc/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs
-L/home/sam/git/gcc/prev-x86_64-pc-linux-gnu/libstdc+
+-v3/libsupc++/.libs -no-pie   -g -O2 -fno-checking -flto=jobserver
-frandom-seed=1 -DIN_GCC-fno-exceptions -fno-rtti
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wmis
sing-format-attribute -Wconditionally-supported -Woverloaded-virtual -pedantic
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common 
-DHAVE_CONFIG_H -no-pie -static-libstdc++ -static-libgcc
  -o cc1plus \
  cp/cp-lang.o c-family/stub-objc.o cp/call.o cp/class.o cp/constexpr.o
cp/constraint.o cp/coroutines.o cp/cp-gimplify.o cp/cp-objcp-common.o
cp/cp-ubsan.o cp/cvt.o cp/contracts.o cp/cxx-pretty-print.o cp
/decl.o cp/decl2.o cp/dump.o cp/error.o cp/except.o cp/expr.o cp/friend.o
cp/init.o cp/lambda.o cp/lex.o cp/logic.o cp/mangle.o cp/mapper-client.o
cp/mapper-resolver.o cp/method.o cp/module.o cp/name-lookup.o
 cp/optimize.o cp/parser.o cp/pt.o cp/ptree.o cp/rtti.o cp/search.o
cp/semantics.o cp/tree.o cp/typeck.o cp/typeck2.o cp/vtable-class-hierarchy.o
attribs.o c-family/c-common.o c-family/c-cppbuiltin.o c-family
/c-dump.o c-family/c-format.o c-family/c-gimplify.o c-family/c-indentation.o
c-family/c-lex.o c-family/c-omp.o c-family/c-opts.o c-family/c-pch.o
c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-pr
int.o c-family/c-semantics.o c-family/c-ada-spec.o c-family/c-ubsan.o
c-family/known-headers.o c-family/c-attribs.o c-family/c-warn.o
c-family/c-spellcheck.o i386-c.o glibc-c.o cc1plus-checksum.o libbackend.a
 main.o libcommon-target.a libcommon.a ../libcpp/libcpp.a
../libdecnumber/libdecnumber.a ../libcody/libcody.a  \
libcommon.a ../libcpp/libcpp.a   ../libbacktrace/.libs/libbacktrace.a
../libiberty/libiberty.a ../libdecnumber/libdecnumber.a   -lmpc -lmpfr -lgmp
-rdynamic  -lz
../.././gcc/spellcheck.cc: In function '_Z17get_edit_distancePKciS0_i.part.0':
../.././gcc/spellcheck.cc:71:61: warning: argument 1 value
'18446744073709551615' exceeds maximum object size 9223372036854775807
[-Walloc-size-larger-than=]
   71 |   edit_distance_t *v_two_ago = new edit_distance_t[len_s + 1];
  | ^
/home/sam/git/gcc/libstdc++-v3/libsupc++/new:133:26: note: in a call to
allocation function 'operator new []' declared here
  133 | _GLIBCXX_NODISCARD void* operator new[](std::size_t) _GLIBCXX_THROW
(std::bad_alloc)
  |  ^
../.././gcc/spellcheck.cc:72:61: warning: argument 1 value
'18446744073709551615' exceeds maximum object size 9223372036854775807
[-Walloc-size-larger-than=]
   72 |   edit_distance_t *v_one_ago = new edit_distance_t[len_s + 1];
  | ^
/home/sam/git/gcc/libstdc++-v3/libsupc++/new:133:26: note: in a call to
allocation function 'operator new []' declared here
  133 | _GLIBCXX_NODISCARD void* operator new[](std::size_t) _GLIBCXX_THROW
(std::bad_alloc)
  |  

[Bug tree-optimization/111751] RISC-V: RVV unexpected vectorization

2023-10-09 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751

--- Comment #7 from JuzheZhong  ---
(In reply to Andrew Pinski from comment #6)
> (In reply to JuzheZhong from comment #5)
> > (In reply to Andrew Pinski from comment #4)
> > > The issue for aarch64 with SVE is that MASK_LOAD is not optimized:
> > > 
> > >   ic = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-";
> > >   ib = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-";
> > >   vect__1.7_9 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1,
> > > -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
> > > 0,
> > > ... });
> > >   vect__2.10_35 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, 
> > > -1,
> > > -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
> > > 0,
> > > ... });
> > 
> > I don't ARM SVE has issues ...
> 
> It does as I mentioned if you use -fno-vect-cost-model, you get the above
> issue which should be optimized really to a constant vector ...

After investigation:

I found it failed to recognize its CONST_VECTOR value in FRE

/* Visit a load from a reference operator RHS, part of STMT, value number it,
   and return true if the value number of the LHS has changed as a result.  */

static bool
visit_reference_op_load (tree lhs, tree op, gimple *stmt)
{
  bool changed = false;
  tree result;
  vn_reference_t res;

  tree vuse = gimple_vuse (stmt);
  tree last_vuse = vuse;
  result = vn_reference_lookup (op, vuse, default_vn_walk_kind, , true,
_vuse);

  /* We handle type-punning through unions by value-numbering based
 on offset and size of the access.  Be prepared to handle a
 type-mismatch here via creating a VIEW_CONVERT_EXPR.  */
  if (result
  && !useless_type_conversion_p (TREE_TYPE (result), TREE_TYPE (op)))
{
  /* Avoid the type punning in case the result mode has padding where
 the op we lookup has not.  */
  if (maybe_lt (GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (result))),
GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (op)
result = NULL_TREE;




The result is BLKmode, op is V16QImode

Then reach
  /* Avoid the type punning in case the result mode has padding where
 the op we lookup has not.  */
  if (maybe_lt (GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (result))),
GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (op)
result = NULL_TREE;

If I delete this code, RVV can optimize it.

Do you have any suggestion ?

This is my observation:

Breakpoint 6, visit_reference_op_load (lhs=0x768364c8, op=0x76874410,
stmt=0x76872640) at ../../../../gcc/gcc/tree-ssa-sccvn.cc:5740
5740  result = vn_reference_lookup (op, vuse, default_vn_walk_kind, ,
true, _vuse);
(gdb) c
Continuing.

Breakpoint 6, visit_reference_op_load (lhs=0x768364c8, op=0x76874410,
stmt=0x76872640) at ../../../../gcc/gcc/tree-ssa-sccvn.cc:5740
5740  result = vn_reference_lookup (op, vuse, default_vn_walk_kind, ,
true, _vuse);
(gdb) n
5746  && !useless_type_conversion_p (TREE_TYPE (result), TREE_TYPE
(op)))
(gdb) p debug (result)
"\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-"
$9 = void
(gdb) p op->typed.type->type_common.mode
$10 = E_V16QImode

[Bug tree-optimization/111751] RISC-V: RVV unexpected vectorization

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751

--- Comment #6 from Andrew Pinski  ---
(In reply to JuzheZhong from comment #5)
> (In reply to Andrew Pinski from comment #4)
> > The issue for aarch64 with SVE is that MASK_LOAD is not optimized:
> > 
> >   ic = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-";
> >   ib = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-";
> >   vect__1.7_9 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1,
> > -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > ... });
> >   vect__2.10_35 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1,
> > -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > ... });
> 
> I don't ARM SVE has issues ...

It does as I mentioned if you use -fno-vect-cost-model, you get the above issue
which should be optimized really to a constant vector ...

[Bug tree-optimization/111751] RISC-V: RVV unexpected vectorization

2023-10-09 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751

--- Comment #5 from JuzheZhong  ---
(In reply to Andrew Pinski from comment #4)
> The issue for aarch64 with SVE is that MASK_LOAD is not optimized:
> 
>   ic = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-";
>   ib = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-";
>   vect__1.7_9 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1,
> -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> ... });
>   vect__2.10_35 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1,
> -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> ... });

I don't ARM SVE has issues ...
If we can choose fixed length vector mode to vectorize it, it will be well
optimized.

I think this is RISC-V target dependent issue.

[Bug tree-optimization/111751] RISC-V: RVV unexpected vectorization

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751

--- Comment #4 from Andrew Pinski  ---
The issue for aarch64 with SVE is that MASK_LOAD is not optimized:

  ic = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-";
  ib = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-";
  vect__1.7_9 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... });
  vect__2.10_35 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1,
-1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ...
});

[Bug tree-optimization/111751] RISC-V: RVV unexpected vectorization

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751

--- Comment #3 from Andrew Pinski  ---
If you add `-fno-vect-cost-model` to aarch64 compiling, then it uses SVE and
does not optimize to just `return 0`.

[Bug tree-optimization/111751] RISC-V: RVV unexpected vectorization

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2023-10-10
 Ever confirmed|0   |1

--- Comment #2 from Andrew Pinski  ---
Confirmed.

[Bug tree-optimization/111751] RISC-V: RVV unexpected vectorization

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751

--- Comment #1 from Andrew Pinski  ---
AARCH64 did vectorize the code just using non-SVE which then allowed to be
optimized too.

RE: [PATCH] RISC-V: Add available vector size for RVV

2023-10-09 Thread Li, Pan2
Committed, thanks Kito.

Pan

-Original Message-
From: Kito Cheng  
Sent: Tuesday, October 10, 2023 11:20 AM
To: Juzhe-Zhong 
Cc: gcc-patches@gcc.gnu.org; kito.ch...@gmail.com; jeffreya...@gmail.com; 
rdapp@gmail.com
Subject: Re: [PATCH] RISC-V: Add available vector size for RVV

LGTM

On Mon, Oct 9, 2023 at 4:23 PM Juzhe-Zhong  wrote:
>
> For RVV, we have VLS modes enable according to TARGET_MIN_VLEN
> from M1 to M8.
>
> For example, when TARGET_MIN_VLEN = 128 bits, we enable
> 128/256/512/1024 bits VLS modes.
>
> This patch fixes following FAIL:
> FAIL: gcc.dg/vect/bb-slp-subgroups-2.c -flto -ffat-lto-objects  
> scan-tree-dump-times slp2 "optimized: basic block" 2
> FAIL: gcc.dg/vect/bb-slp-subgroups-2.c scan-tree-dump-times slp2 "optimized: 
> basic block" 2
>
> gcc/testsuite/ChangeLog:
>
> * lib/target-supports.exp: Add 256/512/1024
>
> ---
>  gcc/testsuite/lib/target-supports.exp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index af52c38433d..dc366d35a0a 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -8881,7 +8881,7 @@ proc available_vector_sizes { } {
> lappend result 4096 2048 1024 512 256 128 64 32 16 8 4 2
>  } elseif { [istarget riscv*-*-*] } {
> if { [check_effective_target_riscv_v] } {
> -   lappend result 0 32 64 128
> +   lappend result 0 32 64 128 256 512 1024
> }
> lappend result 128
>  } else {
> --
> 2.36.3
>


[PATCH 2/2] c++: note other candidates when diagnosing deletedness

2023-10-09 Thread Patrick Palka
With the previous improvements in place, we can easily extend our
deletedness diagnostic to note the other candidates:

  deleted16.C: In function ‘int main()’:
  deleted16.C:10:4: error: use of deleted function ‘void f(int)’
 10 |   f(0);
|   ~^~~
  deleted16.C:5:6: note: declared here
  5 | void f(int) = delete;
|  ^
  deleted16.C:5:6: note: candidate: ‘void f(int)’ (deleted)
  deleted16.C:6:6: note: candidate: ‘void f(...)’
  6 | void f(...);
|  ^
  deleted16.C:7:6: note: candidate: ‘void f(int, int)’
  7 | void f(int, int);
|  ^
  deleted16.C:7:6: note:   candidate expects 2 arguments, 1 provided

These notes are disabled when a deleted special member function is
selected primarily because it introduces a lot of new "cannot bind
reference" errors in the testsuite when noting non-viable candidates,
e.g. in cpp0x/initlist-opt1.C we would need to expect an error at
A(A&&).

gcc/cp/ChangeLog:

* call.cc (build_over_call): Call print_z_candidates when
diagnosing deletedness.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/deleted16.C: New test.
---
 gcc/cp/call.cc | 10 +-
 gcc/testsuite/g++.dg/cpp0x/deleted16.C | 11 +++
 2 files changed, 20 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/deleted16.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 648d383ca4e..55fd71636b1 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -9873,7 +9873,15 @@ build_over_call (struct z_candidate *cand, int flags, 
tsubst_flags_t complain)
   if (DECL_DELETED_FN (fn))
 {
   if (complain & tf_error)
-   mark_used (fn);
+   {
+ mark_used (fn);
+ /* Note the other candidates we considered unless we selected a
+special member function since the mismatch reasons for other
+candidates are usually uninteresting, e.g. rvalue vs lvalue
+reference binding .  */
+ if (cand->next && !special_memfn_p (fn))
+   print_z_candidates (input_location, cand, /*only_viable_p=*/false);
+   }
   return error_mark_node;
 }
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/deleted16.C 
b/gcc/testsuite/g++.dg/cpp0x/deleted16.C
new file mode 100644
index 000..9fd2fbb1465
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/deleted16.C
@@ -0,0 +1,11 @@
+// Verify we note other candidates when a deleted function is
+// selected by overload resolution.
+// { dg-do compile { target c++11 } }
+
+void f(int) = delete; // { dg-message "declared here|candidate" }
+void f(...); // { dg-message "candidate" }
+void f(int, int); // { dg-message "candidate" }
+
+int main() {
+  f(0); // { dg-error "deleted" }
+}
-- 
2.42.0.325.g3a06386e31



[PATCH 1/2] c++: sort candidates according to viability

2023-10-09 Thread Patrick Palka
This patch:

  * changes splice_viable to move the non-viable candidates to the end
of the list instead of removing them outright
  * makes tourney move the best candidate to the front of the candidate
list
  * adjusts print_z_candidates to preserve our behavior of printing only
viable candidates when diagnosing ambiguity
  * adds a parameter to print_z_candidates to control this default behavior
(the follow-up patch will want to print all candidates when diagnosing
deletedness)

Thus after this patch we have access to the entire candidate list through
the best viable candidate.

This change also happens to fix diagnostics for the below testcase where
we currently neglect to note the third candidate, since the presence of
the two unordered non-strictly viable candidates causes splice_viable to
prematurely get rid of the non-viable third candidate.

gcc/cp/ChangeLog:

* call.cc: Include "tristate.h".
(splice_viable): Sort the candidate list according to viability.
Don't remove non-viable candidates from the list.
(print_z_candidates): Add defaulted only_viable_p parameter.
By default only print non-viable candidates if there is no
viable candidate.
(tourney): Make 'candidates' parameter a reference.  Ignore
non-viable candidates.  Move the true champ to the front
of the candidates list, and update 'candidates' to point to
the front.

gcc/testsuite/ChangeLog:

* g++.dg/overload/error5.C: New test.
---
 gcc/cp/call.cc | 161 +++--
 gcc/testsuite/g++.dg/overload/error5.C |  11 ++
 2 files changed, 111 insertions(+), 61 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/overload/error5.C

diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
index 15079ddf6dc..648d383ca4e 100644
--- a/gcc/cp/call.cc
+++ b/gcc/cp/call.cc
@@ -43,6 +43,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attribs.h"
 #include "decl.h"
 #include "gcc-rich-location.h"
+#include "tristate.h"
 
 /* The various kinds of conversion.  */
 
@@ -160,7 +161,7 @@ static struct obstack conversion_obstack;
 static bool conversion_obstack_initialized;
 struct rejection_reason;
 
-static struct z_candidate * tourney (struct z_candidate *, tsubst_flags_t);
+static struct z_candidate * tourney (struct z_candidate *&, tsubst_flags_t);
 static int equal_functions (tree, tree);
 static int joust (struct z_candidate *, struct z_candidate *, bool,
  tsubst_flags_t);
@@ -176,7 +177,8 @@ static void op_error (const op_location_t &, enum 
tree_code, enum tree_code,
 static struct z_candidate *build_user_type_conversion_1 (tree, tree, int,
 tsubst_flags_t);
 static void print_z_candidate (location_t, const char *, struct z_candidate *);
-static void print_z_candidates (location_t, struct z_candidate *);
+static void print_z_candidates (location_t, struct z_candidate *,
+   tristate = tristate::unknown ());
 static tree build_this (tree);
 static struct z_candidate *splice_viable (struct z_candidate *, bool, bool *);
 static bool any_strictly_viable (struct z_candidate *);
@@ -3718,68 +3720,60 @@ add_template_conv_candidate (struct z_candidate 
**candidates, tree tmpl,
 }
 
 /* The CANDS are the set of candidates that were considered for
-   overload resolution.  Return the set of viable candidates, or CANDS
-   if none are viable.  If any of the candidates were viable, set
+   overload resolution.  Sort CANDS so that the strictly viable
+   candidates appear first, followed by non-strictly viable candidates,
+   followed by unviable candidates.  Returns the first candidate
+   in this sorted list.  If any of the candidates were viable, set
*ANY_VIABLE_P to true.  STRICT_P is true if a candidate should be
-   considered viable only if it is strictly viable.  */
+   considered viable only if it is strictly viable when setting
+   *ANY_VIABLE_P.  */
 
 static struct z_candidate*
 splice_viable (struct z_candidate *cands,
   bool strict_p,
   bool *any_viable_p)
 {
-  struct z_candidate *viable;
-  struct z_candidate **last_viable;
-  struct z_candidate **cand;
-  bool found_strictly_viable = false;
+  z_candidate *strictly_viable = nullptr;
+  z_candidate **strictly_viable_tail = _viable;
+
+  z_candidate *non_strictly_viable = nullptr;
+  z_candidate **non_strictly_viable_tail = _strictly_viable;
+
+  z_candidate *unviable = nullptr;
+  z_candidate **unviable_tail = 
 
   /* Be strict inside templates, since build_over_call won't actually
  do the conversions to get pedwarns.  */
   if (processing_template_decl)
 strict_p = true;
 
-  viable = NULL;
-  last_viable = 
-  *any_viable_p = false;
-
-  cand = 
-  while (*cand)
+  for (z_candidate *cand = cands; cand; cand = cand->next)
 {
-  struct z_candidate *c = *cand;
   if (!strict_p
- && (c->viable 

Re: [PATCH] RISC-V: Add available vector size for RVV

2023-10-09 Thread Kito Cheng
LGTM

On Mon, Oct 9, 2023 at 4:23 PM Juzhe-Zhong  wrote:
>
> For RVV, we have VLS modes enable according to TARGET_MIN_VLEN
> from M1 to M8.
>
> For example, when TARGET_MIN_VLEN = 128 bits, we enable
> 128/256/512/1024 bits VLS modes.
>
> This patch fixes following FAIL:
> FAIL: gcc.dg/vect/bb-slp-subgroups-2.c -flto -ffat-lto-objects  
> scan-tree-dump-times slp2 "optimized: basic block" 2
> FAIL: gcc.dg/vect/bb-slp-subgroups-2.c scan-tree-dump-times slp2 "optimized: 
> basic block" 2
>
> gcc/testsuite/ChangeLog:
>
> * lib/target-supports.exp: Add 256/512/1024
>
> ---
>  gcc/testsuite/lib/target-supports.exp | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/testsuite/lib/target-supports.exp 
> b/gcc/testsuite/lib/target-supports.exp
> index af52c38433d..dc366d35a0a 100644
> --- a/gcc/testsuite/lib/target-supports.exp
> +++ b/gcc/testsuite/lib/target-supports.exp
> @@ -8881,7 +8881,7 @@ proc available_vector_sizes { } {
> lappend result 4096 2048 1024 512 256 128 64 32 16 8 4 2
>  } elseif { [istarget riscv*-*-*] } {
> if { [check_effective_target_riscv_v] } {
> -   lappend result 0 32 64 128
> +   lappend result 0 32 64 128 256 512 1024
> }
> lappend result 128
>  } else {
> --
> 2.36.3
>


[Bug c/111751] New: RISC-V: RVV unexpected vectorization

2023-10-09 Thread juzhe.zhong at rivai dot ai via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751

Bug ID: 111751
   Summary: RISC-V: RVV unexpected vectorization
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: juzhe.zhong at rivai dot ai
  Target Milestone: ---

#include 

#define N 16

int main ()
{
  int i;
  char ia[N];
  char ic[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};
  char ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45};

  /* Not vectorizable, multiplication */
  for (i = 0; i < N; i++)
{
  ia[i] = ib[i] * ic[i];
}

  /* check results:  */
  for (i = 0; i < N; i++)
{
  if (ia[i] != (char) (ib[i] * ic[i]))
abort ();
}

  return 0;
}

RVV GCC ASM:

main:
lui a5,%hi(.LANCHOR0)
addia5,a5,%lo(.LANCHOR0)
addisp,sp,-48
ld  a4,0(a5)
ld  a5,8(a5)
sd  a5,8(sp)
sd  a5,24(sp)
sd  ra,40(sp)
addia5,sp,16
sd  a4,0(sp)
sd  a4,16(sp)
vsetivlizero,16,e8,m1,ta,ma
vle8.v  v1,0(a5)
vle8.v  v2,0(sp)
vmul.vv v1,v1,v2
vmv.x.s a5,v1
andia5,a5,0xff
bne a5,zero,.L2
vslidedown.vi   v2,v1,1
li  a4,9
vmv.x.s a5,v2
andia5,a5,0xff
bne a5,a4,.L2
vslidedown.vi   v2,v1,2
li  a4,36
vmv.x.s a5,v2
andia5,a5,0xff
bne a5,a4,.L2
vslidedown.vi   v2,v1,3
li  a4,81
vmv.x.s a5,v2
andia5,a5,0xff
bne a5,a4,.L2
vslidedown.vi   v2,v1,4
li  a4,144
vmv.x.s a5,v2
andia5,a5,0xff
bne a5,a4,.L2
vslidedown.vi   v2,v1,5
li  a4,225
vmv.x.s a5,v2
andia5,a5,0xff
bne a5,a4,.L2
vslidedown.vi   v2,v1,6
li  a4,68
vmv.x.s a5,v2
andia5,a5,0xff
bne a5,a4,.L2
vslidedown.vi   v2,v1,7
li  a4,185
vmv.x.s a5,v2
andia5,a5,0xff
bne a5,a4,.L2
vslidedown.vi   v2,v1,8
li  a4,64
vmv.x.s a5,v2
andia5,a5,0xff
bne a5,a4,.L2
vslidedown.vi   v2,v1,9
li  a4,217
vmv.x.s a5,v2
andia5,a5,0xff
bne a5,a4,.L2
vslidedown.vi   v2,v1,10
li  a4,132
vmv.x.s a5,v2
andia5,a5,0xff
bne a5,a4,.L2
vslidedown.vi   v2,v1,11
li  a4,65
vmv.x.s a5,v2
andia5,a5,0xff
bne a5,a4,.L2
vslidedown.vi   v2,v1,12
li  a4,16
vmv.x.s a5,v2
andia5,a5,0xff
bne a5,a4,.L2
vslidedown.vi   v2,v1,13
li  a4,241
vmv.x.s a5,v2
andia5,a5,0xff
bne a5,a4,.L2
vslidedown.vi   v2,v1,14
li  a4,228
vmv.x.s a5,v2
andia5,a5,0xff
bne a5,a4,.L2
vslidedown.vi   v1,v1,15
li  a4,233
vmv.x.s a5,v1
andia5,a5,0xff
bne a5,a4,.L2
ld  ra,40(sp)
li  a0,0
addisp,sp,48
jr  ra
.L2:
callabort


ARM SVE GCC:

main:
mov w0, 0
ret

Re: [PATCH] RISC-V: Make xtheadcondmov-indirect tests robust against instruction reordering

2023-10-09 Thread Kito Cheng
I guess you may also want to clean up those bodies for "check-function-bodies"?

On Mon, Oct 9, 2023 at 3:47 PM Christoph Muellner
 wrote:
>
> From: Christoph Müllner 
>
> Fixes: c1bc7513b1d7 ("RISC-V: const: hide mvconst splitter from IRA")
>
> A recent change broke the xtheadcondmov-indirect tests, because the order of
> emitted instructions changed. Since the test is too strict when testing for
> a fixed instruction order, let's change the tests to simply count instruction,
> like it is done for similar tests.
>
> Reported-by: Patrick O'Neill 
> Signed-off-by: Christoph Müllner 
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/xtheadcondmov-indirect.c: Make robust against
> instruction reordering.
>
> Signed-off-by: Christoph Müllner 
> ---
>  .../gcc.target/riscv/xtheadcondmov-indirect.c | 11 ---
>  1 file changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c 
> b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c
> index c3253ba5239..eba1b86137b 100644
> --- a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c
> +++ b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c
> @@ -1,8 +1,7 @@
>  /* { dg-do compile } */
> -/* { dg-options "-march=rv32gc_xtheadcondmov -fno-sched-pressure" { target { 
> rv32 } } } */
> -/* { dg-options "-march=rv64gc_xtheadcondmov -fno-sched-pressure" { target { 
> rv64 } } } */
> +/* { dg-options "-march=rv32gc_xtheadcondmov" { target { rv32 } } } */
> +/* { dg-options "-march=rv64gc_xtheadcondmov" { target { rv64 } } } */
>  /* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
> -/* { dg-final { check-function-bodies "**" "" } } */
>
>  /*
>  ** ConEmv_imm_imm_reg:
> @@ -116,3 +115,9 @@ int ConNmv_reg_reg_reg(int x, int y, int z, int n)
>  return z;
>return n;
>  }
> +
> +/* { dg-final { scan-assembler-times "addi\t" 5 } } */
> +/* { dg-final { scan-assembler-times "li\t" 4 } } */
> +/* { dg-final { scan-assembler-times "sub\t" 4 } } */
> +/* { dg-final { scan-assembler-times "th.mveqz\t" 4 } } */
> +/* { dg-final { scan-assembler-times "th.mvnez\t" 4 } } */
> --
> 2.41.0
>


[PATCH] RISC-V Regression: Fix FAIL of predcom-2.c

2023-10-09 Thread Juzhe-Zhong
Like GCN, add -fno-tree-vectorize.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/predcom-2.c: Add riscv.

---
 gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c
index f19edd4cd74..681ff7c696b 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c
@@ -1,6 +1,6 @@
 /* { dg-do run } */
 /* { dg-options "-O2 -funroll-loops --param max-unroll-times=8 
-fpredictive-commoning -fdump-tree-pcom-details-blocks -fno-tree-pre" } */
-/* { dg-additional-options "-fno-tree-vectorize" { target amdgcn-*-* } } */
+/* { dg-additional-options "-fno-tree-vectorize" { target amdgcn-*-* 
riscv*-*-* } } */
 
 void abort (void);
 
-- 
2.36.3



[PATCH] use get_range_query to replace get_global_range_query

2023-10-09 Thread Jiufu Guo
Hi,

For "get_global_range_query" SSA_NAME_RANGE_INFO can be queried.
For "get_range_query", it could get more context-aware range info.
And look at the implementation of "get_range_query",  it returns
global range if no local fun info.

So, if not quering for SSA_NAME, it would be ok to use get_range_query
to replace get_global_range_query.

Patch https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630389.html,
Uses get_range_query could handle more cases.

This patch replaces get_global_range_query by get_range_query for
most possible code pieces (but deoes not draft new test cases).

Pass bootstrap & regtest on ppc64{,le} and x86_64.
Is this ok for trunk.


BR,
Jeff (Jiufu Guo)

gcc/ChangeLog:

* builtins.cc (expand_builtin_strnlen): Replace get_global_range_query
by get_range_query.
* fold-const.cc (expr_not_equal_to): Likewise.
* gimple-fold.cc (size_must_be_zero_p): Likewise.
* gimple-range-fold.cc (fur_source::fur_source): Likewise.
* gimple-ssa-warn-access.cc (check_nul_terminated_array): Likewise.
* tree-dfa.cc (get_ref_base_and_extent): Likewise.
* tree-ssa-loop-split.cc (split_at_bb_p): Likewise.
* tree-ssa-loop-unswitch.cc (evaluate_control_stmt_using_entry_checks):
Likewise.

---
 gcc/builtins.cc   | 2 +-
 gcc/fold-const.cc | 6 +-
 gcc/gimple-fold.cc| 6 ++
 gcc/gimple-range-fold.cc  | 4 +---
 gcc/gimple-ssa-warn-access.cc | 2 +-
 gcc/tree-dfa.cc   | 5 +
 gcc/tree-ssa-loop-split.cc| 2 +-
 gcc/tree-ssa-loop-unswitch.cc | 2 +-
 8 files changed, 9 insertions(+), 20 deletions(-)

diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index cb90bd03b3e..4e0a77ff8e0 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -3477,7 +3477,7 @@ expand_builtin_strnlen (tree exp, rtx target, 
machine_mode target_mode)
 
   wide_int min, max;
   value_range r;
-  get_global_range_query ()->range_of_expr (r, bound);
+  get_range_query (cfun)->range_of_expr (r, bound);
   if (r.varying_p () || r.undefined_p ())
 return NULL_RTX;
   min = r.lower_bound ();
diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 4f8561509ff..15134b21b9f 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -11056,11 +11056,7 @@ expr_not_equal_to (tree t, const wide_int )
   if (!INTEGRAL_TYPE_P (TREE_TYPE (t)))
return false;
 
-  if (cfun)
-   get_range_query (cfun)->range_of_expr (vr, t);
-  else
-   get_global_range_query ()->range_of_expr (vr, t);
-
+  get_range_query (cfun)->range_of_expr (vr, t);
   if (!vr.undefined_p () && !vr.contains_p (w))
return true;
   /* If T has some known zero bits and W has any of those bits set,
diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc
index dc89975270c..853edd9e5d4 100644
--- a/gcc/gimple-fold.cc
+++ b/gcc/gimple-fold.cc
@@ -876,10 +876,8 @@ size_must_be_zero_p (tree size)
   wide_int zero = wi::zero (TYPE_PRECISION (type));
   value_range valid_range (type, zero, ssize_max);
   value_range vr;
-  if (cfun)
-get_range_query (cfun)->range_of_expr (vr, size);
-  else
-get_global_range_query ()->range_of_expr (vr, size);
+  get_range_query (cfun)->range_of_expr (vr, size);
+
   if (vr.undefined_p ())
 vr.set_varying (TREE_TYPE (size));
   vr.intersect (valid_range);
diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc
index d1945ccb554..6e9530c3d7f 100644
--- a/gcc/gimple-range-fold.cc
+++ b/gcc/gimple-range-fold.cc
@@ -50,10 +50,8 @@ fur_source::fur_source (range_query *q)
 {
   if (q)
 m_query = q;
-  else if (cfun)
-m_query = get_range_query (cfun);
   else
-m_query = get_global_range_query ();
+m_query = get_range_query (cfun);
   m_gori = NULL;
 }
 
diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index fcaff128d60..e439d1b9b68 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc
@@ -332,7 +332,7 @@ check_nul_terminated_array (GimpleOrTree expr, tree src, 
tree bound)
 {
   Value_Range r (TREE_TYPE (bound));
 
-  get_global_range_query ()->range_of_expr (r, bound);
+  get_range_query (cfun)->range_of_expr (r, bound);
 
   if (r.undefined_p () || r.varying_p ())
return true;
diff --git a/gcc/tree-dfa.cc b/gcc/tree-dfa.cc
index af8e9243947..5355af2c869 100644
--- a/gcc/tree-dfa.cc
+++ b/gcc/tree-dfa.cc
@@ -531,10 +531,7 @@ get_ref_base_and_extent (tree exp, poly_int64 *poffset,
 
value_range vr;
range_query *query;
-   if (cfun)
- query = get_range_query (cfun);
-   else
- query = get_global_range_query ();
+   query = get_range_query (cfun);
 
if (TREE_CODE (index) == SSA_NAME
&& (low_bound = array_ref_low_bound (exp),
diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc
index 64464802c1e..e85a1881526 100644
--- 

[PATCH] RISC-V Regression: Make match patterns more accurate

2023-10-09 Thread Juzhe-Zhong
This patch fixes following 2 FAILs in RVV regression since the check is not 
accurate.

It's inspired by Robin's previous patch:
https://patchwork.sourceware.org/project/gcc/patch/dde89b9e-49a0-d70b-0906-fb3022cac...@gmail.com/

gcc/testsuite/ChangeLog:

* gcc.dg/vect/no-scevccp-outer-7.c: Adjust regex pattern.
* gcc.dg/vect/no-scevccp-vect-iv-3.c: Ditto.

---
 gcc/testsuite/gcc.dg/vect/no-scevccp-outer-7.c   | 2 +-
 gcc/testsuite/gcc.dg/vect/no-scevccp-vect-iv-3.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-7.c 
b/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-7.c
index 543ee98b5a4..058d1d2db2d 100644
--- a/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-7.c
+++ b/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-7.c
@@ -77,4 +77,4 @@ int main (void)
 }
 
 /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { 
target vect_widen_mult_hi_to_si } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: detected" 
1 "vect" } } */
+/* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: 
detected(?:(?!failed)(?!Re-trying).)*succeeded" 1 "vect" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/no-scevccp-vect-iv-3.c 
b/gcc/testsuite/gcc.dg/vect/no-scevccp-vect-iv-3.c
index 7049e4936b9..6f2b2210b11 100644
--- a/gcc/testsuite/gcc.dg/vect/no-scevccp-vect-iv-3.c
+++ b/gcc/testsuite/gcc.dg/vect/no-scevccp-vect-iv-3.c
@@ -30,4 +30,4 @@ unsigned int main1 ()
 }
 
 /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target 
vect_widen_sum_hi_to_si } } } */
-/* { dg-final { scan-tree-dump-times "vect_recog_widen_sum_pattern: detected" 
1 "vect" { target vect_widen_sum_hi_to_si } } } */
+/* { dg-final { scan-tree-dump-times "vect_recog_widen_sum_pattern: 
detected(?:(?!failed)(?!Re-trying).)*succeeded" 1 "vect" { target 
vect_widen_sum_hi_to_si } } } */
-- 
2.36.3



[Bug tree-optimization/111734] [14 Regression] wrong code with '-O3 -fno-inline-functions-called-once -fno-inline-small-functions -fno-omit-frame-pointer -fno-toplevel-reorder -fno-tree-fre'

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111734

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
Summary|wrong code with '-O3|[14 Regression] wrong code
   |-fno-inline-functions-calle |with '-O3
   |d-once  |-fno-inline-functions-calle
   |-fno-inline-small-functions |d-once
   |-fno-omit-frame-pointer |-fno-inline-small-functions
   |-fno-toplevel-reorder   |-fno-omit-frame-pointer
   |-fno-tree-fre'  |-fno-toplevel-reorder
   ||-fno-tree-fre'
   Last reconfirmed||2023-10-10
  Component|c   |tree-optimization
   Target Milestone|--- |14.0
 Status|UNCONFIRMED |NEW

--- Comment #3 from Andrew Pinski  ---
PRE does:

Processing block 0: BB2
Value numbering stmt = *m_1(D) = 
RHS  simplified to 
No store match
Value numbering store *m_1(D) to 
Setting value number of .MEM_3 to .MEM_

...

Starting insert iteration 1
Deleted redundant store *m_1(D) = 
Removing dead stmt *m_1(D) = 


Better reduced testcase:
```
struct a {};
struct {
  unsigned b;
  unsigned short c;
} d, f = {9, 1};
int e;
static void g(unsigned, __SIZE_TYPE__, int **m);
static void h() {
  int *i = 
  g(0, (__SIZE_TYPE__)i, );
  if (*i)
f = d;
}
void g(unsigned a, __SIZE_TYPE__ b, int **m) {
  *m = 
}
int main() {
  h();
  if (f.c != 1)
__builtin_abort();
}
```

[Bug target/111745] [14 Regression] ICE: in extract_insn, at recog.cc:2791 (unrecognizable insn) with -ffloat-store -mavx512fp16 -mavx512vl

2023-10-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111745

Hongtao.liu  changed:

   What|Removed |Added

 CC||crazylht at gmail dot com

--- Comment #1 from Hongtao.liu  ---
Mine, I'll take a look.

[PATCH V1] introduce light expander sra

2023-10-09 Thread Jiufu Guo
Hi,

There are a few PRs (meta-bug PR101926) on various targets.
The root causes of them are similar: the aggeragte param/
returns are passed by multi-registers, but they are stored
to stack from registers first; and then, access the 
parameter through stack slot.

A general idea to enhance this: accessing the aggregate
parameters/returns directly through registers.  This idea
would be a kind of SRA (using the scalar registers to
access the aggregate parameter/returns).

This experimental patch for light-expander-sra contains
below parts:

a. Check if the parameters/returns are ok/profitable to
   scalarize, and set the scalar pseudos for the
   parameter/return.
  - This is done in "expand_function_start", after the
incoming/outgoing hard registers are determined for the
paramter(s)/return.
The scalarized registers are recorded in DECL_RTL for
the parameter/return in parallel form.
  - At the time when setting DECL_RTL, "scalarizable_aggregate"
is called to check the accesses are ok/profitable to
scalarize.
We can continue to enhance this function, to support
more cases.  For example:
- 'reverse storage order'.
- 'TImode/vector-mode from multi-regs'.
- some cases on 'writing to parameter'/'overlap accesses'.

b. When expanding the accesses of the parameters/returns,
   according to the info of the access(e.g. bitpos,bitsize,
   mode), the scalar(pseudos) can be figured out to expand
   the access.  This may happen when expand below accesses:
  - The component access of a parameter: "_1 = arg.f1".
Or whole parameter access: rhs of "_2 = arg"
  - The assignment to a return val:
"D.xx = yy; or D.xx.f = zz" where D.xx occurs on return
stmt.
  - This is mainly done in expr.cc(expand_expr_real_1, and
expand_assignment).  Function "extract_sub_member" is
used to figure out the scalar rtxs(pseudos).

Besides the above two parts, some work are done in the GIMPLE
tree:  collect sra candidates for parameters/returns, and
collect the SRA access info.
This is mainly done at the beginning of the expander pass by
the class (named expand_sra) and its member functions.
Below are two major items of this part.
 - Collect light-expand-sra candidates.
  Each parameter is checked if it has the proper aggregate type.
  Collect return val (VAR_P) on each return stmts if the
  function is returning via registers.  
  This is implemented in expand_sra::collect_sra_candidates. 

 - Build/collect/manage all the access on the candidates.
  The function "scan_function" is used to do this work, it
  goes through all basicblocks, and all interesting stmts (
  phi, return, assign, call, asm) are checked.
  If there is an interesting expression (e.g. COMPONENT_REF
  or PARM_DECL), then record the required info for the access
  (e.g. pos, size, type, base).
  And if it is risky to do SRA, the candidates may be removed.
  e.g. address-taken and accessed via memory.
  "foo(struct S arg) {bar ();}"

This patch also try to common code for light-expand-sra,
tree-sra, and ipa-sra.
We can continue refactoring to share similar functionalities.

Compare with previous version, this version avoid to store
the parameter to stack if it is scalarized.
https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631177.html

This patch is tested on ppc64{,le} and x86_64.
Is this ok for trunk?

BR,
Jeff (Jiufu Guo)

PR target/65421

gcc/ChangeLog:

* cfgexpand.cc (struct access): New class.
(struct expand_sra): New class.
(expand_sra::collect_sra_candidates): New member function.
(expand_sra::add_sra_candidate): Likewise.
(expand_sra::build_access): Likewise.
(expand_sra::analyze_phi): Likewise.
(expand_sra::analyze_assign): Likewise.
(expand_sra::visit_base): Likewise.
(expand_sra::protect_mem_access_in_stmt): Likewise.
(expand_sra::expand_sra):  Class constructor.
(expand_sra::~expand_sra): Class destructor.
(expand_sra::scalarizable_access):  New member function.
(expand_sra::scalarizable_accesses):  Likewise.
(scalarizable_aggregate):  New function.
(set_scalar_rtx_for_returns):  New function.
(expand_value_return): Updated.
(expand_debug_expr): Updated.
(pass_expand::execute): Updated to use expand_sra.
* cfgexpand.h (scalarizable_aggregate): New declare.
(set_scalar_rtx_for_returns): New declare.
* expr.cc (expand_assignment): Updated.
(expand_constructor): Updated.
(query_position_in_parallel): New function.
(extract_sub_member): New function.
(expand_expr_real_1): Updated.
* expr.h (query_position_in_parallel): New declare.
* function.cc (assign_parm_setup_block): Updated.
(assign_parms): Updated.
(expand_function_start): Updated.
* tree-sra.h (struct sra_base_access): New class.
(struct sra_default_analyzer): New class.

[Bug tree-optimization/111738] incorrect code when PGO is enabled

2023-10-09 Thread iamanonymous.cs at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111738

--- Comment #3 from Anonymous  ---
(In reply to Richard Biener from comment #1)
> I can't reproduce.  Your git version is quite old, it translates to
> r14-2634-g85da0b40538fb0 for me.  It doesn't reproduce with r14-2282 either
> though.
> 
> Current is r14-4486-g873586ebc565b6

Hi, Richard. According to your suggestion, we have updated our gcc to the
latest trunk as:
$ gcc -v
Using built-in specs.
COLLECT_GCC=/root/gcc_set/202310092007/bin/gcc
COLLECT_LTO_WRAPPER=/root/gcc_set/202310092007/libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/root/gcc_set/202310092007
--with-gmp=/root/build_essential --with-mpfr=/root/build_essential
--with-mpc=/root/build_essential --enable-languages=c,c++ --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.0 20231009 (experimental) (GCC)

git version: dee55cf59ceea989f47e7605205c6644b27a1f78


Then, we compiled the same test program with/without PGO enabled and found that
the results are inconsistent as:
$ gcc -O3 -w -fprofile-generate=profile a.c -o a.out
$ ./a.out
4
$ gcc -O3 -w -fprofile-use=profile -Wno-missing-profile -fprofile-correction
a.c -o a.out
$ ./a.out
32765

[Bug libstdc++/111747] Problem with large float list initialization

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111747

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #1 from Andrew Pinski  ---


32bit floating point has the following characteristics:
Sign bit: 1 bit
Exponent width: 8 bits
Significand precision: 24 bits (23 explicitly stored)


5000 is 0x2faf080 which is more than 24bits in precision which means it
cannot be represented exactly and when you start to add 1 to something which is
greater than 0xf0 (which is what 1.67772e+07 is), the value stays the same
and you start to lose precision.

[PATCH] RISC-V Regression: Fix FAIL of bb-slp-pr65935.c for RVV

2023-10-09 Thread Juzhe-Zhong
Here is the reference comparing dump IR between ARM SVE and RVV.

https://godbolt.org/z/zqess8Gss

We can see RVV has one more dump IR:
optimized: basic block part vectorized using 128 byte vectors
since RVV has 1024 bit vectors.

The codegen is reasonable good.

However, I saw GCN also has 1024 bit vector.
This patch may cause this case FAIL in GCN port ?

Hi, GCN folk, could you check this patch in GCN port for me ?

gcc/testsuite/ChangeLog:

* gcc.dg/vect/bb-slp-pr65935.c: Add vect1024 variant.
* lib/target-supports.exp: Ditto.

---
 gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c | 3 ++-
 gcc/testsuite/lib/target-supports.exp  | 6 ++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
index 8df35327e7a..9ef1330b47c 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c
@@ -67,7 +67,8 @@ int main()
 
 /* We should also be able to use 2-lane SLP to initialize the real and
imaginary components in the first loop of main.  */
-/* { dg-final { scan-tree-dump-times "optimized: basic block" 10 "slp1" } } */
+/* { dg-final { scan-tree-dump-times "optimized: basic block" 10 "slp1" { 
target {! { vect1024 } } } } } */
+/* { dg-final { scan-tree-dump-times "optimized: basic block" 11 "slp1" { 
target { { vect1024 } } } } } */
 /* We should see the s->phase[dir] operand splatted and no other operand built
from scalars.  See PR97334.  */
 /* { dg-final { scan-tree-dump "Using a splat" "slp1" } } */
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index dc366d35a0a..95c489d7f76 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -8903,6 +8903,12 @@ proc check_effective_target_vect_variable_length { } {
 return [expr { [lindex [available_vector_sizes] 0] == 0 }]
 }
 
+# Return 1 if the target supports vectors of 1024 bits.
+
+proc check_effective_target_vect1024 { } {
+return [expr { [lsearch -exact [available_vector_sizes] 1024] >= 0 }]
+}
+
 # Return 1 if the target supports vectors of 512 bits.
 
 proc check_effective_target_vect512 { } {
-- 
2.36.3



[Bug tree-optimization/111750] Spurious -Warray-bounds warning when using member function pointers

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111750

--- Comment #1 from Andrew Pinski  ---
> That this source produces a -Warray-bounds warning is somewhat surprising 
> since it contains no arrays, no array indexing, and no pointer arithmetic

Well techincally there is pointer arithmetic because the pointer to member
function could have a delta for the function call at `(c.*func)();`
Also there is an "array" because all variables/decls are arrays in C++ with a
size of 1 (that allows you do pass  + 1 as the end for iterators).

Anyways the problem here is the optimizer optimized  into `(c.*func)();` but
had not optimized the  ::my_method part yet when the warning happened.

[Bug debug/111749] Kk

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111749

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2023-10-10

--- Comment #1 from Andrew Pinski  ---
This bug has no information in it?
Was that by accident?

[PATCH] RISC-V Regression: Fix dump check of bb-slp-68.c

2023-10-09 Thread Juzhe-Zhong
Like GCN, RVV also has 64 bytes vectors (512 bits) which cause FAIL in this 
test.

It's more reasonable to use "vect512" instead of AMDGCN.

gcc/testsuite/ChangeLog:

* gcc.dg/vect/bb-slp-68.c: Use vect512.

---
 gcc/testsuite/gcc.dg/vect/bb-slp-68.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-68.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-68.c
index e7573a14933..2dd3d8ee90c 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-68.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-68.c
@@ -20,4 +20,4 @@ void foo ()
 
 /* We want to have the store group split into 4, 2, 4 when using 32byte 
vectors.
Unfortunately it does not work when 64-byte vectors are available.  */
-/* { dg-final { scan-tree-dump-not "from scalars" "slp2" { xfail amdgcn-*-* } 
} } */
+/* { dg-final { scan-tree-dump-not "from scalars" "slp2" { xfail vect512 } } } 
*/
-- 
2.36.3



Re: Re: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'

2023-10-09 Thread juzhe.zh...@rivai.ai
Oh. I realize this patch increase FAIL that I recently fixed:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632247.html 

This fail because RVV doesn't have vec_pack_trunc_optab (Loop vectorizer will 
failed at first time but succeed at 2nd time), 
then RVV will dump 4 times FOLD_EXTRACT_LAST instead of 2  (ARM SVE 2 times 
because they have vec_pack_trunc_optab).

I think the root cause of RVV failing at multiple tests of "vect" is that we 
don't enable vec_pack/vec_unpack/... stuff, 
we still succeed at vectorizations and we want to enable tests of them 
(Mostly just using different approach to vectorize it (cause dump FAIL) because 
of some changing I have done previously in the middle-end).

So enabling "vec_pack" for RVV will fix some FAILs but increase some other 
FAILs.

CC to Richi to see more reasonable suggestions.



juzhe.zh...@rivai.ai
 
发件人: Maciej W. Rozycki
发送时间: 2023-10-10 06:38
收件人: 钟居哲
抄送: gcc-patches; Jeff Law; rdapp.gcc; kito.cheng
主题: Re: 回复: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'
On Tue, 10 Oct 2023, 钟居哲 wrote:
 
> Btw, could you rebase to the trunk and run regression again?
 
Full regression-testing takes roughly 40 hours here and I do not normally
update the tree midway through my work so as not to add variables and end 
up chasing a moving target, especially with such an unstable state that we 
have ended up with recently with the RISC-V port.  Since I'm done with 
this part I can refresh and schedule another run if you are curious as to 
how it looks like from my side.  For the C subset alone it'll take less.
 
  Maciej
 


[Bug middle-end/111743] shifts in bit field accesses don't combine with other shifts

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743

--- Comment #6 from Andrew Pinski  ---
(In reply to Andi Kleen from comment #5)
> config/i386/i386.h:#define SLOW_BYTE_ACCESS 0
> 
> You mean it doesn't define it?

The default is 1.
Anyways in this case I was wrong but defining it to 0 causes other issues.

[Bug middle-end/111743] shifts in bit field accesses don't combine with other shifts

2023-10-09 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743

--- Comment #5 from Andi Kleen  ---

config/i386/i386.h:#define SLOW_BYTE_ACCESS 0

You mean it doesn't define it?

Re: Odd Python errors in the G++ testsuite

2023-10-09 Thread Ben Boeckel via Gcc
On Mon, Oct 09, 2023 at 20:12:01 +0100, Maciej W. Rozycki wrote:
> Hi,
> 
>  I'm seeing these tracebacks for several cases across the G++ testsuite:
> 
> Executing on host: python3 -c "import sys; assert sys.version_info >= (3, 6)" 
>(timeout = 300)
> spawn -ignore SIGHUP python3 -c import sys; assert sys.version_info >= (3, 6)
> rules/0/primary-output is ok: p1689-1.o
> rules/0/provides/0/logical-name is ok: foo
> rules/0/provides/0/is-interface is ok: True
> Traceback (most recent call last):
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 218, in 
> is_ok = validate_p1689(actual, expect)
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 182, in 
> validate_p1689
> return compare_json([], actual_json, expect_json)
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 145, in 
> compare_json
> is_ok = _compare_object(path, actual, expect)
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 66, in 
> _compare_object
> sub_error = compare_json(path + [key], actual[key], expect[key])
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 151, in 
> compare_json
> is_ok = _compare_array(path, actual, expect)
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 87, in 
> _compare_array
> sub_error = compare_json(path + [str(idx)], a, e)
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 145, in 
> compare_json
> is_ok = _compare_object(path, actual, expect)
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 66, in 
> _compare_object
> sub_error = compare_json(path + [key], actual[key], expect[key])
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 149, in 
> compare_json
> actual = set(actual)
> TypeError: unhashable type: 'dict'

So looking at the source…the 3.6 check is not from me. Not sure what's
up there; it's probably not immediately related to the backtrace.

But this backtrace means that we have a list of objects that do not
expect a given ordering that is of JSON objects. I'm not sure why this
never showed up before as all of the existing uses of it are indeed of
objects. Can you try removing `"__P1689_unordered__"` from the
`p1689-1.exp.ddi` file's `requires` array? The
`p1689-file-default.exp.ddi` and `p1689-target-default.exp.ddi` files
need the same treatment.

--Ben


Re: Odd Python errors in the G++ testsuite

2023-10-09 Thread Ben Boeckel via Gcc
On Mon, Oct 09, 2023 at 19:46:37 -0400, Paul Koning wrote:
> 
> 
> > On Oct 9, 2023, at 7:42 PM, Ben Boeckel via Gcc  wrote:
> > 
> > On Mon, Oct 09, 2023 at 20:12:01 +0100, Maciej W. Rozycki wrote:
> >> I'm seeing these tracebacks for several cases across the G++ testsuite:
> >> 
> >> Executing on host: python3 -c "import sys; assert sys.version_info >= (3, 
> >> 6)"(timeout = 300)
> >> spawn -ignore SIGHUP python3 -c import sys; assert sys.version_info >= (3, 
> >> 6)
> > 
> > What version of Python3 do you have? The test suite might not actually
> > properly handle not having 3.7 (i.e., skip the tests that require it).
> 
> But the rule that you can't put a dict in a set is as old as set support (2.x 
> for some small x).

Yes. I just wonder how a dictionary got in there in the first place. I'm
not sure if some *other* 3.7-related change makes that work.

--Ben


[Bug tree-optimization/111750] New: Spurious -Warray-bounds warning when using member function pointers

2023-10-09 Thread abbeyj+gcc at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111750

Bug ID: 111750
   Summary: Spurious -Warray-bounds warning when using member
function pointers
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: abbeyj+gcc at gmail dot com
  Target Milestone: ---

Created attachment 56086
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56086=edit
Reproducer

The source code below generates a -Warray-bounds warning which I believe is
incorrect.  Compile with `g++ -c -Wall -O2`.

```
struct MyClass {
void my_method();
};

MyClass g;

void pre();

inline void FetchValue(MyClass& c, void(MyClass::*func)()) {
pre();
(c.*func)();
}

int get_int();

inline int Check() {
static const int ret = get_int();
return ret;
}

inline void ReadValue(MyClass& c, void(MyClass::*func)()) {
Check();
FetchValue(c, func);
}

void Main() {
ReadValue(g, ::my_method);
}
```

This produces:
```
In function 'void FetchValue(MyClass&, void (MyClass::*)())',
inlined from 'void ReadValue(MyClass&, void (MyClass::*)())' at
:23:15,
inlined from 'void Main()' at :27:14:
:11:14: warning: array subscript 'int (**)(...)[0]' is partly outside
array bounds of 'MyClass [1]' [-Warray-bounds=]
   11 | (c.*func)();
  | ~^~
: In function 'void Main()':
:5:9: note: object 'g' of size 1
5 | MyClass g;
  | ^
```

Godbolt link: https://godbolt.org/z/6YsWd9xhr

That this source produces a -Warray-bounds warning is somewhat surprising since
it contains no arrays, no array indexing, and no pointer arithmetic.  Small
changes such as removing the static variable or manually inlining a function
into its caller make the warning go away.  

The earliest version that I've been able to reproduce this with is GCC 11.1 and
it still reproduces on the trunk version that's currently available on
godbolt.org.

Re: Odd Python errors in the G++ testsuite

2023-10-09 Thread Paul Koning via Gcc



> On Oct 9, 2023, at 7:42 PM, Ben Boeckel via Gcc  wrote:
> 
> On Mon, Oct 09, 2023 at 20:12:01 +0100, Maciej W. Rozycki wrote:
>> I'm seeing these tracebacks for several cases across the G++ testsuite:
>> 
>> Executing on host: python3 -c "import sys; assert sys.version_info >= (3, 
>> 6)"(timeout = 300)
>> spawn -ignore SIGHUP python3 -c import sys; assert sys.version_info >= (3, 6)
> 
> What version of Python3 do you have? The test suite might not actually
> properly handle not having 3.7 (i.e., skip the tests that require it).

But the rule that you can't put a dict in a set is as old as set support (2.x 
for some small x).

paul



Re: Odd Python errors in the G++ testsuite

2023-10-09 Thread Ben Boeckel via Gcc
On Mon, Oct 09, 2023 at 20:12:01 +0100, Maciej W. Rozycki wrote:
>  I'm seeing these tracebacks for several cases across the G++ testsuite:
> 
> Executing on host: python3 -c "import sys; assert sys.version_info >= (3, 6)" 
>(timeout = 300)
> spawn -ignore SIGHUP python3 -c import sys; assert sys.version_info >= (3, 6)

What version of Python3 do you have? The test suite might not actually
properly handle not having 3.7 (i.e., skip the tests that require it).

> rules/0/primary-output is ok: p1689-1.o

I wrote these tests.

> rules/0/provides/0/logical-name is ok: foo
> rules/0/provides/0/is-interface is ok: True
> Traceback (most recent call last):
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 218, in 
> is_ok = validate_p1689(actual, expect)
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 182, in 
> validate_p1689
> return compare_json([], actual_json, expect_json)
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 145, in 
> compare_json
> is_ok = _compare_object(path, actual, expect)
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 66, in 
> _compare_object
> sub_error = compare_json(path + [key], actual[key], expect[key])
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 151, in 
> compare_json
> is_ok = _compare_array(path, actual, expect)
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 87, in 
> _compare_array
> sub_error = compare_json(path + [str(idx)], a, e)
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 145, in 
> compare_json
> is_ok = _compare_object(path, actual, expect)
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 66, in 
> _compare_object
> sub_error = compare_json(path + [key], actual[key], expect[key])
>   File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 149, in 
> compare_json
> actual = set(actual)
> TypeError: unhashable type: 'dict'

I'm not sure how this ends up with a dictionary in it… Can you
`print(actual)` before this?

> and also these intermittent failures for other cases:
> 
> Executing on host: python3 -c "import sys; assert sys.version_info >= (3, 6)" 
>(timeout = 300)
> spawn -ignore SIGHUP python3 -c import sys; assert sys.version_info >= (3, 6)
> rules/0/primary-output is ok: p1689-2.o
> rules/0/provides/0/logical-name is ok: foo:part1
> rules/0/provides/0/is-interface is ok: True
> ERROR: length mismatch at rules/0/requires: actual: "1" expect: "0"
> version is ok: 0
> revision is ok: 0
> FAIL: ERROR: length mismatch at rules/0/requires: actual: "1" expect: "0"
> 
> This does seem to me like something not working as intended.  As a Python 
> non-expert I have troubles concluding what is going on here and whether 
> these tracebacks are indeed supposed to be there, or whether it is a sign 
> of a problem.  And these failures I don't even know where they come from.  
> 
>  Does anyone know?  Is there a way to run the offending commands by hand?  
> The relevant invocation lines do not appear in the test log file for one 
> to copy and paste, which I think is not the right way of doing things in 
> our environment.
> 
>  These issues seem independent from the test host environment as I can see 
> them on both a `powerpc64le-linux-gnu' and an `x86_64-linux-gnu' machine 
> in `riscv64-linux-gnu' target testing.

Do they all have pre-3.7 Python3 versions?

--Ben


[PATCH] RISC-V: Add available vector size for RVV

2023-10-09 Thread Juzhe-Zhong
For RVV, we have VLS modes enable according to TARGET_MIN_VLEN
from M1 to M8.

For example, when TARGET_MIN_VLEN = 128 bits, we enable
128/256/512/1024 bits VLS modes.

This patch fixes following FAIL:
FAIL: gcc.dg/vect/bb-slp-subgroups-2.c -flto -ffat-lto-objects  
scan-tree-dump-times slp2 "optimized: basic block" 2
FAIL: gcc.dg/vect/bb-slp-subgroups-2.c scan-tree-dump-times slp2 "optimized: 
basic block" 2

gcc/testsuite/ChangeLog:

* lib/target-supports.exp: Add 256/512/1024

---
 gcc/testsuite/lib/target-supports.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index af52c38433d..dc366d35a0a 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -8881,7 +8881,7 @@ proc available_vector_sizes { } {
lappend result 4096 2048 1024 512 256 128 64 32 16 8 4 2
 } elseif { [istarget riscv*-*-*] } {
if { [check_effective_target_riscv_v] } {
-   lappend result 0 32 64 128
+   lappend result 0 32 64 128 256 512 1024
}
lappend result 128
 } else {
-- 
2.36.3



[Bug debug/111749] New: Kk

2023-10-09 Thread molono1386 at dixiser dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111749

Bug ID: 111749
   Summary: Kk
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: molono1386 at dixiser dot com
  Target Milestone: ---

Re: xthead regression with [COMMITTED] RISC-V: const: hide mvconst splitter from IRA

2023-10-09 Thread Christoph Müllner
On Mon, Oct 9, 2023 at 10:48 PM Vineet Gupta  wrote:
>
> On 10/9/23 13:46, Christoph Müllner wrote:
> > Given that this causes repeated issues, I think that a fall-back to
> > counting occurrences is the right thing to do. I can do that if that's ok.
>
> Thanks Christoph.

Tested patch on list:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632393.html

>
> -Vineet


[PATCH] RISC-V: Make xtheadcondmov-indirect tests robust against instruction reordering

2023-10-09 Thread Christoph Muellner
From: Christoph Müllner 

Fixes: c1bc7513b1d7 ("RISC-V: const: hide mvconst splitter from IRA")

A recent change broke the xtheadcondmov-indirect tests, because the order of
emitted instructions changed. Since the test is too strict when testing for
a fixed instruction order, let's change the tests to simply count instruction,
like it is done for similar tests.

Reported-by: Patrick O'Neill 
Signed-off-by: Christoph Müllner 

gcc/testsuite/ChangeLog:

* gcc.target/riscv/xtheadcondmov-indirect.c: Make robust against
instruction reordering.

Signed-off-by: Christoph Müllner 
---
 .../gcc.target/riscv/xtheadcondmov-indirect.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c 
b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c
index c3253ba5239..eba1b86137b 100644
--- a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c
+++ b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c
@@ -1,8 +1,7 @@
 /* { dg-do compile } */
-/* { dg-options "-march=rv32gc_xtheadcondmov -fno-sched-pressure" { target { 
rv32 } } } */
-/* { dg-options "-march=rv64gc_xtheadcondmov -fno-sched-pressure" { target { 
rv64 } } } */
+/* { dg-options "-march=rv32gc_xtheadcondmov" { target { rv32 } } } */
+/* { dg-options "-march=rv64gc_xtheadcondmov" { target { rv64 } } } */
 /* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */
-/* { dg-final { check-function-bodies "**" "" } } */
 
 /*
 ** ConEmv_imm_imm_reg:
@@ -116,3 +115,9 @@ int ConNmv_reg_reg_reg(int x, int y, int z, int n)
 return z;
   return n;
 }
+
+/* { dg-final { scan-assembler-times "addi\t" 5 } } */
+/* { dg-final { scan-assembler-times "li\t" 4 } } */
+/* { dg-final { scan-assembler-times "sub\t" 4 } } */
+/* { dg-final { scan-assembler-times "th.mveqz\t" 4 } } */
+/* { dg-final { scan-assembler-times "th.mvnez\t" 4 } } */
-- 
2.41.0



Re: 回复: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'

2023-10-09 Thread Maciej W. Rozycki
On Tue, 10 Oct 2023, 钟居哲 wrote:

> Btw, could you rebase to the trunk and run regression again?

 Full regression-testing takes roughly 40 hours here and I do not normally
update the tree midway through my work so as not to add variables and end 
up chasing a moving target, especially with such an unstable state that we 
have ended up with recently with the RISC-V port.  Since I'm done with 
this part I can refresh and schedule another run if you are curious as to 
how it looks like from my side.  For the C subset alone it'll take less.

  Maciej


Re: Re: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'

2023-10-09 Thread 钟居哲
I know you want vect_int to block the test for rv64gc. 
But unfortunately it failed.

And I have changed everything to run vect testsuite with "riscv_v".
[PATCH] RISC-V: Enable more tests of "vect" for RVV (gnu.org)

So to be consistent, plz add "riscv_v".



juzhe.zh...@rivai.ai
 
From: Maciej W. Rozycki
Date: 2023-10-10 06:29
To: 钟居哲
CC: gcc-patches; Jeff Law; rdapp.gcc; kito.cheng
Subject: Re: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'
On Tue, 10 Oct 2023, 钟居哲 wrote:
 
>  && [check_effective_target_arm_little_endian])
>   || ([istarget mips*-*-*]
>  && [et-is-effective-target mips_msa])
> +  || [istarget riscv*-*-*]
>   || ([istarget s390*-*-*]
>  && [check_effective_target_s390_vx])
>   || [istarget amdgcn*-*-*] }}]
> 
> You should change it into:
> 
> || ([istarget riscv*-*-*]
>  && [check_effective_target_riscv_v])
> 
> Then, these additional FAILs will be removed:
> 
> with no changes (except for intermittent Python failures for C++) with the 
> remaining testsuites.  There are a few of regressions in `-march=rv64gc' 
> testing:
> +FAIL: gcc.dg/vect/pr97678.c scan-tree-dump vect "vectorizing stmts using SLP"
> +FAIL: gcc.dg/vect/slp-13-big-array.c scan-tree-dump-times vect "vectorizing 
> stmts using SLP" 3
> +FAIL: gcc.dg/vect/slp-13.c scan-tree-dump-times vect "vectorizing stmts 
> using SLP" 3
> +FAIL: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects  scan-tree-dump vect 
> "vectorizing stmts using SLP"
> +FAIL: gcc.dg/vect/slp-13-big-array.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "vectorizing stmts using SLP" 3
> +FAIL: gcc.dg/vect/slp-13.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorizing stmts using SLP" 3
 
I explained in the change description why the check for `riscv_v' isn't 
needed here: the tests mustn't run in the first place, so naturally they 
cannot fail either.  If I missed anything, then please elaborate.
 
  Maciej
 


Re: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'

2023-10-09 Thread Maciej W. Rozycki
On Tue, 10 Oct 2023, 钟居哲 wrote:

>&& [check_effective_target_arm_little_endian])
>|| ([istarget mips*-*-*]
>&& [et-is-effective-target mips_msa])
> +  || [istarget riscv*-*-*]
>|| ([istarget s390*-*-*]
>&& [check_effective_target_s390_vx])
>   || [istarget amdgcn*-*-*] }}]
> 
> You should change it into:
> 
> || ([istarget riscv*-*-*]
>  && [check_effective_target_riscv_v])
> 
> Then, these additional FAILs will be removed:
> 
> with no changes (except for intermittent Python failures for C++) with the 
> remaining testsuites.  There are a few of regressions in `-march=rv64gc' 
> testing:
> +FAIL: gcc.dg/vect/pr97678.c scan-tree-dump vect "vectorizing stmts using SLP"
> +FAIL: gcc.dg/vect/slp-13-big-array.c scan-tree-dump-times vect "vectorizing 
> stmts using SLP" 3
> +FAIL: gcc.dg/vect/slp-13.c scan-tree-dump-times vect "vectorizing stmts 
> using SLP" 3
> +FAIL: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects  scan-tree-dump vect 
> "vectorizing stmts using SLP"
> +FAIL: gcc.dg/vect/slp-13-big-array.c -flto -ffat-lto-objects  
> scan-tree-dump-times vect "vectorizing stmts using SLP" 3
> +FAIL: gcc.dg/vect/slp-13.c -flto -ffat-lto-objects  scan-tree-dump-times 
> vect "vectorizing stmts using SLP" 3

 I explained in the change description why the check for `riscv_v' isn't 
needed here: the tests mustn't run in the first place, so naturally they 
cannot fail either.  If I missed anything, then please elaborate.

  Maciej


[Bug c++/111748] New: GCC does not understand partial ordering between non-constrained and constrained templates for specialization

2023-10-09 Thread jeanmichael.celerier at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111748

Bug ID: 111748
   Summary: GCC does not understand partial ordering between
non-constrained and constrained templates for
specialization
   Product: gcc
   Version: 13.1.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jeanmichael.celerier at gmail dot com
  Target Milestone: ---

Consider: 

#include

template
void foo() { }

template 
void foo() { }

template<> 
void foo() { }

int main() { foo(); }


According to the answers I got in https://stackoverflow.com/questions/77261120/
GCC should be able to compile this code, yet it fails due to a supposed
ambiguity between 

template
void foo() { }

and

template 
void foo() { }

as the base of foo

回复: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'

2023-10-09 Thread 钟居哲
Btw, could you rebase to the trunk and run regression again?

I saw your report 670 FAILs:
# of expected passes   187616
# of unexpected failures   672
# of unexpected successes  14
# of expected failures 1436
# of unresolved testcases  615
# of unsupported tests 4731

I am recently working on fixing FAILs of risc-v regression. Your report looks 
odd.
This is my report:

# of expected passes183613
# of unexpected failures92
# of unexpected successes   12
# of expected failures  1383
# of unresolved testcases   4
# of unsupported tests  4223

This is my report. It should be less than 100 FAILs.


juzhe.zh...@rivai.ai
 
发件人: 钟居哲
发送时间: 2023-10-10 06:17
收件人: gcc-patches
抄送: macro; Jeff Law; rdapp.gcc; kito.cheng
主题: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'
 && [check_effective_target_arm_little_endian])
 || ([istarget mips*-*-*]
 && [et-is-effective-target mips_msa])
+|| [istarget riscv*-*-*]
 || ([istarget s390*-*-*]
 && [check_effective_target_s390_vx])
  || [istarget amdgcn*-*-*] }}]

You should change it into:

|| ([istarget riscv*-*-*]
 && [check_effective_target_riscv_v])

Then, these additional FAILs will be removed:

with no changes (except for intermittent Python failures for C++) with the 
remaining testsuites.  There are a few of regressions in `-march=rv64gc' 
testing:
+FAIL: gcc.dg/vect/pr97678.c scan-tree-dump vect "vectorizing stmts using SLP"
+FAIL: gcc.dg/vect/slp-13-big-array.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 3
+FAIL: gcc.dg/vect/slp-13.c scan-tree-dump-times vect "vectorizing stmts using 
SLP" 3
+FAIL: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects  scan-tree-dump vect 
"vectorizing stmts using SLP"
+FAIL: gcc.dg/vect/slp-13-big-array.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "vectorizing stmts using SLP" 3
+FAIL: gcc.dg/vect/slp-13.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorizing stmts using SLP" 3


juzhe.zh...@rivai.ai


[PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'

2023-10-09 Thread 钟居哲
 && [check_effective_target_arm_little_endian])
 || ([istarget mips*-*-*]
 && [et-is-effective-target mips_msa])
+|| [istarget riscv*-*-*]
 || ([istarget s390*-*-*]
 && [check_effective_target_s390_vx])
  || [istarget amdgcn*-*-*] }}]

You should change it into:

|| ([istarget riscv*-*-*]
 && [check_effective_target_riscv_v])

Then, these additional FAILs will be removed:

with no changes (except for intermittent Python failures for C++) with the 
remaining testsuites.  There are a few of regressions in `-march=rv64gc' 
testing:
+FAIL: gcc.dg/vect/pr97678.c scan-tree-dump vect "vectorizing stmts using SLP"
+FAIL: gcc.dg/vect/slp-13-big-array.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 3
+FAIL: gcc.dg/vect/slp-13.c scan-tree-dump-times vect "vectorizing stmts using 
SLP" 3
+FAIL: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects  scan-tree-dump vect 
"vectorizing stmts using SLP"
+FAIL: gcc.dg/vect/slp-13-big-array.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "vectorizing stmts using SLP" 3
+FAIL: gcc.dg/vect/slp-13.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorizing stmts using SLP" 3


juzhe.zh...@rivai.ai


[Bug tree-optimization/111519] [13/14 Regression] Wrong code at -O3 on x86_64-linux-gnu since r13-455-g1fe04c497d

2023-10-09 Thread roger at nextmovesoftware dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111519

--- Comment #2 from Roger Sayle  ---
Complicated.  Things have gone wrong before the strlen pass which is given:

  _73 = e;
  _72 = *_73;
...
  *_73 = prephitmp_23;
  d = _72;

Here the assignment to *_73 overwrites the value of f (at *e) which then
invalidates the use of _72 resulting in the wrong value for d.  But figuring
out which pass is at fault (perhaps complete loop unrolling?) is tricky.

[Bug libstdc++/111747] New: Problem with large float list initialization

2023-10-09 Thread oplata.kes1 at mail dot ru via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111747

Bug ID: 111747
   Summary: Problem with large float list initialization
   Product: gcc
   Version: 11.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: oplata.kes1 at mail dot ru
  Target Milestone: ---

Created attachment 56085
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56085=edit
*ii file and output

I create a vector on stack of 50 (or 500mln) floats equal to 1.0 and simply add
them.
The sum is not equal to 50 mln (or 5 billions).

5 mln of floats initizalize and sum fine.

The command is

g++ -v -save-temps gcc.cpp && ./a.out

If it is a stack overflow, shouldn't the code fail with stack overflow error?
If not, what is it?

I use GCC 11.4.0 in Ubuntu 22.04 under WSL 2 (!)

[PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'

2023-10-09 Thread Maciej W. Rozycki
Despite not defining `vec_pack_trunc_' standard named patterns the 
backend provides vector pack operations via its own `@pred_trunc' 
set of patterns and they do trigger in vectorization producing narrowing 
VNCVT.X.X.W assembly instructions as expected.

Enable the `vect_pack_trunc' setting for RISC-V targets then, improving
GCC C test results in `-march=rv64gcv' testing as follows:

-FAIL: gcc.dg/vect/pr57705.c scan-tree-dump-times vect "vectorized 1 loop" 2
+PASS: gcc.dg/vect/pr57705.c scan-tree-dump-times vect "vectorized 1 loop" 3
+PASS: gcc.dg/vect/pr59354.c scan-tree-dump vect "vectorized 1 loop"
-UNSUPPORTED: gcc.dg/vect/pr97678.c
+PASS: gcc.dg/vect/pr97678.c (test for excess errors)
+PASS: gcc.dg/vect/pr97678.c execution test
+XFAIL: gcc.dg/vect/pr97678.c scan-tree-dump vect "vectorizing stmts using SLP"
-UNSUPPORTED: gcc.dg/vect/vect-bool-cmp.c
+PASS: gcc.dg/vect/vect-bool-cmp.c (test for excess errors)
+PASS: gcc.dg/vect/vect-bool-cmp.c execution test
+PASS: gcc.dg/vect/vect-iv-4.c scan-tree-dump-times vect "vectorized 1 loops" 1
+PASS: gcc.dg/vect/vect-multitypes-14.c scan-tree-dump-times vect "vectorized 1 
loops" 1
+PASS: gcc.dg/vect/vect-multitypes-8.c scan-tree-dump-times vect "vectorized 1 
loops" 1
+PASS: gcc.dg/vect/vect-reduc-dot-u16b.c scan-tree-dump-times vect "vectorized 
1 loops" 1
+PASS: gcc.dg/vect/vect-strided-store-u16-i4.c scan-tree-dump-times vect 
"vectorized 1 loops" 2
-PASS: gcc.dg/vect/slp-13-big-array.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 2
+XFAIL: gcc.dg/vect/slp-13-big-array.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 3
-PASS: gcc.dg/vect/slp-13.c scan-tree-dump-times vect "vectorizing stmts using 
SLP" 2
+XFAIL: gcc.dg/vect/slp-13.c scan-tree-dump-times vect "vectorizing stmts using 
SLP" 3
+PASS: gcc.dg/vect/slp-multitypes-10.c scan-tree-dump-times vect "vectorized 1 
loops" 1
+PASS: gcc.dg/vect/slp-multitypes-10.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 1
+PASS: gcc.dg/vect/slp-multitypes-5.c scan-tree-dump-times vect "vectorized 1 
loops" 1
+PASS: gcc.dg/vect/slp-multitypes-5.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 1
+PASS: gcc.dg/vect/slp-multitypes-6.c scan-tree-dump-times vect "vectorized 1 
loops" 1
+PASS: gcc.dg/vect/slp-multitypes-6.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 1
+PASS: gcc.dg/vect/slp-multitypes-9.c scan-tree-dump-times vect "vectorized 1 
loops" 1
+PASS: gcc.dg/vect/slp-multitypes-9.c scan-tree-dump-times vect "vectorizing 
stmts using SLP" 1
-UNSUPPORTED: gcc.dg/vect/slp-perm-12.c
+PASS: gcc.dg/vect/slp-perm-12.c (test for excess errors)
+PASS: gcc.dg/vect/slp-perm-12.c execution test
-UNSUPPORTED: gcc.dg/vect/bb-slp-11.c
+PASS: gcc.dg/vect/bb-slp-11.c (test for excess errors)
+PASS: gcc.dg/vect/bb-slp-11.c execution test
-FAIL: gcc.dg/vect/pr57705.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorized 1 loop" 2
+PASS: gcc.dg/vect/pr57705.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorized 1 loop" 3
+PASS: gcc.dg/vect/pr59354.c -flto -ffat-lto-objects  scan-tree-dump vect 
"vectorized 1 loop"
-UNSUPPORTED: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects
+PASS: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects (test for excess errors)
+PASS: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects execution test
+XFAIL: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects  scan-tree-dump vect 
"vectorizing stmts using SLP"
-UNSUPPORTED: gcc.dg/vect/vect-bool-cmp.c -flto -ffat-lto-objects
+PASS: gcc.dg/vect/vect-bool-cmp.c -flto -ffat-lto-objects (test for excess 
errors)
+PASS: gcc.dg/vect/vect-bool-cmp.c -flto -ffat-lto-objects execution test
+PASS: gcc.dg/vect/vect-iv-4.c -flto -ffat-lto-objects  scan-tree-dump-times 
vect "vectorized 1 loops" 1
+PASS: gcc.dg/vect/vect-multitypes-14.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "vectorized 1 loops" 1
+PASS: gcc.dg/vect/vect-multitypes-8.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "vectorized 1 loops" 1
+PASS: gcc.dg/vect/vect-reduc-dot-u16b.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "vectorized 1 loops" 1
+PASS: gcc.dg/vect/vect-strided-store-u16-i4.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "vectorized 1 loops" 2
-PASS: gcc.dg/vect/slp-13-big-array.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "vectorizing stmts using SLP" 2
+XFAIL: gcc.dg/vect/slp-13-big-array.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "vectorizing stmts using SLP" 3
-PASS: gcc.dg/vect/slp-13.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorizing stmts using SLP" 2
+XFAIL: gcc.dg/vect/slp-13.c -flto -ffat-lto-objects  scan-tree-dump-times vect 
"vectorizing stmts using SLP" 3
+PASS: gcc.dg/vect/slp-multitypes-10.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "vectorized 1 loops" 1
+PASS: gcc.dg/vect/slp-multitypes-10.c -flto -ffat-lto-objects  
scan-tree-dump-times vect "vectorizing stmts using SLP" 1
+PASS: gcc.dg/vect/slp-multitypes-5.c -flto 

[Bug tree-optimization/111715] Missed optimization in FRE because of weak TBAA

2023-10-09 Thread sjames at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111715

--- Comment #6 from Sam James  ---
I started hitting the original warning Jakub hit with 13.2.1 20231007 but I've
not tried to figure out which backported change caused it to appear.

[Bug tree-optimization/111679] `(~a) | (a ^ b)` is not simplified to `~(a & b)`

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111679

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||patch
URL||https://gcc.gnu.org/piperma
   ||il/gcc-patches/2023-October
   ||/632386.html

--- Comment #2 from Andrew Pinski  ---
Patch posted:
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632386.html

[PATCH] MATCH: [PR111679] Add alternative simplification of `a | ((~a) ^ b)`

2023-10-09 Thread Andrew Pinski
So currently we have a simplification for `a | ~(a ^ b)` but
that does not match the case where we had originally `(~a) | (a ^ b)`
so we need to add a new pattern that matches that and uses 
bitwise_inverted_equal_p
that also catches comparisons too.

OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

PR tree-optimization/111679

gcc/ChangeLog:

* match.pd (`a | ((~a) ^ b)`): New pattern.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/bitops-5.c: New test.
---
 gcc/match.pd |  8 +++
 gcc/testsuite/gcc.dg/tree-ssa/bitops-5.c | 27 
 2 files changed, 35 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bitops-5.c

diff --git a/gcc/match.pd b/gcc/match.pd
index 31bfd8b6b68..49740d189a7 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1350,6 +1350,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
   && TYPE_PRECISION (TREE_TYPE (@0)) == 1)
   (bit_ior @0 (bit_xor @1 { build_one_cst (type); }
 
+/* a | ((~a) ^ b)  -->  a | (~b) (alt version of the above 2) */
+(simplify
+ (bit_ior:c @0 (bit_xor:cs @1 @2))
+ (with { bool wascmp; }
+ (if (bitwise_inverted_equal_p (@0, @1, wascmp)
+  && (!wascmp || element_precision (type) == 1))
+  (bit_ior @0 (bit_not @2)
+
 /* (a | b) | (a &^ b)  -->  a | b  */
 (for op (bit_and bit_xor)
  (simplify
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitops-5.c 
b/gcc/testsuite/gcc.dg/tree-ssa/bitops-5.c
new file mode 100644
index 000..990610e3002
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bitops-5.c
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-optimized-raw" } */
+/* PR tree-optimization/111679 */
+
+int f1(int a, int b)
+{
+return (~a) | (a ^ b); // ~(a & b) or (~a) | (~b)
+}
+
+_Bool fb(_Bool c, _Bool d)
+{
+return (!c) | (c ^ d); // ~(c & d) or (~c) | (~d)
+}
+
+_Bool fb1(int x, int y)
+{
+_Bool a = x == 10,  b = y > 100;
+return (!a) | (a ^ b); // ~(a & b) or (~a) | (~b)
+// or (x != 10) | (y <= 100)
+}
+
+/* { dg-final { scan-tree-dump-not   "bit_xor_expr, "   "optimized" } } */
+/* { dg-final { scan-tree-dump-times "bit_not_expr, " 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "bit_and_expr, " 2 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "bit_ior_expr, " 1 "optimized" } } */
+/* { dg-final { scan-tree-dump-times "ne_expr, _\[0-9\]+, x_\[0-9\]+"  1 
"optimized" } } */
+/* { dg-final { scan-tree-dump-times "le_expr, _\[0-9\]+, y_\[0-9\]+"  1 
"optimized" } } */
-- 
2.39.3



[RFC] RISC-V: Handle new types in scheduling descriptions

2023-10-09 Thread Edwin Lu
Now that every insn is guaranteed a type, we want to ensure the types are 
handled by the existing scheduling descriptions. 

There are 2 approaches I see:
1. Create a new pipeline intended to eventually abort (sifive-7.md) 
2. Add the types to an existing pipeline (generic.md)

Which approach do we want to go with? If there is a different approach we
want to take instead, please let me know as well.

Additionally, should types associated with specific extensions 
(vector, crypto, etc) have specific pipelines dedicated to them? 

* config/riscv/generic.md: update pipeline
* config/riscv/sifive-7.md (sifive_7): update pipeline
(sifive_7_other):

Signed-off-by: Edwin Lu 
---
 gcc/config/riscv/generic.md  | 3 ++-
 gcc/config/riscv/sifive-7.md | 7 +++
 2 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/gcc/config/riscv/generic.md b/gcc/config/riscv/generic.md
index 57d3c3b4adc..338d2e85b77 100644
--- a/gcc/config/riscv/generic.md
+++ b/gcc/config/riscv/generic.md
@@ -27,7 +27,8 @@ (define_cpu_unit "fdivsqrt" "pipe0")
 
 (define_insn_reservation "generic_alu" 1
   (and (eq_attr "tune" "generic")
-   (eq_attr "type" 
"unknown,const,arith,shift,slt,multi,auipc,nop,logical,move,bitmanip,min,max,minu,maxu,clz,ctz,cpop"))
+   (eq_attr "type" "unknown,const,arith,shift,slt,multi,auipc,nop,
+ logical,move,bitmanip,min,max,minu,maxu,clz,ctz,cpop,trap,cbo"))
   "alu")
 
 (define_insn_reservation "generic_load" 3
diff --git a/gcc/config/riscv/sifive-7.md b/gcc/config/riscv/sifive-7.md
index 526278e46d4..e76d82614d6 100644
--- a/gcc/config/riscv/sifive-7.md
+++ b/gcc/config/riscv/sifive-7.md
@@ -12,6 +12,8 @@ (define_cpu_unit "sifive_7_B" "sifive_7")
 (define_cpu_unit "sifive_7_idiv" "sifive_7")
 (define_cpu_unit "sifive_7_fpu" "sifive_7")
 
+(define_cpu_unit "sifive_7_abort" "sifive_7")
+
 (define_insn_reservation "sifive_7_load" 3
   (and (eq_attr "tune" "sifive_7")
(eq_attr "type" "load"))
@@ -106,6 +108,11 @@ (define_insn_reservation "sifive_7_f2i" 3
(eq_attr "type" "mfc"))
   "sifive_7_A")
 
+(define_insn_reservation "sifive_7_other" 3
+  (and (eq_attr "tune" "sifive_7")
+   (eq_attr "type" "trap,cbo"))
+  "sifive_7_abort")
+
 (define_bypass 1 
"sifive_7_load,sifive_7_alu,sifive_7_mul,sifive_7_f2i,sifive_7_sfb_alu"
   "sifive_7_alu,sifive_7_branch")
 
-- 
2.34.1



[Bug target/111746] [14 Regression] ICE: infinite recursion in try_split (emit-rtl.cc:3972) at -O2

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111746

Andrew Pinski  changed:

   What|Removed |Added

   Target Milestone|--- |14.0

Re: xthead regression with [COMMITTED] RISC-V: const: hide mvconst splitter from IRA

2023-10-09 Thread Vineet Gupta

On 10/9/23 13:46, Christoph Müllner wrote:
Given that this causes repeated issues, I think that a fall-back to 
counting occurrences is the right thing to do. I can do that if that's ok.


Thanks Christoph.

-Vineet


Re: xthead regression with [COMMITTED] RISC-V: const: hide mvconst splitter from IRA

2023-10-09 Thread Christoph Müllner
On Mon, Oct 9, 2023 at 10:36 PM Vineet Gupta  wrote:
>
> Hi Christoph,
>
> On 10/9/23 12:06, Patrick O'Neill wrote:
> >
> > Hi Vineet,
> >
> > We're seeing a regression on all riscv targets after this patch:|
> >
> > FAIL: gcc.target/riscv/xtheadcondmov-indirect.c -O2
> > check-function-bodies ConNmv_imm_imm_reg||
> > FAIL: gcc.target/riscv/xtheadcondmov-indirect.c -O3 -g
> > check-function-bodies ConNmv_imm_imm_reg
> >
> > Debug log output:
> > body: \taddia[0-9]+,a[0-9]+,-1000+
> > \tlia[0-9]+,9998336+
> > \taddia[0-9]+,a[0-9]+,1664+
> > \tth.mveqza[0-9]+,a[0-9]+,a[0-9]+
> > \tret
> >
> > against: lia5,9998336
> > addia4,a0,-1000
> > addia0,a5,1664
> > th.mveqza0,a1,a4
> > ret|
> >
> > https://github.com/patrick-rivos/gcc-postcommit-ci/issues/8
> > https://github.com/ewlu/riscv-gnu-toolchain/issues/286
> >
>
> It seems with my patch, exactly same instructions get out of order (for
> -O2/-O3) tripping up the test results and differ from say O1 for exact
> same build.
>
> -O2 w/ patch
> ConNmv_imm_imm_reg:
>  lia5,9998336
>  addia4,a0,-1000
>  addia0,a5,1664
>  th.mveqza0,a1,a4
>  ret
>
> -O1 w/ patch
> ConNmv_imm_imm_reg:
>  addia4,a0,-1000
>  lia5,9998336
>  addia0,a5,1664
>  th.mveqza0,a1,a4
>  ret
>
> I'm not sure if there is an easy way to handle that.
> Is there a real reason for testing the full sequences verbatim, or is
> testing number of occurrences of th.mv{eqz,nez} enough.

I did not write the test cases, I just merged two non-functional test files
into one that works without changing the actual test approach.

Given that this causes repeated issues, I think that a fall-back to counting
occurrences is the right thing to do.

I can do that if that's ok.

BR
Christoph



> It seems Jeff recently added -fno-sched-pressure to avoid similar issues
> but that apparently is no longer sufficient.
>
> Thx,
> -Vineet
>
> > Thanks,
> > Patrick
> >
> > On 10/6/23 11:22, Vineet Gupta wrote:
> >> Vlad recently introduced a new gate @ira_in_progress, similar to
> >> counterparts @{reload,lra}_in_progress.
> >>
> >> Use this to hide the constant synthesis splitter from being recog* ()
> >> by IRA register equivalence logic which is eager to undo the splits,
> >> generating worse code for constants (and sometimes no code at all).
> >>
> >> See PR/109279 (large constant), PR/110748 (const -0.0) ...
> >>
> >> Granted the IRA logic is subsided with -fsched-pressure which is now
> >> enabled for RISC-V backend, the gate makes this future-proof in
> >> addition to helping with -O1 etc.
> >>
> >> This fixes 1 addition test
> >>
> >> = Summary of gcc testsuite =
> >>  | # of unexpected case / # of unique 
> >> unexpected case
> >>  |  gcc |  g++ | gfortran |
> >>
> >> rv32imac/  ilp32/ medlow |  416 /   103 |   13 / 6 |   67 /12 |
> >>   rv32imafdc/ ilp32d/ medlow |  416 /   103 |   13 / 6 |   24 / 4 |
> >> rv64imac/   lp64/ medlow |  417 /   104 |9 / 3 |   67 /12 |
> >>   rv64imafdc/  lp64d/ medlow |  416 /   103 |5 / 2 |6 / 1 |
> >>
> >> Also similar to v1, this doesn't move RISC-V SPEC scores at all.
> >>
> >> gcc/ChangeLog:
> >>  * config/riscv/riscv.md (mvconst_internal): Add !ira_in_progress.
> >>
> >> Suggested-by: Jeff Law
> >> Signed-off-by: Vineet Gupta
> >> ---
> >>   gcc/config/riscv/riscv.md | 9 ++---
> >>   1 file changed, 6 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> >> index 1ebe8f92284d..da84b9357bd3 100644
> >> --- a/gcc/config/riscv/riscv.md
> >> +++ b/gcc/config/riscv/riscv.md
> >> @@ -1997,13 +1997,16 @@
> >>
> >>   ;; Pretend to have the ability to load complex const_int in order to get
> >>   ;; better code generation around them.
> >> -;;
> >>   ;; But avoid constants that are special cased elsewhere.
> >> +;;
> >> +;; Hide it from IRA register equiv recog* () to elide potential undoing 
> >> of split
> >> +;;
> >>   (define_insn_and_split "*mvconst_internal"
> >> [(set (match_operand:GPR 0 "register_operand" "=r")
> >>   (match_operand:GPR 1 "splittable_const_int_operand" "i"))]
> >> -  "!(p2m1_shift_operand (operands[1], mode)
> >> - || high_mask_shift_operand (operands[1], mode))"
> >> +  "!ira_in_progress
> >> +   && !(p2m1_shift_operand (operands[1], mode)
> >> +|| high_mask_shift_operand (operands[1], mode))"
> >> "#"
> >> "&& 1"
> >> [(const_int 0)]
>


[Bug middle-end/111743] shifts in bit field accesses don't combine with other shifts

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743

Andrew Pinski  changed:

   What|Removed |Added

   See Also||https://gcc.gnu.org/bugzill
   ||a/show_bug.cgi?id=107601

--- Comment #4 from Andrew Pinski  ---
x86_64 defines SLOW_BYTE_ACCESS which caues some (if not all) of the issues
here:
```
;; _3 = bf.c;

(insn 9 8 10 (parallel [
(set (reg:DI 106)
(lshiftrt:DI (reg/v:DI 104 [ bf ])
(const_int 32 [0x20])))
(clobber (reg:CC 17 flags))
]) "/app/example.cpp":5:58 -1
 (nil))

(insn 10 9 0 (parallel [
(set (reg:HI 100 [ _3 ])
(and:HI (subreg:HI (reg:DI 106) 0)
(const_int 1023 [0x3ff])))
(clobber (reg:CC 17 flags))
]) "/app/example.cpp":5:58 -1
 (nil))

;; _4 = (unsigned int) _3;

(insn 11 10 0 (set (reg:SI 101 [ _4 ])
(zero_extend:SI (reg:HI 100 [ _3 ]))) "/app/example.cpp":5:46 -1
 (nil))
```
Uses HImode (short) here due to SLOW_BYTE_ACCESS being defined rather than the
SImode (int).

Re: [PATCH v4] c++: Check for indirect change of active union member in constexpr [PR101631,PR102286]

2023-10-09 Thread Jason Merrill

On 10/8/23 21:03, Nathaniel Shead wrote:

Ping for https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631203.html

+ && (TREE_CODE (t) == MODIFY_EXPR
+ /* Also check if initializations have implicit change of active
+member earlier up the access chain.  */
+ || !refs->is_empty())


I'm not sure what the cumulative point of these two tests is.  TREE_CODE 
(t) will be either MODIFY_EXPR or INIT_EXPR, and either should be OK.


As I understand it, the problematic case is something like 
constexpr-union2.C, where we're also looking at a MODIFY_EXPR.  So what 
is this check doing?


Incidentally, I think constexpr-union6.C could use a test where we pass 
 to a function other than construct_at, and then try (and fail) to 
assign to the b member from that function.


Jason



Re: xthead regression with [COMMITTED] RISC-V: const: hide mvconst splitter from IRA

2023-10-09 Thread Jeff Law




On 10/9/23 14:36, Vineet Gupta wrote:

Hi Christoph,

On 10/9/23 12:06, Patrick O'Neill wrote:


Hi Vineet,

We're seeing a regression on all riscv targets after this patch:|

FAIL: gcc.target/riscv/xtheadcondmov-indirect.c -O2 
check-function-bodies ConNmv_imm_imm_reg||
FAIL: gcc.target/riscv/xtheadcondmov-indirect.c -O3 -g 
check-function-bodies ConNmv_imm_imm_reg


Debug log output:
body: \taddi    a[0-9]+,a[0-9]+,-1000+
\tli    a[0-9]+,9998336+
\taddi    a[0-9]+,a[0-9]+,1664+
\tth.mveqz    a[0-9]+,a[0-9]+,a[0-9]+
\tret

against:     li    a5,9998336
    addi    a4,a0,-1000
    addi    a0,a5,1664
    th.mveqz    a0,a1,a4
    ret|

https://github.com/patrick-rivos/gcc-postcommit-ci/issues/8
https://github.com/ewlu/riscv-gnu-toolchain/issues/286



It seems with my patch, exactly same instructions get out of order (for 
-O2/-O3) tripping up the test results and differ from say O1 for exact 
same build.


-O2 w/ patch
ConNmv_imm_imm_reg:
     li    a5,9998336
     addi    a4,a0,-1000
     addi    a0,a5,1664
     th.mveqz    a0,a1,a4
     ret

-O1 w/ patch
ConNmv_imm_imm_reg:
     addi    a4,a0,-1000
     li    a5,9998336
     addi    a0,a5,1664
     th.mveqz    a0,a1,a4
     ret

I'm not sure if there is an easy way to handle that.
Is there a real reason for testing the full sequences verbatim, or is 
testing number of occurrences of th.mv{eqz,nez} enough.
It seems Jeff recently added -fno-sched-pressure to avoid similar issues 
but that apparently is no longer sufficient.

I'd suggest doing a count test rather than an exact match.

Verify you get a single li, two addis and one th.mveqz

Jeff


[Bug middle-end/111743] shifts in bit field accesses don't combine with other shifts

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 CC||pinskia at gcc dot gnu.org
   Severity|normal  |enhancement
   Last reconfirmed||2023-10-09
 Status|UNCONFIRMED |NEW

--- Comment #3 from Andrew Pinski  ---
RTL wise we have:
Trying 6, 8 -> 9:
6: {r108:DI=r105:DI 0>>0x20;clobber flags:CC;}
  REG_UNUSED flags:CC
8: {r110:SI=r108:DI#0&0x3ff;clobber flags:CC;}
  REG_UNUSED flags:CC
  REG_DEAD r108:DI
9: {r111:SI=r110:SI<<0x14;clobber flags:CC;}
  REG_DEAD r110:SI
  REG_UNUSED flags:CC
Failed to match this instruction:
(parallel [
(set (reg:SI 111)
(and:SI (ashift:SI (subreg:SI (zero_extract:DI (reg/v:DI 105 [ bf
])
(const_int 32 [0x20])
(const_int 32 [0x20])) 0)
(const_int 20 [0x14]))
(const_int 1072693248 [0x3ff0])))
(clobber (reg:CC 17 flags))
])

This should have been simplified.
Anyways bitfields have issues even on the gimple level as they are not lowered
until expand ...

xthead regression with [COMMITTED] RISC-V: const: hide mvconst splitter from IRA

2023-10-09 Thread Vineet Gupta

Hi Christoph,

On 10/9/23 12:06, Patrick O'Neill wrote:


Hi Vineet,

We're seeing a regression on all riscv targets after this patch:|

FAIL: gcc.target/riscv/xtheadcondmov-indirect.c -O2 
check-function-bodies ConNmv_imm_imm_reg||
FAIL: gcc.target/riscv/xtheadcondmov-indirect.c -O3 -g 
check-function-bodies ConNmv_imm_imm_reg


Debug log output:
body: \taddi    a[0-9]+,a[0-9]+,-1000+
\tli    a[0-9]+,9998336+
\taddi    a[0-9]+,a[0-9]+,1664+
\tth.mveqz    a[0-9]+,a[0-9]+,a[0-9]+
\tret

against:     li    a5,9998336
    addi    a4,a0,-1000
    addi    a0,a5,1664
    th.mveqz    a0,a1,a4
    ret|

https://github.com/patrick-rivos/gcc-postcommit-ci/issues/8
https://github.com/ewlu/riscv-gnu-toolchain/issues/286



It seems with my patch, exactly same instructions get out of order (for 
-O2/-O3) tripping up the test results and differ from say O1 for exact 
same build.


-O2 w/ patch
ConNmv_imm_imm_reg:
    li    a5,9998336
    addi    a4,a0,-1000
    addi    a0,a5,1664
    th.mveqz    a0,a1,a4
    ret

-O1 w/ patch
ConNmv_imm_imm_reg:
    addi    a4,a0,-1000
    li    a5,9998336
    addi    a0,a5,1664
    th.mveqz    a0,a1,a4
    ret

I'm not sure if there is an easy way to handle that.
Is there a real reason for testing the full sequences verbatim, or is 
testing number of occurrences of th.mv{eqz,nez} enough.
It seems Jeff recently added -fno-sched-pressure to avoid similar issues 
but that apparently is no longer sufficient.


Thx,
-Vineet


Thanks,
Patrick

On 10/6/23 11:22, Vineet Gupta wrote:

Vlad recently introduced a new gate @ira_in_progress, similar to
counterparts @{reload,lra}_in_progress.

Use this to hide the constant synthesis splitter from being recog* ()
by IRA register equivalence logic which is eager to undo the splits,
generating worse code for constants (and sometimes no code at all).

See PR/109279 (large constant), PR/110748 (const -0.0) ...

Granted the IRA logic is subsided with -fsched-pressure which is now
enabled for RISC-V backend, the gate makes this future-proof in
addition to helping with -O1 etc.

This fixes 1 addition test

= Summary of gcc testsuite =
 | # of unexpected case / # of unique unexpected 
case
 |  gcc |  g++ | gfortran |

rv32imac/  ilp32/ medlow |  416 /   103 |   13 / 6 |   67 /12 |
  rv32imafdc/ ilp32d/ medlow |  416 /   103 |   13 / 6 |   24 / 4 |
rv64imac/   lp64/ medlow |  417 /   104 |9 / 3 |   67 /12 |
  rv64imafdc/  lp64d/ medlow |  416 /   103 |5 / 2 |6 / 1 |

Also similar to v1, this doesn't move RISC-V SPEC scores at all.

gcc/ChangeLog:
* config/riscv/riscv.md (mvconst_internal): Add !ira_in_progress.

Suggested-by: Jeff Law
Signed-off-by: Vineet Gupta
---
  gcc/config/riscv/riscv.md | 9 ++---
  1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 1ebe8f92284d..da84b9357bd3 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -1997,13 +1997,16 @@
  
  ;; Pretend to have the ability to load complex const_int in order to get

  ;; better code generation around them.
-;;
  ;; But avoid constants that are special cased elsewhere.
+;;
+;; Hide it from IRA register equiv recog* () to elide potential undoing of 
split
+;;
  (define_insn_and_split "*mvconst_internal"
[(set (match_operand:GPR 0 "register_operand" "=r")
  (match_operand:GPR 1 "splittable_const_int_operand" "i"))]
-  "!(p2m1_shift_operand (operands[1], mode)
- || high_mask_shift_operand (operands[1], mode))"
+  "!ira_in_progress
+   && !(p2m1_shift_operand (operands[1], mode)
+|| high_mask_shift_operand (operands[1], mode))"
"#"
"&& 1"
[(const_int 0)]




[Bug fortran/67740] Wrong association status of allocatable character pointer in derived types

2023-10-09 Thread anlauf at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67740

--- Comment #10 from anlauf at gcc dot gnu.org ---
(In reply to anlauf from comment #9)

Addendum:

> I was suspecting gfc_conv_variable as a possibly further place for a fix:
> it has a loop over ref's that looks incomplete for REF_COMPONENT.

I tried my version of a patch in that place, which worked for the testcases
here but gave wrong code already for slightly more complex pointer assignments,
like

  type(pointer_typec0_t) :: co, xo
...
  xo%data1 => co%data1

so let's go with your patch.

[Bug target/111745] [14 Regression] ICE: in extract_insn, at recog.cc:2791 (unrecognizable insn) with -ffloat-store -mavx512fp16 -mavx512vl

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111745

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||needs-bisection
   Target Milestone|--- |14.0

Re: [PATCH] c++: Improve diagnostics for constexpr cast from void*

2023-10-09 Thread Jason Merrill

On 10/9/23 06:03, Nathaniel Shead wrote:

Bootstrapped and regtested on x86_64-pc-linux-gnu with
GXX_TESTSUITE_STDS=98,11,14,17,20,23,26,impcx.

-- >8 --

This patch improves the errors given when casting from void* in C++26 to
include the expected type if the type of the pointed-to object was
not similar to the casted-to type.

It also ensures (for all standard modes) that void* casts are checked
even for DECL_ARTIFICIAL declarations, such as lifetime-extended
temporaries, and is only ignored for cases where we know it's OK (heap
identifiers and source_location::current). This provides more accurate
diagnostics when using the pointer and ensures that some other casts
from void* are now correctly rejected.

gcc/cp/ChangeLog:

* constexpr.cc (is_std_source_location_current): New.
(cxx_eval_constant_expression): Only ignore cast from void* for
specific cases and improve other diagnostics.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-cast4.C: New test.

Signed-off-by: Nathaniel Shead 
---
  gcc/cp/constexpr.cc  | 83 +---
  gcc/testsuite/g++.dg/cpp0x/constexpr-cast4.C |  7 ++
  2 files changed, 78 insertions(+), 12 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-cast4.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 0f948db7c2d..f38d541a662 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -2301,6 +2301,36 @@ is_std_allocator_allocate (const constexpr_call *call)
  && is_std_allocator_allocate (call->fundef->decl));
  }
  
+/* Return true if FNDECL is std::source_location::current.  */

+
+static inline bool
+is_std_source_location_current (tree fndecl)
+{
+  if (!decl_in_std_namespace_p (fndecl))
+return false;
+
+  tree name = DECL_NAME (fndecl);
+  if (name == NULL_TREE || !id_equal (name, "current"))
+return false;
+
+  tree ctx = DECL_CONTEXT (fndecl);
+  if (ctx == NULL_TREE || !CLASS_TYPE_P (ctx) || !TYPE_MAIN_DECL (ctx))
+return false;
+
+  name = DECL_NAME (TYPE_MAIN_DECL (ctx));
+  return name && id_equal (name, "source_location");
+}
+
+/* Overload for the above taking constexpr_call*.  */
+
+static inline bool
+is_std_source_location_current (const constexpr_call *call)
+{
+  return (call
+ && call->fundef
+ && is_std_source_location_current (call->fundef->decl));
+}
+
  /* Return true if FNDECL is __dynamic_cast.  */
  
  static inline bool

@@ -7850,33 +7880,62 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, 
tree t,
if (TYPE_PTROB_P (type)
&& TYPE_PTR_P (TREE_TYPE (op))
&& VOID_TYPE_P (TREE_TYPE (TREE_TYPE (op)))
-   /* Inside a call to std::construct_at or to
-  std::allocator::{,de}allocate, we permit casting from void*
+   /* Inside a call to std::construct_at,
+  std::allocator::{,de}allocate, or
+  std::source_location::current, we permit casting from void*
   because that is compiler-generated code.  */
&& !is_std_construct_at (ctx->call)
-   && !is_std_allocator_allocate (ctx->call))
+   && !is_std_allocator_allocate (ctx->call)
+   && !is_std_source_location_current (ctx->call))
  {
/* Likewise, don't error when casting from void* when OP is
uninit and similar.  */
tree sop = tree_strip_nop_conversions (op);
-   if (TREE_CODE (sop) == ADDR_EXPR
-   && VAR_P (TREE_OPERAND (sop, 0))
-   && DECL_ARTIFICIAL (TREE_OPERAND (sop, 0)))
+   tree decl = NULL_TREE;
+   if (TREE_CODE (sop) == ADDR_EXPR)
+ decl = TREE_OPERAND (sop, 0);
+   if (decl
+   && VAR_P (decl)
+   && DECL_ARTIFICIAL (decl)
+   && (DECL_NAME (decl) == heap_identifier
+   || DECL_NAME (decl) == heap_uninit_identifier
+   || DECL_NAME (decl) == heap_vec_identifier
+   || DECL_NAME (decl) == heap_vec_uninit_identifier))
  /* OK */;
/* P2738 (C++26): a conversion from a prvalue P of type "pointer to
   cv void" to a pointer-to-object type T unless P points to an
   object whose type is similar to T.  */
-   else if (cxx_dialect > cxx23
-&& (sop = cxx_fold_indirect_ref (ctx, loc,
- TREE_TYPE (type), sop)))
+   else if (cxx_dialect > cxx23)
  {
-   r = build1 (ADDR_EXPR, type, sop);
-   break;
+   r = cxx_fold_indirect_ref (ctx, loc, TREE_TYPE (type), sop);
+   if (r)
+ {
+   r = build1 (ADDR_EXPR, type, r);
+   break;
+ }
+   if (!ctx->quiet)
+ {
+   if (TREE_CODE (sop) == ADDR_EXPR)
+ {
+   

Re: [PATCH v1 1/4] options: Define TARGET__P and TARGET__OPTS_P macro for Mask and InverseMask

2023-10-09 Thread Kito Cheng
> Doesn't this need to be updated to avoid multi-dimensional arrays in awk
> and rebased?

Oh, yeah, I should update that, it's post before that issue reported,
let me send v2 sn :P


[Bug tree-optimization/111694] [13/14 Regression] Wrong behavior for signbit of negative zero when optimizing

2023-10-09 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111694

--- Comment #9 from Andrew Macleod  ---
(In reply to Andrew Macleod from comment #8)
> (In reply to Alexander Monakov from comment #7)
> > No backport for gcc-13 planned?
> 
> mmm, didn't realize were we propagating floating point equivalences around
> in 13.  similar patch should work there

Testing same patch on gcc13. will let it settle on trunk for a day or two
first, then check it in if nothing shows up.which it shouldn't :-)

[Bug target/111746] New: [14 Regression] ICE: infinite recursion in try_split (emit-rtl.cc:3972) at -O2

2023-10-09 Thread zsojka at seznam dot cz via Gcc-bugs
--with-cloog --with-ppl --with-isl
--with-sysroot=/usr/powerpc64le-unknown-linux-gnu --build=x86_64-pc-linux-gnu
--host=x86_64-pc-linux-gnu --target=powerpc64le-unknown-linux-gnu
--with-ld=/usr/bin/powerpc64le-unknown-linux-gnu-ld
--with-as=/usr/bin/powerpc64le-unknown-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.0 20231009 (experimental) (GCC) 


The build breaks at:
$ make
/bin/sh ../libtool --tag CC --tag disable-shared  --mode=compile
/repo/build-gcc-trunk-powerpc64le/./gcc/xgcc
-B/repo/build-gcc-trunk-powerpc64le/./gcc/
-B/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/bin/
-B/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/lib/
-isystem
/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/include
-isystem
/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/sys-include
   -DHAVE_CONFIG_H -I.. -I/repo/gcc-trunk/libstdc++-v3/../libiberty
-I/repo/gcc-trunk/libstdc++-v3/../include -prefer-pic -D_GLIBCXX_SHARED
-I/repo/build-gcc-trunk-powerpc64le/powerpc64le-unknown-linux-gnu/libstdc++-v3/include/powerpc64le-unknown-linux-gnu
-I/repo/build-gcc-trunk-powerpc64le/powerpc64le-unknown-linux-gnu/libstdc++-v3/include
-I/repo/gcc-trunk/libstdc++-v3/libsupc++-g -O2  -DIN_GLIBCPP_V3 -Wno-error
-c cp-demangle.c
libtool: compile:  /repo/build-gcc-trunk-powerpc64le/./gcc/xgcc
-B/repo/build-gcc-trunk-powerpc64le/./gcc/
-B/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/bin/
-B/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/lib/
-isystem
/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/include
-isystem
/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/sys-include
-DHAVE_CONFIG_H -I.. -I/repo/gcc-trunk/libstdc++-v3/../libiberty
-I/repo/gcc-trunk/libstdc++-v3/../include -D_GLIBCXX_SHARED
-I/repo/build-gcc-trunk-powerpc64le/powerpc64le-unknown-linux-gnu/libstdc++-v3/include/powerpc64le-unknown-linux-gnu
-I/repo/build-gcc-trunk-powerpc64le/powerpc64le-unknown-linux-gnu/libstdc++-v3/include
-I/repo/gcc-trunk/libstdc++-v3/libsupc++ -g -O2 -DIN_GLIBCPP_V3 -Wno-error -c
cp-demangle.c  -fPIC -DPIC -o cp-demangle.o
xgcc: internal compiler error: Segmentation fault signal terminated program cc1
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
See <https://gcc.gnu.org/bugs/> for instructions.
make: *** [Makefile:970: cp-demangle.lo] Error 1

[Bug target/111745] New: [14 Regression] ICE: in extract_insn, at recog.cc:2791 (unrecognizable insn) with -ffloat-store -mavx512fp16 -mavx512vl

2023-10-09 Thread zsojka at seznam dot cz via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111745

Bug ID: 111745
   Summary: [14 Regression] ICE: in extract_insn, at recog.cc:2791
(unrecognizable insn) with -ffloat-store -mavx512fp16
-mavx512vl
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: target
  Assignee: unassigned at gcc dot gnu.org
  Reporter: zsojka at seznam dot cz
  Target Milestone: ---
  Host: x86_64-pc-linux-gnu
Target: x86_64-pc-linux-gnu

Created attachment 56083
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56083=edit
reduced testcase

Compiler output:
$ x86_64-pc-linux-gnu-gcc -ffloat-store -mavx512fp16 -mavx512vl testcase.c 
testcase.c: In function 'foo':
testcase.c:8:1: error: unrecognizable insn:
8 | }
  | ^
(insn 54 53 55 2 (set (reg:V8HF 136)
(vec_concat:V8HF (mem:V4HF (plus:DI (reg/f:DI 93 virtual-stack-vars)
(const_int -16 [0xfff0])) [1  S8 A64])
(reg:V4HF 139))) "testcase.c":7:5 -1
 (nil))
during RTL pass: vregs
testcase.c:8:1: internal compiler error: in extract_insn, at recog.cc:2791
0x7e765b _fatal_insn(char const*, rtx_def const*, char const*, int, char
const*)
/repo/gcc-trunk/gcc/rtl-error.cc:108
0x7e76d8 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
/repo/gcc-trunk/gcc/rtl-error.cc:116
0x7d63fd extract_insn(rtx_insn*)
/repo/gcc-trunk/gcc/recog.cc:2791
0x10e7995 instantiate_virtual_regs_in_insn
/repo/gcc-trunk/gcc/function.cc:1610
0x10e7995 instantiate_virtual_regs
/repo/gcc-trunk/gcc/function.cc:1983
0x10e7995 execute
/repo/gcc-trunk/gcc/function.cc:2030
Please submit a full bug report, with preprocessed source (by using
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.

$ x86_64-pc-linux-gnu-gcc -v
Using built-in specs.
COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc
COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++
--enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra
--disable-bootstrap --with-cloog --with-ppl --with-isl
--build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu
--target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld
--with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch
--prefix=/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-nobootstrap-amd64
Thread model: posix
Supported LTO compression algorithms: zlib zstd
gcc version 14.0.0 20231009 (experimental) (GCC)

[Bug middle-end/111743] shifts in bit field accesses don't combine with other shifts

2023-10-09 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743

--- Comment #2 from Andi Kleen  ---
Okay then it doesn't understand that SHL_signed and SHR_unsigned can be
combined when one the values came from a shorter unsigned.

[Bug rtl-optimization/111744] New: Missed optimization when casting rdtsc into uint32_t and computing difference

2023-10-09 Thread stefan.sakalik at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111744

Bug ID: 111744
   Summary: Missed optimization when casting rdtsc into uint32_t
and computing difference
   Product: gcc
   Version: 14.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: stefan.sakalik at gmail dot com
  Target Milestone: ---

This is similar to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92180 where

this code: https://godbolt.org/z/7W9nqTsjE

#include 
#include 

uint32_t rdtsc32() { return static_cast(__rdtsc()); }

uint64_t rdtsc_delta(uint64_t x) {
return rdtsc32() - rdtsc32();
}

Produces 

rdtsc_delta(unsigned long):
rdtsc
mov rcx, rax
sal rdx, 32
or  rcx, rdx
rdtsc
sub ecx, eax
mov rax, rcx
ret

as opposed to clang version

rdtsc_delta(unsigned long):
rdtsc
mov rcx, rax
rdtsc
sub ecx, eax
mov rax, rcx
ret

Re: [RFC 1/2] RISC-V: Add support for _Bfloat16.

2023-10-09 Thread Jeff Law




On 10/9/23 00:18, Jin Ma wrote:


+;; The conversion of DF to BF needs to be done with SF if there is a
+;; chance to generate at least one instruction, otherwise just using
+;; libfunc __truncdfbf2.
+(define_expand "truncdfbf2"
+  [(set (match_operand:BF 0 "register_operand" "=f")
+   (float_truncate:BF
+   (match_operand:DF 1 "register_operand" " f")))]
+  "TARGET_DOUBLE_FLOAT || TARGET_ZDINX"
+  {
+convert_move (operands[0],
+ convert_modes (SFmode, DFmode, operands[1], 0), 0);
+DONE;
+  })

So for conversions to/from BFmode, doesn't generic code take care of
this for us?  Search for convert_mode_scalar in expr.cc. That code will
utilize SFmode as an intermediate step just like your expander.   Is
there some reason that generic code is insufficient?

Similarly for the the other conversions.


As far as I can see, the function 'convert_mode_scalar' doesn't seem to be 
perfect for
dealing with the conversions to/from BFmode. It can only handle BF to HF, SF, 
DF and
SF to BF well, but the rest of the conversion without any processing, directly 
using
the libcall.

Maybe I should choose to enhance its functionality? This seems to be a
good choice, I'm not sure.My recollection was that BF could be converted to/from SF trivially and 

if we wanted BF->DF we'd first convert to SF, then to DF.

Direct BF<->DF conversions aren't actually important from a performance 
standpoint.  So it's OK if they have an extra step IMHO.


jeff


Re: [pushed] analyzer: improvements to out-of-bounds diagrams [PR111155]

2023-10-09 Thread David Malcolm
On Mon, 2023-10-09 at 17:01 +0200, Tobias Burnus wrote:
> Hi David,
> 
> On 09.10.23 16:08, David Malcolm wrote:
> > On Mon, 2023-10-09 at 12:09 +0200, Tobias Burnus wrote:
> > > The following works:
> > > (A) Using "kind == boundaries::kind::HARD" - i.e. adding
> > > "boundaries::"
> > > (B) Renaming the parameter name "kind" to something else - like
> > > "k"
> > > as used
> > >   in the other functions.
> > > 
> > > Can you fix it?
> > Sorry about the breakage, and thanks for the investigation.
> Well, without an older compiler, one does not see it. It also worked
> flawlessly on my laptop today.
> > Does the following patch fix the build for you?
> 
> Yes – as mentioned either of the variants above should work and (A)
> is
> what you have in your patch.
> 
> And it is what I actually tried for the full build. Hence, yes, it
> works :-)

Thanks!

I've pushed this to trunk as r14-4521-g08d0f840dc7ad2.



Odd Python errors in the G++ testsuite

2023-10-09 Thread Maciej W. Rozycki
Hi,

 I'm seeing these tracebacks for several cases across the G++ testsuite:

Executing on host: python3 -c "import sys; assert sys.version_info >= (3, 6)"   
 (timeout = 300)
spawn -ignore SIGHUP python3 -c import sys; assert sys.version_info >= (3, 6)
rules/0/primary-output is ok: p1689-1.o
rules/0/provides/0/logical-name is ok: foo
rules/0/provides/0/is-interface is ok: True
Traceback (most recent call last):
  File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 218, in 
is_ok = validate_p1689(actual, expect)
  File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 182, in 
validate_p1689
return compare_json([], actual_json, expect_json)
  File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 145, in 
compare_json
is_ok = _compare_object(path, actual, expect)
  File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 66, in 
_compare_object
sub_error = compare_json(path + [key], actual[key], expect[key])
  File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 151, in 
compare_json
is_ok = _compare_array(path, actual, expect)
  File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 87, in 
_compare_array
sub_error = compare_json(path + [str(idx)], a, e)
  File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 145, in 
compare_json
is_ok = _compare_object(path, actual, expect)
  File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 66, in 
_compare_object
sub_error = compare_json(path + [key], actual[key], expect[key])
  File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 149, in 
compare_json
actual = set(actual)
TypeError: unhashable type: 'dict'

and also these intermittent failures for other cases:

Executing on host: python3 -c "import sys; assert sys.version_info >= (3, 6)"   
 (timeout = 300)
spawn -ignore SIGHUP python3 -c import sys; assert sys.version_info >= (3, 6)
rules/0/primary-output is ok: p1689-2.o
rules/0/provides/0/logical-name is ok: foo:part1
rules/0/provides/0/is-interface is ok: True
ERROR: length mismatch at rules/0/requires: actual: "1" expect: "0"
version is ok: 0
revision is ok: 0
FAIL: ERROR: length mismatch at rules/0/requires: actual: "1" expect: "0"

This does seem to me like something not working as intended.  As a Python 
non-expert I have troubles concluding what is going on here and whether 
these tracebacks are indeed supposed to be there, or whether it is a sign 
of a problem.  And these failures I don't even know where they come from.  

 Does anyone know?  Is there a way to run the offending commands by hand?  
The relevant invocation lines do not appear in the test log file for one 
to copy and paste, which I think is not the right way of doing things in 
our environment.

 These issues seem independent from the test host environment as I can see 
them on both a `powerpc64le-linux-gnu' and an `x86_64-linux-gnu' machine 
in `riscv64-linux-gnu' target testing.

  Maciej


[Bug middle-end/111743] shifts in bit field accesses don't combine with other shifts

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743

--- Comment #1 from Andrew Pinski  ---
Remember types smaller than int is prompted to int .

[Bug middle-end/111743] New: shifts in bit field accesses don't combine with other shifts

2023-10-09 Thread andi-gcc at firstfloor dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743

Bug ID: 111743
   Summary: shifts in bit field accesses don't combine with other
shifts
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: andi-gcc at firstfloor dot org
  Target Milestone: ---

(not sure it's the middle-end, picked arbitrarily)

The following code

struct bf { 
unsigned a : 10, b : 20, c : 10;
};
unsigned fbc(struct bf bf) { return bf.b | (bf.c << 20); }


generates:

movq%rdi, %rax
shrq$10, %rdi
shrq$32, %rax   
andl$1048575, %edi
andl$1023, %eax
sall$20, %eax
orl %edi, %eax
ret

It doesn't understand that the shift right can be combined with the shift left.
Also not sure why the shift left is arithmetic (this should be all unsigned) 

clang does the simplification which ends up one instruction shorter:
movl%edi, %eax
shrl$10, %eax
andl$1048575, %eax  # imm = 0xF
shrq$12, %rdi
andl$1072693248, %edi   # imm = 0x3FF0
orl %edi, %eax
retq

Re: [PATCH] wide-int: Allow up to 16320 bits wide_int and change widest_int precision to 32640 bits [PR102989]

2023-10-09 Thread Jakub Jelinek
On Mon, Oct 09, 2023 at 03:44:10PM +0200, Jakub Jelinek wrote:
> Thanks, just quick answers, will work on patch adjustments after trying to
> get rid of rwide_int (seems dwarf2out has very limited needs from it, just
> some routine to construct it in GCed memory (and never change afterwards)
> from const wide_int_ref & or so, and then working operator ==,
> get_precision, elt, get_len and get_val methods, so I think we could just
> have a struct dw_wide_int { unsigned int prec, len; HOST_WIDE_INT val[1]; };
> and perform the methods on it after converting to a storage ref.

Now in patch form (again, incremental).

> > Does the variable-length memcpy pay for itself?  If so, perhaps that's a
> > sign that we should have a smaller inline buffer for this class (say 2 
> > HWIs).
> 
> Guess I'll try to see what results in smaller .text size.

I've left the memcpy changes into a separate patch (incremental, attached).
Seems that second patch results in .text growth by 16256 bytes (0.04%),
though I'd bet it probably makes compile time tiny bit faster because it
replaces an out of line memcpy (caused by variable length) with inlined one.

With even the third one it shrinks by 84544 bytes (0.21% down), but the
extra statistics patch then shows massive number of allocations after
running make check-gcc check-g++ check-gfortran for just a minute or two.
On the widest_int side, I see (first number from sort | uniq -c | sort -nr,
second the estimated or final len)
7289034 4
 173586 5
  21819 6
i.e. there are tons of widest_ints which need len 4 (or perhaps just
have it as upper estimation), maybe even 5 would be nice.
On the wide_int side, I see
 155291 576
(supposedly because of bound_wide_int, where we create wide_int_ref from
the 576-bit precision bound_wide_int and then create 576-bit wide_int when
using unary or binary operation on that).

So, perhaps we could get away with say WIDEST_INT_MAX_INL_ELTS of 5 or 6
instead of 9 but keep WIDE_INT_MAX_INL_ELTS at 9 (or whatever is computed
from MAX_BITSIZE_MODE_ANY_INT?).  Or keep it at 9 for both (i.e. without
the third patch).

--- gcc/poly-int.h.jj   2023-10-09 14:37:45.883940062 +0200
+++ gcc/poly-int.h  2023-10-09 17:05:26.629828329 +0200
@@ -96,7 +96,7 @@ struct poly_coeff_traits
-struct poly_coeff_traits
+struct poly_coeff_traits
 {
   typedef WI_UNARY_RESULT (T) result;
   typedef int int_type;
@@ -110,14 +110,13 @@ struct poly_coeff_traits
-struct poly_coeff_traits
+struct poly_coeff_traits
 {
   typedef WI_UNARY_RESULT (T) result;
   typedef int int_type;
   /* These types are always signed.  */
   static const int signedness = 1;
   static const int precision = wi::int_traits::precision;
-  static const int inl_precision = wi::int_traits::inl_precision;
   static const int rank = precision * 2 / CHAR_BIT;
 
   template
--- gcc/double-int.h.jj 2023-01-02 09:32:22.747280053 +0100
+++ gcc/double-int.h2023-10-09 17:06:03.446317336 +0200
@@ -440,7 +440,7 @@ namespace wi
   template <>
   struct int_traits 
   {
-static const enum precision_type precision_type = CONST_PRECISION;
+static const enum precision_type precision_type = INL_CONST_PRECISION;
 static const bool host_dependent_precision = true;
 static const unsigned int precision = HOST_BITS_PER_DOUBLE_INT;
 static unsigned int get_precision (const double_int &);
--- gcc/wide-int.h.jj   2023-10-09 16:06:39.326805176 +0200
+++ gcc/wide-int.h  2023-10-09 17:29:20.016951691 +0200
@@ -343,8 +343,8 @@ template  class widest_int_storag
 
 typedef generic_wide_int  wide_int;
 typedef FIXED_WIDE_INT (ADDR_MAX_PRECISION) offset_int;
-typedef generic_wide_int  > 
widest_int;
-typedef generic_wide_int  
> widest2_int;
+typedef generic_wide_int  > 
widest_int;
+typedef generic_wide_int  > 
widest2_int;
 
 /* wi::storage_ref can be a reference to a primitive type,
so this is the conservatively-correct setting.  */
@@ -394,13 +394,13 @@ namespace wi
 /* The integer has a variable precision but no defined signedness.  */
 VAR_PRECISION,
 
-/* The integer has a constant precision (known at GCC compile time)
-   and is signed.  */
-CONST_PRECISION,
-
-/* Like CONST_PRECISION, but with WIDEST_INT_MAX_PRECISION or larger
-   precision where not all elements of arrays are always present.  */
-WIDEST_CONST_PRECISION
+/* The integer has a constant precision (known at GCC compile time),
+   is signed and all elements are in inline buffer.  */
+INL_CONST_PRECISION,
+
+/* Like INL_CONST_PRECISION, but elements can be heap allocated for
+   larger lengths.  */
+CONST_PRECISION
   };
 
   /* This class, which has no default implementation, is expected to
@@ -410,15 +410,10 @@ namespace wi
Classifies the type of T.
 
  static const unsigned int precision;
-   Only defined if precision_type == CONST_PRECISION or
-   precision_type == WIDEST_CONST_PRECISION.  Specifies the
+   Only defined if precision_type == 

[Bug c++/111742] Misaligned generated code with MI using aligned virtual base

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111742

--- Comment #3 from Andrew Pinski  ---
Then it is a dup of bug 71644.

*** This bug has been marked as a duplicate of bug 71644 ***

[Bug c++/71644] gcc 6.1 generates movaps for unaligned memory

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71644

Andrew Pinski  changed:

   What|Removed |Added

 CC||cuzdav at gmail dot com

--- Comment #3 from Andrew Pinski  ---
*** Bug 111742 has been marked as a duplicate of this bug. ***

[Bug c++/111742] Misaligned generated code with MI using aligned virtual base

2023-10-09 Thread cuzdav at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111742

--- Comment #2 from Chris Uzdavinis  ---
No, this is not a ubsan report.
Code *crashes* and I thought showing the UBsan warning was enough to
demonstrate it.
A minimal change to make the code crash instead of just report ubsan errors:


struct X {
  void * a = nullptr;
  void * b = nullptr;
};

struct alignas(16) AlignedData { };

struct A : virtual AlignedData {
int x = 0;   // << add this
  X xxx;
int& ref = x;// << and this
};

struct B : virtual AlignedData {};

struct Test : B, A {};

Test* t = new Test;

int main() {}


*** SEGFAULT ***

https://godbolt.org/z/f57vs7jxP

Re: [PATCH] sso-string@gnu-versioned-namespace [PR83077]

2023-10-09 Thread François Dumont



On 09/10/2023 16:42, Iain Sandoe wrote:

Hi François,


On 7 Oct 2023, at 20:32, François Dumont  wrote:

I've been told that previous patch generated with 'git diff -b' was not 
applying properly so here is the same patch again with a simple 'git diff'.

Thanks, that did fix it - There are some training whitespaces in the config 
files, but I suspect that they need to be there since those have values 
appended during the configuration.


You're talking about the ones coming from regenerated Makefile.in and 
configure I guess. I prefer not to edit those, those trailing 
whitespaces are already in.





Anyway, with this + the coroutines and contract v2 (weak def) fix, plus a local 
patch to enable versioned namespace on Darwin, I get results comparable with 
the non-versioned case - but one more patchlet is needed on  yours (to allow 
for targets using emultated TLS):

diff --git a/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver 
b/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver
index 9fab8bead15..b7167fc0c2f 100644
--- a/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver
+++ b/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver
@@ -78,6 +78,7 @@ GLIBCXX_8.0 {
  
  # thread/mutex/condition_variable/future

  __once_proxy;
+__emutls_v._ZNSt3__81?__once_call*;


I can add this one, sure, even if it could be part of a dedicated patch. 
I'm surprised that we do not need the __once_callable emul symbol too, 
it would be more consistent with the non-versioned mode.


I'm pretty sure there are a bunch of other symbols missing, but this 
mode is seldomly tested...


  
  # std::__convert_to_v

  _ZNSt3__814__convert_to_v*;


thanks
Iain



On 07/10/2023 14:25, François Dumont wrote:

Hi

Here is a rebased version of this patch.

There are few test failures when running 'make check-c++' but nothing new.

Still, there are 2 patches awaiting validation to fix some of them, PR 
c++/111524 to fix another bunch and I fear that we will have to live with the 
others.

 libstdc++: [_GLIBCXX_INLINE_VERSION] Use cxx11 abi [PR83077]

 Use cxx11 abi when activating versioned namespace mode. To do support
 a new configuration mode where !_GLIBCXX_USE_DUAL_ABI and 
_GLIBCXX_USE_CXX11_ABI.

 The main change is that std::__cow_string is now defined whenever 
_GLIBCXX_USE_DUAL_ABI
 or _GLIBCXX_USE_CXX11_ABI is true. Implementation is using available 
std::string in
 case of dual abi and a subset of it when it's not.

 On the other side std::__sso_string is defined only when 
_GLIBCXX_USE_DUAL_ABI is true
 and _GLIBCXX_USE_CXX11_ABI is false. Meaning that std::__sso_string is a 
typedef for the
 cow std::string implementation when dual abi is disabled and cow string is 
being used.

 libstdcxx-v3/ChangeLog:

 PR libstdc++/83077
 * acinclude.m4 [GLIBCXX_ENABLE_LIBSTDCXX_DUAL_ABI]: Default to 
"new" libstdcxx abi
 when enable_symvers is gnu-versioned-namespace.
 * config/locale/dragonfly/monetary_members.cc 
[!_GLIBCXX_USE_DUAL_ABI]: Define money_base
 members.
 * config/locale/generic/monetary_members.cc 
[!_GLIBCXX_USE_DUAL_ABI]: Likewise.
 * config/locale/gnu/monetary_members.cc [!_GLIBCXX_USE_DUAL_ABI]: 
Likewise.
 * config/locale/gnu/numeric_members.cc
 [!_GLIBCXX_USE_DUAL_ABI](__narrow_multibyte_chars): Define.
 * configure: Regenerate.
 * include/bits/c++config
 [_GLIBCXX_INLINE_VERSION](_GLIBCXX_NAMESPACE_CXX11, 
_GLIBCXX_BEGIN_NAMESPACE_CXX11):
 Define empty.
[_GLIBCXX_INLINE_VERSION](_GLIBCXX_END_NAMESPACE_CXX11, 
_GLIBCXX_DEFAULT_ABI_TAG):
 Likewise.
 * include/bits/cow_string.h [!_GLIBCXX_USE_CXX11_ABI]: Define a 
light version of COW
 basic_string as __std_cow_string for use in stdexcept.
 * include/std/stdexcept [_GLIBCXX_USE_CXX11_ABI]: Define 
__cow_string.
 (__cow_string(const char*)): New.
 (__cow_string::c_str()): New.
 * python/libstdcxx/v6/printers.py (StdStringPrinter::__init__): 
Set self.new_string to True
 when std::__8::basic_string type is found.
 * src/Makefile.am 
[ENABLE_SYMVERS_GNU_NAMESPACE](ldbl_alt128_compat_sources): Define empty.
 * src/Makefile.in: Regenerate.
 * src/c++11/Makefile.am (cxx11_abi_sources): Rename into...
 (dual_abi_sources): ...this. Also move cow-local_init.cc, 
cxx11-hash_tr1.cc,
 cxx11-ios_failure.cc entries to...
 (sources): ...this.
 (extra_string_inst_sources): Move cow-fstream-inst.cc, 
cow-sstream-inst.cc, cow-string-inst.cc,
 cow-string-io-inst.cc, cow-wtring-inst.cc, cow-wstring-io-inst.cc, 
cxx11-locale-inst.cc,
 cxx11-wlocale-inst.cc entries to...
 (inst_sources): ...this.
 * src/c++11/Makefile.in: 

[Bug tree-optimization/111694] [13/14 Regression] Wrong behavior for signbit of negative zero when optimizing

2023-10-09 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111694

Andrew Macleod  changed:

   What|Removed |Added

 Resolution|FIXED   |---
 Status|RESOLVED|REOPENED

--- Comment #8 from Andrew Macleod  ---
(In reply to Alexander Monakov from comment #7)
> No backport for gcc-13 planned?

mmm, didn't realize were we propagating floating point equivalences around in
13.  similar patch should work there

[Bug c/111741] gcc long double precision

2023-10-09 Thread bernardwidynski at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111741

--- Comment #3 from bernardwidynski at gmail dot com ---
Thanks for the quick response.

That explains it.

On Mon, Oct 9, 2023 at 10:20 AM pinskia at gcc dot gnu.org <
gcc-bugzi...@gcc.gnu.org> wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111741
>
> Andrew Pinski  changed:
>
>What|Removed |Added
>
> 
>  Status|UNCONFIRMED |RESOLVED
>  Resolution|--- |INVALID
>
> --- Comment #2 from Andrew Pinski  ---
> 80bit is the full precission and that 80bits includes 1 bit sign bit,
> 64bits
> for the mantissa and 15bits for the exponent.
>
> So anything above 64bits will start to lose precission in the last digits.
>
> --
> You are receiving this mail because:
> You reported the bug.

[Bug sanitizer/83780] False positive alignment error with -fsanitize=undefined with virtual base

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83780

Andrew Pinski  changed:

   What|Removed |Added

 CC||cuzdav at gmail dot com

--- Comment #6 from Andrew Pinski  ---
*** Bug 111742 has been marked as a duplicate of this bug. ***

[Bug c++/111742] Misaligned generated code with MI using aligned virtual base

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111742

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Andrew Pinski  ---
It is just a santizer issue. Dup of bug 83780.

*** This bug has been marked as a duplicate of bug 83780 ***

[Bug c/111741] gcc long double precision

2023-10-09 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111741

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |INVALID

--- Comment #2 from Andrew Pinski  ---
80bit is the full precission and that 80bits includes 1 bit sign bit, 64bits
for the mantissa and 15bits for the exponent.

So anything above 64bits will start to lose precission in the last digits.

[Bug c++/111742] New: Misaligned generated code with MI using aligned virtual base

2023-10-09 Thread cuzdav at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111742

Bug ID: 111742
   Summary: Misaligned generated code with MI using aligned
virtual base
   Product: gcc
   Version: 13.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: cuzdav at gmail dot com
  Target Milestone: ---

Generated code is misaligned (and crashes in slightly more complex code), in
trunk all the way back to gcc 8.1, when built in c++11 or higher, with O3. 
(Linux, x86)

Complete code:
//
struct X {
  void * a = nullptr;
  void * b = nullptr;
};

struct alignas(16) AlignedData { };

struct A : virtual AlignedData {
  X xxx;
};

struct B : virtual AlignedData {};

struct Test : B, A {};

Test* t = new Test;

int main() {}
//

Compiler Explorer demo:
https://godbolt.org/z/aodTdaedW

Running with UB-san reports this:
/app/example.cpp:14:8: runtime error: constructor call on misaligned address
0x0227f2b8 for type 'struct A', which requires 16 byte alignment
0x0227f2b8: note: pointer points here
 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00
00 00 00  00 00 00 00
  ^ 
/app/example.cpp:8:8: runtime error: member access within misaligned address
0x0227f2b8 for type 'struct A', which requires 16 byte alignment
0x0227f2b8: note: pointer points here
 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  00 00 00 00 00
00 00 00  00 00 00 00
  ^

[Bug tree-optimization/111694] [13/14 Regression] Wrong behavior for signbit of negative zero when optimizing

2023-10-09 Thread amonakov at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111694

--- Comment #7 from Alexander Monakov  ---
No backport for gcc-13 planned?

[Bug c/111741] gcc long double precision

2023-10-09 Thread bernardwidynski at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111741

--- Comment #1 from bernardwidynski at gmail dot com ---
Created attachment 56082
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56082=edit
Output file

[Bug c/111741] New: gcc long double precision

2023-10-09 Thread bernardwidynski at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111741

Bug ID: 111741
   Summary: gcc long double precision
   Product: gcc
   Version: 11.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bernardwidynski at gmail dot com
  Target Milestone: ---

Created attachment 56081
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56081=edit
C program to compute sum of numbers 1, 2, 3, ... N

It is my understanding the long double in gcc has 80 bits precision.

I've run a simple program which shows that it is less than 80 bits precision.

The numbers 1, 2, 3, ... N are summed and compared with N*(N+1)/2

For the case where N = 2^32, the sums compare correctly.

For the case where N = 2^33, the sums are different.

2^33*(2^33-1)/2 is less than 80 bits in precision.

Why doesn't the long double have the capacity for this computation?

See attached program and output file.

This was run on Cygwin64 using gcc version 11.4.0 on an Intel Core i7-9700

[Bug tree-optimization/111694] [13/14 Regression] Wrong behavior for signbit of negative zero when optimizing

2023-10-09 Thread amacleod at redhat dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111694

Andrew Macleod  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED

--- Comment #6 from Andrew Macleod  ---
fixed

[COMMITTED] PR tree-optimization/111694 - Ensure float equivalences include + and - zero.

2023-10-09 Thread Andrew MacLeod
When ranger propagates ranges in the on-entry cache, it also check for 
equivalences and incorporates the equivalence into the range for a name 
if it is known.


With floating point values, the equivalence that is generated by 
comparison must also take into account that if the equivalence contains 
zero, both positive and negative zeros could be in the range.


This PR demonstrates that once we establish an equivalence, even though 
we know one value may only have a positive zero, the equivalence may 
have been formed earlier and included a negative zero  This patch 
pessimistically assumes that if the equivalence contains zero, we should 
include both + and - 0 in the equivalence that we utilize.


I audited the other places, and found no other place where this issue 
might arise.  Cache propagation is the only place where we augment the 
range with random equivalences.


Bootstrapped on x86_64-pc-linux-gnu with no regressions. Pushed.

Andrew
From b0892b1fc637fadf14d7016858983bc5776a1e69 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod 
Date: Mon, 9 Oct 2023 10:15:07 -0400
Subject: [PATCH 2/2] Ensure float equivalences include + and - zero.

A floating point equivalence may not properly reflect both signs of
zero, so be pessimsitic and ensure both signs are included.

	PR tree-optimization/111694
	gcc/
	* gimple-range-cache.cc (ranger_cache::fill_block_cache): Adjust
	equivalence range.
	* value-relation.cc (adjust_equivalence_range): New.
	* value-relation.h (adjust_equivalence_range): New prototype.

	gcc/testsuite/
	* gcc.dg/pr111694.c: New.
---
 gcc/gimple-range-cache.cc   |  3 +++
 gcc/testsuite/gcc.dg/pr111694.c | 19 +++
 gcc/value-relation.cc   | 19 +++
 gcc/value-relation.h|  3 +++
 4 files changed, 44 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/pr111694.c

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 3c819933c4e..89c0845457d 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -1470,6 +1470,9 @@ ranger_cache::fill_block_cache (tree name, basic_block bb, basic_block def_bb)
 		{
 		  if (rel != VREL_EQ)
 		range_cast (equiv_range, type);
+		  else
+		adjust_equivalence_range (equiv_range);
+
 		  if (block_result.intersect (equiv_range))
 		{
 		  if (DEBUG_RANGE_CACHE)
diff --git a/gcc/testsuite/gcc.dg/pr111694.c b/gcc/testsuite/gcc.dg/pr111694.c
new file mode 100644
index 000..a70b03069dc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr111694.c
@@ -0,0 +1,19 @@
+/* PR tree-optimization/111009 */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+#define signbit(x) __builtin_signbit(x)
+
+static void test(double l, double r)
+{
+  if (l == r && (signbit(l) || signbit(r)))
+;
+  else
+__builtin_abort();
+}
+
+int main()
+{
+  test(0.0, -0.0);
+}
+
diff --git a/gcc/value-relation.cc b/gcc/value-relation.cc
index a2ae39692a6..0326fe7cde6 100644
--- a/gcc/value-relation.cc
+++ b/gcc/value-relation.cc
@@ -183,6 +183,25 @@ relation_transitive (relation_kind r1, relation_kind r2)
   return relation_kind (rr_transitive_table[r1][r2]);
 }
 
+// When one name is an equivalence of another, ensure the equivalence
+// range is correct.  Specifically for floating point, a +0 is also
+// equivalent to a -0 which may not be reflected.  See PR 111694.
+
+void
+adjust_equivalence_range (vrange )
+{
+  if (range.undefined_p () || !is_a (range))
+return;
+
+  frange fr = as_a (range);
+  // If range includes 0 make sure both signs of zero are included.
+  if (fr.contains_p (dconst0) || fr.contains_p (dconstm0))
+{
+  frange zeros (range.type (), dconstm0, dconst0);
+  range.union_ (zeros);
+}
+ }
+
 // This vector maps a relation to the equivalent tree code.
 
 static const tree_code relation_to_code [VREL_LAST] = {
diff --git a/gcc/value-relation.h b/gcc/value-relation.h
index be6e277421b..31d48908678 100644
--- a/gcc/value-relation.h
+++ b/gcc/value-relation.h
@@ -91,6 +91,9 @@ inline bool relation_equiv_p (relation_kind r)
 
 void print_relation (FILE *f, relation_kind rel);
 
+// Adjust range as an equivalence.
+void adjust_equivalence_range (vrange );
+
 class relation_oracle
 {
 public:
-- 
2.41.0



  1   2   3   >