[Bug target/111755] The built-in memset function in GCC inadvertently generates code like "vst1.8 {d8-d9}, [sp:64]", which assumes an 8-byte alignment on the stack pointer $sp, leading to alignment vi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111755 --- Comment #2 from Andrew Pinski --- Also can you attach the testcase where this happens? Please read https://gcc.gnu.org/bugs/ on what information we need.
[Bug target/111755] The built-in memset function in GCC inadvertently generates code like "vst1.8 {d8-d9}, [sp:64]", which assumes an 8-byte alignment on the stack pointer $sp, leading to alignment vi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111755 Andrew Pinski changed: What|Removed |Added Last reconfirmed||2023-10-10 Ever confirmed|0 |1 Status|UNCONFIRMED |WAITING --- Comment #1 from Andrew Pinski --- > which assumes an 8-byte alignment on the stack pointer $sp, leading to > alignment violations. Isn't that the ABI? What target is this for?
[Bug c/111755] New: The built-in memset function in GCC inadvertently generates code like "vst1.8 {d8-d9}, [sp:64]", which assumes an 8-byte alignment on the stack pointer $sp, leading to alignment vi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111755 Bug ID: 111755 Summary: The built-in memset function in GCC inadvertently generates code like "vst1.8 {d8-d9}, [sp:64]", which assumes an 8-byte alignment on the stack pointer $sp, leading to alignment violations Product: gcc Version: 9.2.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: kuzume at axell dot co.jp Target Milestone: --- The built-in memset function in GCC inadvertently generates code like "vst1.8 {d8-d9}, [sp:64]" which assumes an 8-byte alignment on the stack pointer $sp, leading to alignment violations. While the issue can be temporarily circumvented using the -fno-builtin-memset option to inhibit the use of the built-in functions, the stack pointer $sp is 4-byte aligned during C function calls. This might be a bug related to GCC's built-in function handling. By the way, the problem can also be resolved by generating assembly listings without alignment specification, like "vst1.8 {d8-d9}, [sp]". Although, from an alignment perspective, this is not the optimal performance solution.
[Bug rtl-optimization/111754] New: [14 Regression] ICE: in decompose, at rtl.h:2313 at -O
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111754 Bug ID: 111754 Summary: [14 Regression] ICE: in decompose, at rtl.h:2313 at -O Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 56088 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56088=edit reduced testcase Compiler output: $ x86_64-pc-linux-gnu-gcc -O testcase.c during RTL pass: expand testcase.c: In function 'foo': testcase.c:14:10: internal compiler error: in decompose, at rtl.h:2313 14 | return bar ((F){9}, (F){}); | ^~~ 0x7ea1ba wi::int_traits >::decompose(long*, unsigned int, std::pair const&) /repo/gcc-trunk/gcc/rtl.h:2313 0x7ea1ba wide_int_ref_storage::wide_int_ref_storage >(std::pair const&) /repo/gcc-trunk/gcc/wide-int.h:1030 0x7ea1ba generic_wide_int >::generic_wide_int >(std::pair const&) /repo/gcc-trunk/gcc/wide-int.h:788 0x7ea1ba poly_int<1u, generic_wide_int > >::poly_int >(poly_int_full, std::pair const&) /repo/gcc-trunk/gcc/poly-int.h:453 0x7ea1ba poly_int<1u, generic_wide_int > >::poly_int >(std::pair const&) /repo/gcc-trunk/gcc/poly-int.h:439 0x7ea1ba wi::to_poly_wide(rtx_def const*, machine_mode) /repo/gcc-trunk/gcc/rtl.h:2382 0x7ea1ba rtx_vector_builder::step(rtx_def*, rtx_def*) const /repo/gcc-trunk/gcc/rtx-vector-builder.h:122 0x143d95b vector_builder::elt(unsigned int) const /repo/gcc-trunk/gcc/vector-builder.h:254 0x143d841 rtx_vector_builder::build() /repo/gcc-trunk/gcc/rtx-vector-builder.cc:73 0x107c7a1 const_vector_from_tree /repo/gcc-trunk/gcc/expr.cc:13494 0x10856ce expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) /repo/gcc-trunk/gcc/expr.cc:11066 0xf50792 expand_expr(tree_node*, rtx_def*, machine_mode, expand_modifier) /repo/gcc-trunk/gcc/expr.h:310 0xf50792 expand_return /repo/gcc-trunk/gcc/cfgexpand.cc:3809 0xf50792 expand_gimple_stmt_1 /repo/gcc-trunk/gcc/cfgexpand.cc:3918 0xf50792 expand_gimple_stmt /repo/gcc-trunk/gcc/cfgexpand.cc:4044 0xf51106 expand_gimple_basic_block /repo/gcc-trunk/gcc/cfgexpand.cc:6100 0xf5378e execute /repo/gcc-trunk/gcc/cfgexpand.cc:6835 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-4521-20231009151152-g08d0f840dc7-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r14-4521-20231009151152-g08d0f840dc7-checking-yes-rtl-df-extra-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.0.0 20231009 (experimental) (GCC)
[PATCH] x86: set spincount 1 for x86 hybrid platform [PR109812]
From: "Zhang, Jun" By test, we find in hybrid platform spincount 1 is better. Use '-march=native -Ofast -funroll-loops -flto', results as follows: spec2017 speed RPL ADL 657.xz_s 0.00% 0.50% 603.bwaves_s 10.90% 26.20% 607.cactuBSSN_s 5.50% 72.50% 619.lbm_s2.40% 2.50% 621.wrf_s-7.70% 2.40% 627.cam4_s 0.50% 0.70% 628.pop2_s 48.20% 153.00% 638.imagick_s-0.10% 0.20% 644.nab_s2.30% 1.40% 649.fotonik3d_s 8.00% 13.80% 654.roms_s 1.20% 1.10% Geomean-int 0.00% 0.50% Geomean-fp 6.30% 21.10% Geomean-all 5.70% 19.10% omp2012 RPL ADL 350.md -1.81% -1.75% 351.bwaves 7.72% 12.50% 352.nab 14.63% 19.71% 357.bt331-0.20% 1.77% 358.botsalgn 0.00% 0.00% 359.botsspar 0.00% 0.65% 360.ilbdc0.00% 0.25% 362.fma3d2.66% -0.51% 363.swim 10.44% 0.00% 367.imagick 0.00% 0.12% 370.mgrid331 2.49% 25.56% 371.applu331 1.06% 4.22% 372.smithwa 0.74% 3.34% 376.kdtree 10.67% 16.03% GEOMEAN 3.34% 5.53% include/ChangeLog: * omphook.h: define RUNOMPHOOK macro. libgomp/ChangeLog: * env.c (initialize_env): add RUNOMPHOOK macro. * config/linux/x86/omphook.h: define RUNOMPHOOK macro. --- include/omphook.h | 1 + libgomp/config/linux/x86/omphook.h | 19 +++ libgomp/env.c | 3 +++ 3 files changed, 23 insertions(+) create mode 100644 include/omphook.h create mode 100644 libgomp/config/linux/x86/omphook.h diff --git a/include/omphook.h b/include/omphook.h new file mode 100644 index 000..2ebe3ad57e6 --- /dev/null +++ b/include/omphook.h @@ -0,0 +1 @@ +#define RUNOMPHOOK() diff --git a/libgomp/config/linux/x86/omphook.h b/libgomp/config/linux/x86/omphook.h new file mode 100644 index 000..aefb311cc07 --- /dev/null +++ b/libgomp/config/linux/x86/omphook.h @@ -0,0 +1,19 @@ +#ifdef __x86_64__ +#include "cpuid.h" + +/* only for x86 hybrid platform */ +#define RUNOMPHOOK() \ + do \ +{ \ + unsigned int eax, ebx, ecx, edx; \ + if ((getenv ("GOMP_SPINCOUNT") == NULL) && (wait_policy < 0) \ + && __get_cpuid_count (7, 0, , , , ) \ + && ((edx >> 15) & 1)) \ + gomp_spin_count_var = 1LL; \ + if (gomp_throttled_spin_count_var > gomp_spin_count_var) \ + gomp_throttled_spin_count_var = gomp_spin_count_var; \ +} \ + while (0) +#else +# include "../../../../include/omphook.h" +#endif diff --git a/libgomp/env.c b/libgomp/env.c index a21adb3fd4b..1f13a148694 100644 --- a/libgomp/env.c +++ b/libgomp/env.c @@ -61,6 +61,7 @@ #include "secure_getenv.h" #include "environ.h" +#include "omphook.h" /* Default values of ICVs according to the OpenMP standard, except for default-device-var. */ @@ -2496,5 +2497,7 @@ initialize_env (void) goacc_runtime_initialize (); goacc_profiling_initialize (); + + RUNOMPHOOK (); } #endif /* LIBGOMP_OFFLOADED_ONLY */ -- 2.31.1
[Bug rtl-optimization/111753] New: [14 Regression] ICE: in extract_constrain_insn, at recog.cc:2692 insn does not satisfy its constraints: {*movsf_internal} with -O2 -mavx512bw -fno-tree-ter
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111753 Bug ID: 111753 Summary: [14 Regression] ICE: in extract_constrain_insn, at recog.cc:2692 insn does not satisfy its constraints: {*movsf_internal} with -O2 -mavx512bw -fno-tree-ter Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 56087 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56087=edit reduced testcase Compiler output: $ x86_64-pc-linux-gnu-gcc -O2 -mavx512bw -fno-tree-ter testcase.c testcase.c: In function 'foo': testcase.c:35:9: warning: division by zero [-Wdiv-by-zero] 35 | f32_0 /= 0; | ^~ testcase.c:38:13: warning: division by zero [-Wdiv-by-zero] 38 | v256f32_0 /= 0; | ^~ testcase.c:66:1: error: insn does not satisfy its constraints: 66 | } | ^ (insn 713 222 227 2 (set (reg:SF 52 xmm16 [473]) (const_double:SF 0.0 [0x0.0p+0])) "testcase.c":45:13 160 {*movsf_internal} (expr_list:REG_EQUAL (const_double:SF 0.0 [0x0.0p+0]) (nil))) during RTL pass: cprop_hardreg testcase.c:66:1: internal compiler error: in extract_constrain_insn, at recog.cc:2692 0x7e2e60 _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) /repo/gcc-trunk/gcc/rtl-error.cc:108 0x7e2ee7 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) /repo/gcc-trunk/gcc/rtl-error.cc:118 0x7d359b extract_constrain_insn(rtx_insn*) /repo/gcc-trunk/gcc/recog.cc:2692 0x13fdd85 copyprop_hardreg_forward_1 /repo/gcc-trunk/gcc/regcprop.cc:836 0x13ff199 execute /repo/gcc-trunk/gcc/regcprop.cc:1423 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-4521-20231009151152-g08d0f840dc7-checking-yes-rtl-df-extra-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r14-4521-20231009151152-g08d0f840dc7-checking-yes-rtl-df-extra-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.0.0 20231009 (experimental) (GCC)
[PATCH v2 3/4] RISC-V: Extend riscv_subset_list, preparatory for target attribute support
riscv_subset_list only accept a full arch string before, but we need to parse single extension when supporting target attribute, also we may set a riscv_subset_list directly rather than re-parsing the ISA string again. gcc/ChangeLog: * config/riscv/riscv-subset.h (riscv_subset_list::parse_single_std_ext): New. (riscv_subset_list::parse_single_multiletter_ext): Ditto. (riscv_subset_list::clone): Ditto. (riscv_subset_list::parse_single_ext): Ditto. (riscv_subset_list::set_loc): Ditto. (riscv_set_arch_by_subset_list): Ditto. * common/config/riscv/riscv-common.cc (riscv_subset_list::parse_single_std_ext): New. (riscv_subset_list::parse_single_multiletter_ext): Ditto. (riscv_subset_list::clone): Ditto. (riscv_subset_list::parse_single_ext): Ditto. (riscv_subset_list::set_loc): Ditto. (riscv_set_arch_by_subset_list): Ditto. --- gcc/common/config/riscv/riscv-common.cc | 203 gcc/config/riscv/riscv-subset.h | 11 ++ 2 files changed, 214 insertions(+) diff --git a/gcc/common/config/riscv/riscv-common.cc b/gcc/common/config/riscv/riscv-common.cc index 9a0a68fe5db..25630d5923e 100644 --- a/gcc/common/config/riscv/riscv-common.cc +++ b/gcc/common/config/riscv/riscv-common.cc @@ -1036,6 +1036,41 @@ riscv_subset_list::parse_std_ext (const char *p) return p; } +/* Parsing function for one standard extensions. + + Return Value: + Points to the end of extensions. + + Arguments: + `p`: Current parsing position. */ + +const char * +riscv_subset_list::parse_single_std_ext (const char *p) +{ + if (*p == 'x' || *p == 's' || *p == 'z') +{ + error_at (m_loc, + "%<-march=%s%>: Not single-letter extension. " + "%<%c%>", + m_arch, *p); + return nullptr; +} + + unsigned major_version = 0; + unsigned minor_version = 0; + bool explicit_version_p = false; + char subset[2] = {0, 0}; + + subset[0] = *p; + + p++; + + p = parsing_subset_version (subset, p, _version, _version, + /* std_ext_p= */ true, _version_p); + + add (subset, major_version, minor_version, explicit_version_p, false); + return p; +} /* Check any implied extensions for EXT. */ void @@ -1138,6 +1173,102 @@ riscv_subset_list::handle_combine_ext () } } +/* Parsing function for multi-letter extensions. + + Return Value: + Points to the end of extensions. + + Arguments: + `p`: Current parsing position. + `ext_type`: What kind of extensions, 's', 'z' or 'x'. + `ext_type_str`: Full name for kind of extension. */ + + +const char * +riscv_subset_list::parse_single_multiletter_ext (const char *p, +const char *ext_type, +const char *ext_type_str) +{ + unsigned major_version = 0; + unsigned minor_version = 0; + size_t ext_type_len = strlen (ext_type); + + if (strncmp (p, ext_type, ext_type_len) != 0) +return NULL; + + char *subset = xstrdup (p); + const char *end_of_version; + bool explicit_version_p = false; + char *ext; + char backup; + size_t len = strlen (p); + size_t end_of_version_pos, i; + bool found_any_number = false; + bool found_minor_version = false; + + end_of_version_pos = len; + /* Find the begin of version string. */ + for (i = len -1; i > 0; --i) +{ + if (ISDIGIT (subset[i])) + { + found_any_number = true; + continue; + } + /* Might be version seperator, but need to check one more char, +we only allow p, so we could stop parsing if found +any more `p`. */ + if (subset[i] == 'p' && + !found_minor_version && + found_any_number && ISDIGIT (subset[i-1])) + { + found_minor_version = true; + continue; + } + + end_of_version_pos = i + 1; + break; +} + + backup = subset[end_of_version_pos]; + subset[end_of_version_pos] = '\0'; + ext = xstrdup (subset); + subset[end_of_version_pos] = backup; + + end_of_version += parsing_subset_version (ext, subset + end_of_version_pos, _version, + _version, /* std_ext_p= */ false, + _version_p); + free (ext); + + if (end_of_version == NULL) +return NULL; + + subset[end_of_version_pos] = '\0'; + + if (strlen (subset) == 1) +{ + error_at (m_loc, "%<-march=%s%>: name of %s must be more than 1 letter", + m_arch, ext_type_str); + free (subset); + return NULL; +} + + add (subset, major_version, minor_version, explicit_version_p, false); + p += end_of_version - subset; + free (subset); + + if (*p != '\0' && *p != '_') +{ + error_at (m_loc, "%<-march=%s%>: %s must separate with %<_%>", + m_arch, ext_type_str); + return NULL; +} + + return p; + +} + /* Parsing function for
[PATCH v2 4/4] RISC-V: Implement target attribute
The target attribute which proposed in [1], target attribute allow user to specify a local setting per-function basis. The syntax of target attribute is `__attribute__((target("")))`. and the syntax of `` describes below: ``` ATTR-STRING := ATTR-STRING ';' ATTR | ATTR ATTR:= ARCH-ATTR | CPU-ATTR | TUNE-ATTR ARCH-ATTR := 'arch=' EXTENSIONS-OR-FULLARCH EXTENSIONS-OR-FULLARCH := | EXTENSIONS := ',' | FULLARCHSTR:= EXTENSION := OP := '+' VERSION:= [0-9]+ 'p' [0-9]+ | [1-9][0-9]* | EXTENSION-NAME := Naming rule is defined in RISC-V ISA manual CPU-ATTR:= 'cpu=' TUNE-ATTR := 'tune=' ``` [1] https://github.com/riscv-non-isa/riscv-c-api-doc/pull/35 gcc/ChangeLog: * config.gcc (riscv): Add riscv-target-attr.o. * config/riscv/riscv-opts.h (TARGET_MIN_VLEN_OPTS): New. * config/riscv/riscv-protos.h (riscv_declare_function_size) New. (riscv_option_valid_attribute_p): New. (riscv_override_options_internal): New. (struct riscv_tune_info): New. (riscv_parse_tune): New. * config/riscv/riscv-target-attr.cc (class riscv_target_attr_parser): New. (struct riscv_attribute_info): New. (riscv_attributes): New. (riscv_target_attr_parser::parse_arch): (riscv_target_attr_parser::handle_arch): (riscv_target_attr_parser::handle_cpu): (riscv_target_attr_parser::handle_tune): (riscv_target_attr_parser::update_settings): (riscv_process_one_target_attr): (num_occurences_in_str): (riscv_process_target_attr): (riscv_option_valid_attribute_p): * config/riscv/riscv.cc: Include target-globals.h and riscv-subset.h. (struct riscv_tune_info): Move to riscv-protos.h. (get_tune_str): (riscv_parse_tune): (riscv_declare_function_size): (riscv_option_override): Build target_option_default_node and target_option_current_node. (riscv_save_restore_target_globals): (riscv_option_restore): (riscv_previous_fndecl): (riscv_set_current_function): Apply the target attribute. (TARGET_OPTION_RESTORE): Define. (TARGET_OPTION_VALID_ATTRIBUTE_P): Ditto. * config/riscv/riscv.h (SWITCHABLE_TARGET): Define to 1. (ASM_DECLARE_FUNCTION_SIZE) Define. * config/riscv/riscv.opt (mtune=): Add Save attribute. (mcpu=): Ditto. (mcmodel=): Ditto. * config/riscv/t-riscv: Add build rule for riscv-target-attr.o * doc/extend.texi: Add doc for target attribute. gcc/testsuite/ChangeLog: * gcc.target/riscv/target-attr-01.c: New. * gcc.target/riscv/target-attr-02.c: Ditto. * gcc.target/riscv/target-attr-03.c: Ditto. * gcc.target/riscv/target-attr-04.c: Ditto. * gcc.target/riscv/target-attr-05.c: Ditto. * gcc.target/riscv/target-attr-06.c: Ditto. * gcc.target/riscv/target-attr-07.c: Ditto. * gcc.target/riscv/target-attr-bad-01.c: Ditto. * gcc.target/riscv/target-attr-bad-02.c: Ditto. * gcc.target/riscv/target-attr-bad-03.c: Ditto. * gcc.target/riscv/target-attr-bad-04.c: Ditto. * gcc.target/riscv/target-attr-bad-05.c: Ditto. * gcc.target/riscv/target-attr-bad-06.c: Ditto. * gcc.target/riscv/target-attr-bad-07.c: Ditto. * gcc.target/riscv/target-attr-warning-01.c: Ditto. * gcc.target/riscv/target-attr-warning-02.c: Ditto. * gcc.target/riscv/target-attr-warning-03.c: Ditto. --- gcc/config.gcc| 2 +- gcc/config/riscv/riscv-opts.h | 6 + gcc/config/riscv/riscv-protos.h | 21 + gcc/config/riscv/riscv-target-attr.cc | 395 ++ gcc/config/riscv/riscv.cc | 192 +++-- gcc/config/riscv/riscv.h | 6 + gcc/config/riscv/riscv.opt| 6 +- gcc/config/riscv/t-riscv | 5 + gcc/doc/extend.texi | 58 +++ .../gcc.target/riscv/target-attr-01.c | 31 ++ .../gcc.target/riscv/target-attr-02.c | 31 ++ .../gcc.target/riscv/target-attr-03.c | 26 ++ .../gcc.target/riscv/target-attr-04.c | 28 ++ .../gcc.target/riscv/target-attr-05.c | 27 ++ .../gcc.target/riscv/target-attr-06.c | 27 ++ .../gcc.target/riscv/target-attr-07.c | 25 ++ .../gcc.target/riscv/target-attr-bad-01.c | 13 + .../gcc.target/riscv/target-attr-bad-02.c | 13 + .../gcc.target/riscv/target-attr-bad-03.c | 13 + .../gcc.target/riscv/target-attr-bad-04.c | 13 + .../gcc.target/riscv/target-attr-bad-05.c | 13 +
[PATCH v2 2/4] RISC-V: Refactor riscv_option_override and riscv_convert_vector_bits. [NFC]
Allow those funciton apply from a local gcc_options rather than the global options. Preparatory for target attribute, sperate this change for eaiser reivew since it's a NFC. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_convert_vector_bits): Get setting from argument rather than get setting from global setting. (riscv_override_options_internal): New, splited from riscv_override_options, also take a gcc_options argument. (riscv_option_override): Splited most part to riscv_override_options_internal. --- gcc/config/riscv/riscv.cc | 93 ++- 1 file changed, 52 insertions(+), 41 deletions(-) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index b7acf836d02..c7d0d300345 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -8066,10 +8066,11 @@ riscv_init_machine_status (void) /* Return the VLEN value associated with -march. TODO: So far we only support length-agnostic value. */ static poly_uint16 -riscv_convert_vector_bits (void) +riscv_convert_vector_bits (struct gcc_options *opts) { int chunk_num; - if (TARGET_MIN_VLEN > 32) + int min_vlen = TARGET_MIN_VLEN_OPTS (opts); + if (min_vlen > 32) { /* When targetting minimum VLEN > 32, we should use 64-bit chunk size. Otherwise we can not include SEW = 64bits. @@ -8087,7 +8088,7 @@ riscv_convert_vector_bits (void) - TARGET_MIN_VLEN = 2048bit: [256,256] - TARGET_MIN_VLEN = 4096bit: [512,512] FIXME: We currently DON'T support TARGET_MIN_VLEN > 4096bit. */ - chunk_num = TARGET_MIN_VLEN / 64; + chunk_num = min_vlen / 64; } else { @@ -8106,10 +8107,10 @@ riscv_convert_vector_bits (void) to set RVV mode size. The RVV machine modes size are run-time constant if TARGET_VECTOR is enabled. The RVV machine modes size remains default compile-time constant if TARGET_VECTOR is disabled. */ - if (TARGET_VECTOR) + if (TARGET_VECTOR_OPTS_P (opts)) { - if (riscv_autovec_preference == RVV_FIXED_VLMAX) - return (int) TARGET_MIN_VLEN / (riscv_bytes_per_vector_chunk * 8); + if (opts->x_riscv_autovec_preference == RVV_FIXED_VLMAX) + return (int) min_vlen / (riscv_bytes_per_vector_chunk * 8); else return poly_uint16 (chunk_num, chunk_num); } @@ -8117,40 +8118,33 @@ riscv_convert_vector_bits (void) return 1; } -/* Implement TARGET_OPTION_OVERRIDE. */ - -static void -riscv_option_override (void) +/* 'Unpack' up the internal tuning structs and update the options +in OPTS. The caller must have set up selected_tune and selected_arch +as all the other target-specific codegen decisions are +derived from them. */ +void +riscv_override_options_internal (struct gcc_options *opts) { const struct riscv_tune_info *cpu; -#ifdef SUBTARGET_OVERRIDE_OPTIONS - SUBTARGET_OVERRIDE_OPTIONS; -#endif - - flag_pcc_struct_return = 0; - - if (flag_pic) -g_switch_value = 0; - /* The presence of the M extension implies that division instructions are present, so include them unless explicitly disabled. */ - if (TARGET_MUL && (target_flags_explicit & MASK_DIV) == 0) -target_flags |= MASK_DIV; - else if (!TARGET_MUL && TARGET_DIV) + if (TARGET_MUL_OPTS_P (opts) && (target_flags_explicit & MASK_DIV) == 0) +opts->x_target_flags |= MASK_DIV; + else if (!TARGET_MUL_OPTS_P (opts) && TARGET_DIV_OPTS_P (opts)) error ("%<-mdiv%> requires %<-march%> to subsume the % extension"); /* Likewise floating-point division and square root. */ if ((TARGET_HARD_FLOAT || TARGET_ZFINX) && (target_flags_explicit & MASK_FDIV) == 0) -target_flags |= MASK_FDIV; +opts->x_target_flags |= MASK_FDIV; /* Handle -mtune, use -mcpu if -mtune is not given, and use default -mtune if both -mtune and -mcpu are not given. */ - cpu = riscv_parse_tune (riscv_tune_string ? riscv_tune_string : - (riscv_cpu_string ? riscv_cpu_string : + cpu = riscv_parse_tune (opts->x_riscv_tune_string ? opts->x_riscv_tune_string : + (opts->x_riscv_cpu_string ? opts->x_riscv_cpu_string : RISCV_TUNE_STRING_DEFAULT)); riscv_microarchitecture = cpu->microarchitecture; - tune_param = optimize_size ? _size_tune_info : cpu->tune_param; + tune_param = opts->x_optimize_size ? _size_tune_info : cpu->tune_param; /* Use -mtune's setting for slow_unaligned_access, even when optimizing for size. For architectures that trap and emulate unaligned accesses, @@ -8166,15 +8160,38 @@ riscv_option_override (void) if ((target_flags_explicit & MASK_STRICT_ALIGN) == 0 && cpu->tune_param->slow_unaligned_access) -target_flags |= MASK_STRICT_ALIGN; +opts->x_target_flags |= MASK_STRICT_ALIGN; /* If the user hasn't specified a branch cost, use the processor's default. */ - if (riscv_branch_cost == 0) -
[PATCH v2 1/4] options: Define TARGET__P and TARGET__OPTS_P macro for Mask and InverseMask
We TARGET__P marcro to test a Mask and InverseMask with user specified target_variable, however we may want to test with specific gcc_options variable rather than target_variable. Like RISC-V has defined lots of Mask with TargetVariable, which is not easy to use, because that means we need to known which Mask are associate with which TargetVariable, so take a gcc_options variable is a better interface for such use case. gcc/ChangeLog: * doc/options.texi (Mask): Document TARGET__P and TARGET__OPTS_P. (InverseMask): Ditto. * opth-gen.awk (Mask): Generate TARGET__P and TARGET__OPTS_P macro. (InverseMask): Ditto. --- gcc/doc/options.texi | 23 --- gcc/opth-gen.awk | 13 - 2 files changed, 28 insertions(+), 8 deletions(-) diff --git a/gcc/doc/options.texi b/gcc/doc/options.texi index 1f7c15b8eb4..715f0a1479c 100644 --- a/gcc/doc/options.texi +++ b/gcc/doc/options.texi @@ -404,18 +404,27 @@ You may also specify @code{Var} to select a variable other than The options-processing script will automatically allocate a unique bit for the option. If the option is attached to @samp{target_flags} or @code{Var} which is defined by @code{TargetVariable}, the script will set the macro -@code{MASK_@var{name}} to the appropriate bitmask. It will also declare a -@code{TARGET_@var{name}} macro that has the value 1 when the option is active -and 0 otherwise. If you use @code{Var} to attach the option to a different variable -which is not defined by @code{TargetVariable}, the bitmask macro with be -called @code{OPTION_MASK_@var{name}}. +@code{MASK_@var{name}} to the appropriate bitmask. It will also declare a +@code{TARGET_@var{name}}, @code{TARGET_@var{name}_P} and +@code{TARGET_@var{name}_OPTS_P}: @code{TARGET_@var{name}} macros that has the +value 1 when the option is active and 0 otherwise, @code{TARGET_@var{name}_P} is +similar to @code{TARGET_@var{name}} but take an argument as @samp{target_flags} +or @code{TargetVariable}, and @code{TARGET_@var{name}_OPTS_P} also similar to +@code{TARGET_@var{name}} but take an argument as @code{gcc_options}. +If you use @code{Var} to attach the option to a different variable which is not +defined by @code{TargetVariable}, the bitmask macro with be called +@code{OPTION_MASK_@var{name}}. @item InverseMask(@var{othername}) @itemx InverseMask(@var{othername}, @var{thisname}) The option is the inverse of another option that has the @code{Mask(@var{othername})} property. If @var{thisname} is given, -the options-processing script will declare a @code{TARGET_@var{thisname}} -macro that is 1 when the option is active and 0 otherwise. +the options-processing script will declare @code{TARGET_@var{thisname}}, +@code{TARGET_@var{name}_P} and @code{TARGET_@var{name}_OPTS_P} macros: +@code{TARGET_@var{thisname}} is 1 when the option is active and 0 otherwise, +@code{TARGET_@var{name}_P} is similar to @code{TARGET_@var{name}} but take an +argument as @samp{target_flags}, and and @code{TARGET_@var{name}_OPTS_P} also +similar to @code{TARGET_@var{name}} but take an argument as @code{gcc_options}. @item Enum(@var{name}) The option's argument is a string from the set of strings associated diff --git a/gcc/opth-gen.awk b/gcc/opth-gen.awk index c4398be2f3a..26551575d55 100644 --- a/gcc/opth-gen.awk +++ b/gcc/opth-gen.awk @@ -439,6 +439,10 @@ for (i = 0; i < n_target_vars; i++) { print "#define TARGET_" other_masks[i "," j] \ " ((" target_vars[i] " & MASK_" other_masks[i "," j] ") != 0)" + print "#define TARGET_" other_masks[i "," j] "_P(" target_vars[i] ")" \ + " (((" target_vars[i] ") & MASK_" other_masks[i "," j] ") != 0)" + print "#define TARGET_" other_masks[i "," j] "_OPTS_P(opts)" \ + " (((opts->x_" target_vars[i] ") & MASK_" other_masks[i "," j] ") != 0)" } } print "" @@ -469,15 +473,22 @@ for (i = 0; i < n_opts; i++) { " ((" vname " & " mask original_name ") != 0)" print "#define TARGET_" name "_P(" vname ")" \ " (((" vname ") & " mask original_name ") != 0)" + print "#define TARGET_" name "_OPTS_P(opts)" \ + " (((opts->x_" vname ") & " mask original_name ") != 0)" print "#define TARGET_EXPLICIT_" name "_P(opts)" \ " ((opts->x_" vname "_explicit & " mask original_name ") != 0)" print "#define SET_TARGET_" name "(opts) opts->x_" vname " |= " mask original_name } } for (i = 0; i < n_extra_masks; i++) { - if (extra_mask_macros[extra_masks[i]] == 0) + if (extra_mask_macros[extra_masks[i]] == 0) { print "#define TARGET_" extra_masks[i] \ " ((target_flags & MASK_" extra_masks[i] ") != 0)" + print "#define TARGET_" extra_masks[i] "_P(target_flags)" \ +
[PATCH v2 0/4] RISC-V target attribute
This patch set implement target attribute for RISC-V target, which is similar to other target like x86 or ARM, let user able to set some local setting per function without changing global settings. We support arch, tune and cpu first, and we will support other target attribute later, this version DOES NOT include multi-version function support yet, that is future work, probably work for GCC 15. The full proposal is put in RISC-V C-API document[1], which has discussed with RISC-V LLVM community, so we have consistent syntax and semantics. [1] https://github.com/riscv-non-isa/riscv-c-api-doc/pull/35 v2 changelog: - Resolve awk multi-dimensional issue. - Tweak code format - Tweak testcases
[Bug middle-end/111752] New: -Wfree-nonheap-object (vec.h:347:10: warning: 'free' called on unallocated object 'dest_bbs') during bootstrap with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111752 Bug ID: 111752 Summary: -Wfree-nonheap-object (vec.h:347:10: warning: 'free' called on unallocated object 'dest_bbs') during bootstrap with LTO Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: sjames at gcc dot gnu.org Target Milestone: --- I'm not sure this was always there - I think I would've noticed if it was a long-standing thing. I get this -Wfree-nonheap-object warning during bootstrap. I can reproduce it with: ``` ./configure --disable-analyzer --disable-bootstrap --disable-cet --disable-default-pie --disable-default-ssp --disable-fixincludes --disable-gcov --disable-libada --disable-libatomic --disable-libgomp --disable-libitm --disable-libquadmath --disable-libsanitizer --disable-libssp --disable-libstdcxx-pch --disable-libvtv --disable-lto --disable-multilib --disable-nls --disable-objc-gc --disable-systemtap --disable-werror --enable-languages=c,c++ --prefix=/tmp/bisect --without-isl --without-zstd --with-system-zlib --enable-bootstrap --enable-lto make BUILD_CONFIG=bootstrap-lto -j$(nproc) ``` I can only reproduce when building with bootstrap-lto. On trunk at r14-4523-gfb124f2a23e92b, I get this: ``` /home/sam/git/gcc/host-x86_64-pc-linux-gnu/prev-gcc/xg++ -B/home/sam/git/gcc/host-x86_64-pc-linux-gnu/prev-gcc/ -B/tmp/bisect/x86_64-pc-linux-gnu/bin/ -nostdinc++ -B/home/sam/git/gcc/prev-x86_64-pc-linux-gnu/ libstdc++-v3/src/.libs -B/home/sam/git/gcc/prev-x86_64-pc-linux-gnu/libstdc++-v3/libsupc++/.libs -I/home/sam/git/gcc/prev-x86_64-pc-linux-gnu/libstdc++-v3/include/x86_64-pc-linux-gnu -I/home/sam/git/gcc/pre v-x86_64-pc-linux-gnu/libstdc++-v3/include -I/home/sam/git/gcc/libstdc++-v3/libsupc++ -L/home/sam/git/gcc/prev-x86_64-pc-linux-gnu/libstdc++-v3/src/.libs -L/home/sam/git/gcc/prev-x86_64-pc-linux-gnu/libstdc+ +-v3/libsupc++/.libs -no-pie -g -O2 -fno-checking -flto=jobserver -frandom-seed=1 -DIN_GCC-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmis sing-format-attribute -Wconditionally-supported -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -DHAVE_CONFIG_H -no-pie -static-libstdc++ -static-libgcc -o cc1plus \ cp/cp-lang.o c-family/stub-objc.o cp/call.o cp/class.o cp/constexpr.o cp/constraint.o cp/coroutines.o cp/cp-gimplify.o cp/cp-objcp-common.o cp/cp-ubsan.o cp/cvt.o cp/contracts.o cp/cxx-pretty-print.o cp /decl.o cp/decl2.o cp/dump.o cp/error.o cp/except.o cp/expr.o cp/friend.o cp/init.o cp/lambda.o cp/lex.o cp/logic.o cp/mangle.o cp/mapper-client.o cp/mapper-resolver.o cp/method.o cp/module.o cp/name-lookup.o cp/optimize.o cp/parser.o cp/pt.o cp/ptree.o cp/rtti.o cp/search.o cp/semantics.o cp/tree.o cp/typeck.o cp/typeck2.o cp/vtable-class-hierarchy.o attribs.o c-family/c-common.o c-family/c-cppbuiltin.o c-family /c-dump.o c-family/c-format.o c-family/c-gimplify.o c-family/c-indentation.o c-family/c-lex.o c-family/c-omp.o c-family/c-opts.o c-family/c-pch.o c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-pr int.o c-family/c-semantics.o c-family/c-ada-spec.o c-family/c-ubsan.o c-family/known-headers.o c-family/c-attribs.o c-family/c-warn.o c-family/c-spellcheck.o i386-c.o glibc-c.o cc1plus-checksum.o libbackend.a main.o libcommon-target.a libcommon.a ../libcpp/libcpp.a ../libdecnumber/libdecnumber.a ../libcody/libcody.a \ libcommon.a ../libcpp/libcpp.a ../libbacktrace/.libs/libbacktrace.a ../libiberty/libiberty.a ../libdecnumber/libdecnumber.a -lmpc -lmpfr -lgmp -rdynamic -lz ../.././gcc/spellcheck.cc: In function '_Z17get_edit_distancePKciS0_i.part.0': ../.././gcc/spellcheck.cc:71:61: warning: argument 1 value '18446744073709551615' exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=] 71 | edit_distance_t *v_two_ago = new edit_distance_t[len_s + 1]; | ^ /home/sam/git/gcc/libstdc++-v3/libsupc++/new:133:26: note: in a call to allocation function 'operator new []' declared here 133 | _GLIBCXX_NODISCARD void* operator new[](std::size_t) _GLIBCXX_THROW (std::bad_alloc) | ^ ../.././gcc/spellcheck.cc:72:61: warning: argument 1 value '18446744073709551615' exceeds maximum object size 9223372036854775807 [-Walloc-size-larger-than=] 72 | edit_distance_t *v_one_ago = new edit_distance_t[len_s + 1]; | ^ /home/sam/git/gcc/libstdc++-v3/libsupc++/new:133:26: note: in a call to allocation function 'operator new []' declared here 133 | _GLIBCXX_NODISCARD void* operator new[](std::size_t) _GLIBCXX_THROW (std::bad_alloc) |
[Bug tree-optimization/111751] RISC-V: RVV unexpected vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751 --- Comment #7 from JuzheZhong --- (In reply to Andrew Pinski from comment #6) > (In reply to JuzheZhong from comment #5) > > (In reply to Andrew Pinski from comment #4) > > > The issue for aarch64 with SVE is that MASK_LOAD is not optimized: > > > > > > ic = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-"; > > > ib = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-"; > > > vect__1.7_9 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1, > > > -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > > > 0, > > > ... }); > > > vect__2.10_35 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, > > > -1, > > > -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > > > 0, > > > ... }); > > > > I don't ARM SVE has issues ... > > It does as I mentioned if you use -fno-vect-cost-model, you get the above > issue which should be optimized really to a constant vector ... After investigation: I found it failed to recognize its CONST_VECTOR value in FRE /* Visit a load from a reference operator RHS, part of STMT, value number it, and return true if the value number of the LHS has changed as a result. */ static bool visit_reference_op_load (tree lhs, tree op, gimple *stmt) { bool changed = false; tree result; vn_reference_t res; tree vuse = gimple_vuse (stmt); tree last_vuse = vuse; result = vn_reference_lookup (op, vuse, default_vn_walk_kind, , true, _vuse); /* We handle type-punning through unions by value-numbering based on offset and size of the access. Be prepared to handle a type-mismatch here via creating a VIEW_CONVERT_EXPR. */ if (result && !useless_type_conversion_p (TREE_TYPE (result), TREE_TYPE (op))) { /* Avoid the type punning in case the result mode has padding where the op we lookup has not. */ if (maybe_lt (GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (result))), GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (op) result = NULL_TREE; The result is BLKmode, op is V16QImode Then reach /* Avoid the type punning in case the result mode has padding where the op we lookup has not. */ if (maybe_lt (GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (result))), GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (op) result = NULL_TREE; If I delete this code, RVV can optimize it. Do you have any suggestion ? This is my observation: Breakpoint 6, visit_reference_op_load (lhs=0x768364c8, op=0x76874410, stmt=0x76872640) at ../../../../gcc/gcc/tree-ssa-sccvn.cc:5740 5740 result = vn_reference_lookup (op, vuse, default_vn_walk_kind, , true, _vuse); (gdb) c Continuing. Breakpoint 6, visit_reference_op_load (lhs=0x768364c8, op=0x76874410, stmt=0x76872640) at ../../../../gcc/gcc/tree-ssa-sccvn.cc:5740 5740 result = vn_reference_lookup (op, vuse, default_vn_walk_kind, , true, _vuse); (gdb) n 5746 && !useless_type_conversion_p (TREE_TYPE (result), TREE_TYPE (op))) (gdb) p debug (result) "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-" $9 = void (gdb) p op->typed.type->type_common.mode $10 = E_V16QImode
[Bug tree-optimization/111751] RISC-V: RVV unexpected vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751 --- Comment #6 from Andrew Pinski --- (In reply to JuzheZhong from comment #5) > (In reply to Andrew Pinski from comment #4) > > The issue for aarch64 with SVE is that MASK_LOAD is not optimized: > > > > ic = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-"; > > ib = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-"; > > vect__1.7_9 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1, > > -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > > ... }); > > vect__2.10_35 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1, > > -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > > ... }); > > I don't ARM SVE has issues ... It does as I mentioned if you use -fno-vect-cost-model, you get the above issue which should be optimized really to a constant vector ...
[Bug tree-optimization/111751] RISC-V: RVV unexpected vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751 --- Comment #5 from JuzheZhong --- (In reply to Andrew Pinski from comment #4) > The issue for aarch64 with SVE is that MASK_LOAD is not optimized: > > ic = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-"; > ib = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-"; > vect__1.7_9 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1, > -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > ... }); > vect__2.10_35 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1, > -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, > ... }); I don't ARM SVE has issues ... If we can choose fixed length vector mode to vectorize it, it will be well optimized. I think this is RISC-V target dependent issue.
[Bug tree-optimization/111751] RISC-V: RVV unexpected vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751 --- Comment #4 from Andrew Pinski --- The issue for aarch64 with SVE is that MASK_LOAD is not optimized: ic = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-"; ib = "\x00\x03\x06\t\f\x0f\x12\x15\x18\x1b\x1e!$\'*-"; vect__1.7_9 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... }); vect__2.10_35 = .MASK_LOAD (, 8B, { -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, ... });
[Bug tree-optimization/111751] RISC-V: RVV unexpected vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751 --- Comment #3 from Andrew Pinski --- If you add `-fno-vect-cost-model` to aarch64 compiling, then it uses SVE and does not optimize to just `return 0`.
[Bug tree-optimization/111751] RISC-V: RVV unexpected vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2023-10-10 Ever confirmed|0 |1 --- Comment #2 from Andrew Pinski --- Confirmed.
[Bug tree-optimization/111751] RISC-V: RVV unexpected vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751 --- Comment #1 from Andrew Pinski --- AARCH64 did vectorize the code just using non-SVE which then allowed to be optimized too.
RE: [PATCH] RISC-V: Add available vector size for RVV
Committed, thanks Kito. Pan -Original Message- From: Kito Cheng Sent: Tuesday, October 10, 2023 11:20 AM To: Juzhe-Zhong Cc: gcc-patches@gcc.gnu.org; kito.ch...@gmail.com; jeffreya...@gmail.com; rdapp@gmail.com Subject: Re: [PATCH] RISC-V: Add available vector size for RVV LGTM On Mon, Oct 9, 2023 at 4:23 PM Juzhe-Zhong wrote: > > For RVV, we have VLS modes enable according to TARGET_MIN_VLEN > from M1 to M8. > > For example, when TARGET_MIN_VLEN = 128 bits, we enable > 128/256/512/1024 bits VLS modes. > > This patch fixes following FAIL: > FAIL: gcc.dg/vect/bb-slp-subgroups-2.c -flto -ffat-lto-objects > scan-tree-dump-times slp2 "optimized: basic block" 2 > FAIL: gcc.dg/vect/bb-slp-subgroups-2.c scan-tree-dump-times slp2 "optimized: > basic block" 2 > > gcc/testsuite/ChangeLog: > > * lib/target-supports.exp: Add 256/512/1024 > > --- > gcc/testsuite/lib/target-supports.exp | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/testsuite/lib/target-supports.exp > b/gcc/testsuite/lib/target-supports.exp > index af52c38433d..dc366d35a0a 100644 > --- a/gcc/testsuite/lib/target-supports.exp > +++ b/gcc/testsuite/lib/target-supports.exp > @@ -8881,7 +8881,7 @@ proc available_vector_sizes { } { > lappend result 4096 2048 1024 512 256 128 64 32 16 8 4 2 > } elseif { [istarget riscv*-*-*] } { > if { [check_effective_target_riscv_v] } { > - lappend result 0 32 64 128 > + lappend result 0 32 64 128 256 512 1024 > } > lappend result 128 > } else { > -- > 2.36.3 >
[PATCH 2/2] c++: note other candidates when diagnosing deletedness
With the previous improvements in place, we can easily extend our deletedness diagnostic to note the other candidates: deleted16.C: In function ‘int main()’: deleted16.C:10:4: error: use of deleted function ‘void f(int)’ 10 | f(0); | ~^~~ deleted16.C:5:6: note: declared here 5 | void f(int) = delete; | ^ deleted16.C:5:6: note: candidate: ‘void f(int)’ (deleted) deleted16.C:6:6: note: candidate: ‘void f(...)’ 6 | void f(...); | ^ deleted16.C:7:6: note: candidate: ‘void f(int, int)’ 7 | void f(int, int); | ^ deleted16.C:7:6: note: candidate expects 2 arguments, 1 provided These notes are disabled when a deleted special member function is selected primarily because it introduces a lot of new "cannot bind reference" errors in the testsuite when noting non-viable candidates, e.g. in cpp0x/initlist-opt1.C we would need to expect an error at A(A&&). gcc/cp/ChangeLog: * call.cc (build_over_call): Call print_z_candidates when diagnosing deletedness. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/deleted16.C: New test. --- gcc/cp/call.cc | 10 +- gcc/testsuite/g++.dg/cpp0x/deleted16.C | 11 +++ 2 files changed, 20 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/deleted16.C diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc index 648d383ca4e..55fd71636b1 100644 --- a/gcc/cp/call.cc +++ b/gcc/cp/call.cc @@ -9873,7 +9873,15 @@ build_over_call (struct z_candidate *cand, int flags, tsubst_flags_t complain) if (DECL_DELETED_FN (fn)) { if (complain & tf_error) - mark_used (fn); + { + mark_used (fn); + /* Note the other candidates we considered unless we selected a +special member function since the mismatch reasons for other +candidates are usually uninteresting, e.g. rvalue vs lvalue +reference binding . */ + if (cand->next && !special_memfn_p (fn)) + print_z_candidates (input_location, cand, /*only_viable_p=*/false); + } return error_mark_node; } diff --git a/gcc/testsuite/g++.dg/cpp0x/deleted16.C b/gcc/testsuite/g++.dg/cpp0x/deleted16.C new file mode 100644 index 000..9fd2fbb1465 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp0x/deleted16.C @@ -0,0 +1,11 @@ +// Verify we note other candidates when a deleted function is +// selected by overload resolution. +// { dg-do compile { target c++11 } } + +void f(int) = delete; // { dg-message "declared here|candidate" } +void f(...); // { dg-message "candidate" } +void f(int, int); // { dg-message "candidate" } + +int main() { + f(0); // { dg-error "deleted" } +} -- 2.42.0.325.g3a06386e31
[PATCH 1/2] c++: sort candidates according to viability
This patch: * changes splice_viable to move the non-viable candidates to the end of the list instead of removing them outright * makes tourney move the best candidate to the front of the candidate list * adjusts print_z_candidates to preserve our behavior of printing only viable candidates when diagnosing ambiguity * adds a parameter to print_z_candidates to control this default behavior (the follow-up patch will want to print all candidates when diagnosing deletedness) Thus after this patch we have access to the entire candidate list through the best viable candidate. This change also happens to fix diagnostics for the below testcase where we currently neglect to note the third candidate, since the presence of the two unordered non-strictly viable candidates causes splice_viable to prematurely get rid of the non-viable third candidate. gcc/cp/ChangeLog: * call.cc: Include "tristate.h". (splice_viable): Sort the candidate list according to viability. Don't remove non-viable candidates from the list. (print_z_candidates): Add defaulted only_viable_p parameter. By default only print non-viable candidates if there is no viable candidate. (tourney): Make 'candidates' parameter a reference. Ignore non-viable candidates. Move the true champ to the front of the candidates list, and update 'candidates' to point to the front. gcc/testsuite/ChangeLog: * g++.dg/overload/error5.C: New test. --- gcc/cp/call.cc | 161 +++-- gcc/testsuite/g++.dg/overload/error5.C | 11 ++ 2 files changed, 111 insertions(+), 61 deletions(-) create mode 100644 gcc/testsuite/g++.dg/overload/error5.C diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc index 15079ddf6dc..648d383ca4e 100644 --- a/gcc/cp/call.cc +++ b/gcc/cp/call.cc @@ -43,6 +43,7 @@ along with GCC; see the file COPYING3. If not see #include "attribs.h" #include "decl.h" #include "gcc-rich-location.h" +#include "tristate.h" /* The various kinds of conversion. */ @@ -160,7 +161,7 @@ static struct obstack conversion_obstack; static bool conversion_obstack_initialized; struct rejection_reason; -static struct z_candidate * tourney (struct z_candidate *, tsubst_flags_t); +static struct z_candidate * tourney (struct z_candidate *&, tsubst_flags_t); static int equal_functions (tree, tree); static int joust (struct z_candidate *, struct z_candidate *, bool, tsubst_flags_t); @@ -176,7 +177,8 @@ static void op_error (const op_location_t &, enum tree_code, enum tree_code, static struct z_candidate *build_user_type_conversion_1 (tree, tree, int, tsubst_flags_t); static void print_z_candidate (location_t, const char *, struct z_candidate *); -static void print_z_candidates (location_t, struct z_candidate *); +static void print_z_candidates (location_t, struct z_candidate *, + tristate = tristate::unknown ()); static tree build_this (tree); static struct z_candidate *splice_viable (struct z_candidate *, bool, bool *); static bool any_strictly_viable (struct z_candidate *); @@ -3718,68 +3720,60 @@ add_template_conv_candidate (struct z_candidate **candidates, tree tmpl, } /* The CANDS are the set of candidates that were considered for - overload resolution. Return the set of viable candidates, or CANDS - if none are viable. If any of the candidates were viable, set + overload resolution. Sort CANDS so that the strictly viable + candidates appear first, followed by non-strictly viable candidates, + followed by unviable candidates. Returns the first candidate + in this sorted list. If any of the candidates were viable, set *ANY_VIABLE_P to true. STRICT_P is true if a candidate should be - considered viable only if it is strictly viable. */ + considered viable only if it is strictly viable when setting + *ANY_VIABLE_P. */ static struct z_candidate* splice_viable (struct z_candidate *cands, bool strict_p, bool *any_viable_p) { - struct z_candidate *viable; - struct z_candidate **last_viable; - struct z_candidate **cand; - bool found_strictly_viable = false; + z_candidate *strictly_viable = nullptr; + z_candidate **strictly_viable_tail = _viable; + + z_candidate *non_strictly_viable = nullptr; + z_candidate **non_strictly_viable_tail = _strictly_viable; + + z_candidate *unviable = nullptr; + z_candidate **unviable_tail = /* Be strict inside templates, since build_over_call won't actually do the conversions to get pedwarns. */ if (processing_template_decl) strict_p = true; - viable = NULL; - last_viable = - *any_viable_p = false; - - cand = - while (*cand) + for (z_candidate *cand = cands; cand; cand = cand->next) { - struct z_candidate *c = *cand; if (!strict_p - && (c->viable
Re: [PATCH] RISC-V: Add available vector size for RVV
LGTM On Mon, Oct 9, 2023 at 4:23 PM Juzhe-Zhong wrote: > > For RVV, we have VLS modes enable according to TARGET_MIN_VLEN > from M1 to M8. > > For example, when TARGET_MIN_VLEN = 128 bits, we enable > 128/256/512/1024 bits VLS modes. > > This patch fixes following FAIL: > FAIL: gcc.dg/vect/bb-slp-subgroups-2.c -flto -ffat-lto-objects > scan-tree-dump-times slp2 "optimized: basic block" 2 > FAIL: gcc.dg/vect/bb-slp-subgroups-2.c scan-tree-dump-times slp2 "optimized: > basic block" 2 > > gcc/testsuite/ChangeLog: > > * lib/target-supports.exp: Add 256/512/1024 > > --- > gcc/testsuite/lib/target-supports.exp | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/testsuite/lib/target-supports.exp > b/gcc/testsuite/lib/target-supports.exp > index af52c38433d..dc366d35a0a 100644 > --- a/gcc/testsuite/lib/target-supports.exp > +++ b/gcc/testsuite/lib/target-supports.exp > @@ -8881,7 +8881,7 @@ proc available_vector_sizes { } { > lappend result 4096 2048 1024 512 256 128 64 32 16 8 4 2 > } elseif { [istarget riscv*-*-*] } { > if { [check_effective_target_riscv_v] } { > - lappend result 0 32 64 128 > + lappend result 0 32 64 128 256 512 1024 > } > lappend result 128 > } else { > -- > 2.36.3 >
[Bug c/111751] New: RISC-V: RVV unexpected vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111751 Bug ID: 111751 Summary: RISC-V: RVV unexpected vectorization Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: juzhe.zhong at rivai dot ai Target Milestone: --- #include #define N 16 int main () { int i; char ia[N]; char ic[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45}; char ib[N] = {0,3,6,9,12,15,18,21,24,27,30,33,36,39,42,45}; /* Not vectorizable, multiplication */ for (i = 0; i < N; i++) { ia[i] = ib[i] * ic[i]; } /* check results: */ for (i = 0; i < N; i++) { if (ia[i] != (char) (ib[i] * ic[i])) abort (); } return 0; } RVV GCC ASM: main: lui a5,%hi(.LANCHOR0) addia5,a5,%lo(.LANCHOR0) addisp,sp,-48 ld a4,0(a5) ld a5,8(a5) sd a5,8(sp) sd a5,24(sp) sd ra,40(sp) addia5,sp,16 sd a4,0(sp) sd a4,16(sp) vsetivlizero,16,e8,m1,ta,ma vle8.v v1,0(a5) vle8.v v2,0(sp) vmul.vv v1,v1,v2 vmv.x.s a5,v1 andia5,a5,0xff bne a5,zero,.L2 vslidedown.vi v2,v1,1 li a4,9 vmv.x.s a5,v2 andia5,a5,0xff bne a5,a4,.L2 vslidedown.vi v2,v1,2 li a4,36 vmv.x.s a5,v2 andia5,a5,0xff bne a5,a4,.L2 vslidedown.vi v2,v1,3 li a4,81 vmv.x.s a5,v2 andia5,a5,0xff bne a5,a4,.L2 vslidedown.vi v2,v1,4 li a4,144 vmv.x.s a5,v2 andia5,a5,0xff bne a5,a4,.L2 vslidedown.vi v2,v1,5 li a4,225 vmv.x.s a5,v2 andia5,a5,0xff bne a5,a4,.L2 vslidedown.vi v2,v1,6 li a4,68 vmv.x.s a5,v2 andia5,a5,0xff bne a5,a4,.L2 vslidedown.vi v2,v1,7 li a4,185 vmv.x.s a5,v2 andia5,a5,0xff bne a5,a4,.L2 vslidedown.vi v2,v1,8 li a4,64 vmv.x.s a5,v2 andia5,a5,0xff bne a5,a4,.L2 vslidedown.vi v2,v1,9 li a4,217 vmv.x.s a5,v2 andia5,a5,0xff bne a5,a4,.L2 vslidedown.vi v2,v1,10 li a4,132 vmv.x.s a5,v2 andia5,a5,0xff bne a5,a4,.L2 vslidedown.vi v2,v1,11 li a4,65 vmv.x.s a5,v2 andia5,a5,0xff bne a5,a4,.L2 vslidedown.vi v2,v1,12 li a4,16 vmv.x.s a5,v2 andia5,a5,0xff bne a5,a4,.L2 vslidedown.vi v2,v1,13 li a4,241 vmv.x.s a5,v2 andia5,a5,0xff bne a5,a4,.L2 vslidedown.vi v2,v1,14 li a4,228 vmv.x.s a5,v2 andia5,a5,0xff bne a5,a4,.L2 vslidedown.vi v1,v1,15 li a4,233 vmv.x.s a5,v1 andia5,a5,0xff bne a5,a4,.L2 ld ra,40(sp) li a0,0 addisp,sp,48 jr ra .L2: callabort ARM SVE GCC: main: mov w0, 0 ret
Re: [PATCH] RISC-V: Make xtheadcondmov-indirect tests robust against instruction reordering
I guess you may also want to clean up those bodies for "check-function-bodies"? On Mon, Oct 9, 2023 at 3:47 PM Christoph Muellner wrote: > > From: Christoph Müllner > > Fixes: c1bc7513b1d7 ("RISC-V: const: hide mvconst splitter from IRA") > > A recent change broke the xtheadcondmov-indirect tests, because the order of > emitted instructions changed. Since the test is too strict when testing for > a fixed instruction order, let's change the tests to simply count instruction, > like it is done for similar tests. > > Reported-by: Patrick O'Neill > Signed-off-by: Christoph Müllner > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/xtheadcondmov-indirect.c: Make robust against > instruction reordering. > > Signed-off-by: Christoph Müllner > --- > .../gcc.target/riscv/xtheadcondmov-indirect.c | 11 --- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c > b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c > index c3253ba5239..eba1b86137b 100644 > --- a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c > +++ b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c > @@ -1,8 +1,7 @@ > /* { dg-do compile } */ > -/* { dg-options "-march=rv32gc_xtheadcondmov -fno-sched-pressure" { target { > rv32 } } } */ > -/* { dg-options "-march=rv64gc_xtheadcondmov -fno-sched-pressure" { target { > rv64 } } } */ > +/* { dg-options "-march=rv32gc_xtheadcondmov" { target { rv32 } } } */ > +/* { dg-options "-march=rv64gc_xtheadcondmov" { target { rv64 } } } */ > /* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */ > -/* { dg-final { check-function-bodies "**" "" } } */ > > /* > ** ConEmv_imm_imm_reg: > @@ -116,3 +115,9 @@ int ConNmv_reg_reg_reg(int x, int y, int z, int n) > return z; >return n; > } > + > +/* { dg-final { scan-assembler-times "addi\t" 5 } } */ > +/* { dg-final { scan-assembler-times "li\t" 4 } } */ > +/* { dg-final { scan-assembler-times "sub\t" 4 } } */ > +/* { dg-final { scan-assembler-times "th.mveqz\t" 4 } } */ > +/* { dg-final { scan-assembler-times "th.mvnez\t" 4 } } */ > -- > 2.41.0 >
[PATCH] RISC-V Regression: Fix FAIL of predcom-2.c
Like GCN, add -fno-tree-vectorize. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/predcom-2.c: Add riscv. --- gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c b/gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c index f19edd4cd74..681ff7c696b 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/predcom-2.c @@ -1,6 +1,6 @@ /* { dg-do run } */ /* { dg-options "-O2 -funroll-loops --param max-unroll-times=8 -fpredictive-commoning -fdump-tree-pcom-details-blocks -fno-tree-pre" } */ -/* { dg-additional-options "-fno-tree-vectorize" { target amdgcn-*-* } } */ +/* { dg-additional-options "-fno-tree-vectorize" { target amdgcn-*-* riscv*-*-* } } */ void abort (void); -- 2.36.3
[PATCH] use get_range_query to replace get_global_range_query
Hi, For "get_global_range_query" SSA_NAME_RANGE_INFO can be queried. For "get_range_query", it could get more context-aware range info. And look at the implementation of "get_range_query", it returns global range if no local fun info. So, if not quering for SSA_NAME, it would be ok to use get_range_query to replace get_global_range_query. Patch https://gcc.gnu.org/pipermail/gcc-patches/2023-September/630389.html, Uses get_range_query could handle more cases. This patch replaces get_global_range_query by get_range_query for most possible code pieces (but deoes not draft new test cases). Pass bootstrap & regtest on ppc64{,le} and x86_64. Is this ok for trunk. BR, Jeff (Jiufu Guo) gcc/ChangeLog: * builtins.cc (expand_builtin_strnlen): Replace get_global_range_query by get_range_query. * fold-const.cc (expr_not_equal_to): Likewise. * gimple-fold.cc (size_must_be_zero_p): Likewise. * gimple-range-fold.cc (fur_source::fur_source): Likewise. * gimple-ssa-warn-access.cc (check_nul_terminated_array): Likewise. * tree-dfa.cc (get_ref_base_and_extent): Likewise. * tree-ssa-loop-split.cc (split_at_bb_p): Likewise. * tree-ssa-loop-unswitch.cc (evaluate_control_stmt_using_entry_checks): Likewise. --- gcc/builtins.cc | 2 +- gcc/fold-const.cc | 6 +- gcc/gimple-fold.cc| 6 ++ gcc/gimple-range-fold.cc | 4 +--- gcc/gimple-ssa-warn-access.cc | 2 +- gcc/tree-dfa.cc | 5 + gcc/tree-ssa-loop-split.cc| 2 +- gcc/tree-ssa-loop-unswitch.cc | 2 +- 8 files changed, 9 insertions(+), 20 deletions(-) diff --git a/gcc/builtins.cc b/gcc/builtins.cc index cb90bd03b3e..4e0a77ff8e0 100644 --- a/gcc/builtins.cc +++ b/gcc/builtins.cc @@ -3477,7 +3477,7 @@ expand_builtin_strnlen (tree exp, rtx target, machine_mode target_mode) wide_int min, max; value_range r; - get_global_range_query ()->range_of_expr (r, bound); + get_range_query (cfun)->range_of_expr (r, bound); if (r.varying_p () || r.undefined_p ()) return NULL_RTX; min = r.lower_bound (); diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 4f8561509ff..15134b21b9f 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -11056,11 +11056,7 @@ expr_not_equal_to (tree t, const wide_int ) if (!INTEGRAL_TYPE_P (TREE_TYPE (t))) return false; - if (cfun) - get_range_query (cfun)->range_of_expr (vr, t); - else - get_global_range_query ()->range_of_expr (vr, t); - + get_range_query (cfun)->range_of_expr (vr, t); if (!vr.undefined_p () && !vr.contains_p (w)) return true; /* If T has some known zero bits and W has any of those bits set, diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc index dc89975270c..853edd9e5d4 100644 --- a/gcc/gimple-fold.cc +++ b/gcc/gimple-fold.cc @@ -876,10 +876,8 @@ size_must_be_zero_p (tree size) wide_int zero = wi::zero (TYPE_PRECISION (type)); value_range valid_range (type, zero, ssize_max); value_range vr; - if (cfun) -get_range_query (cfun)->range_of_expr (vr, size); - else -get_global_range_query ()->range_of_expr (vr, size); + get_range_query (cfun)->range_of_expr (vr, size); + if (vr.undefined_p ()) vr.set_varying (TREE_TYPE (size)); vr.intersect (valid_range); diff --git a/gcc/gimple-range-fold.cc b/gcc/gimple-range-fold.cc index d1945ccb554..6e9530c3d7f 100644 --- a/gcc/gimple-range-fold.cc +++ b/gcc/gimple-range-fold.cc @@ -50,10 +50,8 @@ fur_source::fur_source (range_query *q) { if (q) m_query = q; - else if (cfun) -m_query = get_range_query (cfun); else -m_query = get_global_range_query (); +m_query = get_range_query (cfun); m_gori = NULL; } diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc index fcaff128d60..e439d1b9b68 100644 --- a/gcc/gimple-ssa-warn-access.cc +++ b/gcc/gimple-ssa-warn-access.cc @@ -332,7 +332,7 @@ check_nul_terminated_array (GimpleOrTree expr, tree src, tree bound) { Value_Range r (TREE_TYPE (bound)); - get_global_range_query ()->range_of_expr (r, bound); + get_range_query (cfun)->range_of_expr (r, bound); if (r.undefined_p () || r.varying_p ()) return true; diff --git a/gcc/tree-dfa.cc b/gcc/tree-dfa.cc index af8e9243947..5355af2c869 100644 --- a/gcc/tree-dfa.cc +++ b/gcc/tree-dfa.cc @@ -531,10 +531,7 @@ get_ref_base_and_extent (tree exp, poly_int64 *poffset, value_range vr; range_query *query; - if (cfun) - query = get_range_query (cfun); - else - query = get_global_range_query (); + query = get_range_query (cfun); if (TREE_CODE (index) == SSA_NAME && (low_bound = array_ref_low_bound (exp), diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc index 64464802c1e..e85a1881526 100644 ---
[PATCH] RISC-V Regression: Make match patterns more accurate
This patch fixes following 2 FAILs in RVV regression since the check is not accurate. It's inspired by Robin's previous patch: https://patchwork.sourceware.org/project/gcc/patch/dde89b9e-49a0-d70b-0906-fb3022cac...@gmail.com/ gcc/testsuite/ChangeLog: * gcc.dg/vect/no-scevccp-outer-7.c: Adjust regex pattern. * gcc.dg/vect/no-scevccp-vect-iv-3.c: Ditto. --- gcc/testsuite/gcc.dg/vect/no-scevccp-outer-7.c | 2 +- gcc/testsuite/gcc.dg/vect/no-scevccp-vect-iv-3.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-7.c b/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-7.c index 543ee98b5a4..058d1d2db2d 100644 --- a/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-7.c +++ b/gcc/testsuite/gcc.dg/vect/no-scevccp-outer-7.c @@ -77,4 +77,4 @@ int main (void) } /* { dg-final { scan-tree-dump-times "OUTER LOOP VECTORIZED." 1 "vect" { target vect_widen_mult_hi_to_si } } } */ -/* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: detected" 1 "vect" } } */ +/* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: detected(?:(?!failed)(?!Re-trying).)*succeeded" 1 "vect" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/no-scevccp-vect-iv-3.c b/gcc/testsuite/gcc.dg/vect/no-scevccp-vect-iv-3.c index 7049e4936b9..6f2b2210b11 100644 --- a/gcc/testsuite/gcc.dg/vect/no-scevccp-vect-iv-3.c +++ b/gcc/testsuite/gcc.dg/vect/no-scevccp-vect-iv-3.c @@ -30,4 +30,4 @@ unsigned int main1 () } /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { target vect_widen_sum_hi_to_si } } } */ -/* { dg-final { scan-tree-dump-times "vect_recog_widen_sum_pattern: detected" 1 "vect" { target vect_widen_sum_hi_to_si } } } */ +/* { dg-final { scan-tree-dump-times "vect_recog_widen_sum_pattern: detected(?:(?!failed)(?!Re-trying).)*succeeded" 1 "vect" { target vect_widen_sum_hi_to_si } } } */ -- 2.36.3
[Bug tree-optimization/111734] [14 Regression] wrong code with '-O3 -fno-inline-functions-called-once -fno-inline-small-functions -fno-omit-frame-pointer -fno-toplevel-reorder -fno-tree-fre'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111734 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Summary|wrong code with '-O3|[14 Regression] wrong code |-fno-inline-functions-calle |with '-O3 |d-once |-fno-inline-functions-calle |-fno-inline-small-functions |d-once |-fno-omit-frame-pointer |-fno-inline-small-functions |-fno-toplevel-reorder |-fno-omit-frame-pointer |-fno-tree-fre' |-fno-toplevel-reorder ||-fno-tree-fre' Last reconfirmed||2023-10-10 Component|c |tree-optimization Target Milestone|--- |14.0 Status|UNCONFIRMED |NEW --- Comment #3 from Andrew Pinski --- PRE does: Processing block 0: BB2 Value numbering stmt = *m_1(D) = RHS simplified to No store match Value numbering store *m_1(D) to Setting value number of .MEM_3 to .MEM_ ... Starting insert iteration 1 Deleted redundant store *m_1(D) = Removing dead stmt *m_1(D) = Better reduced testcase: ``` struct a {}; struct { unsigned b; unsigned short c; } d, f = {9, 1}; int e; static void g(unsigned, __SIZE_TYPE__, int **m); static void h() { int *i = g(0, (__SIZE_TYPE__)i, ); if (*i) f = d; } void g(unsigned a, __SIZE_TYPE__ b, int **m) { *m = } int main() { h(); if (f.c != 1) __builtin_abort(); } ```
[Bug target/111745] [14 Regression] ICE: in extract_insn, at recog.cc:2791 (unrecognizable insn) with -ffloat-store -mavx512fp16 -mavx512vl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111745 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #1 from Hongtao.liu --- Mine, I'll take a look.
[PATCH V1] introduce light expander sra
Hi, There are a few PRs (meta-bug PR101926) on various targets. The root causes of them are similar: the aggeragte param/ returns are passed by multi-registers, but they are stored to stack from registers first; and then, access the parameter through stack slot. A general idea to enhance this: accessing the aggregate parameters/returns directly through registers. This idea would be a kind of SRA (using the scalar registers to access the aggregate parameter/returns). This experimental patch for light-expander-sra contains below parts: a. Check if the parameters/returns are ok/profitable to scalarize, and set the scalar pseudos for the parameter/return. - This is done in "expand_function_start", after the incoming/outgoing hard registers are determined for the paramter(s)/return. The scalarized registers are recorded in DECL_RTL for the parameter/return in parallel form. - At the time when setting DECL_RTL, "scalarizable_aggregate" is called to check the accesses are ok/profitable to scalarize. We can continue to enhance this function, to support more cases. For example: - 'reverse storage order'. - 'TImode/vector-mode from multi-regs'. - some cases on 'writing to parameter'/'overlap accesses'. b. When expanding the accesses of the parameters/returns, according to the info of the access(e.g. bitpos,bitsize, mode), the scalar(pseudos) can be figured out to expand the access. This may happen when expand below accesses: - The component access of a parameter: "_1 = arg.f1". Or whole parameter access: rhs of "_2 = arg" - The assignment to a return val: "D.xx = yy; or D.xx.f = zz" where D.xx occurs on return stmt. - This is mainly done in expr.cc(expand_expr_real_1, and expand_assignment). Function "extract_sub_member" is used to figure out the scalar rtxs(pseudos). Besides the above two parts, some work are done in the GIMPLE tree: collect sra candidates for parameters/returns, and collect the SRA access info. This is mainly done at the beginning of the expander pass by the class (named expand_sra) and its member functions. Below are two major items of this part. - Collect light-expand-sra candidates. Each parameter is checked if it has the proper aggregate type. Collect return val (VAR_P) on each return stmts if the function is returning via registers. This is implemented in expand_sra::collect_sra_candidates. - Build/collect/manage all the access on the candidates. The function "scan_function" is used to do this work, it goes through all basicblocks, and all interesting stmts ( phi, return, assign, call, asm) are checked. If there is an interesting expression (e.g. COMPONENT_REF or PARM_DECL), then record the required info for the access (e.g. pos, size, type, base). And if it is risky to do SRA, the candidates may be removed. e.g. address-taken and accessed via memory. "foo(struct S arg) {bar ();}" This patch also try to common code for light-expand-sra, tree-sra, and ipa-sra. We can continue refactoring to share similar functionalities. Compare with previous version, this version avoid to store the parameter to stack if it is scalarized. https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631177.html This patch is tested on ppc64{,le} and x86_64. Is this ok for trunk? BR, Jeff (Jiufu Guo) PR target/65421 gcc/ChangeLog: * cfgexpand.cc (struct access): New class. (struct expand_sra): New class. (expand_sra::collect_sra_candidates): New member function. (expand_sra::add_sra_candidate): Likewise. (expand_sra::build_access): Likewise. (expand_sra::analyze_phi): Likewise. (expand_sra::analyze_assign): Likewise. (expand_sra::visit_base): Likewise. (expand_sra::protect_mem_access_in_stmt): Likewise. (expand_sra::expand_sra): Class constructor. (expand_sra::~expand_sra): Class destructor. (expand_sra::scalarizable_access): New member function. (expand_sra::scalarizable_accesses): Likewise. (scalarizable_aggregate): New function. (set_scalar_rtx_for_returns): New function. (expand_value_return): Updated. (expand_debug_expr): Updated. (pass_expand::execute): Updated to use expand_sra. * cfgexpand.h (scalarizable_aggregate): New declare. (set_scalar_rtx_for_returns): New declare. * expr.cc (expand_assignment): Updated. (expand_constructor): Updated. (query_position_in_parallel): New function. (extract_sub_member): New function. (expand_expr_real_1): Updated. * expr.h (query_position_in_parallel): New declare. * function.cc (assign_parm_setup_block): Updated. (assign_parms): Updated. (expand_function_start): Updated. * tree-sra.h (struct sra_base_access): New class. (struct sra_default_analyzer): New class.
[Bug tree-optimization/111738] incorrect code when PGO is enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111738 --- Comment #3 from Anonymous --- (In reply to Richard Biener from comment #1) > I can't reproduce. Your git version is quite old, it translates to > r14-2634-g85da0b40538fb0 for me. It doesn't reproduce with r14-2282 either > though. > > Current is r14-4486-g873586ebc565b6 Hi, Richard. According to your suggestion, we have updated our gcc to the latest trunk as: $ gcc -v Using built-in specs. COLLECT_GCC=/root/gcc_set/202310092007/bin/gcc COLLECT_LTO_WRAPPER=/root/gcc_set/202310092007/libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc/configure --prefix=/root/gcc_set/202310092007 --with-gmp=/root/build_essential --with-mpfr=/root/build_essential --with-mpc=/root/build_essential --enable-languages=c,c++ --disable-multilib Thread model: posix Supported LTO compression algorithms: zlib gcc version 14.0.0 20231009 (experimental) (GCC) git version: dee55cf59ceea989f47e7605205c6644b27a1f78 Then, we compiled the same test program with/without PGO enabled and found that the results are inconsistent as: $ gcc -O3 -w -fprofile-generate=profile a.c -o a.out $ ./a.out 4 $ gcc -O3 -w -fprofile-use=profile -Wno-missing-profile -fprofile-correction a.c -o a.out $ ./a.out 32765
[Bug libstdc++/111747] Problem with large float list initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111747 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from Andrew Pinski --- 32bit floating point has the following characteristics: Sign bit: 1 bit Exponent width: 8 bits Significand precision: 24 bits (23 explicitly stored) 5000 is 0x2faf080 which is more than 24bits in precision which means it cannot be represented exactly and when you start to add 1 to something which is greater than 0xf0 (which is what 1.67772e+07 is), the value stays the same and you start to lose precision.
[PATCH] RISC-V Regression: Fix FAIL of bb-slp-pr65935.c for RVV
Here is the reference comparing dump IR between ARM SVE and RVV. https://godbolt.org/z/zqess8Gss We can see RVV has one more dump IR: optimized: basic block part vectorized using 128 byte vectors since RVV has 1024 bit vectors. The codegen is reasonable good. However, I saw GCN also has 1024 bit vector. This patch may cause this case FAIL in GCN port ? Hi, GCN folk, could you check this patch in GCN port for me ? gcc/testsuite/ChangeLog: * gcc.dg/vect/bb-slp-pr65935.c: Add vect1024 variant. * lib/target-supports.exp: Ditto. --- gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c | 3 ++- gcc/testsuite/lib/target-supports.exp | 6 ++ 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c b/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c index 8df35327e7a..9ef1330b47c 100644 --- a/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-pr65935.c @@ -67,7 +67,8 @@ int main() /* We should also be able to use 2-lane SLP to initialize the real and imaginary components in the first loop of main. */ -/* { dg-final { scan-tree-dump-times "optimized: basic block" 10 "slp1" } } */ +/* { dg-final { scan-tree-dump-times "optimized: basic block" 10 "slp1" { target {! { vect1024 } } } } } */ +/* { dg-final { scan-tree-dump-times "optimized: basic block" 11 "slp1" { target { { vect1024 } } } } } */ /* We should see the s->phase[dir] operand splatted and no other operand built from scalars. See PR97334. */ /* { dg-final { scan-tree-dump "Using a splat" "slp1" } } */ diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index dc366d35a0a..95c489d7f76 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -8903,6 +8903,12 @@ proc check_effective_target_vect_variable_length { } { return [expr { [lindex [available_vector_sizes] 0] == 0 }] } +# Return 1 if the target supports vectors of 1024 bits. + +proc check_effective_target_vect1024 { } { +return [expr { [lsearch -exact [available_vector_sizes] 1024] >= 0 }] +} + # Return 1 if the target supports vectors of 512 bits. proc check_effective_target_vect512 { } { -- 2.36.3
[Bug tree-optimization/111750] Spurious -Warray-bounds warning when using member function pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111750 --- Comment #1 from Andrew Pinski --- > That this source produces a -Warray-bounds warning is somewhat surprising > since it contains no arrays, no array indexing, and no pointer arithmetic Well techincally there is pointer arithmetic because the pointer to member function could have a delta for the function call at `(c.*func)();` Also there is an "array" because all variables/decls are arrays in C++ with a size of 1 (that allows you do pass + 1 as the end for iterators). Anyways the problem here is the optimizer optimized into `(c.*func)();` but had not optimized the ::my_method part yet when the warning happened.
[Bug debug/111749] Kk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111749 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |WAITING Last reconfirmed||2023-10-10 --- Comment #1 from Andrew Pinski --- This bug has no information in it? Was that by accident?
[PATCH] RISC-V Regression: Fix dump check of bb-slp-68.c
Like GCN, RVV also has 64 bytes vectors (512 bits) which cause FAIL in this test. It's more reasonable to use "vect512" instead of AMDGCN. gcc/testsuite/ChangeLog: * gcc.dg/vect/bb-slp-68.c: Use vect512. --- gcc/testsuite/gcc.dg/vect/bb-slp-68.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-68.c b/gcc/testsuite/gcc.dg/vect/bb-slp-68.c index e7573a14933..2dd3d8ee90c 100644 --- a/gcc/testsuite/gcc.dg/vect/bb-slp-68.c +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-68.c @@ -20,4 +20,4 @@ void foo () /* We want to have the store group split into 4, 2, 4 when using 32byte vectors. Unfortunately it does not work when 64-byte vectors are available. */ -/* { dg-final { scan-tree-dump-not "from scalars" "slp2" { xfail amdgcn-*-* } } } */ +/* { dg-final { scan-tree-dump-not "from scalars" "slp2" { xfail vect512 } } } */ -- 2.36.3
Re: Re: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'
Oh. I realize this patch increase FAIL that I recently fixed: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632247.html This fail because RVV doesn't have vec_pack_trunc_optab (Loop vectorizer will failed at first time but succeed at 2nd time), then RVV will dump 4 times FOLD_EXTRACT_LAST instead of 2 (ARM SVE 2 times because they have vec_pack_trunc_optab). I think the root cause of RVV failing at multiple tests of "vect" is that we don't enable vec_pack/vec_unpack/... stuff, we still succeed at vectorizations and we want to enable tests of them (Mostly just using different approach to vectorize it (cause dump FAIL) because of some changing I have done previously in the middle-end). So enabling "vec_pack" for RVV will fix some FAILs but increase some other FAILs. CC to Richi to see more reasonable suggestions. juzhe.zh...@rivai.ai 发件人: Maciej W. Rozycki 发送时间: 2023-10-10 06:38 收件人: 钟居哲 抄送: gcc-patches; Jeff Law; rdapp.gcc; kito.cheng 主题: Re: 回复: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc' On Tue, 10 Oct 2023, 钟居哲 wrote: > Btw, could you rebase to the trunk and run regression again? Full regression-testing takes roughly 40 hours here and I do not normally update the tree midway through my work so as not to add variables and end up chasing a moving target, especially with such an unstable state that we have ended up with recently with the RISC-V port. Since I'm done with this part I can refresh and schedule another run if you are curious as to how it looks like from my side. For the C subset alone it'll take less. Maciej
[Bug middle-end/111743] shifts in bit field accesses don't combine with other shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743 --- Comment #6 from Andrew Pinski --- (In reply to Andi Kleen from comment #5) > config/i386/i386.h:#define SLOW_BYTE_ACCESS 0 > > You mean it doesn't define it? The default is 1. Anyways in this case I was wrong but defining it to 0 causes other issues.
[Bug middle-end/111743] shifts in bit field accesses don't combine with other shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743 --- Comment #5 from Andi Kleen --- config/i386/i386.h:#define SLOW_BYTE_ACCESS 0 You mean it doesn't define it?
Re: Odd Python errors in the G++ testsuite
On Mon, Oct 09, 2023 at 20:12:01 +0100, Maciej W. Rozycki wrote: > Hi, > > I'm seeing these tracebacks for several cases across the G++ testsuite: > > Executing on host: python3 -c "import sys; assert sys.version_info >= (3, 6)" >(timeout = 300) > spawn -ignore SIGHUP python3 -c import sys; assert sys.version_info >= (3, 6) > rules/0/primary-output is ok: p1689-1.o > rules/0/provides/0/logical-name is ok: foo > rules/0/provides/0/is-interface is ok: True > Traceback (most recent call last): > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 218, in > is_ok = validate_p1689(actual, expect) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 182, in > validate_p1689 > return compare_json([], actual_json, expect_json) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 145, in > compare_json > is_ok = _compare_object(path, actual, expect) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 66, in > _compare_object > sub_error = compare_json(path + [key], actual[key], expect[key]) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 151, in > compare_json > is_ok = _compare_array(path, actual, expect) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 87, in > _compare_array > sub_error = compare_json(path + [str(idx)], a, e) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 145, in > compare_json > is_ok = _compare_object(path, actual, expect) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 66, in > _compare_object > sub_error = compare_json(path + [key], actual[key], expect[key]) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 149, in > compare_json > actual = set(actual) > TypeError: unhashable type: 'dict' So looking at the source…the 3.6 check is not from me. Not sure what's up there; it's probably not immediately related to the backtrace. But this backtrace means that we have a list of objects that do not expect a given ordering that is of JSON objects. I'm not sure why this never showed up before as all of the existing uses of it are indeed of objects. Can you try removing `"__P1689_unordered__"` from the `p1689-1.exp.ddi` file's `requires` array? The `p1689-file-default.exp.ddi` and `p1689-target-default.exp.ddi` files need the same treatment. --Ben
Re: Odd Python errors in the G++ testsuite
On Mon, Oct 09, 2023 at 19:46:37 -0400, Paul Koning wrote: > > > > On Oct 9, 2023, at 7:42 PM, Ben Boeckel via Gcc wrote: > > > > On Mon, Oct 09, 2023 at 20:12:01 +0100, Maciej W. Rozycki wrote: > >> I'm seeing these tracebacks for several cases across the G++ testsuite: > >> > >> Executing on host: python3 -c "import sys; assert sys.version_info >= (3, > >> 6)"(timeout = 300) > >> spawn -ignore SIGHUP python3 -c import sys; assert sys.version_info >= (3, > >> 6) > > > > What version of Python3 do you have? The test suite might not actually > > properly handle not having 3.7 (i.e., skip the tests that require it). > > But the rule that you can't put a dict in a set is as old as set support (2.x > for some small x). Yes. I just wonder how a dictionary got in there in the first place. I'm not sure if some *other* 3.7-related change makes that work. --Ben
[Bug tree-optimization/111750] New: Spurious -Warray-bounds warning when using member function pointers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111750 Bug ID: 111750 Summary: Spurious -Warray-bounds warning when using member function pointers Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: abbeyj+gcc at gmail dot com Target Milestone: --- Created attachment 56086 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56086=edit Reproducer The source code below generates a -Warray-bounds warning which I believe is incorrect. Compile with `g++ -c -Wall -O2`. ``` struct MyClass { void my_method(); }; MyClass g; void pre(); inline void FetchValue(MyClass& c, void(MyClass::*func)()) { pre(); (c.*func)(); } int get_int(); inline int Check() { static const int ret = get_int(); return ret; } inline void ReadValue(MyClass& c, void(MyClass::*func)()) { Check(); FetchValue(c, func); } void Main() { ReadValue(g, ::my_method); } ``` This produces: ``` In function 'void FetchValue(MyClass&, void (MyClass::*)())', inlined from 'void ReadValue(MyClass&, void (MyClass::*)())' at :23:15, inlined from 'void Main()' at :27:14: :11:14: warning: array subscript 'int (**)(...)[0]' is partly outside array bounds of 'MyClass [1]' [-Warray-bounds=] 11 | (c.*func)(); | ~^~ : In function 'void Main()': :5:9: note: object 'g' of size 1 5 | MyClass g; | ^ ``` Godbolt link: https://godbolt.org/z/6YsWd9xhr That this source produces a -Warray-bounds warning is somewhat surprising since it contains no arrays, no array indexing, and no pointer arithmetic. Small changes such as removing the static variable or manually inlining a function into its caller make the warning go away. The earliest version that I've been able to reproduce this with is GCC 11.1 and it still reproduces on the trunk version that's currently available on godbolt.org.
Re: Odd Python errors in the G++ testsuite
> On Oct 9, 2023, at 7:42 PM, Ben Boeckel via Gcc wrote: > > On Mon, Oct 09, 2023 at 20:12:01 +0100, Maciej W. Rozycki wrote: >> I'm seeing these tracebacks for several cases across the G++ testsuite: >> >> Executing on host: python3 -c "import sys; assert sys.version_info >= (3, >> 6)"(timeout = 300) >> spawn -ignore SIGHUP python3 -c import sys; assert sys.version_info >= (3, 6) > > What version of Python3 do you have? The test suite might not actually > properly handle not having 3.7 (i.e., skip the tests that require it). But the rule that you can't put a dict in a set is as old as set support (2.x for some small x). paul
Re: Odd Python errors in the G++ testsuite
On Mon, Oct 09, 2023 at 20:12:01 +0100, Maciej W. Rozycki wrote: > I'm seeing these tracebacks for several cases across the G++ testsuite: > > Executing on host: python3 -c "import sys; assert sys.version_info >= (3, 6)" >(timeout = 300) > spawn -ignore SIGHUP python3 -c import sys; assert sys.version_info >= (3, 6) What version of Python3 do you have? The test suite might not actually properly handle not having 3.7 (i.e., skip the tests that require it). > rules/0/primary-output is ok: p1689-1.o I wrote these tests. > rules/0/provides/0/logical-name is ok: foo > rules/0/provides/0/is-interface is ok: True > Traceback (most recent call last): > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 218, in > is_ok = validate_p1689(actual, expect) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 182, in > validate_p1689 > return compare_json([], actual_json, expect_json) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 145, in > compare_json > is_ok = _compare_object(path, actual, expect) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 66, in > _compare_object > sub_error = compare_json(path + [key], actual[key], expect[key]) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 151, in > compare_json > is_ok = _compare_array(path, actual, expect) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 87, in > _compare_array > sub_error = compare_json(path + [str(idx)], a, e) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 145, in > compare_json > is_ok = _compare_object(path, actual, expect) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 66, in > _compare_object > sub_error = compare_json(path + [key], actual[key], expect[key]) > File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 149, in > compare_json > actual = set(actual) > TypeError: unhashable type: 'dict' I'm not sure how this ends up with a dictionary in it… Can you `print(actual)` before this? > and also these intermittent failures for other cases: > > Executing on host: python3 -c "import sys; assert sys.version_info >= (3, 6)" >(timeout = 300) > spawn -ignore SIGHUP python3 -c import sys; assert sys.version_info >= (3, 6) > rules/0/primary-output is ok: p1689-2.o > rules/0/provides/0/logical-name is ok: foo:part1 > rules/0/provides/0/is-interface is ok: True > ERROR: length mismatch at rules/0/requires: actual: "1" expect: "0" > version is ok: 0 > revision is ok: 0 > FAIL: ERROR: length mismatch at rules/0/requires: actual: "1" expect: "0" > > This does seem to me like something not working as intended. As a Python > non-expert I have troubles concluding what is going on here and whether > these tracebacks are indeed supposed to be there, or whether it is a sign > of a problem. And these failures I don't even know where they come from. > > Does anyone know? Is there a way to run the offending commands by hand? > The relevant invocation lines do not appear in the test log file for one > to copy and paste, which I think is not the right way of doing things in > our environment. > > These issues seem independent from the test host environment as I can see > them on both a `powerpc64le-linux-gnu' and an `x86_64-linux-gnu' machine > in `riscv64-linux-gnu' target testing. Do they all have pre-3.7 Python3 versions? --Ben
[PATCH] RISC-V: Add available vector size for RVV
For RVV, we have VLS modes enable according to TARGET_MIN_VLEN from M1 to M8. For example, when TARGET_MIN_VLEN = 128 bits, we enable 128/256/512/1024 bits VLS modes. This patch fixes following FAIL: FAIL: gcc.dg/vect/bb-slp-subgroups-2.c -flto -ffat-lto-objects scan-tree-dump-times slp2 "optimized: basic block" 2 FAIL: gcc.dg/vect/bb-slp-subgroups-2.c scan-tree-dump-times slp2 "optimized: basic block" 2 gcc/testsuite/ChangeLog: * lib/target-supports.exp: Add 256/512/1024 --- gcc/testsuite/lib/target-supports.exp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index af52c38433d..dc366d35a0a 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -8881,7 +8881,7 @@ proc available_vector_sizes { } { lappend result 4096 2048 1024 512 256 128 64 32 16 8 4 2 } elseif { [istarget riscv*-*-*] } { if { [check_effective_target_riscv_v] } { - lappend result 0 32 64 128 + lappend result 0 32 64 128 256 512 1024 } lappend result 128 } else { -- 2.36.3
[Bug debug/111749] New: Kk
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111749 Bug ID: 111749 Summary: Kk Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: debug Assignee: unassigned at gcc dot gnu.org Reporter: molono1386 at dixiser dot com Target Milestone: ---
Re: xthead regression with [COMMITTED] RISC-V: const: hide mvconst splitter from IRA
On Mon, Oct 9, 2023 at 10:48 PM Vineet Gupta wrote: > > On 10/9/23 13:46, Christoph Müllner wrote: > > Given that this causes repeated issues, I think that a fall-back to > > counting occurrences is the right thing to do. I can do that if that's ok. > > Thanks Christoph. Tested patch on list: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632393.html > > -Vineet
[PATCH] RISC-V: Make xtheadcondmov-indirect tests robust against instruction reordering
From: Christoph Müllner Fixes: c1bc7513b1d7 ("RISC-V: const: hide mvconst splitter from IRA") A recent change broke the xtheadcondmov-indirect tests, because the order of emitted instructions changed. Since the test is too strict when testing for a fixed instruction order, let's change the tests to simply count instruction, like it is done for similar tests. Reported-by: Patrick O'Neill Signed-off-by: Christoph Müllner gcc/testsuite/ChangeLog: * gcc.target/riscv/xtheadcondmov-indirect.c: Make robust against instruction reordering. Signed-off-by: Christoph Müllner --- .../gcc.target/riscv/xtheadcondmov-indirect.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c index c3253ba5239..eba1b86137b 100644 --- a/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c +++ b/gcc/testsuite/gcc.target/riscv/xtheadcondmov-indirect.c @@ -1,8 +1,7 @@ /* { dg-do compile } */ -/* { dg-options "-march=rv32gc_xtheadcondmov -fno-sched-pressure" { target { rv32 } } } */ -/* { dg-options "-march=rv64gc_xtheadcondmov -fno-sched-pressure" { target { rv64 } } } */ +/* { dg-options "-march=rv32gc_xtheadcondmov" { target { rv32 } } } */ +/* { dg-options "-march=rv64gc_xtheadcondmov" { target { rv64 } } } */ /* { dg-skip-if "" { *-*-* } {"-O0" "-Os" "-Og" "-Oz" "-flto" } } */ -/* { dg-final { check-function-bodies "**" "" } } */ /* ** ConEmv_imm_imm_reg: @@ -116,3 +115,9 @@ int ConNmv_reg_reg_reg(int x, int y, int z, int n) return z; return n; } + +/* { dg-final { scan-assembler-times "addi\t" 5 } } */ +/* { dg-final { scan-assembler-times "li\t" 4 } } */ +/* { dg-final { scan-assembler-times "sub\t" 4 } } */ +/* { dg-final { scan-assembler-times "th.mveqz\t" 4 } } */ +/* { dg-final { scan-assembler-times "th.mvnez\t" 4 } } */ -- 2.41.0
Re: 回复: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'
On Tue, 10 Oct 2023, 钟居哲 wrote: > Btw, could you rebase to the trunk and run regression again? Full regression-testing takes roughly 40 hours here and I do not normally update the tree midway through my work so as not to add variables and end up chasing a moving target, especially with such an unstable state that we have ended up with recently with the RISC-V port. Since I'm done with this part I can refresh and schedule another run if you are curious as to how it looks like from my side. For the C subset alone it'll take less. Maciej
Re: Re: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'
I know you want vect_int to block the test for rv64gc. But unfortunately it failed. And I have changed everything to run vect testsuite with "riscv_v". [PATCH] RISC-V: Enable more tests of "vect" for RVV (gnu.org) So to be consistent, plz add "riscv_v". juzhe.zh...@rivai.ai From: Maciej W. Rozycki Date: 2023-10-10 06:29 To: 钟居哲 CC: gcc-patches; Jeff Law; rdapp.gcc; kito.cheng Subject: Re: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc' On Tue, 10 Oct 2023, 钟居哲 wrote: > && [check_effective_target_arm_little_endian]) > || ([istarget mips*-*-*] > && [et-is-effective-target mips_msa]) > + || [istarget riscv*-*-*] > || ([istarget s390*-*-*] > && [check_effective_target_s390_vx]) > || [istarget amdgcn*-*-*] }}] > > You should change it into: > > || ([istarget riscv*-*-*] > && [check_effective_target_riscv_v]) > > Then, these additional FAILs will be removed: > > with no changes (except for intermittent Python failures for C++) with the > remaining testsuites. There are a few of regressions in `-march=rv64gc' > testing: > +FAIL: gcc.dg/vect/pr97678.c scan-tree-dump vect "vectorizing stmts using SLP" > +FAIL: gcc.dg/vect/slp-13-big-array.c scan-tree-dump-times vect "vectorizing > stmts using SLP" 3 > +FAIL: gcc.dg/vect/slp-13.c scan-tree-dump-times vect "vectorizing stmts > using SLP" 3 > +FAIL: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects scan-tree-dump vect > "vectorizing stmts using SLP" > +FAIL: gcc.dg/vect/slp-13-big-array.c -flto -ffat-lto-objects > scan-tree-dump-times vect "vectorizing stmts using SLP" 3 > +FAIL: gcc.dg/vect/slp-13.c -flto -ffat-lto-objects scan-tree-dump-times > vect "vectorizing stmts using SLP" 3 I explained in the change description why the check for `riscv_v' isn't needed here: the tests mustn't run in the first place, so naturally they cannot fail either. If I missed anything, then please elaborate. Maciej
Re: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'
On Tue, 10 Oct 2023, 钟居哲 wrote: >&& [check_effective_target_arm_little_endian]) >|| ([istarget mips*-*-*] >&& [et-is-effective-target mips_msa]) > + || [istarget riscv*-*-*] >|| ([istarget s390*-*-*] >&& [check_effective_target_s390_vx]) > || [istarget amdgcn*-*-*] }}] > > You should change it into: > > || ([istarget riscv*-*-*] > && [check_effective_target_riscv_v]) > > Then, these additional FAILs will be removed: > > with no changes (except for intermittent Python failures for C++) with the > remaining testsuites. There are a few of regressions in `-march=rv64gc' > testing: > +FAIL: gcc.dg/vect/pr97678.c scan-tree-dump vect "vectorizing stmts using SLP" > +FAIL: gcc.dg/vect/slp-13-big-array.c scan-tree-dump-times vect "vectorizing > stmts using SLP" 3 > +FAIL: gcc.dg/vect/slp-13.c scan-tree-dump-times vect "vectorizing stmts > using SLP" 3 > +FAIL: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects scan-tree-dump vect > "vectorizing stmts using SLP" > +FAIL: gcc.dg/vect/slp-13-big-array.c -flto -ffat-lto-objects > scan-tree-dump-times vect "vectorizing stmts using SLP" 3 > +FAIL: gcc.dg/vect/slp-13.c -flto -ffat-lto-objects scan-tree-dump-times > vect "vectorizing stmts using SLP" 3 I explained in the change description why the check for `riscv_v' isn't needed here: the tests mustn't run in the first place, so naturally they cannot fail either. If I missed anything, then please elaborate. Maciej
[Bug c++/111748] New: GCC does not understand partial ordering between non-constrained and constrained templates for specialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111748 Bug ID: 111748 Summary: GCC does not understand partial ordering between non-constrained and constrained templates for specialization Product: gcc Version: 13.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: jeanmichael.celerier at gmail dot com Target Milestone: --- Consider: #include template void foo() { } template void foo() { } template<> void foo() { } int main() { foo(); } According to the answers I got in https://stackoverflow.com/questions/77261120/ GCC should be able to compile this code, yet it fails due to a supposed ambiguity between template void foo() { } and template void foo() { } as the base of foo
回复: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'
Btw, could you rebase to the trunk and run regression again? I saw your report 670 FAILs: # of expected passes 187616 # of unexpected failures 672 # of unexpected successes 14 # of expected failures 1436 # of unresolved testcases 615 # of unsupported tests 4731 I am recently working on fixing FAILs of risc-v regression. Your report looks odd. This is my report: # of expected passes183613 # of unexpected failures92 # of unexpected successes 12 # of expected failures 1383 # of unresolved testcases 4 # of unsupported tests 4223 This is my report. It should be less than 100 FAILs. juzhe.zh...@rivai.ai 发件人: 钟居哲 发送时间: 2023-10-10 06:17 收件人: gcc-patches 抄送: macro; Jeff Law; rdapp.gcc; kito.cheng 主题: [PATCH] RISC-V/testsuite: Enable `vect_pack_trunc' && [check_effective_target_arm_little_endian]) || ([istarget mips*-*-*] && [et-is-effective-target mips_msa]) +|| [istarget riscv*-*-*] || ([istarget s390*-*-*] && [check_effective_target_s390_vx]) || [istarget amdgcn*-*-*] }}] You should change it into: || ([istarget riscv*-*-*] && [check_effective_target_riscv_v]) Then, these additional FAILs will be removed: with no changes (except for intermittent Python failures for C++) with the remaining testsuites. There are a few of regressions in `-march=rv64gc' testing: +FAIL: gcc.dg/vect/pr97678.c scan-tree-dump vect "vectorizing stmts using SLP" +FAIL: gcc.dg/vect/slp-13-big-array.c scan-tree-dump-times vect "vectorizing stmts using SLP" 3 +FAIL: gcc.dg/vect/slp-13.c scan-tree-dump-times vect "vectorizing stmts using SLP" 3 +FAIL: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects scan-tree-dump vect "vectorizing stmts using SLP" +FAIL: gcc.dg/vect/slp-13-big-array.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 3 +FAIL: gcc.dg/vect/slp-13.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 3 juzhe.zh...@rivai.ai
[PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'
&& [check_effective_target_arm_little_endian]) || ([istarget mips*-*-*] && [et-is-effective-target mips_msa]) +|| [istarget riscv*-*-*] || ([istarget s390*-*-*] && [check_effective_target_s390_vx]) || [istarget amdgcn*-*-*] }}] You should change it into: || ([istarget riscv*-*-*] && [check_effective_target_riscv_v]) Then, these additional FAILs will be removed: with no changes (except for intermittent Python failures for C++) with the remaining testsuites. There are a few of regressions in `-march=rv64gc' testing: +FAIL: gcc.dg/vect/pr97678.c scan-tree-dump vect "vectorizing stmts using SLP" +FAIL: gcc.dg/vect/slp-13-big-array.c scan-tree-dump-times vect "vectorizing stmts using SLP" 3 +FAIL: gcc.dg/vect/slp-13.c scan-tree-dump-times vect "vectorizing stmts using SLP" 3 +FAIL: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects scan-tree-dump vect "vectorizing stmts using SLP" +FAIL: gcc.dg/vect/slp-13-big-array.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 3 +FAIL: gcc.dg/vect/slp-13.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 3 juzhe.zh...@rivai.ai
[Bug tree-optimization/111519] [13/14 Regression] Wrong code at -O3 on x86_64-linux-gnu since r13-455-g1fe04c497d
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111519 --- Comment #2 from Roger Sayle --- Complicated. Things have gone wrong before the strlen pass which is given: _73 = e; _72 = *_73; ... *_73 = prephitmp_23; d = _72; Here the assignment to *_73 overwrites the value of f (at *e) which then invalidates the use of _72 resulting in the wrong value for d. But figuring out which pass is at fault (perhaps complete loop unrolling?) is tricky.
[Bug libstdc++/111747] New: Problem with large float list initialization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111747 Bug ID: 111747 Summary: Problem with large float list initialization Product: gcc Version: 11.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: oplata.kes1 at mail dot ru Target Milestone: --- Created attachment 56085 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56085=edit *ii file and output I create a vector on stack of 50 (or 500mln) floats equal to 1.0 and simply add them. The sum is not equal to 50 mln (or 5 billions). 5 mln of floats initizalize and sum fine. The command is g++ -v -save-temps gcc.cpp && ./a.out If it is a stack overflow, shouldn't the code fail with stack overflow error? If not, what is it? I use GCC 11.4.0 in Ubuntu 22.04 under WSL 2 (!)
[PATCH] RISC-V/testsuite: Enable `vect_pack_trunc'
Despite not defining `vec_pack_trunc_' standard named patterns the backend provides vector pack operations via its own `@pred_trunc' set of patterns and they do trigger in vectorization producing narrowing VNCVT.X.X.W assembly instructions as expected. Enable the `vect_pack_trunc' setting for RISC-V targets then, improving GCC C test results in `-march=rv64gcv' testing as follows: -FAIL: gcc.dg/vect/pr57705.c scan-tree-dump-times vect "vectorized 1 loop" 2 +PASS: gcc.dg/vect/pr57705.c scan-tree-dump-times vect "vectorized 1 loop" 3 +PASS: gcc.dg/vect/pr59354.c scan-tree-dump vect "vectorized 1 loop" -UNSUPPORTED: gcc.dg/vect/pr97678.c +PASS: gcc.dg/vect/pr97678.c (test for excess errors) +PASS: gcc.dg/vect/pr97678.c execution test +XFAIL: gcc.dg/vect/pr97678.c scan-tree-dump vect "vectorizing stmts using SLP" -UNSUPPORTED: gcc.dg/vect/vect-bool-cmp.c +PASS: gcc.dg/vect/vect-bool-cmp.c (test for excess errors) +PASS: gcc.dg/vect/vect-bool-cmp.c execution test +PASS: gcc.dg/vect/vect-iv-4.c scan-tree-dump-times vect "vectorized 1 loops" 1 +PASS: gcc.dg/vect/vect-multitypes-14.c scan-tree-dump-times vect "vectorized 1 loops" 1 +PASS: gcc.dg/vect/vect-multitypes-8.c scan-tree-dump-times vect "vectorized 1 loops" 1 +PASS: gcc.dg/vect/vect-reduc-dot-u16b.c scan-tree-dump-times vect "vectorized 1 loops" 1 +PASS: gcc.dg/vect/vect-strided-store-u16-i4.c scan-tree-dump-times vect "vectorized 1 loops" 2 -PASS: gcc.dg/vect/slp-13-big-array.c scan-tree-dump-times vect "vectorizing stmts using SLP" 2 +XFAIL: gcc.dg/vect/slp-13-big-array.c scan-tree-dump-times vect "vectorizing stmts using SLP" 3 -PASS: gcc.dg/vect/slp-13.c scan-tree-dump-times vect "vectorizing stmts using SLP" 2 +XFAIL: gcc.dg/vect/slp-13.c scan-tree-dump-times vect "vectorizing stmts using SLP" 3 +PASS: gcc.dg/vect/slp-multitypes-10.c scan-tree-dump-times vect "vectorized 1 loops" 1 +PASS: gcc.dg/vect/slp-multitypes-10.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 +PASS: gcc.dg/vect/slp-multitypes-5.c scan-tree-dump-times vect "vectorized 1 loops" 1 +PASS: gcc.dg/vect/slp-multitypes-5.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 +PASS: gcc.dg/vect/slp-multitypes-6.c scan-tree-dump-times vect "vectorized 1 loops" 1 +PASS: gcc.dg/vect/slp-multitypes-6.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 +PASS: gcc.dg/vect/slp-multitypes-9.c scan-tree-dump-times vect "vectorized 1 loops" 1 +PASS: gcc.dg/vect/slp-multitypes-9.c scan-tree-dump-times vect "vectorizing stmts using SLP" 1 -UNSUPPORTED: gcc.dg/vect/slp-perm-12.c +PASS: gcc.dg/vect/slp-perm-12.c (test for excess errors) +PASS: gcc.dg/vect/slp-perm-12.c execution test -UNSUPPORTED: gcc.dg/vect/bb-slp-11.c +PASS: gcc.dg/vect/bb-slp-11.c (test for excess errors) +PASS: gcc.dg/vect/bb-slp-11.c execution test -FAIL: gcc.dg/vect/pr57705.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loop" 2 +PASS: gcc.dg/vect/pr57705.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loop" 3 +PASS: gcc.dg/vect/pr59354.c -flto -ffat-lto-objects scan-tree-dump vect "vectorized 1 loop" -UNSUPPORTED: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects +PASS: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects (test for excess errors) +PASS: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects execution test +XFAIL: gcc.dg/vect/pr97678.c -flto -ffat-lto-objects scan-tree-dump vect "vectorizing stmts using SLP" -UNSUPPORTED: gcc.dg/vect/vect-bool-cmp.c -flto -ffat-lto-objects +PASS: gcc.dg/vect/vect-bool-cmp.c -flto -ffat-lto-objects (test for excess errors) +PASS: gcc.dg/vect/vect-bool-cmp.c -flto -ffat-lto-objects execution test +PASS: gcc.dg/vect/vect-iv-4.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 1 +PASS: gcc.dg/vect/vect-multitypes-14.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 1 +PASS: gcc.dg/vect/vect-multitypes-8.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 1 +PASS: gcc.dg/vect/vect-reduc-dot-u16b.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 1 +PASS: gcc.dg/vect/vect-strided-store-u16-i4.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 2 -PASS: gcc.dg/vect/slp-13-big-array.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 2 +XFAIL: gcc.dg/vect/slp-13-big-array.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 3 -PASS: gcc.dg/vect/slp-13.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 2 +XFAIL: gcc.dg/vect/slp-13.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 3 +PASS: gcc.dg/vect/slp-multitypes-10.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorized 1 loops" 1 +PASS: gcc.dg/vect/slp-multitypes-10.c -flto -ffat-lto-objects scan-tree-dump-times vect "vectorizing stmts using SLP" 1 +PASS: gcc.dg/vect/slp-multitypes-5.c -flto
[Bug tree-optimization/111715] Missed optimization in FRE because of weak TBAA
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111715 --- Comment #6 from Sam James --- I started hitting the original warning Jakub hit with 13.2.1 20231007 but I've not tried to figure out which backported change caused it to appear.
[Bug tree-optimization/111679] `(~a) | (a ^ b)` is not simplified to `~(a & b)`
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111679 Andrew Pinski changed: What|Removed |Added Keywords||patch URL||https://gcc.gnu.org/piperma ||il/gcc-patches/2023-October ||/632386.html --- Comment #2 from Andrew Pinski --- Patch posted: https://gcc.gnu.org/pipermail/gcc-patches/2023-October/632386.html
[PATCH] MATCH: [PR111679] Add alternative simplification of `a | ((~a) ^ b)`
So currently we have a simplification for `a | ~(a ^ b)` but that does not match the case where we had originally `(~a) | (a ^ b)` so we need to add a new pattern that matches that and uses bitwise_inverted_equal_p that also catches comparisons too. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. PR tree-optimization/111679 gcc/ChangeLog: * match.pd (`a | ((~a) ^ b)`): New pattern. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/bitops-5.c: New test. --- gcc/match.pd | 8 +++ gcc/testsuite/gcc.dg/tree-ssa/bitops-5.c | 27 2 files changed, 35 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/bitops-5.c diff --git a/gcc/match.pd b/gcc/match.pd index 31bfd8b6b68..49740d189a7 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -1350,6 +1350,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) && TYPE_PRECISION (TREE_TYPE (@0)) == 1) (bit_ior @0 (bit_xor @1 { build_one_cst (type); } +/* a | ((~a) ^ b) --> a | (~b) (alt version of the above 2) */ +(simplify + (bit_ior:c @0 (bit_xor:cs @1 @2)) + (with { bool wascmp; } + (if (bitwise_inverted_equal_p (@0, @1, wascmp) + && (!wascmp || element_precision (type) == 1)) + (bit_ior @0 (bit_not @2) + /* (a | b) | (a &^ b) --> a | b */ (for op (bit_and bit_xor) (simplify diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitops-5.c b/gcc/testsuite/gcc.dg/tree-ssa/bitops-5.c new file mode 100644 index 000..990610e3002 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/bitops-5.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized-raw" } */ +/* PR tree-optimization/111679 */ + +int f1(int a, int b) +{ +return (~a) | (a ^ b); // ~(a & b) or (~a) | (~b) +} + +_Bool fb(_Bool c, _Bool d) +{ +return (!c) | (c ^ d); // ~(c & d) or (~c) | (~d) +} + +_Bool fb1(int x, int y) +{ +_Bool a = x == 10, b = y > 100; +return (!a) | (a ^ b); // ~(a & b) or (~a) | (~b) +// or (x != 10) | (y <= 100) +} + +/* { dg-final { scan-tree-dump-not "bit_xor_expr, " "optimized" } } */ +/* { dg-final { scan-tree-dump-times "bit_not_expr, " 2 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "bit_and_expr, " 2 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "bit_ior_expr, " 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "ne_expr, _\[0-9\]+, x_\[0-9\]+" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "le_expr, _\[0-9\]+, y_\[0-9\]+" 1 "optimized" } } */ -- 2.39.3
[RFC] RISC-V: Handle new types in scheduling descriptions
Now that every insn is guaranteed a type, we want to ensure the types are handled by the existing scheduling descriptions. There are 2 approaches I see: 1. Create a new pipeline intended to eventually abort (sifive-7.md) 2. Add the types to an existing pipeline (generic.md) Which approach do we want to go with? If there is a different approach we want to take instead, please let me know as well. Additionally, should types associated with specific extensions (vector, crypto, etc) have specific pipelines dedicated to them? * config/riscv/generic.md: update pipeline * config/riscv/sifive-7.md (sifive_7): update pipeline (sifive_7_other): Signed-off-by: Edwin Lu --- gcc/config/riscv/generic.md | 3 ++- gcc/config/riscv/sifive-7.md | 7 +++ 2 files changed, 9 insertions(+), 1 deletion(-) diff --git a/gcc/config/riscv/generic.md b/gcc/config/riscv/generic.md index 57d3c3b4adc..338d2e85b77 100644 --- a/gcc/config/riscv/generic.md +++ b/gcc/config/riscv/generic.md @@ -27,7 +27,8 @@ (define_cpu_unit "fdivsqrt" "pipe0") (define_insn_reservation "generic_alu" 1 (and (eq_attr "tune" "generic") - (eq_attr "type" "unknown,const,arith,shift,slt,multi,auipc,nop,logical,move,bitmanip,min,max,minu,maxu,clz,ctz,cpop")) + (eq_attr "type" "unknown,const,arith,shift,slt,multi,auipc,nop, + logical,move,bitmanip,min,max,minu,maxu,clz,ctz,cpop,trap,cbo")) "alu") (define_insn_reservation "generic_load" 3 diff --git a/gcc/config/riscv/sifive-7.md b/gcc/config/riscv/sifive-7.md index 526278e46d4..e76d82614d6 100644 --- a/gcc/config/riscv/sifive-7.md +++ b/gcc/config/riscv/sifive-7.md @@ -12,6 +12,8 @@ (define_cpu_unit "sifive_7_B" "sifive_7") (define_cpu_unit "sifive_7_idiv" "sifive_7") (define_cpu_unit "sifive_7_fpu" "sifive_7") +(define_cpu_unit "sifive_7_abort" "sifive_7") + (define_insn_reservation "sifive_7_load" 3 (and (eq_attr "tune" "sifive_7") (eq_attr "type" "load")) @@ -106,6 +108,11 @@ (define_insn_reservation "sifive_7_f2i" 3 (eq_attr "type" "mfc")) "sifive_7_A") +(define_insn_reservation "sifive_7_other" 3 + (and (eq_attr "tune" "sifive_7") + (eq_attr "type" "trap,cbo")) + "sifive_7_abort") + (define_bypass 1 "sifive_7_load,sifive_7_alu,sifive_7_mul,sifive_7_f2i,sifive_7_sfb_alu" "sifive_7_alu,sifive_7_branch") -- 2.34.1
[Bug target/111746] [14 Regression] ICE: infinite recursion in try_split (emit-rtl.cc:3972) at -O2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111746 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |14.0
Re: xthead regression with [COMMITTED] RISC-V: const: hide mvconst splitter from IRA
On 10/9/23 13:46, Christoph Müllner wrote: Given that this causes repeated issues, I think that a fall-back to counting occurrences is the right thing to do. I can do that if that's ok. Thanks Christoph. -Vineet
Re: xthead regression with [COMMITTED] RISC-V: const: hide mvconst splitter from IRA
On Mon, Oct 9, 2023 at 10:36 PM Vineet Gupta wrote: > > Hi Christoph, > > On 10/9/23 12:06, Patrick O'Neill wrote: > > > > Hi Vineet, > > > > We're seeing a regression on all riscv targets after this patch:| > > > > FAIL: gcc.target/riscv/xtheadcondmov-indirect.c -O2 > > check-function-bodies ConNmv_imm_imm_reg|| > > FAIL: gcc.target/riscv/xtheadcondmov-indirect.c -O3 -g > > check-function-bodies ConNmv_imm_imm_reg > > > > Debug log output: > > body: \taddia[0-9]+,a[0-9]+,-1000+ > > \tlia[0-9]+,9998336+ > > \taddia[0-9]+,a[0-9]+,1664+ > > \tth.mveqza[0-9]+,a[0-9]+,a[0-9]+ > > \tret > > > > against: lia5,9998336 > > addia4,a0,-1000 > > addia0,a5,1664 > > th.mveqza0,a1,a4 > > ret| > > > > https://github.com/patrick-rivos/gcc-postcommit-ci/issues/8 > > https://github.com/ewlu/riscv-gnu-toolchain/issues/286 > > > > It seems with my patch, exactly same instructions get out of order (for > -O2/-O3) tripping up the test results and differ from say O1 for exact > same build. > > -O2 w/ patch > ConNmv_imm_imm_reg: > lia5,9998336 > addia4,a0,-1000 > addia0,a5,1664 > th.mveqza0,a1,a4 > ret > > -O1 w/ patch > ConNmv_imm_imm_reg: > addia4,a0,-1000 > lia5,9998336 > addia0,a5,1664 > th.mveqza0,a1,a4 > ret > > I'm not sure if there is an easy way to handle that. > Is there a real reason for testing the full sequences verbatim, or is > testing number of occurrences of th.mv{eqz,nez} enough. I did not write the test cases, I just merged two non-functional test files into one that works without changing the actual test approach. Given that this causes repeated issues, I think that a fall-back to counting occurrences is the right thing to do. I can do that if that's ok. BR Christoph > It seems Jeff recently added -fno-sched-pressure to avoid similar issues > but that apparently is no longer sufficient. > > Thx, > -Vineet > > > Thanks, > > Patrick > > > > On 10/6/23 11:22, Vineet Gupta wrote: > >> Vlad recently introduced a new gate @ira_in_progress, similar to > >> counterparts @{reload,lra}_in_progress. > >> > >> Use this to hide the constant synthesis splitter from being recog* () > >> by IRA register equivalence logic which is eager to undo the splits, > >> generating worse code for constants (and sometimes no code at all). > >> > >> See PR/109279 (large constant), PR/110748 (const -0.0) ... > >> > >> Granted the IRA logic is subsided with -fsched-pressure which is now > >> enabled for RISC-V backend, the gate makes this future-proof in > >> addition to helping with -O1 etc. > >> > >> This fixes 1 addition test > >> > >> = Summary of gcc testsuite = > >> | # of unexpected case / # of unique > >> unexpected case > >> | gcc | g++ | gfortran | > >> > >> rv32imac/ ilp32/ medlow | 416 / 103 | 13 / 6 | 67 /12 | > >> rv32imafdc/ ilp32d/ medlow | 416 / 103 | 13 / 6 | 24 / 4 | > >> rv64imac/ lp64/ medlow | 417 / 104 |9 / 3 | 67 /12 | > >> rv64imafdc/ lp64d/ medlow | 416 / 103 |5 / 2 |6 / 1 | > >> > >> Also similar to v1, this doesn't move RISC-V SPEC scores at all. > >> > >> gcc/ChangeLog: > >> * config/riscv/riscv.md (mvconst_internal): Add !ira_in_progress. > >> > >> Suggested-by: Jeff Law > >> Signed-off-by: Vineet Gupta > >> --- > >> gcc/config/riscv/riscv.md | 9 ++--- > >> 1 file changed, 6 insertions(+), 3 deletions(-) > >> > >> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > >> index 1ebe8f92284d..da84b9357bd3 100644 > >> --- a/gcc/config/riscv/riscv.md > >> +++ b/gcc/config/riscv/riscv.md > >> @@ -1997,13 +1997,16 @@ > >> > >> ;; Pretend to have the ability to load complex const_int in order to get > >> ;; better code generation around them. > >> -;; > >> ;; But avoid constants that are special cased elsewhere. > >> +;; > >> +;; Hide it from IRA register equiv recog* () to elide potential undoing > >> of split > >> +;; > >> (define_insn_and_split "*mvconst_internal" > >> [(set (match_operand:GPR 0 "register_operand" "=r") > >> (match_operand:GPR 1 "splittable_const_int_operand" "i"))] > >> - "!(p2m1_shift_operand (operands[1], mode) > >> - || high_mask_shift_operand (operands[1], mode))" > >> + "!ira_in_progress > >> + && !(p2m1_shift_operand (operands[1], mode) > >> +|| high_mask_shift_operand (operands[1], mode))" > >> "#" > >> "&& 1" > >> [(const_int 0)] >
[Bug middle-end/111743] shifts in bit field accesses don't combine with other shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743 Andrew Pinski changed: What|Removed |Added See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=107601 --- Comment #4 from Andrew Pinski --- x86_64 defines SLOW_BYTE_ACCESS which caues some (if not all) of the issues here: ``` ;; _3 = bf.c; (insn 9 8 10 (parallel [ (set (reg:DI 106) (lshiftrt:DI (reg/v:DI 104 [ bf ]) (const_int 32 [0x20]))) (clobber (reg:CC 17 flags)) ]) "/app/example.cpp":5:58 -1 (nil)) (insn 10 9 0 (parallel [ (set (reg:HI 100 [ _3 ]) (and:HI (subreg:HI (reg:DI 106) 0) (const_int 1023 [0x3ff]))) (clobber (reg:CC 17 flags)) ]) "/app/example.cpp":5:58 -1 (nil)) ;; _4 = (unsigned int) _3; (insn 11 10 0 (set (reg:SI 101 [ _4 ]) (zero_extend:SI (reg:HI 100 [ _3 ]))) "/app/example.cpp":5:46 -1 (nil)) ``` Uses HImode (short) here due to SLOW_BYTE_ACCESS being defined rather than the SImode (int).
Re: [PATCH v4] c++: Check for indirect change of active union member in constexpr [PR101631,PR102286]
On 10/8/23 21:03, Nathaniel Shead wrote: Ping for https://gcc.gnu.org/pipermail/gcc-patches/2023-September/631203.html + && (TREE_CODE (t) == MODIFY_EXPR + /* Also check if initializations have implicit change of active +member earlier up the access chain. */ + || !refs->is_empty()) I'm not sure what the cumulative point of these two tests is. TREE_CODE (t) will be either MODIFY_EXPR or INIT_EXPR, and either should be OK. As I understand it, the problematic case is something like constexpr-union2.C, where we're also looking at a MODIFY_EXPR. So what is this check doing? Incidentally, I think constexpr-union6.C could use a test where we pass to a function other than construct_at, and then try (and fail) to assign to the b member from that function. Jason
Re: xthead regression with [COMMITTED] RISC-V: const: hide mvconst splitter from IRA
On 10/9/23 14:36, Vineet Gupta wrote: Hi Christoph, On 10/9/23 12:06, Patrick O'Neill wrote: Hi Vineet, We're seeing a regression on all riscv targets after this patch:| FAIL: gcc.target/riscv/xtheadcondmov-indirect.c -O2 check-function-bodies ConNmv_imm_imm_reg|| FAIL: gcc.target/riscv/xtheadcondmov-indirect.c -O3 -g check-function-bodies ConNmv_imm_imm_reg Debug log output: body: \taddi a[0-9]+,a[0-9]+,-1000+ \tli a[0-9]+,9998336+ \taddi a[0-9]+,a[0-9]+,1664+ \tth.mveqz a[0-9]+,a[0-9]+,a[0-9]+ \tret against: li a5,9998336 addi a4,a0,-1000 addi a0,a5,1664 th.mveqz a0,a1,a4 ret| https://github.com/patrick-rivos/gcc-postcommit-ci/issues/8 https://github.com/ewlu/riscv-gnu-toolchain/issues/286 It seems with my patch, exactly same instructions get out of order (for -O2/-O3) tripping up the test results and differ from say O1 for exact same build. -O2 w/ patch ConNmv_imm_imm_reg: li a5,9998336 addi a4,a0,-1000 addi a0,a5,1664 th.mveqz a0,a1,a4 ret -O1 w/ patch ConNmv_imm_imm_reg: addi a4,a0,-1000 li a5,9998336 addi a0,a5,1664 th.mveqz a0,a1,a4 ret I'm not sure if there is an easy way to handle that. Is there a real reason for testing the full sequences verbatim, or is testing number of occurrences of th.mv{eqz,nez} enough. It seems Jeff recently added -fno-sched-pressure to avoid similar issues but that apparently is no longer sufficient. I'd suggest doing a count test rather than an exact match. Verify you get a single li, two addis and one th.mveqz Jeff
[Bug middle-end/111743] shifts in bit field accesses don't combine with other shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743 Andrew Pinski changed: What|Removed |Added Ever confirmed|0 |1 CC||pinskia at gcc dot gnu.org Severity|normal |enhancement Last reconfirmed||2023-10-09 Status|UNCONFIRMED |NEW --- Comment #3 from Andrew Pinski --- RTL wise we have: Trying 6, 8 -> 9: 6: {r108:DI=r105:DI 0>>0x20;clobber flags:CC;} REG_UNUSED flags:CC 8: {r110:SI=r108:DI#0&0x3ff;clobber flags:CC;} REG_UNUSED flags:CC REG_DEAD r108:DI 9: {r111:SI=r110:SI<<0x14;clobber flags:CC;} REG_DEAD r110:SI REG_UNUSED flags:CC Failed to match this instruction: (parallel [ (set (reg:SI 111) (and:SI (ashift:SI (subreg:SI (zero_extract:DI (reg/v:DI 105 [ bf ]) (const_int 32 [0x20]) (const_int 32 [0x20])) 0) (const_int 20 [0x14])) (const_int 1072693248 [0x3ff0]))) (clobber (reg:CC 17 flags)) ]) This should have been simplified. Anyways bitfields have issues even on the gimple level as they are not lowered until expand ...
xthead regression with [COMMITTED] RISC-V: const: hide mvconst splitter from IRA
Hi Christoph, On 10/9/23 12:06, Patrick O'Neill wrote: Hi Vineet, We're seeing a regression on all riscv targets after this patch:| FAIL: gcc.target/riscv/xtheadcondmov-indirect.c -O2 check-function-bodies ConNmv_imm_imm_reg|| FAIL: gcc.target/riscv/xtheadcondmov-indirect.c -O3 -g check-function-bodies ConNmv_imm_imm_reg Debug log output: body: \taddi a[0-9]+,a[0-9]+,-1000+ \tli a[0-9]+,9998336+ \taddi a[0-9]+,a[0-9]+,1664+ \tth.mveqz a[0-9]+,a[0-9]+,a[0-9]+ \tret against: li a5,9998336 addi a4,a0,-1000 addi a0,a5,1664 th.mveqz a0,a1,a4 ret| https://github.com/patrick-rivos/gcc-postcommit-ci/issues/8 https://github.com/ewlu/riscv-gnu-toolchain/issues/286 It seems with my patch, exactly same instructions get out of order (for -O2/-O3) tripping up the test results and differ from say O1 for exact same build. -O2 w/ patch ConNmv_imm_imm_reg: li a5,9998336 addi a4,a0,-1000 addi a0,a5,1664 th.mveqz a0,a1,a4 ret -O1 w/ patch ConNmv_imm_imm_reg: addi a4,a0,-1000 li a5,9998336 addi a0,a5,1664 th.mveqz a0,a1,a4 ret I'm not sure if there is an easy way to handle that. Is there a real reason for testing the full sequences verbatim, or is testing number of occurrences of th.mv{eqz,nez} enough. It seems Jeff recently added -fno-sched-pressure to avoid similar issues but that apparently is no longer sufficient. Thx, -Vineet Thanks, Patrick On 10/6/23 11:22, Vineet Gupta wrote: Vlad recently introduced a new gate @ira_in_progress, similar to counterparts @{reload,lra}_in_progress. Use this to hide the constant synthesis splitter from being recog* () by IRA register equivalence logic which is eager to undo the splits, generating worse code for constants (and sometimes no code at all). See PR/109279 (large constant), PR/110748 (const -0.0) ... Granted the IRA logic is subsided with -fsched-pressure which is now enabled for RISC-V backend, the gate makes this future-proof in addition to helping with -O1 etc. This fixes 1 addition test = Summary of gcc testsuite = | # of unexpected case / # of unique unexpected case | gcc | g++ | gfortran | rv32imac/ ilp32/ medlow | 416 / 103 | 13 / 6 | 67 /12 | rv32imafdc/ ilp32d/ medlow | 416 / 103 | 13 / 6 | 24 / 4 | rv64imac/ lp64/ medlow | 417 / 104 |9 / 3 | 67 /12 | rv64imafdc/ lp64d/ medlow | 416 / 103 |5 / 2 |6 / 1 | Also similar to v1, this doesn't move RISC-V SPEC scores at all. gcc/ChangeLog: * config/riscv/riscv.md (mvconst_internal): Add !ira_in_progress. Suggested-by: Jeff Law Signed-off-by: Vineet Gupta --- gcc/config/riscv/riscv.md | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 1ebe8f92284d..da84b9357bd3 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -1997,13 +1997,16 @@ ;; Pretend to have the ability to load complex const_int in order to get ;; better code generation around them. -;; ;; But avoid constants that are special cased elsewhere. +;; +;; Hide it from IRA register equiv recog* () to elide potential undoing of split +;; (define_insn_and_split "*mvconst_internal" [(set (match_operand:GPR 0 "register_operand" "=r") (match_operand:GPR 1 "splittable_const_int_operand" "i"))] - "!(p2m1_shift_operand (operands[1], mode) - || high_mask_shift_operand (operands[1], mode))" + "!ira_in_progress + && !(p2m1_shift_operand (operands[1], mode) +|| high_mask_shift_operand (operands[1], mode))" "#" "&& 1" [(const_int 0)]
[Bug fortran/67740] Wrong association status of allocatable character pointer in derived types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=67740 --- Comment #10 from anlauf at gcc dot gnu.org --- (In reply to anlauf from comment #9) Addendum: > I was suspecting gfc_conv_variable as a possibly further place for a fix: > it has a loop over ref's that looks incomplete for REF_COMPONENT. I tried my version of a patch in that place, which worked for the testcases here but gave wrong code already for slightly more complex pointer assignments, like type(pointer_typec0_t) :: co, xo ... xo%data1 => co%data1 so let's go with your patch.
[Bug target/111745] [14 Regression] ICE: in extract_insn, at recog.cc:2791 (unrecognizable insn) with -ffloat-store -mavx512fp16 -mavx512vl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111745 Andrew Pinski changed: What|Removed |Added Keywords||needs-bisection Target Milestone|--- |14.0
Re: [PATCH] c++: Improve diagnostics for constexpr cast from void*
On 10/9/23 06:03, Nathaniel Shead wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu with GXX_TESTSUITE_STDS=98,11,14,17,20,23,26,impcx. -- >8 -- This patch improves the errors given when casting from void* in C++26 to include the expected type if the type of the pointed-to object was not similar to the casted-to type. It also ensures (for all standard modes) that void* casts are checked even for DECL_ARTIFICIAL declarations, such as lifetime-extended temporaries, and is only ignored for cases where we know it's OK (heap identifiers and source_location::current). This provides more accurate diagnostics when using the pointer and ensures that some other casts from void* are now correctly rejected. gcc/cp/ChangeLog: * constexpr.cc (is_std_source_location_current): New. (cxx_eval_constant_expression): Only ignore cast from void* for specific cases and improve other diagnostics. gcc/testsuite/ChangeLog: * g++.dg/cpp0x/constexpr-cast4.C: New test. Signed-off-by: Nathaniel Shead --- gcc/cp/constexpr.cc | 83 +--- gcc/testsuite/g++.dg/cpp0x/constexpr-cast4.C | 7 ++ 2 files changed, 78 insertions(+), 12 deletions(-) create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-cast4.C diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc index 0f948db7c2d..f38d541a662 100644 --- a/gcc/cp/constexpr.cc +++ b/gcc/cp/constexpr.cc @@ -2301,6 +2301,36 @@ is_std_allocator_allocate (const constexpr_call *call) && is_std_allocator_allocate (call->fundef->decl)); } +/* Return true if FNDECL is std::source_location::current. */ + +static inline bool +is_std_source_location_current (tree fndecl) +{ + if (!decl_in_std_namespace_p (fndecl)) +return false; + + tree name = DECL_NAME (fndecl); + if (name == NULL_TREE || !id_equal (name, "current")) +return false; + + tree ctx = DECL_CONTEXT (fndecl); + if (ctx == NULL_TREE || !CLASS_TYPE_P (ctx) || !TYPE_MAIN_DECL (ctx)) +return false; + + name = DECL_NAME (TYPE_MAIN_DECL (ctx)); + return name && id_equal (name, "source_location"); +} + +/* Overload for the above taking constexpr_call*. */ + +static inline bool +is_std_source_location_current (const constexpr_call *call) +{ + return (call + && call->fundef + && is_std_source_location_current (call->fundef->decl)); +} + /* Return true if FNDECL is __dynamic_cast. */ static inline bool @@ -7850,33 +7880,62 @@ cxx_eval_constant_expression (const constexpr_ctx *ctx, tree t, if (TYPE_PTROB_P (type) && TYPE_PTR_P (TREE_TYPE (op)) && VOID_TYPE_P (TREE_TYPE (TREE_TYPE (op))) - /* Inside a call to std::construct_at or to - std::allocator::{,de}allocate, we permit casting from void* + /* Inside a call to std::construct_at, + std::allocator::{,de}allocate, or + std::source_location::current, we permit casting from void* because that is compiler-generated code. */ && !is_std_construct_at (ctx->call) - && !is_std_allocator_allocate (ctx->call)) + && !is_std_allocator_allocate (ctx->call) + && !is_std_source_location_current (ctx->call)) { /* Likewise, don't error when casting from void* when OP is uninit and similar. */ tree sop = tree_strip_nop_conversions (op); - if (TREE_CODE (sop) == ADDR_EXPR - && VAR_P (TREE_OPERAND (sop, 0)) - && DECL_ARTIFICIAL (TREE_OPERAND (sop, 0))) + tree decl = NULL_TREE; + if (TREE_CODE (sop) == ADDR_EXPR) + decl = TREE_OPERAND (sop, 0); + if (decl + && VAR_P (decl) + && DECL_ARTIFICIAL (decl) + && (DECL_NAME (decl) == heap_identifier + || DECL_NAME (decl) == heap_uninit_identifier + || DECL_NAME (decl) == heap_vec_identifier + || DECL_NAME (decl) == heap_vec_uninit_identifier)) /* OK */; /* P2738 (C++26): a conversion from a prvalue P of type "pointer to cv void" to a pointer-to-object type T unless P points to an object whose type is similar to T. */ - else if (cxx_dialect > cxx23 -&& (sop = cxx_fold_indirect_ref (ctx, loc, - TREE_TYPE (type), sop))) + else if (cxx_dialect > cxx23) { - r = build1 (ADDR_EXPR, type, sop); - break; + r = cxx_fold_indirect_ref (ctx, loc, TREE_TYPE (type), sop); + if (r) + { + r = build1 (ADDR_EXPR, type, r); + break; + } + if (!ctx->quiet) + { + if (TREE_CODE (sop) == ADDR_EXPR) + { +
Re: [PATCH v1 1/4] options: Define TARGET__P and TARGET__OPTS_P macro for Mask and InverseMask
> Doesn't this need to be updated to avoid multi-dimensional arrays in awk > and rebased? Oh, yeah, I should update that, it's post before that issue reported, let me send v2 sn :P
[Bug tree-optimization/111694] [13/14 Regression] Wrong behavior for signbit of negative zero when optimizing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111694 --- Comment #9 from Andrew Macleod --- (In reply to Andrew Macleod from comment #8) > (In reply to Alexander Monakov from comment #7) > > No backport for gcc-13 planned? > > mmm, didn't realize were we propagating floating point equivalences around > in 13. similar patch should work there Testing same patch on gcc13. will let it settle on trunk for a day or two first, then check it in if nothing shows up.which it shouldn't :-)
[Bug target/111746] New: [14 Regression] ICE: infinite recursion in try_split (emit-rtl.cc:3972) at -O2
--with-cloog --with-ppl --with-isl --with-sysroot=/usr/powerpc64le-unknown-linux-gnu --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=powerpc64le-unknown-linux-gnu --with-ld=/usr/bin/powerpc64le-unknown-linux-gnu-ld --with-as=/usr/bin/powerpc64le-unknown-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.0.0 20231009 (experimental) (GCC) The build breaks at: $ make /bin/sh ../libtool --tag CC --tag disable-shared --mode=compile /repo/build-gcc-trunk-powerpc64le/./gcc/xgcc -B/repo/build-gcc-trunk-powerpc64le/./gcc/ -B/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/bin/ -B/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/lib/ -isystem /repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/include -isystem /repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/sys-include -DHAVE_CONFIG_H -I.. -I/repo/gcc-trunk/libstdc++-v3/../libiberty -I/repo/gcc-trunk/libstdc++-v3/../include -prefer-pic -D_GLIBCXX_SHARED -I/repo/build-gcc-trunk-powerpc64le/powerpc64le-unknown-linux-gnu/libstdc++-v3/include/powerpc64le-unknown-linux-gnu -I/repo/build-gcc-trunk-powerpc64le/powerpc64le-unknown-linux-gnu/libstdc++-v3/include -I/repo/gcc-trunk/libstdc++-v3/libsupc++-g -O2 -DIN_GLIBCPP_V3 -Wno-error -c cp-demangle.c libtool: compile: /repo/build-gcc-trunk-powerpc64le/./gcc/xgcc -B/repo/build-gcc-trunk-powerpc64le/./gcc/ -B/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/bin/ -B/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/lib/ -isystem /repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/include -isystem /repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-powerpc64le/powerpc64le-unknown-linux-gnu/sys-include -DHAVE_CONFIG_H -I.. -I/repo/gcc-trunk/libstdc++-v3/../libiberty -I/repo/gcc-trunk/libstdc++-v3/../include -D_GLIBCXX_SHARED -I/repo/build-gcc-trunk-powerpc64le/powerpc64le-unknown-linux-gnu/libstdc++-v3/include/powerpc64le-unknown-linux-gnu -I/repo/build-gcc-trunk-powerpc64le/powerpc64le-unknown-linux-gnu/libstdc++-v3/include -I/repo/gcc-trunk/libstdc++-v3/libsupc++ -g -O2 -DIN_GLIBCPP_V3 -Wno-error -c cp-demangle.c -fPIC -DPIC -o cp-demangle.o xgcc: internal compiler error: Segmentation fault signal terminated program cc1 Please submit a full bug report, with preprocessed source (by using -freport-bug). See <https://gcc.gnu.org/bugs/> for instructions. make: *** [Makefile:970: cp-demangle.lo] Error 1
[Bug target/111745] New: [14 Regression] ICE: in extract_insn, at recog.cc:2791 (unrecognizable insn) with -ffloat-store -mavx512fp16 -mavx512vl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111745 Bug ID: 111745 Summary: [14 Regression] ICE: in extract_insn, at recog.cc:2791 (unrecognizable insn) with -ffloat-store -mavx512fp16 -mavx512vl Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: ice-on-valid-code Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Target: x86_64-pc-linux-gnu Created attachment 56083 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56083=edit reduced testcase Compiler output: $ x86_64-pc-linux-gnu-gcc -ffloat-store -mavx512fp16 -mavx512vl testcase.c testcase.c: In function 'foo': testcase.c:8:1: error: unrecognizable insn: 8 | } | ^ (insn 54 53 55 2 (set (reg:V8HF 136) (vec_concat:V8HF (mem:V4HF (plus:DI (reg/f:DI 93 virtual-stack-vars) (const_int -16 [0xfff0])) [1 S8 A64]) (reg:V4HF 139))) "testcase.c":7:5 -1 (nil)) during RTL pass: vregs testcase.c:8:1: internal compiler error: in extract_insn, at recog.cc:2791 0x7e765b _fatal_insn(char const*, rtx_def const*, char const*, int, char const*) /repo/gcc-trunk/gcc/rtl-error.cc:108 0x7e76d8 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*) /repo/gcc-trunk/gcc/rtl-error.cc:116 0x7d63fd extract_insn(rtx_insn*) /repo/gcc-trunk/gcc/recog.cc:2791 0x10e7995 instantiate_virtual_regs_in_insn /repo/gcc-trunk/gcc/function.cc:1610 0x10e7995 instantiate_virtual_regs /repo/gcc-trunk/gcc/function.cc:1983 0x10e7995 execute /repo/gcc-trunk/gcc/function.cc:2030 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/14.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-r14-4520-20231009121517-gb0892b1fc63-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 14.0.0 20231009 (experimental) (GCC)
[Bug middle-end/111743] shifts in bit field accesses don't combine with other shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743 --- Comment #2 from Andi Kleen --- Okay then it doesn't understand that SHL_signed and SHR_unsigned can be combined when one the values came from a shorter unsigned.
[Bug rtl-optimization/111744] New: Missed optimization when casting rdtsc into uint32_t and computing difference
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111744 Bug ID: 111744 Summary: Missed optimization when casting rdtsc into uint32_t and computing difference Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: stefan.sakalik at gmail dot com Target Milestone: --- This is similar to https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92180 where this code: https://godbolt.org/z/7W9nqTsjE #include #include uint32_t rdtsc32() { return static_cast(__rdtsc()); } uint64_t rdtsc_delta(uint64_t x) { return rdtsc32() - rdtsc32(); } Produces rdtsc_delta(unsigned long): rdtsc mov rcx, rax sal rdx, 32 or rcx, rdx rdtsc sub ecx, eax mov rax, rcx ret as opposed to clang version rdtsc_delta(unsigned long): rdtsc mov rcx, rax rdtsc sub ecx, eax mov rax, rcx ret
Re: [RFC 1/2] RISC-V: Add support for _Bfloat16.
On 10/9/23 00:18, Jin Ma wrote: +;; The conversion of DF to BF needs to be done with SF if there is a +;; chance to generate at least one instruction, otherwise just using +;; libfunc __truncdfbf2. +(define_expand "truncdfbf2" + [(set (match_operand:BF 0 "register_operand" "=f") + (float_truncate:BF + (match_operand:DF 1 "register_operand" " f")))] + "TARGET_DOUBLE_FLOAT || TARGET_ZDINX" + { +convert_move (operands[0], + convert_modes (SFmode, DFmode, operands[1], 0), 0); +DONE; + }) So for conversions to/from BFmode, doesn't generic code take care of this for us? Search for convert_mode_scalar in expr.cc. That code will utilize SFmode as an intermediate step just like your expander. Is there some reason that generic code is insufficient? Similarly for the the other conversions. As far as I can see, the function 'convert_mode_scalar' doesn't seem to be perfect for dealing with the conversions to/from BFmode. It can only handle BF to HF, SF, DF and SF to BF well, but the rest of the conversion without any processing, directly using the libcall. Maybe I should choose to enhance its functionality? This seems to be a good choice, I'm not sure.My recollection was that BF could be converted to/from SF trivially and if we wanted BF->DF we'd first convert to SF, then to DF. Direct BF<->DF conversions aren't actually important from a performance standpoint. So it's OK if they have an extra step IMHO. jeff
Re: [pushed] analyzer: improvements to out-of-bounds diagrams [PR111155]
On Mon, 2023-10-09 at 17:01 +0200, Tobias Burnus wrote: > Hi David, > > On 09.10.23 16:08, David Malcolm wrote: > > On Mon, 2023-10-09 at 12:09 +0200, Tobias Burnus wrote: > > > The following works: > > > (A) Using "kind == boundaries::kind::HARD" - i.e. adding > > > "boundaries::" > > > (B) Renaming the parameter name "kind" to something else - like > > > "k" > > > as used > > > in the other functions. > > > > > > Can you fix it? > > Sorry about the breakage, and thanks for the investigation. > Well, without an older compiler, one does not see it. It also worked > flawlessly on my laptop today. > > Does the following patch fix the build for you? > > Yes – as mentioned either of the variants above should work and (A) > is > what you have in your patch. > > And it is what I actually tried for the full build. Hence, yes, it > works :-) Thanks! I've pushed this to trunk as r14-4521-g08d0f840dc7ad2.
Odd Python errors in the G++ testsuite
Hi, I'm seeing these tracebacks for several cases across the G++ testsuite: Executing on host: python3 -c "import sys; assert sys.version_info >= (3, 6)" (timeout = 300) spawn -ignore SIGHUP python3 -c import sys; assert sys.version_info >= (3, 6) rules/0/primary-output is ok: p1689-1.o rules/0/provides/0/logical-name is ok: foo rules/0/provides/0/is-interface is ok: True Traceback (most recent call last): File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 218, in is_ok = validate_p1689(actual, expect) File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 182, in validate_p1689 return compare_json([], actual_json, expect_json) File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 145, in compare_json is_ok = _compare_object(path, actual, expect) File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 66, in _compare_object sub_error = compare_json(path + [key], actual[key], expect[key]) File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 151, in compare_json is_ok = _compare_array(path, actual, expect) File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 87, in _compare_array sub_error = compare_json(path + [str(idx)], a, e) File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 145, in compare_json is_ok = _compare_object(path, actual, expect) File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 66, in _compare_object sub_error = compare_json(path + [key], actual[key], expect[key]) File ".../gcc/testsuite/g++.dg/modules/test-p1689.py", line 149, in compare_json actual = set(actual) TypeError: unhashable type: 'dict' and also these intermittent failures for other cases: Executing on host: python3 -c "import sys; assert sys.version_info >= (3, 6)" (timeout = 300) spawn -ignore SIGHUP python3 -c import sys; assert sys.version_info >= (3, 6) rules/0/primary-output is ok: p1689-2.o rules/0/provides/0/logical-name is ok: foo:part1 rules/0/provides/0/is-interface is ok: True ERROR: length mismatch at rules/0/requires: actual: "1" expect: "0" version is ok: 0 revision is ok: 0 FAIL: ERROR: length mismatch at rules/0/requires: actual: "1" expect: "0" This does seem to me like something not working as intended. As a Python non-expert I have troubles concluding what is going on here and whether these tracebacks are indeed supposed to be there, or whether it is a sign of a problem. And these failures I don't even know where they come from. Does anyone know? Is there a way to run the offending commands by hand? The relevant invocation lines do not appear in the test log file for one to copy and paste, which I think is not the right way of doing things in our environment. These issues seem independent from the test host environment as I can see them on both a `powerpc64le-linux-gnu' and an `x86_64-linux-gnu' machine in `riscv64-linux-gnu' target testing. Maciej
[Bug middle-end/111743] shifts in bit field accesses don't combine with other shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743 --- Comment #1 from Andrew Pinski --- Remember types smaller than int is prompted to int .
[Bug middle-end/111743] New: shifts in bit field accesses don't combine with other shifts
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111743 Bug ID: 111743 Summary: shifts in bit field accesses don't combine with other shifts Product: gcc Version: 13.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: andi-gcc at firstfloor dot org Target Milestone: --- (not sure it's the middle-end, picked arbitrarily) The following code struct bf { unsigned a : 10, b : 20, c : 10; }; unsigned fbc(struct bf bf) { return bf.b | (bf.c << 20); } generates: movq%rdi, %rax shrq$10, %rdi shrq$32, %rax andl$1048575, %edi andl$1023, %eax sall$20, %eax orl %edi, %eax ret It doesn't understand that the shift right can be combined with the shift left. Also not sure why the shift left is arithmetic (this should be all unsigned) clang does the simplification which ends up one instruction shorter: movl%edi, %eax shrl$10, %eax andl$1048575, %eax # imm = 0xF shrq$12, %rdi andl$1072693248, %edi # imm = 0x3FF0 orl %edi, %eax retq
Re: [PATCH] wide-int: Allow up to 16320 bits wide_int and change widest_int precision to 32640 bits [PR102989]
On Mon, Oct 09, 2023 at 03:44:10PM +0200, Jakub Jelinek wrote: > Thanks, just quick answers, will work on patch adjustments after trying to > get rid of rwide_int (seems dwarf2out has very limited needs from it, just > some routine to construct it in GCed memory (and never change afterwards) > from const wide_int_ref & or so, and then working operator ==, > get_precision, elt, get_len and get_val methods, so I think we could just > have a struct dw_wide_int { unsigned int prec, len; HOST_WIDE_INT val[1]; }; > and perform the methods on it after converting to a storage ref. Now in patch form (again, incremental). > > Does the variable-length memcpy pay for itself? If so, perhaps that's a > > sign that we should have a smaller inline buffer for this class (say 2 > > HWIs). > > Guess I'll try to see what results in smaller .text size. I've left the memcpy changes into a separate patch (incremental, attached). Seems that second patch results in .text growth by 16256 bytes (0.04%), though I'd bet it probably makes compile time tiny bit faster because it replaces an out of line memcpy (caused by variable length) with inlined one. With even the third one it shrinks by 84544 bytes (0.21% down), but the extra statistics patch then shows massive number of allocations after running make check-gcc check-g++ check-gfortran for just a minute or two. On the widest_int side, I see (first number from sort | uniq -c | sort -nr, second the estimated or final len) 7289034 4 173586 5 21819 6 i.e. there are tons of widest_ints which need len 4 (or perhaps just have it as upper estimation), maybe even 5 would be nice. On the wide_int side, I see 155291 576 (supposedly because of bound_wide_int, where we create wide_int_ref from the 576-bit precision bound_wide_int and then create 576-bit wide_int when using unary or binary operation on that). So, perhaps we could get away with say WIDEST_INT_MAX_INL_ELTS of 5 or 6 instead of 9 but keep WIDE_INT_MAX_INL_ELTS at 9 (or whatever is computed from MAX_BITSIZE_MODE_ANY_INT?). Or keep it at 9 for both (i.e. without the third patch). --- gcc/poly-int.h.jj 2023-10-09 14:37:45.883940062 +0200 +++ gcc/poly-int.h 2023-10-09 17:05:26.629828329 +0200 @@ -96,7 +96,7 @@ struct poly_coeff_traits -struct poly_coeff_traits +struct poly_coeff_traits { typedef WI_UNARY_RESULT (T) result; typedef int int_type; @@ -110,14 +110,13 @@ struct poly_coeff_traits -struct poly_coeff_traits +struct poly_coeff_traits { typedef WI_UNARY_RESULT (T) result; typedef int int_type; /* These types are always signed. */ static const int signedness = 1; static const int precision = wi::int_traits::precision; - static const int inl_precision = wi::int_traits::inl_precision; static const int rank = precision * 2 / CHAR_BIT; template --- gcc/double-int.h.jj 2023-01-02 09:32:22.747280053 +0100 +++ gcc/double-int.h2023-10-09 17:06:03.446317336 +0200 @@ -440,7 +440,7 @@ namespace wi template <> struct int_traits { -static const enum precision_type precision_type = CONST_PRECISION; +static const enum precision_type precision_type = INL_CONST_PRECISION; static const bool host_dependent_precision = true; static const unsigned int precision = HOST_BITS_PER_DOUBLE_INT; static unsigned int get_precision (const double_int &); --- gcc/wide-int.h.jj 2023-10-09 16:06:39.326805176 +0200 +++ gcc/wide-int.h 2023-10-09 17:29:20.016951691 +0200 @@ -343,8 +343,8 @@ template class widest_int_storag typedef generic_wide_int wide_int; typedef FIXED_WIDE_INT (ADDR_MAX_PRECISION) offset_int; -typedef generic_wide_int > widest_int; -typedef generic_wide_int > widest2_int; +typedef generic_wide_int > widest_int; +typedef generic_wide_int > widest2_int; /* wi::storage_ref can be a reference to a primitive type, so this is the conservatively-correct setting. */ @@ -394,13 +394,13 @@ namespace wi /* The integer has a variable precision but no defined signedness. */ VAR_PRECISION, -/* The integer has a constant precision (known at GCC compile time) - and is signed. */ -CONST_PRECISION, - -/* Like CONST_PRECISION, but with WIDEST_INT_MAX_PRECISION or larger - precision where not all elements of arrays are always present. */ -WIDEST_CONST_PRECISION +/* The integer has a constant precision (known at GCC compile time), + is signed and all elements are in inline buffer. */ +INL_CONST_PRECISION, + +/* Like INL_CONST_PRECISION, but elements can be heap allocated for + larger lengths. */ +CONST_PRECISION }; /* This class, which has no default implementation, is expected to @@ -410,15 +410,10 @@ namespace wi Classifies the type of T. static const unsigned int precision; - Only defined if precision_type == CONST_PRECISION or - precision_type == WIDEST_CONST_PRECISION. Specifies the + Only defined if precision_type ==
[Bug c++/111742] Misaligned generated code with MI using aligned virtual base
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111742 --- Comment #3 from Andrew Pinski --- Then it is a dup of bug 71644. *** This bug has been marked as a duplicate of bug 71644 ***
[Bug c++/71644] gcc 6.1 generates movaps for unaligned memory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71644 Andrew Pinski changed: What|Removed |Added CC||cuzdav at gmail dot com --- Comment #3 from Andrew Pinski --- *** Bug 111742 has been marked as a duplicate of this bug. ***
[Bug c++/111742] Misaligned generated code with MI using aligned virtual base
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111742 --- Comment #2 from Chris Uzdavinis --- No, this is not a ubsan report. Code *crashes* and I thought showing the UBsan warning was enough to demonstrate it. A minimal change to make the code crash instead of just report ubsan errors: struct X { void * a = nullptr; void * b = nullptr; }; struct alignas(16) AlignedData { }; struct A : virtual AlignedData { int x = 0; // << add this X xxx; int& ref = x;// << and this }; struct B : virtual AlignedData {}; struct Test : B, A {}; Test* t = new Test; int main() {} *** SEGFAULT *** https://godbolt.org/z/f57vs7jxP
Re: [PATCH] sso-string@gnu-versioned-namespace [PR83077]
On 09/10/2023 16:42, Iain Sandoe wrote: Hi François, On 7 Oct 2023, at 20:32, François Dumont wrote: I've been told that previous patch generated with 'git diff -b' was not applying properly so here is the same patch again with a simple 'git diff'. Thanks, that did fix it - There are some training whitespaces in the config files, but I suspect that they need to be there since those have values appended during the configuration. You're talking about the ones coming from regenerated Makefile.in and configure I guess. I prefer not to edit those, those trailing whitespaces are already in. Anyway, with this + the coroutines and contract v2 (weak def) fix, plus a local patch to enable versioned namespace on Darwin, I get results comparable with the non-versioned case - but one more patchlet is needed on yours (to allow for targets using emultated TLS): diff --git a/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver b/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver index 9fab8bead15..b7167fc0c2f 100644 --- a/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver +++ b/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver @@ -78,6 +78,7 @@ GLIBCXX_8.0 { # thread/mutex/condition_variable/future __once_proxy; +__emutls_v._ZNSt3__81?__once_call*; I can add this one, sure, even if it could be part of a dedicated patch. I'm surprised that we do not need the __once_callable emul symbol too, it would be more consistent with the non-versioned mode. I'm pretty sure there are a bunch of other symbols missing, but this mode is seldomly tested... # std::__convert_to_v _ZNSt3__814__convert_to_v*; thanks Iain On 07/10/2023 14:25, François Dumont wrote: Hi Here is a rebased version of this patch. There are few test failures when running 'make check-c++' but nothing new. Still, there are 2 patches awaiting validation to fix some of them, PR c++/111524 to fix another bunch and I fear that we will have to live with the others. libstdc++: [_GLIBCXX_INLINE_VERSION] Use cxx11 abi [PR83077] Use cxx11 abi when activating versioned namespace mode. To do support a new configuration mode where !_GLIBCXX_USE_DUAL_ABI and _GLIBCXX_USE_CXX11_ABI. The main change is that std::__cow_string is now defined whenever _GLIBCXX_USE_DUAL_ABI or _GLIBCXX_USE_CXX11_ABI is true. Implementation is using available std::string in case of dual abi and a subset of it when it's not. On the other side std::__sso_string is defined only when _GLIBCXX_USE_DUAL_ABI is true and _GLIBCXX_USE_CXX11_ABI is false. Meaning that std::__sso_string is a typedef for the cow std::string implementation when dual abi is disabled and cow string is being used. libstdcxx-v3/ChangeLog: PR libstdc++/83077 * acinclude.m4 [GLIBCXX_ENABLE_LIBSTDCXX_DUAL_ABI]: Default to "new" libstdcxx abi when enable_symvers is gnu-versioned-namespace. * config/locale/dragonfly/monetary_members.cc [!_GLIBCXX_USE_DUAL_ABI]: Define money_base members. * config/locale/generic/monetary_members.cc [!_GLIBCXX_USE_DUAL_ABI]: Likewise. * config/locale/gnu/monetary_members.cc [!_GLIBCXX_USE_DUAL_ABI]: Likewise. * config/locale/gnu/numeric_members.cc [!_GLIBCXX_USE_DUAL_ABI](__narrow_multibyte_chars): Define. * configure: Regenerate. * include/bits/c++config [_GLIBCXX_INLINE_VERSION](_GLIBCXX_NAMESPACE_CXX11, _GLIBCXX_BEGIN_NAMESPACE_CXX11): Define empty. [_GLIBCXX_INLINE_VERSION](_GLIBCXX_END_NAMESPACE_CXX11, _GLIBCXX_DEFAULT_ABI_TAG): Likewise. * include/bits/cow_string.h [!_GLIBCXX_USE_CXX11_ABI]: Define a light version of COW basic_string as __std_cow_string for use in stdexcept. * include/std/stdexcept [_GLIBCXX_USE_CXX11_ABI]: Define __cow_string. (__cow_string(const char*)): New. (__cow_string::c_str()): New. * python/libstdcxx/v6/printers.py (StdStringPrinter::__init__): Set self.new_string to True when std::__8::basic_string type is found. * src/Makefile.am [ENABLE_SYMVERS_GNU_NAMESPACE](ldbl_alt128_compat_sources): Define empty. * src/Makefile.in: Regenerate. * src/c++11/Makefile.am (cxx11_abi_sources): Rename into... (dual_abi_sources): ...this. Also move cow-local_init.cc, cxx11-hash_tr1.cc, cxx11-ios_failure.cc entries to... (sources): ...this. (extra_string_inst_sources): Move cow-fstream-inst.cc, cow-sstream-inst.cc, cow-string-inst.cc, cow-string-io-inst.cc, cow-wtring-inst.cc, cow-wstring-io-inst.cc, cxx11-locale-inst.cc, cxx11-wlocale-inst.cc entries to... (inst_sources): ...this. * src/c++11/Makefile.in:
[Bug tree-optimization/111694] [13/14 Regression] Wrong behavior for signbit of negative zero when optimizing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111694 Andrew Macleod changed: What|Removed |Added Resolution|FIXED |--- Status|RESOLVED|REOPENED --- Comment #8 from Andrew Macleod --- (In reply to Alexander Monakov from comment #7) > No backport for gcc-13 planned? mmm, didn't realize were we propagating floating point equivalences around in 13. similar patch should work there
[Bug c/111741] gcc long double precision
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111741 --- Comment #3 from bernardwidynski at gmail dot com --- Thanks for the quick response. That explains it. On Mon, Oct 9, 2023 at 10:20 AM pinskia at gcc dot gnu.org < gcc-bugzi...@gcc.gnu.org> wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111741 > > Andrew Pinski changed: > >What|Removed |Added > > > Status|UNCONFIRMED |RESOLVED > Resolution|--- |INVALID > > --- Comment #2 from Andrew Pinski --- > 80bit is the full precission and that 80bits includes 1 bit sign bit, > 64bits > for the mantissa and 15bits for the exponent. > > So anything above 64bits will start to lose precission in the last digits. > > -- > You are receiving this mail because: > You reported the bug.
[Bug sanitizer/83780] False positive alignment error with -fsanitize=undefined with virtual base
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83780 Andrew Pinski changed: What|Removed |Added CC||cuzdav at gmail dot com --- Comment #6 from Andrew Pinski --- *** Bug 111742 has been marked as a duplicate of this bug. ***
[Bug c++/111742] Misaligned generated code with MI using aligned virtual base
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111742 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Andrew Pinski --- It is just a santizer issue. Dup of bug 83780. *** This bug has been marked as a duplicate of bug 83780 ***
[Bug c/111741] gcc long double precision
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111741 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #2 from Andrew Pinski --- 80bit is the full precission and that 80bits includes 1 bit sign bit, 64bits for the mantissa and 15bits for the exponent. So anything above 64bits will start to lose precission in the last digits.
[Bug c++/111742] New: Misaligned generated code with MI using aligned virtual base
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111742 Bug ID: 111742 Summary: Misaligned generated code with MI using aligned virtual base Product: gcc Version: 13.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: cuzdav at gmail dot com Target Milestone: --- Generated code is misaligned (and crashes in slightly more complex code), in trunk all the way back to gcc 8.1, when built in c++11 or higher, with O3. (Linux, x86) Complete code: // struct X { void * a = nullptr; void * b = nullptr; }; struct alignas(16) AlignedData { }; struct A : virtual AlignedData { X xxx; }; struct B : virtual AlignedData {}; struct Test : B, A {}; Test* t = new Test; int main() {} // Compiler Explorer demo: https://godbolt.org/z/aodTdaedW Running with UB-san reports this: /app/example.cpp:14:8: runtime error: constructor call on misaligned address 0x0227f2b8 for type 'struct A', which requires 16 byte alignment 0x0227f2b8: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ /app/example.cpp:8:8: runtime error: member access within misaligned address 0x0227f2b8 for type 'struct A', which requires 16 byte alignment 0x0227f2b8: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^
[Bug tree-optimization/111694] [13/14 Regression] Wrong behavior for signbit of negative zero when optimizing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111694 --- Comment #7 from Alexander Monakov --- No backport for gcc-13 planned?
[Bug c/111741] gcc long double precision
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111741 --- Comment #1 from bernardwidynski at gmail dot com --- Created attachment 56082 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56082=edit Output file
[Bug c/111741] New: gcc long double precision
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111741 Bug ID: 111741 Summary: gcc long double precision Product: gcc Version: 11.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: bernardwidynski at gmail dot com Target Milestone: --- Created attachment 56081 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56081=edit C program to compute sum of numbers 1, 2, 3, ... N It is my understanding the long double in gcc has 80 bits precision. I've run a simple program which shows that it is less than 80 bits precision. The numbers 1, 2, 3, ... N are summed and compared with N*(N+1)/2 For the case where N = 2^32, the sums compare correctly. For the case where N = 2^33, the sums are different. 2^33*(2^33-1)/2 is less than 80 bits in precision. Why doesn't the long double have the capacity for this computation? See attached program and output file. This was run on Cygwin64 using gcc version 11.4.0 on an Intel Core i7-9700
[Bug tree-optimization/111694] [13/14 Regression] Wrong behavior for signbit of negative zero when optimizing
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111694 Andrew Macleod changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #6 from Andrew Macleod --- fixed
[COMMITTED] PR tree-optimization/111694 - Ensure float equivalences include + and - zero.
When ranger propagates ranges in the on-entry cache, it also check for equivalences and incorporates the equivalence into the range for a name if it is known. With floating point values, the equivalence that is generated by comparison must also take into account that if the equivalence contains zero, both positive and negative zeros could be in the range. This PR demonstrates that once we establish an equivalence, even though we know one value may only have a positive zero, the equivalence may have been formed earlier and included a negative zero This patch pessimistically assumes that if the equivalence contains zero, we should include both + and - 0 in the equivalence that we utilize. I audited the other places, and found no other place where this issue might arise. Cache propagation is the only place where we augment the range with random equivalences. Bootstrapped on x86_64-pc-linux-gnu with no regressions. Pushed. Andrew From b0892b1fc637fadf14d7016858983bc5776a1e69 Mon Sep 17 00:00:00 2001 From: Andrew MacLeod Date: Mon, 9 Oct 2023 10:15:07 -0400 Subject: [PATCH 2/2] Ensure float equivalences include + and - zero. A floating point equivalence may not properly reflect both signs of zero, so be pessimsitic and ensure both signs are included. PR tree-optimization/111694 gcc/ * gimple-range-cache.cc (ranger_cache::fill_block_cache): Adjust equivalence range. * value-relation.cc (adjust_equivalence_range): New. * value-relation.h (adjust_equivalence_range): New prototype. gcc/testsuite/ * gcc.dg/pr111694.c: New. --- gcc/gimple-range-cache.cc | 3 +++ gcc/testsuite/gcc.dg/pr111694.c | 19 +++ gcc/value-relation.cc | 19 +++ gcc/value-relation.h| 3 +++ 4 files changed, 44 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/pr111694.c diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc index 3c819933c4e..89c0845457d 100644 --- a/gcc/gimple-range-cache.cc +++ b/gcc/gimple-range-cache.cc @@ -1470,6 +1470,9 @@ ranger_cache::fill_block_cache (tree name, basic_block bb, basic_block def_bb) { if (rel != VREL_EQ) range_cast (equiv_range, type); + else + adjust_equivalence_range (equiv_range); + if (block_result.intersect (equiv_range)) { if (DEBUG_RANGE_CACHE) diff --git a/gcc/testsuite/gcc.dg/pr111694.c b/gcc/testsuite/gcc.dg/pr111694.c new file mode 100644 index 000..a70b03069dc --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr111694.c @@ -0,0 +1,19 @@ +/* PR tree-optimization/111009 */ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +#define signbit(x) __builtin_signbit(x) + +static void test(double l, double r) +{ + if (l == r && (signbit(l) || signbit(r))) +; + else +__builtin_abort(); +} + +int main() +{ + test(0.0, -0.0); +} + diff --git a/gcc/value-relation.cc b/gcc/value-relation.cc index a2ae39692a6..0326fe7cde6 100644 --- a/gcc/value-relation.cc +++ b/gcc/value-relation.cc @@ -183,6 +183,25 @@ relation_transitive (relation_kind r1, relation_kind r2) return relation_kind (rr_transitive_table[r1][r2]); } +// When one name is an equivalence of another, ensure the equivalence +// range is correct. Specifically for floating point, a +0 is also +// equivalent to a -0 which may not be reflected. See PR 111694. + +void +adjust_equivalence_range (vrange ) +{ + if (range.undefined_p () || !is_a (range)) +return; + + frange fr = as_a (range); + // If range includes 0 make sure both signs of zero are included. + if (fr.contains_p (dconst0) || fr.contains_p (dconstm0)) +{ + frange zeros (range.type (), dconstm0, dconst0); + range.union_ (zeros); +} + } + // This vector maps a relation to the equivalent tree code. static const tree_code relation_to_code [VREL_LAST] = { diff --git a/gcc/value-relation.h b/gcc/value-relation.h index be6e277421b..31d48908678 100644 --- a/gcc/value-relation.h +++ b/gcc/value-relation.h @@ -91,6 +91,9 @@ inline bool relation_equiv_p (relation_kind r) void print_relation (FILE *f, relation_kind rel); +// Adjust range as an equivalence. +void adjust_equivalence_range (vrange ); + class relation_oracle { public: -- 2.41.0