[Bug c/112702] New: C23, C++23: Extended characters not valid in an identifier with -pedantic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112702 Bug ID: 112702 Summary: C23, C++23: Extended characters not valid in an identifier with -pedantic Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: stammark at gcc dot gnu.org Target Milestone: --- Hi all, This is likely a symptom of the WIP-ness of C23 and C++23 support in the frontends, but see here: https://godbolt.org/z/78KeK1fnG The use of extended characters in identifiers with -pedantic stopped working * For C, in GCC13 * For C++ in GCC12 Removing -pedantic makes the compilation succeed. Is this expected behaviour with -pedantic or a bug? Thanks,
[Bug target/112337] arm: ICE in arm_effective_regno when compiling for MVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112337 --- Comment #4 from Stam Markianos-Wright --- Bisected to f55cdce3f8dd8503e080e35be59c5f5390f6d95e Attached preprocessed source and a creduced-reproducer of it
[Bug target/112337] arm: ICE in arm_effective_regno when compiling for MVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112337 --- Comment #3 from Stam Markianos-Wright --- Created attachment 56493 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56493=edit Full preprocessor reproducer
[Bug target/112337] arm: ICE in arm_effective_regno when compiling for MVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112337 --- Comment #2 from Stam Markianos-Wright --- Created attachment 56492 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56492=edit creduced reproducer
[Bug target/112337] New: arm: ICE in arm_effective_regno
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112337 Bug ID: 112337 Summary: arm: ICE in arm_effective_regno Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: stammark at gcc dot gnu.org Target Milestone: --- Hi all, I found this ICE when compiling CMSIS-NN with latest trunk: ./build-arm-eabi-armv8.1-m.main+mve.fp+fp.dp/install/bin/arm-eabi-gcc -mcpu=cortex-m55 ~/gnu/CMSIS-NN/Source/NNSupportFunctions/arm_nn_depthwise_conv_nt_t_padded_s8.c -I ~/gnu/CMSIS-NN/Include/ -O3 -S during RTL pass: ira /home/stamar01/gnu/CMSIS-NN/Source/NNSupportFunctions/arm_nn_depthwise_conv_nt_t_padded_s8.c: In function 'arm_nn_depthwise_conv_nt_t_padded_s8': /home/stamar01/gnu/CMSIS-NN/Source/NNSupportFunctions/arm_nn_depthwise_conv_nt_t_padded_s8.c:172:1: internal compiler error: in arm_effective_regno, at config/arm/arm.cc:13671 172 | } | ^ 0x1b590f2 arm_effective_regno /home/stamar01/gnu/v8.X-M/src/gcc/gcc/config/arm/arm.cc:13671 0x1b5923a mve_vector_mem_operand(machine_mode, rtx_def*, bool) /home/stamar01/gnu/v8.X-M/src/gcc/gcc/config/arm/arm.cc:13701 0x23015c2 mve_memory_operand(rtx_def*, machine_mode) /home/stamar01/gnu/v8.X-M/src/gcc/gcc/config/arm/predicates.md:39 0x23d79fa recog_235 /home/stamar01/gnu/v8.X-M/src/gcc/gcc/config/arm/mve.md:3636 0x241db9c recog_287 /home/stamar01/gnu/v8.X-M/src/gcc/gcc/config/arm/neon.md:6161 0x24540af recog_344 /home/stamar01/gnu/v8.X-M/src/gcc/gcc/config/arm/mve.md:6390 0x2459355 recog(rtx_def*, rtx_insn*, int*) /home/stamar01/gnu/v8.X-M/src/gcc/gcc/config/arm/sync.md:462 0x15663f5 insn_invalid_p(rtx_insn*, bool) /home/stamar01/gnu/v8.X-M/src/gcc/gcc/recog.cc:358 0x15667ad verify_changes(int) /home/stamar01/gnu/v8.X-M/src/gcc/gcc/recog.cc:469 0x1350f89 equiv_can_be_consumed_p /home/stamar01/gnu/v8.X-M/src/gcc/gcc/ira-costs.cc:1767 0x13518f0 calculate_equiv_gains /home/stamar01/gnu/v8.X-M/src/gcc/gcc/ira-costs.cc:1887 0x1351fbe find_costs_and_classes /home/stamar01/gnu/v8.X-M/src/gcc/gcc/ira-costs.cc:2007 0x135404c ira_costs() /home/stamar01/gnu/v8.X-M/src/gcc/gcc/ira-costs.cc:2564 0x1347e82 ira_build() /home/stamar01/gnu/v8.X-M/src/gcc/gcc/ira-build.cc:3481 0x133d895 ira /home/stamar01/gnu/v8.X-M/src/gcc/gcc/ira.cc:5793 0x133e215 execute /home/stamar01/gnu/v8.X-M/src/gcc/gcc/ira.cc:6117 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. opening this up in GDB I see that: #1 0x01b590f3 in arm_effective_regno (op=0x76e130a8, strict=false) at /home/stamar01/gnu/v8.X-M/src/gcc/gcc/config/arm/arm.cc:13671 13671 gcc_assert (REG_P (op)); (gdb) p debug_rtx (op) (mem/f/c:SI (plus:SI (reg/f:SI 103 afp) (const_int 28 [0x1c])) [2 output_bias+0 S4 A32]) And slightly further up: #3 0x023015c3 in mve_memory_operand (op=0x76e24b70, mode=E_V4SImode) at /home/stamar01/gnu/v8.X-M/src/gcc/gcc/config/arm/predicates.md:39 39 && mve_vector_mem_operand (GET_MODE (op), XEXP (op, 0), (gdb) p debug_rtx (op) (mem:V4SI (post_inc:SI (mem/f/c:SI (plus:SI (reg/f:SI 103 afp) (const_int 28 [0x1c])) [2 output_bias+0 S4 A32])) [0 MEM[(int[4] *)bias_176]+0 S16 A32]) I've started a bisect.
[Bug target/110255] arm: MVE intrinsics C++ polymorphism with -flax-vector-conversions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110255 Stam Markianos-Wright changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED |RESOLVED --- Comment #3 from Stam Markianos-Wright --- Aha! Thanks, Andrew, that makes sense. I'll go back to the original authors for this and check if there's any good reason why they are using -flax-vector-conversions and if they can just change their code :) Also woops, the godbolt link I gave above was in the middle of me messing around with casts. Here is a clean one: https://godbolt.org/z/c9vaas6P8 . And indeed, casting to the "correct" scalar type for the intrinsic (in this case uint16_t), does indeed make this work However, this is sounding like this bugzilla should also go to RESOLVED INVALID. Sorry for the false alarm!
[Bug target/110255] New: arm: MVE intrinsics C++ polymorphism with -flax-vector-conversions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110255 Bug ID: 110255 Summary: arm: MVE intrinsics C++ polymorphism with -flax-vector-conversions Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: stammark at gcc dot gnu.org Target Milestone: --- Hi all, See: https://godbolt.org/z/53ME1fGfM The compiler with the error is the one that is using -flax-vector-conversions through the C++ frontend. Unsure if this is something to do with the C++ front-end or something in the target backend (and how the builtins are registered with the front-end). This seems to happen regardless of if the vaddq intrinsic has been "restructured" by Christophe's https://gcc.gnu.org/pipermail/gcc-patches/2023-April/615997.html (so going back to older GCC12,13 still gives the error), but another intrinsic, like vbicq, doesn't give the error at all (although that has a different context of the `int` immediate having to be a compile-time constant). (clang handles all this fine FWIW) Has anyone seen this kind of thing before, have any ideas on workarounds, or have any insight on if this this invalid C++ to begin with?
[Bug target/107515] MVE: Generic functions do not accept _Float16 scalars
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107515 Stam Markianos-Wright changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #13 from Stam Markianos-Wright --- Fixed on GCC12 onwards.
[Bug target/109697] New: arm: lack of MVE instruction costing causing worse codegen on a vec_duplicate
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109697 Bug ID: 109697 Summary: arm: lack of MVE instruction costing causing worse codegen on a vec_duplicate Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: stammark at gcc dot gnu.org Target Milestone: --- Hi all, In the arm backend, for MVE targets we previously had this bug on the vcmp patterns: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107987 The fix is fine, but it resulted in some failing tests: * gcc.target/arm/mve/intrinsics/vcmpcsq_n_u16.c * gcc.target/arm/mve/intrinsics/vcmpcsq_n_u32.c * gcc.target/arm/mve/intrinsics/vcmpcsq_n_u8.c * gcc.target/arm/mve/intrinsics/vcmpeqq_n_f16.c * gcc.target/arm/mve/intrinsics/vcmpeqq_n_f32.c * gcc.target/arm/mve/intrinsics/vcmpeqq_n_u16.c * gcc.target/arm/mve/intrinsics/vcmpeqq_n_u32.c * gcc.target/arm/mve/intrinsics/vcmpeqq_n_u8.c * gcc.target/arm/mve/intrinsics/vcmpgeq_n_f16.c * gcc.target/arm/mve/intrinsics/vcmpgeq_n_f32.c * gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16.c * gcc.target/arm/mve/intrinsics/vcmpgtq_n_f32.c * gcc.target/arm/mve/intrinsics/vcmphiq_n_u16.c * gcc.target/arm/mve/intrinsics/vcmphiq_n_u32.c * gcc.target/arm/mve/intrinsics/vcmphiq_n_u8.c * gcc.target/arm/mve/intrinsics/vcmpleq_n_f16.c * gcc.target/arm/mve/intrinsics/vcmpleq_n_f32.c * gcc.target/arm/mve/intrinsics/vcmpltq_n_f16.c * gcc.target/arm/mve/intrinsics/vcmpltq_n_f32.c * gcc.target/arm/mve/intrinsics/vcmpneq_n_f16.c * gcc.target/arm/mve/intrinsics/vcmpneq_n_f32.c * gcc.target/arm/mve/intrinsics/vcmpneq_n_u16.c * gcc.target/arm/mve/intrinsics/vcmpneq_n_u32.c * gcc.target/arm/mve/intrinsics/vcmpneq_n_u8.c (after Andrea improved these tests in GCC13) The testcases that are failing are the ones that compare against a scalar immediate (e.g. "vcmpeqq (a, 1.1)"), because the compiler prefers to do: ``` vldr.64 d6, .L5 vldr.64 d7, .L5+8 vcmp.f16eq, q0, q3 ``` When previously we would much more simply: ``` movsr3, #1 vcmp.u16cs, q0, r3 ``` The underlying reason for this change is a known deficiency of the MVE implementation: the lack of proper instruction costing. The compiler falls back to calculating costs based on the operands and the new vec_duplicate in the patterns (mve_vcmpq_n_, etc) gets given a cost of 32 (when instead it should know that the vec duplicate is free and this is all just one instruction...), so the "literal load + vector-vector compare" wins out against the "put the immediate in a GP reg + vector-scalar compare". For now, I plan on simply XFAIL-ing the tests.
[Bug target/107674] [11/12/13 Regressions] arm: MVE codegen regressions on VCTP and vector LDR/STR instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107674 Stam Markianos-Wright changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #5 from Stam Markianos-Wright --- Thanks Richard! I believe this is now fixed. This is likely not applicable for backporting as Andre's changes included some mid-end additions and it's only a missed-optimization regression -- hence closing this ticket.
[Bug target/108177] MVE predicated stores to same address get optimized away
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108177 --- Comment #5 from Stam Markianos-Wright --- With the fix to MVE auto_inc having gone in as ddc9b5ee13cd686c8674f92d46045563c06a23ea I have found that this fix keeps the auto-inc on these predicated stores broken. It seems to fail in auto_inc_dec at this condition: ``` /* Make sure this reg appears only once in this insn. */ if (count_occurrences (PATTERN (mem_insn.insn), mem_insn.reg0, 1) != 1) { if (dump_file) fprintf (dump_file, "mem count failure\n"); return false; } ``` (which makes sense with the pattern now having the MEM appear twice) I guess this is not urgent since this is only a performance impact on one instruction. Also if the change needs to be in the auto-inc pass instead of the backend, then likely something for GCC14, but I thought this would be a good place to record this ;) Does anyone have any ideas on this? Or I wonder what the AVX case does for this
[Bug target/107674] [11/12/13 Regressions] arm: MVE codegen regressions on VCTP and vector LDR/STR instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107674 --- Comment #3 from Stam Markianos-Wright --- Thank you, Andre for fixing the Part 1 in this ticket :) The part 2 we've found to be a regression since r13-416-g485a0ae0982abe and is also the reason why the mve_*_memory_nodes tests are currently failing.
[Bug target/108443] arm: MVE wrongly re-interprets predicate constants
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108443 Stam Markianos-Wright changed: What|Removed |Added CC||stammark at gcc dot gnu.org --- Comment #2 from Stam Markianos-Wright --- What are people's thoughts on getting this (and the rest of the patch series the fix was part of) backported to GCC12? One of the changes in the series is arguably a mid-end optimisation (the change to simplify-rtx), but it is a pre-requisite to getting this wrong-code bug resolved.
[Bug target/109158] New: arm: errors when mixing __attribute__((pcs("aapcs-vfp"))) with +nofp
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109158 Bug ID: 109158 Summary: arm: errors when mixing __attribute__((pcs("aapcs-vfp"))) with +nofp Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: stammark at gcc dot gnu.org Target Milestone: --- I've detected a couple of minor issues when using __attribute__((pcs("aapcs-vfp"))) in no-fp architectures, resulting in wrong code or an ICE. reproducer code: __attribute__((pcs("aapcs-vfp"))) a(__fp16); void foo() { a(0.0); } Issue 1: when compiled as `-march=armv8.1-m.main+mve+nofp -mfloat-abi=softfp -mfp16-format=ieee` we emit an invalid FP instruction `vmov.f16` to put the immediate into `s0` before returning from the function. I believe this is a simple case of the `*mov_vfp_16`assuming that all variants are valid with any `TARGET_VFP_FP16INST || TARGET_HAVE_MVE` when actually the `vmov.f16` cases are only valid with TARGET_VFP_FP16INST, but not with MVE. Issue 2: when compiled as `-march=armv8.1-m.main+nofp -mfloat-abi=softfp` (i.e. no MVE, no FP at all) we ICE with `maximum number of generated reload insns per insn achieved` -- this class of error also happens with `float` and `double` types. IMO this is an invalid configuration and we should be emitting a user error instead of the ICE, similar to what we do if the user requests -mfloat-abi=hard without any MVE or FP present: ``` else if (TARGET_HARD_FLOAT_ABI) { arm_pcs_default = ARM_PCS_AAPCS_VFP; if (!bitmap_bit_p (arm_active_target.isa, isa_bit_vfpv2) && !bitmap_bit_p (arm_active_target.isa, isa_bit_mve)) error ("%<-mfloat-abi=hard%>: selected architecture lacks an FPU"); }```
[Bug target/100000] non-leaf epologue/prologue used if MVE v4sf is used for load/return
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=10 Stam Markianos-Wright changed: What|Removed |Added CC||stammark at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #4 from Stam Markianos-Wright --- I tried out Richard's suggestion in arm_vector_mode_supported_p (allowing V8HF, V4SF and V2DF unconditionally) and it seems to have worked! After a testsuite run I found a few ICEs due to a number of patterns that needed enabling: @mve_vpselq_ which was only enabled for mve.fp And then all the patterns that were conditional on: - "((TARGET_HAVE_MVE && VALID_MVE_SI_MODE (mode)) -|| (TARGET_HAVE_MVE_FLOAT && VALID_MVE_SF_MODE (mode))) mve_vec_extract *mve_vec_extract_sext_internal *mve_vec_extract_zext_internal mve_vec_set_internal *movmisalign_mve_store *movmisalign_mve_load These weren't causing any ICEs but also made sense to enable: mve_vst2q mve_vld2q mve_vld4q No regressions after that, but I think I will hold off until GCC14 Stage 1 to post this patch, just to be safe.
[Bug target/107714] MVE: Invalid addressing mode generated for VLD2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107714 Stam Markianos-Wright changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #9 from Stam Markianos-Wright --- This should now be resolved on all active branches (11, 12, 13)
[Bug target/107714] MVE: Invalid addressing mode generated for VLD2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107714 Stam Markianos-Wright changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Version|12.2.0 |13.0 Known to fail||11.3.1, 12.2.1 Ever confirmed|0 |1 Last reconfirmed|2022-11-17 00:00:00 |2022-12-09
[Bug target/107515] MVE: Generic functions do not accept _Float16 scalars
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107515 --- Comment #9 from Stam Markianos-Wright --- > Clearly Helium+Linux on Godbolt is a bit confused Yea, I agree -- it still shouldn't be an unintuitive front-end type clash error, though! I've posted another patch: https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607675.html (see there for what the error was --- interestingly it was coming from `__ARM_mve_coerce3` and it didn't directly have anything to do with the float types)
[Bug target/107515] MVE: Generic functions do not accept _Float16 scalars
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107515 --- Comment #7 from Stam Markianos-Wright --- (In reply to Kevin Bracey from comment #6) > Retesting the Godbolt on trunk, it's now worse - every line produces > multiple not-very-informative errors: > > source>:7:9: error: '_Generic' specifies two compatible types > 7 | x = vmulq(x, 0.5); // ok > | ^ > :7:9: note: compatible type is here > 7 | x = vmulq(x, 0.5); // ok > | ^ > > (repeated 6 times per source line) Interesting... Thanks for spotting this, I didn't see this in my testing, because it doesn't seem to happen on baremetal `arm-none-eabi` (and I still can't replicate it there), but I do see this on the linux target (let me know if you are seeing anything different). I am investigating further!
[Bug target/107515] MVE: Generic functions do not accept _Float16 scalars
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107515 Stam Markianos-Wright changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #3 from Stam Markianos-Wright --- (In reply to Kevin Bracey from comment #2) > I've just spotted another apparent generic selection problem in my > reproducer for bug 107714 - should I create a new issue for it? Our patch series has gone up to the mailing list! For _Float16 see: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606587.html For `vmuq` this: https://gcc.gnu.org/pipermail/gcc-patches/2022-November/606575.html works on my side, so if it gets merged and works for your case, too, then I guess we don't have to worry about a new bug report :)
[Bug target/107714] MVE: Invalid addressing mode generated for VLD2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107714 Stam Markianos-Wright changed: What|Removed |Added Last reconfirmed||2022-11-17 --- Comment #3 from Stam Markianos-Wright --- Thanks for finding this! Confirmed the `vld2` bug on latest trunk -- my guess would be either a GCC backend or a gas bug (I'm not familiar with these instructions). For the `vmulq` issue I'll reply under the other thread, but I believe that very soon we'll be able to put that down as "already fixed", so we can keep this thread for `vld2` only.
[Bug target/107674] New: [11/12/13 Regressions] arm: MVE codegen regressions on VCTP and vector LDR/STR instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107674 Bug ID: 107674 Summary: [11/12/13 Regressions] arm: MVE codegen regressions on VCTP and vector LDR/STR instructions Product: gcc Version: 12.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: stammark at gcc dot gnu.org Target Milestone: --- We've found a couple of performance regressions with Arm MVE. These can be seen here: https://godbolt.org/z/onPjfW4zj * Between GCC 11 and 12 we seem to have started emitting a strange vmrs/sxth/vmsr instruction sequence after the vctp instruction. I suspect this is something to do with the introduction of MODE_VECTOR_BOOL during that period. * Between GCC 12 and 13 we are no longer merging the pointer increments by #16 into the ldr/strs and we have some random movs that aren't needed either. This also happened in GCC 11, but we want to keep the improved codegen of GCC 12 here ;) This looks like a change in register allocation: Choosing alt 0 in insn 24: (0) =w (1) Ux (2) Up {mve_vldrhq_z_sv8hi} Creating newreg=149, assigning class CORE_REGS to INC/DEC result r149 Creating newreg=150 from oldreg=134, assigning class VPR_REG to r150 bad vs good Choosing alt 0 in insn 24: (0) =w (1) Ux (2) Up {mve_vldrhq_z_sv8hi} Creating newreg=149 from oldreg=134, assigning class VPR_REG to r149 Does anyone have any further ideas on why these may have changed or how to fix them? Thanks!
[Bug target/107515] MVE: Generic functions do not accept _Float16 scalars
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107515 Stam Markianos-Wright changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |stammark at gcc dot gnu.org CC||stammark at gcc dot gnu.org Last reconfirmed||2022-11-10 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #1 from Stam Markianos-Wright --- Also confirmed on trunk and assigning to myself, because I believe I've found a fix: ``` diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h index 2d213e12304..ce20a6fcd24 100644 --- a/gcc/config/arm/arm_mve.h +++ b/gcc/config/arm/arm_mve.h @@ -35582,6 +35582,9 @@ enum { short: __ARM_mve_type_int_n, \ int: __ARM_mve_type_int_n, \ long: __ARM_mve_type_int_n, \ + _Float16: __ARM_mve_type_fp_n, \ + __fp16: __ARM_mve_type_fp_n, \ + float: __ARM_mve_type_fp_n, \ double: __ARM_mve_type_fp_n, \ long long: __ARM_mve_type_int_n, \ unsigned char: __ARM_mve_type_int_n, \ ``` Still untested, though, and I'll likely send it to the mailing list within the next week or two (we've got a couple more arm_mve.h changes in the pipeline, so I'll test them all together).
[Bug libstdc++/100017] [11 regression] error: 'fenv_t' has not been declared in '::' -- canadian compilation fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100017 Stam Markianos-Wright changed: What|Removed |Added CC||stammark at gcc dot gnu.org --- Comment #78 from Stam Markianos-Wright --- Thank you, Jonathan! Is that commit OK to be backported to gcc-11, as well?
[Bug tree-optimization/103247] New: graphite: Wrong code when at -O1 or higher and -floop-nest-optimize is given without an earlier tree-cunrolli pass
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103247 Bug ID: 103247 Summary: graphite: Wrong code when at -O1 or higher and -floop-nest-optimize is given without an earlier tree-cunrolli pass Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: stammark at gcc dot gnu.org Target Milestone: --- Hi all, I'm coming across a strange optimisation bug that happens if I try to combine -O1 with the Graphite -floop-nest-optimize on test interchange-8 I initially spotted the symptom when compiling as: ./bin/aarch64-none-elf-gcc ../src/gcc/gcc/testsuite/gcc.dg/graphite/interchange-8.c -march=armv8-a -specs=aem-validation.specs -o test.out -O1 -floop-nest-optimize And then running on a simulator resulted in the abort () function being called. and then replicated on x86 as: gcc ../src/gcc/gcc/testsuite/gcc.dg/graphite/interchange-8.c -o test.out -O1 -floop-nest-optimize ; ./test.out Which gives an `Aborted (core dumped)` Some further digging showed that the execution failure seems to be something in the graphite/loop-nest-optimize pass that requires a tree-cunrolli pass to have been done earlier, for this test's source code, so: # The test always runs with -floop-nest-optimize # O2 and O3 constain tree-cunrolli # O0 and O1 don't contain tree-cunrolli -O0 -floop-nest-optimize: works (no abort () was called) -O0 -fenable-tree-cunrolli -floop-nest-optimize: works -O1 -floop-nest-optimize: broken -O1 -fenable-tree-cunrolli -floop-nest-optimize: works -O2 -floop-nest-optimize: works -O2 -fdisable-tree-cunrolli -floop-nest-optimize: broken -O3 -floop-nest-optimize: works -O3 -fdisable-tree-cunrolli -floop-nest-optimize: broken Is anyone aware of such a dependency between these passes and/or is this a real bug or am I missing something? Replicated on latest trunk and GCC-11. Thanks!
[Bug tree-optimization/96974] [10/11 Regression] ICE in vect_get_vector_types_for_stmt compiling for SVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974 Stam Markianos-Wright changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |stammark at gcc dot gnu.org Status|NEW |ASSIGNED
[Bug tree-optimization/96974] [10/11 Regression] ICE in vect_get_vector_types_for_stmt compiling for SVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974 --- Comment #8 from Stam Markianos-Wright --- I have a liiitle bit more progress here, but I have a question about vect_get_smallest_scalar_type. If we look at the comment before the function: >/* Return the smallest scalar part of STMT_INFO. > This is used to determine the vectype of the stmt. We generally set the > vectype according to the type of the result (lhs). For stmts whose > result-type is different than the type of the arguments (e.g., demotion, > promotion), vectype will be reset appropriately (later). Note that we have > to visit the smallest datatype in this function, because that determines the > VF. If the smallest datatype in the loop is present only as the rhs of a > promotion operation - we'd miss it. Would this be "smallest datatype in all cases", or is this more like "the smallest datatype within the same promotion/demotion chain"? i.e. how should we react if we detect a smallest datatype on the rhs of "float" when everything else in the stmt has been in the integer chain (int or, like in this case, long int)? > Such a case, where a variable of this datatype does not appear in the lhs > anywhere in the loop, can only occur if it's an invariant: e.g.: > 'int_x = (int) short_inv', which we'd expect to have been optimized away by > invariant motion. However, we cannot rely on invariant motion to always > take invariants out of the loop, and so in the case of promotion we also > have to check the rhs. > LHS_SIZE_UNIT and RHS_SIZE_UNIT contain the sizes of the corresponding > types. */ I have found that this is why we end up with a smaller number in: TYPE_VECTOR_SUBPARTS (nunits_vectype) == 4 than in: TYPE_VECTOR_SUBPARTS (*stmt_vectype_out) == 8 So I'm thinking that either A) We shouldn't allow this, and add in some check maybe for "GET_MODE_CLASS (x) == GET_MODE_CLASS (y)" or B) Some of the logic that generates stmt_vectype_out is deficient and it should also be detecting the existence of a "float" type to get the VF.
[Bug tree-optimization/96974] [10/11 Regression] ICE in vect_get_vector_types_for_stmt compiling for SVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974 --- Comment #7 from Stam Markianos-Wright --- (In reply to rsand...@gcc.gnu.org from comment #6) > (In reply to Stam Markianos-Wright from comment #5) > > I'm tempted to try and add a reverse: > > > > || multiple_p (*stmt_vectype_out, nunits_vectype) > > > > And then regtest, but I probably need to do more reading around to figure > > out > > what we really should be expecting each case! > I don't think that's right. If nunits_vectype is not a multiple > of stmt_vectype then the stmt_vectype contains (or might contain) > unused elements. The vectoriser isn't set up to work like that: > all operations are currently supposed to be full-vector operations > (possibly predicated, on SVE and AVX). > > AFAICT the assert is correct and it's showing up a problem elsewhere. Cool, thank you for the info and the confirmation! I will carry on investigating to try and find the actual problem
[Bug tree-optimization/96974] [10/11 Regression] ICE in vect_get_vector_types_for_stmt compiling for SVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974 --- Comment #5 from Stam Markianos-Wright --- (In reply to rsand...@gcc.gnu.org from comment #4) > (In reply to Stam Markianos-Wright from comment #3) > > Just started looking at this. I've narrowed it as the bug appearing with > > commit 9b75f56d4b7951c60a6563964a65787b95bc. > > > > I have yet to fire this up in gdb to see what's happening, but one test I > > did do was to try commenting out the assert that is causing the ICE and it > > then ran to completion. > > > > So one _total speculation_ would be that with these latest changes that > > enable groups of different sizes, this condition in the assert is now too > > strict: > > > > > > multiple_p (TYPE_VECTOR_SUBPARTS (nunits_vectype), > > TYPE_VECTOR_SUBPARTS (*stmt_vectype_out))) > What are nunits_vectype and *stmt_vectype_out at the point > that the assert fails? Hmm so looking at this on commit 9b75f56d4b7951c60a6563964a65787b95bc, we have: nunits_vectype is a: vector(SUBPARTS {coeffs = {4, 0}}) float *stmt_vectype_out is a: vector(SUBPARTS {coeffs = {8, 0}}) long int In this case we are checking multiple_p (4, 8) == false (and also group_size == 9 here which is expected) Before the commit we'd get here with: nunits_vectype is a: vector(SUBPARTS {coeffs = {4, 0}}) float *stmt_vectype_out is a: vector(SUBPARTS {coeffs = {2, 0}}) long int And here we were checking multiple_p (4, 2) == true I'm tempted to try and add a reverse: || multiple_p (*stmt_vectype_out, nunits_vectype) And then regtest, but I probably need to do more reading around to figure out what we really should be expecting each case!
[Bug tree-optimization/96974] [10/11 Regression] ICE in vect_get_vector_types_for_stmt compiling for SVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974 --- Comment #3 from Stam Markianos-Wright --- Just started looking at this. I've narrowed it as the bug appearing with commit 9b75f56d4b7951c60a6563964a65787b95bc. I have yet to fire this up in gdb to see what's happening, but one test I did do was to try commenting out the assert that is causing the ICE and it then ran to completion. So one _total speculation_ would be that with these latest changes that enable groups of different sizes, this condition in the assert is now too strict: multiple_p (TYPE_VECTOR_SUBPARTS (nunits_vectype), TYPE_VECTOR_SUBPARTS (*stmt_vectype_out)))
[Bug target/91816] [8/9 Regression] Arm generates out of range conditional branches in Thumb2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91816 Stam Markianos-Wright changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #9 from Stam Markianos-Wright --- Patch is now on all active branches, so moving to RESOLVED. Thanks to all for their reviews!
[Bug rtl-optimization/90249] [9/10/11 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249 --- Comment #14 from Stam Markianos-Wright --- Was reminded that this was still open after many months. Have fixed the commit and am in the process of backporting to gcc-8,9.(In reply to Stam Markianos-Wright from comment #12) > Was reminded that this was still open after many months. Have fixed the > commit and am in the process of backporting to gcc-8,9. Excuse that, am an idiot and commented in entirely the wrong place!
[Bug rtl-optimization/90249] [9/10/11 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249 Stam Markianos-Wright changed: What|Removed |Added Assignee|stammark at gcc dot gnu.org|unassigned at gcc dot gnu.org --- Comment #13 from Stam Markianos-Wright --- Was reminded that this was still open after many months. Have fixed the commit and am in the process of backporting to gcc-8,9.(In reply to Stam Markianos-Wright from comment #12) > Was reminded that this was still open after many months. Have fixed the > commit and am in the process of backporting to gcc-8,9. Excuse that, am an idiot and commented in entirely the wrong place!
[Bug rtl-optimization/90249] [9/10/11 Regression] Code size regression on thumb2 due to sub-optimal register allocation starting with r265398
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90249 Stam Markianos-Wright changed: What|Removed |Added CC||stammark at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |stammark at gcc dot gnu.org --- Comment #12 from Stam Markianos-Wright --- Was reminded that this was still open after many months. Have fixed the commit and am in the process of backporting to gcc-8,9.
[Bug tree-optimization/96974] [10/11 Regression] ICE in vect_get_vector_types_for_stmt compiling for SVE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96974 Stam Markianos-Wright changed: What|Removed |Added Host||x86_64-linux-gnu CC||stammark at gcc dot gnu.org Last reconfirmed||2020-10-15 --- Comment #1 from Stam Markianos-Wright --- Also confirmed on today's trunk aarch64-none-elf Seems to relate to use of that specific type of builtin or type of builtins (I tested some other math builtins and they were ok, but e.g. __builtin_llrintf produced the same ICE), and only inside the `d ? b() : 0` condition (i.e. changing that `coeffs[e] = b()` makes the ICE go away)
[Bug target/93300] ICE in convert_mode_scalar, at expr.c:325
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93300 Stam Markianos-Wright changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #2 from Stam Markianos-Wright --- Have not seen any further instances of this issue since my patch, so moving to RESOLVED.
[Bug target/93300] ICE in convert_mode_scalar, at expr.c:325
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93300 Stam Markianos-Wright changed: What|Removed |Added Status|NEW |ASSIGNED