[Bug target/104689] aarch64: libgcc: DW_CFA_val_expression is not supported for RA_SIGN_SATE register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104689 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #3 from Ramana Radhakrishnan --- (In reply to nsz from comment #2) > fixed for gcc-13 In an AArch64 Ubuntu 22.04 VM on my Apple Silicon M1 I see : Features: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 asimddp sha512 asimdfhm dit uscat ilrcpc flagm ssbs sb paca pacg dcpodp flagm2 frint i.e. PACA / PACG , I see this test failing with trunk as of 136029059686fed2d99c755baf35f98553fc0232 simply bootstrapped with $srcdir/configure --enable-languages=c,c++ . I'll see if I can pull something out when I have some time. Ramana
[Bug other/107620] Build errors when using sphinx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107620 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #3 from Ramana Radhakrishnan --- I worked around this by installing everything in $SRCDIR/doc/requirements.txt. pip3 install -r requirements.txt I found that using that allowed me to build html documentation. However I seemed to need in addition a whole bunch of stuff to build latex and pdf documentation. Ramana
[Bug tree-optimization/107326] [13 Regression] ICE: verify_gimple failed (error: type mismatch in binary expression) since r13-3219-g25413fdb2ac249
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107326 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #4 from Ramana Radhakrishnan --- Fixed ?
[Bug target/92999] [armhf] struct with adjacent __fp16's copies wrongly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92999 Ramana Radhakrishnan changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |ramana at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #3 from Ramana Radhakrishnan --- I think I have a patch but might need some help testing it.
[Bug target/105929] [AArch64] armv8.4-a allows atomic stp. 64-bit constants can use 2 32-bit halves with _Atomic or volatile
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105929 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2022-11-05 --- Comment #1 from Ramana Radhakrishnan --- Confirmed as an 8 byte aligned store is always going to be fully within a 16 byte block aligned to a 16 byte aligned address.
[Bug debug/53135] Duplicates cause size explosion (vta/dwarf)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53135 --- Comment #20 from Ramana Radhakrishnan --- (In reply to Jeffrey A. Law from comment #19) > I think it's just workaround that got installed in 2012, not a real fix. > Of course, 10 years later one could ask if the workaround has become the > "real fix". That is of course a jolly good question :P Ramana
[Bug target/107533] New: Inefficient code sequence for fp16 testcase on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107533 Bug ID: 107533 Summary: Inefficient code sequence for fp16 testcase on aarch64 Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ramana at gcc dot gnu.org Target Milestone: --- Derived from PR92999 struct phalf { __fp16 first; __fp16 second; }; struct phalf phalf_copy(struct phalf* src) __attribute__((noinline)); struct phalf phalf_copy(struct phalf* src) { return *src; } Compiling for AArch64 with a recent enough compiler produces. phalf_copy: ldr w0, [x0] ubfxx1, x0, 0, 16 lsr w0, w0, 16 dup v0.4h, w1 dup v1.4h, w0 ret Couldn't it just be ldr h0, [x0] ldr h1, [x0, 2] IIRC this is in base v8 rather than v8.2 regards Ramana
[Bug target/92999] [armhf] struct with adjacent __fp16's copies wrongly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92999 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |NEW CC||ramana at gcc dot gnu.org Ever confirmed|0 |1 Known to fail||13.0 Last reconfirmed||2022-11-05 --- Comment #2 from Ramana Radhakrishnan --- confirmed on trunk. I think this is an issue in the way the registers are allocated for the VFP PCS.
[Bug debug/100523] [11/12/13 Regression] armv8.1-m.main -fcompare-debug failure with -O -fmodulo-sched -mtune=cortex-a53
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100523 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #3 from Ramana Radhakrishnan --- It does fall over but isn't this is a bit of an undefined testcase given that crc is an uninitialised local variable used before initialisation ? What am I missing ? Ramana
[Bug target/94604] support for the ETSI basic operations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94604 Ramana Radhakrishnan changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |WAITING Last reconfirmed||2022-11-04 CC||ramana at gcc dot gnu.org --- Comment #3 from Ramana Radhakrishnan --- I'd suggest moving this to Waiting given the time without a response and the correct links to documentation and that this should be covered really by arm_acle.h . Ramana
[Bug debug/53135] Duplicates cause size explosion (vta/dwarf)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53135 --- Comment #18 from Ramana Radhakrishnan --- Since the fix got installed in 2012 this really should have been fixed from 4.8.0 onwards. Should we really keep this still open or can we close this out ? Ramana
[Bug target/97726] simd intrinsics tests fail on armeb
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97726 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #2 from Ramana Radhakrishnan --- Fixed now ?
[Bug target/96372] [11 regression] arm/ivopts.c fails since r11-2012
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96372 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #4 from Ramana Radhakrishnan --- Is this now fixed ?
[Bug tree-optimization/88709] Improve store-merging
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88709 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #13 from Ramana Radhakrishnan --- (In reply to Richard Earnshaw from comment #9) > (In reply to Jakub Jelinek from comment #7) > > (In reply to Christophe Lyon from comment #6) > > > I've noticed that the new test store_merging_29.c fails on > > > arm-none-eabi --with-cpu cortex-a9 > > > FAIL: gcc.dg/store_merging_29.c scan-tree-dump store-merging "New sequence > > > of 3 stores to replace old one of 6 stores" > > > > That is because target-supports.exp lies on arm, even for -mcpu=cortex-a9 > > STRICT_ALIGNMENT is 1 on arm (it is 1 unconditionally), but > > check_effective_target_non_strict_align returns true on arm anyway. > > I've already added hacks for it in r256783 in another testcase though, guess > > I'll do something similar now, but I must say I'm not very excited about > > that. > > Support for misaligned accesses is a three(.5!)-valued problem: > > 1) There's no support in the architecture at all > 2) There's some support with a limited set of instructions > 3) There's full support: any memory access can handle any alignment. > 3.5) There's full support: but some accesses may be very slow > > I would think that these days most CPU architectures actually fall into > either 1 or 2. Many architectures have limitations, for example on atomic > accesses that are unaligned. > > STRICT_ALIGNMENT only covers, in reality case 3. I'm not even sure if it > would be defined on a machine with case 3.5. > > I think the real problem here is that it's not clear what question this > target-supports macro is really asking - does the CPU have the capability to > do (some) unaligned acceses? or can it arbitrarily support casts from > unaligned pointers to standard types? I agree - it sounds like this should be put in a comment next to the target-supports query.
[Bug target/89400] [7/8/9/10 Regression] ICE: output_operand: invalid %-code with -march=armv6kz -mthumb -munaligned-access
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89400 Ramana Radhakrishnan changed: What|Removed |Added CC||marxin at gcc dot gnu.org --- Comment #4 from Ramana Radhakrishnan --- *** Bug 90308 has been marked as a duplicate of this bug. ***
[Bug target/90308] ICE in output_operand: invalid %-code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90308 Ramana Radhakrishnan changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |DUPLICATE --- Comment #2 from Ramana Radhakrishnan --- Certainly looks like it. *** This bug has been marked as a duplicate of bug 89400 ***
[Bug target/86538] GCC should define a macro to specify if LSE is enabled or not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86538 Ramana Radhakrishnan changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #8 from Ramana Radhakrishnan --- Now fixed on trunk and all release branches.
[Bug target/86538] GCC should define a macro to specify if LSE is enabled or not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86538 --- Comment #7 from Ramana Radhakrishnan --- Author: ramana Date: Wed May 1 15:27:40 2019 New Revision: 270770 URL: https://gcc.gnu.org/viewcvs?rev=270770=gcc=rev Log: [Patch AArch64] Add __ARM_FEATURE_ATOMICS This keeps coming up repeatedly and the ACLE has finally added __ARM_FEATURE_ATOMICS for the LSE feature in GCC. This is now part of the latest ACLE release (https://developer.arm.com/docs/101028/latest/5-feature-test-macros) I know it's late for GCC-9 but this is a simple macro which need not wait for another year. Ok for trunk and to backport to all release branches ? Tested with a simple build and a smoke test. Backport from mainline. PR target/86538 * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define __ARM_FEATURE_ATOMICS Modified: branches/gcc-7-branch/gcc/ChangeLog branches/gcc-7-branch/gcc/config/aarch64/aarch64-c.c
[Bug target/86538] GCC should define a macro to specify if LSE is enabled or not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86538 --- Comment #6 from Ramana Radhakrishnan --- Author: ramana Date: Tue Apr 30 14:57:50 2019 New Revision: 270702 URL: https://gcc.gnu.org/viewcvs?rev=270702=gcc=rev Log: [Patch AArch64] Add __ARM_FEATURE_ATOMICS This keeps coming up repeatedly and the ACLE has finally added __ARM_FEATURE_ATOMICS for the LSE feature in GCC. This is now part of the latest ACLE release (https://developer.arm.com/docs/101028/latest/5-feature-test-macros) I know it's late for GCC-9 but this is a simple macro which need not wait for another year. Ok for trunk and to backport to all release branches ? Tested with a simple build and a smoke test. Backport from mainline. PR target/86538 * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define __ARM_FEATURE_ATOMICS Modified: branches/gcc-8-branch/gcc/ChangeLog branches/gcc-8-branch/gcc/config/aarch64/aarch64-c.c
[Bug target/86538] GCC should define a macro to specify if LSE is enabled or not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86538 --- Comment #5 from Ramana Radhakrishnan --- Author: ramana Date: Tue Apr 30 12:02:30 2019 New Revision: 270689 URL: https://gcc.gnu.org/viewcvs?rev=270689=gcc=rev Log: [Patch AArch64] Add __ARM_FEATURE_ATOMICS This keeps coming up repeatedly and the ACLE has finally added __ARM_FEATURE_ATOMICS for the LSE feature in GCC. This is now part of the latest ACLE release (https://developer.arm.com/docs/101028/latest/5-feature-test-macros) I know it's late for GCC-9 but this is a simple macro which need not wait for another year. Ok for trunk and to backport to all release branches ? Tested with a simple build and a smoke test. Backport from mainline. PR target/86538 * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define __ARM_FEATURE_ATOMICS Modified: branches/gcc-9-branch/gcc/ChangeLog branches/gcc-9-branch/gcc/config/aarch64/aarch64-c.c
[Bug target/86538] GCC should define a macro to specify if LSE is enabled or not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86538 --- Comment #4 from Ramana Radhakrishnan --- Author: ramana Date: Tue Apr 30 11:22:11 2019 New Revision: 270686 URL: https://gcc.gnu.org/viewcvs?rev=270686=gcc=rev Log: [Patch AArch64] Add __ARM_FEATURE_ATOMICS This keeps coming up repeatedly and the ACLE has finally added __ARM_FEATURE_ATOMICS for the LSE feature in GCC. This is now part of the latest ACLE release (https://developer.arm.com/docs/101028/latest/5-feature-test-macros) I know it's late for GCC-9 but this is a simple macro which need not wait for another year. Ok for trunk and to backport to all release branches ? Tested with a simple build and a smoke test. PR target/86538 * config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Define __ARM_FEATURE_ATOMICS Modified: trunk/gcc/ChangeLog trunk/gcc/config/aarch64/aarch64-c.c
[Bug target/86538] GCC should define a macro to specify if LSE is enabled or not
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86538 Ramana Radhakrishnan changed: What|Removed |Added Status|RESOLVED|ASSIGNED Last reconfirmed||2019-04-30 CC||ramana at gcc dot gnu.org Resolution|WONTFIX |--- Assignee|unassigned at gcc dot gnu.org |ramana at gcc dot gnu.org Target Milestone|--- |7.5 Ever confirmed|0 |1 --- Comment #3 from Ramana Radhakrishnan --- reopening and taking.
[Bug middle-end/90075] [7/8 Regression] [AArch64] ICE during RTL pass when member of union passed to copysignf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90075 Ramana Radhakrishnan changed: What|Removed |Added CC||rearnsha at gcc dot gnu.org --- Comment #3 from Ramana Radhakrishnan --- Seems to have been "fixed" by the commit to fix PR87369, Richard, is this something to backport ? Prima-facie , it appears not and we will need an appropriate fix for the release branches.
[Bug middle-end/90075] [7/8 Regression] [AArch64] ICE during RTL pass when member of union passed to copysignf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90075 Ramana Radhakrishnan changed: What|Removed |Added Status|NEW |ASSIGNED CC||ramana at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |ramana at gcc dot gnu.org --- Comment #2 from Ramana Radhakrishnan --- I'll take a look.
[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093 --- Comment #45 from Ramana Radhakrishnan --- (In reply to Jakub Jelinek from comment #42) > Thanks for the explanation. > In that case, I think it would be better to just add > __attribute__((target("general-regs-only"))) > to the > #ifdef __ARM_EABI_UNWINDER__ > _Unwind_Reason_Code > PERSONALITY_FUNCTION (_Unwind_State, struct _Unwind_Exception *, > struct _Unwind_Context *); > decl in unwind-c.c and similarly for eh_personality.cc and to other > personality routines that use CONTINUE_UNWINDING as well (plus to > unwind-arm.c and pr-support.c using pragma for everything). Thanks for all the analysis, this is what I had - I've been swamped this week on a few other things, let me get this wrapped up soonish. (read it as during next week).(In reply to Bernd Edlinger from comment #44) > Comment on attachment 46013 [details] > updated patch. > > @@ -122,12 +122,21 @@ extern tree arm_fp16_type_node; > #define TARGET_32BIT_P(flags) (TARGET_ARM_P (flags) || TARGET_THUMB2_P > (flags)) > > /* Run-time Target Specification. */ > -/* Use hardware floating point instructions. */ > +/* Use hardware floating point instructions. -mgeneral-regs-only prevents > +the use of floating point instructions and registers but does not prevent > +emission of floating point pcs attributes. */ > #define TARGET_HARD_FLOAT(arm_float_abi != ARM_FLOAT_ABI_SOFT\ > + && bitmap_bit_p (arm_active_target.isa, \ > + isa_bit_vfpv2) \ > + && TARGET_32BIT \ > + && !TARGET_GENERAL_REGS_ONLY) > + > +#define TARGET_HARD_FLOAT_SUB(arm_float_abi != ARM_FLOAT_ABI_SOFT > \ >&& bitmap_bit_p (arm_active_target.isa, \ > isa_bit_vfpv2) \ >&& TARGET_32BIT) > > > BTW, you could define TARGET_HARD_FLOAT in terms of TARGET_HARD_FLOAT_SUB and > !TARGET_GENERAL_REGS_ONLY. Yep I could - been traveling quite a lot and I haven't managed to find someone else to catch this - I will pick this up next week . My fault, apologies. Ramana
[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093 Ramana Radhakrishnan changed: What|Removed |Added Attachment #45552|0 |1 is obsolete|| Attachment #45580|0 |1 is obsolete|| --- Comment #32 from Ramana Radhakrishnan --- Created attachment 46013 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46013=edit updated patch. Having discussed this with Richard further , instead of adding -mfpu=none , we would prefer a mgeneral-regs-only option which keeps the 2 separate. I'm sorry about the time it has taken to get back but this is what I have right now. The hunk in eh_personality.cc isn't very pleasing for me but that's because of the warning in the ABI code which triggers on the build for a hard float build because we do have inline functions consuming floating point. Either I drop the warning or I keep the hunk in eh_personality.cc - any preferences / thoughts ?
[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093 --- Comment #30 from Ramana Radhakrishnan --- (In reply to Jakub Jelinek from comment #29) > Ramana, any progress on this? I'm still trying to get the various spec files and the t-multilib bits sorted and half-term has intervened here in the UK.
[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093 --- Comment #27 from Ramana Radhakrishnan --- (In reply to Bernd Edlinger from comment #25) > you might consider adding something like that to your patch: > > Index: elf.h > === > --- elf.h (revision 268337) > +++ elf.h (working copy) > @@ -64,7 +64,7 @@ > %{mapcs-*:-mapcs-%*} \ > %(subtarget_asm_float_spec) \ > %{mthumb-interwork:-mthumb-interwork} \ > -%{mfloat-abi=*} %{!mfpu=auto: %{mfpu=*}} \ > +%{mfloat-abi=*} %{!mfpu=auto: %{!mfpu=none: %{mfpu=*}}} \ > %(subtarget_extra_asm_spec)" > #endif > > > > otherwise using -mfpu=none won't work on the command line. > becuse gas does not understand it. Yes, that's what I've been playing with. I've run out of time this week because of other work commitments, I hope to get back to this early next week. Ramana
[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093 Ramana Radhakrishnan changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |ramana at gcc dot gnu.org --- Comment #21 from Ramana Radhakrishnan --- (In reply to Jakub Jelinek from comment #19) > (In reply to Florian Weimer from comment #18) > > (In reply to Ramana Radhakrishnan from comment #15) > > > Testing this and would be grateful for a test run. > > > > Is this hunk needed as well, or will the unwinding information take care of > > this? (__cxa_call_unexpected has another d8 register spill.) > > No idea here. I'll try and analyse that - The key is ensuring that there is absolutely no floating point code in eh_call.cc , if there is likely to be floating point anywhere this isn't correct > > > --- libstdc++-v3/libsupc++/eh_call.cc (revision 268364) > > +++ libstdc++-v3/libsupc++/eh_call.cc (working copy) > > @@ -22,6 +22,11 @@ > > // see the files COPYING3 and COPYING.RUNTIME respectively. If not, see > > // <http://www.gnu.org/licenses/>. > > > > +#ifdef __arm__ > > +#pragma GCC target ("fpu=none") > > +#pragma GCC push_options > > +#endif > > But why the #pragma GCC push_options? That makes no sense. > Either you need to push options before GCC target and pop later on, but if > you pop at the end of TU and don't really expect anything else to be emitted > there, only #pragma GCC target should be enough (that applies to the other > patch too). I think that's just percolated my quick hack to discuss the issue. It should be enough to do #pragma GCC target . The final patch I have does that . Thanks for confirming that the patch in it's essence fixes up the issue. regards Ramana
[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093 --- Comment #16 from Ramana Radhakrishnan --- (In reply to Jakub Jelinek from comment #14) > We require GNU make, so one can use something like: > unwind-arm.o unwind-c.o libunwind.o pr-support.o: CFLAGS += -mfpu=none > or similar in libgcc/config/arm/t-arm (or similar) with a comment explaining > the reason. For eh_personality.o that needs to be done elsewhere and there > are no such makefile fragments (and libtool is used). Sadly that doesn't work for -mfpu=none in t-arm because we still need gcc-9 to build with older binutils that don't necessarily support -mfpu=none, thus for now let's hide this with target pragmas.
[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093 Ramana Radhakrishnan changed: What|Removed |Added Attachment #45547|0 |1 is obsolete|| --- Comment #15 from Ramana Radhakrishnan --- Created attachment 45552 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45552=edit new patch. Testing this and would be grateful for a test run.
[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-01-29 Ever confirmed|0 |1 --- Comment #13 from Ramana Radhakrishnan --- (In reply to Jakub Jelinek from comment #12) > If one appends -mfloat-abi=soft to command lines of those files, does that > imply incompatible ABI even if nothing is passed in float/VFP etc. registers > nor there is any floating point code? -mfloat-abi=soft is an interesting option, it means use floating point emulation code using the base pcs as well as use the base parameter passing conventions for passing floating point parameters to functions. so it would end up failing at link time . That's why we need an -mfpu=none option which is silent and I've not liked it for a while. (In reply to Jakub Jelinek from comment #11) > Comment on attachment 45547 [details] > untested prototype patch. > > Doesn't the whole unwinder (so eh_personality.cc (whole, not just one > function in it), unwind-arm.c, unwind-c.c, maybe some other unwind-*.c)) > need that? Yes that would be needed. Reading the EHABI again suggests that - I don't see a macro that would help with that everywhere.
[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093 --- Comment #10 from Ramana Radhakrishnan --- Created attachment 45547 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45547=edit untested prototype patch. Not sure if this is complete yet but it gives a framework to dig further.
[Bug target/89093] [9 Regression] C++ exception handling clobbers d8 VFP register
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89093 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #7 from Ramana Radhakrishnan --- (In reply to Florian Weimer from comment #0) > In glibc, we have a test, nptl/tst-thread-exit-clobber, that attempts to > verify if registers are properly restored by unwinding. (The actual target > of the the test is pthread_exit, but it covers more than that.) > > This tests fails when running with GCC 9 libstdc++, even if glibc and the > test were built with GCC 8, and libgcc_s is replaced with the version for > GCC 8 (which works when running against GCC 8 libstdc++). > > In the test, the d8 register is not restored properly during unwinding, it > is set to zero. d9, d10 etc. are restored. > > I noticed that in GCC 9, __gxx_personality_v0 saves the d8 VFP register: > > 0007b620 <__gxx_personality_v0@@CXXABI_1.3>: >7b620: e92d4ff0push{r4, r5, r6, r7, r8, r9, sl, fp, lr} >7b624: ed2d8b02vpush {d8} >7b628: e3a03000mov r3, #0 >7b62c: e1a08001mov r8, r1 > > And it actually uses s16 and s17, apparently for spilling integer registers. > Perhaps the unwinder is not prepared to deal with that. d8 is composed of s16 and s17. That should just be fine. The single precision FP registers are packed into double precision registers in the VFP architecture. Ramana
[Bug target/84923] [8 regression] gcc.dg/attr-weakref-1.c failed on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84923 Ramana Radhakrishnan changed: What|Removed |Added Status|RESOLVED|NEW Resolution|FIXED |--- --- Comment #12 from Ramana Radhakrishnan --- I don't think this is fixed on GCC-8 as the commit to trunk happened on May 21 18 after the release was made Thanks, Ramana
[Bug target/88734] [8 Regression] AArch64's ACLE intrinsics give an ICE instead of compile error when option mismatch.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88734 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #11 from Ramana Radhakrishnan --- (In reply to Tamar Christina from comment #10) > Thanks Jakub! testing hasn't shown any breakages. I would prefer this to be backported to GCC-8 if it has baked reasonably on trunk.
[Bug target/88510] GCC generates inefficient U64x2/v2di scalar multiply for NEON32
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88510 Ramana Radhakrishnan changed: What|Removed |Added Target|armv7-a |arm, aarch64 Status|UNCONFIRMED |NEW Last reconfirmed||2019-01-14 CC||ramana at gcc dot gnu.org Target Milestone|--- |10.0 Ever confirmed|0 |1 --- Comment #3 from Ramana Radhakrishnan --- We are in stage4 at this point of time and a patch for this between now and when GCC9 releases isn't appropriate (i.e. April). Hopefully someone will pick this up afterwards for both backends as the logic required for the expansion should be pretty much identical give or take backend integration issues. Though I wonder if this is better handled in the fall back path for expansion of v2di multiplications instead of duplicating this logic in both arm and aarch64 backends.
[Bug rtl-optimization/87871] [9 Regression] testcases fail after r265398 on arm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87871 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-12-14 Ever confirmed|0 |1 --- Comment #7 from Ramana Radhakrishnan --- Confirmed.
[Bug rtl-optimization/87871] [9 Regression] testcases fail after r265398 on arm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87871 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #6 from Ramana Radhakrishnan --- (In reply to Segher Boessenkool from comment #5) > The first one just needs an xfail. I don't know if it should be *-*-* there > or only arm*-*-* should be added. > > The other two need some debugging by someone who knows the target and/or > these tests. for the addr-modes-float.c case there are additional vmov's being generated and thus is certainly a regression. --- 8.s 2018-12-14 09:41:04.367843079 + +++ addr-modes-float.s 2018-12-14 09:40:39.907980812 + @@ -139,10 +139,13 @@ @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. + vmovq8, q0 @ ti mov r3, r0 + vmovq9, q1 @ ti add r0, r0, #48 - vst3.8 {d0, d2, d4}, [r3]! - vst3.8 {d1, d3, d5}, [r3] + vmovq10, q2 @ ti + vst3.8 {d16, d18, d20}, [r3]! + vst3.8 {d17, d19, d21}, [r3]
[Bug target/88013] can't vectorize rgb to grayscale conversion code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88013 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-12-14 CC||ramana at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #8 from Ramana Radhakrishnan --- > vshr.u16q9, q9, #8 > vshr.u16q8, q8, #8 > vmovn.i16 d20, q9 > vmovn.i16 d21, q8 Isn't that "just" a missing combine pattern to get us vshrn in both backends ? Ramana
[Bug tree-optimization/88259] vectorization failure for a typical loop for getting max value and index
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88259 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #2 from Ramana Radhakrishnan --- (In reply to ktkachov from comment #1) > Confirmed. > Trying to find just the index (not the max value) vectorises as well: > void test_vec(int *data, int n) { > int best_i, best = 0; > > for (int i = 0; i < n; i++) { > if (data[i] > best) { > //best = data[i]; > best_i = i; > } > } > > data[best_i] = data[0]; > data[0] = best; > } > > > -O3: > .L4: > ldr q1, [x2], 16 > mov v3.16b, v2.16b > add v2.4s, v2.4s, v4.4s > cmlev1.4s, v1.4s, #0 > cmp x2, x3 > bif v0.16b, v3.16b, v1.16b > bne .L4 > smaxv s0, v0.4s > and w3, w1, -4 > umovw2, v0.s[0] > cmn w2, #1 > cselw2, w2, wzr, ne > tst x1, 3 > beq .L2 > .L3: > > But their combination seems like it's throwing the machinery off. I'm > guessing the index-finding needs some if-conversion and masking to happen in > the vectoriser ISTR there is some limit in if conversion around the vectorizer where it only works on very simple if-blocks. But this is from memory and it's a bit fuzzy now.
[Bug debug/65771] ICE (in loc_list_from_tree, at dwarf2out.c:14964) on arm-linux-gnueabihf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65771 Ramana Radhakrishnan changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|5.5 |6.0 --- Comment #20 from Ramana Radhakrishnan --- Fixed for GCC 6 from the timeline here. Wont fix for GCC 5.
[Bug target/53440] [arm] generic thunk code fails for method which uses '...'
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53440 Ramana Radhakrishnan changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|--- |7.0 --- Comment #10 from Ramana Radhakrishnan --- Fixed for GCC 7.
[Bug tree-optimization/43721] Failure to optimise (a/b) and (a%b) into single __aeabi_idivmod call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43721 Ramana Radhakrishnan changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|--- |7.0 --- Comment #12 from Ramana Radhakrishnan --- Fixed in GCC7 then.
[Bug target/87867] [7/8 regression] ICE on virtual destructor (-mlong-calls -ffunction-sections) on arm-none-eabi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87867 --- Comment #2 from Ramana Radhakrishnan --- Author: ramana Date: Fri Nov 9 12:50:51 2018 New Revision: 265965 URL: https://gcc.gnu.org/viewcvs?rev=265965=gcc=rev Log: [PATCH, arm] Backport -- Fix ICE during thunk generation with -mlong-calls For Mihail Ionescu. 2018-11-09 Mihail Ionescu PR target/87867 Backport from mainline 2018-09-17 Eric Botcazou * g++.dg/other/thunk2a.C: New test. * g++.dg/other/thunk2b.C: Likewise. 2018-11-09 Mihail Ionescu Backport from mainiline 2018-09-17 Eric Botcazou * g++.dg/other/thunk2a.C: New test. * g++.dg/other/thunk2b.C: Likewise. * g++.dg/other/vthunk1.C: Rename as thunk1.C Added: branches/gcc-8-branch/gcc/testsuite/g++.dg/other/thunk1.C - copied unchanged from r265964, branches/gcc-8-branch/gcc/testsuite/g++.dg/other/vthunk1.C branches/gcc-8-branch/gcc/testsuite/g++.dg/other/thunk2a.C branches/gcc-8-branch/gcc/testsuite/g++.dg/other/thunk2b.C Removed: branches/gcc-8-branch/gcc/testsuite/g++.dg/other/vthunk1.C Modified: branches/gcc-8-branch/gcc/ChangeLog branches/gcc-8-branch/gcc/config/arm/arm.c branches/gcc-8-branch/gcc/testsuite/ChangeLog
[Bug target/87330] ICE in scan_rtx_reg, at regrename.c:1097
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87330 Ramana Radhakrishnan changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Target Milestone|--- |9.0 --- Comment #9 from Ramana Radhakrishnan --- Fixed then ?
[Bug middle-end/86815] [8/9 regression] ICE on valid code on armhf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86815 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2018-10-11 Ever confirmed|0 |1 --- Comment #10 from Ramana Radhakrishnan --- Tip of gcc-8 isn't failing. So until we have more info - this one is waiting I'm afraid.
[Bug target/87565] suboptimal memory-indirect tailcalls on arm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87565 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #3 from Ramana Radhakrishnan --- (In reply to Alexander Monakov from comment #2) > PLT trampolines all end with 'ldr pc, [ip, xxx]!', so do all calls via PLT > suffer from poor branch prediction of such indirect jumps? IIRC you still need to use that in the PLT trampoline for folks to use Linux like userland on strongarm which has a small user constituency still.
[Bug target/82227] ARM thumb inefficient tailcall return sequence (multiple pops)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82227 Ramana Radhakrishnan changed: What|Removed |Added Priority|P3 |P4 Status|UNCONFIRMED |NEW Last reconfirmed||2018-10-10 CC||ramana at gcc dot gnu.org Ever confirmed|0 |1 Severity|normal |enhancement --- Comment #1 from Ramana Radhakrishnan --- Confirmed.(In reply to Peter Cordes from comment #0) > int ext(); > int tailcall_external() { return ext(); } > // https://godbolt.org/g/W43fxw > > gcc6.3 -Os -mthumb > > push{r4, lr} > bl ext > pop {r4} > pop {r1}# two separate pop instructions isn't optimal > bx r1 > > gcc6.3 -Os -mthumb -mno-thumb-interwork > > push{r4, lr} > bl ext > pop {r4, pc} > > A 16-bit thumb pop instruction can only pop "lo" registers and PC, not back > into LR. That's why it can't pop {r4, lr} / bx lr like it does in -marm > mode. > > But there is a more efficient way: > > pop {r1, r2} > bx r2 Yep. > > We never needed a call-preserved register; r4 was pushed only to keep the > stack aligned. So as long as we have 2 call-clobbered regs available, we > can pop the padding that came from r4, and pop the saved lr, both into > call-clobbered regs. > > If we did need a call-preserved register for anything, two separate pop > instructions are presumably better than any combination of pop-multiple and > reg-reg moves. > > > > This also happens with two identical functions with different names, with > -Os. One compiles into a call to the other, done exactly the same way as to > an external function. (See the godbolt link above). > > In that case, I don't understand why we can't just tail-call with a `b` > instruction (like we get with -marm). Both functions are compiled to Thumb2 > code, so we can jump to the other and let it do an interworking return, > right? Especially with -mno-thumb-interwork, I don't understand why > tail-calls aren't optimized to a jump. You need to read up on the various levels of the architecture and the command line options. Thumb2 doesn't show up at the default level of the architecture and needs atleast -mthumb -march=armv6t2 . Try reading this for a beginners guide to the architecture. https://community.arm.com/tools/b/blog/posts/arm-cortex-a-processors-and-gcc-command-lines?CommentSortBy=CreatedDate=Descending We don't tail call in general for Thumb1 which is what your options imply because the branches are just too short (encoded in 16bits ) IIRC. > > (I'm not an expert on ARM / Thumb stuff, so there might be a reason I'm > missing.)
[Bug bootstrap/84199] Error building gcc 7.3.0 on Odroid XU4 (ARM, Ubuntu): cannot load liblto_plugin.so
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84199 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||ramana at gcc dot gnu.org Resolution|--- |INVALID --- Comment #1 from Ramana Radhakrishnan --- I don't think anyone is going to go fetch an odroid for this - it sounds like a problem in your environment as many folks are building / able to build gcc 7.x on an armhf ubuntu system. Looking at the build log - try with appropriate --with-arch --with-float and --with-fpu options to do your build. In general this on armhf is --with-arch=armv7-a --with-fpu=neon --with-float=hard though you could get better options specifically for the odroid.
[Bug sanitizer/86755] [ASAN] Libasan failed to be build for arm with -mthumb and -fno-omit-frame-pointer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86755 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-10-10 CC||ramana at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #2 from Ramana Radhakrishnan --- Confirmed.
[Bug middle-end/86815] [8/9 regression] ICE on valid code on armhf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86815 --- Comment #9 from Ramana Radhakrishnan --- (In reply to Martin Liška from comment #8) > Unfortunately I can't reproduce that with cross compiler. Me neither today. Gianfranco , could you check if you are running out of memory on the machine that you are doing this on with GCC-8 ? Is there a chance that the OOM killer came along when building this file on the native arm machine that you were running this on ? I've tried running this with stock gcc 8 in debian on an armhf docker image and will try building something up later today on my machine , but it may be worth double checking that something like an OOM killer or swap isn't what's throttling the build here.
[Bug middle-end/86815] [8/9 regression] ICE on valid code on armhf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86815 Ramana Radhakrishnan changed: What|Removed |Added Keywords||needs-reduction CC||ramana at gcc dot gnu.org --- Comment #7 from Ramana Radhakrishnan --- (In reply to Gianfranco from comment #6) > Created attachment 44485 [details] > another failing output > > I'm attaching another file suffering from the same issue (mostly every cpp > file has this failure) > this file is only ~2Mb, so maybe reducing it might be easier Needs reduction.
[Bug target/86968] Unaligned big-endian (scalar_storage_order) access on armv7-a yields 4 ldrb instructions rather than ldr+rev
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86968 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #5 from Ramana Radhakrishnan --- (In reply to jos...@codesourcery.com from comment #4) > Any unaligned access things that don't work for big-endian ARM are > probably fallout from the issues with big-endian NEON (NEON architectural > lane numbers are different from the architecture-independent lane numbers > in GNU C vector extensions and GCC IR, and GCC expects each machine mode > to have a single defined memory layout and a single defined layout in any > given register, and to be able to move between core and NEON registers, > and between core registers and memory, in the respective layouts used for > those registers, but some NEON loads and stores for big-endian don't work > with those expectations, so unaligned vector operations are limited for > big-endian ARM). Correct, we don't allow misaligned access for Neon because of exactly the above mentioned reasons. I would have however expected misaligned access to work with -march=armv7-a -munaligned-access -mfpu=vfpv3-d16 -mfloat-abi=softfp/hard on the command line for the afore mentioned testcase as we do have a movmisalign pattern in arm.md that should kick in overriding the movmisalign pattern in neon.md. It probably needs a little more detailed investigation.
[Bug c/85870] [6/7/8/9 Regression][LTO1] ICE in linemap_line_start, at libcpp/line-map.c:794
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85870 --- Comment #8 from Ramana Radhakrishnan --- (In reply to Martin Liška from comment #5) > (In reply to Ramana Radhakrishnan from comment #4) > > (In reply to Martin Liška from comment #3) > > > Can't reproduce with GCC 7.3.0 on x86_64: > > > > > > + gcc-7 -O2 -flto -c test_1.i -o test_1.o > > > + gcc-7 -O2 -flto -c test_2.i -o test_2.o > > > + gcc-7 test_1.o test_2.o > > > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: > > > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../lib64/crt1.o: in function > > > `_start': > > > /home/abuild/rpmbuild/BUILD/glibc-2.27/csu/../sysdeps/x86_64/start.S:104: > > > undefined reference to `main' > > > collect2: error: ld returned 1 exit status > > > > > > Richi how did you achieve to reproduce that? > > > > It's still failing on aarch64-none-linux-gnu. So that doesn't mean this goes > > waiting. > > Native or cross compiler? Because cross compiler works fine for me: > > $ aarch64-suse-linux-g++-8 -c test_1.i -c -flto > $ aarch64-suse-linux-g++-8 -c test_2.i -c -flto > $ /usr/lib64/gcc/aarch64-suse-linux/8/lto1 test_1.o test_2.o > Reading object files: test_1.o test_2.o {GC start 1697k} > Reading the callgraph > Merging declarations > Reading summaries > Reading function bodies: > Performing interprocedural optimizations > > > Assembling functions: > init_xyz_0 init_xyz_1 > Time variable usr sys > wall GGC > phase setup: 0.00 ( 0%) 0.00 ( 0%) 0.00 ( > 0%)1847 kB ( 1%) > phase opt and generate : 2.11 (100%) 0.12 ( 92%) 2.23 > (100%) 188629 kB ( 99%) > phase finalize : 0.00 ( 0%) 0.01 ( 8%) 0.01 ( > 0%) 0 kB ( 0%) > lto stream inflate : 0.12 ( 6%) 0.03 ( 23%) 0.15 ( > 7%) 0 kB ( 0%) > ipa lto constructors in: 0.65 ( 31%) 0.03 ( 23%) 0.69 ( > 31%) 188513 kB ( 99%) > TOTAL : 2.11 0.13 2.24 > 190523 kB cross-compiler built with revision r264905 and note that we have --enable-checking=yes turned on. Maybe that makes a difference ?
[Bug c/85870] [6/7/8/9 Regression][LTO1] ICE in linemap_line_start, at libcpp/line-map.c:794
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85870 Ramana Radhakrishnan changed: What|Removed |Added Status|WAITING |NEW --- Comment #4 from Ramana Radhakrishnan --- (In reply to Martin Liška from comment #3) > Can't reproduce with GCC 7.3.0 on x86_64: > > + gcc-7 -O2 -flto -c test_1.i -o test_1.o > + gcc-7 -O2 -flto -c test_2.i -o test_2.o > + gcc-7 test_1.o test_2.o > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../x86_64-suse-linux/bin/ld: > /usr/lib64/gcc/x86_64-suse-linux/7/../../../../lib64/crt1.o: in function > `_start': > /home/abuild/rpmbuild/BUILD/glibc-2.27/csu/../sysdeps/x86_64/start.S:104: > undefined reference to `main' > collect2: error: ld returned 1 exit status > > Richi how did you achieve to reproduce that? It's still failing on aarch64-none-linux-gnu. So that doesn't mean this goes waiting.
[Bug target/87563] [9 regression ] ICE with -march=armv8-a+sve
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87563 Ramana Radhakrishnan changed: What|Removed |Added Target||aarch64-none-elf Target Milestone|--- |9.0 --- Comment #2 from Ramana Radhakrishnan --- Fix target and milestone.
[Bug target/87563] [9 regression ] ICE with -march=armv8-a+sve
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87563 Ramana Radhakrishnan changed: What|Removed |Added Keywords||ice-on-valid-code Status|UNCONFIRMED |NEW Last reconfirmed||2018-10-09 Ever confirmed|0 |1 --- Comment #1 from Ramana Radhakrishnan --- Confirmed.
[Bug target/87563] New: [9 regression ] ICE with -march=armv8-a+sve
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87563 Bug ID: 87563 Summary: [9 regression ] ICE with -march=armv8-a+sve Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: ramana at gcc dot gnu.org Target Milestone: --- Somewhere between r261702 and r262881 the following testcase ICEs with -Ofast -O3 -march=armv8-a+sve. int a, b, c, *e; int d[2]; void f() { while (c) { d[0] = 4; d[1] = 4; *e = b == 0 ? 0 : a / b; } } /tmp/sve.c:7:21: internal compiler error: in maybe_gen_insn, at optabs.c:7307 *e = b == 0 ? 0 : a / b; ~~~^~~ 0xb06c73 maybe_gen_insn(insn_code, unsigned int, expand_operand*) /tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/optabs.c:7307 0xb072be maybe_expand_insn(insn_code, unsigned int, expand_operand*) /tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/optabs.c:7351 0xb095ef expand_insn(insn_code, unsigned int, expand_operand*) /tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/optabs.c:7382 0x9d586a expand_direct_optab_fn /tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/internal-fn.c:2921 0x9d6143 expand_COND_DIV /tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/internal-fn.def:155 0x9d76bd expand_internal_call(internal_fn, gcall*) /tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/internal-fn.c:3524 0x9d76eb expand_internal_call(gcall*) /tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/internal-fn.c:3532 0x757bb2 expand_call_stmt /tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/cfgexpand.c:2596 0x757bb2 expand_gimple_stmt_1 /tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/cfgexpand.c:3575 0x757bb2 expand_gimple_stmt /tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/cfgexpand.c:3734 0x75b8a5 expand_gimple_basic_block /tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/cfgexpand.c:5769 0x75f950 execute /tmp/dgboter/bbs/bc-b1-2-11--rhe6x86_64/buildbot/rhe6x86_64--aarch64-none-elf/build/src/gcc/gcc/cfgexpand.c:6372
[Bug target/86673] [8/9 regression] inline asm sometimes ignores 'register asm("reg")' declarations
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86673 Ramana Radhakrishnan changed: What|Removed |Added Target||arm-none-linux-gnueabi , ||arm-none-eabi Status|UNCONFIRMED |NEW Last reconfirmed||2018-07-25 CC||ramana at gcc dot gnu.org Known to work||7.2.0 Summary|inline asm sometimes|[8/9 regression] inline asm |ignores 'register |sometimes ignores 'register |asm("reg")' declarations|asm("reg")' declarations Ever confirmed|0 |1 Known to fail||8.1.0, 9.0 --- Comment #3 from Ramana Radhakrishnan --- Confirmed.
[Bug middle-end/86640] [8/9 regression] ICE in combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86640 Ramana Radhakrishnan changed: What|Removed |Added Keywords||ice-on-valid-code Target||arm-none-linux-gnueabihf Status|UNCONFIRMED |NEW Last reconfirmed||2018-07-23 Ever confirmed|0 |1 --- Comment #1 from Ramana Radhakrishnan --- confirmed.
[Bug middle-end/86640] New: [8/9 regression] ICE in combine
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86640 Bug ID: 86640 Summary: [8/9 regression] ICE in combine Product: gcc Version: 8.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: ramana at gcc dot gnu.org Target Milestone: --- char fn1() { long long b[5]; for (int a = 0; a < 5; a++) b[a] = ~0ULL; return b[3]; } $> arm-none-linux-gnueabihf-gcc -c -O3 -mfpu=neon -mfloat-abi=hard -march=armv7-a /tmp/crash.c during RTL pass: combine /tmp/crash.c: In function ‘fn1’: /tmp/crash.c:11:1: internal compiler error: in do_SUBST, at combine.c:731 } ^ 0x12e637c do_SUBST /tmp/dgboter/bbs/bc-b3-3-13--rhe6x86_64/buildbot/rhe6x86_64--arm-none-linux-gnueabihf/build/src/gcc/gcc/combine.c:730 0x12f913e subst /tmp/dgboter/bbs/bc-b3-3-13--rhe6x86_64/buildbot/rhe6x86_64--arm-none-linux-gnueabihf/build/src/gcc/gcc/combine.c:5589 0x12fb2d1 try_combine /tmp/dgboter/bbs/bc-b3-3-13--rhe6x86_64/buildbot/rhe6x86_64--arm-none-linux-gnueabihf/build/src/gcc/gcc/combine.c:3359 0x1301398 combine_instructions /tmp/dgboter/bbs/bc-b3-3-13--rhe6x86_64/buildbot/rhe6x86_64--arm-none-linux-gnueabihf/build/src/gcc/gcc/combine.c:1299 0x1301398 rest_of_handle_combine /tmp/dgboter/bbs/bc-b3-3-13--rhe6x86_64/buildbot/rhe6x86_64--arm-none-linux-gnueabihf/build/src/gcc/gcc/combine.c:14898 0x1301398 execute /tmp/dgboter/bbs/bc-b3-3-13--rhe6x86_64/buildbot/rhe6x86_64--arm-none-linux-gnueabihf/build/src/gcc/gcc/combine.c:14943
[Bug target/86555] unaligned address for ldrd/strd on armv5e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86555 --- Comment #4 from Ramana Radhakrishnan --- (In reply to Khem Raj from comment #2) > we can avoid the problem by altering the structure, thats not an issue, but > do you think compiler is right here by assuming to generate LDRD on a 4byte > aligned address when it is told that architecture (-march=armv5te) its > building for does not support 4byte aligned address for LDRD but only 8-byte > aligned ? It is correct for the compiler to be doing this - the compiler has just not been given enough information. buf can only get aligned to 8 bytes if there is an input attribute setting the alignment properly otherwise it's a char array and the compiler is within it's rights not to have to force align upwards to 8 bytes in this case. When the compiler is derefencing de->d_off it expects it to be naturally 8 byte aligned. Fix the source.
[Bug tree-optimization/80641] missed optimization with with std::vector resize in loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80641 Ramana Radhakrishnan changed: What|Removed |Added Known to fail||7.3.1 --- Comment #14 from Ramana Radhakrishnan --- 7.3.1 appears to fail the original testcase for an aarch64 cross compiler to Linux with -O3 and -Wall.
[Bug tree-optimization/80641] missed optimization with with std::vector resize in loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80641 Ramana Radhakrishnan changed: What|Removed |Added Known to work||6.4.1, 8.1.0 Known to fail||7.2.1 --- Comment #13 from Ramana Radhakrishnan --- With the original testcase I can still see a warning come out for a reasonably recent GCC 7 snapshot on aarch64 while it appears to work find on gcc 8 and gcc 6. Thanks Ramana
[Bug tree-optimization/80641] missed optimization with with std::vector resize in loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80641 --- Comment #12 from Ramana Radhakrishnan --- (In reply to Martin Sebor from comment #11) > *** Bug 86516 has been marked as a duplicate of this bug. *** (In reply to Paul Gotch from comment #10) > I'm afraid the changes made to libstdc++ have only solved part of the > regression if you say something like > > std::vector v; > > if(c.size() > 0) > c.resize(c.size() - 1); > > then you no longer get a warning in 7.3 however if instead you do > > if(! c.empty()) > c.resize(c.size() -1); > > the warning is produced just as in early 7.x releases. No warning is > produced in 6.x so this is still a regression. > > I presume this happens as empty wasn't annotated in libstdc++ and the > underlying data flow analysis bug is yet to be fixed. So why is this not a regression ? It's quite clear that the annotations did not do enough to workaround the issue.
[Bug tree-optimization/85804] [8/9 Regression][AArch64] Mis-compilation of loop with strided array access and xor reduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85804 --- Comment #3 from Ramana Radhakrishnan --- (In reply to Ramana Radhakrishnan from comment #2) > Patch being discussed here. > https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01026.html Bin are you still working on this ?
[Bug target/85854] Performance regression from gcc 4.9.2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85854 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2018-07-11 CC||ramana at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Ramana Radhakrishnan --- I'm unable to build the pre-processed file with 4.9 - is it possible for you to attach a non-preprocessed version as well ? The reason this happens compared to the usual instructions is that the implementation of the intrinsic can well change between compiler versions. regards Ramana
[Bug target/86209] Peephole does not happen because the type of zero/sign extended operands is not the same.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86209 --- Comment #13 from Ramana Radhakrishnan --- Sameera, If you are working on this , can you please assign this to yourself ? Ramana
[Bug target/86209] Peephole does not happen because the type of zero/sign extended operands is not the same.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86209 Ramana Radhakrishnan changed: What|Removed |Added Keywords||missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed||2018-07-11 Ever confirmed|0 |1 --- Comment #12 from Ramana Radhakrishnan --- Confirmed then.
[Bug target/85910] config/aarch64/aarch64.c:15653:12: warning: duplicated ‘if’ condition
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85910 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-07-11 CC||ramana at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Ramana Radhakrishnan --- Confirmed.
[Bug libgcc/85967] [ARM] No unwinding support for division functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85967 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-07-11 CC||ramana at gcc dot gnu.org Assignee|unassigned at gcc dot gnu.org |ramana at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #3 from Ramana Radhakrishnan --- This patch would fit under the 10 line rule but for the future can I also confirm that you have a copyright assignment in place with the FSF ? It would be good to test this and push this into the tree after a testrun, I can do that for you. Ramana
[Bug middle-end/83623] [8 Regression] ICE: in convert_move, at expr.c:248 with -march=knl and 16bit vector bswap/rotate
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83623 Ramana Radhakrishnan changed: What|Removed |Added Status|RESOLVED|REOPENED CC||ramana at gcc dot gnu.org Resolution|FIXED |--- --- Comment #8 from Ramana Radhakrishnan --- Seems to need a fix for gcc 6 branch based on PR86166
[Bug target/86209] Peephole does not happen because the type of zero/sign extended operands is not the same.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86209 --- Comment #3 from Ramana Radhakrishnan --- (In reply to sameerad from comment #2) > Ramana, it is another peephole that I am trying to explore for falkor. It > combines loads/stores of shorter types (QI/HI/SI) into single load/store of > larger type (SI/DI). Ah I see. Sorry , not enough coffee yet.
[Bug target/86209] Peephole does not happen because the type of zero/sign extended operands is not the same.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86209 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #1 from Ramana Radhakrishnan --- (In reply to sameerad from comment #0) > While implementing peephole2 for combining shorter types loads/stores into > larger type load/store, following testcase was found for aarch64 for which > peephole does not happen because the type of zero/sign extended operands is > not the same. > > Test program: > unsigned short > subus (unsigned short *array) > { > return array[0] + array[1]; > } > > Expander generated RTL: > (insn 6 3 7 2 (set (reg:HI 96) > (mem:HI (reg/v/f:DI 94 [ array ]) [1 *array_4(D)+0 S2 A16])) > (nil)) > (insn 7 6 8 2 (set (reg:HI 97) > (mem:HI (plus:DI (reg/v/f:DI 94 [ array ]) > (const_int 2 [0x2])) [1 MEM[(short unsigned int *)array_4(D) > + 2B]+0 S2 A16])) > (nil)) > (insn 8 7 9 2 (set (reg:SI 99) > (subreg:SI (reg:HI 97) 0)) > (nil)) > (insn 9 8 10 2 (set (reg:SI 98) > (plus:SI (subreg:SI (reg:HI 96) 0) > (reg:SI 99))) > (expr_list:REG_EQUAL (plus:SI (subreg:SI (reg:HI 96) 0) > (subreg:SI (reg:HI 97) 0)) > (nil))) > > The combiner combines insn 7 and 8 to generate zero extension to SI mode. > > (insn 8 7 9 2 (set (reg:SI 99 [ MEM[(short unsigned int *)array_4(D) + 2B] ]) > (zero_extend:SI (mem:HI (plus:DI (reg/v/f:DI 94 [ array ]) > (const_int 2 [0x2])) [1 MEM[(short unsigned int > *)array_4(D) + 2B]+0 S2 A16]))) {*zero_extendhisi2_aarch64} > (expr_list:REG_DEAD (reg/v/f:DI 94 [ array ]) > (nil))) > > The reload pass removes SUBREGs, which holds information about desired > type, because of which HImode regs are zero extended to DImode. > > (insn 8 7 6 2 (set (reg:SI 1 x1 [orig:99 MEM[(short unsigned int > *)array_4(D) + 2B] ] [99]) > (zero_extend:SI (mem:HI (plus:DI (reg/v/f:DI 0 x0 [orig:94 array ] > [94]) > (const_int 2 [0x2])) [1 MEM[(short unsigned int > *)array_4(D) + 2B]+0 S2 A16]))) {*zero_extendhisi2_aarch64} > (nil)) > (insn 6 8 9 2 (set (reg:DI 0 x0) > (zero_extend:DI (mem:HI (reg/v/f:DI 0 x0 [orig:94 array ] [94]) [1 > *array_4(D)+0 S2 A16]))) {*zero_extendhidi2_aarch64} > (nil)) > (insn 9 6 14 2 (set (reg:SI 0 x0 [98]) > (plus:SI (reg:SI 0 x0 [orig:96 *array_4(D) ] [96]) > (reg:SI 1 x1 [orig:99 MEM[(short unsigned int *)array_4(D) + 2B] > ] [99]))){*addsi3_aarch64} > (nil)) > (insn 14 9 15 2 (set (reg/i:HI 0 x0) > (reg:HI 0 x0 [98])) {*movhi_aarch64} > (nil)) > (insn 15 14 17 2 (use (reg/i:HI 0 x0)) > (nil)) > (note 17 15 18 NOTE_INSN_DELETED) > (note 18 17 0 NOTE_INSN_DELETED) > > Now as both memory accesses have different extended types, they cannot be > combined by peephole. > > Because of this, even when sched_fusion has brought the loads/stores closer, > they cannot be merged. Hmmm, ldr w0, [x0] ldr w1, [x0, 2] is not the same as ldp w0, w1, [x0] ldp w0, w1, [x0] is the same as merging ldr w0, [x0] ldr w1, [x0, 4] Am I missing something ? That would mean it isn't possible to merge this combination. Thoughts ...
[Bug tree-optimization/64946] [AArch64] gcc.target/aarch64/vect-abs-compile.c - "abs" vectorization fails for char/short types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64946 Ramana Radhakrishnan changed: What|Removed |Added Target Milestone|--- |9.0
[Bug tree-optimization/64946] [AArch64] gcc.target/aarch64/vect-abs-compile.c - "abs" vectorization fails for char/short types
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64946 --- Comment #25 from Ramana Radhakrishnan --- (In reply to kugan from comment #24) > Author: kugan > Date: Sat Jun 16 21:34:29 2018 > New Revision: 261681 > > URL: https://gcc.gnu.org/viewcvs?rev=261681=gcc=rev > Log: > gcc/ChangeLog: > > 2018-06-16 Kugan Vivekanandarajah > > PR middle-end/64946 > * cfgexpand.c (expand_debug_expr): Hande ABSU_EXPR. > * config/i386/i386.c (ix86_add_stmt_cost): Likewise. > * dojump.c (do_jump): Likewise. > * expr.c (expand_expr_real_2): Check operand type's sign. > * fold-const.c (const_unop): Handle ABSU_EXPR. > (fold_abs_const): Likewise. > * gimple-pretty-print.c (dump_unary_rhs): Likewise. > * gimple-ssa-backprop.c (backprop::process_assign_use): Likesie. > (strip_sign_op_1): Likesise. > * match.pd: Add new pattern to generate ABSU_EXPR. > * optabs-tree.c (optab_for_tree_code): Handle ABSU_EXPR. > * tree-cfg.c (verify_gimple_assign_unary): Likewise. > * tree-eh.c (operation_could_trap_helper_p): Likewise. > * tree-inline.c (estimate_operator_cost): Likewise. > * tree-pretty-print.c (dump_generic_node): Likewise. > * tree-vect-patterns.c (vect_recog_sad_pattern): Likewise. > * tree.def (ABSU_EXPR): New. > > gcc/c-family/ChangeLog: > > 2018-06-16 Kugan Vivekanandarajah > > * c-common.c (c_common_truthvalue_conversion): Handle ABSU_EXPR. > > gcc/c/ChangeLog: > > 2018-06-16 Kugan Vivekanandarajah > > * c-typeck.c (build_unary_op): Handle ABSU_EXPR; > * gimple-parser.c (c_parser_gimple_statement): Likewise. > (c_parser_gimple_unary_expression): Likewise. > > gcc/cp/ChangeLog: > > 2018-06-16 Kugan Vivekanandarajah > > * constexpr.c (potential_constant_expression_1): Handle ABSU_EXPR. > * cp-gimplify.c (cp_fold): Likewise. > > gcc/testsuite/ChangeLog: > > 2018-06-16 Kugan Vivekanandarajah > > PR middle-end/64946 > * gcc.dg/absu.c: New test. > * gcc.dg/gimplefe-29.c: New test. > * gcc.target/aarch64/pr64946.c: New test. > > > Added: > trunk/gcc/testsuite/gcc.dg/absu.c > trunk/gcc/testsuite/gcc.dg/gimplefe-29.c > trunk/gcc/testsuite/gcc.target/aarch64/pr64946.c > Modified: > trunk/gcc/ChangeLog > trunk/gcc/c-family/ChangeLog > trunk/gcc/c-family/c-common.c > trunk/gcc/c/ChangeLog > trunk/gcc/c/c-typeck.c > trunk/gcc/c/gimple-parser.c > trunk/gcc/cfgexpand.c > trunk/gcc/config/i386/i386.c > trunk/gcc/cp/ChangeLog > trunk/gcc/cp/constexpr.c > trunk/gcc/cp/cp-gimplify.c > trunk/gcc/dojump.c > trunk/gcc/expr.c > trunk/gcc/fold-const.c > trunk/gcc/gimple-pretty-print.c > trunk/gcc/gimple-ssa-backprop.c > trunk/gcc/match.pd > trunk/gcc/optabs-tree.c > trunk/gcc/testsuite/ChangeLog > trunk/gcc/tree-cfg.c > trunk/gcc/tree-eh.c > trunk/gcc/tree-inline.c > trunk/gcc/tree-pretty-print.c > trunk/gcc/tree-vect-patterns.c > trunk/gcc/tree.def Doesn't this mean we unxfail the vect-abs-compile.c test ?
[Bug tree-optimization/85804] [8/9 Regression][AArch64] Mis-compilation of loop with strided array access and xor reduction
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85804 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #2 from Ramana Radhakrishnan --- Patch being discussed here. https://gcc.gnu.org/ml/gcc-patches/2018-05/msg01026.html
[Bug debug/84342] Location views breaks cross builds of arm including gnueabihf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84342 --- Comment #13 from Ramana Radhakrishnan --- (In reply to Jeffrey A. Law from comment #12) > I'm not familiar enough with the ccfsm bits to know if there's something we > ought to be doing generically to improve CC handling further. I think > downgrading to P2 certainly makes sense though. > > However, I wouldn't be surprised if we find other instances of this kind of > problem confusing the hell out of the location view support. So I wouldn't > dig at the ccfsm stuff just to allow the compiler to handle view support > slightly more efficiently -- removal of ccfsm should stand on its own. Agreed about the removal of ccfsm support standing on it's own. I do wonder if doing that will have the side benefit of not having these kinds of issues. IIRC arc (from the days I worked on it) has ccfsm similar to arm, so maybe the same problem will bite us on other ports as well.
[Bug debug/84342] Location views breaks cross builds of arm including gnueabihf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84342 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #11 from Ramana Radhakrishnan --- (In reply to Jakub Jelinek from comment #10) > GCC 8.1 has been released. (In reply to Jeffrey A. Law from comment #9) > Alex: I realize that's the point of the hook. But I'm pretty sure there's > no way to fix the ARM port given the point at which lengths are set and the > point at which ccfsm is valid are at two different times. We'd either need > a revamp of ccfsm or some layering violations to allow dwarf2out to access > the underlying routines for length query and bypass the cache. > > It's my view this BZ is resolved. But if you want to keep it open to track > the incorrect lengths in the ARM port, that's fine. But it's certainly no > longer a regression for gcc-8. In which case should this still retain the P1 status ? The builds are ok since Alex's patch and we need to look at this at some point of time. Richard E and I have been talking about whether ccfsm actually makes sense in 2018 and whether we should just rip it out anyway and make sure that the rtl optimizers get it right rather than carry this in the far future. It's only used in A32 state, it's probably not got a huge amount going for it and maybe we should just rip it out.
[Bug target/85733] [8 regression] ARM -mbe8 behaviour doesn't match documentation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85733 Ramana Radhakrishnan changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |rearnsha at gcc dot gnu.org
[Bug target/85733] [8 regression] ARM -mbe8 behaviour doesn't match documentation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85733 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-05-11 CC||ramana at gcc dot gnu.org Target Milestone|--- |8.2 Summary|ARM -mbe8 behaviour doesn't |[8 regression] ARM -mbe8 |match documentation |behaviour doesn't match ||documentation Ever confirmed|0 |1
[Bug target/85593] [5,6,7,8 Regression] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-05-04 Version|9.0 |5.4.1 Summary|GCC on ARM allocates R3 for |[5,6,7,8 Regression] GCC on |local variable when calling |ARM allocates R3 for local |naked function with O2 |variable when calling naked |optimizations enabled |function with O2 ||optimizations enabled Ever confirmed|0 |1 --- Comment #5 from Ramana Radhakrishnan --- (In reply to Austin Morton from comment #4) > In my particular case I was able to work around the issue by removing the > naked attribute and using extended assembly with a clobbers list. Removing the naked attribute and using the extended assembler with a clobbers list is absolutely the correct thing to do. > > The resulting code is nearly identical (allowing GCC to generate the correct > pro/epilog instead of hand writing it), and gcc correctly allocates R4 > instead of R3. > > This still feels like a bug in GCC. In the example I gave, if you compiled > the naked function in a separate C file and linked them it would generate > the correct code. The issue is that GCC is able to "see" the naked function > and is performing optimizations that it shouldn't as a result. > > I believe that GCC should treat naked functions as opaque as far as > optimizations are concerned. > At the very least, there should be a note about this kind of issue included > in the documentation of the naked attribute. Yes, it should be opaque as far as this IPA-RA optimization is concerned - I don't think there are many other optimizations that need to treat this as opaque. That's what I alluded to in my previous comment > IIRC there is a hook for ipa-ra that says what > registers can be clobbered : can't find it immediately. I suppose for naked > functions it is *all* registers. I wasn't looking in the backend when I responded earlier, there is no such hook - I think the correct fix would be to get arm_emit_call_insn to mark *all* registers as clobbered if the target of the call insn is a naked function i.e. effectively disabling ipa-ra for naked functions. You'd have to figure out that the DECL for the target of the call had a "naked" attribute attached to it ... Do you feel up to writing up a patch assuming you have copyright assignments et al sewn up ? > > > > regards > Ramana
[Bug target/85593] GCC on ARM allocates R3 for local variable when calling naked function with O2 optimizations enabled
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85593 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #3 from Ramana Radhakrishnan --- Changing the testcase to indicate the clobbered register r3 makes the test pass but that is wrong in a naked function. Extended inline assembler *cannot* be used within a naked function because you inadvertently could end up requiring a stack slot and therefore cause the function to require a prologue and epilogue which is exactly contrary to what we want with the naked function ! I *think* the problem really is the fact that you have ipa-ra coming along and deciding that r3 isn't used at all in the naked function . You can see the problem disappear with -fno-ipa-ra but that is not a workaround I would recommend using in your general C flags , because you are using a hammer disabling a nice optimization to make something like the example "work". Thus I think what you want is to get rid of naked functions in general and write the whole thing in assembler and stop faffing about with naked functions in general. IIRC there is a hook for ipa-ra that says what registers can be clobbered : can't find it immediately. I suppose for naked functions it is *all* registers. regards Ramana
[Bug target/84923] [8/9 regression] gcc.dg/attr-weakref-1.c failed on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84923 --- Comment #4 from Ramana Radhakrishnan --- (In reply to Richard Biener from comment #3) > For x86_64 if I append > > const int *dat[] = { , }; > > the testcase links fine irrespective of where I place the > > .weakrefWv12,wv12 > .weak wv12 > > assembler declarations. > > When I look at the assembler generated by a cross from x86_64 I do not > see any .weak wv12 directive which is likely the issue. Yep, the .weak wv12 directive is the one that disappears. Putting that back in by hand fixes the issue.
[Bug target/68256] Defining TARGET_USE_CONSTANT_BLOCKS_P causes go bootstrap failure on aarch64.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68256 --- Comment #12 from Ramana Radhakrishnan --- (In reply to Steve Ellcey from comment #11) > FYI: This caused a regression on aarch64. > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84923 I have marked 84923 as an 8 regression as it wasn't done earlier and probably slipped in the cracks. The regression was noticed and a patch was posted but it appears that this wasn't reviewed. Ramana
[Bug ada/85380] gnatbind fails with small executable & restricted runtime
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85380 Ramana Radhakrishnan changed: What|Removed |Added CC||ebotcazou at gcc dot gnu.org, ||ramana at gcc dot gnu.org --- Comment #1 from Ramana Radhakrishnan --- Adding Eric to the CC list as someone who could comment on this ?
[Bug target/85203] cmse_nonsecure_caller intrinsic returns incorrect results
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85203 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||ramana at gcc dot gnu.org Resolution|--- |FIXED Target Milestone|--- |7.4 --- Comment #4 from Ramana Radhakrishnan --- Fixed I'm assuming ?
[Bug target/85261] __builtin_arm_set_fpscr ICEs with constant input
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85261 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #2 from Ramana Radhakrishnan --- What about earlier branches ?
[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-03-28 Ever confirmed|0 |1 --- Comment #3 from Ramana Radhakrishnan --- Confirmed.
[Bug target/81863] [7 regression] -mword-relocations is unreliable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81863 --- Comment #21 from Ramana Radhakrishnan --- Author: ramana Date: Tue Mar 27 14:06:20 2018 New Revision: 258886 URL: https://gcc.gnu.org/viewcvs?rev=258886=gcc=rev Log: [Patch ARM] Fix PR target/81863 This has been in my patch stack for quite some time. The problem here was that we weren't handling arm_word_relocations in arm_valid_symbolic_address and is the surest fix for this for GCC8 and GCC7. Regression tested on arm-none-linux-gnueabihf . Applying to trunk and backporting to GCC-7 in a day or so. regards Ramana 2018-03-27 Ramana RadhakrishnanPR target/81863 * config/arm/arm.c (arm_valid_symbolic_address): Handle arm_word_relocations 2018-03-27 Ramana Radhakrishnan PR target/81863 * gcc.target/arm/pr81863.c: New test. Added: trunk/gcc/testsuite/gcc.target/arm/pr81863.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/arm/arm.c trunk/gcc/testsuite/ChangeLog
[Bug middle-end/84877] Local stack copy of BLKmode parameter on the stack is not aligned when the requested alignment exceeds MAX_SUPPORTED_STACK_ALIGNMENT
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84877 Ramana Radhakrishnan changed: What|Removed |Added Target||arm-none-eabi CC||ramana at gcc dot gnu.org --- Comment #1 from Ramana Radhakrishnan --- Isn't this something you said you could see from 6.x ?
[Bug target/68256] Defining TARGET_USE_CONSTANT_BLOCKS_P causes go bootstrap failure on aarch64.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68256 Ramana Radhakrishnan changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Target Milestone|7.4 |8.0 --- Comment #10 from Ramana Radhakrishnan --- Fixed for gcc-8
[Bug target/59833] ARM soft-float extendsfdf2 fails to quiet signaling NaN
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59833 Ramana Radhakrishnan changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Target Milestone|--- |7.0 --- Comment #17 from Ramana Radhakrishnan --- Fixed for GCC-7
[Bug target/84521] [8 Regression] aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and -fomit-frame-pointer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521 --- Comment #15 from Ramana Radhakrishnan --- Author: ramana Date: Mon Feb 26 09:25:21 2018 New Revision: 257984 URL: https://gcc.gnu.org/viewcvs?rev=257984=gcc=rev Log: [Patch AArch64] Turn on frame pointer / partial fix for PR84521 This fixes a GCC-8 regression that we accidentally switched off frame pointers in the AArch64 backend when changing the defaults in the common parts of the code. This breaks an ABI decision that was made in GCC at the dawn of the port with respect to having a frame pointer at all times. If we really want to turn this off lets have a discussion around that separately. For now turn this back on and I believe this will leave PR84521 latent again with -fomit-frame-pointer and (hopefully) make the ruby issue go away. I'm asking Sudi to pick that up. Bootstrapped and regression tested on AArch64-none-linux-gnu but I see one regression in gcc.c-torture/execute/960419-2.c which needs to be looked at next (PR84528, thanks Kyrill). Ok to put in and then look at PR84528 ? 2018-02-26 Ramana RadhakrishnanPR target/84521 * common/config/aarch64/aarch64-common.c (aarch_option_optimization_table[]): Switch off fomit-frame-pointer 2018-02-26 Ramana Radhakrishnan PR target/84521 * gcc.target/aarch64/lr_free_2.c: Revert changes in r254814 disabling -fomit-frame-pointer by default. * gcc.target/aarch64/spill_1.c: Likewise. * gcc.target/aarch64/test_frame_11.c: Likewise. * gcc.target/aarch64/test_frame_12.c: Likewise. * gcc.target/aarch64/test_frame_13.c: Likewise. * gcc.target/aarch64/test_frame_14.c: Likewise. * gcc.target/aarch64/test_frame_15.c: Likewise. * gcc.target/aarch64/test_frame_3.c: Likewise. * gcc.target/aarch64/test_frame_5.c: Likewise. * gcc.target/aarch64/test_frame_9.c: Likewise. Modified: trunk/gcc/ChangeLog trunk/gcc/common/config/aarch64/aarch64-common.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.target/aarch64/lr_free_2.c trunk/gcc/testsuite/gcc.target/aarch64/spill_1.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_11.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_12.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_13.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_14.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_15.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_3.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_5.c trunk/gcc/testsuite/gcc.target/aarch64/test_frame_9.c
[Bug target/84528] [8 Regression] gcc.c-torture/execute/960419-2.c -O3 fails with -fno-omit-frame-pointer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84528 Ramana Radhakrishnan changed: What|Removed |Added Priority|P1 |P3 Status|UNCONFIRMED |NEW Last reconfirmed||2018-02-23 CC||ramana at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Ramana Radhakrishnan --- We are about to turn fno-omit-frame-pointer back on for gcc-8 so this is probably important. Usually the RM's need to set this to P1 - so keeping this at P3 for an RM to confirm this goes up to P1 or P2.
[Bug target/84521] [8 Regression] aarch64: Frame-pointer corruption with __builtin_setjmp/__builtin_longjmp and -fomit-frame-pointer
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84521 --- Comment #10 from Ramana Radhakrishnan --- (In reply to Jakub Jelinek from comment #4) > Is the requirement just for functions that contain setjmp? If so, the > backend could just force frame pointers in cfun->calls_setjmp functions. I think we should flip back fno-omit-frame-pointer on for gcc-8 as that breaks the guarantee that we've had in the port for quite a while. I'm testing a patch currently that I will get out first thing tomorrow to turn this back on. If we want to turn it off that should be a conscious decision. > > If not, even if the default is tweaked again to be -fno-omit-frame-pointer > on aarch64, the code is still wrong with explicit -fno-omit-frame-pointer, > even before that change. I think we should treat that as a separate but related issue. Ramana
[Bug tree-optimization/83543] strlen of a local array member not optimized on some targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83543 Ramana Radhakrishnan changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-02-20 CC||ramana at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #4 from Ramana Radhakrishnan --- (In reply to Martin Sebor from comment #0) > Bug 83462 reports (among others) a failure in the new > c-c++-common/Warray-bounds-4.c test on powerpc64le. The failure is due to a > strlen optimization that's for some reason not working on this target (and > on some others, including arm-none-eabi) but that works fine on > x86_64-linux. The test case below shows the difference in cross-compiler > output between these three architectures. Is this because arm-none-eabi by default is a STRICT_ALIGNMENT target ? What happens if for instance you try this with -march=armv7-a where we allow some limited misaligned access ? Ramana > > $ (set -x && cat z.c && for arch in '' arm-none-eabi powerpc64le-linux; do > /ssd/build/$arch/gcc-git/gcc/xgcc -B /ssd/build/$arch/gcc-git/gcc -O2 -S > -Wall -fdump-tree-optimized=/dev/stdout z.c; done) > + cat z.c > struct S { char a[7]; }; > > void f (void) > { > struct S s = { "12345" }; > if (__builtin_strlen (s.a) != 5) > __builtin_abort (); > } > + for arch in ''\'''\''' arm-none-eabi powerpc64le-linux > + /ssd/build//gcc-git/gcc/xgcc -B /ssd/build//gcc-git/gcc -O2 -S -Wall > -fdump-tree-optimized=/dev/stdout z.c > > ;; Function f (f, funcdef_no=0, decl_uid=1894, cgraph_uid=0, symbol_order=0) > > f () > { >[local count: 1073741825]: > return; > > } > > > + for arch in ''\'''\''' arm-none-eabi powerpc64le-linux > + /ssd/build/arm-none-eabi/gcc-git/gcc/xgcc -B > /ssd/build/arm-none-eabi/gcc-git/gcc -O2 -S -Wall > -fdump-tree-optimized=/dev/stdout z.c > > ;; Function f (f, funcdef_no=0, decl_uid=4155, cgraph_uid=0, symbol_order=0) > > f () > { > struct S s; > unsigned int _1; > >[local count: 1073741825]: > s = *.LC0; > _1 = __builtin_strlen (); > if (_1 != 5) > goto ; [0.00%] > else > goto ; [99.96%] > >[count: 0]: > __builtin_abort (); > >[local count: 1073312327]: > s ={v} {CLOBBER}; > return; > > } > > > + for arch in ''\'''\''' arm-none-eabi powerpc64le-linux > + /ssd/build/powerpc64le-linux/gcc-git/gcc/xgcc -B > /ssd/build/powerpc64le-linux/gcc-git/gcc -O2 -S -Wall > -fdump-tree-optimized=/dev/stdout z.c > > ;; Function f (f, funcdef_no=0, decl_uid=2784, cgraph_uid=0, symbol_order=0) > > f () > { > struct S s; > long unsigned int _1; > >[local count: 1073741825]: > s = *.LC0; > _1 = __builtin_strlen (); > if (_1 != 5) > goto ; [0.00%] > else > goto ; [99.96%] > >[count: 0]: > __builtin_abort (); > >[local count: 1073312327]: > s ={v} {CLOBBER}; > return; > > }
[Bug ipa/83178] [8 regression] g++.dg/ipa/devirt-22.C fail
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=83178 Ramana Radhakrishnan changed: What|Removed |Added Keywords||missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed||2017-12-12 CC||ramana at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #3 from Ramana Radhakrishnan --- Confirmed then.
[Bug target/82248] probe_stack can generate unpredictable STR on arm
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82248 --- Comment #6 from Ramana Radhakrishnan --- Author: ramana Date: Tue Dec 5 16:32:55 2017 New Revision: 255428 URL: https://gcc.gnu.org/viewcvs?rev=255428=gcc=rev Log: [Patch ARM] Fix probe_stack constraint. The probe_stack pattern uses r0 as a fixed register. This can cause issues if we have auto-increment instructions coming out that have r0 as the base register. Tested with a bootstrap and regression run. richi reports that the original issue was fixed in the run. I did consider whether probe_stack_range was affected but it all comes back to probe_stack pattern so I think we are ok. I don't have a testcase that seems to provoke this but it seems to be default on most distributions so I'm expecting the testcoverage to come from there. Applied. Ramana PR target/82248 * config/arm/arm.md (probe_stack) : Use the 'o' constraint. Modified: trunk/gcc/ChangeLog trunk/gcc/config/arm/arm.md