[Bug target/78660] [7 Regression] 7.0 bootstrap fail on mips64el-unknow-linux: configure-stage2-target-libgcc' failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78660 Matthew Fortune changed: What|Removed |Added CC||ebotcazou at gcc dot gnu.org --- Comment #9 from Matthew Fortune --- @eric: adding you following our discussion on PR rtl-optimization/59461 I have added my findings so far on this ticket.
[Bug target/78660] [7 Regression] 7.0 bootstrap fail on mips64el-unknow-linux: configure-stage2-target-libgcc' failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78660 --- Comment #8 from Matthew Fortune --- Created attachment 40518 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=40518=edit testcase I have narrowed this bug down to a mis-compilation of gcc/c/c-decl.c where there are a few code differences after applying yunqiang's partial revert. A reduced and pre-processed c-decl.c file is attached. mips64el-linux-gnu-g++ -fno-PIE -c -DIN_GCC_FRONTEND -O2 -DIN_GCC -fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common -o c-decl.me.o c-decl.me.c The key (I think) is that the following sequence of 3 instructions ends up being combined into 1 but the resulting instruction leaves the upper 32-bits of reg 316 entirely undefined. Eventually this leads to reg 316 being spilled to the stack where it is allocated a 64-bit slot but this spill only writes 32-bits whereas consumers read 64-bit and even if the value will only ever be operated on as 32-bit or less then logical and branch operations on the reloaded value will go wrong and normal 32-bit operations will be (strictly) undefined. (insn 56 55 57 3 (set (reg:SI 470) (ne:SI (reg/f:DI 469 [ current_scope.1_1->bindings ]) (const_int 0 [0]))) "/althome/mips/v6/src/gcc/gcc/c/c-decl.me.c":915 501 {*sne_zero_disi} (expr_list:REG_DEAD (reg/f:DI 469 [ current_scope.1_1->bindings ]) (nil))) (insn 57 56 58 3 (set (reg:QI 468) (subreg:QI (reg:SI 470) 0)) "/althome/mips/v6/src/gcc/gcc/c/c-decl.me.c":915 360 {*movqi_internal} (expr_list:REG_DEAD (reg:SI 470) (nil))) (insn 58 57 59 3 (set (reg:DI 316 [ iftmp.3_114 ]) (zero_extend:DI (reg:QI 468))) "/althome/mips/v6/src/gcc/gcc/c/c-decl.me.c":915 214 {*zero_extendqidi2} (expr_list:REG_DEAD (reg:QI 468) (nil))) combines to... (insn 58 57 59 3 (set (subreg:SI (reg:DI 316 [ iftmp.3_114 ]) 0) (ne:SI (reg/f:DI 469 [ current_scope.1_1->bindings ]) (const_int 0 [0]))) "/althome/mips/v6/src/gcc/gcc/c/c-decl.me.c":915 501 {*sne_zero_disi} (expr_list:REG_DEAD (reg/f:DI 469 [ current_scope.1_1->bindings ]) (nil))) The relevant fragment of combine is: Trying 57 -> 58: Successfully matched this instruction: (set (reg:DI 316 [ iftmp.3_114 ]) (subreg:DI (reg:SI 470) 0)) allowing combination of insns 57 and 58 original costs 4 + 4 = 8 replacement cost 4 deferring deletion of insn with uid = 57. modifying insn i358: r316:DI=r470:SI#0 REG_DEAD r470:SI deferring rescan insn with uid = 58. Trying 56 -> 58: Successfully matched this instruction: (set (subreg:SI (reg:DI 316 [ iftmp.3_114 ]) 0) (ne:SI (reg/f:DI 469 [ current_scope.1_1->bindings ]) (const_int 0 [0]))) allowing combination of insns 56 and 58 original costs 4 + 4 = 8 replacement cost 4 deferring deletion of insn with uid = 56. modifying insn i358: r316:DI#0=r469:DI!=0 REG_DEAD r469:DI deferring rescan insn with uid = 58. I've not started investigating why exactly this decision is made.
[Bug rtl-optimization/59461] missed zero-extension elimination in the combiner
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59461 --- Comment #10 from Matthew Fortune --- (In reply to Eric Botcazou from comment #9) > > This is a notoriously hard topic to address. All instructions affect the > > full 64-bit register including those that do 32-bit arithmetic i.e. they > > will set/clear the upper bits to replicate bit-31. > > So there are different 32-bit and 64-bit 'add' instructions for example? > That might nevertheless be OK for WORD_REGISTER_OPERATIONS. Yes. But logical operations are the same machine instructions regardless of whether it is being used for a 64-bit logical op or 32-bit logical op. > > In terms of instruction definition we therefore have instructions that > > operate on DImode and instructions that operate on SImode. The SImode > > instructions just don't need to worry about what is happening with the upper > > bits. > > Likewise for SPARC, which is WORD_REGISTER_OPERATIONS too, but doesn't care > about the upper bits in 32-bit mode and has a single 'add' instruction (but > maintains 32-bit _and_ 64-bit condition codes for every instruction). So are the upper bits for SPARC completely undefined? That would then be the major distinction between MIPS and SPARC. The upper bits are defined for MIPS > > I don't know if any of that subtlety affects this yet. > > Bugs in the nonzero_bits machinery are not to be ruled out either. I'm > willing to help of course but I don't have access to MIPS64 hardware. I'm yet to even narrow down to an object file affected owing to the bug only showing in stage2 build so far. Thanks for the offer of help, I'll add more when I find it.
[Bug rtl-optimization/59461] missed zero-extension elimination in the combiner
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59461 --- Comment #8 from Matthew Fortune --- (In reply to Eric Botcazou from comment #7) > > I'm yet to get my head around what the issue is but if anyone has a pointer > > based on the potential impact on MIPS64 as described above then I'd be > > grateful. > > Is WORD_REGISTER_OPERATIONS correct for MIPS64, i.e. do all instructions > operate on the full 64-bit integer registers? This is a notoriously hard topic to address. All instructions affect the full 64-bit register including those that do 32-bit arithmetic i.e. they will set/clear the upper bits to replicate bit-31. However, according to the architecture they logically operate on 32-bits and require that all inputs are canonical (sign bit replicated) otherwise the operation is invalid. So it would not matter whether the register was 33 bits or 1000 bits wide as long as all bits from 32 upwards replicate bit-31. The upper bits only become relevant once a 32-bit value is cast to a 64-bit value where sign extension is free and zero extension is an operation. truncation from 64-bit to 32-bit is a sign extension from bit-31 regardless of whether it is truncating to signed or unsigned. In terms of instruction definition we therefore have instructions that operate on DImode and instructions that operate on SImode. The SImode instructions just don't need to worry about what is happening with the upper bits. I don't know if any of that subtlety affects this yet.
[Bug rtl-optimization/59461] missed zero-extension elimination in the combiner
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59461 Matthew Fortune changed: What|Removed |Added CC||matthew.fortune at imgtec dot com --- Comment #6 from Matthew Fortune --- (In reply to Eric Botcazou from comment #5) > Fixed in GCC 7. There is a reasonable chance that this patch broke mips64 n64 but I do not have confirmation yet. See PR target/78660. I can vaguely see how this patch may affect MIPS64 in terms of how 32-bit values are handled on a MIPS64 architecture with the sign bit replicated to the upper-32 bits. This presumably is somehow not accounted for in the nonzero_bits logic. I'm yet to get my head around what the issue is but if anyone has a pointer based on the potential impact on MIPS64 as described above then I'd be grateful.
[Bug tree-optimization/15826] don't use "if" to extract a single bit bit-field.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=15826 Matthew Fortune changed: What|Removed |Added CC||matthew.fortune at imgtec dot com --- Comment #17 from Matthew Fortune --- (In reply to Bill Schmidt from comment #15) > ;; Function andrew (andrew, funcdef_no=2, decl_uid=2361, cgraph_uid=2, > symbol_order=2) > > andrew (struct s * p) > { > unsigned int _4; > unsigned int _5; > unsigned int _6; > > : > _4 = BIT_FIELD_REF <*p_3(D), 32, 0>; > _5 = _4 & 1; > _6 = _5; > return _6; > > } MIPS sees this same spurious failure. The issue (on a GCC 6 branch) is that the optimization has successfully happened it just looks like the check is wrong in the testcase. The presence of a logical and is not harmful as it is a legitimate way to truncate: (from match.pd) /* A truncation to an unsigned type (a zero-extension) should be canonicalized as bitwise and of a mask. */ This simplification kicks in for MIPS and presumably powerpc but does not (I don't know why) for x86_64. There is no 'goto' demonstrating that a single bit extract does not use an IF. I therefore propose we just update the check to verify there is no 'goto'. Matthew
[Bug target/78660] [7 Regression] 7.0 bootstrap fail on mips64el-unknow-linux: configure-stage2-target-libgcc' failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78660 --- Comment #3 from Matthew Fortune --- (In reply to Aldy Hernandez from comment #2) > I can't reproduce on a cross build. Is there a mips64el box on the compile > farm or somewhere public so someone can look at this? The following machines were added to the farm relatively recently. gcc24 is not accepting my login currently though to gcc22 or gcc23 are usable. They will not be fast though as they are modified routers. They are 32-bit but with 64-bit multilibs I suspect that is good enough to reproduce. I won't find time to look at this until the new year at least, any help is appreciated. Operating system are currently configured as follows: * gcc22 - kernel 3.14.10 (EB), Debian 7.10 (Wheezy), gcc 4.6.3 * gcc23 - kernel 4.1.4 (EL), Debian 8.4 (Jessie), gcc 4.9.2 * gcc24 - kernel 4.1.4 (EL), Debian 8.4 (Jessie), gcc 4.9.2 (EB = big endian, EL = little endian)
[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176 --- Comment #9 from Matthew Fortune --- (In reply to Eric Botcazou from comment #8) > > The expansion looks like an acceptable transformation to me i.e. it is not > > introducing the overflow for the offending pointer just maintaining what is > > already in the tree. > > Wrap around for unsigned types is OK but, if expansion does implicit > extensions to larger types, then things can easily go wrong. Sure, and in this case there are no implicit extensions to larger types. The bug does require an implicit extension to occur but this only happens at runtime when on 64-bit hardware and the core is not configured to be limited to a 32-bit address space which is always true for a 64-bit kernel as it turns out. > > I'm still not sure if there is really a bug. Should reassoc not be doing > > this for 'sizetype'? Should ivopts not have created the mess in the first > > place? Would changing either of these actually introduce an assurance that > > this situation won't occur from other optimisations? > > sizetype is unsigned so all the transformations looks valid. I'm leaning heavily towards not-a-compiler-bug but may need to offer a workaround until we can solve this in the Linux kernel. The only workaround possible is to provide an option to disable register+index addressing i.e. remove [ls][wd]xc1 instructions.
[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176 --- Comment #7 from Matthew Fortune --- (In reply to Eric Botcazou from comment #6) > > The issue may stem from the C front end where the dumps start off as below. > > Note that the '-1' in kappa-1 has ended up being represented as 1073741823 > > (0x3fff) presumably owing to the fact it will be multiplied by 4 > > eventually and hence be 'safe'. > > All pointer calculations are done in sizetype and it is unsigned. This kind > of issues generally come from the RTL level, maybe expansion here. The expansion looks like an acceptable transformation to me i.e. it is not introducing the overflow for the offending pointer just maintaining what is already in the tree. I dug back to reassoc2 based on Andrew's comment which gets the following as input: _48 = (sizetype) kappa_6(D); _49 = _48 * 8; _50 = s_37(D) + _49; ivtmp.19_47 = (unsigned int) _50 and then in the loop: _54 = (sizetype) s_37(D); _55 = ivtmp.19_3 - _54; _56 = _55 + 4294967280; # RANGE [0, 4294967295] NONZERO 4294967288 _19 = _56; # PT = nonlocal escaped _20 = _17 + _19; # VUSE <.MEM_4> _21 = *_20 This includes a pointer subtraction of 's' (somewhat pointless but valid I believe). reassoc2 changes it to this: _48 = (sizetype) kappa_6(D); _49 = _48 * 8; _50 = s_37(D) + _49; ivtmp.19_47 = (unsigned int) _50; and in the loop: _54 = (sizetype) s_37(D); _38 = -_54; _25 = 4294967280 - _54; _56 = _25 + ivtmp.19_3; # RANGE [0, 4294967295] NONZERO 4294967288 _19 = _56; # PT = nonlocal escaped _20 = _17 + _19; # VUSE <.MEM_4> _21 = *_20; And _56 is where we get the overflow. I'm still not sure if there is really a bug. Should reassoc not be doing this for 'sizetype'? Should ivopts not have created the mess in the first place? Would changing either of these actually introduce an assurance that this situation won't occur from other optimisations?
[Bug target/78176] [MIPS] miscompiles ldxc1 with large pointers on 32-bits
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=78176 Matthew Fortune changed: What|Removed |Added CC||matthew.fortune at imgtec dot com --- Comment #5 from Matthew Fortune --- I'm struggling to see what can be fixed in GCC for this (and I'm not sure I follow why there is a problem with GCC's behaviour) but... The issue may stem from the C front end where the dumps start off as below. Note that the '-1' in kappa-1 has ended up being represented as 1073741823 (0x3fff) presumably owing to the fact it will be multiplied by 4 eventually and hence be 'safe'. I can't figure out so far what creates this value but the behavior started some time between gcc 4.7 and 4.8 I believe. The questionably bad code then shows up dependent on optimisations and instruction availability. If the calculation had been done with ADDU instead of in the LDXC1 then all would be well. Regardless of whether this is a real bug for the compiler then the Linux kernel should be making use of the status.UX bit in order to properly emulate a 32-bit environment for o32 on 64-bit kernel/hardware and had this been run using a 32-bit kernel (on 64-bit hardware) then it would run correctly. There is an issue for N32 code with the UX bit in that you also need the PX bit available to independently control 64-bit instruction availability vs address space but this is only missing in MIPSIV cores. ;; Function ldxc1_test (null) ;; enabled by -tree-original { int kappa2 = kappa; double tmp = 0.0; int kappa2 = kappa; double tmp = 0.0; :; kappa-- ; if (zeros + 1 < kappa) { tmp = *(*(r->rows + ((sizetype) kappa + 1073741823) * 4) + ((sizetype) kappa + 536870911) * 8) * ctt; tmp = ldexp (tmp, *(expo + ((sizetype) kappa + 1073741823) * 4) - *(expo + (sizetype) ((unsigned int) kappa2 * 4))); } if (zeros + 1 < kappa && *(s + ((sizetype) kappa + 536870911) * 8) <= tmp) goto ; else goto ; :; return tmp;
[Bug java/71917] [7 regression] libjava.jar/ReturnProxyTest.jar etc. FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71917 --- Comment #12 from Matthew Fortune --- Created attachment 39593 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39593=edit Proposed fix Attached fix should resolve the issue on sparc64 BE. The original attempt at the fix for mips64el is reverted and instead the rvalue conversion for the native FFI interface now copes with 64-bit little endian targets ensuring sign/zero extension is performed appropriately. The fix is for all architectures but presumably others do not suffer if the sign/zero extension is missing (or have not noticed).
[Bug java/71917] [7 regression] libjava.jar/ReturnProxyTest.jar etc. FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71917 --- Comment #10 from Matthew Fortune --- (In reply to Eric Botcazou from comment #9) > > I'll certainly check on this but I did run the fix on both big and little > > endian MIPS which seems to suggest there isn't a double adjustment overall. > > So this was broken in 64-bit big-endian too before your fix? Ah, I did BE/LE 32-bit and LE 64-bit. The bug was seen on LE 64-bit. I'm getting my head around this now! The conversion functions will need to perform the 32-bit to 64-bit sign extension but do nothing for the reverse. I think this means that only the raw-to-rvalue needs updating based on the function names. I'll give it a go.
[Bug java/71917] [7 regression] libjava.jar/ReturnProxyTest.jar etc. FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71917 --- Comment #8 from Matthew Fortune --- (In reply to Eric Botcazou from comment #7) > > 2016-07-13 Matthew Fortune> > > > * interpret-run.cc: Use ffi_arg for FFI integer return types. > > so we now have a double adjustment on 64-bit big-endian and this breaks. > Maybe there is something missing for little-endian MIPS in java_raw_api.c. I'll certainly check on this but I did run the fix on both big and little endian MIPS which seems to suggest there isn't a double adjustment overall. The layers involved in this have however exceeded what I can fit in my head so I'll have to draw things out to try and understand what was/should-be/is happening. Is there any definition for the raw API? I've seen some reference to arguments being passed 'in the way a java stack works' but am not sure how that relates to FFI return values passed in registers.
[Bug target/68273] [5/6/7 Regression] Wrong code on mips/mipsel due to (invalid?) peeking at alignments in function_arg.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68273 --- Comment #37 from Matthew Fortune --- (In reply to rguent...@suse.de from comment #36) > On Fri, 3 Jun 2016, matthew.fortune at imgtec dot com wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68273 > > > > --- Comment #35 from Matthew Fortune --- > > (In reply to Aurelien Jarno from comment #33) > > > (In reply to Hector Oron from comment #32) > > > > (In reply to Richard Biener from comment #31) > > > > > eipa_sra introduces the remaining SSA name with non-default alignment > > > > > via > > > > > > > > [PATCH] > > > > > > > > > > For the record, Debian now have the patch from comment #32 in its version > > > of > > > GCC 5 and 6. This fixes a few more issues than reported here, in addition > > > to > > > graphviz, it also fixes subversion and jq. > > > > Richard: Is the updated patch from comment #32 in-keeping with your original > > changes in comment #20? If so then I'll post it on gcc-patches. > > > > I do accept that there are MIPS fixes to be made in this area but I'd like > > to > > get the existing code to 'work' again first if possible. > > Note that the tree-ssa.c changes from comment#32 do not work as-is > (on the branch you don't notice that because checking is disabled). > Even the relaxed variant that we don't have over-aligned types in the IL > breaks AFAIR. > > Given the analysis in comment #31 I'd rather patch up either > turn_representatives_into_adjustments or ipa_modify_formal_parameters > (which already seems to have code to avoid under-aligned types). > > That is, does > > Index: gcc/ipa-prop.c > === > --- gcc/ipa-prop.c (revision 237053) > +++ gcc/ipa-prop.c (working copy) > @@ -3910,7 +3909,7 @@ ipa_modify_formal_parameters (tree fndec > if (is_gimple_reg_type (ptype)) > { > unsigned malign = GET_MODE_ALIGNMENT (TYPE_MODE > (ptype)); > - if (TYPE_ALIGN (ptype) < malign) > + if (TYPE_ALIGN (ptype) != malign) > ptype = build_aligned_type (ptype, malign); > } > } > > fix the issue as well? Sorry for the very slow reply, got busy with libjava bugs. This does fix jq as well so it looks good. I haven't done a GCC testsuite run with the change in place yet though.
[Bug java/71917] [7 regression] libjava.jar/ReturnProxyTest.jar etc. FAIL
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71917 Matthew Fortune changed: What|Removed |Added CC||matthew.fortune at imgtec dot com --- Comment #2 from Matthew Fortune --- Hi Rainer, Sorry for the bugs, I am however tempted to say this will turn out to be a libffi bug. You have described the same failure mode as I fixed for MIPS but libjava is now following the ffi return type rules so my assumption is that it is sparc ffi that is not. I've had a quick read of sparc ffi code and it seems there is a chance that integer return types are not being promoted to word size. I.e. following the ffi rule that integers smaller than a word are returned as type ffi_arg. Do you know if the two new testcases fail if built against a gij built without my changes to java/lang/reflect/natVMProxy.cc and interpret-run.cc? I suspect they may actually pass before but for the wrong reasons. Does 64-bit sparc have problems with other codebases using ffi? libguile, python being some notable examples? Matthew
[Bug target/68273] [5/6/7 Regression] Wrong code on mips/mipsel due to (invalid?) peeking at alignments in function_arg.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68273 --- Comment #35 from Matthew Fortune --- (In reply to Aurelien Jarno from comment #33) > (In reply to Hector Oron from comment #32) > > (In reply to Richard Biener from comment #31) > > > eipa_sra introduces the remaining SSA name with non-default alignment via > > > > [PATCH] > > > > For the record, Debian now have the patch from comment #32 in its version of > GCC 5 and 6. This fixes a few more issues than reported here, in addition to > graphviz, it also fixes subversion and jq. Richard: Is the updated patch from comment #32 in-keeping with your original changes in comment #20? If so then I'll post it on gcc-patches. I do accept that there are MIPS fixes to be made in this area but I'd like to get the existing code to 'work' again first if possible.
[Bug rtl-optimization/65862] [MIPS] IRA/LRA issue: integers spilled to floating-point registers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65862 --- Comment #6 from Matthew Fortune matthew.fortune at imgtec dot com --- (In reply to Robert Suchanek from comment #5) I am not sure, that the result code is better as we access memory 3 times instead of access to $f20. On one hand, yes, it seems good but it's not always desirable to use FP regs until absolutely necessary. For instance, compiling the dynamic linker that uses FP regs does not seem to be right. On the costs front then while it is true that moves between FPR and GPR are usually cheaper than moving to memory and back there is a secondary cost which is that simply turning on the FPU is costly. So, the reason for using FPRs needs to be that the floating point instructions are used rather than registers. Ideally we would not spill to FPRs unless there has been actual floating point code used, this suggests it would be good to set the cost of GPR-FPR higher than memory if no floating point code is present in the function. Otherwise if the FPU is in use anyway then using FPRs as scratch/spill for integer mode data is fine and the costs can be lower than memory. Matthew
[Bug target/64569] New: [MIPS] Unable to build soft-float in conjunction with binutils 2.25
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64569 Bug ID: 64569 Summary: [MIPS] Unable to build soft-float in conjunction with binutils 2.25 Product: gcc Version: 4.9.3 Status: UNCONFIRMED Severity: major Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: matthew.fortune at imgtec dot com binutils 2.25 for MIPS includes more aggressive checks on ABI usage and in particular warns when given code that claims to be soft-float via .gnu_attribute 4,3 but is assembled without -msoft-float. binutils 2.25 will also infer an ABI from the floating-point related command line options. No GCC compiler drivers prior to GCC 5.0 pass the -msoft-float option through to the assembler which means that the assembler is given an inconsistent view of the intended ABI. For compiled code the issue will show itself as a warning from the assembler: Warning: .gnu_attribute 4,3 requires `softfloat' For hand-crafted assembler modules assembled via the compiler driver then the ABI inferred by the assembler will be hard-float instead of soft-float owing to the missing -msoft-float option. A combination of these two problems will then lead to a link failure stating that incompatible ABIs are in use. A number of other relocation errors can also occur as a side effect. (This bug was also filed as a binutils issue but is actually a compiler driver issue: https://sourceware.org/bugzilla/show_bug.cgi?id=17219)
[Bug target/63848] [5.0 regression] FAIL: c-c++-common/torture/builtin-arith-overflow-17.c -O0 execution test
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63848 Matthew Fortune matthew.fortune at imgtec dot com changed: What|Removed |Added CC||matthew.fortune at imgtec dot com --- Comment #1 from Matthew Fortune matthew.fortune at imgtec dot com --- I am seeing multiple failures for the builtin-arith-overflow tests on MIPS64 as well. I have not yet had time to investigate much but it may be due to some missing sign extension and/or performing a range test using bitwise comparison.
[Bug target/51729] dspr2-MULT.c and dspr2-MULTU.c fail for MIPS
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51729 Matthew Fortune matthew.fortune at imgtec dot com changed: What|Removed |Added CC||clm at codesourcery dot com, ||matthew.fortune at imgtec dot com, ||sandra at codesourcery dot com --- Comment #3 from Matthew Fortune matthew.fortune at imgtec dot com --- The affected tests appear to have started passing again since this commit: svn: r211959 2014-06-24 Catherine Moore c...@codesourcery.com Sandra Loosemore san...@codesourcery.com gcc/ * config/mips/mips.c (mips_order_regs_for_local_alloc): Delete. * config/mips/mips.h (ADJUST_REG_ALLOC_ORDER): Delete. * config/mips/mips-protos.h (mips_order_regs_for_local_alloc): Delete. Irritatingly they now pass for -O2 and -O3 but still fail for -O1 and -Os. I was half tempted to submit a patch to just be content that we get multiple accumulators now used at -O2 and -O3 and leave it at that but I think that is probably the wrong attitude. Has anyone else noticed the change in state for these tests and looked into it, planning to look into it or have any thoughts? I'll take a look when I get chance.
[Bug rtl-optimization/58461] [MIPS] Using LRA instead of reload increases code size for mips16
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58461 --- Comment #3 from Matthew Fortune matthew.fortune at imgtec dot com --- (In reply to rsand...@gcc.gnu.org from comment #2) I think it'd be wrong for the backend to say that moves between MIPS16 registers and other general registers are more expensive than memory though. I'm not surprised by that comment. I agree. It was just an option I found that avoided the need for changes to IRA. But of course, although the moves are cheap, registers other than $2-$7, $16, $17 and $24 are only useful as spill space. So if it's better for IRA to ignore them and LRA to use them via TARGET_SPILL_CLASS, perhaps we should enforce that directly, by hiding other registers from IRA. I suppose that's like restoring the old cover classes hook, but only as an optional feature. One of the problems here is that IRA is coming up with M16_REGS as the preferred class and GR_REGS as the alternate class for most pseudos. If there is no cover class to translate GR_REGS into then that seems like a problem unless it would work to use M16_REGS as a cover class for GR_REGS. My initial attempt at this problem was to stop IRA from using the superunion of preferred and alternate classes for an allocno and instead just using the preferred class. This has the same effect as the change to register move costs which was to get the alternate class to be NO_REGS (or ignored). I realise this isn't the point of the bug report or the attachment, but just FWIW: the constraints shouldn't be matching the fake FRAME_POINTER_REGNUM. They should wait for it to be eliminated to either STACK_POINTER_REGNUM or HARD_FRAME_POINTER_REGNUM and match that. I'll look at that when putting an 'enable LRA' patch together. There's lots of testing to do yet.
[Bug rtl-optimization/58461] [MIPS] Using LRA instead of reload increases code size for mips16
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58461 --- Comment #1 from Matthew Fortune matthew.fortune at imgtec dot com --- Created attachment 30853 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30853action=edit Patch to enable LRA for mips16
[Bug rtl-optimization/58461] New: [MIPS] Using LRA instead of reload increases code size for mips16
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58461 Bug ID: 58461 Summary: [MIPS] Using LRA instead of reload increases code size for mips16 Product: gcc Version: 4.9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: matthew.fortune at imgtec dot com Created attachment 30852 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30852action=edit Test case to trigger LRA reload issue While working on enabling LRA for MIPS (mips16 in particular) I have observed that LRA is not producing as optimal code as classic reload. The underlying cause of this is that the register allocation decisions made in IRA were sub-optimal but classic reload 'fixes' them whereas LRA does not. Regardless of fixing the IRA issues there is probably something to fix in LRA. I have attached a patch that applies to top of tree to enable LRA for mips/mips16 and exposes two options to demonstrate how LRA differs from classic reload. I have also attached a test case (reload_test_mips16.i) which is a function from libjpeg. The two options added by the patch are -mreload and -mfix-regalloc. LRA is default on with the patch applied: * mips-sde-elf-gcc -Os -mips16 -march=m4kec reload_test_mips16.i LRA introduces a number of reloads that involve $24 which is inaccessible to most mips16 instructions leading to an increase in code size. * mips-sde-elf-gcc -Os -mips16 -march=m4kec -mreload ... Classic reload manages to avoid the reloads of $24 as its reloads converge to use the same reload register and eliminate $24 altogether. * mips-sde-elf-gcc -Os -mips16 -march=m4kec -mfix-regalloc ... LRA now outperforms classic reload as the initial register allocation by IRA is better so LRA does not hit the problem I am reporting. The original register allocation from IRA is: Disposition: 26:r245 l0 22:r246 l0 4 28:r249 l0 23:r250 l016 4:r251 l0175:r252 l0 86:r253 l0 9 19:r260 l011 15:r266 l024 12:r275 l0 3 11:r276 l0 2 10:r278 l010 27:r281 l0 47:r282 l0 58:r283 l0 61:r284 l0 7 29:r285 l0 mem 24:r286 l024 25:r287 l0 2 23:r288 l024 22:r289 l024 21:r290 l024 20:r291 l024 18:r292 l024 17:r293 l011 16:r294 l011 14:r295 l011 13:r296 l024 9:r297 l0240:r298 l024 The fixed register allocation from IRA is as follows, note the mems instead of hard registers 8-11,24: Disposition: 26:r245 l0 22:r246 l0 4 28:r249 l0 23:r250 l0 mem 4:r251 l0 mem5:r252 l0166:r253 l017 19:r260 l0 mem 15:r266 l0 mem 12:r275 l0 3 11:r276 l0 2 10:r278 l0 mem 27:r281 l0 47:r282 l0 58:r283 l0 61:r284 l0 7 29:r285 l0 mem 24:r286 l0 mem 25:r287 l0 2 23:r288 l0 mem 22:r289 l0 mem 21:r290 l0 mem 20:r291 l0 mem 18:r292 l0 mem 17:r293 l0 mem 16:r294 l0 mem 14:r295 l0 mem 13:r296 l0 mem 9:r297 l0240:r298 l024 So the issue (I believe) is that reloads from LRA do not converge as well as reloads introduced by classic reload. While this example can (and should) be fixed in the original register allocation it feels as though there is a problem to fix in LRA. == [mfortune@mfortune-linux lra_bugreport]$ /althome/mips/tk/bin/mips-sde-elf-gcc -v Using built-in specs. COLLECT_GCC=/althome/mips/tk/bin/mips-sde-elf-gcc COLLECT_LTO_WRAPPER=/althome/mips/tk/libexec/gcc/mips-sde-elf/4.9.0/lto-wrapper Target: mips-sde-elf Configured with: /althome/mips/git_br/gcc/configure --prefix=/althome/mips/tk --target=mips-sde-elf --with-gnu-as --with-gnu-ld --with-arch=mips32r2 --with-mips-plt --with-synci --with-llsc --with-newlib target_alias=mips-sde-elf --enable-languages=c,c++,lto Thread model: single gcc version 4.9.0 20130918 (experimental) (GCC)