[Bug target/39635] [avr] integer wrong code bug
--- Comment #5 from hutchinsonandy at aim dot com 2009-09-13 16:14 --- It looks like most of AVR shift/rotates are messed up. For the case we where we have non constant shifts, the peephole may grab a scratch register. In this case it look like it grabs one that is free afterwards and not before. Hence overlap issue The rotate split pattern problem is different as noted in message links. . In this case it is not apparent why the split is used for only different source/destination registers. If the split were constrained so that src=dest the overlap would be much easily to handle and it would seemingly produce better code for the common case where src=dest. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39635
[Bug target/36494] Char arrays gets corrupted in avr programs.
--- Comment #4 from hutchinsonandy at aim dot com 2008-06-11 22:05 --- I'm sure Eric will weight in again to verify code posted executes correctly (it looks correct to me). I suspect you have some config or memory issue. For example, unoptimized, the string is stored at location 0x100 and 0x101. So if this get trashed, (by another part of software) the result will be 2. Also watch your stack space. Unoptimized, it will be large and overflow data areas The optimization, of course will bypass such problems -and that works. This also shows that front end and middle of gcc are ok. The code contains no tricky or seldom used instructions - so it appears highly unlikely to be a problem with back end which is specfic to avr. If Eric confirms this he may mark bug as invalid - but you can pursue a solution on other forums (such as http://lists.nongnu.org/mailman/listinfo/avr-gcc-list) -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36494
[Bug target/36336] ICE push_reload - psuedo reg_equiv_constant
--- Comment #5 from hutchinsonandy at aim dot com 2008-06-06 20:18 --- Subject: Re: ICE push_reload - psuedo reg_equiv_constant The patch for http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31786 removes one problematic part of LEGITIMIZE_RELOAD_ADDRESS. This is applied to WinAVR 20080512 (patched 4.3.0). But is still waiting for approval on avr-gcc 4.3/4.4 HEAD. Even with that patch, the other parts of L_R_A are bad and need fixing with added check of reg_equivalent_constant. So if register is equivalent constant LEGITIMIZE_RELOAD_ADDRESS should do nothing. The assert that triggers is an explicit check for this. I have not posted patch due to overlap with pending patch. BTW gcc list has short discussion on this bug, including explanation for the AVR code. Andy -- Sent from my Dingleberry wired device. -Original Message- From: eric dot weddington at atmel dot com <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Fri, 6 Jun 2008 3:48 pm Subject: [Bug target/36336] ICE push_reload - psuedo reg_equiv_constant --- Comment #4 from eric dot weddington at atmel dot com 2008-06-06 19:48 --- Test case passes with -O[0123s], with WinAVR 20080512 (patched 4.3.0). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36336 --- You are receiving this mail because: --- You reported the bug, or are watching the reporter. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36336
[Bug target/36336] ICE push_reload - psuedo reg_equiv_constant
--- Comment #3 from hutchinsonandy at aim dot com 2008-06-06 19:42 --- Subject: Re: ICE push_reload - psuedo reg_equiv_constant O2 -- Sent from my Dingleberry wired device. -Original Message- From: eric dot weddington at atmel dot com <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Fri, 6 Jun 2008 3:24 pm Subject: [Bug target/36336] ICE push_reload - psuedo reg_equiv_constant --- Comment #2 from eric dot weddington at atmel dot com 2008-06-06 19:24 --- Andy, I'm having a difficulty in trying to reproduce this bug. I use this command line: avr-gcc -O1 -mmcu=atmega128 -w -std=gnu99 -c memcpy-chk.c -o memcpy-chk.o But I'm using WinAVR 20080512, which is patched, and it does not give an ICE. Are you also getting this ICE with HEAD? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36336 --- You are receiving this mail because: --- You reported the bug, or are watching the reporter. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36336
[Bug middle-end/36447] simplify_subreg ICE with right shift more than length type AVR
--- Comment #3 from hutchinsonandy at aim dot com 2008-06-06 11:55 --- Subject: Re: simplify_subreg ICE with right shift more than length type AVR Thanks for quick response, I will give this a try and no doubt it will work. I was trying to think of how the other case should be simplified, rather than left as shift. or put another way, how should we take sign? Andy -- Sent from my Dingleberry wired device. -Original Message- From: bonzini at gnu dot org <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Fri, 6 Jun 2008 1:04 am Subject: [Bug middle-end/36447] simplify_subreg ICE with right shift more than length type AVR --- Comment #2 from bonzini at gnu dot org 2008-06-06 05:04 --- Can you try and possibly submit this patch: Index: /Users/bonzinip/cvs/gcc/gcc/simplify-rtx.c === --- /Users/bonzinip/cvs/gcc/gcc/simplify-rtx.c (revision 134435) +++ /Users/bonzinip/cvs/gcc/gcc/simplify-rtx.c (working copy) @@ -5247,6 +5247,7 @@ simplify_subreg (enum machine_mode outer && GET_MODE_BITSIZE (innermode) >= (2 * GET_MODE_BITSIZE (outermode)) && GET_CODE (XEXP (op, 1)) == CONST_INT && (INTVAL (XEXP (op, 1)) & (GET_MODE_BITSIZE (outermode) - 1)) == 0 + && INTVAL (XEXP (op, 1)) < GET_MODE_BITSIZE (innermode) && byte == subreg_lowpart_offset (outermode, innermode)) { int shifted_bytes = INTVAL (XEXP (op, 1)) / BITS_PER_UNIT; Thanks! -- bonzini at gnu dot org changed: What|Removed |Added - --- Status|UNCONFIRMED |NEW Ever Confirmed|0 |1 Last reconfirmed|-00-00 00:00:00 |2008-06-06 05:04:36 date|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36447 --- You are receiving this mail because: --- You reported the bug, or are watching the reporter. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36447
[Bug middle-end/36447] simplify_subreg ICE with right shift more than length type AVR
--- Comment #1 from hutchinsonandy at aim dot com 2008-06-06 03:08 --- rev 132971 appears to have created this problem. Revision: 132971 Author: bonzini Date: 8:30:10 AM, Thursday, March 06, 2008 Message: 2008-03-06 Paolo Bonzini <[EMAIL PROTECTED]> * simplify-rtx.c (simplify_subreg): Remove useless shifts from word-extractions out of a multi-word object. Modified : /trunk/gcc/ChangeLog Modified : /trunk/gcc/simplify-rtx.c It fails because subreg simplification tries to extract byte 3 of the HImode int 'a' at simplify-rtx 5271. No check is performed here to see if shift count >= length. For this case simplification should give sign of value (-1 or 0). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36447
[Bug c/36447] New: simplify_subreg ICE with right shift more than length type AVR
Works on avr-gcc (GCC) 4.2.2 (WinAVR 20071221). Does not on 4.4 HEAD. Test results posts show this test failing since AT LEAST SVN rev 132993 on AVR (March 3 2008) (before that test was not run - so dont know when it started. gcc-c/toture/unsorted/shm.c foo (int *p) { int a = *p; return a >> 24; } /home/hutchia/Desktop/gcc/gcc/testsuite/gcc.c-torture/unsorted/shm.c:5: internal compiler error: in simplify_subreg, at simplify-rtx.c:4962 test.c: In function 'foo': test.c:4: warning: right shift count >= width of type Analyzing compilation unit Performing interprocedural optimizations Assembling functions: foo Breakpoint 1, fancy_abort (file=0xb030f2 "../../gcc/gcc/simplify-rtx.c", line=4962, function=0xb03158 "simplify_subreg") at ../../gcc/gcc/diagnostic.c:654 654 internal_error ("in %s, at %s:%d", function, trim_filename (file), lin e); (gdb) where #0 fancy_abort (file=0xb030f2 "../../gcc/gcc/simplify-rtx.c", line=4962, function=0xb03158 "simplify_subreg") at ../../gcc/gcc/diagnostic.c:654 #1 0x007a1143 in simplify_subreg (outermode=QImode, op=0x7ff24f20, innermode=HImode, byte=3) at ../../gcc/gcc/simplify-rtx.c:4937 #2 0x007a1672 in simplify_gen_subreg (outermode=QImode, op=0x7ff24f20, innermode=HImode, byte=3) at ../../gcc/gcc/simplify-rtx.c:5287 #3 0x007a0b83 in simplify_subreg (outermode=QImode, op=0x7ff25100, innermode=HImode, byte=0) at ../../gcc/gcc/simplify-rtx.c:5271 #4 0x007a1672 in simplify_gen_subreg (outermode=QImode, op=0x7ff25100, innermode=HImode, byte=0) at ../../gcc/gcc/simplify-rtx.c:5287 #5 0x0098f86a in propagate_rtx_1 (px=0x22caf4, old=0x4, new=0x7ff25100, flags=2) at ../../gcc/gcc/fwprop.c:333 #6 0x009907da in forward_propagate_into (use=0x0) at ../../gcc/gcc/fwprop.c:457 #7 0x00990bc2 in fwprop () at ../../gcc/gcc/fwprop.c:1055 #8 0x00624a0e in execute_one_pass (pass=0x0) at ../../gcc/gcc/passes.c:1292 #9 0x00624b33 in execute_pass_list (pass=0xa8c320) at ../../gcc/gcc/passes.c:1342 #10 0x00624b46 in execute_pass_list (pass=0xa8c5c0) at ../../gcc/gcc/passes.c:1343 #11 0x0084d899 in tree_rest_of_compilation (fndecl=0x7fdbf1f0) at ../../gcc/gcc/tree-optimize.c:421 #12 0x00625d4b in cgraph_expand_function (node=0x7ff40280) at ../../gcc/gcc/cgraphunit.c:1148 #13 0x0062793e in cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1211 #14 0x0041bee7 in c_write_global_declarations () at ../../gcc/gcc/c-decl.c:8062 #15 0x0062c95b in toplev_main (argc=3, argv=0x1c014c0) at ../../gcc/gcc/toplev.c:976 #16 0x0049467a in main (argc=3, argv=0x1c014c0) at ../../gcc/gcc/main.c:35 -- Summary: simplify_subreg ICE with right shift more than length type AVR Product: gcc Version: 4.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: hutchinsonandy at aim dot com GCC build triplet: i686-pc-cygwin GCC host triplet: i686-pc-cygwin GCC target triplet: avr-unknown-none http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36447
[Bug target/27386] AVR: wrong code generated when passing three uint64_t arguments to function
--- Comment #14 from hutchinsonandy at aim dot com 2008-06-01 15:22 --- It appears emit_single_push_insn() is BROKEN for targets with: a)STACK_GROWS_DOWNWARDS+POST_DEC push b)Upwards+POST_INC push. So if any target has this combo and #define PUSH_ROUNDING - it is broken. Fortunately for AVR the whole mess goes away when we #undef PUSH_ROUNDING- which appears so far to be uneccesary. It also cleared some compat test failures! For reference here what I did sort out for emit_single_push_insn There are two problems. 1) For downwards padding, MISTAKE in original (H8) change that for (a) adds size of slot to SP to get address - that would be highest byte of mem! This crashes AVR. /* We have already decremented the stack pointer, so get the previous value. */ offset += (HOST_WIDE_INT) rounded_size; I AM VERY WRONG POST_DEC leaves stack pointer at BOS-1. We must add smallest addressable unit of stack (byte,word?) to get address of MEM. Or perhaps use STACK_POINTER_OFFSET. So above becomes. offset += (HOST_WIDE_INT) STACK_POINTER_OFFSET; 2) For Upwards padding both POST_INC and POST_DEC have PRE_MODIFY sequences created for upwards padding. This is questioned in code and clearly incorrect. I assume no target has this combo. One possible way to fix issue is to use POST_MODIFY. However, that would assume final instructions would not be split and cause stack corruption during interrupt. This matter I have not checked. Someone might consider adding some asserts here! -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27386
[Bug target/27386] AVR: wrong code generated when passing three uint64_t arguments to function
--- Comment #13 from hutchinsonandy at aim dot com 2008-06-01 02:40 --- expr.c appears all messed up on emit_single_push_insn. This bad code gets executed when there is no push instruction available. As well as getting address of the mem created completely wrong, it does not account for any offset between SP and Top/Bottom of Stack aka STACK_POINTER_OFFSET Any comment before I try and fix this mess? First example, ironically without the warning mentioned in latter code. else if (FUNCTION_ARG_PADDING (mode, type) == downward) { unsigned padding_size = rounded_size - GET_MODE_SIZE (mode); HOST_WIDE_INT offset; emit_move_insn (stack_pointer_rtx, expand_binop (Pmode, #ifdef STACK_GROWS_DOWNWARD sub_optab, #else add_optab, #endif stack_pointer_rtx, GEN_INT (rounded_size), NULL_RTX, 0, OPTAB_LIB_WIDEN)); offset = (HOST_WIDE_INT) padding_size; #ifdef STACK_GROWS_DOWNWARD if (STACK_PUSH_CODE == POST_DEC) /* We have already decremented the stack pointer, so get the previous value. */ ///NEXT LINE IS WRONG We are pointing just below value so we need SP + STACK_POINTER_OFFSET offset += (HOST_WIDE_INT) rounded_size; //For PRE_DEC we already point directly to mem so code OK #else if (STACK_PUSH_CODE == POST_INC) /* We have already incremented the stack pointer, so get the previous value. */ //NEXT LINE IS CORRECT offset -= (HOST_WIDE_INT) rounded_size; //For PRE_INC we now add STACK_POINTER_OFFSET or SP will be one lower than mem address #endif dest_addr = gen_rtx_PLUS (Pmode, stack_pointer_rtx, GEN_INT (offset)); } else The rest of code is even worse! -- hutchinsonandy at aim dot com changed: What|Removed |Added CC||hutchinsonandy at aim dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27386
[Bug testsuite/36285] gcc.dg/compat/struct-by-value-xxx improper test for AVR target
--- Comment #1 from hutchinsonandy at aim dot com 2008-06-01 01:02 --- I have reduced number of failures slightly by setting higher optimisation and skipping complex int using set COMPAT_SKIPS [list {VA} {COMPLEX_INT}] set COMPAT_OPTIONS [list [list {-Os -mcall-prologues} {-Os -mcall-prologues}]] But complex float, double and long double are not avoidable and taking way too much code size to link. Additionally, there appears to be no way of Skipping these test or even marking xfail for the link/run stages. -- hutchinsonandy at aim dot com changed: What|Removed |Added CC||janis187 at us dot ibm dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36285
[Bug testsuite/36285] New: gcc.dg/compat/struct-by-value-xxx improper test for AVR target
Most of the 21 variants of this test fail for AVR target. Issue noted appears to be excessive memory need and thus failure at link time. For example. PASS: gcc.dg/compat/struct-by-value-11 c_compat_y_tst.o compile Executing on host: /home/hutchia/Desktop/awhconf/gcc/xgcc -B/home/hutchia/Desktop/awhconf/gcc/ c_compat_main_tst.o c_compat_x_tst.o c_compat_y_tst.o-DSTACK_SIZE=2048 -DNO_TRAMPOLINES -fno-show-column -DSIGNAL_SUPPRESS -mmcu=atmega128 /home/hutchia/Desktop/dejagnuboards/exit.c -Wl,-u,vfprintf -lprintf_flt -Wl,-Tbss=0x802000,--defsym=__heap_end=0x80 -lm -o gcc-dg-compat-struct-by-value-11-01(timeout = 300) /home/hutchia/local/avr/lib/gcc/avr/4.4.0/../../../../avr/bin/ld: gcc-dg-compat-struct-by-value-11-01 section .text will not fit in region text /home/hutchia/local/avr/lib/gcc/avr/4.4.0/../../../../avr/bin/ld: region text overflowed by 353496 bytes compiler exited with status 1 output is: /home/hutchia/local/avr/lib/gcc/avr/4.4.0/../../../../avr/bin/ld: gcc-dg-compat-struct-by-value-11-01 section .text will not fit in region text /home/hutchia/local/avr/lib/gcc/avr/4.4.0/../../../../avr/bin/ld: region text overflowed by 353496 bytes -- Summary: gcc.dg/compat/struct-by-value-xxx improper test for AVR target Product: gcc Version: 4.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: hutchinsonandy at aim dot com GCC host triplet: i686-pc-linux-gnu GCC target triplet: avr-unknown-none http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36285
[Bug testsuite/36284] gcc.dg-struct-layout fails AVR target - multiple reasons
--- Comment #1 from hutchinsonandy at aim dot com 2008-05-20 22:41 --- Created an attachment (id=15658) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15658&action=view) Extract from gcc.log Extract from gcc.log showing failure details. For economy, not all 28 tests are shown. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36284
[Bug testsuite/36284] New: gcc.dg-struct-layout fails AVR target - multiple reasons
gcc/testsuite/gcc/gcc.dg-struct-layout-1 fails multiple times for AVR target due to non-portable testcase. This test has 28 generated variants, all fail. Problems include: 1)Assumes int are 32 bit gcc/gcc.dg-struct-layout-1//t001_test.h:119: error: width of 'a' 2) Assumes availability of DF mode gcc.dg/compat/vector-defs.h:9: error: unable to emulate 'DF' 3) Some undefined problem - maybe size >32767? short_enums30131.c:3: error: size of array 's' is negative I will post snippet of log file to aid correction. -- Summary: gcc.dg-struct-layout fails AVR target - multiple reasons Product: gcc Version: 4.4.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: hutchinsonandy at aim dot com GCC host triplet: i686-pc-linux-gnu GCC target triplet: avr-unknown-none http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36284
[Bug target/32871] [avr] Bad optimisation - gcc is pushing too many registers
--- Comment #7 from hutchinsonandy at aim dot com 2008-04-28 00:59 --- Attached is INCOMPLETE attempt to fix this issue. Register saves appear to be ok. But same function is required for Argument pointer elimination offset. It would appear DF chain info is not maintained, when global.c uses this. So offset used to access arguments on stack does not reflect final value required and will fail. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32871
[Bug target/32871] [avr] Bad optimisation - gcc is pushing too many registers
--- Comment #6 from hutchinsonandy at aim dot com 2008-04-28 00:58 --- Created an attachment (id=15540) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15540&action=view) Partial solution using DF defs. -- hutchinsonandy at aim dot com changed: What|Removed |Added Attachment #15254|0 |1 is obsolete|| http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32871
[Bug target/35860] [4.3 Regression] [avr] code bloat caused by -fsplit-wide-types
--- Comment #8 from hutchinsonandy at aim dot com 2008-04-16 13:10 --- Subject: Re: [4.3 Regression] [avr] code bloat caused by -fsplit-wide-types Yes, indeed, I have patches in progress for AVR that do split operation to take more advantage of lowering but the "bug" is still an issue then. For example, if the testcase was using PLUS instead or OR, I will not be able to split instruction. (anything with carried "status" is problematic with reload and - as yet - cannot be split) The problem will merely propagate backwards until it gets blocked by unsplit wide mode operation (PLUS, COMPARE, SUB, MULT and probabley calls). Simply put, it will occur where ever a wide mode value meets a set of subregs. Here it will determine there is a conflict - even if there is not one. -Original Message- From: steven at gcc dot gnu dot org <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Wed, 16 Apr 2008 4:59 am Subject: [Bug target/35860] [4.3 Regression] [avr] code bloat caused by -fsplit-wide-types --- Comment #7 from steven at gcc dot gnu dot org 2008-04-16 08:59 --- I agree with Paolo in comment #6. One purpose of the lower-subreg path was to allow backends to *not* define insns that it doesn't have. The expanders will generate inline code for such patterns at expand time, with sets to subregs. Before GCC had lower-subreg, this would lead to awful code, but now that we split the subregs out to pseudos it ought to work just fine. Sadly, even i386 still hasn't been modified to benefit from this work... -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35860 --- You are receiving this mail because: --- You are on the CC list for the bug, or are watching someone who is. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35860
[Bug target/35860] code bloat caused by -fsplit-wide-types
--- Comment #3 from hutchinsonandy at aim dot com 2008-04-09 19:24 --- Subject: Re: code bloat caused by -fsplit-wide-types Try fwprop patch it might well help. I can't tell from report where the oppertunities are missed. But anything split at combine/split won't get any benefit as fwprop passes only occur before (much to my dismay). Register allocation has a more limited forward propagtion ability (it does not simplify for one) and simplistical will remove one level of redundant moves. If we try split before combine (expanded RTL), then combine does work so well and it's a net loss. Combine on split types does not work well as it is not possible to all instructions (like compare, add). We can't split due to use of CC0. We use CC0 because I cant figure out how to prevent reloads destroying status. Dang it! -Original Message- From: eric dot weddington at atmel dot com <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Wed, 9 Apr 2008 3:04 pm Subject: [Bug target/35860] code bloat caused by -fsplit-wide-types --- Comment #2 from eric dot weddington at atmel dot com 2008-04-09 19:04 --- I'll see about testing with Andy Hutchinson's fwprop patch at bug #35542. -- eric dot weddington at atmel dot com changed: What|Removed |Added - --- CC| |hutchinsonandy at aim dot ||com, eric dot weddington at ||atmel dot com GCC host triplet|winavr 20080402 release |mingw http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35860 --- You are receiving this mail because: --- You are on the CC list for the bug, or are watching someone who is. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35860
[Bug target/34916] [4.3/4.4 Regression] gcc.c-torture/execute/pr27364.c fails with -O1, -O2 and -Os
--- Comment #11 from hutchinsonandy at aim dot com 2008-04-08 17:23 --- Subject: Re: [4.3/4.4 Regression] gcc.c-torture/execute/pr27364.c fails with -O1, -O2 and -Os I believe the rules allow for this after a suitable grace period. Remind me towards end of week and I will post for approval. Andy -Original Message- From: eric dot weddington at atmel dot com <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Tue, 8 Apr 2008 11:16 am Subject: [Bug target/34916] [4.3/4.4 Regression] gcc.c-torture/execute/pr27364.c fails with -O1, -O2 and -Os --- Comment #10 from eric dot weddington at atmel dot com 2008-04-08 15:16 --- Andy, since this was a 4.3 regression is there any way we can back port this and commit it on the 4.3 branch? -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916 --- You are receiving this mail because: --- You are on the CC list for the bug, or are watching someone who is. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916
[Bug rtl-optimization/35542] [4.3 Regression] fwprop only propagates one operand
--- Comment #6 from hutchinsonandy at aim dot com 2008-04-02 15:44 --- Subject: Re: [4.3 Regression] fwprop only propagates one operand Eric, it's difficult to give you a specfic example as the propagation is very sensitive to generated code. I found this looking at other AVR bugs and discovered it was not working. (I'll look up that example latter for you). It not an obvious thing as other (good) changes due to DF merge, re-arrange code enough to obscure the omission. But the result is an extra move or redundant instruction here and there. (its also not at all specfic to AVR) Andy -Original Message- From: eric dot weddington at atmel dot com <[EMAIL PROTECTED]> To: [EMAIL PROTECTED] Sent: Wed, 2 Apr 2008 10:21 am Subject: [Bug rtl-optimization/35542] [4.3 Regression] fwprop only propagates one operand --- Comment #4 from eric dot weddington at atmel dot com 2008-04-02 15:21 --- (In reply to comment #3) > committed to trunk, will backport to 4.3 in due time (causes regressions for > AVR) > Could you list what fails? Thanks! -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35542 --- You are receiving this mail because: --- You reported the bug, or are watching the reporter. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35542
[Bug target/21080] Excecution test failure for avr for pr17377 test case.
--- Comment #3 from hutchinsonandy at aim dot com 2008-03-29 12:55 --- Created an attachment (id=15396) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15396&action=view) Patch to correct return_address The attached patch fixes this problem and PR21078 The AVR target support for builtin_return_address only returned value of frame_pointer+1 - so it would only be correct if stack and frame were empty. The attached patch calculates the stack usage in the function prolog. This is placed in symbol stack_usage using UNSPEC instruction pattern. Builtin Return address uses RETURN_ADDR_RTX(count, tem) to add this to frame pointer to get to correct address. This only supports level 0 (same function). Other levels (ie upper functions) return 0 - which is correct response if not supported. The address is that read from the stack - ie word address. Testsuite torture/execute/20010122-1.c and PR17377.c both pass with this patch applied. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21080
[Bug target/34879] __builtin_setjmp / __builtin_longjmp fails stack frame address with O2, O3 and Os
--- Comment #1 from hutchinsonandy at aim dot com 2008-03-29 11:37 --- Created an attachment (id=15395) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15395&action=view) Setjmp patch for AVR The attached patch is a fix for AVR target. MIPS does something similar to get around same issue. The real problem is with gcc builin setjmp receiver being removed by optimizers. Optimizers think that frame_pointer load in receiver is unneeded and remove it! The patch loads the frame pointer in the nonlocal_goto, making the receiver (where it jumps to) empty, so bad optimization cannot remove it. Additionally, it avoids the unnecessary arithmetic around frame pointer offsets. This patch was tested and the testcase passes. Further changes may be required in the future if AVR 24bit jumps are to be supported. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34879
[Bug target/35508] [avr] 4.3.0: undefined reference to `__ffshi2'
--- Comment #2 from hutchinsonandy at aim dot com 2008-03-23 00:24 --- Patch posted http://gcc.gnu.org/ml/gcc-patches/2008-03/msg01341.html -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35508
[Bug target/34210] ffs builtin calls undefined __ffshi2
--- Comment #5 from hutchinsonandy at aim dot com 2008-03-23 00:24 --- Patch posted: http://gcc.gnu.org/ml/gcc-patches/2008-03/msg01341.html -- hutchinsonandy at aim dot com changed: What|Removed |Added CC||hutchinsonandy at aim dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34210
[Bug target/35508] [avr] 4.3.0: undefined reference to `__ffshi2'
--- Comment #1 from hutchinsonandy at aim dot com 2008-03-22 23:51 --- This is same bug as: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34210 working on fix - to be posted soon. -- hutchinsonandy at aim dot com changed: What|Removed |Added CC||hutchinsonandy at aim dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35508
[Bug target/34932] [avr] ICE in reload
--- Comment #4 from hutchinsonandy at aim dot com 2008-03-21 22:52 --- Created an attachment (id=15357) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15357&action=view) FIX for ICE This patches disables instruction pattern that causes ICE. This pattern is used for the case of addition where both operands are zero_extended. Since zero extension of this type must still load a zero into one register anyway, there appears to be no benefit from this pattern over separate patterns for addhi3_zero_extend1 and zero_extend. So rather than trying to get reload to figure it out, the problem instruction can be removed. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34932
[Bug target/30243] [4.1/4.2/4.3/4.4 Regression][avr] signbit() causes an internal compiler error
--- Comment #8 from hutchinsonandy at aim dot com 2008-03-17 23:10 --- Fails 4.3 on recently added testcase for same bug. /cygdrive/e/gcc/gcc/testsuite/gcc.c-torture/execute/pr35456.c:17: internal compiler error: in gen_lowpart_general, at rtlhooks.c:53 Please submit a full bug report -- hutchinsonandy at aim dot com changed: What|Removed |Added CC||hutchinsonandy at aim dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30243
[Bug middle-end/35519] COMBINE repeating same matches and can SEG fault
--- Comment #4 from hutchinsonandy at aim dot com 2008-03-15 23:49 --- This bug also causes incorrect code and appears to be regression from 4.2 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916 The good news is that the fix is effective. Anything else I can do to help expedite the implementation of the patch or alternate fix? -- hutchinsonandy at aim dot com changed: What|Removed |Added CC||hutchinsonandy at aim dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35519
[Bug target/34916] [4.3/4.4 Regression] gcc.c-torture/execute/pr27364.c fails with -O1, -O2 and -Os
--- Comment #8 from hutchinsonandy at aim dot com 2008-03-15 23:40 --- This appear to be same bug where combine is erroneously assuming all DF register references are to different instructions. So it tries combining instructions with themselves and stuff gets lost. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35519 The above testcase still fails with gcc version 4.4.0 20080305. However, with patch from PR35519 it produces correct code: 23/* prologue: function */ 24/* frame size = 0 */ 25.LM2: 26 2BE0 ldi r18,lo8(11) 27 0002 30E0 ldi r19,hi8(11) 28 0004 40E0 ldi r20,hlo8(11) 29 0006 50E0 ldi r21,hhi8(11) 30 0008 0E94 call __mulsi3 31.LVL1: 32/* epilogue start */ 33.LM3: 34 000c 0895 ret -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916
[Bug rtl-optimization/35542] fwprop only propagates one operand
--- Comment #1 from hutchinsonandy at aim dot com 2008-03-11 20:34 --- Created an attachment (id=15300) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15300&action=view) Patch to search modified instruction for register. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35542
[Bug rtl-optimization/35542] New: fwprop only propagates one operand
fwprop.c currently has a bug where a successful propagation to one operand of an instruction will prevent propagation to any remaining operands. The cause is due to the use of loc_mentioned_in_p() to check that a reference, provided by earlier DF scan, still exist in an instruction. The test is intended to check that an earlier propagation and simplification has not removed D/U references. However, loc_mentioned_in_p(), compares addresses of rtx to determine equivalence. If an instruction has already been modified and simplified, this will longer apply - even if the def/use is still valid. This problem was already noted in Nov, but no bug report seems to have been filed. http://gcc.gnu.org/ml/gcc-patches/2007-11/msg00170.html I will attach patch, that uses reg_mentioned_p() as a substitute. The only differences I noted where: a) reg_mentioned_p() does not match physical sub registers of longer hard registers. This seems to have no consequence since fwprop entirely rejects hard_registers in latter code. Perhaps for clarity, hard registers could be ignored earlier. b) registers in asm_operands are found. Which seems beneficial if they are pseudos and again ignored if they are hard registers. I am only able to test this with AVR port. In that it was 100% successful with no regressions of torture/execute testsuite. -- Summary: fwprop only propagates one operand Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: hutchinsonandy at aim dot com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35542
[Bug middle-end/35519] COMBINE repeating same matches and can SEG fault
--- Comment #3 from hutchinsonandy at aim dot com 2008-03-10 22:24 --- Subject: Re: COMBINE repeating same matches and can SEG fault The quadratic nature does not seem to be particularly problem with the data involved. The log_links is build up incrementally. (with duplicates at present). The patch does a "do over" to check that a potential link is not one already recorded. So initially the list is 0 and grows (hence quadratic) However, it is only truely quadratic if there are no duplicates. Duplicates are grouped together, and if I understand correctly, the last element added will always be checked first - so then inner loop will terminate after 1 iteration if a duplicate is present. The final list is quite small (typically 2-3 elements in length) - directly reflecting the number of links between RTL instruction. If instead the list is created - then pruned afterwards, we have a longer list. So the quadratic nature of the loop is now replaced by check/sorting. This could be better- but only if the final list is of a significant size that can exploit non-quadratic searching Given the overhead of adding/removing linked items, and the small length of list, I believe the patch is better. If, however, the number of RTL operands could be much larger than I have assumed, then perhaps a change is needed. Personally I can't think of many insn that refer to more than 3 other instructions. But I could be wrong. Andy steven at gcc dot gnu dot org wrote: > --- Comment #2 from steven at gcc dot gnu dot org 2008-03-10 20:04 > --- > The patch makes adding log use an algorithm quadratic in the number of log > links per insn. It is probably better to: > 1. build the log links. > 2. filter out the duplicates as a post pass (and maybe sort them while at it?) > > > -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35519
[Bug middle-end/35519] COMBINE repeating same matches and can SEG fault
--- Comment #1 from hutchinsonandy at aim dot com 2008-03-09 23:52 --- Created an attachment (id=15287) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15287&action=view) Patch for consideratiom towards a solution Patch that removes duplicates when LOG_LINKS is created. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35519
[Bug middle-end/35519] New: COMBINE repeating same matches and can SEG fault
This problem potentiall affects all all targets The data flow information that combine uses can cause Segmentation fault. I have found this with AVR experimental build but it would seem that it can affect any target. The problem is that the LOG_LINKS that combine creates from DF can include multiple references between instruction pairs. DF will produce multiple reference between instructions if they share a register that decomposes into several smaller registers. The multiple cross references are then used by Combine to select instruction pairs and triples to match. This results in repeat trials of the same instructions! By was on an example, tThe RTL that triggered my problem was: (insn 45 42 46 4 920625-1.c:55 (set (reg:SI 22 r22 [ temp.24 ]) (mem:SI (reg/v/f:HI 71 [ alpha ]) [2 S4 A8])) 19 {*movsi} (nil)) (insn 46 45 47 4 920625-1.c:55 (set (reg:SI 18 r18) (mem:SI (plus:HI (reg:HI 68 [ ivtmp.18 ]) (const_int 4 [0x4])) [2 S4 A8])) 19 {*movsi} (nil)) (insn 47 46 48 4 920625-1.c:55 (parallel [ (set (reg:SI 22 r22) (mult:SI (reg:SI 22 r22) (reg:SI 18 r18))) (clobber (reg:HI 26 r26)) (clobber (reg:HI 30 r30)) ]) 43 {*mulsi3_call} (expr_list:REG_DEAD (reg:SI 18 r18) (expr_list:REG_UNUSED (reg:HI 30 r30) (expr_list:REG_UNUSED (reg:HI 26 r26) (nil) This is call to library function, and the parameter for instruction 47 are hard registers like SI:R22 - which is decomposed in DF as R22,23,24 and 25. DF marks all 4 sub parts in def/use chains (which seems entirely correct) When DF information is transferred into LOG_LINKS we still have 4 references back to the definition in instructions 45 and 47. From gdb this was: (gdb) print uid_log_links[47] $8 = (rtx) 0x7ff140d0 (gdb) pr (insn_list:REG_DEP_TRUE 45 (insn_list:REG_DEP_TRUE 45 (insn_list:REG_DEP_TRUE 45 (insn_list:REG_DEP_TRUE 45 (insn_list:REG_DEP_TRUE 46 (insn_list:REG_DEP_TRUE 4 6 (insn_list:REG_DEP_TRUE 46 (insn_list:REG_DEP_TRUE 46 (nil) These multiple references causes COMBINE to try the same combination of instruction 45 and 47 multiple times ( thinking they are different instructions). In this case the match is tried 4 times - 3 more than needed. Thi part appears most benign - except for processing time/memory used. BUT when combine tries three instructions, it can crash. In this example, combine ends up trying to combine 2 duplicate instruction 45 with 47: I1= (insn 45 42 46 4 920625-1.c:55 (set (reg:SI 22 r22 [ temp.24 ]) (mem:SI (reg/v/f:HI 71 [ alpha ]) [2 S4 A8])) 19 {*movsi} (nil)) I2= (insn 45 42 46 4 920625-1.c:55 (set (reg:SI 22 r22 [ temp.24 ]) (mem:SI (reg/v/f:HI 71 [ alpha ]) [2 S4 A8])) 19 {*movsi} (nil)) I3= (insn 47 46 48 4 920625-1.c:55 (parallel [ (set (reg:SI 22 r22) (mult:SI (reg:SI 22 r22) (reg:SI 18 r18))) (clobber (reg:HI 26 r26)) (clobber (reg:HI 30 r30)) ]) 43 {*mulsi3_call} (expr_list:REG_DEAD (reg:SI 18 r18) (expr_list:REG_UNUSED (reg:HI 30 r30) (expr_list:REG_UNUSED (reg:HI 26 r26) (nil) Combine merges I1 into i3 and deletes I1. Combine notes that the life of R22 terminates in I2 and attempt to put a REG_DEAD note on I2 - except of course the deletion of I1 also deletes the identical i2. Segmentation fault occurs. -- Summary: COMBINE repeating same matches and can SEG fault Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: hutchinsonandy at aim dot com GCC host triplet: i686-oc-cyqwin GCC target triplet: avr-unknown-none http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35519
[Bug target/35507] [avr] 4.3.0: size of small funcion increases from 2 to 29 words
--- Comment #4 from hutchinsonandy at aim dot com 2008-03-09 18:36 --- The problem is not commutation knowledge to the backend. First - the use notes were a red herring. Reversing them did not help. After much chasing thru call.c and optabs.c, it looks like neither creates nor correct the issue. But I can fix (hide) the issue by commutating the back end expander to get optimal code. (Of course that does not fix ALL binop libcalls!) So I tried a non-commutative operator (% or /) - and that was optimal, with no expander changes. So it would appear that the default "don't care" order presented to expand_binop() is wrong. Where it does critically matter (non-commutative functions), the order is ideal. The effect on chained arithmetic or higher modes such as DImode is horrendous, and may explain other noted problems with optimizations. FYI here is expander used to temporarily fix (hide) problem - NOTE operand numbering, relative to RTL order. (define_expand "mulsi3" [(set (reg:SI 18) (match_operand:SI 1 "register_operand" "r")) (set (reg:SI 22) (match_operand:SI 2 "register_operand" "r")) (parallel [(set (reg:SI 22) (mult:SI (reg:SI 22) (reg:SI 18))) (clobber (reg:HI 26)) (clobber (reg:HI 30))]) (set (match_operand:SI 0 "register_operand" "=&r") (reg:SI 22))] "AVR_HAVE_MUL" "") -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35507
[Bug target/35507] [avr] 4.3.0: size of small funcion increases from 2 to 29 words
--- Comment #2 from hutchinsonandy at aim dot com 2008-03-09 12:23 --- Here is more info: Testcase: static long foo99(long b,long a) { return b * a; } long foo2(long b, long a) { return foo99(b,a); } Looking at RTL, the USE of the respective libcalls are reversed. That is the RTL generated for call to MULSI3 is reversed from a normal C function that has same arguments and calling conventions. (call_insn/u 9 8 10 920625-1.c:45 (set (reg:SI 22 r22) (call (mem:HI (symbol_ref:HI ("__mulsi3") [flags 0x41]) [0 S2 A8]) (const_int 0 [0x0]))) -1 (expr_list:REG_EH_REGION (const_int -1 [0x]) (nil)) (expr_list:REG_DEP_TRUE (use (reg:SI 18 r18)) (expr_list:REG_DEP_TRUE (use (reg:SI 22 r22)) (nil (call_insn/u 9 8 10 920625-1.c:51 (set (reg:SI 22 r22) (call (mem:HI (symbol_ref:HI ("foo99") [flags 0x3] ) [0 S2 A8]) (const_int 0 [0x0]))) -1 (expr_list:REG_EH_REGION (const_int 0 [0x0]) (nil)) (expr_list:REG_DEP_TRUE (use (reg:SI 22 r22)) (expr_list:REG_DEP_TRUE (use (reg:SI 18 r18)) (nil -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35507
[Bug target/35507] [avr] 4.3.0: size of small funcion increases from 2 to 29 words
--- Comment #1 from hutchinsonandy at aim dot com 2008-03-09 04:35 --- I can confirms this regression. There appears to be something strange in commutation of operands before RTL is created which may well explain why it used to work. BThe default expander are creating calls to commutative functions that have the opposite argument order from normal function parameters. Functionally this does not matter, but it twists up the data flow. The argument order presented by the RTL is backwards from the ideal. This happens with both target AVR expander and the default expander (the original PR was using a default expander). Clearly this is avoidable. It is not a function of the original code - gcc is purposely ordering the operands. Regardless of operand ordering (x*y or y*x), the RTL is always backwards. The other factor is the lack of forward propagation in the RTL stages. If this was effective, RTL ordering would not matter. The limited propagation that is present avoids hard registers - so never connects the arguments with the library function. Also combine can't exploit the commutation of the operands. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35507
[Bug target/32871] [avr] Bad optimisation - gcc is pushing too many registers
--- Comment #4 from hutchinsonandy at aim dot com 2008-03-02 23:32 --- Created an attachment (id=15254) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15254&action=view) Patch to fix bug. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32871
[Bug target/32871] [avr] Bad optimisation - gcc is pushing too many registers
--- Comment #3 from hutchinsonandy at aim dot com 2008-03-02 17:22 --- Problem is caused by bug in gcc DF or at least incorrect documentation regarding prolog/epilog register save/resotres As specified in internals manual AVR prolog/epilog uses df_regs_ever_live_p(reg) to determine which register should be saved on stack (if it is not call_used_register). However, if function has arguments that are stored in non call_used_registers (R8-R17), then this test gives incorrect result and these registers will always saved/restored by prolog/epilog. (Argument register never need to be saved/restored.) This problem only applies to targets that pass arguments in non call_used_registers. Unfortunately no part of gcc including DF appears to have proper information to use directly. In the absence of a change to gcc, the target can determine which registers are REALLY used as arguments and exclude these from save/restores. So it requires going thru all function arguments again using target argument macros. Will post patch when it's finished testing. But here is key routine: /* Returns HARD_REG_SET indicating which registers are used for arguments */ static void avr_args (HARD_REG_SET *set) { int reg; int i; rtx arg; CUMULATIVE_ARGS cum; tree decl = DECL_ARGUMENTS (current_function_decl); INIT_CUMULATIVE_ARGS (cum, TREE_TYPE (current_function_decl), NULL_RTX, decl, -1); for (; decl; decl = TREE_CHAIN (decl)) { if ( TREE_CODE (decl) == PARM_DECL && DECL_NAME (decl) && !DECL_ARTIFICIAL (decl)) { enum machine_mode mode = DECL_MODE (decl); /* Get argument RTX */ /* This target does not use named attribute */ arg = FUNCTION_ARG (cum, mode, DECL_ARG_TYPE (decl), 1); FUNCTION_ARG_ADVANCE (cum, mode, DECL_ARG_TYPE (decl), 1); if REG_P(arg) { reg = REGNO (arg); for (i = 0;i < HARD_REGNO_NREGS (reg, mode);i++) { if (set) SET_HARD_REG_BIT (*set, reg + i); } } } } } -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32871
[Bug target/34790] [avr] no sibling call optimisation
--- Comment #2 from hutchinsonandy at aim dot com 2008-02-22 01:43 --- We have not gotten around to adding support for tail calls for avr. So nothing happens. So it is not a bug - but a still a valid feature request. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34790
[Bug target/34789] [avr] sometimes the compiler keeps addresses in registers unnecessarily
--- Comment #1 from hutchinsonandy at aim dot com 2008-02-22 01:22 --- This appears to be due to avr_rtx_costs not assigning same cost to SYMBOL_REF and CONST_INT. So SYMBOL_REF looks expensive - so is held in register to avoid "recalculating" it. Quick change to make SYMBOL_REF same cost as CONST_INT in both avr_rtx_cost and avr_operand_rtx_cost gave the desired result 22/* prologue: function */ 23/* frame size = 0 */ 24.LM2: 25 E091 lds r30,a 26 0004 F091 lds r31,(a)+1 27 0008 EE0F lsl r30 28 000a FF1F rol r31 29 000c E050 subi r30,lo8(-(data)) 30 000e F040 sbci r31,hi8(-(data)) 31 0010 8081 ld r24,Z 32 0012 9181 ldd r25,Z+1 33 0014 0E94 call foo 34.LM3: 35 0018 FC01 movw r30,r24 36 001a EE0F lsl r30 37 001c FF1F rol r31 38 001e E050 subi r30,lo8(-(data)) 39 0020 F040 sbci r31,hi8(-(data)) 40 0022 8081 ld r24,Z 41 0024 9181 ldd r25,Z+1 42 0026 0E94 call foo 43/* epilogue start */ 44.LM4: 45 002a 0895 ret CONST and LABEL_REF might also have same problem as they cost same as MEM. But note, if SYMBOL_REF were part of memory address, then it might be better held in register (like Y or Z) - this should be done by checking outer code. With outer code of "MEM" SYMBOL_REF would be more expensive than register. (which might same a few LDS/STS that appear in code) Avr_rtx_cost needs some serious work done to correct these are other anomalies in cost assumptions and the recursion on operand costs. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34789
[Bug target/35013] Incomplete check in RTL for "pm()" annotation
--- Comment #1 from hutchinsonandy at aim dot com 2008-02-16 22:06 --- Created an attachment (id=15169) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15169&action=view) Patch The attached patch allows function address expressions of the form address+k to be correctly recognized as program memory addresses and thus force use of pm() assembler syntax. This has not been extensively tested but assembler appears to be correct for this bugs testcase and a similar issue found in: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27192 Note, odd addresses will be accepted by C and only cause linker warning. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35013
[Bug target/11180] [avr-gcc] Optimization decrease performance of struct assignment.
--- Comment #28 from hutchinsonandy at aim dot com 2008-02-02 15:44 --- The patch and suggestions on this are valid. However, memory moves - particular with base pointers, may require additional instruction to be added to reach required displacments. Splitting such moves may well incur the overhead per BYTE! So we need to get the pointers sorted so that (for example) 4 separate QI bytes is just as good as 1 access to 4 SI bytes. As for other pattern removal YES! The reason to change MOVE_MAX is not made clear. I understood this to control the threshold for using movmem rather than inline RTL. movmem wins (in size) at about 8+ bytes. Does it have another use related to this problem? -- hutchinsonandy at aim dot com changed: What|Removed |Added CC||hutchinsonandy at aim dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11180
[Bug target/34932] [avr] ICE in reload
--- Comment #3 from hutchinsonandy at aim dot com 2008-01-23 02:50 --- The pattern requires operand 1 to be same register as operand 0 Operands 1 & 2 share 2 subregs of same Himode register R22 But should have been solvable without any problem, since HI24 is just right! QI:21 -> QI:24 HI24 = Zex:QI24 + ZexQI22 voila! QI22 could have been in top half of HI24. So this also works HI:22 - > HI:22 HI:24 = Zex:QI24 + ZexQI24 -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34932
[Bug target/34916] gcc.c-torture/execute/pr27364.c fails with -O1, -O2 and -Os
--- Comment #4 from hutchinsonandy at aim dot com 2008-01-22 23:41 --- The WRONG CODE is still present on 4.3.0 20080121 HEAD. This is a regresssion from 4.2 (A big one too!) 4.2.2 20071221 (Winavr) OK 4.3.0 20071213 FAILS 4.3.0 HEAD 20080121 FAILS vr-gcc -c -mmcu=atmega128 -g -w -Q -O2 -DSTACK_SIZE=400 -da -DNO_TRAMPOLINES -fno-show-column -DSIGNAL_SUPPRESS -lm -Wa,-adhlns=xpr27364.lst -lm -std=gnu99 xpr27364.c -o xpr27364.o -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916
[Bug target/34916] gcc.c-torture/execute/pr27364.c fails with -O1, -O2 and -Os
--- Comment #3 from hutchinsonandy at aim dot com 2008-01-22 00:52 --- Assembler of short testcase. Constant load (11L) missing 16.Ltext0: 17.global f2 19f2: 20.LFB2: 21.LM1: 22.LVL0: 23/* prologue: function */ 24/* frame size = 0 */ 25.LM2: 26 0E94 call __mulsi3 27.LVL1: 28/* epilogue start */ 29.LM3: 30 0004 0895 ret 31.LFE2: 57.Letext0: DEFINED SYMBOLS -- hutchinsonandy at aim dot com changed: What|Removed |Added CC||hutchinsonandy at aim dot ||com http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916
[Bug target/34916] gcc.c-torture/execute/pr27364.c fails with -O1, -O2 and -Os
--- Comment #2 from hutchinsonandy at aim dot com 2008-01-22 00:26 --- Created an attachment (id=14992) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14992&action=view) dce pass RTL dump file (bfore combine) Posted two RTL dump file of smaller testcase: long f2(long number_of_digits_to_use) { return ( number_of_digits_to_use * 11L ) ; } This faile GCC head 4.3.0 of 13/12/2007 Combine decide load of constant is not needed. Seem to happen after combine tries combining load with multiply (ie multiply by constant). -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916
[Bug target/34916] gcc.c-torture/execute/pr27364.c fails with -O1, -O2 and -Os
--- Comment #1 from hutchinsonandy at aim dot com 2008-01-22 00:23 --- Created an attachment (id=14991) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14991&action=view) Combine pass RTL dump file -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916
[Bug target/34888] New: Stack patterns for AVR not optimal
There are several instruction patterns related to stack pointer operations. These are not quite right: 1) popqi and poph1 patterns use post_inc codes - when in fact there are pre_inc - this could fail if gcc ever used them outside prolog/epilog 2) Stack moves such as push/pop should be placed before mov patterns, to provide best matching. 3)Stack adjustment (SP=SP+c) is matching with *addhi pattern, which causes reloads of output and input. We really want it to match "*addhi3_sp_R_pc2" when offset is small so *addhi3_sp_R_pc2 needs to be placed before *addhi 4)"*addhi3_sp_R_pc2" does not provide full range of optimal adjustment. For example, SP=SP+8 takes 4 instructions using rcall but around 8 using addition through register. This functionality needs extending accordingly as SP=SP+8 is very common. -- Summary: Stack patterns for AVR not optimal Product: gcc Version: 4.3.0 Status: UNCONFIRMED Severity: minor Priority: P3 Component: target AssignedTo: unassigned at gcc dot gnu dot org ReportedBy: hutchinsonandy at aim dot com GCC target triplet: avr-*-* http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34888
[Bug target/34412] ICE in extract_insn, at recog.c:1990
--- Comment #5 from hutchinsonandy at aim dot com 2008-01-11 23:40 --- An instant work around for Tiny Targets is to optimise at higher level (-Os) This will most likely remove need for frame pointer and skirt around the bug. Though it will still happen if there are more auto variables than registers free. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34412
[Bug target/34412] ICE in extract_insn, at recog.c:1990
--- Comment #4 from hutchinsonandy at aim dot com 2008-01-11 23:32 --- Created an attachment (id=14928) --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14928&action=view) Fix expander patch Prior analysis is correct. Typo resulted in QI addition to HI mode frame pointer, when Tiny series target was selected (256 byte stack). The addition was intended to just change LSB as MSB is always fixed. Patch corrects prolog expander typo so that 8 bit (QI) increment is made to 8 bit representation of frame pointer. -- http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34412