[Bug target/39635] [avr] integer wrong code bug

2009-09-13 Thread hutchinsonandy at aim dot com


--- Comment #5 from hutchinsonandy at aim dot com  2009-09-13 16:14 ---
It looks like most of AVR shift/rotates are messed up.
For the case we where we have non constant shifts, the peephole may grab a
scratch register. In this case it look like it grabs one that is free
afterwards and not before. Hence overlap issue

The rotate split pattern problem is different as noted in message links. .  In
this case it is not apparent why the split is used for only different
source/destination registers. If the split were constrained so that src=dest
the overlap would be much easily to handle and it would seemingly produce
better code for the common case where src=dest.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=39635



[Bug target/36494] Char arrays gets corrupted in avr programs.

2008-06-11 Thread hutchinsonandy at aim dot com


--- Comment #4 from hutchinsonandy at aim dot com  2008-06-11 22:05 ---
I'm sure Eric will weight in again to verify code posted executes correctly (it
looks correct to me). 

I suspect you have some config or memory issue.

For example, unoptimized, the string is stored at location 0x100 and 0x101. So
if this get trashed, (by another part of software) the result will be 2. Also
watch your stack space. Unoptimized, it will be large and overflow data areas

The optimization, of course will bypass such problems -and that works. This
also shows that front end and middle of gcc are ok.

The code contains no tricky or seldom used instructions - so it appears highly
unlikely to be a problem with back end which is specfic to avr.

If Eric confirms this he may mark bug as invalid - but you can pursue a
solution on other forums (such as
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list)


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36494



[Bug middle-end/36447] simplify_subreg ICE with right shift more than length type AVR

2008-06-06 Thread hutchinsonandy at aim dot com


--- Comment #3 from hutchinsonandy at aim dot com  2008-06-06 11:55 ---
Subject: Re:  simplify_subreg ICE with right shift more than
 length type AVR

Thanks for quick response,

I will give this a try and no doubt it will work.
I was trying to think of how the other case should be simplified, 
rather than left as shift.

or put another way, how should we take sign?


Andy



--
Sent from my Dingleberry wired device.


-Original Message-
From: bonzini at gnu dot org [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Fri, 6 Jun 2008 1:04 am
Subject: [Bug middle-end/36447] simplify_subreg ICE with right shift 
more than length type AVR




--- Comment #2 from bonzini at gnu dot org  2008-06-06 05:04 ---
Can you try and possibly submit this patch:

Index: /Users/bonzinip/cvs/gcc/gcc/simplify-rtx.c
===
--- /Users/bonzinip/cvs/gcc/gcc/simplify-rtx.c  (revision 134435)
+++ /Users/bonzinip/cvs/gcc/gcc/simplify-rtx.c  (working copy)
@@ -5247,6 +5247,7 @@ simplify_subreg (enum machine_mode outer
 GET_MODE_BITSIZE (innermode) = (2 * GET_MODE_BITSIZE 
(outermode))
GET_CODE (XEXP (op, 1)) == CONST_INT
 (INTVAL (XEXP (op, 1))  (GET_MODE_BITSIZE (outermode) - 1)) 
== 0
+   INTVAL (XEXP (op, 1))  GET_MODE_BITSIZE (innermode)
byte == subreg_lowpart_offset (outermode, innermode))
 {
   int shifted_bytes = INTVAL (XEXP (op, 1)) / BITS_PER_UNIT;

Thanks!


--

bonzini at gnu dot org changed:

   What|Removed |Added
-
---
 Status|UNCONFIRMED |NEW
 Ever Confirmed|0   |1
   Last reconfirmed|-00-00 00:00:00 |2008-06-06 05:04:36
   date||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36447

--- You are receiving this mail because: ---
You reported the bug, or are watching the reporter.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36447



[Bug target/36336] ICE push_reload - psuedo reg_equiv_constant

2008-06-06 Thread hutchinsonandy at aim dot com


--- Comment #3 from hutchinsonandy at aim dot com  2008-06-06 19:42 ---
Subject: Re:  ICE push_reload - psuedo reg_equiv_constant

O2



--
Sent from my Dingleberry wired device.


-Original Message-
From: eric dot weddington at atmel dot com [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Fri, 6 Jun 2008 3:24 pm
Subject: [Bug target/36336] ICE push_reload - psuedo reg_equiv_constant




--- Comment #2 from eric dot weddington at atmel dot com  
2008-06-06 19:24
---
Andy, I'm having a difficulty in trying to reproduce this bug. I use 
this
command line:
avr-gcc -O1 -mmcu=atmega128 -w -std=gnu99 -c memcpy-chk.c -o 
memcpy-chk.o
But I'm using WinAVR 20080512, which is patched, and it does not give 
an ICE.

Are you also getting this ICE with HEAD?


--


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36336

--- You are receiving this mail because: ---
You reported the bug, or are watching the reporter.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36336



[Bug target/36336] ICE push_reload - psuedo reg_equiv_constant

2008-06-06 Thread hutchinsonandy at aim dot com


--- Comment #5 from hutchinsonandy at aim dot com  2008-06-06 20:18 ---
Subject: Re:  ICE push_reload - psuedo reg_equiv_constant


The patch for

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31786

removes one problematic part of LEGITIMIZE_RELOAD_ADDRESS. This is  
applied to WinAVR 20080512 (patched 4.3.0). But is still waiting for 
approval on avr-gcc 4.3/4.4 HEAD.

Even with that patch, the other parts of L_R_A are bad and need fixing 
with added check of reg_equivalent_constant.
So if register is equivalent constant LEGITIMIZE_RELOAD_ADDRESS should 
do nothing.

The assert that triggers is an explicit check for this.

I have not posted patch due to overlap with pending patch.

BTW gcc list has short discussion on this bug, including explanation 
for the AVR code.

Andy




--
Sent from my Dingleberry wired device.


-Original Message-
From: eric dot weddington at atmel dot com [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Fri, 6 Jun 2008 3:48 pm
Subject: [Bug target/36336] ICE push_reload - psuedo reg_equiv_constant




--- Comment #4 from eric dot weddington at atmel dot com  
2008-06-06 19:48
---
Test case passes with -O[0123s], with WinAVR 20080512 (patched 4.3.0).


--


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36336

--- You are receiving this mail because: ---
You reported the bug, or are watching the reporter.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36336



[Bug c/36447] New: simplify_subreg ICE with right shift more than length type AVR

2008-06-05 Thread hutchinsonandy at aim dot com
Works on avr-gcc (GCC) 4.2.2 (WinAVR 20071221). Does not on 4.4 HEAD.

Test results posts show this test failing since AT LEAST SVN rev 132993 on AVR
(March 3 2008)  (before that test was not run - so dont know when it started. 


gcc-c/toture/unsorted/shm.c

foo (int *p)
{
  int a = *p;
  return a  24;
}

/home/hutchia/Desktop/gcc/gcc/testsuite/gcc.c-torture/unsorted/shm.c:5:
internal compiler error: in simplify_subreg, at simplify-rtx.c:4962

test.c: In function 'foo':
test.c:4: warning: right shift count = width of type

Analyzing compilation unit
Performing interprocedural optimizations
 visibility early_local_cleanups summary generate inline static-var
p
ure-constAssembling functions:
 foo
Breakpoint 1, fancy_abort (file=0xb030f2 ../../gcc/gcc/simplify-rtx.c,
line=4962, function=0xb03158 simplify_subreg)
at ../../gcc/gcc/diagnostic.c:654
654   internal_error (in %s, at %s:%d, function, trim_filename (file),
lin
e);
(gdb) where
#0  fancy_abort (file=0xb030f2 ../../gcc/gcc/simplify-rtx.c, line=4962,
function=0xb03158 simplify_subreg) at ../../gcc/gcc/diagnostic.c:654
#1  0x007a1143 in simplify_subreg (outermode=QImode, op=0x7ff24f20,
innermode=HImode, byte=3) at ../../gcc/gcc/simplify-rtx.c:4937
#2  0x007a1672 in simplify_gen_subreg (outermode=QImode, op=0x7ff24f20,
innermode=HImode, byte=3) at ../../gcc/gcc/simplify-rtx.c:5287
#3  0x007a0b83 in simplify_subreg (outermode=QImode, op=0x7ff25100,
innermode=HImode, byte=0) at ../../gcc/gcc/simplify-rtx.c:5271
#4  0x007a1672 in simplify_gen_subreg (outermode=QImode, op=0x7ff25100,
innermode=HImode, byte=0) at ../../gcc/gcc/simplify-rtx.c:5287
#5  0x0098f86a in propagate_rtx_1 (px=0x22caf4, old=0x4, new=0x7ff25100,
flags=2) at ../../gcc/gcc/fwprop.c:333
#6  0x009907da in forward_propagate_into (use=0x0)
at ../../gcc/gcc/fwprop.c:457
#7  0x00990bc2 in fwprop () at ../../gcc/gcc/fwprop.c:1055
#8  0x00624a0e in execute_one_pass (pass=0x0) at ../../gcc/gcc/passes.c:1292
#9  0x00624b33 in execute_pass_list (pass=0xa8c320)
at ../../gcc/gcc/passes.c:1342
#10 0x00624b46 in execute_pass_list (pass=0xa8c5c0)
at ../../gcc/gcc/passes.c:1343
#11 0x0084d899 in tree_rest_of_compilation (fndecl=0x7fdbf1f0)
at ../../gcc/gcc/tree-optimize.c:421
#12 0x00625d4b in cgraph_expand_function (node=0x7ff40280)
at ../../gcc/gcc/cgraphunit.c:1148
#13 0x0062793e in cgraph_optimize () at ../../gcc/gcc/cgraphunit.c:1211
#14 0x0041bee7 in c_write_global_declarations ()
at ../../gcc/gcc/c-decl.c:8062
#15 0x0062c95b in toplev_main (argc=3, argv=0x1c014c0)
at ../../gcc/gcc/toplev.c:976
#16 0x0049467a in main (argc=3, argv=0x1c014c0) at ../../gcc/gcc/main.c:35


-- 
   Summary: simplify_subreg ICE with right shift more than length
type AVR
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: hutchinsonandy at aim dot com
 GCC build triplet: i686-pc-cygwin
  GCC host triplet: i686-pc-cygwin
GCC target triplet: avr-unknown-none


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36447



[Bug middle-end/36447] simplify_subreg ICE with right shift more than length type AVR

2008-06-05 Thread hutchinsonandy at aim dot com


--- Comment #1 from hutchinsonandy at aim dot com  2008-06-06 03:08 ---
rev 132971 appears to have created this  problem.

Revision: 132971
Author: bonzini
Date: 8:30:10 AM, Thursday, March 06, 2008
Message:
2008-03-06  Paolo Bonzini  [EMAIL PROTECTED]

* simplify-rtx.c (simplify_subreg): Remove useless shifts from
word-extractions out of a multi-word object.


Modified : /trunk/gcc/ChangeLog
Modified : /trunk/gcc/simplify-rtx.c


It fails because subreg simplification tries to extract byte 3 of the HImode
int 'a' at simplify-rtx 5271. No check is performed here to see if shift count
= length. For this case simplification should give sign of value (-1 or 0).


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36447



[Bug target/27386] AVR: wrong code generated when passing three uint64_t arguments to function

2008-06-01 Thread hutchinsonandy at aim dot com


--- Comment #14 from hutchinsonandy at aim dot com  2008-06-01 15:22 ---
It appears emit_single_push_insn() is BROKEN for targets with:

a)STACK_GROWS_DOWNWARDS+POST_DEC push 
b)Upwards+POST_INC push.

So if any target has this combo and #define PUSH_ROUNDING - it is broken.

Fortunately for AVR the whole mess goes away when we #undef PUSH_ROUNDING-
which appears so far to be uneccesary. It also cleared some compat test
failures!


For reference here what I did sort out for emit_single_push_insn

There are two problems.

1) For downwards padding, MISTAKE in original (H8) change that for (a) adds
size of slot to SP to get address - that would be highest byte of mem! This
crashes AVR.

/* We have already decremented the stack pointer, so get the
   previous value.  */
offset += (HOST_WIDE_INT) rounded_size; I AM VERY WRONG

POST_DEC leaves stack pointer at BOS-1.  We must add smallest addressable unit
of stack (byte,word?) to get address of MEM. Or perhaps use
STACK_POINTER_OFFSET. So above becomes.

offset += (HOST_WIDE_INT) STACK_POINTER_OFFSET; 

2) For Upwards padding both POST_INC and POST_DEC have PRE_MODIFY sequences
created for upwards padding. This is questioned in code and clearly incorrect.
I assume no target has this combo.

One possible way to fix issue is to use POST_MODIFY. However, that would assume
final instructions would not be split and cause stack corruption during
interrupt. This matter I have not checked.

Someone might consider adding some asserts here!


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27386



[Bug testsuite/36285] gcc.dg/compat/struct-by-value-xxx improper test for AVR target

2008-05-31 Thread hutchinsonandy at aim dot com


--- Comment #1 from hutchinsonandy at aim dot com  2008-06-01 01:02 ---
I have reduced number of failures slightly by setting higher optimisation and
skipping complex int using 

set COMPAT_SKIPS [list {VA} {COMPLEX_INT}] 
set COMPAT_OPTIONS [list [list {-Os -mcall-prologues} {-Os -mcall-prologues}]] 

But complex float, double and long double are not avoidable and taking way too
much code size to link.

Additionally, there appears to be no way of Skipping these test or even marking
xfail for the link/run stages.


-- 

hutchinsonandy at aim dot com changed:

   What|Removed |Added

 CC||janis187 at us dot ibm dot
   ||com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36285



[Bug target/27386] AVR: wrong code generated when passing three uint64_t arguments to function

2008-05-31 Thread hutchinsonandy at aim dot com


--- Comment #13 from hutchinsonandy at aim dot com  2008-06-01 02:40 ---
expr.c appears all messed up on emit_single_push_insn.

This bad code gets executed when there is no push instruction available.

As well as getting address of the mem created completely wrong, it does not
account for any offset between SP and Top/Bottom of Stack aka
STACK_POINTER_OFFSET

Any comment before I try and fix this mess?


First example, ironically without the warning mentioned in latter code.

 else if (FUNCTION_ARG_PADDING (mode, type) == downward)
{
  unsigned padding_size = rounded_size - GET_MODE_SIZE (mode);
  HOST_WIDE_INT offset;

  emit_move_insn (stack_pointer_rtx,
  expand_binop (Pmode,
#ifdef STACK_GROWS_DOWNWARD
sub_optab,
#else
add_optab,
#endif
stack_pointer_rtx,
GEN_INT (rounded_size),
NULL_RTX, 0, OPTAB_LIB_WIDEN));

  offset = (HOST_WIDE_INT) padding_size;
#ifdef STACK_GROWS_DOWNWARD
  if (STACK_PUSH_CODE == POST_DEC)
/* We have already decremented the stack pointer, so get the
   previous value.  */
///NEXT LINE IS WRONG We are pointing just below value so we need SP +
STACK_POINTER_OFFSET
offset += (HOST_WIDE_INT) rounded_size;
//For PRE_DEC we already point directly to mem so code OK
#else
  if (STACK_PUSH_CODE == POST_INC)
/* We have already incremented the stack pointer, so get the
   previous value.  */
//NEXT LINE IS CORRECT
offset -= (HOST_WIDE_INT) rounded_size;
//For PRE_INC we now add STACK_POINTER_OFFSET or  SP will be one lower than mem
address
#endif
  dest_addr = gen_rtx_PLUS (Pmode, stack_pointer_rtx, GEN_INT (offset));
}
  else

The rest of code is even worse!










-- 

hutchinsonandy at aim dot com changed:

   What|Removed |Added

 CC||hutchinsonandy at aim dot
   ||com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27386



[Bug testsuite/36284] New: gcc.dg-struct-layout fails AVR target - multiple reasons

2008-05-20 Thread hutchinsonandy at aim dot com
gcc/testsuite/gcc/gcc.dg-struct-layout-1 fails multiple times for AVR target
due to non-portable testcase.

This test has 28 generated variants, all fail.
Problems include:

1)Assumes int are 32 bit

gcc/gcc.dg-struct-layout-1//t001_test.h:119: error: width of 'a'

2) Assumes availability of DF mode
gcc.dg/compat/vector-defs.h:9: error: unable to emulate 'DF'

3) Some undefined problem - maybe size 32767?
short_enums30131.c:3: error: size of array 's' is negative


I will post snippet of log file to aid correction.


-- 
   Summary: gcc.dg-struct-layout fails AVR target - multiple reasons
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: hutchinsonandy at aim dot com
  GCC host triplet: i686-pc-linux-gnu
GCC target triplet: avr-unknown-none


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36284



[Bug testsuite/36284] gcc.dg-struct-layout fails AVR target - multiple reasons

2008-05-20 Thread hutchinsonandy at aim dot com


--- Comment #1 from hutchinsonandy at aim dot com  2008-05-20 22:41 ---
Created an attachment (id=15658)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15658action=view)
Extract from gcc.log

Extract from gcc.log showing failure details. For economy, not all 28 tests are
shown.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36284



[Bug testsuite/36285] New: gcc.dg/compat/struct-by-value-xxx improper test for AVR target

2008-05-20 Thread hutchinsonandy at aim dot com
Most of the 21 variants of this test fail for AVR target. Issue noted appears
to be excessive memory need and thus failure at link time. For example.

PASS: gcc.dg/compat/struct-by-value-11 c_compat_y_tst.o compile
Executing on host: /home/hutchia/Desktop/awhconf/gcc/xgcc
-B/home/hutchia/Desktop/awhconf/gcc/ c_compat_main_tst.o c_compat_x_tst.o
c_compat_y_tst.o-DSTACK_SIZE=2048 -DNO_TRAMPOLINES -fno-show-column 
-DSIGNAL_SUPPRESS -mmcu=atmega128  /home/hutchia/Desktop/dejagnuboards/exit.c
-Wl,-u,vfprintf -lprintf_flt -Wl,-Tbss=0x802000,--defsym=__heap_end=0x80 
-lm   -o gcc-dg-compat-struct-by-value-11-01(timeout = 300)
/home/hutchia/local/avr/lib/gcc/avr/4.4.0/../../../../avr/bin/ld:
gcc-dg-compat-struct-by-value-11-01 section .text will not fit in region text
/home/hutchia/local/avr/lib/gcc/avr/4.4.0/../../../../avr/bin/ld: region text
overflowed by 353496 bytes
compiler exited with status 1
output is:
/home/hutchia/local/avr/lib/gcc/avr/4.4.0/../../../../avr/bin/ld:
gcc-dg-compat-struct-by-value-11-01 section .text will not fit in region text
/home/hutchia/local/avr/lib/gcc/avr/4.4.0/../../../../avr/bin/ld: region text
overflowed by 353496 bytes


-- 
   Summary: gcc.dg/compat/struct-by-value-xxx improper test for AVR
target
   Product: gcc
   Version: 4.4.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: hutchinsonandy at aim dot com
  GCC host triplet: i686-pc-linux-gnu
GCC target triplet: avr-unknown-none


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=36285



[Bug target/32871] [avr] Bad optimisation - gcc is pushing too many registers

2008-04-27 Thread hutchinsonandy at aim dot com


--- Comment #6 from hutchinsonandy at aim dot com  2008-04-28 00:58 ---
Created an attachment (id=15540)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15540action=view)
Partial solution using DF defs.


-- 

hutchinsonandy at aim dot com changed:

   What|Removed |Added

  Attachment #15254|0   |1
is obsolete||


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32871



[Bug target/32871] [avr] Bad optimisation - gcc is pushing too many registers

2008-04-27 Thread hutchinsonandy at aim dot com


--- Comment #7 from hutchinsonandy at aim dot com  2008-04-28 00:59 ---
Attached is INCOMPLETE attempt to fix this issue.

Register saves appear to be ok. But same function is required for Argument
pointer elimination offset. It would appear DF chain info is not maintained,
when global.c  uses this. So offset used to access arguments on stack does not
reflect final value required and will fail.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32871



[Bug target/35860] [4.3 Regression] [avr] code bloat caused by -fsplit-wide-types

2008-04-16 Thread hutchinsonandy at aim dot com


--- Comment #8 from hutchinsonandy at aim dot com  2008-04-16 13:10 ---
Subject: Re:  [4.3 Regression] [avr] code bloat caused by
 -fsplit-wide-types

Yes, indeed, I have patches in progress for AVR  that do split 
operation to take more advantage of lowering but the bug is still an 
issue then.

For example, if the testcase was using PLUS instead or OR, I will not 
be able to split instruction. (anything with carried status is 
problematic with reload and - as yet - cannot be split)

The  problem will merely propagate backwards until it gets blocked by 
unsplit wide mode operation (PLUS, COMPARE, SUB, MULT and probabley 
calls). Simply put, it will occur where ever a wide mode value meets a 
set of subregs. Here it will determine there is a conflict - even if 
there is not one.





-Original Message-
From: steven at gcc dot gnu dot org [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wed, 16 Apr 2008 4:59 am
Subject: [Bug target/35860] [4.3 Regression] [avr] code bloat caused by 
-fsplit-wide-types




--- Comment #7 from steven at gcc dot gnu dot org  2008-04-16 08:59 
---
I agree with Paolo in comment #6.  One purpose of the lower-subreg path 
was to
allow backends to *not* define insns that it doesn't have.  The 
expanders will
generate inline code for such patterns at expand time, with sets to 
subregs.
Before GCC had lower-subreg, this would lead to awful code, but now 
that we
split the subregs out to pseudos it ought to work just fine.

Sadly, even i386 still hasn't been modified to benefit from this work...


--


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35860

--- You are receiving this mail because: ---
You are on the CC list for the bug, or are watching someone who is.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35860



[Bug target/35860] code bloat caused by -fsplit-wide-types

2008-04-09 Thread hutchinsonandy at aim dot com


--- Comment #3 from hutchinsonandy at aim dot com  2008-04-09 19:24 ---
Subject: Re:  code bloat caused by -fsplit-wide-types

Try fwprop patch it might well help.

I can't tell from report where the oppertunities are missed.

But anything split at combine/split won't get any benefit as fwprop 
passes only occur before (much to my dismay).

Register allocation has a more  limited forward propagtion ability (it 
does not simplify for one) and simplistical will remove one level of 
redundant moves.

If we try split before combine (expanded RTL), then combine does work 
so well and it's a net loss.

Combine on split types does not work well as it is not possible to all 
instructions  (like compare, add).

We can't split due to use of CC0. We use CC0 because I cant figure out 
how to prevent reloads destroying status.

Dang it!



-Original Message-
From: eric dot weddington at atmel dot com [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wed, 9 Apr 2008 3:04 pm
Subject: [Bug target/35860] code bloat caused by -fsplit-wide-types




--- Comment #2 from eric dot weddington at atmel dot com  
2008-04-09 19:04
---
I'll see about testing with Andy Hutchinson's fwprop patch at bug 
#35542.


--

eric dot weddington at atmel dot com changed:

   What|Removed |Added
-
---
  CC||hutchinsonandy at aim 
dot
||com, eric dot 
weddington at
   ||atmel dot com
   GCC host triplet|winavr 20080402 release |mingw


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35860

--- You are receiving this mail because: ---
You are on the CC list for the bug, or are watching someone who is.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35860



[Bug target/34916] [4.3/4.4 Regression] gcc.c-torture/execute/pr27364.c fails with -O1, -O2 and -Os

2008-04-08 Thread hutchinsonandy at aim dot com


--- Comment #11 from hutchinsonandy at aim dot com  2008-04-08 17:23 ---
Subject: Re:  [4.3/4.4 Regression] gcc.c-torture/execute/pr27364.c
 fails with -O1, -O2 and -Os

I believe the rules allow for this after a suitable grace period.

Remind me towards end of week and I will post for approval.

Andy



-Original Message-
From: eric dot weddington at atmel dot com [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Tue, 8 Apr 2008 11:16 am
Subject: [Bug target/34916] [4.3/4.4 Regression] 
gcc.c-torture/execute/pr27364.c fails with -O1, -O2 and -Os




--- Comment #10 from eric dot weddington at atmel dot com  
2008-04-08 15:16
---
Andy, since this was a 4.3 regression is there any way we can back port 
this
and commit it on the 4.3 branch?


--


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916

--- You are receiving this mail because: ---
You are on the CC list for the bug, or are watching someone who is.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916



[Bug rtl-optimization/35542] [4.3 Regression] fwprop only propagates one operand

2008-04-02 Thread hutchinsonandy at aim dot com


--- Comment #6 from hutchinsonandy at aim dot com  2008-04-02 15:44 ---
Subject: Re:  [4.3 Regression] fwprop only propagates
 one operand

Eric,

it's difficult to give you a specfic example as the propagation is very 
sensitive to generated code. I found this looking at other AVR bugs and 
discovered it was not working. (I'll  look up that example latter for 
you).

It not an obvious thing as other (good) changes due to DF merge, 
re-arrange code enough to obscure the omission. But the result is an 
extra move or redundant instruction here and there.

(its also not at all specfic to AVR)

Andy





-Original Message-
From: eric dot weddington at atmel dot com [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wed, 2 Apr 2008 10:21 am
Subject: [Bug rtl-optimization/35542] [4.3 Regression] fwprop only 
propagates one operand




--- Comment #4 from eric dot weddington at atmel dot com  
2008-04-02 15:21
---
(In reply to comment #3)
 committed to trunk, will backport to 4.3 in due time (causes 
regressions for
 AVR)


Could you list what fails?
Thanks!


--


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35542

--- You are receiving this mail because: ---
You reported the bug, or are watching the reporter.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35542



[Bug target/34879] __builtin_setjmp / __builtin_longjmp fails stack frame address with O2, O3 and Os

2008-03-29 Thread hutchinsonandy at aim dot com


--- Comment #1 from hutchinsonandy at aim dot com  2008-03-29 11:37 ---
Created an attachment (id=15395)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15395action=view)
Setjmp patch for AVR

The attached patch is a fix for AVR target. MIPS does something similar to get
around same issue.

The real problem is with gcc builin setjmp receiver being removed by
optimizers.
Optimizers think that frame_pointer load in receiver is unneeded and remove it! 

The patch loads the frame pointer in the nonlocal_goto, making the receiver
(where it jumps to) empty, so bad optimization cannot remove it. Additionally,
it avoids the unnecessary arithmetic around frame pointer offsets.

This patch was tested and the testcase passes.

Further changes may be required in the future if AVR 24bit jumps are to be
supported.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34879



[Bug target/21080] Excecution test failure for avr for pr17377 test case.

2008-03-29 Thread hutchinsonandy at aim dot com


--- Comment #3 from hutchinsonandy at aim dot com  2008-03-29 12:55 ---
Created an attachment (id=15396)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15396action=view)
Patch to correct return_address

The attached patch fixes this problem and PR21078
The AVR target support for builtin_return_address only returned value of
frame_pointer+1 - so it would only be correct if stack and frame were empty.

The attached patch calculates the stack usage in the function prolog. This is
placed in symbol stack_usage using UNSPEC instruction pattern. Builtin Return
address uses RETURN_ADDR_RTX(count, tem) to add this to frame pointer to get to
correct address. This only supports level 0 (same function). Other levels (ie
upper functions) return 0  - which is correct response if not supported.
The address is that read from the stack - ie word address.

Testsuite torture/execute/20010122-1.c and PR17377.c both pass with this patch
applied.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21080



[Bug target/35508] [avr] 4.3.0: undefined reference to `__ffshi2'

2008-03-22 Thread hutchinsonandy at aim dot com


--- Comment #1 from hutchinsonandy at aim dot com  2008-03-22 23:51 ---
This is same bug as:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34210

working on fix - to be posted soon.


-- 

hutchinsonandy at aim dot com changed:

   What|Removed |Added

 CC||hutchinsonandy at aim dot
   ||com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35508



[Bug target/34210] ffs builtin calls undefined __ffshi2

2008-03-22 Thread hutchinsonandy at aim dot com


--- Comment #5 from hutchinsonandy at aim dot com  2008-03-23 00:24 ---
Patch posted:

http://gcc.gnu.org/ml/gcc-patches/2008-03/msg01341.html


-- 

hutchinsonandy at aim dot com changed:

   What|Removed |Added

 CC||hutchinsonandy at aim dot
   ||com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34210



[Bug target/35508] [avr] 4.3.0: undefined reference to `__ffshi2'

2008-03-22 Thread hutchinsonandy at aim dot com


--- Comment #2 from hutchinsonandy at aim dot com  2008-03-23 00:24 ---
Patch posted

http://gcc.gnu.org/ml/gcc-patches/2008-03/msg01341.html


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35508



[Bug target/34932] [avr] ICE in reload

2008-03-21 Thread hutchinsonandy at aim dot com


--- Comment #4 from hutchinsonandy at aim dot com  2008-03-21 22:52 ---
Created an attachment (id=15357)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15357action=view)
FIX for ICE

This patches disables instruction pattern that causes ICE. This pattern is used
for the case of addition where both operands are zero_extended. 

Since zero extension of this type must still  load a zero into one register
anyway, there appears to be no benefit from this pattern over separate patterns
for addhi3_zero_extend1 and zero_extend.

So rather than trying to get reload to figure it out, the problem instruction
can be removed.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34932



[Bug target/30243] [4.1/4.2/4.3/4.4 Regression][avr] signbit() causes an internal compiler error

2008-03-17 Thread hutchinsonandy at aim dot com


--- Comment #8 from hutchinsonandy at aim dot com  2008-03-17 23:10 ---
Fails 4.3 on recently added testcase for same bug.

/cygdrive/e/gcc/gcc/testsuite/gcc.c-torture/execute/pr35456.c:17: internal
compiler error: in gen_lowpart_general, at rtlhooks.c:53
Please submit a full bug report


-- 

hutchinsonandy at aim dot com changed:

   What|Removed |Added

 CC||hutchinsonandy at aim dot
   ||com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30243



[Bug target/34916] [4.3/4.4 Regression] gcc.c-torture/execute/pr27364.c fails with -O1, -O2 and -Os

2008-03-15 Thread hutchinsonandy at aim dot com


--- Comment #8 from hutchinsonandy at aim dot com  2008-03-15 23:40 ---
This appear to be same bug where  combine is erroneously assuming all DF
register references are to different instructions. So it tries combining
instructions with themselves and stuff gets lost.

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35519

The above testcase still fails with gcc version 4.4.0 20080305.

However, with patch from PR35519 it produces correct code:

  23/* prologue: function */
  24/* frame size = 0 */
  25.LM2:
  26  2BE0  ldi r18,lo8(11)
  27 0002 30E0  ldi r19,hi8(11)
  28 0004 40E0  ldi r20,hlo8(11)
  29 0006 50E0  ldi r21,hhi8(11)
  30 0008 0E94  call __mulsi3
  31.LVL1:
  32/* epilogue start */
  33.LM3:
  34 000c 0895  ret


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916



[Bug middle-end/35519] COMBINE repeating same matches and can SEG fault

2008-03-15 Thread hutchinsonandy at aim dot com


--- Comment #4 from hutchinsonandy at aim dot com  2008-03-15 23:49 ---
This bug also causes incorrect code and appears to be regression from 4.2

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916

The good news is that the fix is effective.

Anything else I can do to help expedite the implementation of the patch or
alternate fix? 


-- 

hutchinsonandy at aim dot com changed:

   What|Removed |Added

 CC||hutchinsonandy at aim dot
   ||com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35519



[Bug rtl-optimization/35542] New: fwprop only propagates one operand

2008-03-11 Thread hutchinsonandy at aim dot com
fwprop.c  currently has a bug where a successful  propagation to one operand 
of an instruction will prevent propagation to any remaining operands.


The cause is due to the use of loc_mentioned_in_p() to check that a reference,
provided by earlier DF scan, still exist in an instruction.

The test is intended to check that an earlier propagation and simplification 
has not removed D/U references.

However, loc_mentioned_in_p(), compares addresses of rtx to determine 
equivalence. If an instruction has already been modified and simplified, this
will
longer apply - even if the def/use is still valid.

This problem was already noted in Nov, but no bug report seems to have been 
filed.

http://gcc.gnu.org/ml/gcc-patches/2007-11/msg00170.html


I will attach patch, that uses reg_mentioned_p() as a substitute. The only
differences I noted where:

a) reg_mentioned_p() does not match physical sub registers of longer hard
registers.
This seems to have no consequence since fwprop entirely rejects hard_registers 
 in latter code. Perhaps for clarity, hard registers could be ignored earlier.

b) registers in asm_operands are found. Which seems beneficial if they are
pseudos and again ignored if they are hard registers.


I am only able to test this with AVR port. In that it was 100% successful with
no 
regressions of torture/execute testsuite.


-- 
   Summary: fwprop only propagates one operand
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: rtl-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: hutchinsonandy at aim dot com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35542



[Bug rtl-optimization/35542] fwprop only propagates one operand

2008-03-11 Thread hutchinsonandy at aim dot com


--- Comment #1 from hutchinsonandy at aim dot com  2008-03-11 20:34 ---
Created an attachment (id=15300)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15300action=view)
Patch to search modified instruction for register.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35542



[Bug middle-end/35519] COMBINE repeating same matches and can SEG fault

2008-03-10 Thread hutchinsonandy at aim dot com


--- Comment #3 from hutchinsonandy at aim dot com  2008-03-10 22:24 ---
Subject: Re:  COMBINE repeating same matches and can
 SEG fault

The quadratic nature  does not seem to be particularly problem with the 
data involved.

The log_links is build up incrementally. (with duplicates at present).

The patch does a do over  to check that a potential link is not one 
already recorded. So initially the list is 0 and grows (hence quadratic)
However, it is only truely quadratic if there are no duplicates.

Duplicates are grouped together, and if I understand correctly,  the 
last element added will always be checked first - so  then inner loop 
will terminate after 1 iteration if a duplicate is present.

The final  list is quite small (typically 2-3 elements in length) - 
directly reflecting the number of  links between  RTL instruction.

If instead the list is created - then pruned afterwards, we have a 
longer list. So the quadratic nature of the loop is now replaced by 
check/sorting. This could be better- but only if the final list is of a 
significant size that  can exploit non-quadratic searching

Given the overhead of adding/removing linked items, and the small length 
of list, I believe the patch is better.

If, however, the number of RTL operands could be much larger than I have 
assumed, then perhaps a change is needed. Personally I can't think of 
many insn  that refer to more than 3 other instructions.  But I could be 
wrong.

Andy




steven at gcc dot gnu dot org wrote:
 --- Comment #2 from steven at gcc dot gnu dot org  2008-03-10 20:04 
 ---
 The patch makes adding log use an algorithm quadratic in the number of log
 links per insn.  It is probably better to:
 1. build the log links.
 2. filter out the duplicates as a post pass (and maybe sort them while at it?)


   


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35519



[Bug target/35507] [avr] 4.3.0: size of small funcion increases from 2 to 29 words

2008-03-09 Thread hutchinsonandy at aim dot com


--- Comment #2 from hutchinsonandy at aim dot com  2008-03-09 12:23 ---
Here is more info:

Testcase:

static long foo99(long b,long a)
{
return b * a;
}

long foo2(long b, long a)
{
return foo99(b,a);  
}

Looking at RTL, the USE of the respective libcalls are reversed. That is the
RTL generated for call to MULSI3 is reversed from a normal C function that has
same arguments and calling conventions.

(call_insn/u 9 8 10 920625-1.c:45 (set (reg:SI 22 r22)
(call (mem:HI (symbol_ref:HI (__mulsi3) [flags 0x41]) [0 S2 A8])
(const_int 0 [0x0]))) -1 (expr_list:REG_EH_REGION (const_int -1
[0x])
(nil))
(expr_list:REG_DEP_TRUE (use (reg:SI 18 r18))
(expr_list:REG_DEP_TRUE (use (reg:SI 22 r22))
(nil

(call_insn/u 9 8 10 920625-1.c:51 (set (reg:SI 22 r22)
(call (mem:HI (symbol_ref:HI (foo99) [flags 0x3] function_decl
0x7fdcf030 foo99) [0 S2 A8])
(const_int 0 [0x0]))) -1 (expr_list:REG_EH_REGION (const_int 0
[0x0])
(nil))
(expr_list:REG_DEP_TRUE (use (reg:SI 22 r22))
(expr_list:REG_DEP_TRUE (use (reg:SI 18 r18))
(nil


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35507



[Bug target/35507] [avr] 4.3.0: size of small funcion increases from 2 to 29 words

2008-03-09 Thread hutchinsonandy at aim dot com


--- Comment #4 from hutchinsonandy at aim dot com  2008-03-09 18:36 ---
The problem is not commutation knowledge to the backend.

First - the use notes were a red herring. Reversing them did not help.

After much chasing thru call.c and optabs.c, it looks like neither creates nor
correct the issue.

But I can fix (hide) the issue by commutating the back end expander to get 
optimal code. (Of course that does not fix ALL binop libcalls!)

So I tried a non-commutative operator (% or /) - and that was optimal, with no
expander changes.

So it would appear that the default don't care order presented to
expand_binop() is wrong. Where it does critically matter (non-commutative
functions), the order is ideal.

The effect on chained arithmetic or higher modes such as DImode is horrendous,
and may explain other noted problems with optimizations.


FYI here is expander used to temporarily fix (hide) problem - NOTE operand
numbering, relative to RTL order. 

(define_expand mulsi3
  [(set (reg:SI 18) (match_operand:SI 1 register_operand r))
 (set (reg:SI 22) (match_operand:SI 2 register_operand r))

  (parallel [(set (reg:SI 22) (mult:SI (reg:SI 22) (reg:SI 18)))
  (clobber (reg:HI 26))
  (clobber (reg:HI 30))])
   (set (match_operand:SI 0 register_operand =r) (reg:SI 22))]
  AVR_HAVE_MUL
  )


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35507



[Bug middle-end/35519] New: COMBINE repeating same matches and can SEG fault

2008-03-09 Thread hutchinsonandy at aim dot com
This problem potentiall affects all all targets

The data flow information that combine uses can cause Segmentation fault. I
have found this with AVR experimental build but it would seem that it can
affect any target.

The problem is  that the LOG_LINKS that combine creates from DF  can include
multiple references between instruction pairs.

DF  will produce multiple reference between instructions if they share a
register that decomposes into several smaller registers.

The multiple cross references are then used by Combine to select instruction
pairs and triples to match. This results in repeat trials of the same
instructions!


By was on an example, tThe RTL that triggered my problem was:

(insn 45 42 46 4 920625-1.c:55 (set (reg:SI 22 r22 [ temp.24 ])
   (mem:SI (reg/v/f:HI 71 [ alpha ]) [2 S4 A8])) 19 {*movsi} (nil))


(insn 46 45 47 4 920625-1.c:55 (set (reg:SI 18 r18)
   (mem:SI (plus:HI (reg:HI 68 [ ivtmp.18 ])
   (const_int 4 [0x4])) [2 S4 A8])) 19 {*movsi} (nil))


(insn 47 46 48 4 920625-1.c:55 (parallel [
   (set (reg:SI 22 r22)
   (mult:SI (reg:SI 22 r22)
   (reg:SI 18 r18)))
   (clobber (reg:HI 26 r26))
   (clobber (reg:HI 30 r30))
   ]) 43 {*mulsi3_call} (expr_list:REG_DEAD (reg:SI 18 r18)
   (expr_list:REG_UNUSED (reg:HI 30 r30)
   (expr_list:REG_UNUSED (reg:HI 26 r26)
   (nil)



This is call to library function, and the parameter for instruction 47 are hard
registers like SI:R22 - which is decomposed in DF as  R22,23,24 and 25.
DF marks all 4 sub parts in def/use chains (which seems entirely correct)

When DF information is transferred into LOG_LINKS we still have 4 references
back to the definition in instructions 45 and 47. From gdb this was:

(gdb) print uid_log_links[47]
$8 = (rtx) 0x7ff140d0
(gdb) pr
(insn_list:REG_DEP_TRUE 45 (insn_list:REG_DEP_TRUE 45 (insn_list:REG_DEP_TRUE
45
(insn_list:REG_DEP_TRUE 45 (insn_list:REG_DEP_TRUE 46 (insn_list:REG_DEP_TRUE 4
6 (insn_list:REG_DEP_TRUE 46 (insn_list:REG_DEP_TRUE 46 (nil)


These multiple references causes COMBINE to try the same combination of
instruction 45 and 47 multiple times ( thinking they are different
instructions). In this case the match is tried 4 times - 3 more than needed.
Thi part appears most benign - except for processing time/memory used.

BUT when combine tries three instructions, it can crash. In this example,
combine ends up trying to combine 2 duplicate instruction 45 with 47:

I1=
(insn 45 42 46 4 920625-1.c:55 (set (reg:SI 22 r22 [ temp.24 ])
   (mem:SI (reg/v/f:HI 71 [ alpha ]) [2 S4 A8])) 19 {*movsi} (nil))
I2=
(insn 45 42 46 4 920625-1.c:55 (set (reg:SI 22 r22 [ temp.24 ])
   (mem:SI (reg/v/f:HI 71 [ alpha ]) [2 S4 A8])) 19 {*movsi} (nil))

I3=
(insn 47 46 48 4 920625-1.c:55 (parallel [
   (set (reg:SI 22 r22)
   (mult:SI (reg:SI 22 r22)
   (reg:SI 18 r18)))
   (clobber (reg:HI 26 r26))
   (clobber (reg:HI 30 r30))
   ]) 43 {*mulsi3_call} (expr_list:REG_DEAD (reg:SI 18 r18)
   (expr_list:REG_UNUSED (reg:HI 30 r30)
   (expr_list:REG_UNUSED (reg:HI 26 r26)
   (nil)


Combine merges I1 into i3 and deletes I1. Combine notes that the life of R22
terminates in I2 and attempt to put a REG_DEAD note on I2 - except of course
the deletion of I1 also deletes the identical i2. Segmentation fault occurs.


-- 
   Summary: COMBINE repeating same matches and can SEG fault
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: hutchinsonandy at aim dot com
  GCC host triplet: i686-oc-cyqwin
GCC target triplet: avr-unknown-none


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35519



[Bug middle-end/35519] COMBINE repeating same matches and can SEG fault

2008-03-09 Thread hutchinsonandy at aim dot com


--- Comment #1 from hutchinsonandy at aim dot com  2008-03-09 23:52 ---
Created an attachment (id=15287)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15287action=view)
Patch for consideratiom towards a solution


Patch that removes duplicates when LOG_LINKS is created.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35519



[Bug target/35507] [avr] 4.3.0: size of small funcion increases from 2 to 29 words

2008-03-08 Thread hutchinsonandy at aim dot com


--- Comment #1 from hutchinsonandy at aim dot com  2008-03-09 04:35 ---
I can confirms this regression.

There appears to be something strange in commutation of operands before RTL is
created which may well explain why it used to work.

BThe  default expander are creating calls to commutative functions that have
the opposite argument order from normal function parameters. Functionally this
does not matter, but it twists up the data flow.

The argument order presented by the RTL is backwards from the ideal. This
happens with both target AVR expander and the default expander (the original PR
was using a default expander). Clearly this is avoidable.

It is not a function of the original code - gcc is purposely ordering the
operands. Regardless of operand ordering (x*y or y*x), the RTL is always
backwards. 


The other factor is the lack of forward propagation in the RTL stages. If this
was effective, RTL ordering would not matter. The limited propagation that is
present avoids hard registers - so never connects the arguments with the
library function. Also combine can't exploit the commutation of the operands.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35507



[Bug target/32871] [avr] Bad optimisation - gcc is pushing too many registers

2008-03-02 Thread hutchinsonandy at aim dot com


--- Comment #3 from hutchinsonandy at aim dot com  2008-03-02 17:22 ---
Problem is caused by bug in gcc DF or at least incorrect documentation
regarding prolog/epilog register save/resotres

As specified in internals manual AVR prolog/epilog uses
df_regs_ever_live_p(reg) to determine which register should be saved on stack
(if it is not call_used_register).

However, if function has arguments that are stored in non call_used_registers
(R8-R17), then this test gives incorrect result and these registers will always
saved/restored by prolog/epilog.  (Argument register never need to be
saved/restored.)


This problem only applies to targets that pass arguments in non
call_used_registers.

Unfortunately no part of gcc including DF appears to have proper information to
use directly.

In the absence of a change to gcc, the target can determine which registers are
REALLY used as arguments and exclude these from save/restores.  So it requires
going thru all function arguments again using target argument macros.

Will post patch when it's finished testing. But here is key routine:

/* Returns HARD_REG_SET indicating which registers are used for arguments */

static void
avr_args (HARD_REG_SET *set)
{
int reg;
int i;
rtx arg;
CUMULATIVE_ARGS cum;

tree decl = DECL_ARGUMENTS (current_function_decl);
INIT_CUMULATIVE_ARGS (cum, TREE_TYPE (current_function_decl), NULL_RTX,
decl, -1);

for (; decl; decl = TREE_CHAIN (decl))
{
if ( TREE_CODE (decl) == PARM_DECL
 DECL_NAME (decl)  !DECL_ARTIFICIAL (decl))  
   {
   enum machine_mode mode = DECL_MODE (decl);
/* Get argument RTX */
/* This target does not use named attribute */
arg = FUNCTION_ARG (cum, mode, DECL_ARG_TYPE (decl), 1);
FUNCTION_ARG_ADVANCE (cum, mode, DECL_ARG_TYPE (decl), 1);
if REG_P(arg)
{
reg = REGNO (arg);
for (i = 0;i  HARD_REGNO_NREGS (reg, mode);i++)
{
  if (set)
  SET_HARD_REG_BIT (*set, reg + i);  
}
}
}
}
}


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32871



[Bug target/32871] [avr] Bad optimisation - gcc is pushing too many registers

2008-03-02 Thread hutchinsonandy at aim dot com


--- Comment #4 from hutchinsonandy at aim dot com  2008-03-02 23:32 ---
Created an attachment (id=15254)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15254action=view)
Patch to fix bug.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=32871



[Bug target/34789] [avr] sometimes the compiler keeps addresses in registers unnecessarily

2008-02-21 Thread hutchinsonandy at aim dot com


--- Comment #1 from hutchinsonandy at aim dot com  2008-02-22 01:22 ---
This appears to be due to avr_rtx_costs not assigning same cost to SYMBOL_REF
and CONST_INT. So SYMBOL_REF looks expensive - so is held in register to avoid
recalculating it.

Quick change to make SYMBOL_REF same cost as CONST_INT in both avr_rtx_cost and
avr_operand_rtx_cost gave the desired result 

  22/* prologue: function */
  23/* frame size = 0 */
  24.LM2:
  25  E091  lds r30,a
  26 0004 F091  lds r31,(a)+1
  27 0008 EE0F  lsl r30
  28 000a FF1F  rol r31
  29 000c E050  subi r30,lo8(-(data))
  30 000e F040  sbci r31,hi8(-(data))
  31 0010 8081  ld r24,Z
  32 0012 9181  ldd r25,Z+1
  33 0014 0E94  call foo
  34.LM3:
  35 0018 FC01  movw r30,r24
  36 001a EE0F  lsl r30
  37 001c FF1F  rol r31
  38 001e E050  subi r30,lo8(-(data))
  39 0020 F040  sbci r31,hi8(-(data))
  40 0022 8081  ld r24,Z
  41 0024 9181  ldd r25,Z+1
  42 0026 0E94  call foo
  43/* epilogue start */
  44.LM4:
  45 002a 0895  ret

CONST and LABEL_REF might also have same problem as they cost same as MEM.

But note, if SYMBOL_REF were part of memory address, then it might be better
held in register (like Y or Z) - this should be done by checking outer code.
With outer code of MEM SYMBOL_REF would be more expensive than register.
(which might same a few LDS/STS that appear in code)

Avr_rtx_cost needs some serious work done to correct these are other anomalies
in cost assumptions and the recursion on operand costs.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34789



[Bug target/34790] [avr] no sibling call optimisation

2008-02-21 Thread hutchinsonandy at aim dot com


--- Comment #2 from hutchinsonandy at aim dot com  2008-02-22 01:43 ---
We have not gotten around to adding support for tail calls  for avr. So nothing
happens. So it is not a bug - but a still a valid feature request.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34790



[Bug target/35013] Incomplete check in RTL for pm() annotation

2008-02-16 Thread hutchinsonandy at aim dot com


--- Comment #1 from hutchinsonandy at aim dot com  2008-02-16 22:06 ---
Created an attachment (id=15169)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15169action=view)
Patch

The attached  patch allows function address expressions of the form address+k
to be correctly recognized as program memory addresses and thus force use of
pm() assembler syntax.

This has not been extensively tested but assembler appears to be correct for
this bugs testcase and a similar issue found in:

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=27192

Note, odd addresses will be accepted by C and only cause linker warning.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35013



[Bug target/11180] [avr-gcc] Optimization decrease performance of struct assignment.

2008-02-02 Thread hutchinsonandy at aim dot com


--- Comment #28 from hutchinsonandy at aim dot com  2008-02-02 15:44 ---
The patch and suggestions on this are valid. However, memory moves  -
particular with base pointers, may require additional instruction to be added
to reach required displacments. Splitting such moves may well incur the
overhead per BYTE!

So we need to get the pointers sorted so that (for example) 4 separate QI bytes
is just as good as 1 access to 4 SI bytes.

As for other pattern removal YES!

The reason to change MOVE_MAX is not made clear. I understood this to control
the threshold for using movmem rather than inline RTL. movmem wins (in size) at
about 8+ bytes. Does it have another use related to this problem?


-- 

hutchinsonandy at aim dot com changed:

   What|Removed |Added

 CC||hutchinsonandy at aim dot
   ||com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11180



[Bug target/34916] gcc.c-torture/execute/pr27364.c fails with -O1, -O2 and -Os

2008-01-22 Thread hutchinsonandy at aim dot com


--- Comment #4 from hutchinsonandy at aim dot com  2008-01-22 23:41 ---
The WRONG CODE is still present on 4.3.0 20080121  HEAD. 
This is a regresssion from 4.2 (A big one too!)

4.2.2 20071221 (Winavr) OK
4.3.0 20071213 FAILS
4.3.0 HEAD 20080121  FAILS

vr-gcc -c -mmcu=atmega128  -g -w  -Q -O2 -DSTACK_SIZE=400 -da  -DNO_TRAMPOLINES
-fno-show-column  -DSIGNAL_SUPPRESS -lm -Wa,-adhlns=xpr27364.lst -lm 
-std=gnu99 xpr27364.c -o xpr27364.o


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916



[Bug target/34932] [avr] ICE in reload

2008-01-22 Thread hutchinsonandy at aim dot com


--- Comment #3 from hutchinsonandy at aim dot com  2008-01-23 02:50 ---
The pattern requires operand 1 to be same register as operand 0

Operands 1  2 share 2 subregs of same Himode register R22

But should have been solvable without any problem, since HI24 is just right!

QI:21 - QI:24

HI24 = Zex:QI24 + ZexQI22 voila!

QI22 could have been in top half of HI24. So this also works

HI:22 -  HI:22

HI:24 = Zex:QI24 + ZexQI24


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34932



[Bug target/34916] gcc.c-torture/execute/pr27364.c fails with -O1, -O2 and -Os

2008-01-21 Thread hutchinsonandy at aim dot com


--- Comment #1 from hutchinsonandy at aim dot com  2008-01-22 00:23 ---
Created an attachment (id=14991)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14991action=view)
Combine pass RTL dump file


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916



[Bug target/34916] gcc.c-torture/execute/pr27364.c fails with -O1, -O2 and -Os

2008-01-21 Thread hutchinsonandy at aim dot com


--- Comment #2 from hutchinsonandy at aim dot com  2008-01-22 00:26 ---
Created an attachment (id=14992)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14992action=view)
dce pass RTL dump file (bfore combine)

Posted two RTL dump file of smaller testcase:


long f2(long number_of_digits_to_use)
{

  return ( number_of_digits_to_use * 11L ) ;
}

This faile GCC head 4.3.0 of 13/12/2007

Combine decide load of constant is not needed. Seem to happen after combine
tries combining load with multiply (ie multiply by constant). 


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916



[Bug target/34916] gcc.c-torture/execute/pr27364.c fails with -O1, -O2 and -Os

2008-01-21 Thread hutchinsonandy at aim dot com


--- Comment #3 from hutchinsonandy at aim dot com  2008-01-22 00:52 ---
Assembler of short testcase. Constant load (11L) missing

  16.Ltext0:
  17.global f2
  19f2:
  20.LFB2:
  21.LM1:
  22.LVL0:
  23/* prologue: function */
  24/* frame size = 0 */
  25.LM2:
  26  0E94  call __mulsi3
  27.LVL1:
  28/* epilogue start */
  29.LM3:
  30 0004 0895  ret
  31.LFE2:
  57.Letext0:
DEFINED SYMBOLS


-- 

hutchinsonandy at aim dot com changed:

   What|Removed |Added

 CC||hutchinsonandy at aim dot
   ||com


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34916



[Bug target/34888] New: Stack patterns for AVR not optimal

2008-01-20 Thread hutchinsonandy at aim dot com
There are several instruction patterns related to stack pointer operations.
These are not quite right:

1) popqi and poph1 patterns use post_inc codes - when in fact there are pre_inc
 - this could fail if gcc ever used them outside prolog/epilog

2) Stack moves such as push/pop should be placed before mov patterns, to
provide best matching.

3)Stack adjustment (SP=SP+c) is matching with *addhi pattern, which causes
reloads of output and input. We really want it to match *addhi3_sp_R_pc2 when
offset is small so *addhi3_sp_R_pc2 needs to be placed before *addhi

4)*addhi3_sp_R_pc2 does not provide full range of optimal adjustment. For
example, SP=SP+8 takes 4 instructions using rcall but around 8 using addition
through register. This functionality needs extending accordingly as SP=SP+8 is
very common.


-- 
   Summary: Stack patterns for AVR not optimal
   Product: gcc
   Version: 4.3.0
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: target
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: hutchinsonandy at aim dot com
GCC target triplet: avr-*-*


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34888



[Bug target/34412] ICE in extract_insn, at recog.c:1990

2008-01-11 Thread hutchinsonandy at aim dot com


--- Comment #4 from hutchinsonandy at aim dot com  2008-01-11 23:32 ---
Created an attachment (id=14928)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14928action=view)
Fix expander patch

Prior analysis is correct. Typo resulted in QI addition to HI mode frame
pointer, when Tiny series target was selected (256 byte stack). The addition
was intended to just change LSB as MSB is always fixed.

Patch corrects prolog expander typo so that 8 bit (QI) increment is made to 8
bit representation of frame pointer.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34412



[Bug target/34412] ICE in extract_insn, at recog.c:1990

2008-01-11 Thread hutchinsonandy at aim dot com


--- Comment #5 from hutchinsonandy at aim dot com  2008-01-11 23:40 ---
An instant  work around for Tiny Targets is to optimise at higher level (-Os)

This will most likely remove need for frame pointer and skirt around the bug.
Though it will still happen if there are more auto variables than registers
free.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34412