[Bug target/11180] [avr-gcc] Optimization decrease performance of struct assignment.

2005-03-27 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-03-27 
14:33 ---
The problem here is that gcc is using  a DImode register to handle 6 byte
(int+long) structure. Why I have no idea!

Since the target has no insn for DI move, gcc turns this into individual QImode
byte moves (subregs all over the place!). 

The 'stacked' 6 byte structure is 'popped' into DI register (6 bytes ). Two
other byte registers are explicitely cleared (making our 8 byte DI register) 

What then  follows is a large amount of shuffling. i.e. Moving from intermediate
virtual DI register (8 bytes) into correct place for a 6 byte return. Which
seems to surpass the abilities of the register allocator (DI and return
registers overlap).

Smaller structures (=4 bytes) are optimally handled. Larger structure 8 are
also much better since they are returned in memory.

So in summary, it would appear that the root cause is allocation of a DI mode
register for structures 4 and =8 bytes.

A secondary factor is the use of QImode moves (when SI,HImode are available and
more  efficient) 

The problem can be partially alleviated by defining DImode moves (that a hell of
a change though). Poor code still remains - for example clearing unused padding
bytes and extra register usage.

PS -fpack-struct does not change this bug.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11180


[Bug bootstrap/20452] New: HEAD ICE during make install

2005-03-13 Thread andrewhutchinson at cox dot net
gcc build fails with:

gcc -c   -g -O2 -DIN_GCC -DCROSS_COMPILE  -W -Wall -Wwrite-strings -Wstrict-prot
otypes -Wmissing-prototypes  -fno-common   -DHAVE_CONFIG_H-I. -I. -I../../gc
c/gcc -I../../gcc/gcc/. -I../../gcc/gcc/../include -I../../gcc/gcc/../libcpp/inc
lude  ../../gcc/gcc/c-lex.c -o c-lex.o
../../gcc/gcc/c-lex.c: In function `c_lex_with_flags':
../../gcc/gcc/c-lex.c:428: error: too many arguments to function `cpp_spell_toke
n'
make[1]: *** [c-lex.o] Error 1
make[1]: Leaving directory `/home/cvsroot/awhconf/gcc'

make: *** [all-gcc] Error 2

Apparently due to:

*cpp_spell_token (parse_in, tok, name, true) = 0;

in c-lex.c where cpp_spell_token appears to now only have 3 arguments elsewhere.

configured using:
$ ../gcc/configure --prefix=/avrdev --enable-languages=c,c++ --target=avr --d
isable-nls

-- 
   Summary: HEAD ICE during make install
   Product: gcc
   Version: 4.1.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: bootstrap
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: andrewhutchinson at cox dot net
CC: gcc-bugs at gcc dot gnu dot org
  GCC host triplet: i686-pc-cygwin
GCC target triplet: avr


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20452


[Bug target/18251] unable to find a register to spill in class `POINTER_REGS'

2005-03-12 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-03-12 
21:20 ---
(In reply to comment #18)
 (In reply to comment #17) 

I think it is always true but the original used the same predicate and test (so
I played safe).

The pattern only helps if it is a constant.  I also thought it should handle
variable block size. However,  I found gcc already produces optimal code for
that case without any help.



  Marek, can you review this bug, the attached patches, and possibly approve 
  committing the fix? 
  
 I'm looking into it right now.  I'm not sure about one thing: should 
 movmemhi 
 handle only constant block sizes, or variable block sizes too?  If variable - 
 is it safe to assume nonzero?  (now 0 means 65536) 
  
 Operand 2 (block size) has the const_int_operand predicate - doesn't this 
 mean that (GET_CODE(operands[2]) == CONST_INT) is always true? 
  
 Thanks, 
 Marek 
  



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18251


[Bug target/18251] unable to find a register to spill in class `POINTER_REGS'

2005-03-12 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-03-13 
01:19 ---
Subject: Re:  unable to find a register to spill in class
 `POINTER_REGS'

The concerns have merit but can be discounted:.

The reload problem occurs because the original pattern demands two 
pointers in parrallel existence.(Actually two pointers and a 
counter!)  The current register allocation is imperfect and with this 
constraint (and only 3 pointers incl frame)  fails to find a solution.

The new RTL expansion does not demand both pointer registers at the same 
time - indeed they could be the same register in extreme circumstances. 
Breaking up the RTL reveals this to GCC and allows the  register 
allocator to find a solution. So that is why it works.

The SAME result can also be realised by deleting the offending pattern - 
in this situation GCC generates it's own solution which happens to be 
identical RTL to the proposed solution (with a 16 bit counter). And 
indeed there is no reload failure.

Since the proposed pattern FAILS, the variable case, we will still end 
up with GCC's solution and we can conclude there will be no hidden 
reload issue. (It should also be noted that a variable count is also 
less retrictive on hard register use than a constant).

Now here is the neat bit!. Since GCC middle end generates the detailed 
RTL loop, for a variable count,  we can and should rely on it to 
consider any restriction on the variable (ie variable count=0). If not, 
its very very broken.

I was very tempted to submit a patch that just deletes the pattern, 
however, that would produce worse code for the very common case where 
fixed count255.

I hope this clarifies things.



marekm at amelek dot gda dot pl wrote:

--- Additional Comments From marekm at amelek dot gda dot pl  2005-03-13 
00:30 ---
Subject: Re:  unable to find a register to spill in class `POINTER_REGS'

On Sat, Mar 12, 2005 at 09:20:18PM -, andrewhutchinson at cox dot net 
wrote:

  

The pattern only helps if it is a constant.  I also thought it should handle
variable block size. However,  I found gcc already produces optimal code for
that case without any help.



See below for revised patch (currently for mainline):
 - FAIL if count is not a CONST_INT
 - handle count == 0 (nothing to do)
 - handle count  32767 (negative in RTL, mask with 0x)
 - minor formatting fixes

But, I'm still concerned a little about the variable block size:
 - __tmp_reg__ will not be used (some other register will)
 - more importantly, can the problem from this PR (unable to find a register
   to spill in class POINTER_REGS) still occur in the variable size case?
   (only with a different, not yet known test case - this means we are
   perhaps trying to hide the real bug instead of fixing it...)

If we have to handle the variable count case too, one more insn will
be needed (initially jump to decrementing the counter; test for carry
instead of zero).  Some other targets handle this by calling a subroutine
in libgcc.S - smaller (but slower) than generating the loop inline.

Marek


2005-03-12  Andy Hutchinson  [EMAIL PROTECTED]

   PR target/18251
   * config/avr/avr.md (movmemhi): Rewrite as RTL loop.
   (*movmemqi_insn): Delete.
   (*movmemhi): Delete.

Index: avr.md
===
RCS file: /cvs/gcc/gcc/gcc/config/avr/avr.md,v
retrieving revision 1.50
diff -c -3 -p -r1.50 avr.md
*** avr.md 6 Mar 2005 21:50:36 -   1.50
--- avr.md 12 Mar 2005 23:51:57 -
***
*** 346,421 
  
  ;;=
  ;; move string (like memcpy)
  
  (define_expand movmemhi
[(parallel [(set (match_operand:BLK 0 memory_operand )
! (match_operand:BLK 1 memory_operand ))
!(use (match_operand:HI 2 const_int_operand ))
!(use (match_operand:HI 3 const_int_operand ))
!(clobber (match_scratch:HI 4 ))
!(clobber (match_scratch:HI 5 ))
!(clobber (match_dup 6))])]

{
!   rtx addr0, addr1;
!   int cnt8;
enum machine_mode mode;
  
if (GET_CODE (operands[2]) != CONST_INT)
  FAIL;
-   cnt8 = byte_immediate_operand (operands[2], GET_MODE (operands[2]));
-   mode = cnt8 ? QImode : HImode;
-   operands[6] = gen_rtx_SCRATCH (mode);
-   operands[2] = copy_to_mode_reg (mode,
-   gen_int_mode (INTVAL (operands[2]), mode));
-   addr0 = copy_to_mode_reg (Pmode, XEXP (operands[0], 0));
-   addr1 = copy_to_mode_reg (Pmode, XEXP (operands[1], 0));
  
!   operands[0] = gen_rtx_MEM (BLKmode, addr0);
!   operands[1] = gen_rtx_MEM (BLKmode, addr1);
  })
  
- (define_insn *movmemqi_insn
-   [(set (mem:BLK (match_operand:HI 0 register_operand e))
-  (mem:BLK (match_operand:HI 1 register_operand e)))
-(use (match_operand:QI 2 register_operand r))
-(use (match_operand:QI 3 const_int_operand i

[Bug target/18251] unable to find a register to spill in class `POINTER_REGS'

2005-03-12 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-03-13 
02:44 ---
Subject: Re:  unable to find a register to spill in class
 `POINTER_REGS'

This is a define EXPAND.  predicates (such as const_int_operand) and 
pattern have no effect at all on generated code or matching. This 
pattern always emits DONE or FAIL.

That is why you need to test operand in body.

And with ox is wrong. As is trying to handle the variable count 
case. That is fixing something that is not broke.

So looks like my patch is ok?




schlie at comcast dot net wrote:

--- Additional Comments From schlie at comcast dot net  2005-03-13 02:06 
---
(In reply to comment #20)

with reference to the most recent patch:
- anding with 0x may turn negatives positive so it seems wrong.
- there's no need limit byte counts to below 0x100 for bytes, as 0xFF is a
   count as long is it was originally verifed that the integer value is 
 positive.

And just as a heads up, from when I was fooling a differnt varant, discovered
that (use (match_operand:HI 2 const_int_operand )) apparently
also matches variable operands when compiling avrlibc:

Apparently failing as no code is generated:

../../../../libc/stdlib/realloc.c:154: error: unrecognizable insn:
(insn 235 232 236 31 ../../../../libc/stdlib/realloc.c:151 (parallel [
(set (mem:BLK (reg/v/f:HI 49 [ memp ]) [0 A8])
 (mem:BLK (reg/v/f:HI 60 [ ptr ]) [0 A8]))
(use (reg:HI 81 [ variable.sz ]))
(use (const_int 1 [0x1]))
]) -1 (insn_list:REG_DEP_TRUE 232 (nil))
(expr_list:REG_DEAD (reg:HI 81 [ variable.sz ])
(expr_list:REG_DEAD (reg/v/f:HI 49 [ memp ])
(nil

From the following yet another version of Andy's patch:

(and for the hell of it, enclosed at the end, a version which
 attempts to handle variable counts, but couldn't figure out
 how to get the conditional insertion of a forward branch
 label generated correctly:)

- won't emit code unless (count  0).

- removes code for non-constant count moves; as it would have
  generated incorrect code for move count = 0.

- allocates a temporary, rather than presuming r0 is safe to use.
  (and seems to generate just as good code, as a step to freeing r0)

-- def --

;; move string (like memcpy)
;; implement as RTL loop

(define_expand movmemhi
  [(parallel [(set (match_operand:BLK 0 memory_operand )
   (match_operand:BLK 1 memory_operand ))
  (use (match_operand:HI 2 const_int_operand ))
  (use (match_operand:HI 3 const_int_operand ))] )]
  
  {
  int cnt8, prob;
  enum machine_mode mode;

  rtx loop_reg;
  rtx label = gen_label_rtx ();
  
  /* Copy pointers into new psuedos - they will be changed.  */
  rtx addr0 = copy_to_mode_reg (Pmode, XEXP (operands[0], 0));
  rtx addr1 = copy_to_mode_reg (Pmode, XEXP (operands[1], 0));
  
  /* If loop count is constant, try to use QImode counter.  */
  if ((GET_CODE (operands[2]) == CONST_INT)  (INTVAL (operands[2])  0))
  {
/* See if constant fit 8 bits.  */
cnt8 = byte_immediate_operand (operands[2], GET_MODE (operands[2]));
mode = cnt8 ? QImode : HImode;

/* Create loop counter register.  */
loop_reg = copy_to_mode_reg (mode, gen_int_mode (INTVAL (operands[2]),
mode));

/* Create RTL code for move loop, with label at top of loop.  */
emit_label (label);
 
/* Move one byte into scratch and inc pointer.  */
rtx tmp_reg = copy_to_mode_reg (QImode, gen_rtx_MEM (QImode, addr1));
emit_move_insn (addr1, gen_rtx_PLUS (Pmode, addr1, const1_rtx));
 
/* Move scratch into mem, and inc other pointer.  */
emit_move_insn (gen_rtx_MEM (QImode, addr0), tmp_reg);
emit_move_insn (addr0, gen_rtx_PLUS (Pmode, addr0, const1_rtx));
 
/* Decrement count.  */
emit_move_insn (loop_reg, gen_rtx_PLUS (mode, loop_reg, constm1_rtx));
 
/* Compare with zero and jump if not equal.  */
emit_cmp_and_jump_insns (loop_reg, const0_rtx, NE, NULL_RTX, mode, 1,
 label);
  
/* Set jump probability based on loop count.  */
rtx jump = get_last_insn ();
prob = REG_BR_PROB_BASE - (REG_BR_PROB_BASE / INTVAL (operands[2]));
REG_NOTES (jump) = gen_rtx_EXPR_LIST (REG_BR_PROB, GEN_INT (prob),
  REG_NOTES (jump));
DONE;
  }})

This time attempting to handle variable counts:

;; move string (like memcpy)
;; implement as RTL loop

(define_expand movmemhi
  [(parallel [(set (match_operand:BLK 0 memory_operand )
  (match_operand:BLK 1 memory_operand ))
 (use (match_operand:HI 2 const_int_operand ))
 (use (match_operand:HI 3 const_int_operand ))] )]
  
  {
  enum machine_mode mode = HImode;
  int prob = (REG_BR_PROB_BASE * 95) / 100;
  rtx test_label = 0; /* Initial no-test value.  */
  
  /* Specify default variable loop count initial value.  */
  rtx loop_cnt

[Bug target/18251] unable to find a register to spill in class `POINTER_REGS'

2005-03-12 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-03-13 
04:05 ---
Subject: Re:  unable to find a register to spill in class
 `POINTER_REGS'

You answered your own question. GCC handles variable moves just like 
anything else. Dealing with range of possible values and size etc and 
construct appropriate RTL.

GCC does not need this backend  define or expand. It is quite happy 
working out moves by itself.

The pattern is *only* defined when the target can  do a better job - 
i.e.  when we have a constant  byte count - but not otherwise.  I have 
found it's a really bad idea to second guess compiler optimisations.

- how about in the case of a variable count = 0 ?
  (or since only constants are handled, it falls back to letting gcc figure it 
 out?)


  





-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18251


[Bug target/19684] avr-gcc 4.0 (and 3.3.4): wrong size in asm comment

2005-03-02 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-03-03 
01:57 ---
This is almost certainly caused by code peepholes doing last minute optimisation
of the code just before the assembler is generated. 

Prior to that, all RTL instructions have a length (in 16 bit words) that is
*soley* used to select the appropriate jump and branch instructions. The size
comments are generated from those lengths. If a peephole does indeed change some
instructions, they will quite likely over estimate the final size (as is the
case presented here)

The length (and so size) is fine if it over estimates the actual size
(Jumps/branches can always work over shorter distances.) The actual jump
displacments are based on labels so they are unaffected.

There may be some other areas of the backend that apply worse-case estimates of
asm instruction size to avoid the complexity of calculating every situation.
However, peepholes definitely do this!

I would suggest this is a non-bug as the size is an internal compiler debug
comment and there is no regression, misoptimisation  or similar downside.

If the size ever under-estimates the true size THAT IS A BUG!




-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19684


[Bug c/20222] New: [AVR] Double load of volatile operand

2005-02-26 Thread andrewhutchinson at cox dot net
 007CCBD0 i1)
[2 i1+0 S2 A8])) -1 (nil)
(nil))

(insn 14 13 15 (set (reg:HI 43)
(mem/v/f:HI (symbol_ref:HI (i1) [flags 0x40] var_decl 007CCBD0 i1)
[2 i1+0 S2 A8])) -1 (nil)
(nil))

(insn 15 14 16 (set (cc0)
(reg:HI 43)) -1 (nil)
(nil))

(jump_insn 16 15 17 (set (pc)
(if_then_else (ge (cc0)
(const_int 0 [0x0]))
(label_ref 18)
(pc))) -1 (nil)
(nil))

(insn 17 16 18 (set (reg:HI 43)
(neg:HI (reg:HI 43))) -1 (nil)
(nil))

(code_label 18 17 19 3  [0 uses])

(insn 19 18 20 (parallel [
(set (cc0)
(compare (reg:HI 43)
(const_int 1 [0x1])))
(clobber (scratch:QI))
]) -1 (nil)
(nil))

(jump_insn 20 19 21 (set (pc)
(if_then_else (eq (cc0)
(const_int 0 [0x0]))
(label_ref 27)
(pc))) -1 (nil)
(nil))

(note 21 20 22 00729208 NOTE_INSN_BLOCK_BEG)

(note 22 21 23 NOTE_INSN_DELETED)

(note 23 22 24 (testabs-2.c) 12)

(call_insn 24 23 25 (call (mem:HI (symbol_ref:HI (abort) [flags 0x41]
function_decl 00779690 abort) [0 S2 A8])
(const_int 0 [0x0])) -1 (nil)
(expr_list:REG_NORETURN (const_int 0 [0x0])
(expr_list:REG_EH_REGION (const_int 0 [0x0])
(nil)))
(nil))

(barrier 25 24 26)

(note 26 25 27 00729208 NOTE_INSN_BLOCK_END)

(code_label 27 26 28 2  [0 uses])

(note 28 27 29 00729230 NOTE_INSN_BLOCK_END)

(note 29 28 30 00729280 NOTE_INSN_BLOCK_BEG)

(note 30 29 31 NOTE_INSN_DELETED)

(note 31 30 32 (testabs-2.c) 13)

(insn 32 31 33 (set (reg:HI 44)
(mem/f:HI (symbol_ref:HI (xi1) [flags 0x40] var_decl 007CCAF0 xi1)
[2 xi1+0 S2 A8])) -1 (nil)
(nil))

(insn 33 32 34 (set (reg:HI 45)
(mem/f:HI (symbol_ref:HI (xi1) [flags 0x40] var_decl 007CCAF0 xi1)
[2 xi1+0 S2 A8])) -1 (nil)
(nil))

(insn 34 33 35 (set (cc0)
(reg:HI 45)) -1 (nil)
(nil))

(jump_insn 35 34 36 (set (pc)
(if_then_else (ge (cc0)
(const_int 0 [0x0]))
(label_ref 37)
(pc))) -1 (nil)
(nil))

(insn 36 35 37 (set (reg:HI 45)
(neg:HI (reg:HI 45))) -1 (nil)
(nil))

(code_label 37 36 38 5  [0 uses])

(insn 38 37 39 (parallel [
(set (cc0)
(compare (reg:HI 45)
(const_int 1 [0x1])))
(clobber (scratch:QI))
]) -1 (nil)
(nil))

(jump_insn 39 38 40 (set (pc)
(if_then_else (eq (cc0)
(const_int 0 [0x0]))
(label_ref 46)
(pc))) -1 (nil)
(nil))

(note 40 39 41 00729258 NOTE_INSN_BLOCK_BEG)

(note 41 40 42 NOTE_INSN_DELETED)

(note 42 41 43 (testabs-2.c) 14)

(call_insn 43 42 44 (call (mem:HI (symbol_ref:HI (abort) [flags 0x41]
function_decl 00779690 abort) [0 S2 A8])
(const_int 0 [0x0])) -1 (nil)
(expr_list:REG_NORETURN (const_int 0 [0x0])
(expr_list:REG_EH_REGION (const_int 0 [0x0])
(nil)))
(nil))

(barrier 44 43 45)

(note 45 44 46 00729258 NOTE_INSN_BLOCK_END)

(code_label 46 45 47 4  [0 uses])

(note 47 46 48 00729280 NOTE_INSN_BLOCK_END)

(note 48 47 49 007292A8 NOTE_INSN_BLOCK_END)

(note 49 48 50 NOTE_INSN_FUNCTION_END)

(note 50 49 51 (testabs-2.c) 16)

(code_label 51 50 0 1  [0 uses])

-- 
   Summary: [AVR] Double load of volatile operand
   Product: gcc
   Version: 3.4.3
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: andrewhutchinson at cox dot net
CC: gcc-bugs at gcc dot gnu dot org
GCC target triplet: avr


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=20222


[Bug target/18251] unable to find a register to spill in class `POINTER_REGS'

2005-02-22 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-02-22 
12:31 ---
Subject: Re:  unable to find a register to spill in class
 `POINTER_REGS'

if you can wait 12hrs I'll create 3.4 version.

Alternatively cut n  paste from a 4.0 avr.md
the change is local to one area.




dieterbmeier at yahoo dot com wrote:

--- Additional Comments From dieterbmeier at yahoo dot com  2005-02-22 
10:32 ---
Andy's patch works great for HEAD, but I get

patching file avr.md
Hunk #1 FAILED at 344.
1 out of 1 hunk FAILED -- saving rejects to file avr.md.rej

when patching 3_4 branch.

  






-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18251


[Bug target/18251] unable to find a register to spill in class `POINTER_REGS'

2005-02-12 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-02-12 
13:50 ---
A sub-optimal fix is to disable movmemhi expansion. Either delete it or the less
draconian:
(define_expand movmemhi
  [(parallel [(set (match_operand:BLK 0 memory_operand )
   (match_operand:BLK 1 memory_operand ))
  (use (match_operand:HI 2 const_int_operand ))
  (use (match_operand:HI 3 const_int_operand ))
  (clobber (match_scratch:HI 4 ))
  (clobber (match_scratch:HI 5 ))
  (clobber (match_dup 6))])]
  (0)
etc

A better solution is currently being tested.



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18251


[Bug c/19924] New: [AVR] MODES_TIEABLE incorrect

2005-02-12 Thread andrewhutchinson at cox dot net
MODES_TIEABLE in avr is set such that access to subreg is prevented.

This result is significantly sub-optimal code.

Attached patch changes this. No regressions have been found with test suite -
indeed things got better!

Note this is related to PR/19815 documentation error.

-- 
   Summary: [AVR] MODES_TIEABLE incorrect
   Product: gcc
   Version: 4.0.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: andrewhutchinson at cox dot net
CC: gcc-bugs at gcc dot gnu dot org
GCC target triplet: avr


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19924


[Bug c/19924] [AVR] MODES_TIEABLE incorrect

2005-02-12 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-02-12 
15:35 ---
Created an attachment (id=8186)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8186action=view)
Patch to chnage MODES_TIEABLE


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19924


[Bug target/19636] Can't compile ethernut OS (avr-gcc)

2005-02-12 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-02-13 
02:07 ---
Try patch attached to PR 18251. Good chance it will fix.
If not, pass me the source for a llok at.




-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19636


[Bug target/19636] Can't compile ethernut OS (avr-gcc)

2005-02-12 Thread andrewhutchinson at cox dot net


-- 
   What|Removed |Added

 CC||andrewhutchinson at cox dot
   ||net


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19636


[Bug c/19835] New: [AVR] Loop variable gets widened to LONG instead of int

2005-02-08 Thread andrewhutchinson at cox dot net
GNU C version 4.0.0 20041205 (experimental) (avr)

Loop variable gets widened to LONG instead of unsigned int (or perhaps even
int). Seems we forgeot how big the target is?

Testcase:

struct S19 { unsigned char i[19]; };
void
init (struct S19 *p, int i) 
{   
  int j;
  for (j = 0; j  19; j++)  
p-i[j] = i + j;
}   


tree dump:


;; Function init (init)

init (p, i)
{
  long unsigned int ivtmp.3;

bb 0:
  ivtmp.3 = 0;

L0:;
  ((unsigned char *) ivtmp.3 + p-i[0])-i[0] = (unsigned char) ivtmp.3 +
(unsigned char) (signed char) i;
  ivtmp.3 = ivtmp.3 + 1;
  if (ivtmp.3 != 19) goto L0; else goto L2;

L2:;
  return;

}

Not surprisingly, backend code reflects long (SImode) decrement and compare :

/* prologue: frame size=0 */
/* prologue end (size=0) */
movw r26,r24
ldi r18,lo8(0)
ldi r19,hi8(0)
ldi r20,hlo8(0)
ldi r21,hhi8(0)
.L2:
movw r30,r26
add r30,r18
adc r31,r19
mov r24,r18
add r24,r22
st Z,r24
subi r18,lo8(-(1))
sbci r19,hi8(-(1))
sbci r20,hlo8(-(1))
sbci r21,hhi8(-(1))
cpi r18,lo8(19)
cpc r19,__zero_reg__
cpc r20,__zero_reg__
cpc r21,__zero_reg__
brne .L2
/* epilogue: frame size=0 */
ret

-- 
   Summary: [AVR] Loop variable gets widened to LONG instead of int
   Product: gcc
   Version: 4.0.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: andrewhutchinson at cox dot net
CC: gcc-bugs at gcc dot gnu dot org
  GCC host triplet: cygwin
GCC target triplet: avr


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19835


[Bug tree-optimization/19686] [4.0 Regression] loop performance decrease, not comparing against 0

2005-02-07 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-02-08 
01:48 ---
I ran testcase with proposed avr_costs patch applied. The result is unchanged. 

The initially generated RTL is unfortunately beyond that which can be fixed by
backend. I dont think this problem is avr specific, it should appear on other
targets.

I have attached initially generated RTL. It is alarmingly complex given starting
point. (This is not so apparent in the assembler as the backend has done a
rather good job of tidying up what it can.)

Perhaps somebody could glance at this to see what exactly went off the rails. It
might just be a manifestation of a known problem.

Wish I could help more - but trees are beyond me at the moment.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19686


[Bug tree-optimization/18219] [4.0 Regression] gcc-4.0.0 bloats code by 31%

2005-02-07 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-02-08 
02:12 ---
Bad post ignore


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18219


[Bug other/19815] New: Documentation change - GCC Internals MODES_TIEABLE_P

2005-02-07 Thread andrewhutchinson at cox dot net
Documentation change - GCC Internals

The definition of MODES_TIEABLE_P is incorrect and has resulted in reduced
optimisation for the avr target (and perhaps others)

The definition is currently:

A C expression that is nonzero if a value of mode mode1 is accessible in mode
mode2 without copying.

This part would be ok but is then detailed as :

If HARD_REGNO_MODE_OK (r, mode1) and HARD_REGNO_MODE_OK (r, mode2) are always
the same for any r, then MODES_TIEABLE_P (mode1, mode2) should be nonzero. If
they differ for any r, you should define this macro to return zero unless some
other mechanism ensures the accessibility of the value in a narrower mode.

This second paragraph is too restrictive. 

MODES_TIEABLE_p may also be nonzero if r is accessible in any SMALLER mode.

In the particular example of the avr target, word or larger registers are
assigned even numbered  registers ONLY. Byte registers have no such restriction.
Because this does indeed fail the second paragraph criteria, MODE_TIEABLE_P has
been set 0=FALSE preventing byte operations on the word register and uneeded
register moves. It should have been set TRUE.

I was tempted to report this as AVR target bug - but the code is not really the
problem.

Note that the definition is often included in target header files as well as gcc
internal manual.

-- 
   Summary: Documentation change - GCC Internals MODES_TIEABLE_P
   Product: gcc
   Version: 3.4.3
Status: UNCONFIRMED
  Severity: minor
  Priority: P2
 Component: other
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: andrewhutchinson at cox dot net
CC: gcc-bugs at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19815


[Bug tree-optimization/19686] [4.0 Regression] loop performance decrease, not comparing against 0

2005-02-06 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-02-06 
23:06 ---
Taking X as the initial value of x on function entry.

The loop is defined as i=X to 0, step -1. Which is a simple do loop.

It gets optimized as i=0 to -X, step -1. (Which is something bizarre!)

The code increase is due to 1) Computation of -X and  2) compare said -X



-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19686


[Bug c/19703] New: Poor optimisation of loop test

2005-01-29 Thread andrewhutchinson at cox dot net
Missed optimization

gcc version 4.0.0 20041205 (experimental)
4.0.0/cc1.exe -quiet -v looprv.c -quiet -dumpbase
looprv.c -mmcu=atmega169 -auxbase looprv -Os -Wall -version -funsigned-char
-funsigned-bitfields -fpack-struct -fshort-enums -o looprv.s

Down counting loop, uses expensive compare EQ (-n) instead of compare =0.
Testcase as follows:

volatile char  value6;
extern void foo6(char);
void testloop6(void)
{
int i;
for (i=100;i= 0;i-=10)
{
   if (!value6)
   {
   foo6(i);
   } 
}

Loop test in RTL is compare NE -10. It should be compare GE 0 - which
is(generally) free. First dump of Expanded RTL show the compare.

-- 
   Summary: Poor optimisation of loop test
   Product: gcc
   Version: 4.0.0
Status: UNCONFIRMED
  Severity: minor
  Priority: P2
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: andrewhutchinson at cox dot net
CC: gcc-bugs at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19703


[Bug tree-optimization/19703] Poor optimisation of loop test

2005-01-29 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-01-30 
04:58 ---
Subject: Re:  Poor optimisation of loop test

I am not sure what makes you think that. Compare with ZERO is 
invariabley cheaper than compare with n.
The former is free sign status following any conditioning setting 
instruction - like subtract!
Its even the sign bit of the result!

subi r28,10
cpi   r28, -10
brpl  looptop

subi r28,10
brpl looptop

or did I miss something?

pinskia at gcc dot gnu dot org wrote:

--- Additional Comments From pinskia at gcc dot gnu dot org  2005-01-30 
03:17 ---
Hmm, on most targets it is true that != is the same case as =.

  






-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19703


[Bug c/19676] New: Loop optimizer fails to reverse simple loop

2005-01-28 Thread andrewhutchinson at cox dot net
AVR Target 20041205 snapshot

gcc version 4.0.0 20041205 (experimental)
 /avrdev/libexec/gcc/avr/4.0.0/cc1.exe -quiet -v looprv.c -quiet -dumpbase
looprv.c -mmcu=atmega169 -auxbase looprv -Os -Wall -version -funsigned-char
-funsigned-bitfields -fpack-struct -fshort-enums -o looprv.s


Loop optimiser fails to reverse simple loop. Example

void testloop5(void)
{
int i;
for (i=0;i100;i++)
{
if (!value)
{
foo();
} 
}
}

generates RTL setting index to 100 then using decrement/branch at end of loop as
expected. However, adding any kind of while/for loop inside outer loop leaves
index unoptimised. For example

void testloop3(void)
{
int i;
for (i=0;i100;i++)
{
while (!value)
{
foo();
}
}

}

Here index starts at 0 and increments to 99.

Problem seems to be related to maybe_multiple being set in loop scan. However,
since 'i' is never used inside loop there would seem to be no need to check for
multiple setting.

This was tested with AVR target but looks like it will affect any target - I can
provide RTL etc on demand.

-- 
   Summary: Loop optimizer fails to reverse simple loop
   Product: gcc
   Version: 4.0.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P2
 Component: c
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: andrewhutchinson at cox dot net
CC: gcc-bugs at gcc dot gnu dot org


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19676


[Bug c/19676] Loop optimizer fails to reverse simple loop

2005-01-28 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-01-28 
19:12 ---
Created an attachment (id=8092)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8092action=view)
Testcase c source 

Testloop3() is NOT reversed. Others for reference are.


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19676


[Bug c/19676] Loop optimizer fails to reverse simple loop

2005-01-28 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-01-28 
19:13 ---
Created an attachment (id=8093)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8093action=view)
Expanded RTL

Expanded RTL from looprv testcase source

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19676


[Bug target/19676] Loop optimizer fails to reverse simple loop

2005-01-28 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-01-28 
19:14 ---
Created an attachment (id=8094)
 -- (http://gcc.gnu.org/bugzilla/attachment.cgi?id=8094action=view)
Optimised RTL

Final Optimised RTL before asm code generation.

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19676


[Bug tree-optimization/19676] Loop optimizer fails to reverse simple loop

2005-01-28 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2005-01-28 
20:15 ---
Subject: Re:  Loop optimizer fails to reverse
 simple loop

GCC 3.3.1 did reverse testloop3 but not testloop2() or testloop(4).  So 
4.0 gets 4/5 right an 3.3.1 3/5 right.

Its complicated by other optimisations though on my inner loop code so I 
could not say if testloop3 is a regression. Im trying to get some 
results from 3.4.x

The only issue with inconsistent patterns is that it makes matching 
backend patterns more likely to fail. As we need to catch GE 0 or EQ -1 
after decrement for almost identical code structure. I am not sure if 
this is a real probelm or one that gcc will take care of by alternate 
patter substitutions.


pinskia at gcc dot gnu dot org wrote:

--- Additional Comments From pinskia at gcc dot gnu dot org  2005-01-28 
19:54 ---
Confirmed, we should be able to do this on the tree level but don't for 
testloop2, testloop3, testloop4.

To answer this question:
*  - why is gcc inconsistent in loop reversal bounds
Because sometimes we do loop reversal on the tree level or the rtl level.  See 
above about where we 
don't do it on the tree level.

Do you know if all of these loops were loop reversal for say 3.4.0?

  






-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=19676


[Bug rtl-optimization/14151] [new-ra] new-ra get frame size incorrect

2004-12-24 Thread andrewhutchinson at cox dot net

--- Additional Comments From andrewhutchinson at cox dot net  2004-12-25 
02:33 ---
Problem still present on gcc (GCC) 4.0.0 20041205 (experimental) SNAPSHOT *sigh*

-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=14151