from:"Ben Jackson"

[avr-gcc-list] Re: loop deleted using optimization

2007-03-06 Thread Ben Jackson

 Why does avr-gcc delete my empty for loops if I compile with 
 optimization on?
 I am able to preserve the loop if I add a NOP in the loop but that will 
 eat up one clock cycle.  Is there a way to preserve the empty loops 
 without adding any NOP clock cycles?

It would probably work to add an empty asm, as long as you used the
__volatile attribute to scare gcc away from reordering code in the
loop.  So, for () { __asm __volatile () ; }

(I note I wasn't able to get gcc 3.3 to optimize away the loop on i386)

-- 
Ben Jackson AD7GD
[EMAIL PROTECTED]
http://www.ben.com/


___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

Re: [avr-gcc-list] Using PORTB with variables, PORTB = var

2006-04-22 Thread Ben Jackson

 void change() {
   var = 1;
 }

That's a shift, not a rotation, so var is going to become 0.

   while (1) {
 int var=1;
 while (var != 0) {
   PORTB = var;
   var = 1;
 }
   }

Also, 'int var' is going to be 16 bits wide, so in this version which
rotates the bit it's going to spend half its time in high bits and
PORTB is going to be 0.



___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

Re: [avr-gcc-list] multiply by constant always expands to int

2005-12-06 Thread Ben Jackson

On Tue, Dec 06, 2005 at 07:33:51PM +, Paulo Marques wrote:
 
 I thought this would just happen with any multiply operation, and tried 
 to build a simple test case, and to my surprise everything was just fine.

I just did the same thing.  Turns out there are at least two things
required for my version:  one of the operands has to be `const', and
you must use `-Os':

void
g(void)
{
extern char f(char);
const char y = 20;
char x, z;

z = f(x * y);
}

Turn off `const' or `-Os' and you get __mulqi3, else __mulhi3

You can probably also just move the literal '20' in for y;

-- 
Ben Jackson
[EMAIL PROTECTED]
http://www.ben.com/


___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

[avr-gcc-list] multiply by constant always expands to int

2005-12-03 Thread Ben Jackson

Using 3.4.4 (I'm off 4.0.1 for now because I haven't had time to figure
out why all loop variables are promoted to ints) I had a strange experience
where changing:

char x = 10, y;

loop {
f(x * y);
}

got *longer* when I made it 'const char x = 10' or just hardcoded 10.
Turns out to be because mulqi was replaced with mulhi which is longer.
Now I'm not 100% sure what the integral type promotion rules say for
multiply, but I'd be surprised if they were different for const char
vs char, especially if the *const* version widens *more*.

Where's most of the development going on now, 3.x or 4.x?  I guess I
need to buckle down and read the gcc internals docs and start
submitting patches.

-- 
Ben Jackson
[EMAIL PROTECTED]
http://www.ben.com/


___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

[avr-gcc-list] Saving space in interrupt handler prologues

2005-08-19 Thread Ben Jackson

I don't know if this has been covered already, but noe way to save space
in interrupt handlers is to inline all the functions they call.  Obviously
is it not always practical, but if the functions are split out mainly for
clarity this can be a big win.  The main reason is that if the interrupt
handler doesn't call *any* other functions it isn't obligated to save the
caller-saves registers if it doesn't use them.

Also, I think someone else wanted to know why things sometimes did/did not
get inlined.  gcc tries to be clever about this, so if you want the final
say, try this:

#define NOINLINE __attribute__ ((__noinline__))
#define YESINLINE inline __attribute__ ((__always_inline__))

Make sure to declare the YESINLINE functions static if you don't want a
callable copy for external references.

Also, as far a general prologue bloat goes:  I may have mentioned this
before, but I think a lot of it comes from the fact that AVR GCC generates
16 and 32 bit math in parallel instead of series.  For example, something
like:

int32 a, b, c;

a = b | c;

becomes:
lds r24,b
lds r25,(b)+1
lds r26,(b)+2
lds r27,(b)+3
lds r18,c
lds r19,(c)+1
lds r20,(c)+2
lds r21,(c)+3
or r24,r18
or r25,r19
or r26,r20
or r27,r21
sts a,r24
sts (a)+1,r25
sts (a)+2,r26
sts (a)+3,r27

when it could be:

lds r24,b
lds r25,c
or r24,r25
sts a,r24
...

using only 2 registers instead of 8.  None of the load/store instructions
set flag bits, so even math with carry can be done this way.  I don't
think it's a legitimate peephole optimization (though I haven't pondered
it deeply) since qualifiers like 'volatile' would prevent the re-ordering
of the memory accesses.

-- 
Ben Jackson
[EMAIL PROTECTED]
http://www.ben.com/


___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

[avr-gcc-list] Zeroing longs: use memset for -Os?

2005-07-29 Thread Ben Jackson

Due to the size of sts instructions, it's actually fewer bytes to
memset(somelong, 0, sizeof(long)).  The memset is 12 bytes (count,
hi, lo, sts, dec, loop, all 2 bytes) vs 16 for 4 straight sts's.
If you're doing more than one variable of course the savings are
bigger.

-- 
Ben Jackson
[EMAIL PROTECTED]
http://www.ben.com/


___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

[avr-gcc-list] Shift by multiple of 8 is really shifting...

2005-07-29 Thread Ben Jackson

Here I'm doing something like

l = ((ulong)i  16) + ...;

It's actually loading the i, expanding it, then shifting it.  Once
again I'm sure I've seen gcc do better (on i386 or PPC, not sure).
If it's not useful for me to point out these problems as I run
across them, feel free to tell me to shut up.  :-)  Better yet,
give me an idea of where in gcc I should fix this (clearly it could
be fixed with a peephole rule of some kind, but I bet it's also
possible to just generate better code).

BTW, this is not purely academic for me, I'm working on a LC meter
that greatly benefits from the float math etc and I've managed to
get the float-using version down from 2046 to 1662, which is going
to leave space to finish the code.  This asm got generated when I
started tracking timer0/1 overflows and adding them in, scaled, at
the end.  It's still a win, but it'd be even better if both copies
of this saved 4 more instructions!

b2:   20 91 6a 00 lds r18, 0x006A
b6:   30 91 6b 00 lds r19, 0x006B
ba:   44 27   eor r20, r20
bc:   55 27   eor r21, r21
be:   53 2f   mov r21, r19
c0:   42 2f   mov r20, r18
c2:   33 27   eor r19, r19
c4:   22 27   eor r18, r18

-- 
Ben Jackson
[EMAIL PROTECTED]
http://www.ben.com/


___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

[avr-gcc-list] Don't use gcc 3.4.4, use 4.0.1

2005-07-28 Thread Ben Jackson

When I built gcc for AVR I wasn't sure what the Right version of GCC to
use was.  I've been using a cross-compiling GCC 3.4.4 for another platform
with success so I went with that.  However, even with -morder1 and -fnew-ra,
the code is not nearly as good as 4.0.1 for AVR.  My main test file .o text
(at the moment mostly poking bits and doing integer math) got 12% shorter
with -Os (due in no small part to the fact that it's smarter with registers,
resulting in less spilling, resulting in no need for the prolog/eplilog
callouts).

Of course I've got nearly zero experience with both, so I welcome dissenting
opinions, but I wanted to put the recommendation out there for any other
newbies.

-- 
Ben Jackson
[EMAIL PROTECTED]
http://www.ben.com/


___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

[avr-gcc-list] Should -mtiny-stack be the default for small devices?

2005-07-28 Thread Ben Jackson

The AT90S2313, for example, has only 128 bytes of memory.  Is there any
reason why -mtiny-stack shouldn't be the default for these devices?

Looks like there are half a dozen ATtiny devices with 0 and =256 bytes
of SRAM and several of the old AT90S* devices.

-- 
Ben Jackson
[EMAIL PROTECTED]
http://www.ben.com/


___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

[avr-gcc-list] Re: loop deleted using optimization

Re: [avr-gcc-list] Using PORTB with variables, PORTB = var

Re: [avr-gcc-list] multiply by constant always expands to int

[avr-gcc-list] multiply by constant always expands to int

[avr-gcc-list] Saving space in interrupt handler prologues

[avr-gcc-list] Zeroing longs: use memset for -Os?

[avr-gcc-list] Shift by multiple of 8 is really shifting...

[avr-gcc-list] Don't use gcc 3.4.4, use 4.0.1

[avr-gcc-list] Should -mtiny-stack be the default for small devices?

9 matches

Site Navigation

Mail list logo

Footer information