Re: [avr-gcc-list] dev and mod may not be optimized
Dmitry K. wrote: On Wednesday 12 December 2007 01:16, Albert Andras wrote: - Original Message - From: Nathan Moore [EMAIL PROTECTED] unsigned char a, b, c; c = some_input_function(); LABEL: a = c/10; b = c%10; ... * LABEL: mov r24,r18 ldi r22,lo8(10) call __udivmodqi4 mov r19,r24 .LM436: mov r24,r18 call __udivmodqi4 If I'm understanding __udivmodqi4 it produces the results of both of these with one call. Acording to: http://www.nongnu.org/avr-libc/user-manual/structdiv__t.html you could do: div_t t; t=div(c/10); a=t.quot; b=t.rem; Alas, the usage of div() function is more space and more time expansive. The reason is that div() uses an another function: __divmodhi4(), i.e. 16-bits. Compare two programs: prog1: 150 bytes, 238 clocks prog2: 212 bytes, 316 clocks Regards, Dmitry. - volatile unsigned char a, b, v = 123; int main () { unsigned char c = v; a = c / 10; b = c % 10; return 0; } - #include stdlib.h volatile unsigned char a, b, v = 123; int main () { div_t d; d = div (v, 10); a = d.quot; b = d.rem; return 0; } -- How about: volatile unsigned char a, b, v = 123; int main () { unsigned char c = v; a = c / 10; b = c - a * 10; return 0; } It's a little inelegant, but multiplying by 10 and then subtracting is much smaller and faster than doing a second division. mvh., David ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] dev and mod may not be optimized
Gre7g Luterman wrote: [...] Well if size speed are getting you down, try this: http://pastie.textmate.org/127218 The calculation takes 22 bytes and executes in 40 clocks. As an assembly-language programmer, having that extra MOV in the beginning annoys me, but hey, I try not to break the asm rules. Regardless, you'd be hard-pressed to get it much smaller or faster. Actually, getting it faster (and possibly smaller) shouldn't be that hard for a _constant_ division like that in a AVR that supports mul instructions. To do: a = c/10; b = c%10; you can do: a = (c * 65534) 16; b = c - (a * 10); leaving you with no divisions to perform. The main problem is that the first multiplication should be done with some hand optimized assembly to perform a 8x16 bit multiplication, leaving just the upper 8 bits of a 24 bit result, but it should be doable with just 2 mul instructions and one 16 bit addition. -- Paulo Marques Software Development Department - Grupo PIE, S.A. Phone: +351 252 290600, Fax: +351 252 290601 Web: www.grupopie.com The impossible often has a kind of integrity to it which the merely improbable lacks. Douglas Adams ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] dev and mod may not be optimized
Thank you for your explination Bjoern. I have looked at GCC internals before, but never worked on them, so I'm likely to say something stupid. I would have thought that an optimization pattern for just this type of thing would have already been in GCC, especially since on the internal helper function is not unlike the divide instruction on x86, returning 2 values rather than one. I would think that peephole optimization would be all over this. Nathan -Original Message- From: Haase Bjoern (PT/EMM1) [mailto:[EMAIL PROTECTED] Sent: Tuesday, December 11, 2007 9:16 AM To: Nathan Moore; avr-gcc-list@nongnu.org Subject: AW: [avr-gcc-list] dev and mod may not be optimized The source of this problem is a deeply buried internal issue of the RTL handling within the compiler. Description of the internal problem: The expand pass generates strange RTL sequences refering to hard registers for the div and mod operations and uses specific insn RTL for calling the respective functions. This way the compiler does not need to save and restore all of the call-clobbered registers but only those that the functions alter. Unfortunately div and mod could not be optimized this way because the combine and CSE passes do not operate on hard registers :-(. Unfortunately there is no easy solution of this issue :-(. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] dev and mod may not be optimized
Paulo Marques wrote: [...] a = (c * 65534) 16; b = c - (a * 10); leaving you with no divisions to perform. The main problem is that the first multiplication should be done with some hand optimized assembly to perform a 8x16 bit multiplication, leaving just the upper 8 bits of a 24 bit result, but it should be doable with just 2 mul instructions and one 16 bit addition. Actually after a few more digging, we can do this all in plain C, by using a smaller reciprocal multiplier: unsigned char a,b,c; unsigned int r; r = c * 205; a = r / 2048; b = c - (a * 10); generates this code with avr-gcc 4.2.1: r = c * 205; a = r / 2048; d8: 2d ec ldi r18, 0xCD ; 205 da: 82 9f mul r24, r18 dc: 90 01 movwr18, r0 de: 11 24 eor r1, r1 e0: 23 2f mov r18, r19 e2: 33 27 eor r19, r19 e4: 26 95 lsr r18 e6: 26 95 lsr r18 e8: 26 95 lsr r18 b = c - (a * 10); ea: 9a e0 ldi r25, 0x0A ; 10 ec: 92 9f mul r25, r18 ee: 90 2d mov r25, r0 f0: 11 24 eor r1, r1 f2: 89 1b sub r24, r25 This is slightly larger than 22 bytes (28 bytes), but it should execute in just 16 cycles. We could certainly improve the size a little with hand optimized assembly, though. -- Paulo Marques Software Development Department - Grupo PIE, S.A. Phone: +351 252 290600, Fax: +351 252 290601 Web: www.grupopie.com God is real, unless declared integer. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
RE: [avr-gcc-list] dev and mod may not be optimized
-Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] org] On Behalf Of Nathan Moore Sent: Tuesday, December 11, 2007 9:26 AM To: avr-gcc-list@nongnu.org Subject: RE: [avr-gcc-list] dev and mod may not be optimized Thank you for your explination Bjoern. I have looked at GCC internals before, but never worked on them, so I'm likely to say something stupid. I would have thought that an optimization pattern for just this type of thing would have already been in GCC, especially since on the internal helper function is not unlike the divide instruction on x86, returning 2 values rather than one. I would think that peephole optimization would be all over this. But it requires someone to write that peephole optimization for the AVR. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] dev and mod may not be optimized
--- Paulo Marques [EMAIL PROTECTED] wrote: Actually after a few more digging, we can do this all in plain C, by using a smaller reciprocal multiplier: snipped Wow, that's freakin' inspired! And you're right it does optimize nicely. Here's a 20 byte version that executes in 12 cycles: http://pastie.textmate.org/127541 Gre7g __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list
Re: [avr-gcc-list] dev and mod may not be optimized
Gre7g Luterman wrote: --- Paulo Marques [EMAIL PROTECTED] wrote: Actually after a few more digging, we can do this all in plain C, by using a smaller reciprocal multiplier: snipped Wow, that's freakin' inspired! Thanks :) It's not really a new idea, though. Using multiply by the reciprocal to do divisions by constants has been around for a while. For an in-depth analysis of the technique I suggest reading Hacker's Delight. The funny thing is that gcc for x86 does this optimization itself, replacing division by constants with multiplications when optimizing for speed. Unfortunately I don't know enough (or have time to get involved) to port the same optimization to the avr backend... :( And you're right it does optimize nicely. Here's a 20 byte version that executes in 12 cycles: http://pastie.textmate.org/127541 Very nice :) -- Paulo Marques Software Development Department - Grupo PIE, S.A. Phone: +351 252 290600, Fax: +351 252 290601 Web: www.grupopie.com All I ask is a chance to prove that money can't make me happy. ___ AVR-GCC-list mailing list AVR-GCC-list@nongnu.org http://lists.nongnu.org/mailman/listinfo/avr-gcc-list