subject:"RE\: \[avr\-gcc\-list\] Inversion of logic improves size speed"

Re: [avr-gcc-list] Inversion of logic improves size speed

2007-09-11 Thread Bernard Fouché


Hi All.

Do you know if this patch will make it in the 4.2.2 release expected 
around September 18th ?


Thanks!

 Bernard

Anatoly Sokolov wrote:

Anatoly Sokolov wrote:


Hi.

This patch optimizes logic left shift of unsigned char by 4, 5, and 6,
excluding double 'andi' instructions in some cases.

  


Patch.

Anatoly.



  
___

AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list





___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

RE: [avr-gcc-list] Inversion of logic improves size speed

2007-09-11 Thread Eric Weddington

 -Original Message-
 From:
 [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED]
 org] On Behalf Of Bernard Fouché
 Sent: Tuesday, September 11, 2007 4:33 AM
 To: Anatoly Sokolov
 Cc: avr-gcc-list@nongnu.org
 Subject: Re: [avr-gcc-list] Inversion of logic improves size speed

 Hi All.

 Do you know if this patch will make it in the 4.2.2 release expected
 around September 18th ?

I doubt that it will make it into the GCC tree itself by 4.2.2, but that is
totally up to Anatoly if he has time.

However, I will be including it in the next release of WinAVR, which will
hopefully include 4.2.2. However, GCC bug #29524
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29524 needs to have a fix
before I feel comfortable releasing it.

___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

Re: [avr-gcc-list] Inversion of logic improves size speed

2007-08-27 Thread Wouter van Gulik


Anatoly Sokolov wrote:

Hi.

This patch optimizes logic left shift of unsigned char by 4, 5, and 6, 
excluding double 'andi' instructions in some cases.




snip




Now:

0092 getBit4InvShift:
  92: 82 95swap r24
  94: 81 70andi r24, 0x01 ; 1
  96: 08 95ret

0098 getBit5InvShift:
  98: 82 95swap r24
  9a: 86 95lsr r24
  9c: 81 70andi r24, 0x01 ; 1
  9e: 08 95ret

00a0 getBit6InvShift:
  a0: 82 95swap r24
  a2: 86 95lsr r24
  a4: 86 95lsr r24
  a6: 81 70andi r24, 0x01 ; 1
  a8: 08 95ret




That's good news! No more clr r25 and no double and anymore!
Does this fix the double and in more situations? Is this because the 
swapand is now exposed to the upperlayers?


One thing, the patch is not in this e-mail (the list). And I did not 
receive your e-mail on my private e-mail. Maybe it's filtered. Will 
check my junk map.


Thanks,

Wouter


___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

RE: [avr-gcc-list] Inversion of logic improves size speed

2007-08-27 Thread Eric Weddington

 -Original Message-
 From:
 [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED]
 org] On Behalf Of Wouter van Gulik
 Sent: Monday, August 27, 2007 3:25 AM
 To: Anatoly Sokolov
 Cc: avr-gcc-list@nongnu.org
 Subject: Re: [avr-gcc-list] Inversion of logic improves size speed

 One thing, the patch is not in this e-mail (the list). And I did not
 receive your e-mail on my private e-mail. Maybe it's filtered. Will
 check my junk map.

Patch was not attached to email. However, Anatoly attached the patch to the
bug report.

Eric Weddington

___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

Re: [avr-gcc-list] Inversion of logic improves size speed

2007-08-27 Thread Wouter van Gulik


Eric Weddington schreef:


Patch was not attached to email. However, Anatoly attached the patch to the
bug report.



What bug report?
I looked at:

Non optimal bit extraction
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33049

No register save:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33050

Double and:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11259

I can't find them there or I need some more coffee... it's after all 
still monday ;)


Wouter



___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

Re: [avr-gcc-list] Inversion of logic improves size speed

2007-08-27 Thread Anatoly Sokolov


 Anatoly Sokolov wrote:
 Hi.

 This patch optimizes logic left shift of unsigned char by 4, 5, and 6,
 excluding double 'andi' instructions in some cases.


Patch.

Anatoly.



begin 666 gcc_fix_11259_33028.txt
[EMAIL PROTECTED](=C8R]R96-O9RYCCT]/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T]
M/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T*+2TM
M(=C8R]R96-O9RYC2AR979IVEO;B Q,CW.3DIBLK*R!G8V,OF5C;VN
M8PDH=V]R:VEN9R!C;W!Y*0I 0 M,CDV.PV([EMAIL PROTECTED] *( D)( O
M*B!)9B!A;B!I;G-N(AAR!25%A?1E)!345?4D5,051%1%]0('-E=P@5E
MAO;4*( D)( @(!S=6)S=ET=71I;[EMAIL PROTECTED];0@;]S92!T:4*( D)
M( @(!214=?1E)!345?4D5,051%1%]%6%!2('1H870@:7,@871T86-H960N
M( J+PHK0D@(9OB H:2 ](# [(D@/!-05A?24Y33E-?4$527U!%15 R
M(L@,[EMAIL PROTECTED]0D@( @5E#)?:6YS;E]D871A6VE=+FENVX@/2!.
M54Q,7U)46#L**PD)(!P965P,E]I;G-N7V1A=%;5E#)?8W5RF5N=%TN
M:6YS;B ](%!%15 R7T5/0CL*( D)(!P965P,E]C=7)R96YT7V-O=6YT(#T@
M,#L*( D)(!TGD@/2!.54Q,.PH@0E]DEN95X.B!G8V,O8V]N9FEG+V%V
MB]A=G(N;60*/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T]
M/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T]/3T]/[EMAIL PROTECTED];F9I
M9R]A=G(O879R+FUD2AR979IVEO;B Q,CW.3DIBLK*R!G8V,O8V]N9FEG
M+V%VB]A=G(N;60)*'=OFMI;F@8V]P2D*0$ @[EMAIL PROTECTED],3@@[EMAIL PROTECTED]
M.#(@0$ *( @6RAS971?871TB B;5N9W1H(B B,PT+#0L,3 B*0H@( @
M*'-E=%]A='1R()C8R(@(FYO;F4L8VQO8F)EBQS971?;BQC;]B8F5R(BE=
M*0H@BLH95F:6YE7VENVX@(G)O=QQ:3,BBL@(%LHV5T(AM871C:%]O
M5R86YD.E%)(# @(G)E9VES=5R7V]P97)A;F0B((]B(IBL)*')O=%T
M93I122 H;6%T8VA?;W!EF%N9#I122 Q()R96=IW1EE]O5R86YD(B B
M,(IBL)2 @(AC;VYS=%]I;[EMAIL PROTECTED]DI*5T**R @(B(**R @(G-W87 @)3 B
MBL@(%LHV5T7V%T='(@(FQE;F=T:(@(C$B*0HK( @*'-E=%]A='1R()C
M8R(@(FYO;F4B*5TIBL*(#L[(#X^(#X^(#X^(#X^(#X^(#X^(#X^(#X^(#X^
M(#X^(#X^(#X^(#X^(#X^(#X^(#X^(#X^(#X^(#X^(#X^(#X^(#X^(#X^(#X^
M(#X^B [.R!L;V=I8V%L('-H:69T(')I9VATB *+2AD969I;F5?:6YS;B B
M;'-HG%I,R(**RAD969I;F5?97AP86YD()LVAR6DS(@H@(!;*'-E= H
M;6%T8VA?;W!EF%N9#I122 P()R96=IW1EE]O5R86YD(B @( @( @
M( @( B/7(LBQR+'(L(60LBQR(BD*( DH;'-H:69T[EMAIL PROTECTED]UA=-H
M7V]P97)A;F0Z44D@,2 BF5G:7-T97)?;W!EF%N9(@(C L,PP+# L,PP
M+# B*0H@0D@( @(AM871C:%]O5R86YD.E%)(#(@(F=E;F5R86Q?;W!E
MF%N9(@()R+$PL4Q++XL;BQ1;2(I*2E=B @((BBL@((B*0HKBLH
M95F:6YE7VENVY?86YD7W-P;ET((J;'-HG%I,U]C;VYS=#0BBL@(%LH
MV5T(AM871C:%]O5R86YD.E%)(# @(F1?F5G:7-T97)?;W!EF%N9(@
M( @( @( @( @(CUD(BD**PDH;'-H:69T[EMAIL PROTECTED]UA=-H7V]P97)A
M;F0Z44D@,2 B9%]R96=IW1EE]O5R86YD(B B,(IBL)2 @( @*-O
M;G-T7VEN= T*2DI70HK( B(@HK( B(R(**R @(B(**R @[EMAIL PROTECTED]UA
M=-H7V1U P*2 HF]T871E.E%)(AM871C:%]D=7 @,[EMAIL PROTECTED]-O;G-T7VEN
M= T*2DIBL@( HV5T(AM871C:%]D=7 @,[EMAIL PROTECTED]%N9#I122 H;6%T8VA?
M9'5P(# I(AC;VYS=%]I;G0@,34I*2E=BL@((B*0HKBLH95F:6YE7VEN
MVY?86YD7W-P;ET((J;'-HG%I,U]C;VYS=#4BBL@(%LHV5T(AM871C
M:%]O5R86YD.E%)(# @(F1?F5G:7-T97)?;W!EF%N9(@( @( @( @
M( @(CUD(BD**PDH;'-H:69T[EMAIL PROTECTED]UA=-H7V]P97)A;F0Z44D@,2 B
M9%]R96=IW1EE]O5R86YD(B B,(IBL)2 @( @*-O;G-T7VEN= U
M*2DI70HK( B(@HK( B(R(**R @(B(**R @[EMAIL PROTECTED]UA=-H7V1U P
M*2 HF]T871E.E%)(AM871C:%]D=7 @,[EMAIL PROTECTED]-O;G-T7VEN= T*2DIBL@
M( HV5T(AM871C:%]D=7 @,[EMAIL PROTECTED]QS:EF=')T.E%)(AM871C:%]D=7 @
M,[EMAIL PROTECTED]-O;G-T7VEN= Q*2DIBL@( HV5T(AM871C:%]D=7 @,[EMAIL 
PROTECTED]%N
M9#I122 H;6%T8VA?9'5P(# I(AC;VYS=%]I;[EMAIL PROTECTED] @(B(IBL*
M*RAD969I;F5?:6YS;E]A;F1?W!L:70@(BILVAR6DS7V-O;G-T-B(**R @
[EMAIL PROTECTED]UA=-H7V]P97)A;F0Z44D@, B9%]R96=IW1EE]O5R86YD
M(B @( @( @( @( B/60B*0HK2ALVAI9G1R=#I122 H;6%T8VA?;W!E
MF%N9#I122 Q()D7W)E9VES=5R7V]P97)A;F0B((P(BD**PD)( @( H
M8V]NW1?:6YT(#8I*2E=BL@((BBL@((C(@HK( B(@HK(!;*'-E= H
M;6%T8VA?9'5P(# I(AR;W1A=[EMAIL PROTECTED]UA=-H7V1U P*2 H8V]NW1?
M:6YT(#0I*2D**R @([EMAIL PROTECTED]UA=-H7V1U P*2 H;'-H:69TG0Z44D@
M*UA=-H7V1U P*2 H8V]NW1?:6YT(#(I*2D**R @([EMAIL PROTECTED]UA=-H
M7V1U P*2 H86YD.E%)(AM871C:%]D=7 @,[EMAIL PROTECTED]-O;G-T7VEN= S*2DI
M70HK( B(BD**PHK*1E9FEN95]I;G-N((J;'-HG%I,R(**R @6RAS970@
M*UA=-H7V]P97)A;F0Z44D@, BF5G:7-T97)?;W!EF%N9(@( @( @
M( @( @(CUR+'(LBQR+%D+'(LB(IBL)*QS:EF=')T.E%)(AM871C
M:%]O5R86YD.E%)(#$@(G)E9VES=5R7V]P97)A;F0B((P+# L,PP+# L
M,PP(BD**PD)( @( H;6%T8VA?;W!EF%N9#I122 R()G96YEF%L7V]P
M97)A;F0B( BBQ,+% L2RQN+XL46TB*2DI70HK( B(@H@( B*B!R971U
MFX@;'-HG%I,[EMAIL PROTECTED]ENVXL(]P97)A;F1S+!.54Q,*3LBB @(%LH
MV5T7V%T='(@(FQE;F=T:(@(C4L,PQ+#(L-PV+#DB*0H@( @*'-E=%]A
M='1R()C8R(@(F-L;V)B97(L;F]N92QS971?8WIN+'-E=%]CFXLV5T7V-Z
M;BQS971?8WIN+-L;V)B97(B*5TIB **RAD969I;F5?5EAO;4R(#L@
M86YD:0HK(!;*'-E= H;6%T8VA?;W!EF%N9#I122 P()D7W)E9VES=5R
M7V]P97)A;F0B((B*0HK( @( @( H86YD.E%)(AM871C:%]D=7 @,D*
M*PD@( @( @(AM871C:%]O5R86YD.E%)(#$@(F-O;G-T7VEN=%]O5R
M86YD(B B(BDI*0HK( @*'-E= H;6%T8VA?9'5P(# IBL@( @( @(AA
M;[EMAIL PROTECTED]UA=-H7V1U P*0HK2 @( @( @*UA=-H7V]P97)A;F0Z
M44D@,B B8V]NW1?:6YT7V]P97)A;F0B((B*2DI70HK( B(@HK(!;*'-E
M= H;6%T8VA?9'5P(# I(AA;[EMAIL PROTECTED]UA=-H7V1U P*2 H;6%T8VA?
M9'5P(#$I*2E=BL@('L**R @(!O5R86YDULQ72 ]($=%3E])[EMAIL PROTECTED]
M5%9!3 H;W!EF%N9'-;,5TI([EMAIL PROTECTED],(AO5R86YDULR72DI.PHK
M(!]*0HKB H95F:6YE7VENVX@(FQS:')H:3,BB @(%LHV5T(AM871C
M:%]O5R86YD.DA)(# @(G)E9VES=5R7V]P97)A;F0B( @( @( @( @
M((]BQR+'(LBQR+'(LB(IB

RE: [avr-gcc-list] Inversion of logic improves size speed

2007-08-27 Thread Eric Weddington



 -Original Message-
 From: Wouter van Gulik [mailto:[EMAIL PROTECTED]
 Sent: Monday, August 27, 2007 7:04 AM
 To: Eric Weddington
 Cc: 'Anatoly Sokolov'; avr-gcc-list@nongnu.org
 Subject: Re: [avr-gcc-list] Inversion of logic improves size speed

 Eric Weddington schreef:
 
  Patch was not attached to email. However, Anatoly attached
 the patch to the
  bug report.
 

 What bug report?
 I looked at:

 Non optimal bit extraction
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33049

 No register save:
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33050

 Double and:
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11259

 I can't find them there or I need some more coffee... it's after all
 still monday ;)

Bug #33028.

But Anatoly has now kindly sent this to the list. :-)

Eric Weddington




___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

RE: [avr-gcc-list] Inversion of logic improves size speed

2007-08-27 Thread Eric Weddington

 -Original Message-
 From:
 [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED]
 org] On Behalf Of Anatoly Sokolov
 Sent: Monday, August 27, 2007 7:14 AM
 To: Wouter van Gulik
 Cc: avr-gcc-list@nongnu.org
 Subject: Re: [avr-gcc-list] Inversion of logic improves size speed

  Anatoly Sokolov wrote:
  Hi.

  This patch optimizes logic left shift of unsigned char by
 4, 5, and 6,
  excluding double 'andi' instructions in some cases.

 Patch.

 Anatoly.

Hi Anatoly,

I build gcc 4.2.1 with this patch.

Interestingly, for bug #33028 there is no difference between 4.1.2, and
4.2.1 with this patch. I see the bug show up in 4.3.0 snapshot, but I have
not tried the patch with 4.3.x yet.

For bug #11259, the testcase in question generates this:

in r24,50-0x20
swap r24
andi r24,0x0f
andi r24,lo8(12)

With this patch it transformed it to:

in r24,50-0x20
swap r24
andi r24,lo8(12)

Which looks to be correct.

Thanks,
Eric Weddington

___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

Re: [avr-gcc-list] Inversion of logic improves size speed

2007-08-26 Thread Anatoly Sokolov

Hi.

This patch optimizes logic left shift of unsigned char by 4, 5, and 6, 
excluding double 'andi' instructions in some cases.


...
 uint8_t getBit4InvShift(uint8_t temp) { uint8_t r = 0; if((temp4)1) 
 r|=0x1; return r; }
 uint8_t getBit5InvShift(uint8_t temp) { uint8_t r = 0; if((temp5)1) 
 r|=0x1; return r; }
 uint8_t getBit6InvShift(uint8_t temp) { uint8_t r = 0; if((temp6)1) 
 r|=0x1; return r; }
...


 This results in:


 0146 getBit4InvShift:
 uint8_t getBit4InvShift(uint8_t temp) { uint8_t r = 0; if((temp4)1)
 r|=0x1; return r; }
 146: 82 95   swap r24
 148: 8f 70   andi r24, 0x0F ; 15
 14a: 81 70   andi r24, 0x01 ; 1
 14c: 99 27   eor r25, r25
 14e: 08 95   ret

 0150 getBit5InvShift:
 uint8_t getBit5InvShift(uint8_t temp) { uint8_t r = 0; if((temp5)1)
 r|=0x1; return r; }
 150: 82 95   swap r24
 152: 86 95   lsr r24
 154: 87 70   andi r24, 0x07 ; 7
 156: 81 70   andi r24, 0x01 ; 1
 158: 99 27   eor r25, r25
 15a: 08 95   ret

 015c getBit6InvShift:
 uint8_t getBit6InvShift(uint8_t temp) { uint8_t r = 0; if((temp6)1)
 r|=0x1; return r; }
 15c: 82 95   swap r24
 15e: 86 95   lsr r24
 160: 86 95   lsr r24
 162: 83 70   andi r24, 0x03 ; 3
 164: 81 70   andi r24, 0x01 ; 1
 166: 99 27   eor r25, r25
 168: 08 95   ret


Now:

0092 getBit4InvShift:
  92: 82 95swap r24
  94: 81 70andi r24, 0x01 ; 1
  96: 08 95ret

0098 getBit5InvShift:
  98: 82 95swap r24
  9a: 86 95lsr r24
  9c: 81 70andi r24, 0x01 ; 1
  9e: 08 95ret

00a0 getBit6InvShift:
  a0: 82 95swap r24
  a2: 86 95lsr r24
  a4: 86 95lsr r24
  a6: 81 70andi r24, 0x01 ; 1
  a8: 08 95ret


Anatoly. 




___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

Re: [avr-gcc-list] Inversion of logic improves size speed

2007-08-07 Thread Wouter van Gulik


Anatoly Sokolov schreef:

Hi,

Bug #11259 [avr] gcc Double 'andi' missed optimization:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11259

Bug #29560 Poor optimization for character shifts on Atmel AVR:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560




Bug #29560 seems to be a little different. The bug report is on shifting 
with a variable shift count. And the loop for doing this shift is non 
optimal (high byte shift because of int promotion or something alike).


While my example works with fixed shifts. Actually, it's bit extraction 
implemented as shifting.
My concern is that when rewriting/inverting my logic I get much better 
(optimal in most cases) results. So it seems the compiler has not chosen 
the most optimal path. It seems like he has two ways of doing the 
shifting? Mabye it's some hidden 8-bit/16-bit variable difference?



Testcase:


snip



  There are two 'and' insn (#24 and #12), but them are not optimized yet. Why?
Probably reason, 'lshiftrt' insn is splited in 'rotate' and 'and' insns in
'pass_split_after_reload' pass of the compiler, but optimization passes
(combine and cse) of which two 'and' insns can merge are run earlier.



I see, to bad...


It is possible to add peephole for merge two 'and' insns. But I do not think
that this decision optimum.



Why not? I agree it's not solving the roots of the problem but it helps 
anyway. I am a total noob on GCC internals so this might be a stupid 
question...


Thanks for all the explantions! Really interresting stuff.

Greetings,

Wouter


___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

Re: [avr-gcc-list] Inversion of logic improves size speed

2007-08-06 Thread Anatoly Sokolov

Hi,

From: Wouter van Gulik [EMAIL PROTECTED]
Sent: Sunday, August 05, 2007 11:46 PM


 After some testing I found out that inverting shift and and
 instruction can significantly reduce speed and size. In the first is 
 case the compiler misses that it can optimise the shifts for bit 4..7 
 by first nibble swapping. Which it does figure out when rewriting the 
 part as in the lower part.
 
 Is this a (known?) bug or am I missing something?
 

 Yes:

Bug #11259 [avr] gcc Double 'andi' missed optimization:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=11259

Bug #29560 Poor optimization for character shifts on Atmel AVR:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=29560


Testcase:

unsigned char 
getBit4InvShift(unsigned char  temp) 
{ 
  unsigned char r = 0; 
  if((temp4)1) r|=0x1; 
  return r; 
}

This code is compiled in insns:

/* frame size = 0 */
.LM2:
 ; (insn 6 3 28 demo.c:4 (set (reg:QI 24 r24 [44])
 ; (lshiftrt:QI (reg:QI 24 r24 [ temp ])
 ; (const_int 4 [0x4]))) 66 {lshrqi3} (nil))
swap r24 ;  6   lshrqi3/5   [length = 2]
andi r24,0x0f
.LVL1:
.LM3:
 ; (insn 12 7 18 demo.c:8 (set (reg/i:QI 24 r24 [ result ])
 ; (and:QI (reg:QI 24 r24 [44])
 ; (const_int 1 [0x1]))) 41 {andqi3} (nil))
andi r24,lo8(1)  ;  12  andqi3/2[length = 1]
/* epilogue start */


The  lshrqi3 patterns defined as opaque macro sequences, an 'andi'
instruction from lshrqi3 insn (#6) is never exposed to GCC's RTL optimizers. 

I try implemented 'lshrqi3' insn for r  4 as 'define_insn_and_split
*lshrqi3_const4':

(define_insn rotlqi3
  [(set (match_operand:QI 0 register_operand =r)
(rotate:QI (match_operand:QI 1 register_operand 0)
   (const_int 4)))]
  
  swap %0
  [(set_attr length 1)
   (set_attr cc none)])

;; 
;; logical shift right

(define_expand lshrqi3
  [(set (match_operand:QI 0 register_operand =r,r,r,r,!d,r,r)
(lshiftrt:QI (match_operand:QI 1 register_operand 0,0,0,0,0,0,0)
 (match_operand:QI 2 general_operand
r,L,P,K,n,n,Qm)))]
  
  )

(define_insn_and_split *lshrqi3_const4
  [(set (match_operand:QI 0 d_register_operand =d)
(lshiftrt:QI (match_operand:QI 1 d_register_operand 0)
 (const_int 4)))]
  
  #
  
  [(set (match_dup 0) (rotate:QI (match_dup 0) (const_int 4)))
   (set (match_dup 0) (and:QI (match_dup 0) (const_int 15)))]
  )

(define_insn *lshrqi3
  [(set (match_operand:QI 0 register_operand =r,r,r,r,!d,r,r)
(lshiftrt:QI (match_operand:QI 1 register_operand 0,0,0,0,0,0,0)
 (match_operand:QI 2 general_operand
r,L,P,K,n,n,Qm)))]
  
  * return lshrqi3_out (insn, operands, NULL);
  [(set_attr length 5,0,1,2,4,6,9)
   (set_attr cc clobber,none,set_czn,set_czn,set_czn,set_czn,clobber)])


As result, next code now generate:

.LM2:
 ; (insn 23 3 30 demo.c:4 (set (reg:QI 24 r24 [44])
 ; (rotate:QI (reg:QI 24 r24 [44])
 ; (const_int 4 [0x4]))) 66 {rotlqi3} (nil))
swap r24 ;  23  rotlqi3 [length = 1]
.LVL1:
 ; (insn 24 30 7 demo.c:4 (set (reg:QI 24 r24 [44])
 ; (and:QI (reg:QI 24 r24 [44])
 ; (const_int 15 [0xf]))) 41 {andqi3} (nil))
andi r24,lo8(15) ;  24  andqi3/2[length = 1]
.LM3:
 ; (insn 12 7 18 demo.c:8 (set (reg/i:QI 24 r24 [ result ])
 ; (and:QI (reg:QI 24 r24 [44])
 ; (const_int 1 [0x1]))) 41 {andqi3} (nil))
andi r24,lo8(1)  ;  12  andqi3/2[length = 1]


  There are two 'and' insn (#24 and #12), but them are not optimized yet. Why?
Probably reason, 'lshiftrt' insn is splited in 'rotate' and 'and' insns in
'pass_split_after_reload' pass of the compiler, but optimization passes
(combine and cse) of which two 'and' insns can merge are run earlier.

It is possible to add peephole for merge two 'and' insns. But I do not think
that this decision optimum.

Mine...

Anatoly.





___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

Re: [avr-gcc-list] Inversion of logic improves size speed

2007-08-05 Thread Joerg Wunsch

Wouter van Gulik [EMAIL PROTECTED] wrote:

 Is this a (known?) bug or am I missing something?

It's not strictly a bug but a missed optimization.  Could you fill
in a bugzilla report on GCC for this?  If you replace the uint8_t by
unsigned char, no further preprocessing is needed, so you can
attacht that example directly to the report.  Please do also attach
the *generated* assembly code (rather than the disassembly listing),
this is what you get by using the -S compiler option.

-- 
cheers, Jorg   .-.-.   --... ...--   -.. .  DL8DTL

http://www.sax.de/~joerg/NIC: JW11-RIPE
Never trust an operating system you don't have sources for. ;-)


___
AVR-GCC-list mailing list
AVR-GCC-list@nongnu.org
http://lists.nongnu.org/mailman/listinfo/avr-gcc-list

Re: [avr-gcc-list] Inversion of logic improves size speed

RE: [avr-gcc-list] Inversion of logic improves size speed

Re: [avr-gcc-list] Inversion of logic improves size speed

RE: [avr-gcc-list] Inversion of logic improves size speed

Re: [avr-gcc-list] Inversion of logic improves size speed

Re: [avr-gcc-list] Inversion of logic improves size speed

RE: [avr-gcc-list] Inversion of logic improves size speed

RE: [avr-gcc-list] Inversion of logic improves size speed

Re: [avr-gcc-list] Inversion of logic improves size speed

Re: [avr-gcc-list] Inversion of logic improves size speed

Re: [avr-gcc-list] Inversion of logic improves size speed

Re: [avr-gcc-list] Inversion of logic improves size speed

12 matches

Site Navigation

Mail list logo

Footer information