I agree that optimization needs to be optional. Not only because it slows
down compilation time, but also
because of possible bugs (operators &&, || and ? generate jumps inside
expressions and might rely
on longer generated code variants)

There are numerous patterns that can be simplified

1. This is mentioned in my first post. In a sequences like
5+a+b  (sum of two variables as a part of larger expression)
we remark patterns like
MOV ECX,[address]
ADD EAX,ECX
(other registers are also possible)
it can be replaced with
ADD EAX,[address]

Similary, there are replaceable patterns
MOV ECX,[address]
AND EAX,ECX

MOV ECX,[address]
OR EAX,ECX

MOV ECX,[address]
XOR EAX,ECX

MOV ECX,[address]
SUB EAX,ECX

2) Comparison expressions
a==800
is translated as
MOV EAX,[address]
CMP EAX,800
MOV EAX,0
SETE AL

we can save 2 bytes by replacing last two instructions with
SETE AL
MOVZX EAX,AL

if constant we compare with is larger than 255, we can save 2 more bytes by
replacing
first two instructions with

CMP DWORD[address],800

3) Single instruction which occurs in almost every loop
i++
generates
MOV EAX,[address]
MOV ECX,EAX
INC EAX
MOV [address],EAX

and we can save 9 bytes by replacing it into

INC DWORD [address]

4) Conditional expression:
a==5?1:2

generates
    CMP EAX,5
    MOV EAX,0
    SETE AL
    CMP EAX,0
    JE S1
    JMP S2
S1: MOV EAX,2
    JMP S3
S2: MOV EAX,1
S3:

We can save 16 bytes for similar cases:
   CMP EAX,5
   JE S2
S1: MOV EAX,2
    JMP S3
S2: MOV EAX,1
S3:

5) Constant assignement
  a=10
  generates
  MOV EAX,10
  MOV [address],EAX

  One byte can be saved by replacing it with
  MOV DWORD [address],10
* but beware of conditional expressions

6) Function prologue generates
  PUSH EBP
  MOV EBP,ESP
  SUB ESP,8
  NOP

we can save 6 bytes with
  ENTER 0,8
(however code latency will increase from 3 to 12 cycles)

7) Constant and variable parameter passing to function
  p(2)
parameter is passed using two instructions
  MOV EAX,2
  PUSH EAX

One byte saving with single instruction
 PUSH 2

8) Variable is already in register:
Sometimes (for example in
a=expression;
p(a)

we can see the code
MOV [address],EAX
MOV EAX,[address]
Second instruction is redundant (6 bytes save) but ...
* beware of goto to the second instruction

On Sun, May 24, 2020 at 7:16 AM Christian Jullien <eli...@orange.fr> wrote:

> Hi, there is certainly a tradeoff between compilation time and execution
> time.
>
> A possible way to add this peephole optimization and possibly others we’ll
> find later it to include this code only if –O2 is supplied.
>
> Now, removing 400 similar sequences like this is certainly something we
> would like to have and on other backend if it applies.
>
>
>
> How long it makes tcc slower when compiling with and without this option?
>
>
>
> Wdyt?
>
>
>
> *From:* Tinycc-devel [mailto:tinycc-devel-bounces+eligis=
> orange...@nongnu.org] *On Behalf Of *Samir Ribic
> *Sent:* Sunday, May 24, 2020 01:09
> *To:* tinycc-devel@nongnu.org
> *Subject:* [Tinycc-devel] Simple peephole optimization
>
>
>
> Tcc was never intended to be a real optimizing compiler, because it is
> focused on compilation speed and memory savings. However, it seems that
> some improvements on generated code can be done, without significant
> compiler speed degradation.As tcc generates machine code, rather than
> assembly, pattern searching  for peephole optimization might be quite fast.
>
>
>
> I remarked that sequences in 386 generated code like this (Intel syntax)
>
> MOV ECX,[memory location]
>
> ADD EAX,ECX
>
> appear quite often, and they can be replaced with
>
> ADD EAX, [memory location]
>
> which saves two bytes. Together with variants using AND, OR, XOR, ADC, SUB
> and SBB instructions, ]iIn a generated code from tcc.c there are about 400
> similar sequences.
>
>
>
> To do the replacement In a file i386-gen.c I have changed the function
>
> void gen_opi(int op) about 20 lines below the label gen_op8:.
>
> .....
>
>         } else {
>             gv2(RC_INT, RC_INT);
>             r = vtop[-1].r;
>             fr = vtop[0].r;
>             o((opc << 3) | 0x01);
>             o(0xc0 + r + fr * 8);
> // Added code******
>             unsigned char * peep;
>             peep=cur_text_section->data+ind;
>
> // op with global var
>
>             if (peep[-8] == 0x8B &&      /*   MOV reg,regm */
>                 ((peep[-7] & 0xC7) == 5) &&  /* modrm= [disp32] */
>                 ((peep[-7] & 0x38) == (peep[-1] & 0x38 ))  /* dest reg of
> first==srcreg of second */
>                ) {
>                 peep[-8]=(opc << 3) | 0x03;  /* first instruction change
> mov to op and swap registers */
>                 peep[-7] &= 0xC7;   /* clear reg field of first
> instruction */
>                 peep[-7] |= r*8;    /* set reg field of first instruction
> */
>                 ind -= 2;           /* two bytes saved */
>             }
> // op with local var
>             else {
>             if (peep[-5] == 0x8B &&      /*   MOV reg,regm */
>                 ((peep[-4] & 0xC7) == 0x45) &&  /* modrm= [EBP+disp8] */
>                 ((peep[-4] & 0x38) == (peep[-1] & 0x38 ))  /* dest reg of
> first==srcreg of second */
>                ) {
>                 peep[-5]=(opc << 3) | 0x03;  /* first instruction change
> mov to op and swap registers */
>                 peep[-4] &= 0xC7;   /* clear reg field of first
> instruction */
>                 peep[-4] |= r*8;    /* set reg field of first instruction
> */
>                 ind -= 2;           /* two bytes saved */
>             }
>
>             }
>    //*************************
>         }
>
>
>
>
> _______________________________________________
> Tinycc-devel mailing list
> Tinycc-devel@nongnu.org
> https://lists.nongnu.org/mailman/listinfo/tinycc-devel
>
_______________________________________________
Tinycc-devel mailing list
Tinycc-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/tinycc-devel

Reply via email to