Re: [fpc-devel] Registers reloading

2019-07-19 Thread Marco Borsari via fpc-devel

Il 18/07/2019 21:17, J. Gareth Moreton ha scritto:

Hi Marco,

That is actually one thing I've been researching myself. Currently the 
peephole optimiser doesn't have the facilities to detect if a register 
already contains a value that is being assigned to it, something I call 
the "Deep Optimizer" but is otherwise just basic data flow analysis.


Thank you for your interest and your work,
greetings Marco Borsari
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] Registers reloading

2019-07-18 Thread J. Gareth Moreton

Hi Marco,

That is actually one thing I've been researching myself. Currently the 
peephole optimiser doesn't have the facilities to detect if a register 
already contains a value that is being assigned to it, something I call 
the "Deep Optimizer" but is otherwise just basic data flow analysis.  I 
have been overhauling the peephole optimiser for i386 and x86-64 to be 
more efficient with fewer passes, although it's taking time to get it 
into a form that the administrators are happy to merge into the trunk, 
since it is quite a major change.


Yours sincerely,

J. Gareth "Kit" Moreton


On 18/07/2019 14:33, Marco Borsari via fpc-devel wrote:

Hi all,
does the compiler have the optimization to avoid reloading into
a register of an address (or a value) whether it is already
present?
I ask this because of this code fragment compiled with the -O4
option:

.Lj380:
# Register eax allocated
# [726] T := T-1;
subw    $1,2(%ebx)
# Register ax released
# Register edx allocated
# [727] H6 := @STK^[T];
movl    U_$GLOBALS_$$_STK,%edx
# Register eax allocated
movzwl    2(%ebx),%eax
leal    -8(%edx,%eax,8),%eax
# Register edx released
movl    %eax,-256(%ebp)
# [728] H7 := H6;
movl    %eax,-260(%ebp) <-(1)
# Register eax released
# [729] Inc(H7, 1);
addl    $8,-260(%ebp)
# Register eax allocated
# [730] H6^.I := H6^.I+H7^.I;
movl    -256(%ebp),%eax <-(2)
# Register ecx allocated
movswl    4(%eax),%ecx
movl    -260(%ebp),%eax
# Register edx allocated
movswl    4(%eax),%edx
leal    (%ecx,%edx),%eax
# Register ecx released
movw    %ax,%dx
movl    -256(%ebp),%eax <-(3)
movw    %dx,4(%eax)
# Register eax,dx released

In (1) the compiler assumes that eax has the correct
address but in (2) does not omit to reload it (eax released
too soon).
Both seem to be of the same level of the analysis complexity,
more deeper should be takes into account of (3) with the use of
a supplementar registers as si or di (they are used in other parts of 
the code).

Thank you for any response, Marco Borsari
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel




---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


[fpc-devel] Registers reloading

2019-07-18 Thread Marco Borsari via fpc-devel

Hi all,
does the compiler have the optimization to avoid reloading into
a register of an address (or a value) whether it is already
present?
I ask this because of this code fragment compiled with the -O4
option:

.Lj380:
# Register eax allocated
# [726] T := T-1;
subw$1,2(%ebx)
# Register ax released
# Register edx allocated
# [727] H6 := @STK^[T];
movlU_$GLOBALS_$$_STK,%edx
# Register eax allocated
movzwl  2(%ebx),%eax
leal-8(%edx,%eax,8),%eax
# Register edx released
movl%eax,-256(%ebp)
# [728] H7 := H6;
movl%eax,-260(%ebp) <-(1)
# Register eax released
# [729] Inc(H7, 1);
addl$8,-260(%ebp)
# Register eax allocated
# [730] H6^.I := H6^.I+H7^.I;
movl-256(%ebp),%eax <-(2)
# Register ecx allocated
movswl  4(%eax),%ecx
movl-260(%ebp),%eax
# Register edx allocated
movswl  4(%eax),%edx
leal(%ecx,%edx),%eax
# Register ecx released
movw%ax,%dx
movl-256(%ebp),%eax <-(3)
movw%dx,4(%eax)
# Register eax,dx released

In (1) the compiler assumes that eax has the correct
address but in (2) does not omit to reload it (eax released
too soon).
Both seem to be of the same level of the analysis complexity,
more deeper should be takes into account of (3) with the use of
a supplementar registers as si or di (they are used in other parts of 
the code).

Thank you for any response, Marco Borsari
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel