Re: [fpc-devel] Question about memory alignment (again!)

J. Gareth Moreton via fpc-devel Thu, 18 Aug 2022 01:06:26 -0700

Thanks for the additional research. I may be reinventing the wheel abit with some of these discoveries.


Still, it feels good to beat gcc occasionally!


Gareth aka. Kit

On 18/08/2022 08:48, Stefan Glienke via fpc-devel wrote:

Interestingly this is what clang also does:

https://godbolt.org/z/Y4v14f9s3

On 17/08/2022 02:21 CEST J. Gareth Moreton via fpc-devel 
<fpc-devel@lists.freepascal.org> wrote:

Hi everyone,


Recently I've made some optimisations centred around the SHR instruction
on x86, and there was one pair of instructions that caught my attention:

movl (%rbx),%eax
shrl $24,%eax

Is it permissible to optimise this to (x86 is little-endian):

movzbl 3(%rbx),%eax?

(You could also optimise "movl; sarl" into a "movsbl" instruction this way)

Logically the result is the same and it removes an instruction and a
pipeline stall, but will there be a performance hit that comes from
reading an unaligned byte of memory like that?

I did make similar optimisation once before with QWords using the
implicit zero-extension of the 32-bit MOV instruction - that is:

movq (%rbx),%rax
shrq $32,%rax

To:

movl 4(%rbx),%eax

This one is a little nicer though because it's still on a 32-bit
boundary and so was permissible.

Gareth aka. Kit

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Question about memory alignment (again!)

Reply via email to