Re: [fpc-devel] i386-linux switched to a 16 byte aligned stack

2019-09-16 Thread Henry Vermaak
On Mon, 16 Sep 2019 at 14:58, Ben Grasset  wrote:
> On Sun, Sep 15, 2019 at 1:36 PM Florian Klämpfl  
> wrote:
>> In r43005 to 43014 I committed a couple of patches so FPC generates
>> stack frames aligned to 16 byte boundaries on i386-linux
>
> Good change! Means, for example, the long-standing issues with popular 
> libraries like SDL2 on 32-bit Linux won't be a problem anymore.

Wow, I almost forgot about this train wreck.  For us it was opencv and
we luckily had our own library between fpc and opencv so we could add
-mstackrealign.

Henry
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] i386-linux switched to a 16 byte aligned stack

2019-09-16 Thread Ben Grasset
On Sun, Sep 15, 2019 at 1:36 PM Florian Klämpfl 
wrote:

> In r43005 to 43014 I committed a couple of patches so FPC generates
> stack frames aligned to 16 byte boundaries on i386-linux
>

Good change! Means, for example, the long-standing issues with popular
libraries like SDL2 on 32-bit Linux won't be a problem anymore.
___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] i386-linux switched to a 16 byte aligned stack

2019-09-16 Thread J. Gareth Moreton
Ah whoops, misunderstood.  Only for i386-linux, not i386-win32 as well.  
Would there be benefits to aligning the stack on that platform as well 
though?


Gareth aka. Kit

On 16/09/2019 13:32, J. Gareth Moreton wrote:
It's a useful feature as far as hand-written and generated assembly 
language is concerned.  The Intel SIMD instruction sets work far 
better with aligned memory (e.g. you can use MOVAPS instead of MOVUPS, 
the former being faster on older CPUs but triggering a segmentation 
fault if the memory is unaligned). Granted, while vectorcall currently 
only works on x86_64-win64 because I was able to re-use the code for 
the System V ABI, with an aligned stack it might make it potentially 
easier to port it to i386-win32 eventually (under Microsoft Visual 
C++, __vectorcall is supported on 32-bit platforms by only using ECX 
and EDX as the integer registers... the same as __fastcall... speaking 
of 'fastcall' I do wonder if it's worth implementing that calling 
convention in case one wants to communicate with an external library 
that uses the convention).


Gareth aka. Kit

On 15/09/2019 21:07, Florian Klämpfl wrote:

Am 15.09.19 um 19:35 schrieb Florian Klämpfl:
In r43005 to 43014 I committed a couple of patches so FPC generates 
stack frames aligned to 16 byte boundaries on i386-linux (before a 
call instruction, esp is dividable by 16). This is done because it 
seems that linux library start to depend on this property gcc 
ensures for around 20 years. To ensure this, FPC uses the same 
approach as clang (and as FPC for i386-darwin uses): esp has a fixed 
value fulfilling the alignment requirements during the whole 
procedure. Outgoing parameters are copied by mov instead of push 
instructions onto the stack. The consequences of these changes are:
- For pure pascal programs, this does not change anything. The 
resulting code might be slightly bigger but in turn floating point 
code might be faster as double values can be properly aligned now.
- Most assembler code is not affected by the change. Only code using 
constants to access the stack via esp might be affected, such code 
is rare.
- Assembler code calling other procedures should be adapted to keep 
the stack aligned to 16 byte boundaries as well. Assembler code 
working on i386-darwin fulfills this requirement already. The define 
FPC_STACKALIGNMENT contains the alignment of the stack (16 in the 
case of i386-linux).
- To test if the stack is always properly aligned, compile with -Ct: 
the stack checking code for i386-linux checks the stack alignment 
now as well.


One thing (and actually an important one) I forgot to mention: this 
means also that the regcall calling conventions we use by default on 
i386-linux use now a caller-cleared stack. I forgot about because 
even our regression tests did not find this. OTOH it means, that 
probably little code out there is affected by this, an exception 
might be PascalScript.


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel



---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


Re: [fpc-devel] i386-linux switched to a 16 byte aligned stack

2019-09-16 Thread J. Gareth Moreton
It's a useful feature as far as hand-written and generated assembly 
language is concerned.  The Intel SIMD instruction sets work far better 
with aligned memory (e.g. you can use MOVAPS instead of MOVUPS, the 
former being faster on older CPUs but triggering a segmentation fault if 
the memory is unaligned). Granted, while vectorcall currently only works 
on x86_64-win64 because I was able to re-use the code for the System V 
ABI, with an aligned stack it might make it potentially easier to port 
it to i386-win32 eventually (under Microsoft Visual C++, __vectorcall is 
supported on 32-bit platforms by only using ECX and EDX as the integer 
registers... the same as __fastcall... speaking of 'fastcall' I do 
wonder if it's worth implementing that calling convention in case one 
wants to communicate with an external library that uses the convention).


Gareth aka. Kit

On 15/09/2019 21:07, Florian Klämpfl wrote:

Am 15.09.19 um 19:35 schrieb Florian Klämpfl:
In r43005 to 43014 I committed a couple of patches so FPC generates 
stack frames aligned to 16 byte boundaries on i386-linux (before a 
call instruction, esp is dividable by 16). This is done because it 
seems that linux library start to depend on this property gcc ensures 
for around 20 years. To ensure this, FPC uses the same approach as 
clang (and as FPC for i386-darwin uses): esp has a fixed value 
fulfilling the alignment requirements during the whole procedure. 
Outgoing parameters are copied by mov instead of push instructions 
onto the stack. The consequences of these changes are:
- For pure pascal programs, this does not change anything. The 
resulting code might be slightly bigger but in turn floating point 
code might be faster as double values can be properly aligned now.
- Most assembler code is not affected by the change. Only code using 
constants to access the stack via esp might be affected, such code is 
rare.
- Assembler code calling other procedures should be adapted to keep 
the stack aligned to 16 byte boundaries as well. Assembler code 
working on i386-darwin fulfills this requirement already. The define 
FPC_STACKALIGNMENT contains the alignment of the stack (16 in the 
case of i386-linux).
- To test if the stack is always properly aligned, compile with -Ct: 
the stack checking code for i386-linux checks the stack alignment now 
as well.


One thing (and actually an important one) I forgot to mention: this 
means also that the regcall calling conventions we use by default on 
i386-linux use now a caller-cleared stack. I forgot about because even 
our regression tests did not find this. OTOH it means, that probably 
little code out there is affected by this, an exception might be 
PascalScript.


___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel



---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

___
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel