Re: [fpc-devel] Difficulty in specifying record alignment... and more compiler optimisation shenanigans!

J. Gareth Moreton Sun, 27 Oct 2019 03:24:29 -0700

The following passes everything through XMM0:

#include<cmath>
#include<emmintrin.h>


doubleMod(__m128dz)
{
returnsqrt((z[0]*z[0])+(z[1]*z[1]));
}

intmain()
{
__m128dz;
z[0] = 0; z[1] = 1;
doubled = Mod(z);
}

I will admit that it's very fiddly to get right. All of my attempts tomap an anonymous struct to __m128d via a union (so you could call z.reand z.im rather than access the array elements) were unsuccessful. C++is not very friendly with vector types and you have to go out of yourway to get the compiler to be efficient with them, but the System V ABIdoes support utilising the full vector registers.

It took me a while to work out how passing a record type with twosingle-precision elements into just XMM0 is correct, but this is becausethe record type as a whole has a size of eight bytes, and gets passed asa single argument of class SSE. If the function parameters are insteadtwo separate arguments, then they get passed individually through XMM0and XMM1. It seems you have to interpret this document very literallyto get it right: https://www.uclibc.org/docs/psABI-x86_64.pdf


Gareth aka. Kit

On 27/10/2019 08:13, Florian Klämpfl wrote:

Am 23.10.19 um 22:36 schrieb J. Gareth Moreton:
So I did a bit of reading after finding the "mpx-linux64-abi.pdf"document. As I suspected, the System V ABI is like vectorcall whenit comes to using the XMM registers... only the types __m128,__float128 and __Decimal128 use the "SSEUP" class and hence use theentire register. The types are opaque, but both their size andalignment are 16 bytes, so I think anything that abides by thoserules can be considered equivalent.
If the complex type is unaligned, the two fields get their own XMMregister. If aligned, they both go into %xmm0. At least that iswhat I gathered from reading the document - it's a little unclearsometimes.
I briefly tested with god bolt (https://godbolt.org/): records of twodouble are passed in two xmm registers regardless of the alignment,two floats (so single) are passed in one xmm register.
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel



--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Re: [fpc-devel] Difficulty in specifying record alignment... and more compiler optimisation shenanigans!

Reply via email to