So I've done some testing... make_not_regable is NOT being called at all.  When the array is already on the stack, this is no problem, but if it's in an MM register, then problems start occurring.  If I force the call to make_not_regable, then the bad code disappears.  I'm still learning how it works in this part of the code, especially where the node location changes.

program m128test;

function Test3(V1, V2: __m128d): __m128d; vectorcall;
begin
  Test3[1] := V1[1] + V2[1];
end;

begin
end.

Output:

    # Register rsp,rbp allocated
    pushq    %rbp
.seh_pushreg %rbp
    movq    %rsp,%rbp
    leaq    -64(%rsp),%rsp
.seh_stackalloc 64
# Temp -64,16 allocated
.seh_endprologue
# Temp -16,16 allocated
# Temp -32,16 allocated
# Temp -48,16 allocated
    # Register xmm0,xmm1 allocated
    movdqa    %xmm0,-16(%rbp)
    # Register xmm0 released
    movdqa    %xmm1,-32(%rbp)
    # Register xmm1 released
    # Register mreg32 allocated
    movsd    -8(%rbp),%mreg32md
    addsd    -24(%rbp),%mreg32md
    # Register mreg32 released
    movsd    %mreg32md,-40(%rbp)
# Temp -16,16 released
# Temp -32,16 released
    # Register xmm0 allocated
    movdqa    -48(%rbp),%xmm0
# Temp -48,16 released
# Temp -64,16 released
    leaq    (%rbp),%rsp
    popq    %rbp
    # Register rbp,rsp released
    ret
.seh_endproc
    # Register xmm0 released

No uninitialised registers are being written (although -48(%rbp) is uninitialised because I don't write to Test3[0]).  Of course I don't want to force the call to make

On 08/04/2022 20:58, Jonas Maebe via fpc-devel wrote:
On 08/04/2022 20:31, J. Gareth Moreton via fpc-devel wrote:
That might explain a few things.  The problem is that under vectorcall and the System V ABI (the default x86_64 calling convention for Linux), vector types are supposed to be fully supported, like an aligned array of 4 Singles should be passed in a single XMM register.

That's no problem in itself. Normally, make_not_regable will ensure that such values will be stored in memory on procedure entry and kept there. Various architectures require that records are also passed in registers (even if they're larger than 1 register), which also work fine even though the compiler only supports record regvars occupying at most one register (or perhaps two, I don't remember).


Jonas
_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel


--
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

_______________________________________________
fpc-devel maillist  -  fpc-devel@lists.freepascal.org
https://lists.freepascal.org/cgi-bin/mailman/listinfo/fpc-devel

Reply via email to