Op 2019-10-29 om 12:23 schreef J. Gareth Moreton:
When it comes to testing vectorcall, uComplex isn't the best example actually because most of the operators are inlined.  There are a number of tests under "tests/test/cg" that test vectorcall and the System V ABI using a Pascal implementation of the opaque __m128 type (the two ABIs should behave exactly the same when dealing with simple vectors).

The last time I checked it didn't vector anything at all. So only the native vectorizing of the record of two singles would be nice.

Last time I checked in 2017, complexadd inlined looked something like this:

    leal    32(%eax),%edx
    leal    8(%eax),%ecx
    vmovss    (%ecx),%xmm0
    vaddss    (%edx),%xmm0,%xmm0
    vmovss    %xmm0,-8(%ebp)
    vmovss    4(%ecx),%xmm0
    vaddss    4(%edx),%xmm0,%xmm0
    vmovss    %xmm0,-4(%ebp)

And I realize quite some rearrangements must be done.

If anything though, the example function you gave (I'll need to double-check what ComplexScl does though, if it isn't a simple multiplication)

It is simple multiplication of both real and imaginary with a scalar (as opposed to complex*complex which has more terms).

would be a pretty solid and heavy-duty test of the compiler attempting to vectorise the code - in an ideal world, individual calls to ComplexAdd and ComplexSub (which are simple + and - operations in uComplex) will compile into a single line of assembly language (ADDPD and SUBPD respectively).  Nevertheless, one could disable the inlining to see how well the compiler handles the function chaining, since with aligned data, the result from XMM0 should be easily transposed in one go to another XMM register if not just left alone as parameter data for the next function.

Yes, it is just a somewhat realworld codebase to play with. It is MPL even.
fpc-devel maillist  -  fpc-devel@lists.freepascal.org

Reply via email to