https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

Dmitry Kazakov <dimula73 at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |dimula73 at gmail dot com

--- Comment #25 from Dmitry Kazakov <dimula73 at gmail dot com> ---
Hi, all!

I would like to add one more test file, related to the problem. If GCC tries to
call a function, that accepts a __m256 register as a parameter, it unloads this
parameter into the stack using an **aligned** move (vmovaps), but the alignment
guarantee on Windows is only 16-byte. It means that the application will crash
because of unaligned memory access.

Affected versions: GCC 7.3.0 (MinGW64), GCC 8.1.0 (MinGW64)

Here is the testing source (see also in an attachment):

#include <intrin.h>

struct X { 
alignas(32) __m256 d;
};

void g1(X);
void g2(const X&);
void g3(const void *);

void f(float *ptr) {
    X x = {_mm256_load_ps(ptr)};
    g1(x);  // BUG: passes via unaligned (whatever rsp alignment is) stack
    g2(x);  // OK: passes via aligned stack location
    g3(&x); // OK: passes via aligned stack location
}


Compiled result (-O2 -march=skylake):

_Z1fPf:
.LFB5135:
        pushq   %rbx
        .seh_pushreg    %rbx
        addq    $-128, %rsp
        .seh_stackalloc 128
        .seh_endprologue
        vmovaps (%rcx), %ymm0
        leaq    95(%rsp), %rbx
        leaq    32(%rsp), %rcx
        andq    $-32, %rbx
        vmovaps %ymm0, (%rbx)    # %rbx is properly aligned 
        vmovaps %ymm0, 32(%rsp)  # %rsp may be unaligned
        vzeroupper
        call    _Z2g11X
        movq    %rbx, %rcx
        call    _Z2g2RK1X
        movq    %rbx, %rcx
        call    _Z2g3PKv
        nop
        subq    $-128, %rsp
        popq    %rbx
        ret

Related bug in Vc library: https://github.com/VcDevel/Vc/issues/241
Related bug in Krita: https://bugs.kde.org/show_bug.cgi?id=406209

Reply via email to