[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2024-03-28 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #44 from Eric Botcazou  ---
> Thank you, Dmitry, but that particular solution may not be possible for me.
> When I try compiling with -mstackrealign -mpreferred-stack-boundary=5
> -mincoming-stack-boundary=5 instead of forcing unaligned moves I get
> "cc1.exe: error: '-mpreferred-stack-boundary=5' is not between 3 and 4". Is
> that this bug in a different form, something that should be filed
> separately, or known and intended behavior?

No, it's the same issue: 32-byte stack alignment is not supported with SEH.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2024-03-28 Thread avraham.adler at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #43 from Avraham Adler  ---
Thank you, Dmitry, but that particular solution may not be possible for me.
When I try compiling with -mstackrealign -mpreferred-stack-boundary=5
-mincoming-stack-boundary=5 instead of forcing unaligned moves I get "cc1.exe:
error: '-mpreferred-stack-boundary=5' is not between 3 and 4". Is that this bug
in a different form, something that should be filed separately, or known and
intended behavior?

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2024-03-27 Thread dimula73 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #42 from Dmitry Kazakov  ---
Hi, Avraham!

> Does it remain true that the only option to get around this bug without 
> killing all AVX2 is to pass "-Wa,-muse-unaligned-vector-move" when compiling 
> using GCC on Windows 64? Thank you

I'm not sure about your particular issue, but in our case we used to manage to
workaround this issue by passing AVX2-related structures by reference (or
const-reference, when possible).

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2024-03-26 Thread avraham.adler at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #41 from Avraham Adler  ---
It has been a few years since the last comment. I recently got hit by this bug
for the first time in about a decade and a half of compiling R for Windows 64
using GCC 13.2.0 as packaged in Rtools44 [1].

Does it remain true that the only option to get around this bug without killing
all AVX2 is to pass "-Wa,-muse-unaligned-vector-move" when compiling using GCC
on Windows 64? Thank you.

[1] https://stat.ethz.ch/pipermail/r-sig-windows/2024q1/000113.html

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2024-01-09 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

Eric Botcazou  changed:

   What|Removed |Added

   Assignee|ebotcazou at gcc dot gnu.org   |unassigned at gcc dot 
gnu.org
 Status|ASSIGNED|NEW

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2022-04-20 Thread hjl.tools at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #40 from H.J. Lu  ---
(In reply to Eric Botcazou from comment #37)
> > If the Windows ABI doesn't align stack or not as much as gcc assumes, then a
> > fix would ensure only automatic vars on Windows are accessed always using
> > unaligned vector instructions provided dynamic stack realignment is not an
> > option.
> 
> It's classical double-word alignment, i.e. 16 bytes, and AVX requires 32
> bytes.
> The implementation of dynamic stack realignment is too much of a kludge to
> be safely used on Windows IMO so, yes, the way out is probably unaligned
> vector instructions.

Assembler in binutils 2.38 supports:

 -muse-unaligned-vector-move
  encode aligned vector move as unaligned vector move

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2022-04-01 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #39 from Eric Botcazou  ---
> If SEH is the problem, can alignment be accounted for in cases where SEH is
> not in use (if there are any such cases)? I'm thinking of -fno-exceptions,
> and dwarf (on x86) or setjump/longjump exceptions.

The hitch is that Setjmp/Longjmp is implemented on top of SEH on 64-bit
Windows, which means that SEH information must always be generated, even in
plain C.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2022-04-01 Thread rcopley at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #38 from R Copley  ---
(A patch to emit unaligned instructions should probably resolve bug 49001
instead of this one, 54412.)

Could dynamic alignment be achieved, not for automatic variables and function
parameters, but for registers spilled to the stack (due to register exhaustion,
or because they may be clobbered)? So that users can write code that stores
over-aligned objects on the heap only.

If SEH is the problem, can alignment be accounted for in cases where SEH is not
in use (if there are any such cases)? I'm thinking of -fno-exceptions, and
dwarf (on x86) or setjump/longjump exceptions.

Sorry if those are stupid questions.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2022-04-01 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #37 from Eric Botcazou  ---
> If the Windows ABI doesn't align stack or not as much as gcc assumes, then a
> fix would ensure only automatic vars on Windows are accessed always using
> unaligned vector instructions provided dynamic stack realignment is not an
> option.

It's classical double-word alignment, i.e. 16 bytes, and AVX requires 32 bytes.
The implementation of dynamic stack realignment is too much of a kludge to be
safely used on Windows IMO so, yes, the way out is probably unaligned vector
instructions.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2022-04-01 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #36 from Jakub Jelinek  ---
That patch is certainly unacceptable, not only because it affects non-Windows
too, but even on Windows it will unnecessarily pessimize e.g. accesses to data
sections or heap that can be aligned.
If the Windows ABI doesn't align stack or not as much as gcc assumes, then a
fix would ensure only automatic vars on Windows are accessed always using
unaligned vector instructions provided dynamic stack realignment is not an
option.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2022-04-01 Thread steve at sk2 dot org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

Stephen Kitt  changed:

   What|Removed |Added

 CC||steve at sk2 dot org

--- Comment #35 from Stephen Kitt  ---
Created attachment 52737
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52737=edit
Use unaligned VMOV instructions (for Windows targets)

The reason I didn't submit the Debian patch is that it unconditionally replaces
V...{U,A} with V...U instructions. That's fine when we know the target needs
something like that, which is the case when we're building a Windows
cross-compiler; but I don't think it's suitable for general use as-is. It would
need a build-time conditional at the very least.

Anyway, I'll add it as an attachment here; I'll try to find time to make it
generally applicable. I haven't filed copyright assignment paperwork for me
personally; if the patch needs it, consider it submitted by sk...@redhat.com
under the corporate copyright assignment agreement.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2022-04-01 Thread ebotcazou at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #34 from Eric Botcazou  ---
> It's unfortunate that the best and most common advice for using avx2 with
> gcc/mingw is to use a patched compiler. Might it be possible to accept
> Debian's patch upstream?

Sure, but they need to submit it first, we cannot do it for them.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2022-04-01 Thread lists at coryfields dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

Cory Fields  changed:

   What|Removed |Added

 CC||lists at coryfields dot com

--- Comment #33 from Cory Fields  ---
Adding another +1. Still present in 10.3.0.

Bitcoin Core's sha2 code uses avx2 when possible. We ran into this bug when
bumping our toolchain:
https://github.com/bitcoin/bitcoin/pull/24736

and opted to take Debian's patch:
https://salsa.debian.org/mingw-w64-team/gcc-mingw-w64/-/blob/master/debian/patches/vmov-alignment.patch

It's unfortunate that the best and most common advice for using avx2 with
gcc/mingw is to use a patched compiler. Might it be possible to accept Debian's
patch upstream?

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2021-09-26 Thread mehdi.chinoune at hotmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

Chinoune  changed:

   What|Removed |Added

 CC||mehdi.chinoune at hotmail dot 
com

--- Comment #32 from Chinoune  ---
Still present in GCC 11.2.0

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2021-08-24 Thread dimula73 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

Dmitry Kazakov  changed:

   What|Removed |Added

 CC||dimula73 at gmail dot com

--- Comment #31 from Dmitry Kazakov  ---
Hi, all!

Just wanted to note that the bug is still present in GCC 10.3.0 on Windows
(from MSYS-MinGW64 packages).

> gcc (Rev5, Built by MSYS2 project) 10.3.0

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2021-08-22 Thread arthur200126 at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

Mingye Wang  changed:

   What|Removed |Added

 CC||arthur200126 at gmail dot com

--- Comment #30 from Mingye Wang  ---
One of the weird probably SEH-related things is that the lack-of-alignment
behavior of comment 28 and attachment 1 is not reproduced on a "normal" Linux
GCC with __attribute__((ms_abi)) sprinkled all over to get the right calling
convention. The code takes the same shape, uses mostly the same registers, but
the `and rsp, -32` is just either not there or placed wrong.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2019-08-25 Thread yyc1992 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #29 from Yichao Yu  ---
See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412#c25

GCC is fully capable of aligning the stack. It just seems that different part
of it disagrees on what the current stack alignment is and whether a
realignment is needed.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2019-08-25 Thread john_platts at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

John Platts  changed:

   What|Removed |Added

 CC||john_platts at hotmail dot com

--- Comment #28 from John Platts  ---
The correct way to align the stack to a 32-byte or 64-byte boundary on 64-bit
Windows is to use a frame pointer in a function that requires stack realignment
and then realign the stack to the required alignment once the frame pointer is
set and all of the non-volatile registers used in the function are saved.

class Avx2VectorGenerator {
public:
virtual __m256i NextVector() = 0;
};

__m256i Example_AVX2_Func(Avx2VectorGenerator* generator, size_t iterations);

Example_AVX2_Func:
pushq %rbp
.seh_pushreg %rbp
pushq %rbx
.seh_pushreg %rbx
pushq %rdi
.seh_pushreg %rdi
movq %rsp, %rbp
.seh_setframe %rbp, 0
.seh_endprologue

/* Set rbx to generator and rdi to iterations */
movq %rcx, %rbx
movq %rdx, %rdi

/* It is okay to allocate additional stack memory */
/* and re-align the stack pointer outside of the */
/* SEH prologue as there is a frame pointer in this */
/* function */
subq $64, %rsp
andq $-32, %rsp

/* Zero out the result vector */
vpxor %ymm0, %ymm0, %ymm0

test %rdi, %rdi
jz .loop_complete
.loop_iteration_start:
/* Save the result vector to 32(%rsp) */
vmovdqa 32(%rsp), ymm0

/* Move generator into rcx */
movq %rbx, %rcx
/* Move the pointer to the NextVector() virtual member func */
/* into rax */
movq (%rbx), %rax
/* Call generator->NextVector() */
call *(%rax)

/* Add the result of generator->NextVector() to the result vector */
vpaddb 32(%rsp), %ymm0, %ymm0

/* Decrement iterations by 1 */
sub $1, %rdi

/* Jump back to the beginning of the loop if iterations is non-zero */
jnz .loop_iteration_start
.loop_complete:
lea (%rbp), %rsp
pop %rdi
pop %rbx
pop %rbp
ret
.seh_endproc

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2019-04-10 Thread dimula73 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #27 from Dmitry Kazakov  ---
As a workaround, one can either use __attribute__((always_inline)) for *all*
the functions accepting __m256 or pass *all* arguments by const-ref. Const-ref
arguments are passed correctly.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2019-04-10 Thread dimula73 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #26 from Dmitry Kazakov  ---
Created attachment 46133
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46133=edit
Test source for unaligned pass-by-value crash

Test file for the comment above

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2019-04-10 Thread dimula73 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

Dmitry Kazakov  changed:

   What|Removed |Added

 CC||dimula73 at gmail dot com

--- Comment #25 from Dmitry Kazakov  ---
Hi, all!

I would like to add one more test file, related to the problem. If GCC tries to
call a function, that accepts a __m256 register as a parameter, it unloads this
parameter into the stack using an **aligned** move (vmovaps), but the alignment
guarantee on Windows is only 16-byte. It means that the application will crash
because of unaligned memory access.

Affected versions: GCC 7.3.0 (MinGW64), GCC 8.1.0 (MinGW64)

Here is the testing source (see also in an attachment):

#include 

struct X { 
alignas(32) __m256 d;
};

void g1(X);
void g2(const X&);
void g3(const void *);

void f(float *ptr) {
X x = {_mm256_load_ps(ptr)};
g1(x);  // BUG: passes via unaligned (whatever rsp alignment is) stack
g2(x);  // OK: passes via aligned stack location
g3(); // OK: passes via aligned stack location
}


Compiled result (-O2 -march=skylake):

_Z1fPf:
.LFB5135:
pushq   %rbx
.seh_pushreg%rbx
addq$-128, %rsp
.seh_stackalloc 128
.seh_endprologue
vmovaps (%rcx), %ymm0
leaq95(%rsp), %rbx
leaq32(%rsp), %rcx
andq$-32, %rbx
vmovaps %ymm0, (%rbx)# %rbx is properly aligned 
vmovaps %ymm0, 32(%rsp)  # %rsp may be unaligned
vzeroupper
call_Z2g11X
movq%rbx, %rcx
call_Z2g2RK1X
movq%rbx, %rcx
call_Z2g3PKv
nop
subq$-128, %rsp
popq%rbx
ret

Related bug in Vc library: https://github.com/VcDevel/Vc/issues/241
Related bug in Krita: https://bugs.kde.org/show_bug.cgi?id=406209

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2019-02-27 Thread yyc1992 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #24 from Yichao Yu  ---
Oh, and the test case above was compiled with -O3 (and -g -Wall -Wextra).

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2019-02-27 Thread yyc1992 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

Yichao Yu  changed:

   What|Removed |Added

 CC||yyc1992 at gmail dot com

--- Comment #23 from Yichao Yu  ---
> It is GCC does not realign the stack at all that is the issue.

I hit another related issue that might confirm this as well.

I noticed this when I tried to manually align the stack with inline assembly.

C++ code reduced from my test case,

```
#include 
#include 
#include 

__attribute__((target("avx")))
__attribute__((noinline)) __m256d f(__m256d x, uint32_t a, const double *p)
{
__m256d res;
asm volatile ("vxorpd %0, %0, %0" :
  "=x"(res), "+x"(x), "+r"(a), "+r"(p) ::
  "memory", "rax", "rcx", "rdx", "r8", "r9", "r10",
  "r11", "rbp");
return res;
}

__attribute__((target("avx")))
__attribute__((noinline)) __m256d f2(__m256d x, uint32_t a, const double *p)
{
__m256d res;
asm volatile ("vxorpd %0, %0, %0" :
  "=x"(res), "+x"(x), "+r"(a), "+r"(p) ::
  "memory", "rax", "rcx", "rdx", "r8", "r9", "r10",
  "r11", "rbp");
return res;
}

__attribute__((target("avx")))
__attribute__((noinline)) __m256d f(__m256d x, __m256d y, __m256d z,
uint32_t a, const double *p)
{
__m256d res;
asm volatile ("vxorpd %0, %0, %0" :
  "=x"(res), "+x"(x), "+x"(y), "+x"(z), "+r"(a), "+r"(p) ::
  "memory", "rax", "rcx", "rdx", "r8", "r9", "r10",
  "r11", "rbp");
return res;
}

const double points[] = {0, 0.1, 0.2, 0.6};

__attribute__((target("avx"))) void test_avx()
{
f(__m256d{0, 0, 0, 0}, __m256d{0, 0, 0, 0},
   __m256d{0, 0, 0, 0}, 4, points);
f(__m256d{0, 0, 0, 0}, 4, points);
}

__attribute__((target("avx"))) void test_avx2()
{
f2(__m256d{0, 0, 0, 0}, 4, points);
}

static void call_aligned_stack(void (*p)(void))
{
asm volatile ("movq %%rsp, %%rbp\n"
  "andq $-64, %%rsp\n"
  "subq $64, %%rsp\n"
  "callq *%0\n"
  "movq %%rbp, %%rsp\n"
  :: "r"(p)
  : "memory", "rax", "rcx", "rdx", "r8", "r9", "r10", "r11",
"rbp");
}

int main()
{
call_aligned_stack(test_avx);
fprintf(stderr, "\n");
fflush(stderr);
call_aligned_stack(test_avx2);
return 0;
}
```

(The `fprintf` is there only to make it easier to see when the crash happens.)
The stack alignment code makes sure that the stack is aligned to 64bytes before
making the `call`, which is verified in the debugger, however, when compiled
with GCC 8.2.1 on msys2 (using the mingw-w64-x86_64-gcc package) the `test_avx`
function is happy while `test_avx2` function is not.

Looking at the generated code, for the crashing function:

```
004015c0 <_Z9test_avx2v>:
  4015c0:   48 83 ec 68 sub$0x68,%rsp
  4015c4:   c5 f9 57 c0 vxorpd %xmm0,%xmm0,%xmm0
  4015c8:   4c 8d 0d 51 7a 00 00lea0x7a51(%rip),%r9# 409020
<_ZL6points>
  4015cf:   41 b8 04 00 00 00   mov$0x4,%r8d
  4015d5:   48 8d 4c 24 40  lea0x40(%rsp),%rcx
  4015da:   48 8d 54 24 20  lea0x20(%rsp),%rdx
  4015df:   c5 fd 29 44 24 20   vmovapd %ymm0,0x20(%rsp)
  4015e5:   c5 f8 77vzeroupper 
  4015e8:   e8 a3 ff ff ff  callq  401590 <_Z2f2Dv4_djPKd>
  4015ed:   90  nop
  4015ee:   48 83 c4 68 add$0x68,%rsp
  4015f2:   c3  retq   
```

which tries to write with 32byte alignment with a stack offset from the initial
call instruction: -8 - 0x68 + 0x20 = -80.

OTOH, for the "good" function,

```
00401640 <_Z8test_avxv>:
  401640:   57  push   %rdi
  401641:   56  push   %rsi
  401642:   53  push   %rbx
  401643:   48 81 ec b0 00 00 00sub$0xb0,%rsp
  40164a:   c5 d9 57 e4 vxorpd %xmm4,%xmm4,%xmm4
  40164e:   48 8d 3d cb 79 00 00lea0x79cb(%rip),%rdi#
409020 <_ZL6points>
  401655:   48 8d 74 24 70  lea0x70(%rsp),%rsi
  40165a:   4c 8d 4c 24 30  lea0x30(%rsp),%r9
  40165f:   48 89 7c 24 28  mov%rdi,0x28(%rsp)
  401664:   48 8d 9c 24 90 00 00lea0x90(%rsp),%rbx
  40166b:   00 
  40166c:   4c 8d 44 24 50  lea0x50(%rsp),%r8
  401671:   48 89 f2mov%rsi,%rdx
  401674:   c5 fd 29 64 24 70   vmovapd %ymm4,0x70(%rsp)
  40167a:   48 89 d9mov%rbx,%rcx
  40167d:   c5 fd 29 64 24 50   vmovapd %ymm4,0x50(%rsp)
  401683:   c5 fd 29 64 24 30   vmovapd %ymm4,0x30(%rsp)
  401689:   c7 44 24 20 04 

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2018-08-28 Thread royiavital at yahoo dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #22 from Royi  ---
Hello,

Any progress on this on GCC 8.x?

We really want GCC + AVX on Windows.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2018-04-18 Thread lh_mouse at 126 dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

Liu Hao  changed:

   What|Removed |Added

 CC||lh_mouse at 126 dot com

--- Comment #21 from Liu Hao  ---
This comment could be important:



> mstorsjo commented 10 days ago
> However, this only seems to be an issue when passing such variables by value. 
> Local variables seem to be properly aligned even with GCC:

If the `__m256` in question in the original post was made to pass by reference,
the crash would go away. From the assembly code following that reply we can
also conclude that, it is not the impossibility of realigning the stack during
run time that is the issue (because RSP was aligned in that snippet of code and
I believe that code was correct). It is GCC does not realign the stack at all
that is the issue.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2018-04-10 Thread ebotcazou at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #20 from Eric Botcazou  ---
> This comment could be important:
> 
> https://stackoverflow.com/questions/30928265/mingw64-is-incapable-of-32-byte-
> stack-alignment-required-for-avx-on-windows-
> x64?noredirect=1#comment86499640_30928265.

As already said, MSVC does something completely different (it realigns the
frame instead of the stack) and we cannot do that; the model must be Clang
instead.

[Bug target/54412] minimal 32-byte stack alignment with -mavx on 64-bit Windows

2018-04-09 Thread royiavital at yahoo dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412

--- Comment #19 from Royi  ---
This comment could be important:

https://stackoverflow.com/questions/30928265/mingw64-is-incapable-of-32-byte-stack-alignment-required-for-avx-on-windows-x64?noredirect=1#comment86499640_30928265.

Hopefully you'll find a way to bring AVX to Windows 64 using GCC.

Thank You.