[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 Royi changed: What|Removed |Added CC||royiavital at yahoo dot com --- Comment #16 from Royi --- Hello, More details about this bug can be seen at (With simple test case): https://github.com/Alexpux/MSYS2-packages/issues/1209#issuecomment-379576367 Also on MinGW64 discussion: https://sourceforge.net/p/mingw-w64/mailman/message/36287627/ Regarding Kai Tietz's comment: https://stackoverflow.com/a/30929086/195787 Any chance this gets fixed? Thank You.
[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 Joseph Coffland changed: What|Removed |Added CC||gcc at joe dot coffland.com --- Comment #15 from Joseph Coffland --- After compiling and running the test case, I can confirm that this bug still exists in ``gcc (Debian 5.3.1-11) 5.3.1 20160307``. It crashes both under Wine and 64-bit Windows 7. I would love to see this fixed. It's the only thing keeping me from building all of the Folding@home software for Windows under Linux. We need AVX for our protein folding simulations. The extra performance gained by using AVX is significant. My other options are clang or building on Windows using MSVC or Intel compilers. ``` $ gcc -v Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Debian 5.3.1-11' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --with-arch-32=i586 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 5.3.1 20160307 (Debian 5.3.1-11) ```
[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 c...@spin-digital.com changed: What|Removed |Added CC||c...@spin-digital.com --- Comment #14 from c...@spin-digital.com --- A solution would be to use unaligned loads and stores to stack variables for 256-bit variables and spilled registers. Likely the other compilers are doing this to make it work. I would really appreciate such a solution.
[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 --- Comment #13 from Roland Schulz --- But this problem is limited to GCC. ICC, Clang and MSVC don't have the problem with compiling 64bit AVX code. Thus they must have some kind of work-around for ABI and GCC should be able to use a work-around too (at least in theory).
[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 Kai Tietz changed: What|Removed |Added Status|UNCONFIRMED |SUSPENDED Last reconfirmed||2015-09-22 Ever confirmed|0 |1 --- Comment #12 from Kai Tietz --- It is good to hear that issue is fixed for 32-bit. But for 64-bit - as I already explained in comment above - this issue isn't fixable for stack-variables. The problem is that for x64 ABI we are tighten bound to SEH-prologue information, and this can't express alignment-operations. The x64 ABI guarantee 16 byte alignment on function entry, therefore sse 128-bit operations are possible to be placed fully aligned on stack, but higher alignment is simply not expressible. Therefore I will need to set this bug to suspended. If this information gets in future extended to allow such prologue-information we need for alignment, then we will be able to fix that. So long it is suspended.
[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 --- Comment #11 from R Copley --- On 20 September 2014 07:08, roland at rschulz dot eu wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 > > --- Comment #10 from Roland Schulz --- > Created attachment 33520 > --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33520&action=edit > Slightly modified testcase > > This slightly modified testcase in which the return value isn't stored, still > segfaults for me. With the 32bit mingw64 binary ((i686-win32-dwarf-rev1, Built > by MinGW-W64 project) 4.9.1) it is OK, but with the 64bit binary > ((x86_64-win32-seh-rev1, Built by MinGW-W64 project) 4.9.1) it segfaults. Confirmed (with the same compiler, in the mingw-builds toolchain). I compiled your testcase with command "gcc -O0 -g -ggdb -m64 -mavx bug.c". It segfaults on the instruction marked "=>" below. (gdb) disassemble /m Dump of assembler code for function f: 6 { 0x004014f0 <+0>: push %rbp 0x004014f1 <+1>: mov%rsp,%rbp 0x004014f4 <+4>: mov%rcx,0x10(%rbp) 0x004014f8 <+8>: sub$0x40,%rsp 0x004014fc <+12>:mov%rsp,%rax 0x004014ff <+15>:add$0x1f,%rax 0x00401503 <+19>:shr$0x5,%rax 0x00401507 <+23>:shl$0x5,%rax 7 v4d x __attribute__ ((aligned (32))) = { 1.0, 2.0, 3.0, 4.0, }; 0x0040150b <+27>:vmovapd 0x2aed(%rip),%ymm0# 0x404000 0x00401513 <+35>:vmovapd %ymm0,(%rax) 8 return x; 0x00401517 <+39>:mov0x10(%rbp),%rdx 0x0040151b <+43>:vmovapd (%rax),%ymm0 => 0x0040151f <+47>:vmovapd %ymm0,(%rdx) 9 } 0x00401523 <+51>:mov0x10(%rbp),%rax 0x00401527 <+55>:mov%rbp,%rsp 0x0040152a <+58>:pop%rbp 0x0040152b <+59>:retq End of assembler dump. (gdb) print $rdx % 32 $1 = 16
[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 --- Comment #10 from Roland Schulz --- Created attachment 33520 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33520&action=edit Slightly modified testcase This slightly modified testcase in which the return value isn't stored, still segfaults for me. With the 32bit mingw64 binary ((i686-win32-dwarf-rev1, Built by MinGW-W64 project) 4.9.1) it is OK, but with the 64bit binary ((x86_64-win32-seh-rev1, Built by MinGW-W64 project) 4.9.1) it segfaults.
[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 --- Comment #9 from R Copley --- Heh, sorry. I don't really know C, I assumed it had an implicit "return 0;" like C++. Apparently C99 has this but earlier C standards do not. So, not bizarre at all.
[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 --- Comment #8 from R Copley --- No, I use the mingw-builds distro too. gcc --version gcc (x86_64-win32-seh-rev0, Built by MinGW-W64 project) 4.9.1 Bizarrely, the attached program exits with a random error code unless I add a "return 0;" statement to the end of the main function. But it doesn't segfault.
[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 --- Comment #7 from Roland Schulz --- For me the problem isn't fixed with gcc 4.9.1. I tried two build a) http://sourceforge.net/projects/mingw-w64/files/Toolchains%20targetting%20Win32/Personal%20Builds/mingw-builds/installer/mingw-w64-install.exe/download and b) http://nuwen.net/mingw.html. Did you use a special distribution or special flags if you compiled gcc yourself?
[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 --- Comment #6 from R Copley --- As I mentioned in the description, this request was indeed related to that bug. The test-case no longer crashes with recent MinGW-W64 toolchains (GCC 4.9.1).
[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 --- Comment #5 from Roland Schulz --- This seems to me to be a duplicate of 49001.
[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 --- Comment #4 from R Copley --- (In reply to Kai Tietz from comment #1) > MS' abi doesn't allow this. So I doubt we will be able to implement that > for this target. If we want to re-align stack on function-base we will run > into troubles with SEH-information. You might be right, I'm not sure. Are you aware that on 64-bit Windows, SEH is table-based, not frame-based (see, e.g., http://www.osronline.com/article.cfm?article=469)? > Doesn't it work to align explicit the variable itself? No (see attachments 2 and 3). If I understand correctly, the alignment specification is redundant anyway, because the variables are supposed to be naturally aligned, on their size. Assembling attachment 3 with "-g" and running it in gdb gives: Program received signal SIGSEGV, Segmentation fault. main () at bug1.s:46 46 vmovapd %ymm0, -96(%rbp) Thanks.
[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 --- Comment #3 from R Copley --- Created attachment 30794 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30794&action=edit Assembly-language code compiled from attachment 1 Compiled with GCC 4.7.2 from the MinGW-w64 toolchain. Compile command: "gcc -O0 -m64 -mavx -S bug1.c -o bug1.s". gcc -v output: Using built-in specs. COLLECT_GCC=gcc COLLECT_LTO_WRAPPER=c:/mingw64/bin/../libexec/gcc/x86_64-w64-mingw32/4.7.2/lto-wrapper.exe Target: x86_64-w64-mingw32 Configured with: /home/ruben/mingw-w64/src/gcc/configure --host=x86_64-w64-mingw32 --build=x86_64-linux-gnu --target=x86_64-w64-mingw32 --with-sysroot=/home/ruben/mingw-w64/mingw64mingw64/mingw64 --prefix=/home/ruben/mingw-w64/mingw64mingw64/mingw64 --with-gmp=/home/ruben/mingw-w64/prereq/x86_64-w64-mingw32/install --with-mpfr=/home/ruben/mingw-w64/prereq/x86_64-w64-mingw32/install --with-mpc=/home/ruben/mingw-w64/prereq/x86_64-w64-mingw32/install --with-ppl=/home/ruben/mingw-w64/prereq/x86_64-w64-mingw32/install --with-cloog=/home/ruben/mingw-w64/prereq/x86_64-w64-mingw32/install --disable-ppl-version-check --disable-cloog-version-check --enable-cloog-backend=isl --with-host-libstdcxx='-static -lstdc++ -lm' --enable-shared --enable-static --enable-threads=win32 --enable-plugins --disable-multilib --enable-languages=c,lto,c++,objc,obj-c++,fortran,java --enable-libgomp --enable-fully-dynamic-string --enable-libstdcxx-time --disable-nls --disable-werror --enable-checking=release --with-gnu-as --with-gnu-ld --disable-win32-registry --disable-rpath --disable-werror --with-libiconv-prefix=/home/ruben/mingw-w64/prereq/x86_64-w64-mingw32/install --with-pkgversion=rubenvb-4.7.2-release --with-bugurl=mingw-w64-pub...@lists.sourceforge.net CC= CFLAGS='-O2 -march=nocona -mtune=core2 -fomit-frame-pointer -momit-leaf-frame-pointer' LDFLAGS= Thread model: win32 gcc version 4.7.2 (rubenvb-4.7.2-release)
[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 --- Comment #2 from R Copley --- Created attachment 30793 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=30793&action=edit As before, but with explicitly 32-byte aligned variables
[Bug target/54412] Request for 32-byte stack alignment with -mavx on Windows
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54412 Kai Tietz changed: What|Removed |Added CC||ktietz at gcc dot gnu.org --- Comment #1 from Kai Tietz --- MS' abi doesn't allow this. So I doubt we will be able to implement that for this target. If we want to re-align stack on function-base we will run into troubles with SEH-information. Doesn't it work to align explicit the variable itself?