https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116046
Bug ID: 116046
Summary: vmovdqa64 is used when unaligned memory caused by
unaligned %rsp/%rbp
Product: gcc
Version: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: haochen.jiang at intel dot com
Target Milestone: ---
Under x86-64-pc-linux-gnu, when I compiled test avx512f-vec-set-1.c with -O0 to
get an executable and ran it, I get a segmentation fault:
$ /export/users/haochenj/env/build_no_bootstrap_future/gcc/xgcc
-B/export/users/haochenj/env/build_no_bootstrap_future/gcc/
/export/users/haochenj/src/gcc/future/gcc/testsuite/gcc.target/i386/avx512f-vec-set-2.c
-m32 -fdiagnostics-plain-output -O0 -mavx512f -mno-avx512bw -lm -o
./avx512f-vec-set-2.exe
$ ./avx512f-vec-set-2.exe
Segmentation fault (core dumped)
-O1 and clang are both ok.
Derived testcase: https://godbolt.org/z/dTxn7TG1c
The segmentation fault happened at the vmovdqa here:
.L4:
...
leaq -192(%rbp), %rdx
vmovdqa64 (%rdx), %zmm0
movl $50, %esi
movl %eax, %edi
call foo_v64qi(char __vector(64), char, unsigned int)
...
The %rbp here is not 64 byte aligned here since at the beginning of test_512():
test_512():
leaq 8(%rsp), %r10
andq $-64, %rsp
pushq -8(%r10)
pushq %rbp
movq %rsp, %rbp
pushq %r10
...
After %rsp is aligned to 64, we got another push and ruined the alignment.
For -O1, it is similar for %rbp/%rsp, but at the vmovdqa64, it used the exact
offset which get the memory aligned:
.L3:
...
vmovdqa64 -176(%rbp), %zmm0
call foo_v64qi(char __vector(64), char, unsigned int)
...
For clang, it aligned %rsp after all the push and used vmovdqu64 for unaligned
memory.
test_512(): # @test_512()
pushq %rbp
movq %rsp, %rbp
andq $-64, %rsp
subq $256, %rsp # imm = 0x100
...
.LBB1_5:
...
vmovdqu64 160(%rsp), %zmm0
movl 232(%rsp), %eax
movl $50, %esi
movsbl %al, %edi
callq foo_v64qi(char vector[64], char, unsigned int)
...
Probably that the issue exists since vmovdqa64 is introduced.