On Sep 10, 2013, at 18:34, Tijl Coosemans <t...@freebsd.org> wrote: > On Tue, 10 Sep 2013 18:16:01 +0200 Tijl Coosemans wrote: >> I've attached a small test program extracted from multimedia/gstreamer-ffmpeg >> (libavcodec/h264_cabac.c:ff_h264_init_cabac_states(H264Context *h)). >> >> When you compile and run it like this on FreeBSD/i386, it results in a >> SIGBUS: >> >> % cc -o paddd paddd.c -O3 -msse2 -fPIE -fomit-frame-pointer >> % ./paddd >> Bus error >> >> The reason is this instruction where %esp isn't 16-byte aligned: >> paddd (%esp), %xmm7
Hmm, as far as I can see, the problem is related to position independent code, in combination with omitting the frame pointer: $ cc -o paddd paddd.c -O3 -msse2 -fomit-frame-pointer $ ./paddd $ $ cc -o paddd paddd.c -O3 -msse2 -fPIE -fomit-frame-pointer $ ./paddd Bus error (core dumped) $ $ cc -o paddd paddd.c -O3 -msse2 -fPIE -fno-omit-frame-pointer $ ./paddd $ >> Is this an upstream bug or is this because of local changes (to make the >> stack 4 byte aligned by default or something)? The 4 byte alignment on i386 changes are from upstream, but we initiated them after a bit of discussion (see http://llvm.org/viewvc/llvm-project?view=revision&revision=167632 ). Note the problem only occurs at -O3, which enables the vectorizer, so there might an issue with it in combination with position independent code generation and omitting frame pointers. If you check what clang passes to its cc1 stage with your original command line, it gives: "/usr/bin/cc" -cc1 -triple i386-unknown-freebsd10.0 -emit-obj -disable-free -main-file-name paddd.c -mrelocation-model pic -pic-level 2 -pie-level 2 -masm-verbose -mconstructor-aliases -target-cpu i486 -target-feature +sse2 -v -resource-dir /usr/bin/../lib/clang/3.3 -O3 -fdebug-compilation-dir /home/dim/bugs/paddd -ferror-limit 19 -fmessage-length 130 -mstackrealign -fobjc-runtime=gnustep -fobjc-default-synthesize-properties -fdiagnostics-show-option -fcolor-diagnostics -backend-option -vectorize-loops -o /tmp/paddd-zdRbKM.o -x c paddd.c So it does pass -mstackrealign, but for some reason it isn't always effective. For the -fPIE -fomit-frame-pointer case, the prolog for init_states() becomes : init_states: # @init_states # BB#0: # %vector.ph pushl %ebp pushl %ebx pushl %edi pushl %esi subl $28, %esp calll .L0$pb .L0$pb: popl %edx If you remove -fPIE, the data is directly accessed via its (properly 16 byte aligned) symbol, so there is no alignment problem: paddd .LCPI0_0, %xmm7 but the stack is not realigned in the prolog either: init_states: # @init_states # BB#0: # %vector.ph pushl %ebx pushl %edi pushl %esi movd 16(%esp), %xmm0 ... Then, if you use -fPIE, but add -fno-omit-frame-pointer: init_states: # @init_states # BB#0: # %vector.ph pushl %ebp movl %esp, %ebp pushl %ebx pushl %edi pushl %esi andl $-16, %esp subl $48, %esp calll .L0$pb .L0$pb: popl %edx .Ltmp0: E.g., here the stack is properly realigned, and the function works fine. In any case: yes, I think this is a bug, and we should report it upstream. This is a very nice test case to do so. -Dimitry
signature.asc
Description: Message signed with OpenPGP using GPGMail