Package: gcc-4.5 Version: 4.5.2-8 Severity: normal Hi
See Debian #621137; gcc-4.5 4.5.2-8 misbuilds busybox: r5 is clobbered in tryexec() as gcc believes it doesn't return, but it does. This doesn't happen with Linaro GCC 4.5 or GCC 4.6, but it does happen with plain FSF 4.5.2. I've asked on the Linaro Toolchain ML which patch fixes this: http://lists.linaro.org/pipermail/linaro-toolchain/2011-April/001047.html But I didn't open an upstream ticket as there might be one already. See IRC log below for details. 00:52 < lool> michaelh1: if I remove references to applet_no from the function and call sites (it's unused anyway) the issue goes away 00:52 < lool> michaelh1: it's flagged __unused_param__ 00:53 < lool> michaelh1: So I have ash.o and am looking at a disassembled dump, and I can't figure what to look for 00:53 < michaelh1> lool: do you have a preprocessed version of it? What compiler and build options does the fault occur at? 00:53 < michaelh1> lool: I'm suspicious that you're going from a all-register version to a also-something-on-the-stack by adding the extra parameter. 00:54 < lool> michaelh1: Right now I'm doing this with Debian's gcc-4.5 00:54 < lool> no Linaro patchset 00:55 < lool> I was using it earlier, but the busybox versions I was using were not xbuild friendly, or did not reproduce 00:55 < lool> Now that I know that this Debian patch triggers it, I might be able to reproduce with Linaro toolchain 00:55 < michaelh1> lool: yip. Can you do a -save-temps run and send me ash.i and ash.s? 00:56 < lool> michaelh1: BTW I can feel how it would have been useful to have "fsf" toolchains in Ubuntu or Debian and Linaro toolchains in Debian :-) 00:56 < michaelh1> lool: yeah, I want to have a plain linaro PPA in the future. Add a plain fsf as well using the same infrastructure. Helps in tracking down what introduced a problem... 00:57 -!- NishanthMenon [~nmenon@nat/ti/x-iiyvawwlphubdqsc] has quit [Quit: Ex-Chat] 00:57 < lool> michaelh1: I'll look at saving the temps; objdump -D isn't useful I guess? 00:58 < lool> http://paste.ubuntu.com/590485/ has one 00:58 < lool> it's awfully big 00:58 < michaelh1> lool: it's easiest to compare outputs from different compilers by running over the .i file. 00:58 * michaelh1 looks 00:59 < michaelh1> lool: is this the broken one? 01:00 < lool> michaelh1: http://people.linaro.org/~lool/ash.s 01:00 < lool> michaelh1: That one is broken yes 01:01 < lool> michaelh1: http://people.linaro.org/~lool/ash.i 01:03 < michaelh1> lool: what options did you pass to GCC? -Os? -O2? 01:05 < lool> michaelh1: Hmm oddly it seems to be -Os 01:05 < lool> gcc -save-temps -Wp,-MD,shell/.ash.o.d -std=gnu99 -Iinclude -Ilibbb -include include/autoconf.h -D_GNU_SOURCE -DNDEBUG -D_LARGEFILE_SOURCE -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -D"BB_VER=KBUILD_STR(1.18.4)" -D"BB_BT=AUTOCONF_TIMESTAMP" -Wall -Wshadow -Wwrite-strings -Wundef -Wstrict-prototypes -Wunused -Wunused-parameter -Wunused-function -Wunused-value -Wmissing-prototypes -Wmissing-declarations -Wdeclaration-after-statement ... 01:05 < lool> ... -Wold-style-definition -fno-builtin-strlen -finline-limit=0 -fomit-frame-pointer -ffunction-sections -fdata-sections -fno-guess-branch-probability -funsigned-char -static-libgcc -falign-functions=1 -falign-jumps=1 -falign-labels=1 -falign-loops=1 -g -Os -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(ash)" -D"KBUILD_MODNAME=KBUILD_STR(ash)" -c -o shell/ash.o shell/ash.c 01:05 < lool> But it should have been -O0 :-( 01:05 < lool> michaelh1: I tried cross-building with Ubuntu's cross and Debian's patch applied, but I can't reproduce with that toolchain 01:05 -!- robher [~robher@173.226.190.126] has quit [Ping timeout: 246 seconds] 01:05 < lool> it might also be other differences between my tree and the Debian one 01:06 < michaelh1> lool: but you can reproduce the problem? 01:06 -!- marciom [~marciom@209.119.62.66] has joined #linaro 01:07 < lool> michaelh1: I can in a Debian gcc-4.5 context 01:07 < michaelh1> lool: and is this for armv4t or armv7-a 01:07 < lool> michaelh1: Debian is armv4t, when cross-building I passed CC="arm-linux-gnueabi-gcc -marm -march=armv4t -mtune=arm9tdmi" and verified that this was used 01:07 < lool> but the cross-build didn't expose the issue 01:08 < lool> I'm trying another cross-build with the exact Debian tree rather than busybox git + Debian patch 01:08 < michaelh1> lool: OK. so tryexec.clone.0 seems to be tryexec with the first argument removed. 01:09 -!- arun_ [~arun@unaffiliated/sindian] has joined #linaro 01:09 < lool> michaelh1: BTW "successes" don't mean anything above as the issue is slightly random 01:10 < lool> michaelh1: probably memory initialization or stack state 01:11 < lool> So I tried the exact Debian tree, and couldn't reproduce with a cross-build using Ubuntu's cross toolchain 01:11 -!- rcn-ee [~voo...@thief-pool8-147.mncable.net] has quit [Remote host closed the connection] 01:12 < lool> michaelh1: Ah I failed to mention that we had a similar issue in Ubuntu at some point, and it went away by changing build options :-/ 01:12 < lool> michaelh1: https://bugs.launchpad.net/ubuntu/+source/klibc/+bug/683683 01:12 < lool> actually maybe I need to pass -marm 01:13 < lool> ah no I did 01:13 -!- rcn-ee [~voo...@thief-pool8-147.mncable.net] has joined #linaro 01:14 < michaelh1> lool: OK. The assembly up until just after xmalloc is identical with Linaro GCC. Let's see if the code following is correct... 01:19 -!- rcn-ee [~voo...@thief-pool8-147.mncable.net] has quit [Ping timeout: 276 seconds] 01:19 -!- mpoirier [~quas...@s0106002369de4dac.cg.shawcable.net] has quit [Ping timeout: 260 seconds] 01:23 < lool> michaelh1: BTW I tride running in valgrind, but got a fury of warnings; also, valgrind isnt happy when the underlying program calls exec(0 01:23 < lool> exec() 01:24 < michaelh1> lool: on first pass, the code looks OK. A strange thing is that Debian GCC thinks the function never returns... 01:24 < lool> michaelh1: I think it's because the parent function doesn't return 01:24 < lool> michaelh1: and it's static 01:24 < lool> michaelh1: I wondered the same 01:25 < michaelh1> lool: Linaro GCC thinks that it does return... 01:25 -!- jadamcze [~jadam...@ppp118-208-199-129.lns20.hba1.internode.on.net] has quit [Read error: Operation timed out] 01:25 < lool> michaelh1: that is a good lead 01:26 < lool> michaelh1: gcc-4.5 os 4.5.2 in Debian 01:26 < michaelh1> lool: yip. I see the same as your ash.s when compiling with plain 4.5.2 01:27 -!- jadamcze [~jadam...@ppp118-208-206-25.lns20.hba1.internode.on.net] has joined #linaro 01:27 < michaelh1> lool: so: Linaro GCC 4.5 says the function returns, FSF 4.6 says it returns, FSF 4.5.2 says it doesn't 01:28 < lool> michaelh1: Ah, I wanted to try with gcc-4.6, but didn't have easy access to it 01:28 < lool> It would have been a good test 01:29 < lool> michaelh1: Is this something you can spot easily in Changelog.linaro? 01:29 -!- marciom [~marciom@209.119.62.66] has quit [Quit: This computer has gone to sleep] 01:29 < lool> michaelh1: I'm some days behind on Debian gcc-4.5, I see Matthias uploaded a bunch of things, trying this now 01:29 < michaelh1> lool: I remember there being problems with these .clone. functions in Qt. 01:30 -!- rcn-ee [~voo...@thief-pool8-147.mncable.net] has joined #linaro 01:32 < michaelh1> lool: here's another hint: with applet_no removed, gcc 4.5.2 now thinks that the function returns. 01:32 < lool> michaelh1: haha awesome 01:33 < lool> michaelh1: So probably a combination with __unused__ param 01:35 < lool> Hmm I can't put __attribute__ ((__noreturn__)) in my tentative test case 01:35 < lool> michaelh1: I'm trying to convert this issue in a test case 01:35 < lool> but gcc doesn't like my //static void foo(void) __attribute__ ((__noreturn__)) 01:36 < lool> ah I need to declare it standalone 01:39 < lool> michaelh1: http://paste.ubuntu.com/590492/ doesn't give the same detailed output in the .s 01:39 < michaelh1> lool: so tryexec is at least clobbering the callers r5 01:40 < lool> michaelh1: Hmm which gcc wouldn't see as a problem for a noreturn function I guess 01:42 -!- rcn-ee [~voo...@thief-pool8-147.mncable.net] has quit [Remote host closed the connection] 01:45 < lool> michaelh1: Still an issue with latest Debian gcc-4.5 01:45 < michaelh1> lool: hmm. applet_no seems to be uninitialised in shellexec() 01:46 < lool> Yup; the initial code was setting it from a function call 01:46 < lool> michaelh1: trying a build setting that to 0 01:47 < michaelh1> lool: removing the 'static' from tryexec turns it back into a normal call, drops the clone, and now 4.5.2 thinks the function returns... 01:48 < lool> michaelh1: BTW these are my first builds on pandas, and gosh is that nice 01:49 < michaelh1> lool: 2.3 x faster on a GCC build... 01:49 < lool> plus I've setup ccache and am building the package with -j2, it's quite impressive 01:50 < lool> michaelh1: I still see the issue after initializing applet_no to 0 01:51 -!- jadamcze [~jadam...@ppp118-208-206-25.lns20.hba1.internode.on.net] has quit [Ping timeout: 246 seconds] 01:51 < michaelh1> lool: here's a minimal version of tryexec: http://pastebin.ubuntu.com/590498/ 01:52 < michaelh1> lool: delete the variable and GCC changes it's mind from 'doesn't return' to 'does return' 01:53 -!- jadamcze [~jadam...@ppp118-208-249-88.lns20.hba2.internode.on.net] has joined #linaro 01:53 < lool> michaelh1: Now I feel a bit bad because it's not affecting gcc-linaro, but it does affect FSF 4.5 and Debian 01:54 < lool> michaelh1: Do you think there is a chance that it's just hidden upstream and could bite use again in the future/ 01:54 < lool> ? 01:54 < michaelh1> lool: all bugs are worth looking into as toolchain bugs are rare... 01:54 < michaelh1> lool: I'm pretty sure we've seen and fixed a similar bug but I can't track it down. Send an email to linaro-toolchain? 01:55 < lool> michaelh1: How do you trigger the output of "return" in the gcc output? 01:55 < lool> arm-linux-gnueabi-gcc -save-temps -std=gnu99 -D_GNU_SOURCE -o bar.o -c bar.c doesn't show it with your tryexec snippet 01:55 < michaelh1> lool: oh, and GCC 4.5.2 gets it wrong at -O2 as well... 01:56 < michaelh1> lool: I don't understand. 01:56 < lool> michaelh1: I confirm that removing the static works around the issue as well, busybox works again 01:56 < michaelh1> lool: I shrunk tryexec in ash.i down to that snippet and kept the rest of the file. 01:56 < lool> michaelh1: I am trying to get the same interesting "no return" output 01:57 < lool> Oh you're building the whole .i, I was trying to come down to a .c file + build opts 01:57 < lool> but I guess it's just simpler to keep using the .i file 01:57 < michaelh1> I'd be happy with a minimal version of the .i file containing, say, tryexec() and shellexec(). Both are already small... Thanks, -- Loïc Minier -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org