Hi Mathew On Tue, Oct 9, 2012 at 11:37 PM, Matthew Gretton-Dann < matthew.gretton-d...@linaro.org> wrote:
> On 9 October 2012 14:44, Jubi Taneja <jubitan...@gmail.com> wrote: > > > > > > On Tue, Oct 9, 2012 at 5:21 PM, Matthew Gretton-Dann > > <matthew.gretton-d...@linaro.org> wrote: > >> > >> >> /* arm-none-linux-gnueabi-gcc -mcpu=cortex-a15 -mfpu=vfpv4 -S -o- > >> >> /tmp/fma.c -mfloat-abi=hard -O2 */ > >> >> float f(float a, float b, float c) > >> >> { > >> >> return a * b + c; > >> >> } > >> >> /* end of tmp.c */ > >> >> > >> >> (Note that -mfloat-abi=softfp will also work in this example. Which > >> >> one you want to use depends on whether you have configured your > system > >> >> for hard or soft-float ABIs). > >> >> > >> > I checked both with -mfpu=vfpv3 and -mfpu=vfpv4 and it generates the > >> > same > >> > assembly code. VMLA insn is emitted for both the cases. I was > wondering > >> > if I > >> > can get any test case so that I may observe the difference in the two > >> > objdumps. > >> > >> Which compiler are you using? VFMA support is only in trunk FSF GCC. > >> Linaro has not yet backported support to 4.7. > > > > > > I am using FSF GCC only. > > What version of GCC (what does arm-none-linux-gneabi-gcc -v report?). > # arm-none-linux-gneabi-gcc -v Using built-in specs. COLLECT_GCC=arm-none-linux-gneabi-gcc COLLECT_LTO_WRAPPER=/opt/toolchains/arm/bin/../libexec/gcc/arm-none-linux-gneabi/4.6.3/lto-wrapper Target: arm-none-linux-gneabi Configured with: /home/user/arm-src/build/sources/gcc_1/configure --build=i686-pc-linux-gnu --host=i686-pc-linux-gnu --target=arm-none-linux-gneabi --prefix=/opt/arm --with-sysroot=/opt/arm/arm-none-linux-gneabi/sys-root --disable-libmudflap --disable-libssp --disable-libgomp --disable-nls --disable-libstdcxx-pch --with-interwork --with-mode=arm --with-fpu=vfp3 --with-cpu=cortex-a9 --with-tune=cortex-a9 --with-float=softfp --enable-extra-vd-multilibs --enable-poison-system-directories --enable-long-long --enable-threads --enable-languages=c,c++ --enable-shared --enable-lto --enable-symvers=gnu --enable-__cxa_atexit --with-pkgversion=arm-toolchain.v1 --with-gnu-as --with-gnu-ld --with-host-libstdcxx='-static-libgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm' --with-build-time-tools=/opt/arm/bin --with-gmp=/opt/arm --with-mpfr=/opt/arm --with-ppl=/opt/arm --with-cloog=/opt/arm --with-libelf=/opt/arm Thread model: posix gcc version 4.6.3 (arm-toolchain.v1) When I compile the test case above with a recent (within last month or > so) trunk GCC I get the following output which uses vfma: > > $ > /work/builds/gcc-fsf-arm-none-linux-gnueabi/tools/bin/arm-none-linux-gnueabi-gcc > -mcpu=cortex-a15 -mfpu=vfpv4 -S -o- /tmp/fma.c -mfloat-abi=hard -O2 > .cpu cortex-a15 > .eabi_attribute 27, 3 > .eabi_attribute 28, 1 > .fpu vfpv4 > .eabi_attribute 20, 1 > .eabi_attribute 21, 1 > .eabi_attribute 23, 3 > .eabi_attribute 24, 1 > .eabi_attribute 25, 1 > .eabi_attribute 26, 2 > .eabi_attribute 30, 2 > .eabi_attribute 34, 1 > .eabi_attribute 18, 4 > .file "fma.c" > .text > .align 2 > .global f > .type f, %function > f: > @ args = 0, pretend = 0, frame = 0 > @ frame_needed = 0, uses_anonymous_args = 0 > @ link register save eliminated. > vfma.f32 s2, s0, s1 > fcpys s0, s2 > bx lr > .size f, .-f > .ident "GCC: (GNU) 4.8.0 20120913 (experimental)" > .section .note.GNU-stack,"",%progbits > > -- > $ arm-none-linux-gnueabi-gcc -mcpu=cortex-a15 -mfpu=vfpv4 -S -o- prog.c -O2 .cpu cortex-a15 .eabi_attribute 27, 3 .fpu vfpv4 .eabi_attribute 20, 1 .eabi_attribute 21, 1 .eabi_attribute 23, 3 .eabi_attribute 24, 1 .eabi_attribute 25, 1 .eabi_attribute 26, 2 .eabi_attribute 30, 2 .eabi_attribute 34, 0 .eabi_attribute 18, 4 .file "prog.c" .section .text.f,"ax",%progbits .align 2 .global f .type f, %function f: .fnstart @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. fmsr s14, r0 fmsr s13, r2 fmsr s15, r1 fmacs s13, s14, s15 fmrs r0, s13 bx lr .fnend .size f, .-f .section .text.startup.main,"ax",%progbits .align 2 .global main .type main, %function main: .fnstart @ args = 0, pretend = 0, frame = 0 @ frame_needed = 0, uses_anonymous_args = 0 @ link register save eliminated. bx lr .fnend .size main, .-main .ident "GCC: (VDLinux.GA1.2012-10-03) 4.6.4" .section .note.GNU-stack,"",%progbits I could not conclude the difference in two results and the overall conclusion for my query... Can you please guide to dig deeper in it? Jubi > Matthew Gretton-Dann > Linaro Toolchain Working Group > matthew.gretton-d...@linaro.org >
_______________________________________________ linaro-toolchain mailing list linaro-toolchain@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-toolchain