Hi Mathew

On Tue, Oct 9, 2012 at 11:37 PM, Matthew Gretton-Dann <
matthew.gretton-d...@linaro.org> wrote:

> On 9 October 2012 14:44, Jubi Taneja <jubitan...@gmail.com> wrote:
> >
> >
> > On Tue, Oct 9, 2012 at 5:21 PM, Matthew Gretton-Dann
> > <matthew.gretton-d...@linaro.org> wrote:
> >>
> >> >> /* arm-none-linux-gnueabi-gcc -mcpu=cortex-a15 -mfpu=vfpv4 -S -o-
> >> >> /tmp/fma.c -mfloat-abi=hard -O2 */
> >> >> float f(float a, float b, float c)
> >> >> {
> >> >>   return a * b + c;
> >> >> }
> >> >> /* end of tmp.c */
> >> >>
> >> >> (Note that -mfloat-abi=softfp will also work in this example.  Which
> >> >> one you want to use depends on whether you have configured your
> system
> >> >> for hard or soft-float ABIs).
> >> >>
> >> > I checked both with -mfpu=vfpv3 and -mfpu=vfpv4 and it generates the
> >> > same
> >> > assembly code. VMLA insn is emitted for both the cases. I was
> wondering
> >> > if I
> >> > can get any test case so that I may observe the difference in the two
> >> > objdumps.
> >>
> >> Which compiler are you using?  VFMA support is only in trunk FSF GCC.
> >> Linaro has not yet backported support to 4.7.
> >
> >
> > I am using FSF GCC only.
>
> What version of GCC (what does arm-none-linux-gneabi-gcc -v report?).
>
# arm-none-linux-gneabi-gcc -v
Using built-in specs.
COLLECT_GCC=arm-none-linux-gneabi-gcc
COLLECT_LTO_WRAPPER=/opt/toolchains/arm/bin/../libexec/gcc/arm-none-linux-gneabi/4.6.3/lto-wrapper
Target: arm-none-linux-gneabi
Configured with: /home/user/arm-src/build/sources/gcc_1/configure
--build=i686-pc-linux-gnu --host=i686-pc-linux-gnu
--target=arm-none-linux-gneabi --prefix=/opt/arm
--with-sysroot=/opt/arm/arm-none-linux-gneabi/sys-root --disable-libmudflap
--disable-libssp --disable-libgomp --disable-nls --disable-libstdcxx-pch
--with-interwork --with-mode=arm --with-fpu=vfp3 --with-cpu=cortex-a9
--with-tune=cortex-a9 --with-float=softfp --enable-extra-vd-multilibs
--enable-poison-system-directories --enable-long-long --enable-threads
--enable-languages=c,c++ --enable-shared --enable-lto --enable-symvers=gnu
--enable-__cxa_atexit --with-pkgversion=arm-toolchain.v1  --with-gnu-as
--with-gnu-ld --with-host-libstdcxx='-static-libgcc
-Wl,-Bstatic,-lstdc++,-Bdynamic -lm' --with-build-time-tools=/opt/arm/bin
--with-gmp=/opt/arm --with-mpfr=/opt/arm --with-ppl=/opt/arm
--with-cloog=/opt/arm --with-libelf=/opt/arm
Thread model: posix
gcc version 4.6.3 (arm-toolchain.v1)

When I compile the test case above with a recent (within last month or
> so) trunk GCC I get the following output which uses vfma:
>
> $
> /work/builds/gcc-fsf-arm-none-linux-gnueabi/tools/bin/arm-none-linux-gnueabi-gcc
> -mcpu=cortex-a15 -mfpu=vfpv4 -S -o- /tmp/fma.c -mfloat-abi=hard -O2
>         .cpu cortex-a15
>         .eabi_attribute 27, 3
>         .eabi_attribute 28, 1
>         .fpu vfpv4
>         .eabi_attribute 20, 1
>         .eabi_attribute 21, 1
>         .eabi_attribute 23, 3
>         .eabi_attribute 24, 1
>         .eabi_attribute 25, 1
>         .eabi_attribute 26, 2
>         .eabi_attribute 30, 2
>         .eabi_attribute 34, 1
>         .eabi_attribute 18, 4
>         .file   "fma.c"
>         .text
>         .align  2
>         .global f
>         .type   f, %function
> f:
>         @ args = 0, pretend = 0, frame = 0
>         @ frame_needed = 0, uses_anonymous_args = 0
>         @ link register save eliminated.
>         vfma.f32        s2, s0, s1
>         fcpys   s0, s2
>         bx      lr
>         .size   f, .-f
>         .ident  "GCC: (GNU) 4.8.0 20120913 (experimental)"
>         .section        .note.GNU-stack,"",%progbits
>
> --
>


$ arm-none-linux-gnueabi-gcc -mcpu=cortex-a15 -mfpu=vfpv4 -S -o- prog.c -O2
    .cpu cortex-a15
    .eabi_attribute 27, 3
    .fpu vfpv4
    .eabi_attribute 20, 1
    .eabi_attribute 21, 1
    .eabi_attribute 23, 3
    .eabi_attribute 24, 1
    .eabi_attribute 25, 1
    .eabi_attribute 26, 2
    .eabi_attribute 30, 2
    .eabi_attribute 34, 0
    .eabi_attribute 18, 4
    .file    "prog.c"
    .section    .text.f,"ax",%progbits
    .align    2
    .global    f
    .type    f, %function
f:
    .fnstart
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    fmsr    s14, r0
    fmsr    s13, r2
    fmsr    s15, r1
    fmacs    s13, s14, s15
    fmrs    r0, s13
    bx    lr
    .fnend
    .size    f, .-f
    .section    .text.startup.main,"ax",%progbits
    .align    2
    .global    main
    .type    main, %function
main:
    .fnstart
    @ args = 0, pretend = 0, frame = 0
    @ frame_needed = 0, uses_anonymous_args = 0
    @ link register save eliminated.
    bx    lr
    .fnend
    .size    main, .-main
    .ident    "GCC: (VDLinux.GA1.2012-10-03) 4.6.4"
    .section    .note.GNU-stack,"",%progbits

I could not conclude the difference in two results and the overall
conclusion for my query... Can you please guide to dig deeper in it?

Jubi

>  Matthew Gretton-Dann
> Linaro Toolchain Working Group
> matthew.gretton-d...@linaro.org
>
_______________________________________________
linaro-toolchain mailing list
linaro-toolchain@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-toolchain

Reply via email to