Re: Performance of floating point instructions

Laurent Desnogues Wed, 10 Mar 2010 12:02:07 -0800

On Wed, Mar 10, 2010 at 8:54 PM, Siarhei Siamashka
<siarhei.siamas...@gmail.com> wrote:
[...]
> I wonder why the compiler does not use real NEON instructions with -ffast-math
> option, it should be quite useful even for scalar code.
>
> something like:
>
> vld1.32  {d0[0]}, [r0]
> vadd.f32 d0, d0, d0
> vst1.32  {d0[0]}, [r0]
>
> instead of:
>
> flds     s0, [r0]
> fadds    s0, s0, s0
> fsts     s0, [r0]
>
> for:
>
> *float_ptr = *float_ptr + *float_ptr;
>
> At least NEON is pipelined and should be a lot faster on more complex code
> examples where it can actually benefit from pipelining. On x86, SSE2 is used
> quite nicely for floating point math.


Even if fast-math is known to break some rules, it only
breaks C rules IIRC.  OTOH, NEON FP has no support
for NaN and other nice things from IEEE754.

Anyway you're perhaps looking for -mfpu=neon, no?


Laurent
_______________________________________________
maemo-developers mailing list
maemo-developers@maemo.org
https://lists.maemo.org/mailman/listinfo/maemo-developers

Re: Performance of floating point instructions

Reply via email to