Hans, the info about NEON is relevant for armv7 (Beagleboard,
Cubieboard, PengPod...). But Raspberry Pi doesn't have NEON. Float
processing is done on coprocessor vfpv2. As far as I can see, vfpv2
hardly has any SIMD instructions (except for moving data between ARM
and vfp). It is said to process a maximum of 8 single precision floats
in parallel, but Raspberry Pi doesn't show a sign that it profits from
data alignment, at least not when code is compiled with gcc.

Katja


On Sun, Jan 20, 2013 at 5:12 PM, Hans-Christoph Steiner <[email protected]> wrote:
>
> I think this is what you want, from 'man gcc'.  Its interesting to note that
> the NEON mode, which provides SIMD, also does not do denormals:
>
> -mfpu=name
> -mfpe=number
> -mfp=number
>     This specifies what floating point hardware (or hardware emulation) is
>     available on the target.  Permissible names are: fpa, fpe2, fpe3, 
> maverick,
>     vfp, vfpv3, vfpv3-fp16, vfpv3-d16, vfpv3-d16-fp16, vfpv3xd, vfpv3xd-fp16,
>     neon, neon-fp16, vfpv4, vfpv4-d16, fpv4-sp-d16 and neon-vfpv4.  -mfp and
>     -mfpe are synonyms for -mfpu=fpenumber, for compatibility with older
>     versions of GCC.
>
>     If -msoft-float is specified this specifies the format of floating point
>     values.
>
>     If the selected floating-point hardware includes the NEON extension (e.g.
>     -mfpu=neon), note that floating-point operations will not be used by GCC's
>     auto-vectorization pass unless -funsafe-math-optimizations is also
>     specified.  This is because NEON hardware does not fully implement the 
> IEEE
>     754 standard for floating-point arithmetic (in particular denormal values
>     are treated as zero), so the use of NEON instructions may lead to a loss 
> of
>     precision.
>
>
> .hc
>
> On 01/20/2013 06:54 AM, katja wrote:
>> I was assuming, or maybe just hoping? that Raspberry Pi (and ARM
>> devices in general) would not suffer from Denormal's disease like
>> Intel processors do. But guess what: Pi's float coprocessor is IEEE
>> 754 compliant and does all denormals by default (can check with
>> attached denorm-test.pd). Bummer! As if one would use an ARM device to
>> calculate the size of a Majorana particle, rather than doing simple
>> dsp. Do we really need to enable PD-BIGORSMALL() checks for this poor
>> little processor? There seems to be something called 'RunFast mode'
>> for Pi's float processor vfpv2, but I see no way how to enable this
>> via gcc. Option -ffast-math is allowed but doesn't do the trick. Can't
>> find an option to set vfpv2 specifically, in gcc docs.
>>
>> Katja
>>
>>
>>
>> _______________________________________________
>> [email protected] mailing list
>> UNSUBSCRIBE and account-management -> 
>> http://lists.puredata.info/listinfo/pd-list
>>
>
> _______________________________________________
> [email protected] mailing list
> UNSUBSCRIBE and account-management -> 
> http://lists.puredata.info/listinfo/pd-list

_______________________________________________
[email protected] mailing list
UNSUBSCRIBE and account-management -> 
http://lists.puredata.info/listinfo/pd-list

Reply via email to