On 13/09/16 12:35, Wilco Dijkstra wrote:
> Jakub wrote:
>> On Mon, Sep 12, 2016 at 04:19:32PM +0000, Tamar Christina wrote:
>>> This patch adds an optimized route to the fpclassify builtin
>>> for floating point numbers which are similar to IEEE-754 in format.
>>> The goal is to make it faster by:
>>> 1. Trying to determine the most common case first
>>> (e.g. the float is a Normal number) and then the
>>> rest. The amount of code generated at -O2 are
>>> about the same +/- 1 instruction, but the code
>>> is much better.
>>> 2. Using integer operation in the optimized path.
>> Is it generally preferable to use integer operations for this instead
>> of floating point operations? I mean various targets have quite high costs
>> of moving data in between the general purpose and floating point register
>> file, often it has to go through memory etc.
> It is generally preferable indeed - there was a *very* long discussion about
> vs FP on the GLIBC mailing list when I updated math.h to use the GCC builtins
> while back (the GLIBC implementation used a non-inlined unoptimized integer
> implementation, so an inlined FP implementation seemed a good intermediate
> Integer operations are generally lower latency and enable bit manipulation
> tricks like the
> fast early exit. The FP version requires execution of 5 branches for a
> "normal" FP value
> and loads several floating point immediates. There are also many targets with
> floating point types, so 5 calls to the comparison lib function would be
> seriously slow.
> Note using so many FP comparisons is not just slow but they aren't correct
> for signalling
> NaNs, so this patch also fixes bug 66462 for fpclassify.
And don't forget that getting the results of a floating-point comparison
back to the branch unit may be no faster than transferring the value in
the first place.
> I would suggest someone with access to a machine with slow FP moves (POWER?)
> to benchmark this using the fpclassify test
> so we know for sure.