Re: [PATCH] Optimise the fpclassify builtin to perform integer operations when possible

Richard Earnshaw (lists) Wed, 21 Sep 2016 07:50:04 -0700

On 13/09/16 12:35, Wilco Dijkstra wrote:
> Jakub wrote:
>> On Mon, Sep 12, 2016 at 04:19:32PM +0000, Tamar Christina wrote:
>>> This patch adds an optimized route to the fpclassify builtin
>>> for floating point numbers which are similar to IEEE-754 in format.
>>>
>>> The goal is to make it faster by:
>>> 1. Trying to determine the most common case first
>>>    (e.g. the float is a Normal number) and then the
>>>    rest. The amount of code generated at -O2 are
>>>    about the same +/- 1 instruction, but the code
>>>    is much better.
>>> 2. Using integer operation in the optimized path.
>>
>> Is it generally preferable to use integer operations for this instead
>> of floating point operations?  I mean various targets have quite high costs
>> of moving data in between the general purpose and floating point register
>> file, often it has to go through memory etc.
> 
> It is generally preferable indeed - there was a *very* long discussion about 
> integer
> vs FP on the GLIBC mailing list when I updated math.h to use the GCC builtins 
> a
> while back (the GLIBC implementation used a non-inlined unoptimized integer
> implementation, so an inlined FP implementation seemed a good intermediate 
> solution).
> 
> Integer operations are generally lower latency and enable bit manipulation 
> tricks like the
> fast early exit. The FP version requires execution of 5 branches for a 
> "normal" FP value
> and loads several floating point immediates. There are also many targets with 
> emulated
> floating point types, so 5 calls to the comparison lib function would be 
> seriously slow.
> Note using so many FP comparisons is not just slow but they aren't correct 
> for signalling
> NaNs, so this patch also fixes bug 66462 for fpclassify.


And don't forget that getting the results of a floating-point comparison
back to the branch unit may be no faster than transferring the value in
the first place.

R.

> 
> I would suggest someone with access to a machine with slow FP moves (POWER?)
> to benchmark this using the fpclassify test 
> (glibc/benchtests/bench-math-inlines.c)
> so we know for sure.
> 
> Wilco
>

Re: [PATCH] Optimise the fpclassify builtin to perform integer operations when possible

Reply via email to