On Oct 18, 2014, at 3:04 AM, Chandler Carruth <[email protected]> wrote:

> On Fri, Oct 17, 2014 at 2:23 AM, Steve Canon <[email protected]> wrote:
> Apologies for delay in looking at this, I'm on vacation this week.
> 
> Not a problem. =] 
> 
> I don't love this approach because (a) it doesn't get us fully to where we 
> want to be in performance, and (b) it's going to trash the floating-point 
> flag state.  The performance issue is that we still have two comparisons and 
> one or two branches for every complex op outside of no-nans, and the flags 
> issue is as follows:
> 
> The intention of IEEE-754 is that anything that is conceptually a single 
> "operation" should raise at most one of divide-by-zero, invalid, overflow, or 
> underflow.  A complex multiplication implemented with lazy checking may cause 
> two of these to be raised:
> 
>     (tiny, huge) * (tiny, huge) --> underflow + overflow
>     (0, huge) * (inf, huge) --> invalid + overflow, no flags
> 
> My preferred approach would be to implement limited-range semantics as an 
> option (via either pragma or flag), and have it implied by fast-math.
> 
> I don't really understand what you want here.
> 
> In the case of fast-math, the comparisons should vanish and I think we're 
> left with a minimal amount of math. If there is some more minimal way to 
> compute the result in the case of fast-math, please let me know?

I agree; what you have is perfectly fine for fast-math, and should generate 
fast code.

> In the case of *not* have fast-math and needing to be correct, I'm just not 
> in a position to come up with a more efficient but still numerically correct 
> implementation. I have no idea how to do it. And I'm not really willing to 
> sign up to do it because I don't have the time. =/ I don't think that hoping 
> for a future better world should obstruct getting this into the tree as it 
> (to the extent I'm aware) is a strict improvement on the status quo.

What I'm saying is that in the long-term, we'd like to support two modes for 
these operations:

limited-range: In this mode, we use the simple "usual" mathematical 
formulations for multiplication and division (no careful handling of overflow 
or underflow or invalid cases).  This is like finite-math restricted to complex 
arithmetic expressions (in particular, we don't want to require users enable 
finite-math to get this behavior; we may want this behavior to be the default).

no-limited-range: We unconditionally call to compiler-rt for complex mul and 
div operations, and make the compiler-rt implementations correct w.r.t. flags.

The current state of affairs is similar to supporting only no-limited-range, 
except that the compiler-rt implementations may need to be fixed up (I'm happy 
to do that work).  This patch puts us somewhere in between the two modes, which 
is a better place for most users, but still slightly worse than where I'd 
really like to be headed.  My only real concern is of building up too much 
machinery that needs to be undone to get to the "really right" place.

I'm not *so* concerned with this patch in particular.  My comments are more of 
an effort to establish a record of where we'd like to be going with this stuff 
for future reference.  LGTM.

– Steve

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to