----- Original Message ----- > From: "Stephen Canon" <[email protected]> > To: "Chandler Carruth" <[email protected]> > Cc: [email protected], "Owen Anderson" > <[email protected]>, "Hal Finkel" > <[email protected]>, "llvm cfe" <[email protected]> > Sent: Saturday, October 18, 2014 1:41:02 AM > Subject: Re: [PATCH] [complex] Teach the complex math IR gen to emit direct > math and a NaN-test prior to the call to > the library function. > > > On Oct 18, 2014, at 3:04 AM, Chandler Carruth < [email protected] > > wrote: > > On Fri, Oct 17, 2014 at 2:23 AM, Steve Canon < [email protected] > > wrote: > > > > Apologies for delay in looking at this, I'm on vacation this week. > > > > Not a problem. =] > > > > > I don't love this approach because (a) it doesn't get us fully to > where we want to be in performance, and (b) it's going to trash the > floating-point flag state. The performance issue is that we still > have two comparisons and one or two branches for every complex op > outside of no-nans, and the flags issue is as follows: > > The intention of IEEE-754 is that anything that is conceptually a > single "operation" should raise at most one of divide-by-zero, > invalid, overflow, or underflow. A complex multiplication > implemented with lazy checking may cause two of these to be raised: > > (tiny, huge) * (tiny, huge) --> underflow + overflow > (0, huge) * (inf, huge) --> invalid + overflow, no flags > > My preferred approach would be to implement limited-range semantics > as an option (via either pragma or flag), and have it implied by > fast-math. > > > > I don't really understand what you want here. > > > In the case of fast-math, the comparisons should vanish and I think > we're left with a minimal amount of math. If there is some more > minimal way to compute the result in the case of fast-math, please > let me know? > > > I agree; what you have is perfectly fine for fast-math, and should > generate fast code. > > > > > > > In the case of *not* have fast-math and needing to be correct, I'm > just not in a position to come up with a more efficient but still > numerically correct implementation. I have no idea how to do it. And > I'm not really willing to sign up to do it because I don't have the > time. =/ I don't think that hoping for a future better world should > obstruct getting this into the tree as it (to the extent I'm aware) > is a strict improvement on the status quo. > > > What I'm saying is that in the long-term, we'd like to support two > modes for these operations: > > > limited-range: In this mode, we use the simple "usual" mathematical > formulations for multiplication and division (no careful handling of > overflow or underflow or invalid cases). This is like finite-math > restricted to complex arithmetic expressions (in particular, we > don't want to require users enable finite-math to get this behavior; > we may want this behavior to be the default).
Do you think that we'd be able to use limited-range as the default mode? -Hal > > > no-limited-range: We unconditionally call to compiler-rt for complex > mul and div operations, and make the compiler-rt implementations > correct w.r.t. flags. > > > The current state of affairs is similar to supporting only > no-limited-range, except that the compiler-rt implementations may > need to be fixed up (I'm happy to do that work). This patch puts us > somewhere in between the two modes, which is a better place for most > users, but still slightly worse than where I'd really like to be > headed. My only real concern is of building up too much machinery > that needs to be undone to get to the "really right" place. > > > I'm not *so* concerned with this patch in particular. My comments are > more of an effort to establish a record of where we'd like to be > going with this stuff for future reference. LGTM. > > > – Steve > > -- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
