On Fri, Oct 17, 2014 at 2:23 AM, Steve Canon <[email protected]> wrote:
> Apologies for delay in looking at this, I'm on vacation this week. > Not a problem. =] I don't love this approach because (a) it doesn't get us fully to where we > want to be in performance, and (b) it's going to trash the floating-point > flag state. The performance issue is that we still have two comparisons > and one or two branches for every complex op outside of no-nans, and the > flags issue is as follows: > > The intention of IEEE-754 is that anything that is conceptually a single > "operation" should raise at most one of divide-by-zero, invalid, overflow, > or underflow. A complex multiplication implemented with lazy checking > may cause two of these to be raised: > > (tiny, huge) * (tiny, huge) --> underflow + overflow > (0, huge) * (inf, huge) --> invalid + overflow, no flags > > My preferred approach would be to implement limited-range semantics as an > option (via either pragma or flag), and have it implied by fast-math. > I don't really understand what you want here. In the case of fast-math, the comparisons should vanish and I think we're left with a minimal amount of math. If there is some more minimal way to compute the result in the case of fast-math, please let me know? In the case of *not* have fast-math and needing to be correct, I'm just not in a position to come up with a more efficient but still numerically correct implementation. I have no idea how to do it. And I'm not really willing to sign up to do it because I don't have the time. =/ I don't think that hoping for a future better world should obstruct getting this into the tree as it (to the extent I'm aware) is a strict improvement on the status quo. > > Now, all that being said, I haven't checked if today's compiler-rt > implementations are even correct w.r.t. flags in this sense, > So, the code I am generating here is *exactly* the code we have in compiler-rt. I don't know the first thing about actually implementing this stuff and am completely leveraging the compiler-rt implementation. I'm also not a numerics expert and not setting out to improve that implementation, but if you or anyone else have a better implementation, I'm all ears. > so it's not immediately obvious that this change makes anything worse > today, and it will address //some// of the performance concerns of the > earlier patch. > I'm pretty sure this is essentially just inlining the code from compiler-rt around the call to the library function. =] > It just seems contrary to the direction that we really want to be going in > the longer-term w.r.t. numerical correctness. > I don't really know that why this is less *correct*... but I'll take your word on it. However, I also think that this future you're describing is somewhat hypothetical really. Is there any hope of getting there? Is anyone working on it?
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
