Apologies for delay in looking at this, I'm on vacation this week.
I don't love this approach because (a) it doesn't get us fully to where we want
to be in performance, and (b) it's going to trash the floating-point flag
state. The performance issue is that we still have two comparisons and one or
two branches for every complex op outside of no-nans, and the flags issue is as
follows:
The intention of IEEE-754 is that anything that is conceptually a single
"operation" should raise at most one of divide-by-zero, invalid, overflow, or
underflow. A complex multiplication implemented with lazy checking may cause
two of these to be raised:
(tiny, huge) * (tiny, huge) --> underflow + overflow
(0, huge) * (inf, huge) --> invalid + overflow, no flags
My preferred approach would be to implement limited-range semantics as an
option (via either pragma or flag), and have it implied by fast-math.
Now, all that being said, I haven't checked if today's compiler-rt
implementations are even correct w.r.t. flags in this sense, so it's not
immediately obvious that this change makes anything worse today, and it will
address //some// of the performance concerns of the earlier patch. It just
seems contrary to the direction that we really want to be going in the
longer-term w.r.t. numerical correctness.
http://reviews.llvm.org/D5756
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits