On Oct 19, 2014, at 9:36 PM, Hal Finkel <[email protected]> wrote:
> ----- Original Message -----
>> From: "Steve Canon" <[email protected]>
>> To: [email protected], [email protected], [email protected], [email protected]
>> Cc: [email protected]
>> Sent: Friday, October 17, 2014 4:23:28 AM
>> Subject: Re: [PATCH] [complex] Teach the complex math IR gen to emit direct
>> math and a NaN-test prior to the call to
>> the library function.
>>
>> Apologies for delay in looking at this, I'm on vacation this week.
>>
>> I don't love this approach because (a) it doesn't get us fully to
>> where we want to be in performance, and (b) it's going to trash the
>> floating-point flag state. The performance issue is that we still
>> have two comparisons and one or two branches for every complex op
>> outside of no-nans, and the flags issue is as follows:
>>
>> The intention of IEEE-754 is that anything that is conceptually a
>> single "operation" should raise at most one of divide-by-zero,
>> invalid, overflow, or underflow. A complex multiplication
>> implemented with lazy checking may cause two of these to be raised:
>>
>> (tiny, huge) * (tiny, huge) --> underflow + overflow
>> (0, huge) * (inf, huge) --> invalid + overflow, no flags
>
> Thinking about this, this can only matter if we actually permit access to the
> FP environment, which we currently don't. So, if we were to ever allow
> "#pragma STDC FENV_ACCESS on", then we'd want to disable this optimization.
> But for now this is irrelevant (at least from the C perspective). Is this
> right?
That's basically correct, though it's a bit strong to say that we don't permit
access to the FP environment. More accurately, we don't make the necessary
ordering guarantees to support FENV_ACCESS, but we also don't (generally)
deliberately trash the flag state, we allow it to be accessed, and the result
of accessing should (generally) be correct up to sequencing issues. Getting
the complex ops right is a necessary step to supporting FENV_ACCESS someday if
we want to, and does confer *some* benefit now, even without FENV_ACCESS
support.
Basically, I don't want things to get dramatically worse then they are. We
shouldn't e.g. introduce type conversion sequences that get the flags wrong
because "we don't support FENV_ACCESS". If we hold that line, then it at least
will remain feasible for someone to implement FENV_ACCESS. If they also have
to fix all the lowerings and all the libcalls, it starts to become pretty
daunting.
– Steve
_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits