Re: [PATCH] [complex] Teach the complex math IR gen to emit direct math and a NaN-test prior to the call to the library function.

Stephen Canon Mon, 20 Oct 2014 03:21:05 -0700

On Oct 19, 2014, at 9:36 PM, Hal Finkel <[email protected]> wrote:

> ----- Original Message -----
>> From: "Steve Canon" <[email protected]>
>> To: [email protected], [email protected], [email protected], [email protected]
>> Cc: [email protected]
>> Sent: Friday, October 17, 2014 4:23:28 AM
>> Subject: Re: [PATCH] [complex] Teach the complex math IR gen to emit direct 
>> math and a NaN-test prior to the call to
>> the library function.
>> 
>> Apologies for delay in looking at this, I'm on vacation this week.
>> 
>> I don't love this approach because (a) it doesn't get us fully to
>> where we want to be in performance, and (b) it's going to trash the
>> floating-point flag state.  The performance issue is that we still
>> have two comparisons and one or two branches for every complex op
>> outside of no-nans, and the flags issue is as follows:
>> 
>> The intention of IEEE-754 is that anything that is conceptually a
>> single "operation" should raise at most one of divide-by-zero,
>> invalid, overflow, or underflow.  A complex multiplication
>> implemented with lazy checking may cause two of these to be raised:
>> 
>>    (tiny, huge) * (tiny, huge) --> underflow + overflow
>>    (0, huge) * (inf, huge) --> invalid + overflow, no flags
> 
> Thinking about this, this can only matter if we actually permit access to the 
> FP environment, which we currently don't. So, if we were to ever allow 
> "#pragma STDC FENV_ACCESS on", then we'd want to disable this optimization. 
> But for now this is irrelevant (at least from the C perspective). Is this 
> right?


That's basically correct, though it's a bit strong to say that we don't permit 
access to the FP environment.  More accurately, we don't make the necessary 
ordering guarantees to support FENV_ACCESS, but we also don't (generally) 
deliberately trash the flag state, we allow it to be accessed, and the result 
of accessing should (generally) be correct up to sequencing issues.  Getting 
the complex ops right is a necessary step to supporting FENV_ACCESS someday if 
we want to, and does confer *some* benefit now, even without FENV_ACCESS 
support.

Basically, I don't want things to get dramatically worse then they are.  We 
shouldn't e.g. introduce type conversion sequences that get the flags wrong 
because "we don't support FENV_ACCESS".  If we hold that line, then it at least 
will remain feasible for someone to implement FENV_ACCESS.  If they also have 
to fix all the lowerings and all the libcalls, it starts to become pretty 
daunting.

– Steve

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: [PATCH] [complex] Teach the complex math IR gen to emit direct math and a NaN-test prior to the call to the library function.

Reply via email to