andrew.w.kaylor added a comment.

In https://reviews.llvm.org/D53157#1302724, @uweigand wrote:

> A couple of comments on the previous discussion:
>
> 1. Instead of defining a new command line option, I'd prefer to use the 
> existing options -frounding-math and -ftrapping-math to set the default 
> behavior of math operations w.r.t. rounding modes and exception status.  (For 
> compatibility with GCC if nothing else.)


I agree that it's preferable to re-use these existing options if possible. I 
have some concerns that -ftrapping-math has a partial implementation in place 
that doesn't seem to be well aligned with the way fast-math flags are handled, 
so it might require some work to have that working as expected without breaking 
existing users. In general though these seem like they should do what we need.

Regarding GCC compatibility, I notice that GCC defaults to trapping math being 
enabled and I don't think that's what we want with clang. It also seems to 
imply something more than I think we need for constrained handling. For 
example, the GCC documentation says that -fno-trapping-math "can result in 
incorrect output for programs that depend on an exact implementation of IEEE or 
ISO rules/specifications for math functions" so it sounds like maybe it also 
implies (for GCC)  something like LLVM's "afn" fast math flag.

So if we are going to use these options, I think we need to have a discussion 
about whether or not it's OK to diverge from GCC's interpretation of them.

In https://reviews.llvm.org/D53157#1302724, @uweigand wrote:

> 2. I also read the C standard to imply that it is a requirement of **user 
> code** to reset the status flag to default before switching back to 
> FENV_ACCESS OFF.  The fundamental characterization of the pragma says "The 
> FENV_ACCESS pragma provides a means **to inform the implementation** when a 
> program might access the floating-point environment to test floating-point 
> status flags or run under non-default floating-point control modes."  There 
> is no mention anywhere that using the pragma, on its own, will ever 
> **change** those control modes.   The last sentence about "... the 
> floating-point control modes have their default setting", while indeed a bit 
> ambiguous, is still consistent with an interpretation that it is the 
> responsibility of user code to ensure that state, there is no explicit 
> statement that the implementation will do so.


I definitely agree with this interpretation of the standard. My understanding 
is that behavior is undefined if the user has not left the FP environment in 
the default state when transitioning to an FENV_ACCESS OFF region.

In https://reviews.llvm.org/D53157#1302724, @uweigand wrote:

> 3. I agree that we need to be careful about intermixing "normal" 
> floating-point operations with strict ones.  However, I'm still not convinced 
> that the pragma itself must be the scheduling barrier.  It seems to me that 
> the compiler already knows where FP control flags are ever modified directly 
> (this can only happen with intrinsics or the like), so the main issue is 
> whether function calls need to be considered.  This is where the pragma comes 
> in: in my mind, the primary difference between FENV_ACCESS ON and FENV_ACCESS 
> OFF regions is that where the pragma is ON, function calls need to be 
> considered (unless otherwise known for sure) to access FP control flags, 
> while where the pragma is OFF, function calls can be considered to never 
> touch FP control flags.  So the real scheduling barrier would be any 
> **function call within a FENV_ACCESS ON region**.  Those would have to be 
> marked by the front-end in the IR, presumably using a function attribute.  
> The common LLVM optimizers would then need to respect that scheduling barrier 
> (here is where we likely still have an open issue, there doesn't appear to be 
> any way to express that at the IR level for regular floating-point operations 
> ...), and likewise the back-ends (but that looks straightforward: a back-end 
> typically will model FP status as residing in a register or in a 
> pseudo-memory slot, and those can simply be considered used/clobbered by 
> function calls marked as within FENV_ACCESS ON regions).


I'm a bit confused by this. The constrained intrinsics will cause all calls to 
act as barriers to motion of the FP operations represented by the intrinsics 
(at least before instruction selection). So I'm not clear what you are saying 
is needed here.


https://reviews.llvm.org/D53157



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to