> On Feb 18, 2015, at 3:23 AM, Joseph Myers <jos...@codesourcery.com> wrote:
> 
>> On Tue, 17 Feb 2015, Kenneth Zadeck wrote:
>> 
>> The fp exceptions raise some very tricky issues with respect to gcc and 
>> optimization.  On many machines, noisy does not mean to throw an 
>> exception, it means that you set a bit and then check later.  If you try 
>> to model this kind of behavior in gcc, you end up pinning the code so 
>> that nothing can be moved or reordered.
> 
> When I say exception here, I'm always referring to that flag bit setting, 
> not to processor-level exceptions.  In IEEE 754 terms, an exception is 
> *signaled*, and the default exception handling is to *raise* a flag and 
> deliver a default result (except for exact underflow which doesn't raise 
> the flag).
> 
> To quote Annex F, "This specification does not require support for trap 
> handlers that maintain information about the order or count of 
> floating-point exceptions. Therefore, between function calls, 
> floating-point exceptions need not be precise: the actual order and number 
> of occurrences of floating-point exceptions (> 1) may vary from what the 
> source code expresses.".  So it is not necessary to be concerned about 
> configurations where trap handlers may be called.
> 
> There is as yet no public draft of TS 18661-5 (Supplementary attributes).  
> That will provide C bindings for alternate exception handling as described 
> in IEEE 754-2008 clause 8.  I suspect such bindings will not readily be 
> efficiently implementable using processor-level exception handlers; SIGFPE 
> is an awkward interface for implementing such things at the C language 
> level, some processors do not support such trap handlers at all (e.g. many 
> ARM processors), and where traps are supported they may be asynchronous 
> rather than occurring immediately on execution of the relevant 
> instruction.  In addition, at least x86 does not support raising exception 
> flags without running trap handlers on the next floating-point instruction 
> (raiseFlags operation, fesetexcept in TS 18661-1); that is, if trap 
> handlers were used to implement standard functionality, it would need to 
> be in a way such that this x86 peculiarity is not visible.
my point here is that what you want to be able to do is freely reorder the fp 
operations ( within the rules of reordering fp operations) between places were 
those bits are explicitly read or cleared.   were have no way to model that 
chain of modify operations in gcc.
> 
>> to get this right gcc needs something like a monotonic dependency which 
>> would allow reordering and gcc has nothing like this.  essentially, you 
>> need way to say that all of these insns modify the same variable, but 
>> they all just move the value in the same direction so you do not care 
>> what order the operations are performed in.  that does not mean that 
>> this could not be added but gcc has nothing like this.
> 
> Indeed, this is one of the things about defining the default mode that I 
> referred to; the present default is -ftrapping-math, but we may wish to 
> distinguish between strict trapping-math (whenever exception flags might 
> be tested / raised / lowered, exactly the computations specified by the 
> abstract machine have occurred, which might mean rather more limits on 
> code movement in the absence of monotonic dependencies) and loose trapping 
> math (like the present default; maybe don't transform expressions locally 
> in ways that add or remove exceptions, but don't treat an expression as 
> having side effects or reading global state purely because of possible 
> raising of floating-point exceptions).
> 
>> going back to the rounding modes issue, there is a huge range in the 
>> architectural implementation space.  you have a few that are pure 
>> dynamic, a few that are pure static and some in the middle that are just 
>> a mess.  a lot of machines would have liked to support fully static, but 
>> could not fit the bits to specify the rounding modes into the 
>> instruction.  my point here is you do need to at least have a plan that 
>> will support the full space even if you do this with a 1000 small 
>> patches.
> 
> I think the norm is dynamic, because that's what was in IEEE 754-1985, 
> with static rounding added more recently on some processors, because of 
> IEEE 754-2008.  (There are other variants - IA64 having multiple dynamic 
> rounding mode registers and allowing instructions to specify which one the 
> rounding mode is taken from.)
the first ieee standard only allowed the dynamic model.   the second allows the 
static model.   while dynamic is more common, there are/were architectures that 
are fully static.   i believe that the first sparks were fully static and this 
was why the standard changed. ( i could be completely wrong on which arch was 
the first fully static).  the private port that i am working on is currently 
fully static, but i am trying to change that.   code generation of a dynamic 
program on a fully static machine is gruesome. 

my point here is that there are fully static machines so do not do anything 
that precludes this.

also remember that constant prop on the rounding mode can be a win.   without 
knowing the rounding mode precisely, you cannot really do constant prop on the 
data.  also the constant prop on the rounding mode can let you avoid a lot of 
code which sets that register.   this can be important if the machine requires 
a cycle or two to settle that setting before the next fp operation.
> 
> -- 
> Joseph S. Myers
> jos...@codesourcery.com

Reply via email to