On Fri, 15 Nov 2013, H.J. Lu wrote: > Hi, > > float.h has > > /* Addition rounds to 0: zero, 1: nearest, 2: +inf, 3: -inf, -1: unknown. */ > /* ??? This is supposed to change with calls to fesetround in <fenv.h>. */ > #undef FLT_ROUNDS > #define FLT_ROUNDS 1 > > Clang introduces __builtin_flt_rounds and > > #define FLT_ROUNDS (__builtin_flt_rounds()) > > I am not sure if it is the correct approach. Is there any plan to > address this in GCC and glibc?
This is GCC bug 30569. It's one of the more straightforward of the various floating-point conformance issues, fixable with more or less self-contained local changes whereas the general issues with exceptions and rounding modes support (I don't know if you are interested in those) would involve much more complicated and wide-ranging changes to fix, and much more initial design work. FLT_ROUNDS can't involve a call to a libm function - or to any function outside the reserved and C90 namespaces (it can't call fegetround, even if a particular system has that in libc, because FLT_ROUNDS is in C90 and fegetround is in the user's namespace for C90). So GCC needs to expand it inline (the expansion might involve a call to a reserved-namespace library function). Given that it expands it inline, a function call __builtin_flt_rounds, that returns an int with the correct value, is the natural interface. The following are my thoughts about how this might be implemented. First, there seems no point in "optimizing" it to 1 in the -fno-rounding-math (default) case; if people use FLT_ROUNDS they'll expect accurate information even if not using -frounding-math. Second, the default for targets not providing the relevant facilities will of course be to return 1. Third, one might imagine expanding this either through a flt_rounds<mode> insn pattern, or through a target hook to expand earlier to trees or GIMPLE. My inclination is the latter. My reasoning is: a typical __builtin_flt_rounds implementation would probably use an appropriate instruction to access a floating-point control register, mask out two bits from that register, and then have a switch statement to map the two bits to the values specified for FLT_ROUNDS. (You can easily enough do arbitrary permutations of 0-3 without a switch, but a switch is what it is in logical terms.) A typical user would probably be doing "if (FLT_ROUNDS == N)" or "switch (FLT_ROUNDS)". If the mapping from hardware bits to FLT_ROUNDS values is represented as a switch at GIMPLE level, the GIMPLE optimizers should be able to eliminate the conversion to FLT_ROUNDS convention and turn things into a simple switch on the masked register value. (NB I haven't tested that - but if they don't, it's a clear missed optimization of more general use.) Since the GIMPLE level is where such optimizations generally take place in GCC, it's best to represent the conversion to FLT_ROUNDS convention at the GIMPLE level. Thus, I'd imagine a hook or hooks would specify (a) an architecture-specific built-in function to get the floating-point control register value (or appropriate code other than a built-in function call), (b) a mask to apply to that value, (c) the resulting values in the register for each rounding mode. And generic code would take care of generating the switch to convert from machine values to FLT_ROUNDS values. (And if the mapping happens to be the identity map, it could avoid generating the switch. That accommodates architectures that for any reason need to do their own expansion not based on extracting two bits and using a switch.) Cases needing something special include powerpc-linux-gnu soft-float where the rounding mode is in libc rather than hardware. Once <https://sourceware.org/ml/libc-alpha/2013-11/msg00180.html> and <https://sourceware.org/ml/libc-alpha/2013-11/msg00189.html> are reviewed I intend to follow up to those by adding __flt_rounds, __atomic_feholdexcept, __atomic_feclearexcept and __atomic_feupdateenv functions to powerpc-linux-gnu soft-float glibc for GCC to use in expansion of FLT_ROUNDS and atomic compound assignment. Any hooks should accommodate targets wishing in some cases just to generate their own libc call like this. -- Joseph S. Myers jos...@codesourcery.com