[Michael Hudson] >>>>> In what way does C99's fenv.h fail? Is it just insufficiently >>>>> available, or is there some conceptual lack?
[Tim Peters] >>>> Just that it's not universally supported. Look at fpectlmodule.c for >>>> a sample of the wildly different ways it _is_ spelled across some >>>> platforms. [Michael] >>> C'mon, fpectlmodule.c is _old_. Maybe I'm stupidly optimistic, but >>> perhaps in the last near-decade things have got a little better here. [Tim] >> Ah, but as I've said before, virtually all C compilers on 754 boxes >> support _some_ way to get at this stuff. This includes gcc before C99 >> and fenv.h -- if the platforms represented in fpectlmodule.c were >> happy to use gcc, they all could have used the older gcc spellings >> (which are in fpectlmodule.c, BTW, under the __GLIBC__ #ifdef). [Michael] > Um, well, no, not really. The stuff under __GLIBC___ unsurprisingly > applies to platforms using the GNU project's implementation of the C > library, and GCC is used on many more platforms than just that > (e.g. OS X, FreeBSD). Good point taken: parings of C compilers and C runtime libraries are somewhat fluid. So if all the platforms represented in fpectlmodule.c were happy to use glibc, they all could have used the older glibc spellings. Apparently the people who cared enough on those platforms to contribute code to fpectlmodule.c did not want to use glibc, though. In the end, I still don't know why there would be a reason to hope that an endless variety of other libms would standardize on the C99 spellings. For backward compatibility, they have to continue supporting their old spellings too, and then what's in it for them to supply aliases? Say I'm SGI, struggling as often as not just to stay in business. I'm unlikely to spend what little cash I have to make it easier for customers to jump ship <wink>. > ... > Even given that, the glibc section looks mighty Intel specific to me (I don't > see why 0x1372 should have any x-architecture meaning). Why not? I don't know whether glibc ever did this, but Microsoft's spelling of this stuff used to, on Alphas (when MS compilers still supported Alphas), pick apart the bits and rearrange them into the bits needed for the Alpha's FPU control registers. Saying that bit 0x10 (whatever) is "the overflow flag" (whatever) is as much a x-platform API as saying that the expansion of the macro FE_OVERFLOW is "the overflow flag". Fancy pants symbolic names are favored by "computer science" types these days, but real numeric programmers have always been delighted to wallow in raw bits <wink>. ... > One thing GCC doesn't yet support, it turns out, is the "#pragma STDC > FENV_ACCESS ON" gumpf, which means the optimiser is all too willing to > reorder > > feclearexcept(FE_ALL_EXCEPT); > r = x * y; > fe = fetestexcept(FE_ALL_EXCEPT); > > into > > feclearexcept(FE_ALL_EXCEPT); > fe = fetestexcept(FE_ALL_EXCEPT); > r = x * y; > > Argh! Declaring r 'volatile' made it work. Oh, sigh. One of the lovely ironies in all this is that CPython _could_ make for an excellent 754 environment, precisely because it does such WYSIWYG code generation. Optimizing-compiler writers hate hidden side effects, and every fp operation in 754 is swimming in them -- but Python couldn't care much less. Anyway, you're rediscovering the primary reason you have to pass a double lvalue to the PyFPE_END_PROTECT protect macro. PyFPE_END_PROTECT(v) expands to an expression including the subexpression PyFPE_dummy(&(v)) where PyFPE_dummy() is an extern that ignores its double* argument. The point is that this dance prevents C optimizers from moving the code that computes v below the code generated for PyFPE_END_PROTECT(v). Since v is usually used soon after in the routine, it also discourages the optimizer from moving code up above the PyFPE_END_PROTECT(v) (unless the C does cross-file analysis, it has to assume that PyFPE_dummy(&(v)) may change the value of v). These tricks may be useful here too -- fighting C compilers to the death is part of this game, alas. PyFPE_END_PROTECT() incorporates an even stranger trick, and I wonder how gcc deals with it. The Pentium architecture made an agonizing (for users who care) choice: if you have a particular FP trap enabled (let's say overflow), and you do an fp operation that overflows, the trap doesn't actually fire until the _next_ fp operation (of any kind) occurs. You can honest-to-God have, e.g., an overflowing fp add on an Intel box, and not learn about it until a billion cycles after it happened (if you don't do more FP operations over the next billion cycles). So "the other thing" PyFPE_END_PROTECT does is force a seemingly pointless double->int conversion (it always coerces 1.0 to an int), just to make sure that a Pentium will act on any enabled trap that occurred before it. If you have in mind just testing flags (and staying away from enabling HW traps -- and this is the course I recommend), this shouldn't matter (the sticky status flag is set immediately, it's only triggering the correspondnig trap that's delayed). I haven't studied C99 deeply enough to determine whether it has weasle words allowing traps to be delayed indefinitely, but that kind of HW-driven compromise is common in the C standards. Not to imply that this isn't all dead easy <wink>. -- http://mail.python.org/mailman/listinfo/python-list