Re: RFC: changing precision control setting in initial FPU context
And Kevin Buhr writes: - - > What Linux does presently on x86 is as right as right can be on - > this platform. - - I'm not so sure. Let me rephrase: According to a designer of the x87 and one of the IEEE 754 authors, the behavior currently in Linux and glibc is reasonable on x86. Reasonable is the best you can hope for in floating-point. Double-rounding from intermediate spills isn't reasonable, but that's neither a kernel nor a C library issue. Tackling that issue in the compiler is difficult. MS punted and gcc's trying to get things right (or has, I've lost track, search for `XF', `mode', and `spill' in the archives). If you want plain single- or double-precision arithmetic, use a recent IA-32 with SSE2 instructions. What I should have done in my first response was to refer you to Doug Priest's supplement to David Goldberg's ``What Every Computer Scientist Should Know about Floating-Point Arithmetic''. Of course, you need first read the paper itself. You can find a copy at http://www.validgh.com/ Read it with paper, pencil, and calculator handy. You'll want to work out some examples for yourself. The supplement covers the issues well. If you really want to get upset at operating systems, complain about their lack of support for efficient floating-point exception handling. ;) (Or search for wmexcp, which will kill that complaint on x86 Linux.) Jason - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
[EMAIL PROTECTED] (Kevin Buhr) writes: > > You want peoples existing applications to suddenely and magically change > > their results. Umm problem. > > So, how would you feel about a mechanism whereby the kernel could be > passed a default FPU control word by the binary (with old binaries, by > default, There will be no change whatsoever with me. The existing ABI is fixed. If you want your programs to behave different set the mode appropriately. I have not the slightest interest in seeing applications (including the libc) being broken just because of this stupid idea. No kernel and no libc modifications necessary. This is the end of the story as far as I'm concerned. -- ---. ,-. 1325 Chesapeake Terrace Ulrich Drepper \,---' \ Sunnyvale, CA 94089 USA Red Hat `--' drepper at redhat.com ` - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
Alan Cox <[EMAIL PROTECTED]> writes: > > You want peoples existing applications to suddenely and magically change > their results. Umm problem. So, how would you feel about a mechanism whereby the kernel could be passed a default FPU control word by the binary (with old binaries, by default, using the old default control word)? There's already an ELF AT_FPUCW auxv entry type. What if this was used by the kernel, rather than the C library (as it is now), to set a default to be used in "init_fpu()" when and if the program executed a floating point instruction? Then, a compiler startup-code writer would be able to specify a default control word for binaries that was appropriate for (new) programs generated by that compiler *WITHOUT* worrying about whether he was accidentally turning a non-FP program into an FP program by introducing "fnstcw" as its only FPU instruction. The C library is already trying to do this (setting the CW based on the AT_FPUCW vector). It just can't do it *right* because it doesn't know if the program is really FP. It just guesses that if the AT_FPUCW vector contains something other than the hard-coded _FPU_DEFAULT (which is supposed to be equal to the kernel default: it isn't, but it's close enough), it must be set; otherwise, it's left alone. Kevin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
> with GCC's 64-bit doubles (and its 64-bit clean but 80-bit dirty > floating point optimizations), so I'm proposing adding an instruction > to "init_fpu()" to change the default hardware control word. You want peoples existing applications to suddenely and magically change their results. Umm problem. If your app needs a specific control word then just force it in the app - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
"Adam J. Richter" <[EMAIL PROTECTED]> writes: > > IEEE-754 floating point is available under glibc-based systems, > including most current GNU/Linux distributions, by linking with -lieee. > Your example program produces the "9 10" result you wanted when linked > this way, even when compiled with -O2 No, you've got it backwards. The "9 10" result is the *wrong* result. IEEE 64-bit arithmetic should give the result "10 10". Also, I can't duplicate your outcome. I see no difference linking with "-lieee" versus linking without it, at least under glibc-2.1.3: $ gcc -v Reading specs from /usr/lib/gcc-lib/i386-linux/2.95.2/specs gcc version 2.95.2 2220 (Debian GNU/Linux) $ cat modified.c #include #include int main() { int a = 10; fpu_control_t foo; _FPU_GETCW(foo); printf("%04x %d %d\n", foo, (int)( a*.3 + a*.7), /* first expression */ (int)(10*.3 + 10*.7)); /* second expression */ return 0; } $ gcc modified.c && ./a.out 037f 9 10 $ gcc -O2 modified.c && ./a.out 037f 10 10 $ gcc modified.c -lieee && ./a.out 037f 9 10 $ gcc -O2 modified.c -lieee && ./a.out 037f 10 10 $ As you can see, linking with "ieee" has no effect on the control word setting or the results. Perhaps this has changed post-glibc 2.1.3? Looking at the 2.1.3 code, it appears that all "ieee" does is set a variable that's referenced in the math library innards. It has no effect on startup code right now. > When not linked with "-lieee", Linux personality ELF > x86 binaries start with Precision Control set to 3, just because that > is how the x86 fninit instruction sets it. Yes. I know. In fact, the "fninit" instruction is executed in the kernel's "init_fpu()" when the first FPU instruction is executed by the program. I just think the hardware default happens to be a bad default on a system where most floating-point software is GCC-compiled with GCC's 64-bit doubles (and its 64-bit clean but 80-bit dirty floating point optimizations), so I'm proposing adding an instruction to "init_fpu()" to change the default hardware control word. > In general, I think most real uses of floating point are for "fast and > sloppy" purposes, and programs that want to use floating point and > care about exact reproducibility will link with "-lieee". However, this doesn't seem to work. Nor does "-ffloat-store". Kevin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
Jason Riedy <[EMAIL PROTECTED]> writes: > > Note that getting what some people want to call `true' IEEE 754 > arithmetic on an x86 is frightfully tricky. Changing the precision > does not shorten the exponent field, and that can have, um, fun > effects on and around under/overflow. Whoops. This is an important point and something I'd missed. > What Linux does presently on x86 is as right as right can be on > this platform. I'm not so sure. If most floating point programs and math libraries used 80-bit "long double"s (and if GCC did 80-bit arithmetic correctly, as you seem to imply it doesn't), then I would argue that the current default is a perfect default. As it is, I think most C floating point software (that isn't written by i386 FPU gurus) is written with "double"s, written without attention to the FPU control word, and compiled with no special options. These programs can be made, at least, predictable with respect to compiler optimizations and compatible with many other architectures if we change the default to the *BSD choice. >The *BSD choice is > valid by some lines of thought, but it also denies people the happy > accident of computing with more precision and range than they thought > they needed. If this "accident" happened reliably when the program was compiled with and without "-O2", or if this "accident" couldn't be affected by, say, which branch of an if-else was taken (by means of causing a reload from a 64-bit memory location in one case and not the other, for example), and if this accident was compatible with other i386 Unixish operating systems, it would, indeed, be a *happy* accident. Here, I think it's just an accident. Someone whose code actually benefits from extra mantissa precision beyond 53 bits without them understanding the intricacies of i387 programming needs to be pummelled with a stick. Of course, someone whose code *breaks* from extra mantissa precision *also* needs to be pummeled with a stick. But, in between beatings, I'd still like to get the default changed. It seems to me that this issue is a little different from, say, the "Linux modifies the timeout field in select calls" kind of incompatibility. If an FP program under Linux behaved differently but, at least, reliably and predictably, I wouldn't be bringing this up. An incompatibile implementation that *also* leads to bizarre surprises (with any change to compilation flags, program flow, phase of the moon, whatever) especially when the alternative, compatible implementation *doesn't* lead to surprises well, that's what has gotten me up in arms. > Overall, computing with x86 double-extended is a good > thing so long as you don't introduce multiple roundings. That's a > compiler issue, not a kernel one. Yes, maybe it is. The issue as I see it is to set a reasonable, default floating-point policy without compromising Linux's lazy FPU context switching---it can't be done in the C library startup code without a kernel change. It *could* be done by the compiler (which would clearly know when a particular function used floating point and what control word setting was appropriate). It's something to think about, at any rate. Kevin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
"Albert D. Cahalan" <[EMAIL PROTECTED]> writes: > > > > Well, yes, but I'll try not to cry myself to sleep over it. I'm > > tempted to say that someone who chooses to use "float"s has given up > > all pretense of caring about the answers they get. And, if they > > really want to do predictable math with floats they can change the FPU > > control word from whatever its default is to PC==0. > > There are algorithms which work fine using 32-bit floating-point, > but which become unstable when you get unpredictable precision. > It is reasonable to use such an algorithm and some 64-bit math in > the same program. So there isn't any correct x86 setting. So what? Of course there's no "correct" x86 setting for all situations. In this particular situation, you will need to change the PC on a function-by-function basis. I'm just suggesting there might be a better *default* PC than the current one. > That would be an awful idea. There are two main useful behaviors: > > 1. Pure IEEE for 32-bit, 64-bit, and 80-bit floating-point values. >The compiler rounds intermediate values by writing to memory >or by adjusting the precision control before each operation. > > 2. Extra precision when it comes free. The precision control is set >to 80-bit and the compiler tries to keep values in registers. >This is usually the more useful behavior, and it performs better. I find it difficult to believe that anyone would find the second alternative even remotely comparable in "usefulness" to the first. The extra precision isn't free; it comes at the expense of predictable program behavior and compatibility with other i386 and non-i386 architectures. > What you are suggesting is a gross hybrid. You claim it has something > to do with IEEE, but it doesn't handle 32-bit math correctly. Your > proposal is NOT true IEEE math. What I am suggesting would permit IEEE 64-bit math to be done, in the default configuration, in any GCC-compiled C program (with or without optimization) that used only doubles for floating point arithmetic. The current default PC allows no IEEE compliant GCC-compiled math in any mode under any circumstances. It also gives unexpected anamolous results, *and* these results differ from the behavior under FreeBSD, NetBSD, and most non-Intel platforms. > Woah, what kind of crap is that You can not get true IEEE math > by setting the precision control word at startup. You don't; it turns out linking with "ieee" doesn't change the control word; at one time it did, but the point was never to change the precision control, it was to switch from POSIX to IEEE exception handling. And it wasn't my idea. > Check the archives: the x86 Linux ABI specifies 80-bit precision. > This will never change. The library is supposed to assume this, > rather than try to allow for a change that will never happen. > Linus dished out some nice toasty flames for the libc developers > over this. Okay, fine. The Linux ABI can specify whatever the hell it wants. Then, we should have a way for the library to communicate a preferred default value to the kernel *WITHOUT* turning on the lazy FPU context switching for every program. Kevin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
And "Albert D. Cahalan" writes: - - 2. Extra precision when it comes free. The precision control is set -to 80-bit and the compiler tries to keep values in registers. -This is usually the more useful behavior, and it performs better. Even better is for gcc to spill intermediate results to 80 bits. Unfortunately, these 80 bits have to be expanded to 128 for alignment, and this eats cache. IIRC, this has been discussed many times by gcc developers. I don't recall the final verdict. The original intent with the 8087 was that the compiler and/or OS could transparently extend the stack into memory, but one necessary feature was left out until the 80387. By that point, it was too late. And then came caches... - What you are suggesting is a gross hybrid. You claim it has something - to do with IEEE, but it doesn't handle 32-bit math correctly. Your - proposal is NOT true IEEE math. Note that getting what some people want to call `true' IEEE 754 arithmetic on an x86 is frightfully tricky. Changing the precision does not shorten the exponent field, and that can have, um, fun effects on and around under/overflow. The mantissa and exponent lengths were chosen carefully to protect against those effects in many computations. What Linux does presently on x86 is as right as right can be on this platform. Compare with what MS's compilers do (die when you run out of the fp stack slots, telling users to simplify the expressions in the source code) and be happy. The *BSD choice is valid by some lines of thought, but it also denies people the happy accident of computing with more precision and range than they thought they needed. Overall, computing with x86 double-extended is a good thing so long as you don't introduce multiple roundings. That's a compiler issue, not a kernel one. Historical note: According to one of the x87 designers, this all boils down to the simple fact that there's no time when a pair of collaborators in California and Israel can be both awake and lucid enough to explain things well over a noisy telephone line. Amazing that it really wasn't long ago. And if anyone's really interested, keep checking http://www.cs.berkeley.edu/~wkahan/ as some of Dr. Kahan's older papers are slowly converted and added. They give a great deal of insight into the choices that eventually became the accepted IEEE 754 standard. Jason - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
Kevin Buhr writes: > "Albert D. Cahalan" <[EMAIL PROTECTED]> writes: >> So you change it to 2... but what about the "float" type? It gets >> a mixture of 64-bit and 32-bit IEEE arithmetic depending rather >> unpredictably on compiler register allocations and optimizations??? > > Well, yes, but I'll try not to cry myself to sleep over it. I'm > tempted to say that someone who chooses to use "float"s has given up > all pretense of caring about the answers they get. And, if they > really want to do predictable math with floats they can change the FPU > control word from whatever its default is to PC==0. There are algorithms which work fine using 32-bit floating-point, but which become unstable when you get unpredictable precision. It is reasonable to use such an algorithm and some 64-bit math in the same program. So there isn't any correct x86 setting. >> If a "float" will have excess precision, then a "double" might >> as well have it too. Usually it helps, but sometimes it hurts. >> This is life with C on x86. > > That's the way I initially felt, and it looks silly when it's written > down, so I'm glad I changed my mind. > > I don't think extra precision that is unpredictable is ever helpful. > Extra precision that might be gained or lost depending on, say, which > branch of an if-statement is taken, is of no use to anyone. It just > causes confusion. The excess precision on "float" is a nuisance. The > excess precision on "double" is another nuisance. It would be nice to > eliminate one of those nuisances, at least by default. That would be an awful idea. There are two main useful behaviors: 1. Pure IEEE for 32-bit, 64-bit, and 80-bit floating-point values. The compiler rounds intermediate values by writing to memory or by adjusting the precision control before each operation. 2. Extra precision when it comes free. The precision control is set to 80-bit and the compiler tries to keep values in registers. This is usually the more useful behavior, and it performs better. What you are suggesting is a gross hybrid. You claim it has something to do with IEEE, but it doesn't handle 32-bit math correctly. Your proposal is NOT true IEEE math. >> Ugh, more start-up crud. > > The startup crud is already there. It's used to allow linking with > "-lieee" to set a new control word value, for example, and it's Woah, what kind of crap is that You can not get true IEEE math by setting the precision control word at startup. This is a bug. The compiler must save values to memory or adjust the precision control as needed. For example, the precision control could be loaded on function entry. This may be optimized away for some "static" or "inline" functions. > To me, a system call (not necessarily a *new* system call, but some > way to get the desired FPU control word to the kernel) seems like a > more elegant solution. > > On the other hand, I'm not married to the idea. I'd rather just get > the default control word changed in the kernel. Check the archives: the x86 Linux ABI specifies 80-bit precision. This will never change. The library is supposed to assume this, rather than try to allow for a change that will never happen. Linus dished out some nice toasty flames for the libc developers over this. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
IEEE-754 floating point is available under glibc-based systems, including most current GNU/Linux distributions, by linking with -lieee. Your example program produces the "9 10" result you wanted when linked this way, even when compiled with -O2 When not linked with "-lieee", Linux personality ELF x86 binaries start with Precision Control set to 3, just because that is how the x86 fninit instruction sets it. I thought that libieee was also available at run time for dynamic executables by doing something like "LD_PRELOAD=/usr/lib/libieee.so my_dynamic_exeuctable", so you could set it in your .bashrc if you wanted, but that apparently is not the case, at least under glibc-2.2.2. I will have to try to figure out why this is not available. I am a bit out of my depth when discussing the advantages of occasional 80 bit precision over 64 bit, but I think that there are situations where getting gratuitously more accurate results helps, like getting faster convergence in some scientific numerical methods, such as Newton's method. (You'll still find the same point of convergence if there is only one, but the program will run faster). Another example would be things like 3D lighting calculations (used in games?) where you want to produce the best images that you can within that CPU budget. I don't know of any sound encodings where a fully optimized implementation would use floating point, but it's possible. In general, I think most real uses of floating point are for "fast and sloppy" purposes, and programs that want to use floating point and care about exact reproducibility will link with "-lieee". On the other hand, if a GNU/Linux-x86 distribution did want to change the initial floating point control word in Linux to PC=2, I think you would still want old programs to run in their old PC=3 environment, just in case one relied on it. Your sys_setfpcw suggestion could do (to set the default floating point control word without flagging the process as one that was definitely going to use floating point), but I think a simpler approach would be to assign a different magic number argument setpersonality() for programs that expect to be initialized with floating point precision control set to 2. Adam J. Richter __ __ 4880 Stevens Creek Blvd, Suite 104 [EMAIL PROTECTED] \ / San Jose, California 95129-1034 +1 408 261-6630 | g g d r a s i l United States of America fax +1 408 261-6631 "Free Software For The Rest Of Us." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
"Albert D. Cahalan" <[EMAIL PROTECTED]> writes: > > So you change it to 2... but what about the "float" type? It gets > a mixture of 64-bit and 32-bit IEEE arithmetic depending rather > unpredictably on compiler register allocations and optimizations??? Well, yes, but I'll try not to cry myself to sleep over it. I'm tempted to say that someone who chooses to use "float"s has given up all pretense of caring about the answers they get. And, if they really want to do predictable math with floats they can change the FPU control word from whatever its default is to PC==0. I guess if I had to choose between two default control word settings so that either (A) "float" arithmetic is unpredictable but "double" arithmetic is predictable, corresponds to 64-bit IEEE arithmetic, is invariant under different compiler optimization settings, matches the compiler's handling of constant folding, and mimics the behavior on i386 FreeBSD and NetBSD and most modern, non-i386 architectures; or (B) "float" and "double" arithmetic are both unpredictable and non-IEEE; I'd choose (A). > If a "float" will have excess precision, then a "double" might > as well have it too. Usually it helps, but sometimes it hurts. > This is life with C on x86. That's the way I initially felt, and it looks silly when it's written down, so I'm glad I changed my mind. I don't think extra precision that is unpredictable is ever helpful. Extra precision that might be gained or lost depending on, say, which branch of an if-statement is taken, is of no use to anyone. It just causes confusion. The excess precision on "float" is a nuisance. The excess precision on "double" is another nuisance. It would be nice to eliminate one of those nuisances, at least by default. > Ugh, more start-up crud. The startup crud is already there. It's used to allow linking with "-lieee" to set a new control word value, for example, and it's inelegant and ugly. Because we don't want to set the control word on a non-FPU program and defeat the lazy FPU context initialization, we compare the value of the control word we want with a value hard-coded into the library that's supposed to match the value hard-coded into the kernel. If the two values differ, we set the control word to the new value (whether the program actually ends up ever executing an FPU instruction or not). To me, a system call (not necessarily a *new* system call, but some way to get the desired FPU control word to the kernel) seems like a more elegant solution. On the other hand, I'm not married to the idea. I'd rather just get the default control word changed in the kernel. Kevin <[EMAIL PROTECTED]> - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
Kevin Buhr writes: > It boils down to the fact that, under i386 Linux, the FPU control word > has its precision control (PC) set to 3 (for 80-bit extended > precision) while under i386 FreeBSD, NetBSD, and others, it's set to 2 > (for 64-bit double precision). On other architectures, I assume > there's usually no mismatch between the C "double" precision and the > FPU's default internal precision. ... > Initially, I was quick to dismiss the whole thing as symptomatic of a > severe floating-point-related cluon shortage. However, the more I > think about it, the better the case seems for changing the Linux > default: > > 1. First, PC=3 is a dangerous setting. A floating point program > using "double"s, compiled with GCC without attention to > FPU-related compilation options, won't do IEEE arithmetic running > under this setting. Instead, it will use a mixture of 80-bit and > 64-bit IEEE arithmetic depending rather unpredictably on compiler > register allocations and optimizations. > > 2. Second, PC=3 is a mostly *useless* setting for GCC-compiled > programs. There can obviously be no way to guarantee reliable > IEEE 80-bit arithmetic in GCC-compiled code when "double"s are > only 64 bits, so our only hope is to guarantee reliable IEEE > 64-bit arithmetic. But, then we should have set PC=2 in the first > place. So you change it to 2... but what about the "float" type? It gets a mixture of 64-bit and 32-bit IEEE arithmetic depending rather unpredictably on compiler register allocations and optimizations??? If a "float" will have excess precision, then a "double" might as well have it too. Usually it helps, but sometimes it hurts. This is life with C on x86. > So, on a related note, is it reasonable to consider resurrecting the > "sys_setfpucw" idea at this point, to push the decision on the correct > initial control word up to the C library level where it belongs? (For > those who don't remember the proposal, the idea is that the C library > can use "sys_setfpucw" to set the desired initial control word. Ugh, more start-up crud. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
Kevin Buhr writes: It boils down to the fact that, under i386 Linux, the FPU control word has its precision control (PC) set to 3 (for 80-bit extended precision) while under i386 FreeBSD, NetBSD, and others, it's set to 2 (for 64-bit double precision). On other architectures, I assume there's usually no mismatch between the C "double" precision and the FPU's default internal precision. ... Initially, I was quick to dismiss the whole thing as symptomatic of a severe floating-point-related cluon shortage. However, the more I think about it, the better the case seems for changing the Linux default: 1. First, PC=3 is a dangerous setting. A floating point program using "double"s, compiled with GCC without attention to FPU-related compilation options, won't do IEEE arithmetic running under this setting. Instead, it will use a mixture of 80-bit and 64-bit IEEE arithmetic depending rather unpredictably on compiler register allocations and optimizations. 2. Second, PC=3 is a mostly *useless* setting for GCC-compiled programs. There can obviously be no way to guarantee reliable IEEE 80-bit arithmetic in GCC-compiled code when "double"s are only 64 bits, so our only hope is to guarantee reliable IEEE 64-bit arithmetic. But, then we should have set PC=2 in the first place. So you change it to 2... but what about the "float" type? It gets a mixture of 64-bit and 32-bit IEEE arithmetic depending rather unpredictably on compiler register allocations and optimizations??? If a "float" will have excess precision, then a "double" might as well have it too. Usually it helps, but sometimes it hurts. This is life with C on x86. So, on a related note, is it reasonable to consider resurrecting the "sys_setfpucw" idea at this point, to push the decision on the correct initial control word up to the C library level where it belongs? (For those who don't remember the proposal, the idea is that the C library can use "sys_setfpucw" to set the desired initial control word. Ugh, more start-up crud. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
"Albert D. Cahalan" [EMAIL PROTECTED] writes: So you change it to 2... but what about the "float" type? It gets a mixture of 64-bit and 32-bit IEEE arithmetic depending rather unpredictably on compiler register allocations and optimizations??? Well, yes, but I'll try not to cry myself to sleep over it. I'm tempted to say that someone who chooses to use "float"s has given up all pretense of caring about the answers they get. And, if they really want to do predictable math with floats they can change the FPU control word from whatever its default is to PC==0. I guess if I had to choose between two default control word settings so that either (A) "float" arithmetic is unpredictable but "double" arithmetic is predictable, corresponds to 64-bit IEEE arithmetic, is invariant under different compiler optimization settings, matches the compiler's handling of constant folding, and mimics the behavior on i386 FreeBSD and NetBSD and most modern, non-i386 architectures; or (B) "float" and "double" arithmetic are both unpredictable and non-IEEE; I'd choose (A). If a "float" will have excess precision, then a "double" might as well have it too. Usually it helps, but sometimes it hurts. This is life with C on x86. That's the way I initially felt, and it looks silly when it's written down, so I'm glad I changed my mind. I don't think extra precision that is unpredictable is ever helpful. Extra precision that might be gained or lost depending on, say, which branch of an if-statement is taken, is of no use to anyone. It just causes confusion. The excess precision on "float" is a nuisance. The excess precision on "double" is another nuisance. It would be nice to eliminate one of those nuisances, at least by default. Ugh, more start-up crud. The startup crud is already there. It's used to allow linking with "-lieee" to set a new control word value, for example, and it's inelegant and ugly. Because we don't want to set the control word on a non-FPU program and defeat the lazy FPU context initialization, we compare the value of the control word we want with a value hard-coded into the library that's supposed to match the value hard-coded into the kernel. If the two values differ, we set the control word to the new value (whether the program actually ends up ever executing an FPU instruction or not). To me, a system call (not necessarily a *new* system call, but some way to get the desired FPU control word to the kernel) seems like a more elegant solution. On the other hand, I'm not married to the idea. I'd rather just get the default control word changed in the kernel. Kevin [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
IEEE-754 floating point is available under glibc-based systems, including most current GNU/Linux distributions, by linking with -lieee. Your example program produces the "9 10" result you wanted when linked this way, even when compiled with -O2 When not linked with "-lieee", Linux personality ELF x86 binaries start with Precision Control set to 3, just because that is how the x86 fninit instruction sets it. I thought that libieee was also available at run time for dynamic executables by doing something like "LD_PRELOAD=/usr/lib/libieee.so my_dynamic_exeuctable", so you could set it in your .bashrc if you wanted, but that apparently is not the case, at least under glibc-2.2.2. I will have to try to figure out why this is not available. I am a bit out of my depth when discussing the advantages of occasional 80 bit precision over 64 bit, but I think that there are situations where getting gratuitously more accurate results helps, like getting faster convergence in some scientific numerical methods, such as Newton's method. (You'll still find the same point of convergence if there is only one, but the program will run faster). Another example would be things like 3D lighting calculations (used in games?) where you want to produce the best images that you can within that CPU budget. I don't know of any sound encodings where a fully optimized implementation would use floating point, but it's possible. In general, I think most real uses of floating point are for "fast and sloppy" purposes, and programs that want to use floating point and care about exact reproducibility will link with "-lieee". On the other hand, if a GNU/Linux-x86 distribution did want to change the initial floating point control word in Linux to PC=2, I think you would still want old programs to run in their old PC=3 environment, just in case one relied on it. Your sys_setfpcw suggestion could do (to set the default floating point control word without flagging the process as one that was definitely going to use floating point), but I think a simpler approach would be to assign a different magic number argument setpersonality() for programs that expect to be initialized with floating point precision control set to 2. Adam J. Richter __ __ 4880 Stevens Creek Blvd, Suite 104 [EMAIL PROTECTED] \ / San Jose, California 95129-1034 +1 408 261-6630 | g g d r a s i l United States of America fax +1 408 261-6631 "Free Software For The Rest Of Us." - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
Kevin Buhr writes: "Albert D. Cahalan" [EMAIL PROTECTED] writes: So you change it to 2... but what about the "float" type? It gets a mixture of 64-bit and 32-bit IEEE arithmetic depending rather unpredictably on compiler register allocations and optimizations??? Well, yes, but I'll try not to cry myself to sleep over it. I'm tempted to say that someone who chooses to use "float"s has given up all pretense of caring about the answers they get. And, if they really want to do predictable math with floats they can change the FPU control word from whatever its default is to PC==0. There are algorithms which work fine using 32-bit floating-point, but which become unstable when you get unpredictable precision. It is reasonable to use such an algorithm and some 64-bit math in the same program. So there isn't any correct x86 setting. If a "float" will have excess precision, then a "double" might as well have it too. Usually it helps, but sometimes it hurts. This is life with C on x86. That's the way I initially felt, and it looks silly when it's written down, so I'm glad I changed my mind. I don't think extra precision that is unpredictable is ever helpful. Extra precision that might be gained or lost depending on, say, which branch of an if-statement is taken, is of no use to anyone. It just causes confusion. The excess precision on "float" is a nuisance. The excess precision on "double" is another nuisance. It would be nice to eliminate one of those nuisances, at least by default. That would be an awful idea. There are two main useful behaviors: 1. Pure IEEE for 32-bit, 64-bit, and 80-bit floating-point values. The compiler rounds intermediate values by writing to memory or by adjusting the precision control before each operation. 2. Extra precision when it comes free. The precision control is set to 80-bit and the compiler tries to keep values in registers. This is usually the more useful behavior, and it performs better. What you are suggesting is a gross hybrid. You claim it has something to do with IEEE, but it doesn't handle 32-bit math correctly. Your proposal is NOT true IEEE math. Ugh, more start-up crud. The startup crud is already there. It's used to allow linking with "-lieee" to set a new control word value, for example, and it's Woah, what kind of crap is that You can not get true IEEE math by setting the precision control word at startup. This is a bug. The compiler must save values to memory or adjust the precision control as needed. For example, the precision control could be loaded on function entry. This may be optimized away for some "static" or "inline" functions. To me, a system call (not necessarily a *new* system call, but some way to get the desired FPU control word to the kernel) seems like a more elegant solution. On the other hand, I'm not married to the idea. I'd rather just get the default control word changed in the kernel. Check the archives: the x86 Linux ABI specifies 80-bit precision. This will never change. The library is supposed to assume this, rather than try to allow for a change that will never happen. Linus dished out some nice toasty flames for the libc developers over this. - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
And "Albert D. Cahalan" writes: - - 2. Extra precision when it comes free. The precision control is set -to 80-bit and the compiler tries to keep values in registers. -This is usually the more useful behavior, and it performs better. Even better is for gcc to spill intermediate results to 80 bits. Unfortunately, these 80 bits have to be expanded to 128 for alignment, and this eats cache. IIRC, this has been discussed many times by gcc developers. I don't recall the final verdict. The original intent with the 8087 was that the compiler and/or OS could transparently extend the stack into memory, but one necessary feature was left out until the 80387. By that point, it was too late. And then came caches... - What you are suggesting is a gross hybrid. You claim it has something - to do with IEEE, but it doesn't handle 32-bit math correctly. Your - proposal is NOT true IEEE math. Note that getting what some people want to call `true' IEEE 754 arithmetic on an x86 is frightfully tricky. Changing the precision does not shorten the exponent field, and that can have, um, fun effects on and around under/overflow. The mantissa and exponent lengths were chosen carefully to protect against those effects in many computations. What Linux does presently on x86 is as right as right can be on this platform. Compare with what MS's compilers do (die when you run out of the fp stack slots, telling users to simplify the expressions in the source code) and be happy. The *BSD choice is valid by some lines of thought, but it also denies people the happy accident of computing with more precision and range than they thought they needed. Overall, computing with x86 double-extended is a good thing so long as you don't introduce multiple roundings. That's a compiler issue, not a kernel one. Historical note: According to one of the x87 designers, this all boils down to the simple fact that there's no time when a pair of collaborators in California and Israel can be both awake and lucid enough to explain things well over a noisy telephone line. Amazing that it really wasn't long ago. And if anyone's really interested, keep checking http://www.cs.berkeley.edu/~wkahan/ as some of Dr. Kahan's older papers are slowly converted and added. They give a great deal of insight into the choices that eventually became the accepted IEEE 754 standard. Jason - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
"Albert D. Cahalan" [EMAIL PROTECTED] writes: Well, yes, but I'll try not to cry myself to sleep over it. I'm tempted to say that someone who chooses to use "float"s has given up all pretense of caring about the answers they get. And, if they really want to do predictable math with floats they can change the FPU control word from whatever its default is to PC==0. There are algorithms which work fine using 32-bit floating-point, but which become unstable when you get unpredictable precision. It is reasonable to use such an algorithm and some 64-bit math in the same program. So there isn't any correct x86 setting. So what? Of course there's no "correct" x86 setting for all situations. In this particular situation, you will need to change the PC on a function-by-function basis. I'm just suggesting there might be a better *default* PC than the current one. That would be an awful idea. There are two main useful behaviors: 1. Pure IEEE for 32-bit, 64-bit, and 80-bit floating-point values. The compiler rounds intermediate values by writing to memory or by adjusting the precision control before each operation. 2. Extra precision when it comes free. The precision control is set to 80-bit and the compiler tries to keep values in registers. This is usually the more useful behavior, and it performs better. I find it difficult to believe that anyone would find the second alternative even remotely comparable in "usefulness" to the first. The extra precision isn't free; it comes at the expense of predictable program behavior and compatibility with other i386 and non-i386 architectures. What you are suggesting is a gross hybrid. You claim it has something to do with IEEE, but it doesn't handle 32-bit math correctly. Your proposal is NOT true IEEE math. What I am suggesting would permit IEEE 64-bit math to be done, in the default configuration, in any GCC-compiled C program (with or without optimization) that used only doubles for floating point arithmetic. The current default PC allows no IEEE compliant GCC-compiled math in any mode under any circumstances. It also gives unexpected anamolous results, *and* these results differ from the behavior under FreeBSD, NetBSD, and most non-Intel platforms. Woah, what kind of crap is that You can not get true IEEE math by setting the precision control word at startup. You don't; it turns out linking with "ieee" doesn't change the control word; at one time it did, but the point was never to change the precision control, it was to switch from POSIX to IEEE exception handling. And it wasn't my idea. Check the archives: the x86 Linux ABI specifies 80-bit precision. This will never change. The library is supposed to assume this, rather than try to allow for a change that will never happen. Linus dished out some nice toasty flames for the libc developers over this. Okay, fine. The Linux ABI can specify whatever the hell it wants. Then, we should have a way for the library to communicate a preferred default value to the kernel *WITHOUT* turning on the lazy FPU context switching for every program. Kevin [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
Jason Riedy [EMAIL PROTECTED] writes: Note that getting what some people want to call `true' IEEE 754 arithmetic on an x86 is frightfully tricky. Changing the precision does not shorten the exponent field, and that can have, um, fun effects on and around under/overflow. Whoops. This is an important point and something I'd missed. What Linux does presently on x86 is as right as right can be on this platform. I'm not so sure. If most floating point programs and math libraries used 80-bit "long double"s (and if GCC did 80-bit arithmetic correctly, as you seem to imply it doesn't), then I would argue that the current default is a perfect default. As it is, I think most C floating point software (that isn't written by i386 FPU gurus) is written with "double"s, written without attention to the FPU control word, and compiled with no special options. These programs can be made, at least, predictable with respect to compiler optimizations and compatible with many other architectures if we change the default to the *BSD choice. The *BSD choice is valid by some lines of thought, but it also denies people the happy accident of computing with more precision and range than they thought they needed. If this "accident" happened reliably when the program was compiled with and without "-O2", or if this "accident" couldn't be affected by, say, which branch of an if-else was taken (by means of causing a reload from a 64-bit memory location in one case and not the other, for example), and if this accident was compatible with other i386 Unixish operating systems, it would, indeed, be a *happy* accident. Here, I think it's just an accident. Someone whose code actually benefits from extra mantissa precision beyond 53 bits without them understanding the intricacies of i387 programming needs to be pummelled with a stick. Of course, someone whose code *breaks* from extra mantissa precision *also* needs to be pummeled with a stick. But, in between beatings, I'd still like to get the default changed. It seems to me that this issue is a little different from, say, the "Linux modifies the timeout field in select calls" kind of incompatibility. If an FP program under Linux behaved differently but, at least, reliably and predictably, I wouldn't be bringing this up. An incompatibile implementation that *also* leads to bizarre surprises (with any change to compilation flags, program flow, phase of the moon, whatever) especially when the alternative, compatible implementation *doesn't* lead to surprises well, that's what has gotten me up in arms. Overall, computing with x86 double-extended is a good thing so long as you don't introduce multiple roundings. That's a compiler issue, not a kernel one. Yes, maybe it is. The issue as I see it is to set a reasonable, default floating-point policy without compromising Linux's lazy FPU context switching---it can't be done in the C library startup code without a kernel change. It *could* be done by the compiler (which would clearly know when a particular function used floating point and what control word setting was appropriate). It's something to think about, at any rate. Kevin [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
"Adam J. Richter" [EMAIL PROTECTED] writes: IEEE-754 floating point is available under glibc-based systems, including most current GNU/Linux distributions, by linking with -lieee. Your example program produces the "9 10" result you wanted when linked this way, even when compiled with -O2 No, you've got it backwards. The "9 10" result is the *wrong* result. IEEE 64-bit arithmetic should give the result "10 10". Also, I can't duplicate your outcome. I see no difference linking with "-lieee" versus linking without it, at least under glibc-2.1.3: $ gcc -v Reading specs from /usr/lib/gcc-lib/i386-linux/2.95.2/specs gcc version 2.95.2 2220 (Debian GNU/Linux) $ cat modified.c #include stdio.h #include fpu_control.h int main() { int a = 10; fpu_control_t foo; _FPU_GETCW(foo); printf("%04x %d %d\n", foo, (int)( a*.3 + a*.7), /* first expression */ (int)(10*.3 + 10*.7)); /* second expression */ return 0; } $ gcc modified.c ./a.out 037f 9 10 $ gcc -O2 modified.c ./a.out 037f 10 10 $ gcc modified.c -lieee ./a.out 037f 9 10 $ gcc -O2 modified.c -lieee ./a.out 037f 10 10 $ As you can see, linking with "ieee" has no effect on the control word setting or the results. Perhaps this has changed post-glibc 2.1.3? Looking at the 2.1.3 code, it appears that all "ieee" does is set a variable that's referenced in the math library innards. It has no effect on startup code right now. When not linked with "-lieee", Linux personality ELF x86 binaries start with Precision Control set to 3, just because that is how the x86 fninit instruction sets it. Yes. I know. In fact, the "fninit" instruction is executed in the kernel's "init_fpu()" when the first FPU instruction is executed by the program. I just think the hardware default happens to be a bad default on a system where most floating-point software is GCC-compiled with GCC's 64-bit doubles (and its 64-bit clean but 80-bit dirty floating point optimizations), so I'm proposing adding an instruction to "init_fpu()" to change the default hardware control word. In general, I think most real uses of floating point are for "fast and sloppy" purposes, and programs that want to use floating point and care about exact reproducibility will link with "-lieee". However, this doesn't seem to work. Nor does "-ffloat-store". Kevin [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
with GCC's 64-bit doubles (and its 64-bit clean but 80-bit dirty floating point optimizations), so I'm proposing adding an instruction to "init_fpu()" to change the default hardware control word. You want peoples existing applications to suddenely and magically change their results. Umm problem. If your app needs a specific control word then just force it in the app - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
Alan Cox [EMAIL PROTECTED] writes: You want peoples existing applications to suddenely and magically change their results. Umm problem. So, how would you feel about a mechanism whereby the kernel could be passed a default FPU control word by the binary (with old binaries, by default, using the old default control word)? There's already an ELF AT_FPUCW auxv entry type. What if this was used by the kernel, rather than the C library (as it is now), to set a default to be used in "init_fpu()" when and if the program executed a floating point instruction? Then, a compiler startup-code writer would be able to specify a default control word for binaries that was appropriate for (new) programs generated by that compiler *WITHOUT* worrying about whether he was accidentally turning a non-FP program into an FP program by introducing "fnstcw" as its only FPU instruction. The C library is already trying to do this (setting the CW based on the AT_FPUCW vector). It just can't do it *right* because it doesn't know if the program is really FP. It just guesses that if the AT_FPUCW vector contains something other than the hard-coded _FPU_DEFAULT (which is supposed to be equal to the kernel default: it isn't, but it's close enough), it must be set; otherwise, it's left alone. Kevin [EMAIL PROTECTED] - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
[EMAIL PROTECTED] (Kevin Buhr) writes: You want peoples existing applications to suddenely and magically change their results. Umm problem. So, how would you feel about a mechanism whereby the kernel could be passed a default FPU control word by the binary (with old binaries, by default, There will be no change whatsoever with me. The existing ABI is fixed. If you want your programs to behave different set the mode appropriately. I have not the slightest interest in seeing applications (including the libc) being broken just because of this stupid idea. No kernel and no libc modifications necessary. This is the end of the story as far as I'm concerned. -- ---. ,-. 1325 Chesapeake Terrace Ulrich Drepper \,---' \ Sunnyvale, CA 94089 USA Red Hat `--' drepper at redhat.com ` - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: RFC: changing precision control setting in initial FPU context
And Kevin Buhr writes: - - What Linux does presently on x86 is as right as right can be on - this platform. - - I'm not so sure. Let me rephrase: According to a designer of the x87 and one of the IEEE 754 authors, the behavior currently in Linux and glibc is reasonable on x86. Reasonable is the best you can hope for in floating-point. Double-rounding from intermediate spills isn't reasonable, but that's neither a kernel nor a C library issue. Tackling that issue in the compiler is difficult. MS punted and gcc's trying to get things right (or has, I've lost track, search for `XF', `mode', and `spill' in the archives). If you want plain single- or double-precision arithmetic, use a recent IA-32 with SSE2 instructions. What I should have done in my first response was to refer you to Doug Priest's supplement to David Goldberg's ``What Every Computer Scientist Should Know about Floating-Point Arithmetic''. Of course, you need first read the paper itself. You can find a copy at http://www.validgh.com/ Read it with paper, pencil, and calculator handy. You'll want to work out some examples for yourself. The supplement covers the issues well. If you really want to get upset at operating systems, complain about their lack of support for efficient floating-point exception handling. ;) (Or search for wmexcp, which will kill that complaint on x86 Linux.) Jason - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
RFC: changing precision control setting in initial FPU context
A question recently came up in "c.o.l.d.s"; actually, it was a comment on Slashdot that had been cross-posted to 15 Usenet groups by some ignoramus. It concerned a snippet of C code that cast a double to int in such a way as to get a different answer under i386 Linux than under the i386 free BSDs and most non-i386 architectures. In fact, the exact same assembly, running under Linux and under FreeBSD on the same machine, reportedly gave different results. For those who might care, #include int main() { int a = 10; printf("%d %d\n", /* now for some BAD CODE! */ (int)( a*.3 + a*.7), /* first expression */ (int)(10*.3 + 10*.7)); /* second expression */ return 0; } when compiled under GCC *without optimization*, will print "9 10" on i386 Linux and "10 10" most every place else. (And, by the way, if you sit down with a pencil and paper, you'll find that IEEE 754 arithmetic in 32-bit, 64-bit, or 80-bit precision tells us that floor(10*.3 + 10*.7) == 10, not 9.) It boils down to the fact that, under i386 Linux, the FPU control word has its precision control (PC) set to 3 (for 80-bit extended precision) while under i386 FreeBSD, NetBSD, and others, it's set to 2 (for 64-bit double precision). On other architectures, I assume there's usually no mismatch between the C "double" precision and the FPU's default internal precision. To be specific, under Linux, the first expression takes 64-bit versions of the constants 0.3 and 0.7 (each slightly less than the true values of 0.3 and 0.7), and does 80-bit multiplies and an add to get a number slightly less than 10. This gets truncated to 9. On the other hand, under the BSDs, the 64-bit add rounds upward before the truncation, giving the answer "10". The second expression always produces 10 (and, with -O2, the first also produces 10), probably because GCC itself either does all the constant optimization arithmetic (including forming the constants 0.3 and 0.7) in 80 bits or stores the interim results often enough in 64-bit registers to make it come out "right". Initially, I was quick to dismiss the whole thing as symptomatic of a severe floating-point-related cluon shortage. However, the more I think about it, the better the case seems for changing the Linux default: 1. First, PC=3 is a dangerous setting. A floating point program using "double"s, compiled with GCC without attention to FPU-related compilation options, won't do IEEE arithmetic running under this setting. Instead, it will use a mixture of 80-bit and 64-bit IEEE arithmetic depending rather unpredictably on compiler register allocations and optimizations. 2. Second, PC=3 is a mostly *useless* setting for GCC-compiled programs. There can obviously be no way to guarantee reliable IEEE 80-bit arithmetic in GCC-compiled code when "double"s are only 64 bits, so our only hope is to guarantee reliable IEEE 64-bit arithmetic. But, then we should have set PC=2 in the first place. Worse yet, I don't know of any compilation flags that *can* guarantee IEEE 64-bit arithmetic. I would have thought -ffloat-store would do the trick, but it doesn't change the assembly generated for the above example, at least on my Debian potato build of gcc 2.95.2. The only use for PC=3 is in hand-assembled code (or perhaps using GCC "long double"s); in those cases, the people doing the coding (or the compiler) should know enough to set the required control word value. 2. Finally, the setting is incompatible with other Unixish platforms. As mentioned, Free/NetBSD both use PC=2, and most non-IA-64 FPU architectures appear to use a floating point representation that matches their C "double" precision which prevents these kinds of surprises. The case against, as I see it, boils down to this: 1. The current setting is the hardware default, so it somehow "makes sense" to leave it. 2. It could potentially break existing code. Can anyone guess how badly? 3. Implementation is a bit of a pain. It requires both kernel and libc changes. On the third point, Ulrich and Adam hashed out weirdness with the FPU control word setting some time ago in the context of selecting IEEE or POSIX error handling behavior with "-lieee" without thwarting the kernel's lazy FPU context initialization scheme. So, on a related note, is it reasonable to consider resurrecting the "sys_setfpucw" idea at this point, to push the decision on the correct initial control word up to the C library level where it belongs? (For those who don't remember the proposal, the idea is that the C library can use "sys_setfpucw" to set the desired initial control word. If the C program actually executes an FPU instruction, the kernel will use that saved control word to initialize the FPU context in "init_fpu()"; otherwise, lazy FPU initialization
RFC: changing precision control setting in initial FPU context
A question recently came up in "c.o.l.d.s"; actually, it was a comment on Slashdot that had been cross-posted to 15 Usenet groups by some ignoramus. It concerned a snippet of C code that cast a double to int in such a way as to get a different answer under i386 Linux than under the i386 free BSDs and most non-i386 architectures. In fact, the exact same assembly, running under Linux and under FreeBSD on the same machine, reportedly gave different results. For those who might care, #include stdio.h int main() { int a = 10; printf("%d %d\n", /* now for some BAD CODE! */ (int)( a*.3 + a*.7), /* first expression */ (int)(10*.3 + 10*.7)); /* second expression */ return 0; } when compiled under GCC *without optimization*, will print "9 10" on i386 Linux and "10 10" most every place else. (And, by the way, if you sit down with a pencil and paper, you'll find that IEEE 754 arithmetic in 32-bit, 64-bit, or 80-bit precision tells us that floor(10*.3 + 10*.7) == 10, not 9.) It boils down to the fact that, under i386 Linux, the FPU control word has its precision control (PC) set to 3 (for 80-bit extended precision) while under i386 FreeBSD, NetBSD, and others, it's set to 2 (for 64-bit double precision). On other architectures, I assume there's usually no mismatch between the C "double" precision and the FPU's default internal precision. details To be specific, under Linux, the first expression takes 64-bit versions of the constants 0.3 and 0.7 (each slightly less than the true values of 0.3 and 0.7), and does 80-bit multiplies and an add to get a number slightly less than 10. This gets truncated to 9. On the other hand, under the BSDs, the 64-bit add rounds upward before the truncation, giving the answer "10". The second expression always produces 10 (and, with -O2, the first also produces 10), probably because GCC itself either does all the constant optimization arithmetic (including forming the constants 0.3 and 0.7) in 80 bits or stores the interim results often enough in 64-bit registers to make it come out "right". /details Initially, I was quick to dismiss the whole thing as symptomatic of a severe floating-point-related cluon shortage. However, the more I think about it, the better the case seems for changing the Linux default: 1. First, PC=3 is a dangerous setting. A floating point program using "double"s, compiled with GCC without attention to FPU-related compilation options, won't do IEEE arithmetic running under this setting. Instead, it will use a mixture of 80-bit and 64-bit IEEE arithmetic depending rather unpredictably on compiler register allocations and optimizations. 2. Second, PC=3 is a mostly *useless* setting for GCC-compiled programs. There can obviously be no way to guarantee reliable IEEE 80-bit arithmetic in GCC-compiled code when "double"s are only 64 bits, so our only hope is to guarantee reliable IEEE 64-bit arithmetic. But, then we should have set PC=2 in the first place. Worse yet, I don't know of any compilation flags that *can* guarantee IEEE 64-bit arithmetic. I would have thought -ffloat-store would do the trick, but it doesn't change the assembly generated for the above example, at least on my Debian potato build of gcc 2.95.2. The only use for PC=3 is in hand-assembled code (or perhaps using GCC "long double"s); in those cases, the people doing the coding (or the compiler) should know enough to set the required control word value. 2. Finally, the setting is incompatible with other Unixish platforms. As mentioned, Free/NetBSD both use PC=2, and most non-IA-64 FPU architectures appear to use a floating point representation that matches their C "double" precision which prevents these kinds of surprises. The case against, as I see it, boils down to this: 1. The current setting is the hardware default, so it somehow "makes sense" to leave it. 2. It could potentially break existing code. Can anyone guess how badly? 3. Implementation is a bit of a pain. It requires both kernel and libc changes. On the third point, Ulrich and Adam hashed out weirdness with the FPU control word setting some time ago in the context of selecting IEEE or POSIX error handling behavior with "-lieee" without thwarting the kernel's lazy FPU context initialization scheme. So, on a related note, is it reasonable to consider resurrecting the "sys_setfpucw" idea at this point, to push the decision on the correct initial control word up to the C library level where it belongs? (For those who don't remember the proposal, the idea is that the C library can use "sys_setfpucw" to set the desired initial control word. If the C program actually executes an FPU instruction, the kernel will use that saved control word to initialize the FPU context in "init_fpu()"; otherwise,