subject:"RFC\: changing precision control setting in initial FPU context"

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Jason Riedy


And Kevin Buhr writes:
 - 
 - > What Linux does presently on x86 is as right as right can be on 
 - > this platform.
 - 
 - I'm not so sure.

Let me rephrase:  According to a designer of the x87 and one
of the IEEE 754 authors, the behavior currently in Linux and
glibc is reasonable on x86.  Reasonable is the best you can 
hope for in floating-point.  

Double-rounding from intermediate spills isn't reasonable, but 
that's neither a kernel nor a C library issue.  Tackling that 
issue in the compiler is difficult.  MS punted and gcc's trying 
to get things right (or has, I've lost track, search for `XF', 
`mode', and `spill' in the archives).  If you want plain single- 
or double-precision arithmetic, use a recent IA-32 with SSE2 
instructions.

What I should have done in my first response was to refer you to
Doug Priest's supplement to David Goldberg's ``What Every Computer
Scientist Should Know about Floating-Point Arithmetic''.  Of course,
you need first read the paper itself.  You can find a copy at
  http://www.validgh.com/
Read it with paper, pencil, and calculator handy.  You'll want to
work out some examples for yourself.  The supplement covers the
issues well.

If you really want to get upset at operating systems, complain
about their lack of support for efficient floating-point exception
handling.  ;)  (Or search for wmexcp, which will kill that 
complaint on x86 Linux.)

Jason
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Ulrich Drepper

[EMAIL PROTECTED] (Kevin Buhr) writes:

> > You want peoples existing applications to suddenely and magically change
> > their results. Umm problem.
> 
> So, how would you feel about a mechanism whereby the kernel could be
> passed a default FPU control word by the binary (with old binaries, by
> default,

There will be no change whatsoever with me.  The existing ABI is
fixed.  If you want your programs to behave different set the mode
appropriately.  I have not the slightest interest in seeing
applications (including the libc) being broken just because of this
stupid idea.  No kernel and no libc modifications necessary.  This is
the end of the story as far as I'm concerned.

-- 
---.  ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \,---'   \  Sunnyvale, CA 94089 USA
Red Hat  `--' drepper at redhat.com   `
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Kevin Buhr

Alan Cox <[EMAIL PROTECTED]> writes:
> 
> You want peoples existing applications to suddenely and magically change
> their results. Umm problem.

So, how would you feel about a mechanism whereby the kernel could be
passed a default FPU control word by the binary (with old binaries, by
default, using the old default control word)?  There's already an ELF
AT_FPUCW auxv entry type.  What if this was used by the kernel, rather
than the C library (as it is now), to set a default to be used in
"init_fpu()" when and if the program executed a floating point
instruction?

Then, a compiler startup-code writer would be able to specify a
default control word for binaries that was appropriate for (new)
programs generated by that compiler *WITHOUT* worrying about whether
he was accidentally turning a non-FP program into an FP program by
introducing "fnstcw" as its only FPU instruction.

The C library is already trying to do this (setting the CW based on
the AT_FPUCW vector).  It just can't do it *right* because it doesn't
know if the program is really FP.  It just guesses that if the
AT_FPUCW vector contains something other than the hard-coded
_FPU_DEFAULT (which is supposed to be equal to the kernel default: it
isn't, but it's close enough), it must be set; otherwise, it's left
alone.

Kevin <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Alan Cox


> with GCC's 64-bit doubles (and its 64-bit clean but 80-bit dirty
> floating point optimizations), so I'm proposing adding an instruction
> to "init_fpu()" to change the default hardware control word.

You want peoples existing applications to suddenely and magically change
their results. Umm problem. If your app needs a specific control word then
just force it in the app

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Kevin Buhr

"Adam J. Richter" <[EMAIL PROTECTED]> writes:
> 
>   IEEE-754 floating point is available under glibc-based systems,
> including most current GNU/Linux distributions, by linking with -lieee.
> Your example program produces the "9 10" result you wanted when linked
> this way, even when compiled with -O2 

No, you've got it backwards.  The "9 10" result is the *wrong* result.
IEEE 64-bit arithmetic should give the result "10 10".  Also, I can't
duplicate your outcome.  I see no difference linking with "-lieee"
versus linking without it, at least under glibc-2.1.3:

$ gcc -v
Reading specs from /usr/lib/gcc-lib/i386-linux/2.95.2/specs
gcc version 2.95.2 2220 (Debian GNU/Linux)
$ cat modified.c
#include 
#include 
int main()
{
int a = 10;
fpu_control_t foo;
_FPU_GETCW(foo);
printf("%04x %d %d\n",
foo,
(int)( a*.3 +  a*.7),   /* first expression */
(int)(10*.3 + 10*.7));  /* second expression */
return 0;
}
$ gcc modified.c && ./a.out
037f 9 10
$ gcc -O2 modified.c && ./a.out
037f 10 10
$ gcc modified.c -lieee && ./a.out
037f 9 10
$ gcc -O2 modified.c -lieee && ./a.out
037f 10 10
$

As you can see, linking with "ieee" has no effect on the control word
setting or the results.  Perhaps this has changed post-glibc 2.1.3?
Looking at the 2.1.3 code, it appears that all "ieee" does is set a
variable that's referenced in the math library innards.  It has no
effect on startup code right now.

>   When not linked with "-lieee", Linux personality ELF
> x86 binaries start with Precision Control set to 3, just because that
> is how the x86 fninit instruction sets it.

Yes.  I know.  In fact, the "fninit" instruction is executed in the
kernel's "init_fpu()" when the first FPU instruction is executed by
the program.  I just think the hardware default happens to be a bad
default on a system where most floating-point software is GCC-compiled
with GCC's 64-bit doubles (and its 64-bit clean but 80-bit dirty
floating point optimizations), so I'm proposing adding an instruction
to "init_fpu()" to change the default hardware control word.

> In general, I think most real uses of floating point are for "fast and
> sloppy" purposes, and programs that want to use floating point and
> care about exact reproducibility will link with "-lieee".

However, this doesn't seem to work.  Nor does "-ffloat-store".

Kevin <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Kevin Buhr

Jason Riedy <[EMAIL PROTECTED]> writes:
> 
> Note that getting what some people want to call `true' IEEE 754 
> arithmetic on an x86 is frightfully tricky.  Changing the precision
> does not shorten the exponent field, and that can have, um, fun
> effects on and around under/overflow.

Whoops.  This is an important point and something I'd missed.

> What Linux does presently on x86 is as right as right can be on 
> this platform.

I'm not so sure.  If most floating point programs and math libraries
used 80-bit "long double"s (and if GCC did 80-bit arithmetic
correctly, as you seem to imply it doesn't), then I would argue that
the current default is a perfect default.

As it is, I think most C floating point software (that isn't written
by i386 FPU gurus) is written with "double"s, written without
attention to the FPU control word, and compiled with no special
options.  These programs can be made, at least, predictable with
respect to compiler optimizations and compatible with many other
architectures if we change the default to the *BSD choice.

>The *BSD choice is 
> valid by some lines of thought, but it also denies people the happy 
> accident of computing with more precision and range than they thought 
> they needed.

If this "accident" happened reliably when the program was compiled
with and without "-O2", or if this "accident" couldn't be affected by,
say, which branch of an if-else was taken (by means of causing a
reload from a 64-bit memory location in one case and not the other,
for example), and if this accident was compatible with other i386
Unixish operating systems, it would, indeed, be a *happy* accident.
Here, I think it's just an accident.

Someone whose code actually benefits from extra mantissa precision
beyond 53 bits without them understanding the intricacies of i387
programming needs to be pummelled with a stick.  Of course, someone
whose code *breaks* from extra mantissa precision *also* needs to be
pummeled with a stick.  But, in between beatings, I'd still like to
get the default changed.

It seems to me that this issue is a little different from, say, the
"Linux modifies the timeout field in select calls" kind of
incompatibility.  If an FP program under Linux behaved differently
but, at least, reliably and predictably, I wouldn't be bringing this
up.  An incompatibile implementation that *also* leads to bizarre
surprises (with any change to compilation flags, program flow, phase
of the moon, whatever) especially when the alternative, compatible
implementation *doesn't* lead to surprises well, that's what has
gotten me up in arms.

> Overall, computing with x86 double-extended is a good
> thing so long as you don't introduce multiple roundings.  That's a
> compiler issue, not a kernel one.

Yes, maybe it is.  The issue as I see it is to set a reasonable,
default floating-point policy without compromising Linux's lazy FPU
context switching---it can't be done in the C library startup code
without a kernel change.  It *could* be done by the compiler (which
would clearly know when a particular function used floating point and
what control word setting was appropriate).  It's something to think
about, at any rate.

Kevin <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Kevin Buhr

"Albert D. Cahalan" <[EMAIL PROTECTED]> writes:
> >
> > Well, yes, but I'll try not to cry myself to sleep over it.  I'm
> > tempted to say that someone who chooses to use "float"s has given up
> > all pretense of caring about the answers they get.  And, if they
> > really want to do predictable math with floats they can change the FPU
> > control word from whatever its default is to PC==0.
> 
> There are algorithms which work fine using 32-bit floating-point,
> but which become unstable when you get unpredictable precision.
> It is reasonable to use such an algorithm and some 64-bit math in
> the same program. So there isn't any correct x86 setting.

So what?  Of course there's no "correct" x86 setting for all
situations.  In this particular situation, you will need to change the
PC on a function-by-function basis.  I'm just suggesting there might
be a better *default* PC than the current one.

> That would be an awful idea. There are two main useful behaviors:
> 
> 1. Pure IEEE for 32-bit, 64-bit, and 80-bit floating-point values.
>The compiler rounds intermediate values by writing to memory
>or by adjusting the precision control before each operation.
> 
> 2. Extra precision when it comes free. The precision control is set
>to 80-bit and the compiler tries to keep values in registers.
>This is usually the more useful behavior, and it performs better.

I find it difficult to believe that anyone would find the second
alternative even remotely comparable in "usefulness" to the first.
The extra precision isn't free; it comes at the expense of predictable
program behavior and compatibility with other i386 and non-i386
architectures.

> What you are suggesting is a gross hybrid. You claim it has something
> to do with IEEE, but it doesn't handle 32-bit math correctly. Your
> proposal is NOT true IEEE math.

What I am suggesting would permit IEEE 64-bit math to be done, in the
default configuration, in any GCC-compiled C program (with or without
optimization) that used only doubles for floating point arithmetic.
The current default PC allows no IEEE compliant GCC-compiled math in
any mode under any circumstances.  It also gives unexpected anamolous
results, *and* these results differ from the behavior under FreeBSD,
NetBSD, and most non-Intel platforms.

> Woah, what kind of crap is that You can not get true IEEE math
> by setting the precision control word at startup.

You don't; it turns out linking with "ieee" doesn't change the control
word; at one time it did, but the point was never to change the
precision control, it was to switch from POSIX to IEEE exception
handling.  And it wasn't my idea.

> Check the archives: the x86 Linux ABI specifies 80-bit precision.
> This will never change. The library is supposed to assume this,
> rather than try to allow for a change that will never happen.
> Linus dished out some nice toasty flames for the libc developers
> over this.

Okay, fine.  The Linux ABI can specify whatever the hell it wants.
Then, we should have a way for the library to communicate a preferred
default value to the kernel *WITHOUT* turning on the lazy FPU context
switching for every program.

Kevin <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Jason Riedy


And "Albert D. Cahalan" writes:
 - 
 - 2. Extra precision when it comes free. The precision control is set
 -to 80-bit and the compiler tries to keep values in registers.
 -This is usually the more useful behavior, and it performs better.

Even better is for gcc to spill intermediate results to 80 bits.
Unfortunately, these 80 bits have to be expanded to 128 for 
alignment, and this eats cache.  IIRC, this has been discussed
many times by gcc developers.  I don't recall the final verdict.
The original intent with the 8087 was that the compiler and/or
OS could transparently extend the stack into memory, but one
necessary feature was left out until the 80387.  By that point,
it was too late.  And then came caches...

 - What you are suggesting is a gross hybrid. You claim it has something
 - to do with IEEE, but it doesn't handle 32-bit math correctly. Your
 - proposal is NOT true IEEE math.

Note that getting what some people want to call `true' IEEE 754 
arithmetic on an x86 is frightfully tricky.  Changing the precision
does not shorten the exponent field, and that can have, um, fun
effects on and around under/overflow.  The mantissa and exponent 
lengths were chosen carefully to protect against those effects in 
many computations.

What Linux does presently on x86 is as right as right can be on 
this platform.  Compare with what MS's compilers do (die when you 
run out of the fp stack slots, telling users to simplify the 
expressions in the source code) and be happy.  The *BSD choice is 
valid by some lines of thought, but it also denies people the happy 
accident of computing with more precision and range than they thought 
they needed.  Overall, computing with x86 double-extended is a good
thing so long as you don't introduce multiple roundings.  That's a
compiler issue, not a kernel one.

Historical note:  According to one of the x87 designers, this all
boils down to the simple fact that there's no time when a pair of
collaborators in California and Israel can be both awake and lucid
enough to explain things well over a noisy telephone line.  Amazing
that it really wasn't long ago.  And if anyone's really interested,
keep checking
http://www.cs.berkeley.edu/~wkahan/
as some of Dr. Kahan's older papers are slowly converted and added.
They give a great deal of insight into the choices that eventually
became the accepted IEEE 754 standard.

Jason
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Albert D. Cahalan

Kevin Buhr writes:
> "Albert D. Cahalan" <[EMAIL PROTECTED]> writes:

>> So you change it to 2... but what about the "float" type? It gets
>> a mixture of 64-bit and 32-bit IEEE arithmetic depending rather
>> unpredictably on compiler register allocations and optimizations???
>
> Well, yes, but I'll try not to cry myself to sleep over it.  I'm
> tempted to say that someone who chooses to use "float"s has given up
> all pretense of caring about the answers they get.  And, if they
> really want to do predictable math with floats they can change the FPU
> control word from whatever its default is to PC==0.

There are algorithms which work fine using 32-bit floating-point,
but which become unstable when you get unpredictable precision.
It is reasonable to use such an algorithm and some 64-bit math in
the same program. So there isn't any correct x86 setting.

>> If a "float" will have excess precision, then a "double" might
>> as well have it too. Usually it helps, but sometimes it hurts.
>> This is life with C on x86.
>
> That's the way I initially felt, and it looks silly when it's written
> down, so I'm glad I changed my mind.
>
> I don't think extra precision that is unpredictable is ever helpful.
> Extra precision that might be gained or lost depending on, say, which
> branch of an if-statement is taken, is of no use to anyone.  It just
> causes confusion.  The excess precision on "float" is a nuisance.  The
> excess precision on "double" is another nuisance.  It would be nice to
> eliminate one of those nuisances, at least by default.

That would be an awful idea. There are two main useful behaviors:

1. Pure IEEE for 32-bit, 64-bit, and 80-bit floating-point values.
   The compiler rounds intermediate values by writing to memory
   or by adjusting the precision control before each operation.

2. Extra precision when it comes free. The precision control is set
   to 80-bit and the compiler tries to keep values in registers.
   This is usually the more useful behavior, and it performs better.

What you are suggesting is a gross hybrid. You claim it has something
to do with IEEE, but it doesn't handle 32-bit math correctly. Your
proposal is NOT true IEEE math.

>> Ugh, more start-up crud.
> 
> The startup crud is already there.  It's used to allow linking with
> "-lieee" to set a new control word value, for example, and it's

Woah, what kind of crap is that You can not get true IEEE math
by setting the precision control word at startup. This is a bug.
The compiler must save values to memory or adjust the precision
control as needed.

For example, the precision control could be loaded on function entry.
This may be optimized away for some "static" or "inline" functions.

> To me, a system call (not necessarily a *new* system call, but some
> way to get the desired FPU control word to the kernel) seems like a
> more elegant solution.
>
> On the other hand, I'm not married to the idea.  I'd rather just get
> the default control word changed in the kernel.

Check the archives: the x86 Linux ABI specifies 80-bit precision.
This will never change. The library is supposed to assume this,
rather than try to allow for a change that will never happen.
Linus dished out some nice toasty flames for the libc developers
over this.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Adam J. Richter



IEEE-754 floating point is available under glibc-based systems,
including most current GNU/Linux distributions, by linking with -lieee.
Your example program produces the "9 10" result you wanted when linked
this way, even when compiled with -O2 

When not linked with "-lieee", Linux personality ELF
x86 binaries start with Precision Control set to 3, just because that
is how the x86 fninit instruction sets it.

I thought that libieee was also available at run time for
dynamic executables by doing something like
"LD_PRELOAD=/usr/lib/libieee.so my_dynamic_exeuctable", so you could set
it in your .bashrc if you wanted, but that apparently is not the case,
at least under glibc-2.2.2.  I will have to try to figure out why this
is not available.

I am a bit out of my depth when discussing the advantages of
occasional 80 bit precision over 64 bit, but I think that there are
situations where getting gratuitously more accurate results helps,
like getting faster convergence in some scientific numerical methods,
such as Newton's method.  (You'll still find the same point of
convergence if there is only one, but the program will run faster).
Another example would be things like 3D lighting calculations (used in
games?) where you want to produce the best images that you can within
that CPU budget.  I don't know of any sound encodings where a fully
optimized implementation would use floating point, but it's possible.
In general, I think most real uses of floating point are for "fast and
sloppy" purposes, and programs that want to use floating point and
care about exact reproducibility will link with "-lieee".

On the other hand, if a GNU/Linux-x86 distribution did want to
change the initial floating point control word in Linux to PC=2, I think
you would still want old programs to run in their old PC=3 environment,
just in case one relied on it.  Your sys_setfpcw suggestion could do
(to set the default floating point control word without flagging the
process as one that was definitely going to use floating point), but I
think a simpler approach would be to assign a different magic number
argument setpersonality() for programs that expect to be initialized
with floating point precision control set to 2.

Adam J. Richter __ __   4880 Stevens Creek Blvd, Suite 104
[EMAIL PROTECTED] \ /  San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l   United States of America
fax +1 408 261-6631  "Free Software For The Rest Of Us."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Kevin Buhr

"Albert D. Cahalan" <[EMAIL PROTECTED]> writes:
> 
> So you change it to 2... but what about the "float" type? It gets
> a mixture of 64-bit and 32-bit IEEE arithmetic depending rather
> unpredictably on compiler register allocations and optimizations???

Well, yes, but I'll try not to cry myself to sleep over it.  I'm
tempted to say that someone who chooses to use "float"s has given up
all pretense of caring about the answers they get.  And, if they
really want to do predictable math with floats they can change the FPU
control word from whatever its default is to PC==0.

I guess if I had to choose between two default control word settings
so that either (A) "float" arithmetic is unpredictable but "double"
arithmetic is predictable, corresponds to 64-bit IEEE arithmetic, is
invariant under different compiler optimization settings, matches the
compiler's handling of constant folding, and mimics the behavior on
i386 FreeBSD and NetBSD and most modern, non-i386 architectures; or
(B) "float" and "double" arithmetic are both unpredictable and
non-IEEE; I'd choose (A).

> If a "float" will have excess precision, then a "double" might
> as well have it too. Usually it helps, but sometimes it hurts.
> This is life with C on x86.

That's the way I initially felt, and it looks silly when it's written
down, so I'm glad I changed my mind.

I don't think extra precision that is unpredictable is ever helpful.
Extra precision that might be gained or lost depending on, say, which
branch of an if-statement is taken, is of no use to anyone.  It just
causes confusion.  The excess precision on "float" is a nuisance.  The
excess precision on "double" is another nuisance.  It would be nice to
eliminate one of those nuisances, at least by default.

> Ugh, more start-up crud.

The startup crud is already there.  It's used to allow linking with
"-lieee" to set a new control word value, for example, and it's
inelegant and ugly.  Because we don't want to set the control word on
a non-FPU program and defeat the lazy FPU context initialization, we
compare the value of the control word we want with a value hard-coded
into the library that's supposed to match the value hard-coded into
the kernel.  If the two values differ, we set the control word to the
new value (whether the program actually ends up ever executing an FPU
instruction or not).

To me, a system call (not necessarily a *new* system call, but some
way to get the desired FPU control word to the kernel) seems like a
more elegant solution.

On the other hand, I'm not married to the idea.  I'd rather just get
the default control word changed in the kernel.

Kevin <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Albert D. Cahalan


Kevin Buhr writes:

> It boils down to the fact that, under i386 Linux, the FPU control word
> has its precision control (PC) set to 3 (for 80-bit extended
> precision) while under i386 FreeBSD, NetBSD, and others, it's set to 2
> (for 64-bit double precision).  On other architectures, I assume
> there's usually no mismatch between the C "double" precision and the
> FPU's default internal precision.
...
> Initially, I was quick to dismiss the whole thing as symptomatic of a
> severe floating-point-related cluon shortage.  However, the more I
> think about it, the better the case seems for changing the Linux
> default:
> 
> 1.  First, PC=3 is a dangerous setting.  A floating point program
> using "double"s, compiled with GCC without attention to
> FPU-related compilation options, won't do IEEE arithmetic running
> under this setting.  Instead, it will use a mixture of 80-bit and
> 64-bit IEEE arithmetic depending rather unpredictably on compiler
> register allocations and optimizations.
> 
> 2.  Second, PC=3 is a mostly *useless* setting for GCC-compiled
> programs.  There can obviously be no way to guarantee reliable
> IEEE 80-bit arithmetic in GCC-compiled code when "double"s are
> only 64 bits, so our only hope is to guarantee reliable IEEE
> 64-bit arithmetic.  But, then we should have set PC=2 in the first
> place.

So you change it to 2... but what about the "float" type? It gets
a mixture of 64-bit and 32-bit IEEE arithmetic depending rather
unpredictably on compiler register allocations and optimizations???

If a "float" will have excess precision, then a "double" might
as well have it too. Usually it helps, but sometimes it hurts.
This is life with C on x86.

> So, on a related note, is it reasonable to consider resurrecting the
> "sys_setfpucw" idea at this point, to push the decision on the correct
> initial control word up to the C library level where it belongs?  (For
> those who don't remember the proposal, the idea is that the C library
> can use "sys_setfpucw" to set the desired initial control word.

Ugh, more start-up crud.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Albert D. Cahalan


Kevin Buhr writes:

 It boils down to the fact that, under i386 Linux, the FPU control word
 has its precision control (PC) set to 3 (for 80-bit extended
 precision) while under i386 FreeBSD, NetBSD, and others, it's set to 2
 (for 64-bit double precision).  On other architectures, I assume
 there's usually no mismatch between the C "double" precision and the
 FPU's default internal precision.
...
 Initially, I was quick to dismiss the whole thing as symptomatic of a
 severe floating-point-related cluon shortage.  However, the more I
 think about it, the better the case seems for changing the Linux
 default:
 
 1.  First, PC=3 is a dangerous setting.  A floating point program
 using "double"s, compiled with GCC without attention to
 FPU-related compilation options, won't do IEEE arithmetic running
 under this setting.  Instead, it will use a mixture of 80-bit and
 64-bit IEEE arithmetic depending rather unpredictably on compiler
 register allocations and optimizations.
 
 2.  Second, PC=3 is a mostly *useless* setting for GCC-compiled
 programs.  There can obviously be no way to guarantee reliable
 IEEE 80-bit arithmetic in GCC-compiled code when "double"s are
 only 64 bits, so our only hope is to guarantee reliable IEEE
 64-bit arithmetic.  But, then we should have set PC=2 in the first
 place.

So you change it to 2... but what about the "float" type? It gets
a mixture of 64-bit and 32-bit IEEE arithmetic depending rather
unpredictably on compiler register allocations and optimizations???

If a "float" will have excess precision, then a "double" might
as well have it too. Usually it helps, but sometimes it hurts.
This is life with C on x86.

 So, on a related note, is it reasonable to consider resurrecting the
 "sys_setfpucw" idea at this point, to push the decision on the correct
 initial control word up to the C library level where it belongs?  (For
 those who don't remember the proposal, the idea is that the C library
 can use "sys_setfpucw" to set the desired initial control word.

Ugh, more start-up crud.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Kevin Buhr


"Albert D. Cahalan" [EMAIL PROTECTED] writes:
 
 So you change it to 2... but what about the "float" type? It gets
 a mixture of 64-bit and 32-bit IEEE arithmetic depending rather
 unpredictably on compiler register allocations and optimizations???

Well, yes, but I'll try not to cry myself to sleep over it.  I'm
tempted to say that someone who chooses to use "float"s has given up
all pretense of caring about the answers they get.  And, if they
really want to do predictable math with floats they can change the FPU
control word from whatever its default is to PC==0.

I guess if I had to choose between two default control word settings
so that either (A) "float" arithmetic is unpredictable but "double"
arithmetic is predictable, corresponds to 64-bit IEEE arithmetic, is
invariant under different compiler optimization settings, matches the
compiler's handling of constant folding, and mimics the behavior on
i386 FreeBSD and NetBSD and most modern, non-i386 architectures; or
(B) "float" and "double" arithmetic are both unpredictable and
non-IEEE; I'd choose (A).

 If a "float" will have excess precision, then a "double" might
 as well have it too. Usually it helps, but sometimes it hurts.
 This is life with C on x86.

That's the way I initially felt, and it looks silly when it's written
down, so I'm glad I changed my mind.

I don't think extra precision that is unpredictable is ever helpful.
Extra precision that might be gained or lost depending on, say, which
branch of an if-statement is taken, is of no use to anyone.  It just
causes confusion.  The excess precision on "float" is a nuisance.  The
excess precision on "double" is another nuisance.  It would be nice to
eliminate one of those nuisances, at least by default.

 Ugh, more start-up crud.

The startup crud is already there.  It's used to allow linking with
"-lieee" to set a new control word value, for example, and it's
inelegant and ugly.  Because we don't want to set the control word on
a non-FPU program and defeat the lazy FPU context initialization, we
compare the value of the control word we want with a value hard-coded
into the library that's supposed to match the value hard-coded into
the kernel.  If the two values differ, we set the control word to the
new value (whether the program actually ends up ever executing an FPU
instruction or not).

To me, a system call (not necessarily a *new* system call, but some
way to get the desired FPU control word to the kernel) seems like a
more elegant solution.

On the other hand, I'm not married to the idea.  I'd rather just get
the default control word changed in the kernel.

Kevin [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Adam J. Richter



IEEE-754 floating point is available under glibc-based systems,
including most current GNU/Linux distributions, by linking with -lieee.
Your example program produces the "9 10" result you wanted when linked
this way, even when compiled with -O2 

When not linked with "-lieee", Linux personality ELF
x86 binaries start with Precision Control set to 3, just because that
is how the x86 fninit instruction sets it.

I thought that libieee was also available at run time for
dynamic executables by doing something like
"LD_PRELOAD=/usr/lib/libieee.so my_dynamic_exeuctable", so you could set
it in your .bashrc if you wanted, but that apparently is not the case,
at least under glibc-2.2.2.  I will have to try to figure out why this
is not available.

I am a bit out of my depth when discussing the advantages of
occasional 80 bit precision over 64 bit, but I think that there are
situations where getting gratuitously more accurate results helps,
like getting faster convergence in some scientific numerical methods,
such as Newton's method.  (You'll still find the same point of
convergence if there is only one, but the program will run faster).
Another example would be things like 3D lighting calculations (used in
games?) where you want to produce the best images that you can within
that CPU budget.  I don't know of any sound encodings where a fully
optimized implementation would use floating point, but it's possible.
In general, I think most real uses of floating point are for "fast and
sloppy" purposes, and programs that want to use floating point and
care about exact reproducibility will link with "-lieee".

On the other hand, if a GNU/Linux-x86 distribution did want to
change the initial floating point control word in Linux to PC=2, I think
you would still want old programs to run in their old PC=3 environment,
just in case one relied on it.  Your sys_setfpcw suggestion could do
(to set the default floating point control word without flagging the
process as one that was definitely going to use floating point), but I
think a simpler approach would be to assign a different magic number
argument setpersonality() for programs that expect to be initialized
with floating point precision control set to 2.

Adam J. Richter __ __   4880 Stevens Creek Blvd, Suite 104
[EMAIL PROTECTED] \ /  San Jose, California 95129-1034
+1 408 261-6630 | g g d r a s i l   United States of America
fax +1 408 261-6631  "Free Software For The Rest Of Us."
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Albert D. Cahalan


Kevin Buhr writes:
 "Albert D. Cahalan" [EMAIL PROTECTED] writes:

 So you change it to 2... but what about the "float" type? It gets
 a mixture of 64-bit and 32-bit IEEE arithmetic depending rather
 unpredictably on compiler register allocations and optimizations???

 Well, yes, but I'll try not to cry myself to sleep over it.  I'm
 tempted to say that someone who chooses to use "float"s has given up
 all pretense of caring about the answers they get.  And, if they
 really want to do predictable math with floats they can change the FPU
 control word from whatever its default is to PC==0.

There are algorithms which work fine using 32-bit floating-point,
but which become unstable when you get unpredictable precision.
It is reasonable to use such an algorithm and some 64-bit math in
the same program. So there isn't any correct x86 setting.

 If a "float" will have excess precision, then a "double" might
 as well have it too. Usually it helps, but sometimes it hurts.
 This is life with C on x86.

 That's the way I initially felt, and it looks silly when it's written
 down, so I'm glad I changed my mind.

 I don't think extra precision that is unpredictable is ever helpful.
 Extra precision that might be gained or lost depending on, say, which
 branch of an if-statement is taken, is of no use to anyone.  It just
 causes confusion.  The excess precision on "float" is a nuisance.  The
 excess precision on "double" is another nuisance.  It would be nice to
 eliminate one of those nuisances, at least by default.

That would be an awful idea. There are two main useful behaviors:

1. Pure IEEE for 32-bit, 64-bit, and 80-bit floating-point values.
   The compiler rounds intermediate values by writing to memory
   or by adjusting the precision control before each operation.

2. Extra precision when it comes free. The precision control is set
   to 80-bit and the compiler tries to keep values in registers.
   This is usually the more useful behavior, and it performs better.

What you are suggesting is a gross hybrid. You claim it has something
to do with IEEE, but it doesn't handle 32-bit math correctly. Your
proposal is NOT true IEEE math.

 Ugh, more start-up crud.
 
 The startup crud is already there.  It's used to allow linking with
 "-lieee" to set a new control word value, for example, and it's

Woah, what kind of crap is that You can not get true IEEE math
by setting the precision control word at startup. This is a bug.
The compiler must save values to memory or adjust the precision
control as needed.

For example, the precision control could be loaded on function entry.
This may be optimized away for some "static" or "inline" functions.

 To me, a system call (not necessarily a *new* system call, but some
 way to get the desired FPU control word to the kernel) seems like a
 more elegant solution.

 On the other hand, I'm not married to the idea.  I'd rather just get
 the default control word changed in the kernel.

Check the archives: the x86 Linux ABI specifies 80-bit precision.
This will never change. The library is supposed to assume this,
rather than try to allow for a change that will never happen.
Linus dished out some nice toasty flames for the libc developers
over this.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Jason Riedy


And "Albert D. Cahalan" writes:
 - 
 - 2. Extra precision when it comes free. The precision control is set
 -to 80-bit and the compiler tries to keep values in registers.
 -This is usually the more useful behavior, and it performs better.

Even better is for gcc to spill intermediate results to 80 bits.
Unfortunately, these 80 bits have to be expanded to 128 for 
alignment, and this eats cache.  IIRC, this has been discussed
many times by gcc developers.  I don't recall the final verdict.
The original intent with the 8087 was that the compiler and/or
OS could transparently extend the stack into memory, but one
necessary feature was left out until the 80387.  By that point,
it was too late.  And then came caches...

 - What you are suggesting is a gross hybrid. You claim it has something
 - to do with IEEE, but it doesn't handle 32-bit math correctly. Your
 - proposal is NOT true IEEE math.

Note that getting what some people want to call `true' IEEE 754 
arithmetic on an x86 is frightfully tricky.  Changing the precision
does not shorten the exponent field, and that can have, um, fun
effects on and around under/overflow.  The mantissa and exponent 
lengths were chosen carefully to protect against those effects in 
many computations.

What Linux does presently on x86 is as right as right can be on 
this platform.  Compare with what MS's compilers do (die when you 
run out of the fp stack slots, telling users to simplify the 
expressions in the source code) and be happy.  The *BSD choice is 
valid by some lines of thought, but it also denies people the happy 
accident of computing with more precision and range than they thought 
they needed.  Overall, computing with x86 double-extended is a good
thing so long as you don't introduce multiple roundings.  That's a
compiler issue, not a kernel one.

Historical note:  According to one of the x87 designers, this all
boils down to the simple fact that there's no time when a pair of
collaborators in California and Israel can be both awake and lucid
enough to explain things well over a noisy telephone line.  Amazing
that it really wasn't long ago.  And if anyone's really interested,
keep checking
http://www.cs.berkeley.edu/~wkahan/
as some of Dr. Kahan's older papers are slowly converted and added.
They give a great deal of insight into the choices that eventually
became the accepted IEEE 754 standard.

Jason
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Kevin Buhr


"Albert D. Cahalan" [EMAIL PROTECTED] writes:
 
  Well, yes, but I'll try not to cry myself to sleep over it.  I'm
  tempted to say that someone who chooses to use "float"s has given up
  all pretense of caring about the answers they get.  And, if they
  really want to do predictable math with floats they can change the FPU
  control word from whatever its default is to PC==0.
 
 There are algorithms which work fine using 32-bit floating-point,
 but which become unstable when you get unpredictable precision.
 It is reasonable to use such an algorithm and some 64-bit math in
 the same program. So there isn't any correct x86 setting.

So what?  Of course there's no "correct" x86 setting for all
situations.  In this particular situation, you will need to change the
PC on a function-by-function basis.  I'm just suggesting there might
be a better *default* PC than the current one.

 That would be an awful idea. There are two main useful behaviors:
 
 1. Pure IEEE for 32-bit, 64-bit, and 80-bit floating-point values.
The compiler rounds intermediate values by writing to memory
or by adjusting the precision control before each operation.
 
 2. Extra precision when it comes free. The precision control is set
to 80-bit and the compiler tries to keep values in registers.
This is usually the more useful behavior, and it performs better.

I find it difficult to believe that anyone would find the second
alternative even remotely comparable in "usefulness" to the first.
The extra precision isn't free; it comes at the expense of predictable
program behavior and compatibility with other i386 and non-i386
architectures.

 What you are suggesting is a gross hybrid. You claim it has something
 to do with IEEE, but it doesn't handle 32-bit math correctly. Your
 proposal is NOT true IEEE math.

What I am suggesting would permit IEEE 64-bit math to be done, in the
default configuration, in any GCC-compiled C program (with or without
optimization) that used only doubles for floating point arithmetic.
The current default PC allows no IEEE compliant GCC-compiled math in
any mode under any circumstances.  It also gives unexpected anamolous
results, *and* these results differ from the behavior under FreeBSD,
NetBSD, and most non-Intel platforms.

 Woah, what kind of crap is that You can not get true IEEE math
 by setting the precision control word at startup.

You don't; it turns out linking with "ieee" doesn't change the control
word; at one time it did, but the point was never to change the
precision control, it was to switch from POSIX to IEEE exception
handling.  And it wasn't my idea.

 Check the archives: the x86 Linux ABI specifies 80-bit precision.
 This will never change. The library is supposed to assume this,
 rather than try to allow for a change that will never happen.
 Linus dished out some nice toasty flames for the libc developers
 over this.

Okay, fine.  The Linux ABI can specify whatever the hell it wants.
Then, we should have a way for the library to communicate a preferred
default value to the kernel *WITHOUT* turning on the lazy FPU context
switching for every program.

Kevin [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Kevin Buhr


Jason Riedy [EMAIL PROTECTED] writes:
 
 Note that getting what some people want to call `true' IEEE 754 
 arithmetic on an x86 is frightfully tricky.  Changing the precision
 does not shorten the exponent field, and that can have, um, fun
 effects on and around under/overflow.

Whoops.  This is an important point and something I'd missed.

 What Linux does presently on x86 is as right as right can be on 
 this platform.

I'm not so sure.  If most floating point programs and math libraries
used 80-bit "long double"s (and if GCC did 80-bit arithmetic
correctly, as you seem to imply it doesn't), then I would argue that
the current default is a perfect default.

As it is, I think most C floating point software (that isn't written
by i386 FPU gurus) is written with "double"s, written without
attention to the FPU control word, and compiled with no special
options.  These programs can be made, at least, predictable with
respect to compiler optimizations and compatible with many other
architectures if we change the default to the *BSD choice.

The *BSD choice is 
 valid by some lines of thought, but it also denies people the happy 
 accident of computing with more precision and range than they thought 
 they needed.

If this "accident" happened reliably when the program was compiled
with and without "-O2", or if this "accident" couldn't be affected by,
say, which branch of an if-else was taken (by means of causing a
reload from a 64-bit memory location in one case and not the other,
for example), and if this accident was compatible with other i386
Unixish operating systems, it would, indeed, be a *happy* accident.
Here, I think it's just an accident.

Someone whose code actually benefits from extra mantissa precision
beyond 53 bits without them understanding the intricacies of i387
programming needs to be pummelled with a stick.  Of course, someone
whose code *breaks* from extra mantissa precision *also* needs to be
pummeled with a stick.  But, in between beatings, I'd still like to
get the default changed.

It seems to me that this issue is a little different from, say, the
"Linux modifies the timeout field in select calls" kind of
incompatibility.  If an FP program under Linux behaved differently
but, at least, reliably and predictably, I wouldn't be bringing this
up.  An incompatibile implementation that *also* leads to bizarre
surprises (with any change to compilation flags, program flow, phase
of the moon, whatever) especially when the alternative, compatible
implementation *doesn't* lead to surprises well, that's what has
gotten me up in arms.

 Overall, computing with x86 double-extended is a good
 thing so long as you don't introduce multiple roundings.  That's a
 compiler issue, not a kernel one.

Yes, maybe it is.  The issue as I see it is to set a reasonable,
default floating-point policy without compromising Linux's lazy FPU
context switching---it can't be done in the C library startup code
without a kernel change.  It *could* be done by the compiler (which
would clearly know when a particular function used floating point and
what control word setting was appropriate).  It's something to think
about, at any rate.

Kevin [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Kevin Buhr


"Adam J. Richter" [EMAIL PROTECTED] writes:
 
   IEEE-754 floating point is available under glibc-based systems,
 including most current GNU/Linux distributions, by linking with -lieee.
 Your example program produces the "9 10" result you wanted when linked
 this way, even when compiled with -O2 

No, you've got it backwards.  The "9 10" result is the *wrong* result.
IEEE 64-bit arithmetic should give the result "10 10".  Also, I can't
duplicate your outcome.  I see no difference linking with "-lieee"
versus linking without it, at least under glibc-2.1.3:

$ gcc -v
Reading specs from /usr/lib/gcc-lib/i386-linux/2.95.2/specs
gcc version 2.95.2 2220 (Debian GNU/Linux)
$ cat modified.c
#include stdio.h
#include fpu_control.h
int main()
{
int a = 10;
fpu_control_t foo;
_FPU_GETCW(foo);
printf("%04x %d %d\n",
foo,
(int)( a*.3 +  a*.7),   /* first expression */
(int)(10*.3 + 10*.7));  /* second expression */
return 0;
}
$ gcc modified.c  ./a.out
037f 9 10
$ gcc -O2 modified.c  ./a.out
037f 10 10
$ gcc modified.c -lieee  ./a.out
037f 9 10
$ gcc -O2 modified.c -lieee  ./a.out
037f 10 10
$

As you can see, linking with "ieee" has no effect on the control word
setting or the results.  Perhaps this has changed post-glibc 2.1.3?
Looking at the 2.1.3 code, it appears that all "ieee" does is set a
variable that's referenced in the math library innards.  It has no
effect on startup code right now.

   When not linked with "-lieee", Linux personality ELF
 x86 binaries start with Precision Control set to 3, just because that
 is how the x86 fninit instruction sets it.

Yes.  I know.  In fact, the "fninit" instruction is executed in the
kernel's "init_fpu()" when the first FPU instruction is executed by
the program.  I just think the hardware default happens to be a bad
default on a system where most floating-point software is GCC-compiled
with GCC's 64-bit doubles (and its 64-bit clean but 80-bit dirty
floating point optimizations), so I'm proposing adding an instruction
to "init_fpu()" to change the default hardware control word.

 In general, I think most real uses of floating point are for "fast and
 sloppy" purposes, and programs that want to use floating point and
 care about exact reproducibility will link with "-lieee".

However, this doesn't seem to work.  Nor does "-ffloat-store".

Kevin [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Alan Cox


 with GCC's 64-bit doubles (and its 64-bit clean but 80-bit dirty
 floating point optimizations), so I'm proposing adding an instruction
 to "init_fpu()" to change the default hardware control word.

You want peoples existing applications to suddenely and magically change
their results. Umm problem. If your app needs a specific control word then
just force it in the app

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Kevin Buhr


Alan Cox [EMAIL PROTECTED] writes:
 
 You want peoples existing applications to suddenely and magically change
 their results. Umm problem.

So, how would you feel about a mechanism whereby the kernel could be
passed a default FPU control word by the binary (with old binaries, by
default, using the old default control word)?  There's already an ELF
AT_FPUCW auxv entry type.  What if this was used by the kernel, rather
than the C library (as it is now), to set a default to be used in
"init_fpu()" when and if the program executed a floating point
instruction?

Then, a compiler startup-code writer would be able to specify a
default control word for binaries that was appropriate for (new)
programs generated by that compiler *WITHOUT* worrying about whether
he was accidentally turning a non-FP program into an FP program by
introducing "fnstcw" as its only FPU instruction.

The C library is already trying to do this (setting the CW based on
the AT_FPUCW vector).  It just can't do it *right* because it doesn't
know if the program is really FP.  It just guesses that if the
AT_FPUCW vector contains something other than the hard-coded
_FPU_DEFAULT (which is supposed to be equal to the kernel default: it
isn't, but it's close enough), it must be set; otherwise, it's left
alone.

Kevin [EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Ulrich Drepper


[EMAIL PROTECTED] (Kevin Buhr) writes:

  You want peoples existing applications to suddenely and magically change
  their results. Umm problem.
 
 So, how would you feel about a mechanism whereby the kernel could be
 passed a default FPU control word by the binary (with old binaries, by
 default,

There will be no change whatsoever with me.  The existing ABI is
fixed.  If you want your programs to behave different set the mode
appropriately.  I have not the slightest interest in seeing
applications (including the libc) being broken just because of this
stupid idea.  No kernel and no libc modifications necessary.  This is
the end of the story as far as I'm concerned.

-- 
---.  ,-.   1325 Chesapeake Terrace
Ulrich Drepper  \,---'   \  Sunnyvale, CA 94089 USA
Red Hat  `--' drepper at redhat.com   `
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: RFC: changing precision control setting in initial FPU context

2001-03-03 Thread Jason Riedy


And Kevin Buhr writes:
 - 
 -  What Linux does presently on x86 is as right as right can be on 
 -  this platform.
 - 
 - I'm not so sure.

Let me rephrase:  According to a designer of the x87 and one
of the IEEE 754 authors, the behavior currently in Linux and
glibc is reasonable on x86.  Reasonable is the best you can 
hope for in floating-point.  

Double-rounding from intermediate spills isn't reasonable, but 
that's neither a kernel nor a C library issue.  Tackling that 
issue in the compiler is difficult.  MS punted and gcc's trying 
to get things right (or has, I've lost track, search for `XF', 
`mode', and `spill' in the archives).  If you want plain single- 
or double-precision arithmetic, use a recent IA-32 with SSE2 
instructions.

What I should have done in my first response was to refer you to
Doug Priest's supplement to David Goldberg's ``What Every Computer
Scientist Should Know about Floating-Point Arithmetic''.  Of course,
you need first read the paper itself.  You can find a copy at
  http://www.validgh.com/
Read it with paper, pencil, and calculator handy.  You'll want to
work out some examples for yourself.  The supplement covers the
issues well.

If you really want to get upset at operating systems, complain
about their lack of support for efficient floating-point exception
handling.  ;)  (Or search for wmexcp, which will kill that 
complaint on x86 Linux.)

Jason
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

RFC: changing precision control setting in initial FPU context

2001-03-02 Thread Kevin Buhr


A question recently came up in "c.o.l.d.s"; actually, it was a comment
on Slashdot that had been cross-posted to 15 Usenet groups by some
ignoramus.  It concerned a snippet of C code that cast a double to int
in such a way as to get a different answer under i386 Linux than under
the i386 free BSDs and most non-i386 architectures.  In fact, the
exact same assembly, running under Linux and under FreeBSD on the same
machine, reportedly gave different results.

For those who might care,

#include 
int main()
{
int a = 10;
printf("%d %d\n", 
/* now for some BAD CODE! */
(int)( a*.3 +  a*.7),   /* first expression */
(int)(10*.3 + 10*.7));  /* second expression */
return 0;
}

when compiled under GCC *without optimization*, will print "9 10" on
i386 Linux and "10 10" most every place else.  (And, by the way, if
you sit down with a pencil and paper, you'll find that IEEE 754
arithmetic in 32-bit, 64-bit, or 80-bit precision tells us that
floor(10*.3 + 10*.7) == 10, not 9.)

It boils down to the fact that, under i386 Linux, the FPU control word
has its precision control (PC) set to 3 (for 80-bit extended
precision) while under i386 FreeBSD, NetBSD, and others, it's set to 2
(for 64-bit double precision).  On other architectures, I assume
there's usually no mismatch between the C "double" precision and the
FPU's default internal precision.


   To be specific, under Linux, the first expression takes 64-bit
   versions of the constants 0.3 and 0.7 (each slightly less than the
   true values of 0.3 and 0.7), and does 80-bit multiplies and an add
   to get a number slightly less than 10.  This gets truncated to 9.
   On the other hand, under the BSDs, the 64-bit add rounds upward
   before the truncation, giving the answer "10".

   The second expression always produces 10 (and, with -O2, the first
   also produces 10), probably because GCC itself either does all the
   constant optimization arithmetic (including forming the constants
   0.3 and 0.7) in 80 bits or stores the interim results often enough
   in 64-bit registers to make it come out "right".


Initially, I was quick to dismiss the whole thing as symptomatic of a
severe floating-point-related cluon shortage.  However, the more I
think about it, the better the case seems for changing the Linux
default:

1.  First, PC=3 is a dangerous setting.  A floating point program
using "double"s, compiled with GCC without attention to
FPU-related compilation options, won't do IEEE arithmetic running
under this setting.  Instead, it will use a mixture of 80-bit and
64-bit IEEE arithmetic depending rather unpredictably on compiler
register allocations and optimizations.

2.  Second, PC=3 is a mostly *useless* setting for GCC-compiled
programs.  There can obviously be no way to guarantee reliable
IEEE 80-bit arithmetic in GCC-compiled code when "double"s are
only 64 bits, so our only hope is to guarantee reliable IEEE
64-bit arithmetic.  But, then we should have set PC=2 in the first
place.

Worse yet, I don't know of any compilation flags that *can*
guarantee IEEE 64-bit arithmetic.  I would have thought
-ffloat-store would do the trick, but it doesn't change the
assembly generated for the above example, at least on my Debian
potato build of gcc 2.95.2.

The only use for PC=3 is in hand-assembled code (or perhaps using
GCC "long double"s); in those cases, the people doing the coding
(or the compiler) should know enough to set the required control
word value.

2.  Finally, the setting is incompatible with other Unixish platforms.
As mentioned, Free/NetBSD both use PC=2, and most non-IA-64 FPU
architectures appear to use a floating point representation that
matches their C "double" precision which prevents these kinds of
surprises.

The case against, as I see it, boils down to this:

1.  The current setting is the hardware default, so it somehow "makes
sense" to leave it.

2.  It could potentially break existing code.  Can anyone guess
how badly?

3.  Implementation is a bit of a pain.  It requires both kernel and
libc changes.

On the third point, Ulrich and Adam hashed out weirdness with the FPU
control word setting some time ago in the context of selecting IEEE or
POSIX error handling behavior with "-lieee" without thwarting the
kernel's lazy FPU context initialization scheme.

So, on a related note, is it reasonable to consider resurrecting the
"sys_setfpucw" idea at this point, to push the decision on the correct
initial control word up to the C library level where it belongs?  (For
those who don't remember the proposal, the idea is that the C library
can use "sys_setfpucw" to set the desired initial control word.  If
the C program actually executes an FPU instruction, the kernel will
use that saved control word to initialize the FPU context in
"init_fpu()"; otherwise, lazy FPU initialization

RFC: changing precision control setting in initial FPU context

2001-03-02 Thread Kevin Buhr


A question recently came up in "c.o.l.d.s"; actually, it was a comment
on Slashdot that had been cross-posted to 15 Usenet groups by some
ignoramus.  It concerned a snippet of C code that cast a double to int
in such a way as to get a different answer under i386 Linux than under
the i386 free BSDs and most non-i386 architectures.  In fact, the
exact same assembly, running under Linux and under FreeBSD on the same
machine, reportedly gave different results.

For those who might care,

#include stdio.h
int main()
{
int a = 10;
printf("%d %d\n", 
/* now for some BAD CODE! */
(int)( a*.3 +  a*.7),   /* first expression */
(int)(10*.3 + 10*.7));  /* second expression */
return 0;
}

when compiled under GCC *without optimization*, will print "9 10" on
i386 Linux and "10 10" most every place else.  (And, by the way, if
you sit down with a pencil and paper, you'll find that IEEE 754
arithmetic in 32-bit, 64-bit, or 80-bit precision tells us that
floor(10*.3 + 10*.7) == 10, not 9.)

It boils down to the fact that, under i386 Linux, the FPU control word
has its precision control (PC) set to 3 (for 80-bit extended
precision) while under i386 FreeBSD, NetBSD, and others, it's set to 2
(for 64-bit double precision).  On other architectures, I assume
there's usually no mismatch between the C "double" precision and the
FPU's default internal precision.

details
   To be specific, under Linux, the first expression takes 64-bit
   versions of the constants 0.3 and 0.7 (each slightly less than the
   true values of 0.3 and 0.7), and does 80-bit multiplies and an add
   to get a number slightly less than 10.  This gets truncated to 9.
   On the other hand, under the BSDs, the 64-bit add rounds upward
   before the truncation, giving the answer "10".

   The second expression always produces 10 (and, with -O2, the first
   also produces 10), probably because GCC itself either does all the
   constant optimization arithmetic (including forming the constants
   0.3 and 0.7) in 80 bits or stores the interim results often enough
   in 64-bit registers to make it come out "right".
/details

Initially, I was quick to dismiss the whole thing as symptomatic of a
severe floating-point-related cluon shortage.  However, the more I
think about it, the better the case seems for changing the Linux
default:

1.  First, PC=3 is a dangerous setting.  A floating point program
using "double"s, compiled with GCC without attention to
FPU-related compilation options, won't do IEEE arithmetic running
under this setting.  Instead, it will use a mixture of 80-bit and
64-bit IEEE arithmetic depending rather unpredictably on compiler
register allocations and optimizations.

2.  Second, PC=3 is a mostly *useless* setting for GCC-compiled
programs.  There can obviously be no way to guarantee reliable
IEEE 80-bit arithmetic in GCC-compiled code when "double"s are
only 64 bits, so our only hope is to guarantee reliable IEEE
64-bit arithmetic.  But, then we should have set PC=2 in the first
place.

Worse yet, I don't know of any compilation flags that *can*
guarantee IEEE 64-bit arithmetic.  I would have thought
-ffloat-store would do the trick, but it doesn't change the
assembly generated for the above example, at least on my Debian
potato build of gcc 2.95.2.

The only use for PC=3 is in hand-assembled code (or perhaps using
GCC "long double"s); in those cases, the people doing the coding
(or the compiler) should know enough to set the required control
word value.

2.  Finally, the setting is incompatible with other Unixish platforms.
As mentioned, Free/NetBSD both use PC=2, and most non-IA-64 FPU
architectures appear to use a floating point representation that
matches their C "double" precision which prevents these kinds of
surprises.

The case against, as I see it, boils down to this:

1.  The current setting is the hardware default, so it somehow "makes
sense" to leave it.

2.  It could potentially break existing code.  Can anyone guess
how badly?

3.  Implementation is a bit of a pain.  It requires both kernel and
libc changes.

On the third point, Ulrich and Adam hashed out weirdness with the FPU
control word setting some time ago in the context of selecting IEEE or
POSIX error handling behavior with "-lieee" without thwarting the
kernel's lazy FPU context initialization scheme.

So, on a related note, is it reasonable to consider resurrecting the
"sys_setfpucw" idea at this point, to push the decision on the correct
initial control word up to the C library level where it belongs?  (For
those who don't remember the proposal, the idea is that the C library
can use "sys_setfpucw" to set the desired initial control word.  If
the C program actually executes an FPU instruction, the kernel will
use that saved control word to initialize the FPU context in
"init_fpu()"; otherwise,

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

Re: RFC: changing precision control setting in initial FPU context

RFC: changing precision control setting in initial FPU context

RFC: changing precision control setting in initial FPU context

26 matches

Site Navigation

Mail list logo

Footer information