[Numpy-discussion] [OT] was about masked array and IEEE arithmetic

Paul Dubois Fri, 10 Nov 2006 09:11:14 -0800

On 09 Nov 2006 22:50:44 -0800, A. M. Archibald <[EMAIL PROTECTED]> wrote:

On 09/11/06, Paul Dubois <[EMAIL PROTECTED]> wrote:

> Since the function of old retired persons is to tell youngsters stories
> around the campfile:

I'll pull up a log. But since I'm uppity, and not especially young, I
hope you won't mind if I heckle.

> A supercomputer hardware designer told me that when forced to add IEEE
> arithmetic to his designs that it decreased performance substantially, maybe
> 25-30%; it wasn't that doing the operations took so much longer, it was that
> the increased physical space needed for that circuitry pushed the memory
> farther away. Doubtless this inspired doing some of it in software instead.

The goal of IEEE floats is not to be faster but to make doing correct
numerical programming easier. (Numerical analysts can write robust
numerical code for almost any kind of float, but the vast majority of
numerical code is written by scientists who know the bare minimum or
less about numerical analysis, so a good system makes such code work
as well as possible.)

This is an urban legend propagated to support the IEEE argument. I've worked intimately with scientific programmers for 30 years and very few of them have the characteristics you describe. It is their professional business to know the basics of numerical analysis. They use packages written by the numerical analysis community a lot, too.

The other reason IEEE floats are good is, no disrespect to your
hardware designer friend intended, that they were designed with some
care. Reimplementations are liable to get some of the finicky details
wrong (round-to-even, denormalized numbers, what have you...). I found
Kahan's "How Java's Floating-point Hurts Everyone Everywhere" (
http://www.cs.berkeley.edu/~wkahan/JAVAhurt.pdf ) very educational
when it comes to "what can go wrong with platform-dependent
floating-point".

Well, Kahan at one point went around showing how using an early HP calculator to navigate your plane was a bad idea. The pilots in the audience were smirking -- he had an extremely long thin triangle in his example. They aren't that dumb. I think these alleged difficulties don't exist in practice.

> No standard for controlling the behaviors exists, either, so you can find
> out the hard way that underflow-to-zero is being done in software by
> default, and that you are doing a lot of it. Or that your code doesn't have
> the same behavior on different platforms.

Well, if the platform is IEEE-compliant, you can trust it to behave
the same way logicially, even if some applications are slower on some
systems than others. Or did you mean that there's no standard way to
set the various IEEE flags? That seems like a language/compiler issue,
to me (which is not to minimize the pain it causes!)

I meant both. It may seem like a language issue to you (and me) but it doesn't to the language standard committees; just like they won't take ownership of the command line to run the compiler no matter how much the variances hurt people.

> To my mind, all that was really accomplished was to convince said youngsters
> that somehow this NaN stuff was the solution to some problem. In reality,
> computing for 3 days and having it print out 1000 NaNs is not exactly
> satisfying. I think this stuff was essentially a mistake in the sense that
> it is a nice ivory-tower idea that costs more in practice than it is worth.
> I do not think a properly thought-out and coded algorithm ought to do
> anything that this stuff is supposed to 'help' with, and if it does do it,
> the code should stop executing.

Well, then turn on floating-point exceptions. I find a few NaNs in my
calculations relatively benign. If I'm doing a calculation where
they'll propagate and destroy everything, I trap them. But it happens
just as often that I launch a job, a NaN appears early on, but I come
back in the morning to find that instead of aborting, the job has
completed except for a few bad data points.

This sounds like "the software is done but we just haven't gotten it running yet."

If you want an algorithm that takes advantage of NaNs, I have one. I
was simulating light rays around a black hole, generating a map of the
bending for various view angles. So I naturally did a lot of exploring
of grazing incidence, where the calculations would often yield NaNs.
So I used the NaNs to fill in the gap between the internal bending of
light and the external bending of the light; no extra programming
effort required, since that was what came out of the ray tracer
anyway. The rendering step just returned NaNs for the image pixels
that came from interpolating in the NaN-filled regions, which came out
black; just what I wanted. I just wrote the program ignoring what
would happen if a NaN appeared, and it came out right. Which is, I
think, the goal - to save the programmer having to think too much
about numerical-analysis headaches.

How do you know it came out right? (:->

Anyway, costing those who can think 30% of performance to save you a few headaches is not worth it. That's what I meant, not that there were no cases of benefit.

> Anyway, if I thought it would do the job I wouldn't have written MA in the
> first place.

It's certainly not always a good solution; in particular it's hard to
control the behaviour of things like where(). There's also no NaN for
integer types, which makes MaskedArray more useful.

Anyway, I am pleased that numpy has both controllable IEEE floats and
MaskedArray, and I apologize for a basically off-topic post.

A. M. Archibald

I guess it was me who was off topic, so I'll be quiet now.

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Numpy-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642

_______________________________________________
Numpy-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/numpy-discussion

[Numpy-discussion] [OT] was about masked array and IEEE arithmetic

Reply via email to