[Tim Peters] >>>> All Python behavior in the presence of infinities, NaNs, and signed >>>> zeroes is a platform-dependent accident, mostly inherited from that >>>> all C89 behavior in the presence of infinities, NaNs, and signed >>>> zeroes is a platform-dependent crapshoot.
[Michael Hudson] >>> As you may have noticed by now, I'd kind of like to stop you saying >>> this :) -- at least on platforms where doubles are good old-fashioned >>> 754 8-byte values. [Tim] >> Nope, I hadn't noticed! I'll stop saying it when it stops being true, >> though <wink>. Note that since there's not even an alpha out for 2.5 >> yet, none of the good stuff you did in CVS counts for users yet. [Michael] > Well, obviously. OTOH, there's nothing I CAN do that will be useful > for users until 2.5 actually comes out. Sure. I was explaining why I keep saying what you say you don't want me to say: until 2.5 actually comes out, what purpose would it serve to stop warning people that 754 special-value behavior is a x-platform crapshoot? Much of it (albeit less so) will remain a crapshoot after 2.5 comes out too. >>> But first, I'm going to whinge a bit, and lay out some stuff that Tim >>> at least already knows (and maybe get some stuff wrong, we'll see). >>> >>> Floating point standards lay out a number of "conditions": Overflow >>> (number too large in magnitude to represent), Underflow (non-zero >>> number to small in magnitude to represent), Subnormal (non-zero number >>> to small in magnitude to represent in a normalized way), ... >> The 754 standard has five of them: underflow, overflow, invalid >> operation, inexact, and "divide by 0" (which should be understood more >> generally as a singularity; e.g., divide-by-0 is also appropriate for >> log(0)). > OK, the decimal standard has more, which confused me for a bit > (presumably it has more because it doesn't normalize after each > operation). The "conditions" in IBM's decimal standard map, many-to-one, on to a smaller collection of "signals" in that standard. It has 8 signals: the 5 I named above from 754, plus "clamped", "rounded", and "subnormal". Distinctions are excruciatingly subtle; e.g., "rounded" and "inexact" would be the same thing in 754, but, as you suggest, in the decimal standard a result can be exact yet also rounded (if it "rounds away" one or more trailing zeroes), due to the unnormalized model. >>> For each condition, it should (at some level) is possible to trap each >>> condition, or continue in some standard-mandated way (e.g. return 0 >>> for Underflow). >> 754 requires that, yes. >>> While ignoring the issue of allowing the user to control this, I do >>> wish sometimes that Python would make up it's mind about what it does >>> for each condition. >> Guido and I agreed long ago that Python "should", by default, raise an >> exception on overflow, invalid operation, and divide by 0, and "should >> not", by default, raise an exception on underflow or inexact. And, I'll add, "should not" on rounded, clamped and subnormal too. > OK. OK <wink>. >> Such defaults favor non-expert use. Experts may or may not be happy >> with them, so Python "should" also allow changing the set. > Later :) That's a problem, though. 754 subsets are barely an improvement over what Python does today: the designers knew darned well that each default is going to make some non-trivial group of users horridly unhappy. That's why such extensive machinery for detecting signals, and for trapping or not trapping on signals, is mandated. That's a very important part of these standards. > (In the mean time can we just kill fpectl, please?) Has it been marked as deprecated yet (entered into the PEP for deprecated modules, raises deprecation warnings, etc)? I don't know. IMO it should become deprecated, but I don't have time to push that. >>> There are a bunch of conditions which we shouldn't and don't trap by >>> default -- Underflow for example. For the conditions that probably should >>> result in an exception, there are inconsistencies galore: >>> >>> >>> inf = 1e300 * 1e300 # <- Overflow, no exception >>> >>> nan = inf/inf # <- InvalidOperation, no exception >> Meaning you're running on a 754 platform whose C runtime arranged to >> disable the overflow and invalid operation traps. > Isn't that the standard-mandated start up environment? The 754 standard mandates "non-stop" mode (all traps disabled) at startup, but who in this chain is _claiming_ to implement the 754 standard? Your platform C may or may not, and your OS may or may not. >> You're seeing native HW fp behavior then. > But anyway, shouldn't we try to raise exceptions in these cases? I believe Python should raise exceptions in these cases by default, because, as above, they correspond to the overflow and invalid-operation signals respectively, and Python should raise exceptions on the overflow, invalid-operation, and divide-by-0 signals by default. But I also believe Python _dare not_ do so unless it also supplies sane machinery for disabling traps on specific signals (along the lines of the relevant standards here). Many serious numeric programmers would be livid, and justifiably so, if they couldn't get non-stop mode back. The most likely x-platfrom accident so far is that they've been getting non-stop mode in Python since its beginning. > I don't think it's a particularly good idea to try to utilize the fp > hardware's ability to do this at this stage, btw, but to add some kind > of check after each operation. That's possible, anyway. Doing it in software is slow, but if you look at the horrid contortions fpectlmodule.c endures to try to exploit HW facilities, it's not clear that the latter is any faster. >>> >>> pow(1e100, 100) <- Overflow, exception >>> Traceback (most recent call last): >>> File "<stdin>", line 1, in ? >>> OverflowError: (34, 'Numerical result out of range') >>> >>> math.sqrt(-1) # <- InvalidOperation, exception >>> Traceback (most recent call last): >>> File "<stdin>", line 1, in ? >>> ValueError: math domain error >> Unlike the first two examples, these call libm functions. > And the user cares about this why? Didn't say the user did care. Why doesn't Python already supply a fully 754-conforming arithmetic on 754 boxes? It's got almost everything to do with implementation headaches, and very little to do with what users care about. Because all the C facilities are a x-platform mess, the difference between calling and not calling libm can be the difference between using the platform libm or Python needing to write its own libm. For example, there's no guarantee that math.sqrt(-1) will raise ValueError in Python, because Python currently relies on the platform libm sqrt to detect _and report_ errors. The C standards don't require much of anything there. As is, Python does all sorts of dumbass sniffing on libm return values, trying to see whether errno was set, trying to guess whether libm "should have" griped about overflow, and trying to guess whether libm should have griped about an invalid operation. Even then, it's trying to squash this into the ancient C ERANGE/EDOM model, not trying to do something sensible wrt 754 semantics. There are many issues here, and they're mostly concerned with implementation headaches. >> Then it's a x-platform crapshoot whether and when the libm functions >> set errno to ERANGE or EDOM, and somewhat of a mystery whether it's >> better to reproduce what the native libm considers to be "an error", >> or try to give the same results across platforms. Python makes a >> weak attempt at the latter. > Well, you can at least be pretty sure that an infinite result is the > result of an overflow condition, I guess. There are at least two other causes: some cases of divide-by-0 (like 1/0 returns +Inf), and non-exceptional production of an infinite result from infinite operands (like sqrt(+Inf) should return +Inf, and there's nothing exceptional about that). >>> At least we're fairly consistent on DivisionByZero... >> When it's a division by 0, yes. It's cheap and easy to test for that. >> However, many expert uses strongly favor getting back an infinity >> then instead, so it's not good that Python doesn't support a choice >> about x/0. > Indeed. But I'd rather work on non-settable predictability first. OTOH, I bet I could find people willing to pay for the ability to "turn off" divide-by-0 exceptions (or the people talking my ear off at PyCon this year were just bluffing <wink>). BTW, since there's so little the HW can help us with here in reality (since there's no portable way to get at it), any predictability is going to be supplied by software, and software for that will work better over the long run if it's designed from the start realizing that it needs to consult some form of context to decide what is and is not "an exception". >>> If we're going to trap Overflow consistently, we really need a way of >>> getting the special values reliably -- which is what pep 754 is about, >>> and its implementation may actually work more reliably in 2.5 since my >>> recent work... >> I don't know what you have in mind. > Well, the reason I headbutted into this stuff again recently was > writing (grotty) string to float parsing code for PyPy. If you write > (where 'e' is the exponent parsed from, say '1e309'): > > while e > 0: > result *= 10 > e -= 1 > > you get an infinite result in the large e case. If instead you write > the seemingly much more sensible: > > result *= pow(10, e) > > you don't, you get an overflow error instead. That last line looks like a (large) integer operation, so I'm guessing result *= pow(10.0, e) (or moral equivalent) was intended. > This make me sad. > > Whether my code should have returned an infinite value or raised an > exception for "1e1000" is an open question, but I'd like the answer to > be less accidental. It's a delicate house of cards right now for sure. In some ways, experts have learned to live with-- and exploit --it: when they _want_ an overflow exception they'll use pow(), and when they don't they'll just multiply. It would be a mistake to assume that everyone is unhappy about that. The only way to approach making everyone happy is to give them a choice about whether overflow raises an exception. Then consistency would be an unqualifed good thing. As is, exploiting inconsistencies is sometimes the only way people have now to get the exception behavior they want. >> For example, checking the result of x*y to see whether it's an >> infinity is not a reliable way to detect overflow, and it fails in >> more than one way (e.g., one of the inputs may have been an infinity >> (in which case OverflowError is inappropriate), > Well, yeah. What other ways can it fail? The rounding mode choice mentioned later. For example, under to-minus-infinity rounding 1e300 * 1e300 does overflow, but does not return +Inf (it returns the largest representable finite float instead). >> and overflow doesn't always result in an infinity either (depends on >> the rounding mode in effect)). > If we're allowing the user to alter rounding mode, I imagine we'll > have a much better idea about this sort of thing. For now, I think we > can assume a default rounding that doesn't do that, and someone who > has a platform like where it has his or her own problem. IME the rounding modes in 754 are almost never changed from the default to-nearest-even, and I can live with that. The rounding modes in the Decimal module will be changed a lot, though (due to the insanity of politically mandated rounding rules in currency calculations). >>> On the issue of platforms that start up processes with traps enabled, >>> I think the correct solution is to find the incantation to turn them >>> off again and use that in Py_Initialize(), though that might upset >>> embedders. >> Hard to know. Python currently has a hack to disable traps on >> FreeBSD, in python.c's main(). > Oh, right. Well, I don't much care about embedding, anyway. So far, I think FreeBSD remains the only known Python platform that enables some FP traps by default. So it's OK if you _just_ don't care much about embedding on FreeBSD <wink>. -- http://mail.python.org/mailman/listinfo/python-list