On Thu, 02 Jun 2011 09:54:30 +0000, Steven D'Aprano wrote: >> Exceptions allow you to write more natural code by ignoring the awkward >> cases. E.g. writing "x * y + z" rather than first determining whether "x >> * y" is even defined then using a conditional. > > You've quoted me out of context. I wasn't asking for justification for > exceptions in general. There's no doubt that they're useful. We were > specifically talking about NAN == NAN raising an exception rather than > returning False.
It's arguable that NaN itself simply shouldn't exist in Python; if the FPU ever generates a NaN, Python should raise an exception at that point. But given that NaNs propagate in almost the same manner as exceptions, you could "optimise" this by treating a NaN as a special-case implementation of exceptions, and turn it into a real exception at the point where you can no longer use a NaN (e.g. when using a comparison operator). This would produce the same end result as raising an exception immediately, but would reduce the number of isnan() tests. >> NaN itself is an exceptional condition which arises when a result is >> undefined or not representable. When an operation normally returns a >> number but a specific case cannot do so, it returns not-a-number. > > I'm not sure what "not representable" is supposed to mean, Consider sqrt(-1). This is defined (as "i" aka "j"), but not representable as a floating-point "real". Making root/log/trig/etc functions return complex numbers when necessary probably be inappropriate for a language such as Python. > but if you "undefined" you mean "invalid", then correct. I mean undefined, in the sense that 0/0 is undefined (I note that Python actually raises an exception for "0.0/0.0"). >> The usual semantics for NaNs are practically identical to those for >> exceptions. If any intermediate result in a floating-point expression is >> NaN, the overall result is NaN. > > Not necessarily. William Kahan gives an example where passing a NAN to > hypot can justifiably return INF instead of NAN. Hmm. Is that still true if the NaN signifies "not representable" (e.g. known but complex) rather than undefined (e.g. unknown value but known to be real)? > While it's certainly > true that *mostly* any intermediate NAN results in a NAN, that's not a > guarantee or requirement of the standard. A function is allowed to > convert NANs back to non-NANs, if it is appropriate for that function. > > Another example is the Kronecker delta: > > def kronecker(x, y): > if x == y: return 1 > return 0 > > This will correctly consume NAN arguments. If either x or y is a NAN, it > will return 0. (As an aside, this demonstrates that having NAN != any > NAN, including itself, is useful, as kronecker(x, x) will return 0 if x > is a NAN.) How is this useful? On the contrary, I'd suggest that the fact that kronecker(x, x) can return 0 is an argument against the "NaN != NaN" axiom. A case where the semantics of exceptions differ from those of NaN is: def cond(t, x, y): if t: return x else: return y as cond(True, x, nan()) will return x, while cond(True, x, raise()) will raise an exception. But this is a specific instance of a more general problem with strict languages, i.e. strict functions violate referential transparency. This is why even strict languages (i.e. almost everything except for a handful of functional languages which value mathematical purity, e.g. Haskell) have non-strict conditionals. If you remove the conditional from the function and write it in-line, then: if True: return x else: raise() behaves like NaN. Also, note that the "convenience" of NaN (e.g. not propagating from the untaken branch of a conditional) is only available for floating-point types. If it's such a good idea, why don't we have it for other types? > Equality comparison is another such function. There's no need for > NAN == NAN to fail, because the equality operation is perfectly well > defined for NANs. The definition is entirely arbitrary. You could just as easily define that (NaN == NaN) is True. You could just as easily define that "1 + NaN" is 27. Actually, "NaN == NaN" makes more sense than "NaN != NaN", as the former upholds the equivalence axioms and is consistent with the normal behaviour of "is" (i.e. "x is y" => "x == y", even if the converse isn't necessarily true). If you're going to argue that "NaN == NaN" should be False on the basis that the values are sentinels for unrepresentable values (which may be *different* unrepresentable values), it follows that "NaN != NaN" should also be False for the same reason. >> But only the floating-point types have a NaN value, while >> bool doesn't. However, all types have exceptions. > > What relevance does bool have? The result of comparisons is a bool. >> Why should there be a correct answer? What does NaN actually mean? > > NAN means "this is a sentinel marking that an invalid calculation was > attempted". For the purposes of numeric calculation, it is often useful > to allow those sentinels to propagate through your calculation rather > than to halt the program, perhaps because you hope to find that the > invalid marker ends up not being needed and can be ignored, or because > you can't afford to halt the program. > > Does INVALID == INVALID? Either True or INVALID. You can make a reasonable argument for either. Making a reasonable argument that it should be False is much harder. > If you can cope with the question "Is an apple equal to a puppy dog?" It depends upon your definition of equality, but it's not a particularly hard question. And completely irrelevant here. > So what should NAN == NAN equal? Consider the answer to the apple and > puppy dog comparison. Chances are that anyone asked that will give you a > strange look and say "Of course not, you idiot". (In my experience, and > believe it or not I have actually tried this, some people will ask you to > define equality. But they're a distinct minority.) > > If you consider "equal to" to mean "the same as", then the answer is > clear and obvious: apples do not equal puppies, This is "equality" as opposed to "equivalence", i.e. x and y are equal if and only if f(x) and f(y) are equal for all f. > and any INVALID sentinel is not equal to any other INVALID. This does not follow. Unless you explicity define the sentinel to be unequal to itself, the strict equality definition holds, as NaN tends to be a specific bit pattern (multiple bit patterns are interpreted as NaN, but operations which result in a NaN will use a specific pattern, possibly modulo the sign bit). If you want to argue that "NaN == NaN" should be False, then do so. Simply asserting that it should be False won't suffice (nor will citing the IEEE FP standard *unless* you're arguing that "because the standard says so" is the only reason required). > (Remember, NAN is not a value itself, it's a sentinel representing the > fact that you don't have a valid number.) i'm aware of that. > So NAN == NAN should return False, Why? > just like the standard states, and NAN != NAN should return True. Why? In both cases, the more obvious result should be some kind of sentinel indicating that we don't have a valid boolean. Why should this sentinel propagate through arithmetic operations but not through logical operations? >> Apart from anything else, defining "NaN == NaN" as False means that "x >> == x" is False if x is NaN, which violates one of the fundamental axioms >> of an equivalence relation (and, in every other regard, "==" is normally >> intended to be an equivalence relation). > > Yes, that's a consequence of NAN behaviour. Another consequence: > x = float("nan") > x is x True > x == x False Ordinarily, you would consider this behaviour a bug in the class' __eq__ method. > I can live with that. I can *live* with it (not that I have much choice), but that doesn't meant that it's correct or even anything short of downright stupid. >> The creation of NaN was a pragmatic decision on how to handle >> exceptional conditions in hardware. It is not holy writ, and there's no >> fundamental reason why a high-level language should export the >> hardware's behaviour verbatim. > > There is a good, solid reason: it's a *useful* standard Debatable. > that *works*, Debatable. > proven in practice, If anything, it has proven to be a major nuisance. It takes a lot of effort to create (or even specify) code which does the right thing in the presence of NaNs. Turning NaNs into exceptions at their source wouldn't make it significantly harder to write correct code (there are a handful of cases where the existing behaviour produces the right answer almost by accident, far more where it doesn't), and would mean that "simple" code (where NaN hasn't been explicitly considered) raises an exception rather than silently producing a wrong answer. > invented by people who have forgotten more about > floating point than you or I will ever learn, and we dismiss their > conclusions at our peril. I'm not aware that they made any conclusions about Python. I don't consider any conclusions about the most appropriate behaviour for hardware (which may have no choice beyond exactly /which/ bit pattern to put into a register) to automatically determine what is the most appropriate behaviour for a high-level language. > A less good reason: its a standard. Better to stick to a not-very-good > standard than to have the Wild West, where everyone chooses their own > behaviour. You have NAN == NAN raise ValueError, Fred has it return True, > George has it return False, Susan has it return a NAN, Michelle makes it > raise MathError, somebody else returns Maybe ... This isn't an issue if you have the language deal with it. >> A result of NaN means that the result of the calculation is undefined, >> so the value is "unknown". > > Incorrect. NANs are not "unknowns", or missing values. You're contradicting yourself here. -- http://mail.python.org/mailman/listinfo/python-list