On 12/29/19 12:20 AM, Brendan Barnwell wrote:
On 2019-12-28 21:11, Richard Damon wrote:
<not addressing me>
You seem to understand Pure Math, but not the Applied Mathematics of
computers. The Applied Mathematics of Computing is based on the concept
of finite approximation, which is something that Pure Math, like the
type that builds up the Number line starting with Set Theory doesn't
handle approximation well. In Pure Math, something that is just mostly
true, and can be proved to not always be true, is considered False, but
to applied math, it can be good enough.

    But that is the problem.  "The applied mathematics of computing" is floating point, and in floating point, NaN is a number (despite its name).  You can't have your cake and eat it too.  You can say take the pure mathematical view that that NaN isn't a number, or you can say take the applied view that computers work with things (like floats) that aren't actually numbers but are similar to them.  But what you seem to be trying to say is that computers work with things that aren't actually numbers (like floats), but that even so, NaN isn't one of those things, and that position just strikes me as senseless.  The things that computers work with are floats, and NaN is a float, so in any relevant sense it is a number; it is an instance of a numerical type.

    With regard to the larger issue of what statistics.median should do, I think the simplest solution is to just put an explicit warning in the docs that says "Results may be meaningless if your data contain NaN".  I think it's fine to adopt a GIGO approach, but the problem is that the current docs aren't explicit enough about what counts as garbage, especially for the naive user.  We don't have to worry about whether it's logically sufficient to say "the data must be orderable" or "the data must be numbers".  The only case of practical relevance is the specific case of the single value NaN, so just put a specific note in the docs to specifically warn people about that specific value.

Per the IEEE standard for floating point, which I think would be the controlling document here, we have that Floating Point Values (or Representations, which is what every bit pattern is) that are divided into 3 major classes:

1) Finite Number, which for binary floating point are sub divided into Normalized and Denormalized Numbers, which are our representations for the set of the Real Numbers we can express.

2) Infinities, which extend the representation into the Extended Real Numbers

3) Not A Numbers, which are the result of exceptions in computations when we don't get a numeric result.

Thus, in IEEE-754, NaN is NOT a Number (and the infinities might or not be numbers too depending on which definition of number you are using)

Also, in the applied mathematics dealing with computing, NaN is NOT thought of as a "Number", but just a "Value", it has none of the properties of a Number, so to treat it as such is a mistake. I suppose that is part of the heart of the trickiness of IEEEE floating point, that not every value it represents is really a number.

Note that the number values obey, to the limits of precision, the normal axioms of mathematics (and sometimes the confusion is that it is ONLY to the limits of precision, so the basic axioms don't hold to the last bit, but only within the limits of precision so they match 'close enough'). This is a proper Engineering equivalence, even if not a precise Pure Mathematical one.

I would have no problem with the idea that the statistic model, or even more general Python documentation include a warning about NaN values (and there are actually a whole lot of values that are NaNs) behaving in unexpected manners.

--
Richard Damon
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/6QXEFHEJQUYSBPJDZC33BFZPCGD77TDK/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to