On 12/29/19 12:20 AM, Brendan Barnwell wrote:
On 2019-12-28 21:11, Richard Damon wrote:
<not addressing me>
You seem to understand Pure Math, but not the Applied Mathematics of
computers. The Applied Mathematics of Computing is based on the concept
of finite approximation, which is something that Pure Math, like the
type that builds up the Number line starting with Set Theory doesn't
handle approximation well. In Pure Math, something that is just mostly
true, and can be proved to not always be true, is considered False, but
to applied math, it can be good enough.
But that is the problem. "The applied mathematics of computing"
is floating point, and in floating point, NaN is a number (despite its
name). You can't have your cake and eat it too. You can say take the
pure mathematical view that that NaN isn't a number, or you can say
take the applied view that computers work with things (like floats)
that aren't actually numbers but are similar to them. But what you
seem to be trying to say is that computers work with things that
aren't actually numbers (like floats), but that even so, NaN isn't one
of those things, and that position just strikes me as senseless. The
things that computers work with are floats, and NaN is a float, so in
any relevant sense it is a number; it is an instance of a numerical type.
With regard to the larger issue of what statistics.median should
do, I think the simplest solution is to just put an explicit warning
in the docs that says "Results may be meaningless if your data contain
NaN". I think it's fine to adopt a GIGO approach, but the problem is
that the current docs aren't explicit enough about what counts as
garbage, especially for the naive user. We don't have to worry about
whether it's logically sufficient to say "the data must be orderable"
or "the data must be numbers". The only case of practical relevance
is the specific case of the single value NaN, so just put a specific
note in the docs to specifically warn people about that specific value.
Per the IEEE standard for floating point, which I think would be the
controlling document here, we have that Floating Point Values (or
Representations, which is what every bit pattern is) that are divided
into 3 major classes:
1) Finite Number, which for binary floating point are sub divided into
Normalized and Denormalized Numbers, which are our representations for
the set of the Real Numbers we can express.
2) Infinities, which extend the representation into the Extended Real
Numbers
3) Not A Numbers, which are the result of exceptions in computations
when we don't get a numeric result.
Thus, in IEEE-754, NaN is NOT a Number (and the infinities might or not
be numbers too depending on which definition of number you are using)
Also, in the applied mathematics dealing with computing, NaN is NOT
thought of as a "Number", but just a "Value", it has none of the
properties of a Number, so to treat it as such is a mistake. I suppose
that is part of the heart of the trickiness of IEEEE floating point,
that not every value it represents is really a number.
Note that the number values obey, to the limits of precision, the normal
axioms of mathematics (and sometimes the confusion is that it is ONLY to
the limits of precision, so the basic axioms don't hold to the last bit,
but only within the limits of precision so they match 'close enough').
This is a proper Engineering equivalence, even if not a precise Pure
Mathematical one.
I would have no problem with the idea that the statistic model, or even
more general Python documentation include a warning about NaN values
(and there are actually a whole lot of values that are NaNs) behaving in
unexpected manners.
--
Richard Damon
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/6QXEFHEJQUYSBPJDZC33BFZPCGD77TDK/
Code of Conduct: http://python.org/psf/codeofconduct/