On 3/4/20 11:07 PM, Andrew Barnert via Python-ideas wrote:
On Mar 4, 2020, at 19:12, Richard Damon <rich...@damon-family.org> wrote:
Yes, because of the NaN issue, you sort of need an 'Almost Total Order' and
'Really Truly a Total Order', the first allowing the small exception of a very
limited (maybe only one) special value that breaks the true definition of Total
Order so that we could call Floats to be an Almost Total Order.
The obvious answer is to have PartialOrder < NaNOrder < TotalOrder, where a
type is NaN-ordered if it’s partially ordered, and its subset of all self-equal
objects is totally ordered.
I can’t imagine when a generic AlmostTotalOrder that didn’t specify how it
fails could be useful; a NaNOrder is useful all over the place. A median that
ignores NaN values requires NaNOrdered input. Even a median that does something
stupid with NaN—see below.
Of course there might be other kinds of almost total orders that might be
useful. If anyone runs into one, they can write their own ABC with the proper
relationships to the stdlib ones. Or even propose it for the stdlib. But they
couldn’t do anything useful with a general AlmostTotalOrder if it had to handle
both NaN and whatever their different case was.
I think your NaNOrder is my Almost Total Order, except that the name
NaNOrder implies that it is only NaN is the special case, and thus we
are dealing with numbers, but there may well be some other user type
that has a similar 'exceptional' value with properties like NaN, but
since the items aren't numbers, the Not-A-Number tag isn't really
appropriate.
This wouldn't help the median issue as having median reject being given Float
values because they don't form a true Total Order would be much worse than the
issue of it getting confused with NaNs.
Well, if you want to handle floats and Decimals and do something stupid with
NaN values so the user has to be careful, it seems to me the right thing to do
is require NaNOrder but document that you do something stupid with NaN values
so the user has to be careful.
Yes, that is the idea of AlmostTotalOrder, to have algorithms that
really require a total order (like sorting) but we really need to use a
type that has these exceptional values. Imagine that sort/median was
defined to type check its parameter, and that meant that you couldn't
take the median of a list of floats (because float has the NaN value
that breaks TotalOrder).
The presence of NaN in the float system does add a significant complexity to
dealing with floating point numbers. Sometimes I wonder if since Python
supports dynamic typing of results, might not do better by removing the NaN
value from Floats and Decimals, and make the operations that generate the NaN
generate an object of a special NaN type.
The thing is, in most cases you’d be duck typing and get no benefit, because
float and floatnan both have the same methods, etc. Sure, when you’re
type-checking with isinstance you could choose whether to check float|floatnan
or just float, but how often do you write such type checks? And if you really
want that, you could write a Float ABC that does the same thing, although IIRC
ABCs only have a subclasshook rather than an instancehook so you need a new
metaclass:
class FloatMeta(ABCMeta):
def __instancecheck__(self, instance):
return isinstance(instance, float) and not math.isnan(instance)
class Float(metaclass=FloatMeta):
pass
People have experimented a lot with similar things and beyond in static type
systems: you can define a new type which is just the non-NaN floats, or just the
finite floats, or just the floats from -1 to 1, or just the three floats -1 or 0 or
1, and the compiler can check that you’re always using it safely. (If you call a
function that takes one of those non-NaN floats with a float, unless the compiler
can see an if not isnan(x) or a pattern match that excludes NaN or something else
that guarantees correctness, it’s a TypeError.) Fitting this into the type algebra
(& and | and -) is pretty easy. I’m not aware of anyone translating that idea
to dynamic type systems, but it could be interesting.
The problem with trying to make NaN a special type in a static typed
language is that suddenly you can't define any of your variables to be
of type 'float' if they might have a NaN value, as that would have the
wrong type to store it. Your basic fundamental float type suddenly needs
to use the dynamic typing system within the static type system that
handles type hierarchies. That would be a BIG efficiency hit in a
language like C++. A Dynamically typed language like Python, since it
already takes that hit (and leverages that it always takes the hit to
add functionality) has much more flexibility here.
I will add, that the whole concept of the NaN value was an invention to
handle the problem that a totally statically typed language, like C,
couldn't efficiently handle those sorts of errors except by a funny
value that didn't really act like a floating point number. Except for
the fact that so many people have gotten used to the fact that certain
erroneous math computation create this weird NaN value that we think of
as a number but also not really a number, there is no reason in a
dynamically typed language that this error code needs to be considered
to actually be a Float value.
--
Richard Damon
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/IQLQV3MNIEHWKD7QP5TTT7YY7HCFX2FE/
Code of Conduct: http://python.org/psf/codeofconduct/