New submission from Nick Coghlan <[email protected]>:
The question of the way Python handles NaN came up again on python-dev
recently. The current semantics have been assessed as a reasonable compromise,
but a poorly explained and inconsistently implemented one.
Based on a suggestion from Terry Reedy [1] I propose that a new glossary entry
be added for "Reflexive Equality":
"Part of the standard mathematical definition of equality is that it is
reflexive, that is ``x is y`` necessarily implies that ``x == y``. This is an
essential property that is relied upon when designing and implementing
container classes such as ``list`` and ``dict``.
However, the IEEE754 committee defined the float Not_a_Number (NaN) values as
being unequal with all others floats, including themselves. While this design
choice violates the basic mathematical definition of equality, it is still
considered desirable to be able to correctly implement IEEE754 floating point
semantics, and those of similar types such as ``decimal.Decimal``, directly in
Python.
Accordingly, Python makes the follow compromise in order to cope with types
that use non-reflexive definitions of equality without breaking the invariants
of container classes that rely on reflexive definitions of equality:
1. Direct equality comparisons involving ``NaN``, such as ``nan=float('NaN');
nan == nan``, follow the IEEE754 rule and return False (or True in the case of
``!=``). This rule applies to ``float`` and ``decimal.Decimal`` within the
builtins and standard library.
2. Indirect comparisons conducted internally by container classes, such as ``x
in someset`` or ``seq.count(x)`` or ``somedict[x]``, enforce reflexivity by
using the expressions ``x is y or x == y`` and ``x is not y and x != y``
respectively rather than assuming that ``x == y`` and ``x != y`` will always
respect the reflexivity requirement. This rule applies to all container types
within the builtins and standard library that may contain values of arbitrary
types.
Also see [1] for a more comprehensive theoretical discussion of this topic.
[1]
http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civilization/"
Specific container methods that have currently been identified as relying on
the reflexivity assumption are:
- __contains__() (for x in c: assert x in c)
- __eq__() (assert [x] == [x])
- __ne__() (assert not [x] != [x])
- index() (for x in c: assert 0 <= c.index(x) < len(c))
- count() (for x in c: assert c.count(x) > 0)
collections.Sequence and array.array (with the 'f' or 'd' type indicators) have
already been identified as container classes in the standard library that fails
to follow the second guideline and hence fail to correctly implement the above
invariants in the presence of non-reflexive definitions of equality. They will
be fixed as part of implementing this patch. Other container types that fail to
correctly enforce reflexivity can be fixed as they are identified.
[1] http://mail.python.org/pipermail/python-dev/2011-April/110962.html
----------
assignee: docs@python
components: Documentation, Library (Lib)
messages: 134639
nosy: docs@python, ncoghlan
priority: normal
severity: normal
status: open
title: Adopt and document consistent semantics for handling NaN values in
containers
type: behavior
versions: Python 2.7, Python 3.2, Python 3.3
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue11945>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com