date:20110427

[Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Hrvoje Niksic


The other day I was surprised to learn this:

>>> nan = float('nan')
>>> nan == nan
False
>>> [nan] == [nan]
True  # also True in tuples, dicts, etc.

# also:
>>> l = [nan]
>>> nan in l
True
>>> l.index(nan)
0
>>> l[0] == nan
False

The identity test is not in container comparators, but in 
PyObject_RichCompareBool:


/* Quick result when objects are the same.
   Guarantees that identity implies equality. */
if (v == w) {
if (op == Py_EQ)
return 1;
else if (op == Py_NE)
return 0;
}

The guarantee referred to in the comment is not only (AFAICT) 
undocumented, but contradicts the documentation, which states that the 
result should be the "equivalent of o1 op o2".


Calling PyObject_RichCompareBool is inconsistent with calling 
PyObject_RichCompare and converting its result to bool manually, 
something that wrappers (C++) and generators (cython) might reasonably 
want to do themselves, for various reasons.


If this is considered a bug, I can open an issue.

Hrvoje
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Łukasz Langa

Wiadomość napisana przez Hrvoje Niksic w dniu 2011-04-27, o godz. 11:37:

> The other day I was surprised to learn this:
> 
> >>> nan = float('nan')
> >>> nan == nan
> False
> >>> [nan] == [nan]
> True  # also True in tuples, dicts, etc.
> 
> # also:
> >>> l = [nan]
> >>> nan in l
> True
> >>> l.index(nan)
> 0
> >>> l[0] == nan
> False
> 

This surprises me as well. I guess this is all related to the fact that:
>>> nan is nan
True

Have a look at this as well:

>>> inf = float('inf')
>>> inf == inf
True
>>> [inf] == [inf]
True
>>> l = [inf]
>>> inf in l
True
>>> l.index(inf)
0
>>> l[0] == inf
True

# Or even:
>>> inf+1 == inf-1
True

For the infinity part, I believe this is related to the funky IEEE 754 
standard. I found
some discussion about this here: 
http://compilers.iecc.com/comparch/article/98-07-134

-- 
Best regards,
Łukasz Langa
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Nick Coghlan

2011/4/27 Łukasz Langa :
> # Or even:
 inf+1 == inf-1
> True
>
> For the infinity part, I believe this is related to the funky IEEE 754 
> standard. I found
> some discussion about this here: 
> http://compilers.iecc.com/comparch/article/98-07-134

The inf behaviour is fine (inf != inf only when you start talking
about aleph levels, and IEEE 754 doesn't handle those).

It's specifically `nan` that is problematic, as it is one of the very
few cases that breaks the reflexivity of equality.

That said, the current behaviour was chosen deliberately so that
containers could cope with `nan` at least somewhat gracefully:
http://bugs.python.org/issue4296

Issue 10912 added an explicit note about this behaviour to the 3.x
series documentation, but that has not as yet been backported to 2.7
(I reopened the issue to request such a backport).

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Raymond Hettinger

On Apr 27, 2011, at 2:37 AM, Hrvoje Niksic wrote:

> The other day I was surprised to learn this:
> 
> >>> nan = float('nan')
> >>> nan == nan
> False
> >>> [nan] == [nan]
> True  # also True in tuples, dicts, etc.

Would also be surprised if you put an object in a dictionary but couldn't get 
it out?  Or added it to a list but its count was zero?

Identity-implies-equality is necessary so that classes can maintain their 
invariants and so that programmers can reason about their code.  It is not just 
in PyObject_RichCompareBool, it is deeply embedded in the language (the logic 
inside dicts for example).  It is not a short-cut, it is a way of making sure 
that internally we can count on equality relations reflexive, symmetric, and 
transitive.  A programmer needs to be able to make basic deductions such as the 
relationship between the two forms of the in-operator:   for elem in somelist:  
assert elem in somelist  # this should never fail.

What surprises me is that anyone gets surprised by anything when experimenting 
with an object that isn't equal to itself.  It is roughly in the same category 
as creating a __hash__ that has no relationship to __eq__ or making 
self-referencing sets or setting False,True=1,0 in python 2.  See 
http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civilization/
 for a nice blog post on the subject.

Raymond

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Guido van Rossum

On Wed, Apr 27, 2011 at 7:39 AM, Raymond Hettinger
 wrote:
>
> On Apr 27, 2011, at 2:37 AM, Hrvoje Niksic wrote:
>
> The other day I was surprised to learn this:
>
 nan = float('nan')
 nan == nan
> False
 [nan] == [nan]
> True  # also True in tuples, dicts, etc.
>
> Would also be surprised if you put an object in a dictionary but couldn't
> get it out?  Or added it to a list but its count was zero?
> Identity-implies-equality is necessary so that classes can maintain their
> invariants and so that programmers can reason about their code.  It is not
> just in PyObject_RichCompareBool, it is deeply embedded in the language (the
> logic inside dicts for example).  It is not a short-cut, it is a way of
> making sure that internally we can count on equality relations reflexive,
> symmetric, and transitive.  A programmer needs to be able to make basic
> deductions such as the relationship between the two forms of the
> in-operator:   for elem in somelist:  assert elem in somelist  # this should
> never fail.
> What surprises me is that anyone gets surprised by anything when
> experimenting with an object that isn't equal to itself.  It is roughly in
> the same category as creating a __hash__ that has no relationship to __eq__
> or making self-referencing sets or setting False,True=1,0 in python 2.
>  See http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civilization/ for
> a nice blog post on the subject.

Maybe we should just call off the odd NaN comparison behavior?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Nick Coghlan

On Thu, Apr 28, 2011 at 12:53 AM, Guido van Rossum  wrote:
>> What surprises me is that anyone gets surprised by anything when
>> experimenting with an object that isn't equal to itself.  It is roughly in
>> the same category as creating a __hash__ that has no relationship to __eq__
>> or making self-referencing sets or setting False,True=1,0 in python 2.
>>  See http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civilization/ for
>> a nice blog post on the subject.
>
> Maybe we should just call off the odd NaN comparison behavior?

Rereading Meyer's article (I read it last time this came up, but it's
a nice piece, so I ended up going over it again this time) the quote
that leapt out at me was this one:

"""A few of us who had to examine the issue recently think that —
whatever the standard says at the machine level — a programming
language should support the venerable properties that equality is
reflexive and that assignment yields equality.

Every programming language should decide this on its own; for Eiffel
we think this should be the specification. Do you agree?"""

Currently, Python tries to split the difference: "==" and "!=" follow
IEEE754 for NaN, but most other operations involving builtin types
rely on the assumption that equality is always reflexive (and IEEE754
be damned).

What that means is that "correct" implementations of methods like
__contains__, __eq__, __ne__, index() and count() on containers should
be using "x is y or x == y" to enforce reflexivity, but most such code
does not (e.g. our own collections.abc.Sequence implementation gets
those of these that it implements wrong, and hence Sequence based
containers will handle NaN in a way that differs from the builtin
containers)

And none of that is actually documented anywhere (other than a
behavioural note in the 3.x documentation for
PyObject_RichCompareBool), so it's currently just an implementation
detail of CPython that most of the builtin containers behave that way
in practice.

Given the status quo, what would seem to be the path of least resistance is to:
- articulate in the language specification which container special
methods are expected to enforce reflexivity of equality (even for
non-reflexive types)
- articulate in the library specification which ordinary container
methods enforce reflexivity of equality
- fix any standard library containers that don't enforce reflexivity
to do so where appropriate (e.g. collections.abc.Sequence)

Types with a non-reflexive notion of equality still wouldn't play
nicely with containers that didn't enforce reflexivity where
appropriate, but bad interactions between 3rd party types isn't really
something we can prevent.

Backing away from having float and decimal.Decimal respect the IEEE754
notion of NaN inequality at this late stage of the game seems like one
for the "too hard" basket. It also wouldn't achieve much, since we
want the builtin containers to preserve their invariants even for 3rd
party types with a non-reflexive notion of equality.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Alexander Belopolsky

On Wed, Apr 27, 2011 at 10:53 AM, Guido van Rossum  wrote:
..
> Maybe we should just call off the odd NaN comparison behavior?

+1

There was a long thread on this topic last year:

http://mail.python.org/pipermail/python-dev/2010-March/098832.html

I was trying to find a rationale for non-reflexivity of equality in
IEEE and although it is often mentioned that this property simplifies
some numerical algorithms, I am yet to find an important algorithm
that would benefit from it.  I also believe that long history of
suboptimal hardware implementations of nan arithmetics has stifled the
development of practical applications.

High performance applications that rely on non-reflexivity will still
have an option of using ctypes.c_float type or NumPy.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Nick Coghlan

On Thu, Apr 28, 2011 at 1:43 AM, Alexander Belopolsky
 wrote:
> High performance applications that rely on non-reflexivity will still
> have an option of using ctypes.c_float type or NumPy.

However, that's exactly the reason I don't see any reason to reverse
course on having float() and Decimal() follow IEEE754 semantics,
regardless of how irritating we may find those semantics to be.

Since we allow types to customise __eq__ and __ne__ with non-standard
behaviour, if we want to permit *any* type to have a non-reflexive
notion of equality, then we need to write our container types to
enforce reflexivity when appropriate. Many of the builtin types
already do this, by virtue of it being built in to RichCompareBool.
It's now a matter of documenting that properly and updating the
non-conformant types accordingly.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Alexander Belopolsky

On Wed, Apr 27, 2011 at 11:31 AM, Nick Coghlan  wrote:
..
> Backing away from having float and decimal.Decimal respect the IEEE754
> notion of NaN inequality at this late stage of the game seems like one
> for the "too hard" basket.

Why?  float('nan') has always been in the use-at-your-own-risk
territory despite recent efforts to support it across Python
platforms.   I cannot speak about decimal.Decimal (and decimal is a
different story because it is tied to a particular standard), but the
only use of non-reflexifity for float nans I've seen was use of x != x
instead of math.isnan(x).

> It also wouldn't achieve much, since we
> want the builtin containers to preserve their invariants even for 3rd
> party types with a non-reflexive notion of equality.

These are orthogonal issues.   A third party type that plays with
__eq__ and other basic operations can easily break stdlib algorithms
no matter what we do.  Therefore it is important to document the
properties of the types that each algorithm relies on.  It is more
important, however that stdlib types do not break 3rd party's
algorithms.   I don't think I've ever seen a third party type that
deliberately defines a non-reflexive __eq__ except as a side effect of
using float attributes or C float members in the underlying structure.
 (Yes, decimal is a counter-example, but this is a very special case.)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Alexander Belopolsky

On Wed, Apr 27, 2011 at 12:05 PM, Isaac Morland  wrote:
..
> Of course, the definition of math.isnan cannot then be by checking its
> argument by comparison with itself - it would have to check the appropriate
> bits of the float representation.

math.isnan() is implemented in C and does not rely on float.__eq__ in any way.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Isaac Morland


On Wed, 27 Apr 2011, Alexander Belopolsky wrote:


High performance applications that rely on non-reflexivity will still
have an option of using ctypes.c_float type or NumPy.


Python could also provide IEEE-754 equality as a function (perhaps in 
"math"), something like:


def ieee_equal (a, b):
return a == b and not isnan (a) and not isnan (b)

Of course, the definition of math.isnan cannot then be by checking its 
argument by comparison with itself - it would have to check the 
appropriate bits of the float representation.


Isaac Morland   CSCF Web Guru
DC 2554C, x36650WWW Software Specialist
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Antoine Pitrou

On Wed, 27 Apr 2011 12:05:12 -0400 (EDT)
Isaac Morland  wrote:
> On Wed, 27 Apr 2011, Alexander Belopolsky wrote:
> 
> > High performance applications that rely on non-reflexivity will still
> > have an option of using ctypes.c_float type or NumPy.
> 
> Python could also provide IEEE-754 equality as a function (perhaps in 
> "math"), something like:
> 
> def ieee_equal (a, b):
>   return a == b and not isnan (a) and not isnan (b)

+1 (perhaps call it math.eq()).

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Raymond Hettinger


On Apr 27, 2011, at 7:53 AM, Guido van Rossum wrote:

> Maybe we should just call off the odd NaN comparison behavior?

I'm reluctant to suggest changing such enshrined behavior.

ISTM, the current state of affairs is reasonable.  
Exotic objects are allowed to generate exotic behaviors
but consumers of those objects are free to ignore some
of those behaviors by making reasonable assumptions
about how an object should behave.

It's possible to make objects where the __hash__ doesn't
correspond to __eq__.; they just won't behave well with
hash tables.  Likewise, it's possible for a sequence to
define a __len__ that is different from it true length; it
just won't behave well with the various pieces of code
that assume collections are equal if the lengths are unequal.

All of this seems reasonable to me.


Raymond


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Isaac Morland


On Wed, 27 Apr 2011, Antoine Pitrou wrote:


Isaac Morland  wrote:


Python could also provide IEEE-754 equality as a function (perhaps in
"math"), something like:

def ieee_equal (a, b):
return a == b and not isnan (a) and not isnan (b)


+1 (perhaps call it math.eq()).


Alexander Belopolsky pointed out to me (thanks!) that isnan is implemented 
in C so my caveat about the implementation of isnan is not an issue.  But 
then that made me realize the ieee_equal (or just "eq" if that's 
preferable) probably ought to be implemented in C using a floating point 
comparison - i.e., use the processor implementation of the comparison 
operation..


Isaac Morland   CSCF Web Guru
DC 2554C, x36650WWW Software Specialist
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Issue Tracker

2011-04-27 Thread Ethan Furman


Ezio Melotti wrote:

On 26/04/2011 22.32, Ethan Furman wrote:
Okay, I finally found a little time and got roundup installed and 
operating.


Only major complaint at this point is that the issue messages are 
presented in top-post format (argh).


Does anyone know off the top of one's head what to change to put 
roundup in bottom-post (chronological) format?


TIA!

~Ethan~


See line 309 of 
http://svn.python.org/view/tracker/instances/python-dev/html/issue.item.html?view=markup 

If you have other questions about Roundup see 
https://lists.sourceforge.net/lists/listinfo/roundup-users


Thanks so much!  That was just what I needed.

~Ethan~
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Alexander Belopolsky

On Wed, Apr 27, 2011 at 12:28 PM, Raymond Hettinger
 wrote:
>
> On Apr 27, 2011, at 7:53 AM, Guido van Rossum wrote:
>
>> Maybe we should just call off the odd NaN comparison behavior?
>
> I'm reluctant to suggest changing such enshrined behavior.
>
> ISTM, the current state of affairs is reasonable.
> Exotic objects are allowed to generate exotic behaviors
> but consumers of those objects are free to ignore some
> of those behaviors by making reasonable assumptions
> about how an object should behave.

Unfortunately NaNs are not that exotic.  They can be silently produced
in calculations and lead to hard to find errors.  For example:

>>> x = 1e300*1e300
>>> x - x
nan

This means that every program dealing with float data has to detect
nans at every step and handle them correctly.  This in turn makes it
impossible to write efficient code that works equally well with floats
and integers.

Note that historically, Python was trying hard to prevent production
of non-finite floats.  AFAICT, none of the math functions would
produce inf or nan.   I am not sure why arithmetic operations are
different.  For example:

>>> 1e300*1e300
inf

but

>>> 1e300**2
Traceback (most recent call last):
  File "", line 1, in 
OverflowError: (34, 'Result too large')

and

>>> math.pow(1e300,2)
Traceback (most recent call last):
  File "", line 1, in 
OverflowError: math range error
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Raymond Hettinger


On Apr 27, 2011, at 10:16 AM, Alexander Belopolsky wrote:
> Unfortunately NaNs are not that exotic.  

They're exotic in the sense that they have the unusual property of not being 
equal to themselves.

Exotic (adj) strikingly strange or unusual


Raymond


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Terry Reedy

On 4/27/2011 10:53 AM, Guido van Rossum wrote:

On Wed, Apr 27, 2011 at 7:39 AM, Raymond Hettinger

>> Identity-implies-equality is necessary so that classes can maintain 
>> their invariants and so that programmers can reason about their code.

[snip]

  See 
http://bertrandmeyer.com/2010/02/06/reflexivity-and-other-pillars-of-civilization/
 for
a nice blog post on the subject.

I carefully reread this, with the comments, and again came to the 
conclusion that the committee left us no *good* answer, only a choice 
between various more-or-less unsatifactory answers. The current Python 
compromise may be as good as anything. In any case, I think it should be 
explicitly documented with an indexed paragraph, perhaps as follows:

"The IEEE-754 committee defined the float Not_a_Number (NaN) value as 
being incomparable with all others floats, including itself. This 
violates the math and logic rule that equality is reflexive, that 'a == 
a' is always True. And Python collection classes depend on that rule for 
their proper operation. So Python makes the follow compromise. Direct 
equality comparisons involving Nan, such as "NaN=float('NaN'); NaN == 
ob", follow the IEEE-754 rule and return False. Indirect comparisons 
conducted internally as part of a collection operation, such as 'NaN in 
someset' or 'seq.count()' or 'somedict[x]', follow the reflexive rule 
and act as it 'Nan == NaN' were True. Most Python programmers will never 
see a Nan in real programs."

This might best be an entry in the Glossary under "NaN -- Not a Number". 
It should be the first reference for Nan in the General Index and linked 
to from the float() builtin and float type Nan mentions.

Maybe we should just call off the odd NaN comparison behavior?

Eiffel seems to have survived, though I do not know if it used for 
numerical work. I wonder how much code would break and what the scipy 
folks would think. 3.0 would have been the time, though.

--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Glenn Linderman


On 4/27/2011 8:31 AM, Nick Coghlan wrote:

What that means is that "correct" implementations of methods like
__contains__, __eq__, __ne__, index() and count() on containers should
be using "x is y or x == y" to enforce reflexivity, but most such code
does not (e.g. our own collections.abc.Sequence implementation gets
those of these that it implements wrong, and hence Sequence based
containers will handle NaN in a way that differs from the builtin
containers)


+1 to everything Nick said.

One issue that I don't fully understand: I know there is only one 
instance of None in Python, but I'm not sure where to discover whether 
there is only a single, or whether there can be multiple, instances of 
NaN or Inf.  The IEEE 754 spec is clear that there are multiple bit 
sequences that can be used to represent these, so I would hope that 
there can be, in fact, more than one value containing NaN (and Inf).


This would properly imply that a collection should correctly handle the 
case of storing multiple, different items using different NaN (and Inf) 
instances.  A dict, for example, should be able to hold hundreds of 
items with the index value of NaN.


The distinction between "is" and "==" would permit proper operation, and 
I believe that Python's "rebinding" of names to values rather than the 
copying of values to variables makes such a distinction possible to use 
in a correct manner.


Can someone confirm or explain this issue?
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Robert Kern


On 4/27/11 12:44 PM, Terry Reedy wrote:

On 4/27/2011 10:53 AM, Guido van Rossum wrote:



Maybe we should just call off the odd NaN comparison behavior?


Eiffel seems to have survived, though I do not know if it used for numerical
work. I wonder how much code would break and what the scipy folks would think.


I suspect most of us would oppose changing it on general backwards-compatibility 
grounds rather than actually *liking* the current behavior. If the behavior 
changed with Python floats, we'd have to mull over whether we try to match that 
behavior with our scalar types (one of which subclasses from float) and our 
arrays. We would be either incompatible with Python or C, and we'd probably end 
up choosing Python to diverge from. It would make a mess, honestly. We already 
have to explain why equality is funky for arrays (arr1 == arr2 is a rich 
comparison that gives an array, not a bool, so we can't do containment tests for 
lists of arrays), so NaN is pretty easy to explain afterward.


--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Terry Reedy

On 4/27/2011 2:41 PM, Glenn Linderman wrote:

One issue that I don't fully understand: I know there is only one
instance of None in Python, but I'm not sure where to discover whether
there is only a single, or whether there can be multiple, instances of
NaN or Inf.

I am sure there are multiple instances with just one bit pattern, the 
same as other floats. Otherwise, float('nan') would have to either 
randomly or systematically choose from among the possibilities. Ugh.

There are functions in the math module that pull apart (and put 
together) floats.

> The IEEE 754 spec is clear that there are multiple bit

sequences that can be used to represent these,

Anyone actually interested in those should use C or possibly the math 
module float assembly function.

> so I would hope that

there can be, in fact, more than one value containing NaN (and Inf).

If you do not know which pattern is which, what use could such passibly be?

--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Terry Reedy


On 4/27/2011 11:31 AM, Nick Coghlan wrote:


Currently, Python tries to split the difference: "==" and "!=" follow
IEEE754 for NaN, but most other operations involving builtin types
rely on the assumption that equality is always reflexive (and IEEE754
be damned).

What that means is that "correct" implementations of methods like
__contains__, __eq__, __ne__, index() and count() on containers should
be using "x is y or x == y" to enforce reflexivity, but most such code
does not (e.g. our own collections.abc.Sequence implementation gets
those of these that it implements wrong, and hence Sequence based
containers will handle NaN in a way that differs from the builtin
containers)

And none of that is actually documented anywhere (other than a
behavioural note in the 3.x documentation for
PyObject_RichCompareBool), so it's currently just an implementation
detail of CPython that most of the builtin containers behave that way
in practice.


Which is why I proposed a Glossary entry in another post.


Given the status quo, what would seem to be the path of least resistance is to:
- articulate in the language specification which container special
methods are expected to enforce reflexivity of equality (even for
non-reflexive types)
- articulate in the library specification which ordinary container
methods enforce reflexivity of equality
- fix any standard library containers that don't enforce reflexivity
to do so where appropriate (e.g. collections.abc.Sequence)


+1 to making my proposed text consistenly true if not now ;-).


Backing away from having float and decimal.Decimal respect the IEEE754
notion of NaN inequality at this late stage of the game seems like one
for the "too hard" basket.


Robert Kern confirmed my suspicion about this relative to numpy.

> It also wouldn't achieve much, since we

want the builtin containers to preserve their invariants even for 3rd
party types with a non-reflexive notion of equality.


Good point.

--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Mark Dickinson

On Wed, Apr 27, 2011 at 10:37 AM, Hrvoje Niksic  wrote:
> The other day I was surprised to learn this:
>
 nan = float('nan')
 nan == nan
> False
 [nan] == [nan]
> True                  # also True in tuples, dicts, etc.

That one surprises me a bit too:  I knew we were using
identity-then-equality checks for containment (nan in [nan]), but I
hadn't realised identity-then-equality was also used for the
item-by-item comparisons when comparing two lists.  It's defensible,
though: [nan] == [nan] should presumably produce the same result as
{nan} == {nan}, and the latter is a test that's arguably based on
containment (for sets s and t, s == t if each element of s is in t,
and vice versa).

I don't think any of this should change.  It seems to me that we've
currently got something approaching the best approximation to
consistency and sanity achievable, given the fundamental
incompatibility of (1) nan breaking reflexivity of equality and (2)
containment being based on equality.  That incompatibility is bound to
create inconsistencies somewhere along the line.

Declaring that 'nan == nan' should be True seems attractive in theory,
but I agree that it doesn't really seem like a realistic option in
terms of backwards compatibility and compatibility with other
mainstream languages.

Mark
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Mark Dickinson

On Wed, Apr 27, 2011 at 7:41 PM, Glenn Linderman  wrote:
> One issue that I don't fully understand: I know there is only one instance
> of None in Python, but I'm not sure where to discover whether there is only
> a single, or whether there can be multiple, instances of NaN or Inf.  The
> IEEE 754 spec is clear that there are multiple bit sequences that can be
> used to represent these, so I would hope that there can be, in fact, more
> than one value containing NaN (and Inf).
>
> This would properly imply that a collection should correctly handle the case
> of storing multiple, different items using different NaN (and Inf)
> instances.  A dict, for example, should be able to hold hundreds of items
> with the index value of NaN.
>
> The distinction between "is" and "==" would permit proper operation, and I
> believe that Python's "rebinding" of names to values rather than the copying
> of values to variables makes such a distinction possible to use in a correct
> manner.

For infinities, there's no issue:  there are exactly two distinct
infinities (+inf and -inf), and they don't have any special properties
that affect membership tests.   Your float-keyed dict can contain both
+inf and -inf keys, or just one, or neither, in exactly the same way
that it can contain both +5.0 and -5.0 as keys, or just one, or
neither.

For nans, you *can* put multiple nans into a dictionary as separate
keys, but under the current rules the test for 'sameness' of two nan
keys becomes a test of object identity, not of bitwise equality.
Python takes no notice of the sign bits and 'payload' bits of a float
nan, except in operations like struct.pack and struct.unpack.  For
example:

>>> x, y = float('nan'), float('nan')
>>> d = {x: 1, y:2}
>>> x in d
True
>>> y in d
True
>>> d[x]
1
>>> d[y]
2

But using struct.pack, you can see that x and y are bitwise identical:

>>> struct.pack('http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Socket servers in the test suite

2011-04-27 Thread Vinay Sajip

I've been recently trying to improve the test coverage for the logging package,
and have got to a not unreasonable point:

logging/__init__.py 99% (96%)
logging/config.py 89% (85%)
logging/handlers.py 60% (54%)

where the figures in parentheses include branch coverage measurements.

I'm at the point where to appreciably increase coverage, I'd need to write some
test servers to exercise client code in SocketHandler, DatagramHandler and
HTTPHandler.

I notice there are no utility classes in test.support to help with this kind of
thing - would there be any mileage in adding such things? Of course I could add
test server code just to test_logging (which already contains some socket server
code to exercise the configuration functionality), but rolling a test server
involves boilerplate such as using a custom RequestHandler-derived class for
each application. I had in mind a more streamlined approach where you can just
pass a single callable to a server to handle requests, e.g. as outlined in

https://gist.github.com/945157

I'd be grateful for any comments about adding such functionality to e.g.
test.support.

Regards,

Vinay Sajip

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Glenn Linderman

On 4/27/2011 2:15 PM, Mark Dickinson wrote:

On Wed, Apr 27, 2011 at 7:41 PM, Glenn Linderman  wrote:

One issue that I don't fully understand: I know there is only one instance
of None in Python, but I'm not sure where to discover whether there is only
a single, or whether there can be multiple, instances of NaN or Inf.  The
IEEE 754 spec is clear that there are multiple bit sequences that can be
used to represent these, so I would hope that there can be, in fact, more
than one value containing NaN (and Inf).

This would properly imply that a collection should correctly handle the case
of storing multiple, different items using different NaN (and Inf)
instances.  A dict, for example, should be able to hold hundreds of items
with the index value of NaN.

The distinction between "is" and "==" would permit proper operation, and I
believe that Python's "rebinding" of names to values rather than the copying
of values to variables makes such a distinction possible to use in a correct
manner.

For infinities, there's no issue:  there are exactly two distinct
infinities (+inf and -inf), and they don't have any special properties
that affect membership tests.   Your float-keyed dict can contain both
+inf and -inf keys, or just one, or neither, in exactly the same way
that it can contain both +5.0 and -5.0 as keys, or just one, or
neither.

For nans, you *can* put multiple nans into a dictionary as separate
keys, but under the current rules the test for 'sameness' of two nan
keys becomes a test of object identity, not of bitwise equality.
Python takes no notice of the sign bits and 'payload' bits of a float
nan, except in operations like struct.pack and struct.unpack.  For
example:
Thanks, Mark, for the succinct description and demonstration.  Yes, only 
two Inf values, many possible NaNs.  And this is what I would expect.

I would not, however expect the original case that was described:
>>> nan = float('nan')
>>> nan == nan
False
>>> [nan] == [nan]
True  # also True in tuples, dicts, etc.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Greg Ewing


Guido van Rossum wrote:


Maybe we should just call off the odd NaN comparison behavior?


That's probably as good an idea as anything.

The weirdness of NaNs is supposed to ensure that they
propagate through a computation as a kind of exception
signal. But to make that work properly, comparing two
NaNs should really give you a NaB (Not a Boolean). As
long as we're not doing that, we might as well treat
NaNs sanely as Python objects.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Glenn Linderman


On 4/27/2011 2:04 PM, Mark Dickinson wrote:

On Wed, Apr 27, 2011 at 10:37 AM, Hrvoje Niksic  wrote:

The other day I was surprised to learn this:


nan = float('nan')
nan == nan

False

[nan] == [nan]

True  # also True in tuples, dicts, etc.

That one surprises me a bit too:  I knew we were using
identity-then-equality checks for containment (nan in [nan]), but I
hadn't realised identity-then-equality was also used for the
item-by-item comparisons when comparing two lists.  It's defensible,
though: [nan] == [nan] should presumably produce the same result as
{nan} == {nan}, and the latter is a test that's arguably based on
containment (for sets s and t, s == t if each element of s is in t,
and vice versa).

I don't think any of this should change.  It seems to me that we've
currently got something approaching the best approximation to
consistency and sanity achievable, given the fundamental
incompatibility of (1) nan breaking reflexivity of equality and (2)
containment being based on equality.  That incompatibility is bound to
create inconsistencies somewhere along the line.

Declaring that 'nan == nan' should be True seems attractive in theory,
but I agree that it doesn't really seem like a realistic option in
terms of backwards compatibility and compatibility with other
mainstream languages.


I think it should change.  Inserting a NaN, even the same instance of 
NaN into a list shouldn't suddenly make it compare equal to itself, 
especially since the docs (section 5.9. Comparisons) say:


   *

 Tuples and lists are compared lexicographically using comparison
 of corresponding elements. This means that to compare equal, each
 element must compare equal and the two sequences must be of the
 same type and have the same length.

 If not equal, the sequences are ordered the same as their first
 differing elements. For example, [1,2,x] <= [1,2,y] has the same
 value as x <= y. If the corresponding element does not exist, the
 shorter sequence is ordered first (for example, [1,2] < [1,2,3]).

The principle of least surprise, says that if two unequal items are 
inserted into otherwise equal lists, the lists should be unequal.  NaN 
is unequal to itself.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Steven D'Aprano

Guido van Rossum wrote:

Maybe we should just call off the odd NaN comparison behavior?

This doesn't solve the broader problem that *any* type might 
deliberately define non-reflexive equality, and therefore people will 
still be surprised by

>>> x = SomeObject()
>>> x == x
False
>>> [x] == [x]
True

The "problem" (if it is a problem) here is list, not NANs. Please don't 
break NANs to not-fix a problem with list.

Since we can't (can we?) prohibit non-reflexivity, and even if we can, 
we shouldn't, reasonable solutions are:

(1) live with the fact that lists and other built-in containers will 
short-cut equality with identity for speed, ignoring __eq__;

(2) slow containers down by guaranteeing that they will use __eq__;

(but how much will it actually hurt performance for real-world cases? 
and this will have the side-effect that non-reflexivity will propagate 
to containers)

(3) allow types to register that they are non-reflexive, allowing 
containers to skip the identity shortcut when necessary.

(but it is not clear to me that the extra complexity will be worth the cost)

My vote is the status quo, (1).

--
Steven

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Steven D'Aprano

Terry Reedy wrote:

On 4/27/2011 2:41 PM, Glenn Linderman wrote:

One issue that I don't fully understand: I know there is only one
instance of None in Python, but I'm not sure where to discover whether
there is only a single, or whether there can be multiple, instances of
NaN or Inf.

I am sure there are multiple instances with just one bit pattern, the 
same as other floats. Otherwise, float('nan') would have to either 
randomly or systematically choose from among the possibilities. Ugh.

I think Glenn is asking whether NANs are singletons. They're not:

>>> x = float('nan')
>>> y = float('nan')
>>> x is y
False
>>> [x] == [y]
False

There are functions in the math module that pull apart (and put 
together) floats.

The IEEE 754 spec is clear that there are multiple bit
sequences that can be used to represent these,

Anyone actually interested in those should use C or possibly the math 
module float assembly function.

I'd like to point out that way back in the 1980s, Apple's Hypercard 
allowed users to construct, and compare, distinct NANs without needing 
to use C or check bit patterns. I think it is painful and ironic that a 
development system aimed at non-programmers released by a company 
notorious for "dumbing down" interfaces over 20 years ago had better and 
simpler support for NANs than we have now.

--
Steven
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Steven D'Aprano


Greg Ewing wrote:

Guido van Rossum wrote:


Maybe we should just call off the odd NaN comparison behavior?


That's probably as good an idea as anything.

The weirdness of NaNs is supposed to ensure that they
propagate through a computation as a kind of exception
signal. But to make that work properly, comparing two
NaNs should really give you a NaB (Not a Boolean). As
long as we're not doing that, we might as well treat
NaNs sanely as Python objects.


That doesn't follow. You can compare NANs, and the result of the 
comparisons are perfectly well defined by either True or False. There's 
no need for a NAB comparison flag.




--
Steven

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Python-checkins] cpython: PyGILState_Ensure(), PyGILState_Release(), PyGILState_GetThisThreadState() are

2011-04-27 Thread Jim Jewett

Would it be a problem to make them available a no-ops?

On 4/26/11, victor.stinner  wrote:
> http://hg.python.org/cpython/rev/75503c26a17f
> changeset:   69584:75503c26a17f
> user:Victor Stinner 
> date:Tue Apr 26 23:34:58 2011 +0200
> summary:
>   PyGILState_Ensure(), PyGILState_Release(), PyGILState_GetThisThreadState()
> are
> not available if Python is compiled without threads.
>
> files:
>   Include/pystate.h |  10 +++---
>   1 files changed, 7 insertions(+), 3 deletions(-)
>
>
> diff --git a/Include/pystate.h b/Include/pystate.h
> --- a/Include/pystate.h
> +++ b/Include/pystate.h
> @@ -73,9 +73,9 @@
>  struct _frame *frame;
>  int recursion_depth;
>  char overflowed; /* The stack has overflowed. Allow 50 more calls
> - to handle the runtime error. */
> -char recursion_critical; /* The current calls must not cause
> - a stack overflow. */
> +to handle the runtime error. */
> +char recursion_critical; /* The current calls must not cause
> +a stack overflow. */
>  /* 'tracing' keeps track of the execution depth when tracing/profiling.
> This is to prevent the actual trace/profile code from being recorded
> in
> the trace/profile. */
> @@ -158,6 +158,8 @@
>  enum {PyGILState_LOCKED, PyGILState_UNLOCKED}
>  PyGILState_STATE;
>
> +#ifdef WITH_THREAD
> +
>  /* Ensure that the current thread is ready to call the Python
> C API, regardless of the current state of Python, or of its
> thread lock.  This may be called as many times as desired
> @@ -199,6 +201,8 @@
>  */
>  PyAPI_FUNC(PyThreadState *) PyGILState_GetThisThreadState(void);
>
> +#endif   /* #ifdef WITH_THREAD */
> +
>  /* The implementation of sys._current_frames()  Returns a dict mapping
> thread id to that thread's current frame.
>  */
>
> --
> Repository URL: http://hg.python.org/cpython
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Ethan Furman


Mark Dickinson wrote:

On Wed, Apr 27, 2011 at 10:37 AM, Hrvoje Niksic  wrote:

The other day I was surprised to learn this:


nan = float('nan')
nan == nan

False

[nan] == [nan]

True  # also True in tuples, dicts, etc.


That one surprises me a bit too:  I knew we were using
identity-then-equality checks for containment (nan in [nan]), but I
hadn't realised identity-then-equality was also used for the
item-by-item comparisons when comparing two lists.  It's defensible,
though: [nan] == [nan] should presumably produce the same result as
{nan} == {nan}, and the latter is a test that's arguably based on
containment (for sets s and t, s == t if each element of s is in t,
and vice versa).

I don't think any of this should change.  It seems to me that we've
currently got something approaching the best approximation to
consistency and sanity achievable, given the fundamental
incompatibility of (1) nan breaking reflexivity of equality and (2)
containment being based on equality.  That incompatibility is bound to
create inconsistencies somewhere along the line.

Declaring that 'nan == nan' should be True seems attractive in theory,
but I agree that it doesn't really seem like a realistic option in
terms of backwards compatibility and compatibility with other
mainstream languages.


Totally out of my depth, but what if the a NaN object was allowed to 
compare equal to itself, but different NaN objects still compared 
unequal?  If NaN was a singleton then the current behavior makes more 
sense, but since we get a new NaN with each instance creation is there 
really a good reason why the same NaN can't be equal to itself?


~Ethan~
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Glenn Linderman

On 4/27/2011 5:05 PM, Steven D'Aprano wrote:

(2) slow containers down by guaranteeing that they will use __eq__;

(but how much will it actually hurt performance for real-world cases? 
and this will have the side-effect that non-reflexivity will propagate 
to containers) 

I think it is perfectly reasonable that containers containing items with 
non-reflexive equality should sometimes have non-reflexive equality also 
(depends on the placement of the item in the container, and the values 
of other items, whether the non-reflexive equality of an internal item 
will actually affect the equality of the container in practice).

I quoted the docs for tuple and list comparisons in a different part of 
this thread, and for those types, the docs are very clear that the items 
must compare equal for the lists or tuples to compare equal.  For other 
built-in types, the docs are less clear:

   *

 Mappings (dictionaries) compare equal if and only if they have the
 same (key, value) pairs. Order comparisons ('<', '<=', '>=', '>')
 raise TypeError
 .

So we can immediately conclude that mappings do not provide an ordering 
for sorts.  But, the language "same (key, value)" pairs implies identity 
comparisons, rather than equality comparisons.  But in practice, 
equality is used sometimes, and identity sometimes:

>>> nan = float('NaN')
>>> d1 = dict( a=1, nan=2 )
>>> d2 = dict( a=1, nan=2.0 )
>>> d1 == d2
True
>>> 2 is 2.0
False

"nan" and "nan" is being compared using identity, 2 and 2.0 by 
equality.  While that may be clear to those of you that know the 
implementation (and even have described it somewhat in this thread), it 
is certainly not clear in the docs.  And I think it should read much 
more like lists and tuples... "if all the (key, value) pairs, considered 
as tuples, are equal".

   *

 Sets and frozensets define comparison operators to mean subset and
 superset tests. Those relations do not define total orderings (the
 two sets {1,2} and {2,3} are not equal, nor subsets of one
 another, nor supersets of one another). Accordingly, sets are not
 appropriate arguments for functions which depend on total
 ordering. For example, min()
 , max()
 , and
 sorted()

 produce undefined results given a list of sets as inputs.

This clearly talks about sets and subsets, but it doesn't define those 
concepts well in this section.  It should refer to where it that concept 
is defined, perhaps.  The intuitive definition of "subset" to me is if, 
for every item in set A, if an equal item is found in set B, then set A 
is a subset of set B.  That's what I learned back in math classes.  
Since NaN is not equal to NaN, however, I would not expect a set 
containing NaN to compare equal to any other set.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Glenn Linderman

On 4/27/2011 6:11 PM, Ethan Furman wrote:

Mark Dickinson wrote:
On Wed, Apr 27, 2011 at 10:37 AM, Hrvoje Niksic 
 wrote:

The other day I was surprised to learn this:

nan = float('nan')
nan == nan

False

[nan] == [nan]

True  # also True in tuples, dicts, etc.

That one surprises me a bit too:  I knew we were using
identity-then-equality checks for containment (nan in [nan]), but I
hadn't realised identity-then-equality was also used for the
item-by-item comparisons when comparing two lists.  It's defensible,
though: [nan] == [nan] should presumably produce the same result as
{nan} == {nan}, and the latter is a test that's arguably based on
containment (for sets s and t, s == t if each element of s is in t,
and vice versa).

I don't think any of this should change.  It seems to me that we've
currently got something approaching the best approximation to
consistency and sanity achievable, given the fundamental
incompatibility of (1) nan breaking reflexivity of equality and (2)
containment being based on equality.  That incompatibility is bound to
create inconsistencies somewhere along the line.

Declaring that 'nan == nan' should be True seems attractive in theory,
but I agree that it doesn't really seem like a realistic option in
terms of backwards compatibility and compatibility with other
mainstream languages.

Totally out of my depth, but what if the a NaN object was allowed to 
compare equal to itself, but different NaN objects still compared 
unequal?  If NaN was a singleton then the current behavior makes more 
sense, but since we get a new NaN with each instance creation is there 
really a good reason why the same NaN can't be equal to itself?

>>> n1 = float('NaN')
>>> n2 = float('NaN')
>>> n3 = n1

>>> n1
nan
>>> n2
nan
>>> n3
nan

>>> [n1] == [n2]
False
>>> [n1] == [n3]
True

This is the current situation: some NaNs compare equal sometimes, and 
some don't.  And unless you are particularly aware of the identity of 
the object containing the NaN (not the list, but the particular NaN 
value) it is surprising and confusing, because the mathematical 
definition of NaN is that it should not be equal to itself.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Glenn Linderman


On 4/27/2011 6:15 PM, Glenn Linderman wrote:
I think it is perfectly reasonable that containers containing items 
with non-reflexive equality should sometimes have non-reflexive 
equality also (depends on the placement of the item in the container, 
and the values of other items, whether the non-reflexive equality of 
an internal item will actually affect the equality of the container in 
practice).


Pardon me, please ignore the parenthetical statement... it was really 
inspired by inequality comparisons, not equality comparisons.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Socket servers in the test suite

2011-04-27 Thread Nick Coghlan

On Thu, Apr 28, 2011 at 7:23 AM, Vinay Sajip  wrote:
> I've been recently trying to improve the test coverage for the logging 
> package,
> and have got to a not unreasonable point:
>
> logging/__init__.py 99% (96%)
> logging/config.py 89% (85%)
> logging/handlers.py 60% (54%)
>
> where the figures in parentheses include branch coverage measurements.
>
> I'm at the point where to appreciably increase coverage, I'd need to write 
> some
> test servers to exercise client code in SocketHandler, DatagramHandler and
> HTTPHandler.
>
> I notice there are no utility classes in test.support to help with this kind 
> of
> thing - would there be any mileage in adding such things? Of course I could 
> add
> test server code just to test_logging (which already contains some socket 
> server
> code to exercise the configuration functionality), but rolling a test server
> involves boilerplate such as using a custom RequestHandler-derived class for
> each application. I had in mind a more streamlined approach where you can just
> pass a single callable to a server to handle requests, e.g. as outlined in
>
> https://gist.github.com/945157
>
> I'd be grateful for any comments about adding such functionality to e.g.
> test.support.

If you poke around in the test directory a bit, you may find there is
already some code along these lines in other tests (e.g. I'm pretty
sure the urllib tests already fire up a local server). Starting down
the path of standardisation of that test functionality would be good.

For larger components like this, it's also reasonable to add a
dedicated helper module rather than using test.support directly. I
started (and Antoine improved) something along those lines with the
test.script_helper module for running Python subprocesses and checking
their output, although it lacks documentation and there are lots of
older tests that still use subprocess directly.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Stephen J. Turnbull

Glenn Linderman writes:

 > I would not, however expect the original case that was described:
 >  >>> nan = float('nan')
 >  >>> nan == nan
 > False
 >  >>> [nan] == [nan]
 > True  # also True in tuples, dicts, etc.

Are you saying you would expect that

>>> nan = float('nan')
>>> a = [1, ..., 499, nan, 501, ..., 999]# meta-ellipsis, not Ellipsis
>>> a == a
False

??

I wouldn't even expect

>>> a = [1, ..., 499, float('nan'), 501, ..., 999]
>>> b = [1, ..., 499, float('nan'), 501, ..., 999]
>>> a == b
False

but I guess I have to live with that.  While I wouldn't apply it
to other people, I have to admit Raymond's aphorism applies to me (the
surprising thing is not the behavior of NaNs, but that I'm surprised
by anything that happens in the presence of NaNs!)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Stephen J. Turnbull

Mark Dickinson writes:

 > Declaring that 'nan == nan' should be True seems attractive in
 > theory,

No, it's intuitively attractive, but that's because humans like nice
continuous behavior.  In *theory*, it's true that some singularities
are removable, and the NaN that occurs when evaluating at that point
is actually definable in a broader context, but the point of NaN is
that some singularities are *not* removable.  This is somewhat
Pythonic: "In the presence of ambiguity, refuse to guess."

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Stephen J. Turnbull

Glenn Linderman writes:
 > On 4/27/2011 6:11 PM, Ethan Furman wrote:

 > > Totally out of my depth, but what if the a NaN object was allowed to 
 > > compare equal to itself, but different NaN objects still compared 
 > > unequal?  If NaN was a singleton then the current behavior makes more 
 > > sense, but since we get a new NaN with each instance creation is there 
 > > really a good reason why the same NaN can't be equal to itself?

Yes.  A NaN is a special object that means "the computation that
produced this object is undefined."  For example, consider the
computation 1/x at x = 0.  If you approach from the left, 1/0
"obviously" means minus infinity, while if you approach from the right
just as obviously it means plus infinity.  So what does the 1/0 that
occurs in [1/x for x in range(-5, 6)] mean?  In what sense is it
"equal to itself"?  How can something which is not a number be
compared for numerical equality?

 >  >>> n1 = float('NaN')
 >  >>> n2 = float('NaN')
 >  >>> n3 = n1
 > 
 >  >>> n1
 > nan
 >  >>> n2
 > nan
 >  >>> n3
 > nan
 > 
 >  >>> [n1] == [n2]
 > False
 >  >>> [n1] == [n3]
 > True
 > 
 > This is the current situation: some NaNs compare equal sometimes, and 
 > some don't.

No, Ethan is asking for "n1 == n3" => True.  As Mark points out, "[n1]
== [n3]" can be interpreted as a containment question, rather than an
equality question, with respect to the NaNs themselves.  In standard
set theory, these are the same question, but that's not necessarily so
in other set-like toposes.  In particular, getting equality and set
membership to behave reasonably with respect to each other one of the
problems faced in developing a workable theory of fuzzy sets.

I don't think it matters what behavior you choose for NaNs, somebody
is going be unhappy sometimes.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Guido van Rossum

On Wed, Apr 27, 2011 at 9:28 AM, Raymond Hettinger
 wrote:
>
> On Apr 27, 2011, at 7:53 AM, Guido van Rossum wrote:
>
>> Maybe we should just call off the odd NaN comparison behavior?
>
> I'm reluctant to suggest changing such enshrined behavior.

No doubt there would be some problems; probably more for decimals than
for floats.

> ISTM, the current state of affairs is reasonable.

Hardly; when I picked the NaN behavior I knew the IEEE std prescribed
it but had never seen any code that used this.

> Exotic objects are allowed to generate exotic behaviors
> but consumers of those objects are free to ignore some
> of those behaviors by making reasonable assumptions
> about how an object should behave.

I'd say that the various issues and inconsistencies brought up (e.g. x
in A even though no a in A equals x) make it clear that one ignores
NaN's exoticnesss at one's peril.

> It's possible to make objects where the __hash__ doesn't
> correspond to __eq__.; they just won't behave well with
> hash tables.

That's not the same thing at all. Such an object would violate a rule
of the language (although one that Python cannot strictly enforce) and
it would always be considered a bug. Currently NaN is not violating
any language rules -- it is just violating users' intuition, in a much
worse way than Inf does. (All in all, Inf behaves pretty intuitively,
at least for someone who was awake during at least a few high school
math classes. NaN is not discussed there. :-)

> Likewise, it's possible for a sequence to
> define a __len__ that is different from it true length; it
> just won't behave well with the various pieces of code
> that assume collections are equal if the lengths are unequal.

(you probably meant "are never equal")

Again, typically a bug.

> All of this seems reasonable to me.

Given the IEEE std and Python's history, it's defensible and hard to
change, but still, I find reasonable too strong a word for the
situation.

I expect that that if 15 years or so ago I had decided to ignore the
IEEE std and declare that object identity always implies equality it
would have seemed quite reasonable as well... The rule could be
something like "the == operator first checks for identity and if left
and right are the same object, the answer is True without calling the
object's __eq__ method; similarly the != would always return False
when an object is compared to itself". We wouldn't change the
inequalities, nor the outcome if a NaN is compared to another NaN (not
the same object). But we would extend the special case for object
identity from containers to all == and != operators. (Currently it
seems that all NaNs have a hash() of 0. That hasn't hurt anyone so
far.)

Doing this in 3.3 would, alas, be a huge undertaking -- I expect that
there are tons of unittests that depend either on the current NaN
behavior or on x == x calling x.__eq__(x). Plus the decimal unittests
would be affected. Perhaps somebody could try?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Guido van Rossum

On Wed, Apr 27, 2011 at 11:48 AM, Robert Kern  wrote:
> On 4/27/11 12:44 PM, Terry Reedy wrote:
>>
>> On 4/27/2011 10:53 AM, Guido van Rossum wrote:
>
>>> Maybe we should just call off the odd NaN comparison behavior?
>>
>> Eiffel seems to have survived, though I do not know if it used for
>> numerical
>> work. I wonder how much code would break and what the scipy folks would
>> think.
>
> I suspect most of us would oppose changing it on general
> backwards-compatibility grounds rather than actually *liking* the current
> behavior. If the behavior changed with Python floats, we'd have to mull over
> whether we try to match that behavior with our scalar types (one of which
> subclasses from float) and our arrays. We would be either incompatible with
> Python or C, and we'd probably end up choosing Python to diverge from. It
> would make a mess, honestly. We already have to explain why equality is
> funky for arrays (arr1 == arr2 is a rich comparison that gives an array, not
> a bool, so we can't do containment tests for lists of arrays), so NaN is
> pretty easy to explain afterward.

So does NumPy also follow Python's behavior about ignoring the NaN
special-casing when doing array ops?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Robert Kern


On 2011-04-27 22:16 , Guido van Rossum wrote:

On Wed, Apr 27, 2011 at 11:48 AM, Robert Kern  wrote:

On 4/27/11 12:44 PM, Terry Reedy wrote:


On 4/27/2011 10:53 AM, Guido van Rossum wrote:



Maybe we should just call off the odd NaN comparison behavior?


Eiffel seems to have survived, though I do not know if it used for
numerical
work. I wonder how much code would break and what the scipy folks would
think.


I suspect most of us would oppose changing it on general
backwards-compatibility grounds rather than actually *liking* the current
behavior. If the behavior changed with Python floats, we'd have to mull over
whether we try to match that behavior with our scalar types (one of which
subclasses from float) and our arrays. We would be either incompatible with
Python or C, and we'd probably end up choosing Python to diverge from. It
would make a mess, honestly. We already have to explain why equality is
funky for arrays (arr1 == arr2 is a rich comparison that gives an array, not
a bool, so we can't do containment tests for lists of arrays), so NaN is
pretty easy to explain afterward.


So does NumPy also follow Python's behavior about ignoring the NaN
special-casing when doing array ops?


By "ignoring the NaN special-casing", do you mean that identity is checked 
first? When we use dtype=object arrays (arrays that contain Python objects as 
their data), yes:


[~]
|1> nan = float('nan')

[~]
|2> import numpy as np

[~]
|3> a = np.array([1, 2, nan], dtype=object)

[~]
|4> nan in a
True

[~]
|5> float('nan') in a
False


Just like lists:

[~]
|6> nan in [1, 2, nan]
True

[~]
|7> float('nan') in [1, 2, nan]
False


Actually, we go a little further by using PyObject_RichCompareBool() rather than 
PyObject_RichCompare() to implement the array-wise comparisons in addition to 
containment:


[~]
|8> a == nan
array([False, False,  True], dtype=bool)

[~]
|9> [x == nan for x in [1, 2, nan]]
[False, False, False]


But for dtype=float arrays (which contain C doubles, not Python objects) we use 
C semantics. Literally, we use whatever C's == operator gives us for the two 
double values. Since there is no concept of identity for this case, there is no 
cognate behavior of Python to match.


[~]
|10> b = np.array([1.0, 2.0, nan], dtype=float)

[~]
|11> b == nan
array([False, False, False], dtype=bool)

[~]
|12> nan in b
False

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Nick Coghlan

On Thu, Apr 28, 2011 at 12:42 PM, Stephen J. Turnbull
 wrote:
> Mark Dickinson writes:
>
>  > Declaring that 'nan == nan' should be True seems attractive in
>  > theory,
>
> No, it's intuitively attractive, but that's because humans like nice
> continuous behavior.  In *theory*, it's true that some singularities
> are removable, and the NaN that occurs when evaluating at that point
> is actually definable in a broader context, but the point of NaN is
> that some singularities are *not* removable.  This is somewhat
> Pythonic: "In the presence of ambiguity, refuse to guess."

Refusing to guess in this case would be to treat all NaNs as
signalling NaNs, and that wouldn't be good, either :)

I like Terry's suggestion for a glossary entry, and have created an
updated proposal at http://bugs.python.org/issue11945

(I also noted that array.array is like collections.Sequence in failing
to enforce the container invariants in the presence of NaN values)

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Guido van Rossum

On Wed, Apr 27, 2011 at 8:42 PM, Robert Kern  wrote:
> On 2011-04-27 22:16 , Guido van Rossum wrote:
>> So does NumPy also follow Python's behavior about ignoring the NaN
>> special-casing when doing array ops?
>
> By "ignoring the NaN special-casing", do you mean that identity is checked
> first? When we use dtype=object arrays (arrays that contain Python objects
> as their data), yes:
>
> [~]
> |1> nan = float('nan')
>
> [~]
> |2> import numpy as np
>
> [~]
> |3> a = np.array([1, 2, nan], dtype=object)
>
> [~]
> |4> nan in a
> True
>
> [~]
> |5> float('nan') in a
> False
>
>
> Just like lists:
>
> [~]
> |6> nan in [1, 2, nan]
> True
>
> [~]
> |7> float('nan') in [1, 2, nan]
> False
>
>
> Actually, we go a little further by using PyObject_RichCompareBool() rather
> than PyObject_RichCompare() to implement the array-wise comparisons in
> addition to containment:
>
> [~]
> |8> a == nan
> array([False, False,  True], dtype=bool)

Hm, this sounds like NumPy always considers a NaN equal to *itself* as
long as objects are concerned.

> [~]
> |9> [x == nan for x in [1, 2, nan]]
> [False, False, False]
>
>
> But for dtype=float arrays (which contain C doubles, not Python objects) we
> use C semantics. Literally, we use whatever C's == operator gives us for the
> two double values. Since there is no concept of identity for this case,
> there is no cognate behavior of Python to match.
>
> [~]
> |10> b = np.array([1.0, 2.0, nan], dtype=float)
>
> [~]
> |11> b == nan
> array([False, False, False], dtype=bool)
>
> [~]
> |12> nan in b
> False

And I wouldn't want to change that. It sounds like NumPy wouldn't be
much affected if we were to change this (which I'm not saying we
would).

Thanks!

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Guido van Rossum

On Wed, Apr 27, 2011 at 8:43 PM, Nick Coghlan  wrote:
> (I also noted that array.array is like collections.Sequence in failing
> to enforce the container invariants in the presence of NaN values)

Regardless of whether we go any further it would indeed be good to be
explicit about the rules in the language reference and fix the
behavior of collections.Sequence.

I'm not sure about array.array -- it doesn't hold objects so I don't
think there's anything to enforce. It seems to behave the same way as
NumPy arrays when they don't contain objects.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Alexander Belopolsky

On Wed, Apr 27, 2011 at 2:48 PM, Robert Kern  wrote:
..
> I suspect most of us would oppose changing it on general
> backwards-compatibility grounds rather than actually *liking* the current
> behavior. If the behavior changed with Python floats, we'd have to mull over
> whether we try to match that behavior with our scalar types (one of which
> subclasses from float) and our arrays. We would be either incompatible with
> Python or C, and we'd probably end up choosing Python to diverge from. It
> would make a mess, honestly. We already have to explain why equality is
> funky for arrays (arr1 == arr2 is a rich comparison that gives an array, not
> a bool, so we can't do containment tests for lists of arrays), so NaN is
> pretty easy to explain afterward.

Most NumPy applications are actually not exposed to NaN problems
because it is recommended that NaNs be avoided in computations and
when missing or undefined values are necessary, the recommended
solution is to use ma.array or masked array which is a drop-in
replacement for numpy array type and carries a boolean "mask" value
with every element.  This allows to have undefined elements is arrays
of any type: float, integer or even boolean.  Masked values propagate
through all computations including comparisons.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Guido van Rossum

On Wed, Apr 27, 2011 at 9:15 PM, Alexander Belopolsky
 wrote:
> On Wed, Apr 27, 2011 at 2:48 PM, Robert Kern  wrote:
> ..
>> I suspect most of us would oppose changing it on general
>> backwards-compatibility grounds rather than actually *liking* the current
>> behavior. If the behavior changed with Python floats, we'd have to mull over
>> whether we try to match that behavior with our scalar types (one of which
>> subclasses from float) and our arrays. We would be either incompatible with
>> Python or C, and we'd probably end up choosing Python to diverge from. It
>> would make a mess, honestly. We already have to explain why equality is
>> funky for arrays (arr1 == arr2 is a rich comparison that gives an array, not
>> a bool, so we can't do containment tests for lists of arrays), so NaN is
>> pretty easy to explain afterward.
>
> Most NumPy applications are actually not exposed to NaN problems
> because it is recommended that NaNs be avoided in computations and
> when missing or undefined values are necessary, the recommended
> solution is to use ma.array or masked array which is a drop-in
> replacement for numpy array type and carries a boolean "mask" value
> with every element.  This allows to have undefined elements is arrays
> of any type: float, integer or even boolean.  Masked values propagate
> through all computations including comparisons.

So do new masks get created when the outcome of an elementwise
operation is a NaN? Because that's the only reason why one should have
NaNs in one's data in the first place -- not to indicate missing
values!

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Robert Kern


On 2011-04-27 23:01 , Guido van Rossum wrote:

On Wed, Apr 27, 2011 at 8:42 PM, Robert Kern  wrote:



But for dtype=float arrays (which contain C doubles, not Python objects) we
use C semantics. Literally, we use whatever C's == operator gives us for the
two double values. Since there is no concept of identity for this case,
there is no cognate behavior of Python to match.

[~]
|10>  b = np.array([1.0, 2.0, nan], dtype=float)

[~]
|11>  b == nan
array([False, False, False], dtype=bool)

[~]
|12>  nan in b
False


And I wouldn't want to change that. It sounds like NumPy wouldn't be
much affected if we were to change this (which I'm not saying we
would).


Well, I didn't say that. If Python changed its behavior for (float('nan') == 
float('nan')), we'd have to seriously consider some changes. We do like to keep 
*some* amount of correspondence with Python semantics. In particular, we like 
our scalar types that match Python types to work as close to the Python type as 
possible. We have the np.float64 type, which represents a C double scalar and 
corresponds to a Python float. It is used when a single item is indexed out of a 
float64 array. We even subclass from the Python float type to help working with 
libraries that may not know about numpy:


[~]
|5> import numpy as np

[~]
|6> nan = np.array([1.0, 2.0, float('nan')])[2]

[~]
|7> nan == nan
False

[~]
|8> type(nan)
numpy.float64

[~]
|9> type(nan).mro()
[numpy.float64,
 numpy.floating,
 numpy.inexact,
 numpy.number,
 numpy.generic,
 float,
 object]


If the Python float type changes behavior, we'd have to consider whether to keep 
that for np.float64 or change it to match the usual C semantics used elsewhere. 
So there *would* be a dilemma. Not necessarily the most nerve-wracking one, but 
a dilemma nonetheless.


--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Robert Kern


On 2011-04-27 23:24 , Guido van Rossum wrote:

On Wed, Apr 27, 2011 at 9:15 PM, Alexander Belopolsky
  wrote:

On Wed, Apr 27, 2011 at 2:48 PM, Robert Kern  wrote:
..

I suspect most of us would oppose changing it on general
backwards-compatibility grounds rather than actually *liking* the current
behavior. If the behavior changed with Python floats, we'd have to mull over
whether we try to match that behavior with our scalar types (one of which
subclasses from float) and our arrays. We would be either incompatible with
Python or C, and we'd probably end up choosing Python to diverge from. It
would make a mess, honestly. We already have to explain why equality is
funky for arrays (arr1 == arr2 is a rich comparison that gives an array, not
a bool, so we can't do containment tests for lists of arrays), so NaN is
pretty easy to explain afterward.


Most NumPy applications are actually not exposed to NaN problems
because it is recommended that NaNs be avoided in computations and
when missing or undefined values are necessary, the recommended
solution is to use ma.array or masked array which is a drop-in
replacement for numpy array type and carries a boolean "mask" value
with every element.  This allows to have undefined elements is arrays
of any type: float, integer or even boolean.  Masked values propagate
through all computations including comparisons.


So do new masks get created when the outcome of an elementwise
operation is a NaN?


No.


Because that's the only reason why one should have
NaNs in one's data in the first place -- not to indicate missing
values!


Yes. I'm not sure that Alexander was being entirely clear. Masked arrays are 
intended to solve just the missing data problem and not the occurrence of NaNs 
from computations. There is still a persistent part of the community that really 
does like to use NaNs for missing data, though. I don't think that's entirely 
relevant to this discussion[1].


I wouldn't say that numpy applications aren't exposed to NaN problems. They are 
just as exposed to computational NaNs as you would expect any application that 
does that many flops to be.


[1] Okay, that's a lie. I'm sure that persistent minority would *love* to have 
NaN == NaN, because that would make their (ab)use of NaNs easier to work with.


--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
 that is made terrible by our own mad attempt to interpret it as though it had
 an underlying truth."
  -- Umberto Eco

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Nick Coghlan

On Thu, Apr 28, 2011 at 2:07 PM, Guido van Rossum  wrote:
> I'm not sure about array.array -- it doesn't hold objects so I don't
> think there's anything to enforce. It seems to behave the same way as
> NumPy arrays when they don't contain objects.

Yep, after reading Robert's post I realised the point about native
arrays in NumPy (and the lack of "object identity" in those cases)
applied equally well to the array module.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Glenn Linderman

On 4/27/2011 7:31 PM, Stephen J. Turnbull wrote:

Glenn Linderman writes:

  >  I would not, however expect the original case that was described:
  >   >>>  nan = float('nan')
  >   >>>  nan == nan
  >  False
  >   >>>  [nan] == [nan]
  >  True  # also True in tuples, dicts, etc.

Are you saying you would expect that

nan = float('nan')
a = [1, ..., 499, nan, 501, ..., 999]# meta-ellipsis, not Ellipsis
a == a

False

??

Yes, absolutely.  Once you understand the definition of NaN, it 
certainly cannot be True.   a is a, but a is not equal to a.

I wouldn't even expect

a = [1, ..., 499, float('nan'), 501, ..., 999]
b = [1, ..., 499, float('nan'), 501, ..., 999]
a == b

False

but I guess I have to live with that.   While I wouldn't apply it
to other people, I have to admit Raymond's aphorism applies to me (the
surprising thing is not the behavior of NaNs, but that I'm surprised
by anything that happens in the presence of NaNs!)

The only thing that should happen in the presence of NaNs is more NaNs :)

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Guido van Rossum

On Wed, Apr 27, 2011 at 9:25 PM, Robert Kern  wrote:
> On 2011-04-27 23:01 , Guido van Rossum wrote:
>> And I wouldn't want to change that. It sounds like NumPy wouldn't be
>> much affected if we were to change this (which I'm not saying we
>> would).
>
> Well, I didn't say that. If Python changed its behavior for (float('nan') ==
> float('nan')), we'd have to seriously consider some changes.

Ah, but I'm not proposing anything of the sort! float('nan') returns a
new object each time and two NaNs that are not the same *object* will
still follow the IEEE std. It's just when comparing a NaN-valued
*object* to *itself* (i.e. the *same* object) that I would consider
following the lead of Python's collections.

> We do like to
> keep *some* amount of correspondence with Python semantics. In particular,
> we like our scalar types that match Python types to work as close to the
> Python type as possible. We have the np.float64 type, which represents a C
> double scalar and corresponds to a Python float. It is used when a single
> item is indexed out of a float64 array. We even subclass from the Python
> float type to help working with libraries that may not know about numpy:
>
> [~]
> |5> import numpy as np
>
> [~]
> |6> nan = np.array([1.0, 2.0, float('nan')])[2]
>
> [~]
> |7> nan == nan
> False

Yeah, this is where things might change, because it is the same
*object* left and right.

> [~]
> |8> type(nan)
> numpy.float64
>
> [~]
> |9> type(nan).mro()
> [numpy.float64,
>  numpy.floating,
>  numpy.inexact,
>  numpy.number,
>  numpy.generic,
>  float,
>  object]
>
>
> If the Python float type changes behavior, we'd have to consider whether to
> keep that for np.float64 or change it to match the usual C semantics used
> elsewhere. So there *would* be a dilemma. Not necessarily the most
> nerve-wracking one, but a dilemma nonetheless.

Given what I just said, would it still be a dilemma? Maybe a smaller one?

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Guido van Rossum

On Wed, Apr 27, 2011 at 9:33 PM, Robert Kern  wrote:
> [1] Okay, that's a lie. I'm sure that persistent minority would *love* to
> have NaN == NaN, because that would make their (ab)use of NaNs easier to
> work with.

Too bad, because that won't change. :-) I agree that this is abuse of
NaNs and shouldn't be encouraged.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Glenn Linderman

On 4/27/2011 8:06 PM, Stephen J. Turnbull wrote:

Glenn Linderman writes:
  >  On 4/27/2011 6:11 PM, Ethan Furman wrote:

  >  >  Totally out of my depth, but what if the a NaN object was allowed to
  >  >  compare equal to itself, but different NaN objects still compared
  >  >  unequal?  If NaN was a singleton then the current behavior makes more
  >  >  sense, but since we get a new NaN with each instance creation is there
  >  >  really a good reason why the same NaN can't be equal to itself?

Yes.  A NaN is a special object that means "the computation that
produced this object is undefined."  For example, consider the
computation 1/x at x = 0.  If you approach from the left, 1/0
"obviously" means minus infinity, while if you approach from the right
just as obviously it means plus infinity.  So what does the 1/0 that
occurs in [1/x for x in range(-5, 6)] mean?  In what sense is it
"equal to itself"?  How can something which is not a number be
compared for numerical equality?

  >   >>>  n1 = float('NaN')
  >   >>>  n2 = float('NaN')
  >   >>>  n3 = n1
  >
  >   >>>  n1
  >  nan
  >   >>>  n2
  >  nan
  >   >>>  n3
  >  nan
  >
  >   >>>  [n1] == [n2]
  >  False
  >   >>>  [n1] == [n3]
  >  True
  >
  >  This is the current situation: some NaNs compare equal sometimes, and
  >  some don't.

No, Ethan is asking for "n1 == n3" =>  True.  As Mark points out, "[n1]
== [n3]" can be interpreted as a containment question, rather than an
equality question, with respect to the NaNs themselves.

It _can_ be interpreted as a containment question, but doing so is 
contrary to the documentation of Python list comparison, which presently 
doesn't match the implementation.  The intuitive definition of equality 
of lists is that each member is equal.  The presence of NaN destroys 
intuition of people that don't expect them to be as different from 
numbers as they actually are, but for people that understand NaNs and 
expect them to behave according to their definition, then the presence 
of a NaN in a list would be expected to cause the list to not be equal 
to itself, because a NaN is not equal to itself.

In standard
set theory, these are the same question, but that's not necessarily so
in other set-like toposes.  In particular, getting equality and set
membership to behave reasonably with respect to each other one of the
problems faced in developing a workable theory of fuzzy sets.

I don't think it matters what behavior you choose for NaNs, somebody
is going be unhappy sometimes.

Some people will be unhappy just because they exist in the language, so 
I agree :)

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Nick Coghlan

On Thu, Apr 28, 2011 at 2:54 PM, Guido van Rossum  wrote:
>> Well, I didn't say that. If Python changed its behavior for (float('nan') ==
>> float('nan')), we'd have to seriously consider some changes.
>
> Ah, but I'm not proposing anything of the sort! float('nan') returns a
> new object each time and two NaNs that are not the same *object* will
> still follow the IEEE std. It's just when comparing a NaN-valued
> *object* to *itself* (i.e. the *same* object) that I would consider
> following the lead of Python's collections.

The reason this possibility bothers me is that it doesn't mesh well
with the "implementations are free to cache and reuse immutable
objects" rule. Although, if the updated NaN semantics were explicit
that identity was now considered part of the value of NaN objects
(thus ruling out caching them at the implementation layer), I guess
that objection would go away.

Regards,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Alexander Belopolsky

On Wed, Apr 27, 2011 at 11:14 PM, Guido van Rossum  wrote:
..
>> ISTM, the current state of affairs is reasonable.
>
> Hardly; when I picked the NaN behavior I knew the IEEE std prescribed
> it but had never seen any code that used this.
>

Same here.  The only code I've seen that depended on this NaN behavior
was either buggy (programmer did not consider NaN case) or was using x
== x as a way to detect nans.  The later idiom is universally frowned
upon regardless of the language.  In Python one should use
math.isnan() for this purpose.

I would like to present a challenge to the proponents of the status
quo.  Look through your codebase and find code that will behave
differently if nan == nan were True.   Then come back and report how
many bugs you have found. :-)  Seriously, though, I bet that if you
find anything, it will fall into one of the two cases I mentioned
above.

..
> I expect that that if 15 years or so ago I had decided to ignore the
> IEEE std and declare that object identity always implies equality it
> would have seemed quite reasonable as well... The rule could be
> something like "the == operator first checks for identity and if left
> and right are the same object, the answer is True without calling the
> object's __eq__ method; similarly the != would always return False
> when an object is compared to itself".

Note that ctypes' floats already behave this way:

>>> x = c_double(float('nan'))
>>> x == x
True

..
> Doing this in 3.3 would, alas, be a huge undertaking -- I expect that
> there are tons of unittests that depend either on the current NaN
> behavior or on x == x calling x.__eq__(x). Plus the decimal unittests
> would be affected. Perhaps somebody could try?

Before we go down this path, I would like to discuss another
peculiarity of NaNs:

>>> float('nan') < 0
False
>>> float('nan') > 0
False

This property in my experience causes much more trouble than nan ==
nan being false.  The problem is that common sorting or binary search
algorithms may degenerate into infinite loops in the presence of nans.
 This may even happen when searching for a finite value in a large
array that contains a single nan.  Errors like this do happen in the
wild and and after chasing a bug like this programmers tend to avoid
nans at all costs.  Oftentimes this leads to using "magic"
placeholders such as 1e300 for missing data.

Since py3k has already made None < 0 an error, it may be reasonable
for float('nan') < 0 to raise an error as well (probably ValueError
rather than TypeError).  This will not make lists with nans sortable
or searchable using binary search, but will make associated bugs
easier to find.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Alexander Belopolsky

On Thu, Apr 28, 2011 at 12:33 AM, Robert Kern  wrote:
> On 2011-04-27 23:24 , Guido van Rossum wrote:
..
>> So do new masks get created when the outcome of an elementwise
>> operation is a NaN?
>
> No.

Yes.

>>> from MA import array
>>> print array([0])/array([0])
[-- ]

(I don't have numpy on this laptop, so the example is using Numeric,
but I hope you guys did not change that while I was not looking:-)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Greg Ewing


Steven D'Aprano wrote:
You can compare NANs, and the result of the 
comparisons are perfectly well defined by either True or False.


But it's *arbitrarily* defined, and it's far from clear that
the definition chosen is useful in any way.

If you perform a computation and get a NaN as the result,
you know that something went wrong at some point.

But if you subject that NaN to a comparison, your code
takes some arbitrarily-chosen branch and produces a
result that may look plausible but is almost certainly
wrong.

The Pythonic thing to do (in the Python 3 world at least) would
be to regard NaNs as non-comparable and raise an exception.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Alexander Belopolsky

On Thu, Apr 28, 2011 at 12:24 AM, Guido van Rossum  wrote:
> So do new masks get created when the outcome of an elementwise
> operation is a NaN?  Because that's the only reason why one should have
> NaNs in one's data in the first place.

If this is the case, why Python almost never produces NaNs as IEEE
standard prescribes?

>>> 0.0/0.0
Traceback (most recent call last):
  File "", line 1, in 
ZeroDivisionError: float division

> -- not to indicate missing values!

Sometimes you don't have a choice.  For example when you data comes
from a database that uses NaNs for missing values.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Greg Ewing


Stephen J. Turnbull wrote:

So what does the 1/0 that
occurs in [1/x for x in range(-5, 6)] mean?  In what sense is it
"equal to itself"?  How can something which is not a number be
compared for numerical equality?


I would say it *can't* be compared for *numerical* equality.
It might make sense to compare it using some other notion of
equality.

One of the problems here, I think, is that Python only lets
you define one notion of equality for each type, and that
notion is the one that gets used when you compare collections
of that type. (Or at least it's supposed to, but the identity-
implies-equality shortcut that gets taken in some places
interferes with that.)

So if you're going to decide that it doesn't make sense to
compare undefined numeric quantities, then it doesn't make
sense to compare lists containing them either.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Greg Ewing


Guido van Rossum wrote:

Currently NaN is not violating
any language rules -- it is just violating users' intuition, in a much
worse way than Inf does.


If it's to be an official language non-rule (by which I mean
that types are officially allowed to compare non-reflexively)
then any code assuming that identity implies equality for
arbitrary objects is broken and should be fixed.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Alexander Belopolsky

On Thu, Apr 28, 2011 at 1:40 AM, Greg Ewing  wrote:
..
> The Pythonic thing to do (in the Python 3 world at least) would
> be to regard NaNs as non-comparable and raise an exception.

As I mentioned in a previous post, I agree in case of <, <=,  >, or >=
comparisons, but == and  != are a harder case because you don't want,
for example:

>>> [1,2,float('nan'),3].index(3)
3

to raise an exception.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Glenn Linderman

On 4/27/2011 8:43 PM, Nick Coghlan wrote:

On Thu, Apr 28, 2011 at 12:42 PM, Stephen J. Turnbull
  wrote:

Mark Dickinson writes:

  >  Declaring that 'nan == nan' should be True seems attractive in
  >  theory,

No, it's intuitively attractive, but that's because humans like nice
continuous behavior.  In *theory*, it's true that some singularities
are removable, and the NaN that occurs when evaluating at that point
is actually definable in a broader context, but the point of NaN is
that some singularities are *not* removable.  This is somewhat
Pythonic: "In the presence of ambiguity, refuse to guess."

Refusing to guess in this case would be to treat all NaNs as
signalling NaNs, and that wouldn't be good, either :)

I like Terry's suggestion for a glossary entry, and have created an
updated proposal at http://bugs.python.org/issue11945

(I also noted that array.array is like collections.Sequence in failing
to enforce the container invariants in the presence of NaN values)

In that bug, Nick, you mention that reflexive equality is something that 
container classes rely on in their implementation.  Such reliance seems 
to me to be a bug, or an inappropriate optimization, rather than a 
necessity.  I realize that classes that do not define equality use 
identity as their default equality operator, and that is acceptable for 
items that do not or cannot have any better equality operator.  It does 
lead to the situation where two objects that are bit-for-bit clones get 
separate entries in a set... exactly the same as how NaNs of different 
identity work... the situation with a NaN of the same identity not being 
added to the set multiple times seems to simply be a bug because of 
conflating identity and equality, and should not be relied on in 
container implementations.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Alexander Belopolsky

On Thu, Apr 28, 2011 at 2:20 AM, Glenn Linderman  wrote:
..
> In that bug, Nick, you mention that reflexive equality is something that
> container classes rely on in their implementation.  Such reliance seems to
> me to be a bug, or an inappropriate optimization, ..

An alternative interpretation would be that it is a bug to use NaN
values in lists.  It is certainly nonsensical to use NaNs as keys in
dictionaries and that reportedly led Java designers to forgo the
nonreflexivity of nans:

"""
A "NaN" value is not equal to itself. However, a "NaN" Java "Float"
object is equal to itself. The semantic is defined this way, because
otherwise "NaN" Java "Float" objects cannot be retrieved from a hash
table.
""" - http://www.concentric.net/~ttwang/tech/javafloat.htm

With the status quo in Python, it may only make sense to store NaNs in
array.array, but not in a list.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PyObject_RichCompareBool identity shortcut

2011-04-27 Thread Nick Coghlan

On Thu, Apr 28, 2011 at 4:20 PM, Glenn Linderman  wrote:
> In that bug, Nick, you mention that reflexive equality is something that
> container classes rely on in their implementation.  Such reliance seems to
> me to be a bug, or an inappropriate optimization, rather than a necessity.
> I realize that classes that do not define equality use identity as their
> default equality operator, and that is acceptable for items that do not or
> cannot have any better equality operator.  It does lead to the situation
> where two objects that are bit-for-bit clones get separate entries in a
> set... exactly the same as how NaNs of different identity work... the
> situation with a NaN of the same identity not being added to the set
> multiple times seems to simply be a bug because of conflating identity and
> equality, and should not be relied on in container implementations.

No, as Raymond has articulated a number of times over the years, it's
a property of the equivalence relation that is needed in order to
present sane invariants to users of the container. I included in the
bug report the critical invariants I am currently aware of that should
hold, even when the container may hold types with a non-reflexive
definition of equality:

  assert [x] == [x] # Generalised to all container types
  assert not [x] != [x]# Generalised to all container types
  for x in c:
assert x in c
assert c.count(x) > 0   # If applicable
assert 0 <= c.index(x) < len(c)  # If applicable

The builtin types all already work this way, and that's a deliberate
choice - my proposal is simply to document the behaviour as
intentional, and fix the one case I know of in the standard library
where we don't implement these semantics correctly (i.e.
collections.Sequence).

The question of whether or not float and decimal.Decimal should be
modified to have reflexive definitions of equality (even for NaN
values) is actually orthogonal to the question of clarifying and
documenting the expected semantics of containers in the face of
non-reflexive definitions of equality.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

66 matches

Mail list logo