En Wed, 14 May 2008 23:47:56 -0300, George Sakkis <[EMAIL PROTECTED]> escribió:
> I spent several hours debugging some bogus data results that turned > out to be caused by the fact that heapq.nlargest doesn't respect rich > comparisons: > > import heapq > import random > > class X(object): > def __init__(self, x): self.x=x > def __repr__(self): return 'X(%s)' % self.x > if True: > # this succeeds > def __cmp__(self, other): return cmp(self.x , other.x) > else: > # this fails > def __lt__(self, other): return self.x < other.x > > s = [X(i) for i in range(10)] > random.shuffle(s) > > s1 = heapq.nlargest(5, s) > s2 = sorted(s, reverse=True)[:5] > assert s1 == s2, (s,s1,s2) > > s1 = heapq.nsmallest(5, s) > s2 = sorted(s)[:5] > assert s1 == s2, (s,s1,s2) > > > According to the docs, nlargest is equivalent to: "sorted(iterable, > key=key, reverse=True)[:n]" and similarly for nsmallest. So that must > be at least a documentation bug, if not an implementation one. Your class is not properly defined. sorted() only uses < to compare items, but this fact is NOT documented as far as I can tell. So you are abusing this implementation detail by not defining the other comparison operators, and the X class has a rather strange behavior: py> X(1)<X(2) True py> X(1)<=X(2) False # !!! py> X(1)>X(2) False py> X(1)==X(1) False # !!! If you write all the six rich comparison methods (__lt__, __le__, etc) nlargest and nsmallest work fine. If your objects are fully comparable (like numbers; that is, given A and B, one and only one of these holds: A<B, A==B, A>B) the easiest way is to write a helper function (like the old __cmp__) and implement the rich comparison methods using it. Those six methods could be defined in a mixin class. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list