Terry J. Reedy <[email protected]> added the comment:
While refactoring the code for 2.7, I discovered that the description of the
heuristic for 2.6 and in the code comments is off by 1. "items that appear more
than 1% of the time" should actually be "items whose duplicates (after the
first) appear more than 1% of the time". The discrepancy arises because in the
following code
for i, elt in enumerate(b):
if elt in b2j:
indices = b2j[elt]
if n >= 200 and len(indices) * 100 > n:
populardict[elt] = 1
del indices[:]
else:
indices.append(i)
else:
b2j[elt] = [i]
len(indices) is retrieved *before* the index i of the current elt is added.
Whatever one might think the heuristic 'should' have been (and by the nature of
heuristics, there is no right answer), the default behavior must remain as it
is, so we adjusted the code and doc to match that.
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue2986>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com