On 3/7/2013 3:41 PM, Wolfgang Maier wrote:
Iterators do not generally have __len__ methods.
len(iter(range(10)))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: object of type 'range_iterator' has no len()
But iterators have a length hint method that are used for some
optimizations and preallocations, too.
i = iter(range(10))
i.__length_hint__()
10
See http://www.python.org/dev/peps/pep-0424/
very interesting (hadn't heard of it)! Just checked the PEP,
then tested list()'s behavior, and it is just as described:
class stupid(list):
def __len__(self):
print ('len() called')
return NotImplemented
def __length_hint__(self):
print ('hint requested')
l=iter(self).__length_hint__()
print (l)
return l
a=stupid((1,2,3))
len(d)
======>
len() called
Traceback (most recent call last):
File "<pyshell#79>", line 1, in <module>
len(d)
TypeError: an integer is required
list(d)
======>
len() called
hint requested
3
[1, 2, 3]
so list() first tries to call the iterable's __len__ method. If that raises a
TypeError it falls back to __length_hint__ .
What I still don't know is how the listiterator object's __length_hint__ works.
Why, in this case, does it know that it has a length of 3 ? The PEP does not
provide any hint how a reasonable hint could be calculated.
'How' depends on the iterator, but when it has an underlying concrete
iterable of known length, it should be rather trivial as .__next__ will
explicitly or implicitly use the count remaining in its operation. Part
of the justification of adding __length_hint__ is that in many cases it
is so easy. Iterators based on an iterator with a length_hint can just
pass it along.
The list iterator might work with a C pointer to the next item and a
countdown count initialized to the list length. The __next__ method
might be something like this mixed C and Python pseudocode:
if (countdown--) return *(current--);
else raise StopIteration
(For non-C coders, postfix -- decrements value *after* retrieving it.)
Then __length_hint__ would just return countdown. Tuples would work the
same way. Sets and dicts would need current-- elaborated to skip over
empty hash table slots. Range iterators would add the step to current,
instead of 1.
--
Terry Jan Reedy
--
http://mail.python.org/mailman/listinfo/python-list