On 11 June 2017 at 13:35, David Mertz <me...@gnosis.cx> wrote: > You are right. I made a thinko. > > List construction from an iterator is O(N) just as is `sum(1 for _ in it)`. > Both of them need to march through every element. But as a constant > multiplier, just constructing the list should be faster than needing an > addition (Python append is O(1) because of smart dynamic memory > pre-allocation). > > So the "just read the iterator" is about 2-3 times faster than > read-then-accumulate). Of course, it the run-lengths are LARGE, we can > start worrying about the extra memory allocation needed as a tradeoff. Your > sum uses constant memory.
This would be another argument in favour of providing an itertools.counted_len function, as that would be able to avoid all the overheads currently associated with the "sum(1 for __ in iterable)" counting strategy. Without that, the best you can do in pure Python is to use __length_hint__ to choose your preferred counting strategy at runtime based on the specific input. Something like: from operator import length_hint # 10k 64-bit pointers ~= 640k max temporary list _USE_COUNTED_SUM = 10_001 def counted_len(iterable): # For sized containers, just return their length try: return len(iterable) except TypeError: pass # For probably-large inputs & those with no length hint, count them hint = length_hint(iterable, default=_USE_COUNTED_SUM) if hint >= _USE_COUNTED_SUM: return sum(1 for __ in iter(iterable)) # Otherwise, make a temporary list and report its length # as the specifics of the CPython implementation make this # faster than using the generator expression return len(list(iterable)) Cheers, Nick. P.S. itertools._grouper objects don't currently provide a length hint, and it would be tricky to add one, since it would need to be based on the number of remaining items in the original sequence, which would it turn depend on *that* defining either __len__ or __length_hint__. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia _______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/