Raymond Hettinger added the comment:
FYI, here is the disassembly of the inner code objects. It shows where the
difference in speed arises:
>>> from dis import dis
>>> def f():
lc = [i for i in [1, 2]]
ge = (i for i in [1, 2])
return lc,
Raymond Hettinger added the comment:
[Antony Lee]
> so both cases are much more similar -- superficially, at least.
Try disassembling the inner code object as well -- that is where the work gets
done and the list comprehension can take advantage of the LIST_APPEND
Stefan Behnel added the comment:
> as `range` may have been shadowed at that point
No, range() has in fact already been executed at that point. And it returned an
iterable that knows its length (search for "LengthHint" in the CPython sources).
--
Antony Lee added the comment:
> But with a list or other sequence with a known length, the interpreter can
> allocate the right number of items up front, and avoid growing or shrinking
> the new list. I believe that this is the time saving you are seeing.
But certainly
Steven D'Aprano added the comment:
I think the difficulty here is that your perspective is backwards. It isn't
that sorting a generator is *slower*, it is that sorting a list is *faster*,
because there is more information available with a list and so the
Stefan Behnel added the comment:
The constant function call overhead doesn't make a big difference:
$ /opt/python3.7-opt/bin/python3 -m timeit 'list(i for i in range(1000))'
5000 loops, best of 5: 55 usec per loop
$ /opt/python3.7-opt/bin/python3 -m timeit '[i for i in
Antony Lee added the comment:
Feel free to close the issue if that's not the forum for this discussion, but
I'm still baffled by what's happening.
In the example that you give, the first case needs to look up the `list` global
and that callable (which happens to be the
Stefan Behnel added the comment:
sorted() *does* convert its input to a list first, and only then sorts it. It
calls PySequence_List() for that, which in turn uses list_extend(), which then
applies the obvious optimisation of copying input lists (and tuples) directly.
New submission from Antony Lee :
Consider e.g.
In [2]: %timeit sorted([i for i in range(100)])
4.74 µs ± 24.3 ns per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [3]: %timeit sorted(i for i in range(100))
7.05 µs ± 25.7 ns per loop (mean ± std.