[Python-ideas] Re: String comprehension

Steven D'Aprano Mon, 03 May 2021 05:07:15 -0700

On Mon, May 03, 2021 at 09:04:51PM +1000, Chris Angelico wrote:

> > My understanding of the situation is that the list comprehension [ 
> > x*x for x in range(5) ] is a shorthand for list( x*x for x in 
> > range(5) ).
> 
> Sorta-kinda. It's not a shorthand in the sense that you can't simply
> replace one with the other,


Only because the `list` name could be shadowed or rebound to something 
else. Syntactically and functionally, aside from the lazy vs eager 
difference, a comprehension is a comprehension and there is nothing 
generator comprehensions can do that list comprehensions can't.

In Python 2 there were scoping differences between the two, but I 
believe that in Python 3 those have been eliminated.

> but they do have very similar behaviour,
> yes. A genexp is far more flexible than a list comp, 

Aside from the lazy nature of generator comprehensions, what else?


> so the compiled
> bytecode for list(genexp) has to go to a lot of unnecessary work to
> permit that flexibility, whereas the list comp can simplify things
> down.

I don't think so. The bytecode in 3.9 is remarkably similar.


    >>> dis.dis('list(spam for spam in eggs)')
      1           0 LOAD_NAME                0 (list)
                  2 LOAD_CONST               0 (<code object <genexpr> at 
0x7fc185ce0870, file "<dis>", line 1>)
                  4 LOAD_CONST               1 ('<genexpr>')
                  6 MAKE_FUNCTION            0
                  8 LOAD_NAME                1 (eggs)
                 10 GET_ITER
                 12 CALL_FUNCTION            1
                 14 CALL_FUNCTION            1
                 16 RETURN_VALUE

    Disassembly of <code object <genexpr> at 0x7fc185ce0870, file "<dis>", line 
1>:
      1           0 LOAD_FAST                0 (.0)
            >>    2 FOR_ITER                10 (to 14)
                  4 STORE_FAST               1 (spam)
                  6 LOAD_FAST                1 (spam)
                  8 YIELD_VALUE
                 10 POP_TOP
                 12 JUMP_ABSOLUTE            2
            >>   14 LOAD_CONST               0 (None)
                 16 RETURN_VALUE


The bytecode for the list comp `[spam for spam in eggs]` is only three
bytecodes shorter, so that doesn't support your comment about "a lot of 
unnecessary work".

`dis.dis('[spam for spam in eggs]')` can:

- skip the name lookup for list (LOAD_NAME);

- and the CALL_FUNCTION that ends up calling it;

The dissassemblies of the two code objects, "<genexpr>" and 
"<listcomp>", have slightly different implementations but only differ by 
one bytecode overall.

As far as runtime efficiency, list comps are a little faster. Iterating 
over a 1000-item sequence is 33% faster for a list comp, but for a 
100000-item sequence that drops to 25% faster. But as soon as you do a 
significant amount of work inside the comprehension, that work is likely 
to dominate the other costs.

There's definitely some overhead needed to support starting and stopping 
a generator, but we can argue that is an implementation detail. A 
sufficiently clever interpreter could avoid that overhead.


> That said, I think the only way you'd actually detect a
> behavioural difference is if the name "list" has been rebound.

That and timing.


-- 
Steve
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/[email protected]/message/76G6HGPUPTWJHNN62JGKWLRXRIUQGX2C/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: String comprehension

Reply via email to