Raymond Hettinger wrote:
From: "Nick Coghlan" <[EMAIL PROTECTED]>
Georg and I tried doing it that way and had major problems trying to get it to work - the hard part is that the body of the list comp (which may include nested list comps, lambda functions and generator expressions) needs to be able to see the iteration variables exactly as they are, but the surrounding code shouldn't be able to see them at all.

Did you try just saving and restoring the variable so that hiding wouldn't be necessary.

[for x in s]
  -->
  save(x) if it exists
  do the listcomp
  restore(x) if it was saved


As for whether or not this is worth fixing, I think getting rid of the current subtle differences between the scoping rules for list comprehensions and generator expressions is a very worthwhile outcome, even at the possible performance cost.

Difference of opinion here.
All 2-to-3 conversions don't automatically work
and when they don't, the reason is subtle, surprising, and weird.

2to3 should be able to detect and warn about the cases which don't work though (specifically, list comprehensions at module or class scope which refer to local variables somewhere other than in the outermost iterable expression). A patch to add such a warning would be very welcome.

There's nothing it can do about code which deletes the no longer leaked local variable, but that failure isn't subtle (and is also fairly easily fixed by changing the code to "var = None; del var").

With genexps, you expect function scoping rules to apply
(we explain genexps in terms of an equivalent generator).
With set/list/dict comps, that is unnecessary, slow, and unexpected

I should probably add in to this discussion what the Py3k expansions of list, set and dict comprehensions actually *are* (they don't implicitly build genexps because that would be stupid and incredibly slow, but they do each build a single implicit function in order to get the scoping rules right):

>>> import dis
>>> def f():
...     return [x for x in range(10)]
...
>>> dis.dis(f)
2 0 LOAD_CONST 1 (<code object <listcomp> at 0xb7c41260, file "<stdin>", line 2>)
              3 MAKE_FUNCTION            0
              6 LOAD_GLOBAL              0 (range)
              9 LOAD_CONST               2 (10)
             12 CALL_FUNCTION            1
             15 GET_ITER
             16 CALL_FUNCTION            1
             19 RETURN_VALUE
>>> dis.dis(f.__code__.co_consts[1])
  2           0 BUILD_LIST               0
              3 DUP_TOP
              4 STORE_FAST               1 (_[1])
              7 LOAD_FAST                0 (.0)
        >>   10 FOR_ITER                13 (to 26)
             13 STORE_FAST               2 (x)
             16 LOAD_FAST                1 (_[1])
             19 LOAD_FAST                2 (x)
             22 LIST_APPEND
             23 JUMP_ABSOLUTE           10
        >>   26 RETURN_VALUE

So [x for x in range(10)] is shorthand for:

  def _f(itr):
      _[1] = []
      for x in itr:
          _[1].append(x)
      return _[1]
  expr_result = _f(range(10))

{x for x in range(10)} is shorthand for:

  def _f(itr):
      _[1] = set()
      for x in itr:
          _[1].add(x)
      return _[1]
  expr_result = _f(range(10))

And {x:x*x for x in range(10)} is shorthand for:

  def _f(itr):
      _[1] = {}
      for x in itr:
          _[1][x] = x*x
      return _[1]
  expr_result = _f(range(10))

While (x for x in range(10)) continues to be shorthand for:

  def _g(itr):
      for x in itr:
        yield x
  expr_result = _g(range(10))

It all looks simple, consistent and pretty straightforward to me. If pdb doesn't understand it, then it may be a good idea for someone to teach pdb what a disassembled comprehension looks like in Py3k.

Those expansions should also make it clear why this change actually results in a speed increase of module and class level list comprehensions above a certain size - the iteration variables become function locals, so they benefit from function local speedups. The exact point where the speedup from that is greater than the cost of a single function call will vary based on the complexity of the body of the list comprehension (most notable the number of times the iteration variables are referred to).

$ ./python -m timeit "exec('[x*x for x in range(100)]')"
10000 loops, best of 3: 125 usec per loop
$ python -m timeit "exec '[x*x for x in range(100)]'"
10000 loops, best of 3: 121 usec per loop
$ ./python -m timeit "exec('[x*x for x in range(1000)]')"
1000 loops, best of 3: 418 usec per loop
$ python -m timeit "exec '[x*x for x in range(1000)]'"
1000 loops, best of 3: 723 usec per loop

(exec is used in the above examples because the timeit module otherwise runs all supplied code in a function scope, so the speedup in Py3k list comprehensions isn't visible. The version of Python in the local directory is a fairly up to date build from the Py3k branch, the system Python is 2.5.1)

> (we should be able to explain those in terms of an equivalent
> in-lined for-loop with the loop variable restricted to the scope of
> the loop).

That only sounds reasonable until you try to get the following two pieces of code to produce the same answer (example lifted from test_listcomps.py):

def test_func_listcomp():
    items = [(lambda: i) for i in range(5)]
    i = 20
    return [x() for x in items]

def test_func_genexp():
    items = list((lambda: i) for i in range(5))
    i = 20
    return [x() for x in items]

In current Py3k, they do produce the same answer ([4]*5) and in current 2.x they produce different, but sensible answers ([20]*5 for the list comprehension, [4]*5 for the generator expression).

With the "just hide the iteration variable" approach it is highly unclear what the list comprehension version should produce.

Cheers,
Nick.

--
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---------------------------------------------------------------
            http://www.boredomandlaziness.org
_______________________________________________
Python-3000 mailing list
Python-3000@python.org
http://mail.python.org/mailman/listinfo/python-3000
Unsubscribe: 
http://mail.python.org/mailman/options/python-3000/archive%40mail-archive.com

Reply via email to