[Python-Dev] variable name resolution in exec is incorrect

2010-05-26 Thread Colin H
Hi,

   issue991196 was closed being described as intentional.  I've added
a comment in that issue which argues that this is a serious bug (also
aserted by a previous commenter - Armin Rigo), because it creates a
unique, undocumented, oddly behaving scope that doesn't apply closures
correctly. At the very least I think this should be acknowledged as a
plain old bug (rather than a feature), and then a discussion about
whether it will be fixed or not.  Appreciate your thoughts - cheers,

Colin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-26 Thread Colin H
Thanks for the details on why the observed behaviour occurs - very
clear. My only query would be why this is considered correct? Why is
it running as a class namespace, when it is not a class? Is there any
reason why this is not considered a mistake? Slightly concerned that
this is being considered not a bug because 'it is how it is'.

A really good reason why you would want to provide a separate locals
dictionary is to get access to the stuff that was defined in the
exec()'d code block.  Unfortunately this use case is broken by the
current behaviour.  The only way to get the definitions from the
exec()'d code block is to supply a single dictionary, and then try to
weed out the definitions from amongst all the other globals, which is
very difficult if you don't know in advance what was in the code block
you exec()'d.

So put simply - the bug is that a class namespace is used, but its not a class.

On 26/05/2010 13:51, Nick Coghlan wrote:
> On 26/05/10 19:48, Mark Dickinson wrote:
>> This is a long way from my area of expertise (I'm commenting here
>> because it was me who sent Colin here in the first place), and it's
>> not clear to me whether this is a bug, and if it is a bug, how it
>> could be resolved. What would the impact be of having the compiler
>> produce 'LOAD_NAME' rather than 'LOAD_GLOBAL' here?
>
> exec with a single argument = module namespace
> exec with two arguments = class namespace
>
> Class namespaces are deliberately exempted from lexical scoping so
> that methods can't see class attributes, hence the example in the
> tracker issue works exactly as it would if the code was written as a
> class body.
>
> class C:
> y = 3
> def execfunc():
> print y
> execfunc()
>
> With this code, y would end up in C.__dict__ rather than the module
> globals (at least, it would if it wasn't for the exception) and the
> call to execfunc fails with a NameError when attempting to find y.
>
> I know I've closed other bug reports that were based on the same
> misunderstanding, and I didn't understand it myself until Guido
> explained it to me a few years back, so suggestions for improving the
> exec documentation in this area would be appreciated.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-26 Thread Colin H
The changes to the docs will definitely help in understanding why this
behaves as it does. I would like like to take one last stab though at
justifying why this behaviour isn't correct - will leave it alone if
these arguments don't stack up :)  Appreciate the input and
discussion.

Terry Jan Reedy wrote

> You are expecting that it run as a function namespace (with post 2.2
> nesting), when it is not a function. Why is that any better?

Because a class namespace (as I see it) was implemented to treat a
specific situation - i.e. that functions in classes cannot see class
variables. exec() is a far more generic instrument that has no such
explicit requirement - i.e. it feels like hijacking an edge case to
meet a requirement that doesn't exist. However 'all locals in an
enclosing scope are made available in the function namespace' is
generally understood as python's generic closure implementation, and
would match more effectively the generic nature of the exec()
statement.  A litmus test for this sort of thing - if you polled 100
knowledgeable python devs who hadn't encountered this problem or this
thread and asked if they would expect exec() to run as a class or
function namespace, I think you'd struggle to get 1 of them to expect
a class namespace. Functions are the more generic construct, and thus
more appropriate for the generic nature of exec() (IMHO).

It would appear that the only actual requirement not to make locals in
an enclosing scope available in a nested function scope is for a
class. The situation we are discussing seems have created a similar
requirement for exec(), but with no reason.

> In original Python, the snippet would have given an error whether you
> thought of it as being in a class or function context, which is how
> anyone who knew Python then would have expected. Consistency is not a bug.

> When nested function namespaces were introduced, the behavior of exec
> was left unchanged. Backward compatibility is not a bug.

Generally, most other behaviour did change - locals in enclosing
scopes *did* become available in the nested function namespace, which
was not backward compatible.  Why is a special case made to retain
consistency and backward compatibility for code run using exec()? It's
all python code. Inconsistent backward compatibility might be
considered a bug.

Cheers,

Colin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-26 Thread Colin H
Mark Dickinson wrote:

> Seems to me the whole idea of being able to specify
> separate global and local scopes for top-level code is
> screwy in the first place. Are there any use cases for
> it? Maybe the second scope argument to exec() should
> be deprecated?

It is running as class namespace that makes the argument that there's
no use case. I agree - I can't think of any decent use cases for
exec() as class namespace - defining functions and classes only works
for a subset of function and class definitions

However, if exec() ran as function namespace instead, then the locals
dictionary will contain all the definitions from the exec()'d code
block, and only those definitions. Very useful. This is a major use
case for exec() - defining code from strings (e.g. enabling you to
store python code in the database), and using it at runtime. It seems
to me this must have been the point of locals in the first place.

If you just use globals, then your definitions exist amongst a whole
bunch of other python stuff, and unless you know in advance what was
defined in your code block, its very difficult to extract them.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-26 Thread Colin H
Hi Guido,

   Thanks for the possible workaround - unfortunately 'stuff' will
contain a whole stack of things that are not in 'context', and were
not defined in 'user_code' - things that python embeds - a (very
small) selection -

{..., 'NameError': , 'BytesWarning':
, 'dict': , 'input':
, 'oct': ,
'bin': , ...}

It makes sense why this happens of course, but upon return, the
globals dict is very large, and finding the stuff you defined in your
user_code amongst it is a very difficult task.  Avoiding this problem
is the 'locals' use-case for me.  Cheers,

Colin

On Thu, May 27, 2010 at 1:38 AM, Guido van Rossum  wrote:
> This is not easy to fix. The best short-term work-around is probably a
> hack like this:
>
> def define_stuff(user_code):
>  context = {...}
>  stuff = {}
>  stuff.update(context)
>  exec(user_code, stuff)
>  for key in context:
>    if key in stuff and stuff[key] == context[key]:
>      del stuff[key]
>  return stuff
>
> --
> --Guido van Rossum (python.org/~guido)
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-26 Thread Colin H
Of course :) - I need to pay more attention. Your workaround should do
the trick. It would make sense if locals could be used for this
purpose, but the workaround doesn't add so much overhead in most
situations.  Thanks for the help, much appreciated,

Colin

On Thu, May 27, 2010 at 2:05 AM, Guido van Rossum  wrote:
> On Wed, May 26, 2010 at 5:53 PM, Colin H  wrote:
>>   Thanks for the possible workaround - unfortunately 'stuff' will
>> contain a whole stack of things that are not in 'context', and were
>> not defined in 'user_code' - things that python embeds - a (very
>> small) selection -
>>
>> {..., 'NameError': , 'BytesWarning':
>> , 'dict': , 'input':
>> , 'oct': ,
>> 'bin': , ...}
>>
>> It makes sense why this happens of course, but upon return, the
>> globals dict is very large, and finding the stuff you defined in your
>> user_code amongst it is a very difficult task.  Avoiding this problem
>> is the 'locals' use-case for me.  Cheers,
>
> No, if taken literally that doesn't make sense. Those are builtins. I
> think you are mistaken that each of those (e.g. NameError) is in stuff
> -- they are in stuff['__builtins__'] which represents the built-in
> namespace. You should remove that key from stuff as well.
>
> --Guido
>
>> Colin
>>
>> On Thu, May 27, 2010 at 1:38 AM, Guido van Rossum  wrote:
>>> This is not easy to fix. The best short-term work-around is probably a
>>> hack like this:
>>>
>>> def define_stuff(user_code):
>>>  context = {...}
>>>  stuff = {}
>>>  stuff.update(context)
>>>  exec(user_code, stuff)
>>>  for key in context:
>>>    if key in stuff and stuff[key] == context[key]:
>>>      del stuff[key]
>>>  return stuff
>>>
>>> --
>>> --Guido van Rossum (python.org/~guido)
>>>
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-27 Thread Colin H
I needed to make a small modification to the workaround - I wasn't
able to delete from 'stuff', as the definitions in exec()'d code won't
run - they're relying on that being present at runtime. In practice
the overhead of doing this is quite noticeable if you run your code
like this a lot, and build up a decent sized context (which I do). It
will obviously depend on the usage scenario though.

def define_stuff(user_code):
  context = {...}
  stuff = {}
  stuff.update(context)

  exec(user_code, stuff)

  return_stuff = {}
  return_stuff.update(stuff)

  del return_stuff['__builtins__']
  for key in context:
    if key in return_stuff and return_stuff[key] == context[key]:
      del return_stuff[key]

  return return_stuff

On Thu, May 27, 2010 at 2:13 AM, Colin H  wrote:
> Of course :) - I need to pay more attention. Your workaround should do
> the trick. It would make sense if locals could be used for this
> purpose, but the workaround doesn't add so much overhead in most
> situations.  Thanks for the help, much appreciated,
>
> Colin
>
> On Thu, May 27, 2010 at 2:05 AM, Guido van Rossum  wrote:
>> On Wed, May 26, 2010 at 5:53 PM, Colin H  wrote:
>>>   Thanks for the possible workaround - unfortunately 'stuff' will
>>> contain a whole stack of things that are not in 'context', and were
>>> not defined in 'user_code' - things that python embeds - a (very
>>> small) selection -
>>>
>>> {..., 'NameError': , 'BytesWarning':
>>> , 'dict': , 'input':
>>> , 'oct': ,
>>> 'bin': , ...}
>>>
>>> It makes sense why this happens of course, but upon return, the
>>> globals dict is very large, and finding the stuff you defined in your
>>> user_code amongst it is a very difficult task.  Avoiding this problem
>>> is the 'locals' use-case for me.  Cheers,
>>
>> No, if taken literally that doesn't make sense. Those are builtins. I
>> think you are mistaken that each of those (e.g. NameError) is in stuff
>> -- they are in stuff['__builtins__'] which represents the built-in
>> namespace. You should remove that key from stuff as well.
>>
>> --Guido
>>
>>> Colin
>>>
>>> On Thu, May 27, 2010 at 1:38 AM, Guido van Rossum  wrote:
>>>> This is not easy to fix. The best short-term work-around is probably a
>>>> hack like this:
>>>>
>>>> def define_stuff(user_code):
>>>>  context = {...}
>>>>  stuff = {}
>>>>  stuff.update(context)
>>>>  exec(user_code, stuff)
>>>>  for key in context:
>>>>    if key in stuff and stuff[key] == context[key]:
>>>>      del stuff[key]
>>>>  return stuff
>>>>
>>>> --
>>>> --Guido van Rossum (python.org/~guido)
>>>>
>>>
>>
>>
>>
>> --
>> --Guido van Rossum (python.org/~guido)
>>
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-27 Thread Colin H
Yep fair call - was primarily modifying Guido's example to make the
point about not being able to delete from the globals returned from
exec - cheers,

Colin

On Thu, May 27, 2010 at 2:09 PM, Scott Dial
 wrote:
> On 5/27/2010 7:14 AM, Colin H wrote:
>> def define_stuff(user_code):
>>   context = {...}
>>   stuff = {}
>>   stuff.update(context)
>>
>>   exec(user_code, stuff)
>>
>>   return_stuff = {}
>>   return_stuff.update(stuff)
>>
>>   del return_stuff['__builtins__']
>>   for key in context:
>>     if key in return_stuff and return_stuff[key] == context[key]:
>>       del return_stuff[key]
>>
>>   return return_stuff
>
> I'm not sure your application, but I suspect you would benefit from
> using an identity check instead of an __eq__ check. The equality check
> may be expensive (e.g., a large dictionary), and I don't think it
> actually is checking what you want -- if the user_code generates an
> __eq__-similar dictionary, wouldn't you still want that? The only reason
> I can see to use __eq__ is if you are trying to detect user_code
> modifying an object passed in, which is something that wouldn't be
> addressed by your original complaint about exec (as in, modifying a
> global data structure).
>
> Instead of:
>>     if key in return_stuff and return_stuff[key] == context[key]:
>
> Use:
>>     if key in return_stuff and return_stuff[key] is context[key]:
>
> --
> Scott Dial
> sc...@scottdial.com
> scod...@cs.indiana.edu
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-27 Thread Colin H
Just to put a couple of alternatives on the table that don't break
existing code - not necessarily promoting them, or suggesting they
would be easy to do -

1. modify exec() to take an optional third argument - 'scope_type' -
if it is not supplied (but locals is), then it runs as class namespace
- i.e. identical to existing behaviour. If it is supplied then it will
run as whichever is specified, with function namespace being an
option.  The API already operates along these lines, with the second
argument being optional and implying module namespace if it is not
present.

2. a new API exec2() which uses function namespace, and deprecating
the old exec() - assuming there is agreement that function namespace
makes more sense than the class namespace, because there are real use
cases, and developers would generally expect this behaviour when
approaching the API for the first time.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-27 Thread Colin H
This option sounds very promising - seems right to do it at the
compile stage - i.e. compile(code_str, name, "closure") as you have
suggested.  If there were any argument against, it would be that the
most obvious behaviour (function namespace) is the hardest to induce,
but the value in knowing you're not breaking anything is pretty high.

Cheers,
Colin

On Thu, May 27, 2010 at 4:42 PM, Nick Coghlan  wrote:
> On 27/05/10 10:38, Guido van Rossum wrote:
>>
>> On Wed, May 26, 2010 at 5:12 PM, Nick Coghlan  wrote:
>>>
>>> Lexical scoping only works for code that is compiled as part of a single
>>> operation - the separation between the compilation of the individual
>>> string
>>> and the code defining that string means that the symbol table analysis
>>> needed for lexical scoping can't cross the boundary.
>>
>> Hi Nick,
>>
>> I don't think Colin was asking for such things.
>
> Yes, I realised some time after sending that message that I'd gone off on a
> tangent unrelated to the original question (as a result of earlier parts of
> the discussion I'd been pondering the scoping differences between exec with
> two namespaces and a class definition and ended up writing about that
> instead of the topic Colin originally brought up).
>
> I suspect Thomas is right that the current two namespace exec behaviour is
> mostly a legacy of the standard scoping before nested scopes were added.
>
> To state the problem as succinctly as I can, the basic issue is that a code
> object which includes a function definition that refers to top level
> variables will execute correctly when the same namespace is used for both
> locals and globals (i.e. like module level code) but will fail when these
> namespaces are different (i.e. like code in class definition).
>
> So long as the code being executed doesn't define any functions that refer
> to top level variables in the executed code the two argument form is
> currently perfectly usable, so deprecating it would be an overreaction.
>
> However, attaining the (sensible) behaviour Colin is requesting when such
> top level variable references exist would actually be somewhat tricky.
> Considering Guido's suggestion to treat two argument exec like a function
> rather than a class and generate a closure with full lexical scoping a
> little further, I don't believe this could be done in exec itself without
> breaking code that expects the current behaviour. However, something along
> these lines could probably be managed as a new compilation mode for
> compile() (e.g. compile(code_str, name, "closure")), which would then allow
> these code objects to be passed to exec to get the desired behaviour.
>
> Compare and contrast:
>
 def f():
> ...   x = 1
> ...   def g():
> ...     print x
> ...   g()
> ...
 exec f.func_code in globals(), {}
> 1
>
 source = """\
> ... x = 1
> ... def g():
> ...   print x
> ... g()
> ... """
 exec source in globals(), {}
> Traceback (most recent call last):
>  File "", line 1, in 
>  File "", line 4, in 
>  File "", line 3, in g
> NameError: global name 'x' is not defined
>
> Breaking out dis.dis on these examples is fairly enlightening, as they
> generate *very* different bytecode for the definition of g().
>
> Cheers,
> Nick.
>
> --
> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
> ---
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-27 Thread Colin H
By hardest to induce I mean the default compile exec(code_str, {}, {})
would still be class namespace, but it's pretty insignificant.

On Fri, May 28, 2010 at 12:32 AM, Colin H  wrote:
> This option sounds very promising - seems right to do it at the
> compile stage - i.e. compile(code_str, name, "closure") as you have
> suggested.  If there were any argument against, it would be that the
> most obvious behaviour (function namespace) is the hardest to induce,
> but the value in knowing you're not breaking anything is pretty high.
>
> Cheers,
> Colin
>
> On Thu, May 27, 2010 at 4:42 PM, Nick Coghlan  wrote:
>> On 27/05/10 10:38, Guido van Rossum wrote:
>>>
>>> On Wed, May 26, 2010 at 5:12 PM, Nick Coghlan  wrote:
>>>>
>>>> Lexical scoping only works for code that is compiled as part of a single
>>>> operation - the separation between the compilation of the individual
>>>> string
>>>> and the code defining that string means that the symbol table analysis
>>>> needed for lexical scoping can't cross the boundary.
>>>
>>> Hi Nick,
>>>
>>> I don't think Colin was asking for such things.
>>
>> Yes, I realised some time after sending that message that I'd gone off on a
>> tangent unrelated to the original question (as a result of earlier parts of
>> the discussion I'd been pondering the scoping differences between exec with
>> two namespaces and a class definition and ended up writing about that
>> instead of the topic Colin originally brought up).
>>
>> I suspect Thomas is right that the current two namespace exec behaviour is
>> mostly a legacy of the standard scoping before nested scopes were added.
>>
>> To state the problem as succinctly as I can, the basic issue is that a code
>> object which includes a function definition that refers to top level
>> variables will execute correctly when the same namespace is used for both
>> locals and globals (i.e. like module level code) but will fail when these
>> namespaces are different (i.e. like code in class definition).
>>
>> So long as the code being executed doesn't define any functions that refer
>> to top level variables in the executed code the two argument form is
>> currently perfectly usable, so deprecating it would be an overreaction.
>>
>> However, attaining the (sensible) behaviour Colin is requesting when such
>> top level variable references exist would actually be somewhat tricky.
>> Considering Guido's suggestion to treat two argument exec like a function
>> rather than a class and generate a closure with full lexical scoping a
>> little further, I don't believe this could be done in exec itself without
>> breaking code that expects the current behaviour. However, something along
>> these lines could probably be managed as a new compilation mode for
>> compile() (e.g. compile(code_str, name, "closure")), which would then allow
>> these code objects to be passed to exec to get the desired behaviour.
>>
>> Compare and contrast:
>>
>>>>> def f():
>> ...   x = 1
>> ...   def g():
>> ...     print x
>> ...   g()
>> ...
>>>>> exec f.func_code in globals(), {}
>> 1
>>
>>>>> source = """\
>> ... x = 1
>> ... def g():
>> ...   print x
>> ... g()
>> ... """
>>>>> exec source in globals(), {}
>> Traceback (most recent call last):
>>  File "", line 1, in 
>>  File "", line 4, in 
>>  File "", line 3, in g
>> NameError: global name 'x' is not defined
>>
>> Breaking out dis.dis on these examples is fairly enlightening, as they
>> generate *very* different bytecode for the definition of g().
>>
>> Cheers,
>> Nick.
>>
>> --
>> Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
>> ---
>>
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-29 Thread Colin H
Perhaps the next step is to re-open the issue? If it is seen as a bug,
it would be great to see a fix in 2.6+ - a number of options which
will not break backward compatibility have been put forward - cheers,

Colin

On Thu, May 27, 2010 at 9:05 PM, Reid Kleckner  wrote:
> On Thu, May 27, 2010 at 11:42 AM, Nick Coghlan  wrote:
>> However, attaining the (sensible) behaviour Colin is requesting when such
>> top level variable references exist would actually be somewhat tricky.
>> Considering Guido's suggestion to treat two argument exec like a function
>> rather than a class and generate a closure with full lexical scoping a
>> little further, I don't believe this could be done in exec itself without
>> breaking code that expects the current behaviour.
>
> Just to give a concrete example, here is code that would break if exec
> were to execute code in a function scope instead of a class scope:
>
> exec """
> def len(xs):
>    return -1
> def foo():
>    return len([])
> print foo()
> """ in globals(), {}
>
> Currently, the call to 'len' inside 'foo' skips the outer scope
> (because it's a class scope) and goes straight to globals and
> builtins.  If it were switched to a local scope, a cell would be
> created for the broken definition of 'len', and the call would resolve
> to it.
>
> Honestly, to me, the fact that the above code ever worked (ie prints
> "0", not "-1") seems like a bug, so I wouldn't worry about backwards
> compatibility.
>
> Reid
>
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com