Re: [Python-Dev] variable name resolution in exec is incorrect
Perhaps the next step is to re-open the issue? If it is seen as a bug, it would be great to see a fix in 2.6+ - a number of options which will not break backward compatibility have been put forward - cheers, Colin On Thu, May 27, 2010 at 9:05 PM, Reid Kleckner r...@mit.edu wrote: On Thu, May 27, 2010 at 11:42 AM, Nick Coghlan ncogh...@gmail.com wrote: However, attaining the (sensible) behaviour Colin is requesting when such top level variable references exist would actually be somewhat tricky. Considering Guido's suggestion to treat two argument exec like a function rather than a class and generate a closure with full lexical scoping a little further, I don't believe this could be done in exec itself without breaking code that expects the current behaviour. Just to give a concrete example, here is code that would break if exec were to execute code in a function scope instead of a class scope: exec def len(xs): return -1 def foo(): return len([]) print foo() in globals(), {} Currently, the call to 'len' inside 'foo' skips the outer scope (because it's a class scope) and goes straight to globals and builtins. If it were switched to a local scope, a cell would be created for the broken definition of 'len', and the call would resolve to it. Honestly, to me, the fact that the above code ever worked (ie prints 0, not -1) seems like a bug, so I wouldn't worry about backwards compatibility. Reid ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] variable name resolution in exec is incorrect
I needed to make a small modification to the workaround - I wasn't able to delete from 'stuff', as the definitions in exec()'d code won't run - they're relying on that being present at runtime. In practice the overhead of doing this is quite noticeable if you run your code like this a lot, and build up a decent sized context (which I do). It will obviously depend on the usage scenario though. def define_stuff(user_code): context = {...} stuff = {} stuff.update(context) exec(user_code, stuff) return_stuff = {} return_stuff.update(stuff) del return_stuff['__builtins__'] for key in context: if key in return_stuff and return_stuff[key] == context[key]: del return_stuff[key] return return_stuff On Thu, May 27, 2010 at 2:13 AM, Colin H hawk...@gmail.com wrote: Of course :) - I need to pay more attention. Your workaround should do the trick. It would make sense if locals could be used for this purpose, but the workaround doesn't add so much overhead in most situations. Thanks for the help, much appreciated, Colin On Thu, May 27, 2010 at 2:05 AM, Guido van Rossum gu...@python.org wrote: On Wed, May 26, 2010 at 5:53 PM, Colin H hawk...@gmail.com wrote: Thanks for the possible workaround - unfortunately 'stuff' will contain a whole stack of things that are not in 'context', and were not defined in 'user_code' - things that python embeds - a (very small) selection - {..., 'NameError': type 'exceptions.NameError', 'BytesWarning': type 'exceptions.BytesWarning', 'dict': type 'dict', 'input': function input at 0x10047a9b0, 'oct': built-in function oct, 'bin': built-in function bin, ...} It makes sense why this happens of course, but upon return, the globals dict is very large, and finding the stuff you defined in your user_code amongst it is a very difficult task. Avoiding this problem is the 'locals' use-case for me. Cheers, No, if taken literally that doesn't make sense. Those are builtins. I think you are mistaken that each of those (e.g. NameError) is in stuff -- they are in stuff['__builtins__'] which represents the built-in namespace. You should remove that key from stuff as well. --Guido Colin On Thu, May 27, 2010 at 1:38 AM, Guido van Rossum gu...@python.org wrote: This is not easy to fix. The best short-term work-around is probably a hack like this: def define_stuff(user_code): context = {...} stuff = {} stuff.update(context) exec(user_code, stuff) for key in context: if key in stuff and stuff[key] == context[key]: del stuff[key] return stuff -- --Guido van Rossum (python.org/~guido) -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] variable name resolution in exec is incorrect
Yep fair call - was primarily modifying Guido's example to make the point about not being able to delete from the globals returned from exec - cheers, Colin On Thu, May 27, 2010 at 2:09 PM, Scott Dial scott+python-...@scottdial.com wrote: On 5/27/2010 7:14 AM, Colin H wrote: def define_stuff(user_code): context = {...} stuff = {} stuff.update(context) exec(user_code, stuff) return_stuff = {} return_stuff.update(stuff) del return_stuff['__builtins__'] for key in context: if key in return_stuff and return_stuff[key] == context[key]: del return_stuff[key] return return_stuff I'm not sure your application, but I suspect you would benefit from using an identity check instead of an __eq__ check. The equality check may be expensive (e.g., a large dictionary), and I don't think it actually is checking what you want -- if the user_code generates an __eq__-similar dictionary, wouldn't you still want that? The only reason I can see to use __eq__ is if you are trying to detect user_code modifying an object passed in, which is something that wouldn't be addressed by your original complaint about exec (as in, modifying a global data structure). Instead of: if key in return_stuff and return_stuff[key] == context[key]: Use: if key in return_stuff and return_stuff[key] is context[key]: -- Scott Dial sc...@scottdial.com scod...@cs.indiana.edu ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] variable name resolution in exec is incorrect
Just to put a couple of alternatives on the table that don't break existing code - not necessarily promoting them, or suggesting they would be easy to do - 1. modify exec() to take an optional third argument - 'scope_type' - if it is not supplied (but locals is), then it runs as class namespace - i.e. identical to existing behaviour. If it is supplied then it will run as whichever is specified, with function namespace being an option. The API already operates along these lines, with the second argument being optional and implying module namespace if it is not present. 2. a new API exec2() which uses function namespace, and deprecating the old exec() - assuming there is agreement that function namespace makes more sense than the class namespace, because there are real use cases, and developers would generally expect this behaviour when approaching the API for the first time. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] variable name resolution in exec is incorrect
This option sounds very promising - seems right to do it at the compile stage - i.e. compile(code_str, name, closure) as you have suggested. If there were any argument against, it would be that the most obvious behaviour (function namespace) is the hardest to induce, but the value in knowing you're not breaking anything is pretty high. Cheers, Colin On Thu, May 27, 2010 at 4:42 PM, Nick Coghlan ncogh...@gmail.com wrote: On 27/05/10 10:38, Guido van Rossum wrote: On Wed, May 26, 2010 at 5:12 PM, Nick Coghlanncogh...@gmail.com wrote: Lexical scoping only works for code that is compiled as part of a single operation - the separation between the compilation of the individual string and the code defining that string means that the symbol table analysis needed for lexical scoping can't cross the boundary. Hi Nick, I don't think Colin was asking for such things. Yes, I realised some time after sending that message that I'd gone off on a tangent unrelated to the original question (as a result of earlier parts of the discussion I'd been pondering the scoping differences between exec with two namespaces and a class definition and ended up writing about that instead of the topic Colin originally brought up). I suspect Thomas is right that the current two namespace exec behaviour is mostly a legacy of the standard scoping before nested scopes were added. To state the problem as succinctly as I can, the basic issue is that a code object which includes a function definition that refers to top level variables will execute correctly when the same namespace is used for both locals and globals (i.e. like module level code) but will fail when these namespaces are different (i.e. like code in class definition). So long as the code being executed doesn't define any functions that refer to top level variables in the executed code the two argument form is currently perfectly usable, so deprecating it would be an overreaction. However, attaining the (sensible) behaviour Colin is requesting when such top level variable references exist would actually be somewhat tricky. Considering Guido's suggestion to treat two argument exec like a function rather than a class and generate a closure with full lexical scoping a little further, I don't believe this could be done in exec itself without breaking code that expects the current behaviour. However, something along these lines could probably be managed as a new compilation mode for compile() (e.g. compile(code_str, name, closure)), which would then allow these code objects to be passed to exec to get the desired behaviour. Compare and contrast: def f(): ... x = 1 ... def g(): ... print x ... g() ... exec f.func_code in globals(), {} 1 source = \ ... x = 1 ... def g(): ... print x ... g() ... exec source in globals(), {} Traceback (most recent call last): File stdin, line 1, in module File string, line 4, in module File string, line 3, in g NameError: global name 'x' is not defined Breaking out dis.dis on these examples is fairly enlightening, as they generate *very* different bytecode for the definition of g(). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] variable name resolution in exec is incorrect
By hardest to induce I mean the default compile exec(code_str, {}, {}) would still be class namespace, but it's pretty insignificant. On Fri, May 28, 2010 at 12:32 AM, Colin H hawk...@gmail.com wrote: This option sounds very promising - seems right to do it at the compile stage - i.e. compile(code_str, name, closure) as you have suggested. If there were any argument against, it would be that the most obvious behaviour (function namespace) is the hardest to induce, but the value in knowing you're not breaking anything is pretty high. Cheers, Colin On Thu, May 27, 2010 at 4:42 PM, Nick Coghlan ncogh...@gmail.com wrote: On 27/05/10 10:38, Guido van Rossum wrote: On Wed, May 26, 2010 at 5:12 PM, Nick Coghlanncogh...@gmail.com wrote: Lexical scoping only works for code that is compiled as part of a single operation - the separation between the compilation of the individual string and the code defining that string means that the symbol table analysis needed for lexical scoping can't cross the boundary. Hi Nick, I don't think Colin was asking for such things. Yes, I realised some time after sending that message that I'd gone off on a tangent unrelated to the original question (as a result of earlier parts of the discussion I'd been pondering the scoping differences between exec with two namespaces and a class definition and ended up writing about that instead of the topic Colin originally brought up). I suspect Thomas is right that the current two namespace exec behaviour is mostly a legacy of the standard scoping before nested scopes were added. To state the problem as succinctly as I can, the basic issue is that a code object which includes a function definition that refers to top level variables will execute correctly when the same namespace is used for both locals and globals (i.e. like module level code) but will fail when these namespaces are different (i.e. like code in class definition). So long as the code being executed doesn't define any functions that refer to top level variables in the executed code the two argument form is currently perfectly usable, so deprecating it would be an overreaction. However, attaining the (sensible) behaviour Colin is requesting when such top level variable references exist would actually be somewhat tricky. Considering Guido's suggestion to treat two argument exec like a function rather than a class and generate a closure with full lexical scoping a little further, I don't believe this could be done in exec itself without breaking code that expects the current behaviour. However, something along these lines could probably be managed as a new compilation mode for compile() (e.g. compile(code_str, name, closure)), which would then allow these code objects to be passed to exec to get the desired behaviour. Compare and contrast: def f(): ... x = 1 ... def g(): ... print x ... g() ... exec f.func_code in globals(), {} 1 source = \ ... x = 1 ... def g(): ... print x ... g() ... exec source in globals(), {} Traceback (most recent call last): File stdin, line 1, in module File string, line 4, in module File string, line 3, in g NameError: global name 'x' is not defined Breaking out dis.dis on these examples is fairly enlightening, as they generate *very* different bytecode for the definition of g(). Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia --- ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] variable name resolution in exec is incorrect
Hi, issue991196 was closed being described as intentional. I've added a comment in that issue which argues that this is a serious bug (also aserted by a previous commenter - Armin Rigo), because it creates a unique, undocumented, oddly behaving scope that doesn't apply closures correctly. At the very least I think this should be acknowledged as a plain old bug (rather than a feature), and then a discussion about whether it will be fixed or not. Appreciate your thoughts - cheers, Colin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] variable name resolution in exec is incorrect
Thanks for the details on why the observed behaviour occurs - very clear. My only query would be why this is considered correct? Why is it running as a class namespace, when it is not a class? Is there any reason why this is not considered a mistake? Slightly concerned that this is being considered not a bug because 'it is how it is'. A really good reason why you would want to provide a separate locals dictionary is to get access to the stuff that was defined in the exec()'d code block. Unfortunately this use case is broken by the current behaviour. The only way to get the definitions from the exec()'d code block is to supply a single dictionary, and then try to weed out the definitions from amongst all the other globals, which is very difficult if you don't know in advance what was in the code block you exec()'d. So put simply - the bug is that a class namespace is used, but its not a class. On 26/05/2010 13:51, Nick Coghlan wrote: On 26/05/10 19:48, Mark Dickinson wrote: This is a long way from my area of expertise (I'm commenting here because it was me who sent Colin here in the first place), and it's not clear to me whether this is a bug, and if it is a bug, how it could be resolved. What would the impact be of having the compiler produce 'LOAD_NAME' rather than 'LOAD_GLOBAL' here? exec with a single argument = module namespace exec with two arguments = class namespace Class namespaces are deliberately exempted from lexical scoping so that methods can't see class attributes, hence the example in the tracker issue works exactly as it would if the code was written as a class body. class C: y = 3 def execfunc(): print y execfunc() With this code, y would end up in C.__dict__ rather than the module globals (at least, it would if it wasn't for the exception) and the call to execfunc fails with a NameError when attempting to find y. I know I've closed other bug reports that were based on the same misunderstanding, and I didn't understand it myself until Guido explained it to me a few years back, so suggestions for improving the exec documentation in this area would be appreciated. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] variable name resolution in exec is incorrect
The changes to the docs will definitely help in understanding why this behaves as it does. I would like like to take one last stab though at justifying why this behaviour isn't correct - will leave it alone if these arguments don't stack up :) Appreciate the input and discussion. Terry Jan Reedy wrote You are expecting that it run as a function namespace (with post 2.2 nesting), when it is not a function. Why is that any better? Because a class namespace (as I see it) was implemented to treat a specific situation - i.e. that functions in classes cannot see class variables. exec() is a far more generic instrument that has no such explicit requirement - i.e. it feels like hijacking an edge case to meet a requirement that doesn't exist. However 'all locals in an enclosing scope are made available in the function namespace' is generally understood as python's generic closure implementation, and would match more effectively the generic nature of the exec() statement. A litmus test for this sort of thing - if you polled 100 knowledgeable python devs who hadn't encountered this problem or this thread and asked if they would expect exec() to run as a class or function namespace, I think you'd struggle to get 1 of them to expect a class namespace. Functions are the more generic construct, and thus more appropriate for the generic nature of exec() (IMHO). It would appear that the only actual requirement not to make locals in an enclosing scope available in a nested function scope is for a class. The situation we are discussing seems have created a similar requirement for exec(), but with no reason. In original Python, the snippet would have given an error whether you thought of it as being in a class or function context, which is how anyone who knew Python then would have expected. Consistency is not a bug. When nested function namespaces were introduced, the behavior of exec was left unchanged. Backward compatibility is not a bug. Generally, most other behaviour did change - locals in enclosing scopes *did* become available in the nested function namespace, which was not backward compatible. Why is a special case made to retain consistency and backward compatibility for code run using exec()? It's all python code. Inconsistent backward compatibility might be considered a bug. Cheers, Colin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] variable name resolution in exec is incorrect
Mark Dickinson wrote: Seems to me the whole idea of being able to specify separate global and local scopes for top-level code is screwy in the first place. Are there any use cases for it? Maybe the second scope argument to exec() should be deprecated? It is running as class namespace that makes the argument that there's no use case. I agree - I can't think of any decent use cases for exec() as class namespace - defining functions and classes only works for a subset of function and class definitions However, if exec() ran as function namespace instead, then the locals dictionary will contain all the definitions from the exec()'d code block, and only those definitions. Very useful. This is a major use case for exec() - defining code from strings (e.g. enabling you to store python code in the database), and using it at runtime. It seems to me this must have been the point of locals in the first place. If you just use globals, then your definitions exist amongst a whole bunch of other python stuff, and unless you know in advance what was defined in your code block, its very difficult to extract them. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] variable name resolution in exec is incorrect
Hi Guido, Thanks for the possible workaround - unfortunately 'stuff' will contain a whole stack of things that are not in 'context', and were not defined in 'user_code' - things that python embeds - a (very small) selection - {..., 'NameError': type 'exceptions.NameError', 'BytesWarning': type 'exceptions.BytesWarning', 'dict': type 'dict', 'input': function input at 0x10047a9b0, 'oct': built-in function oct, 'bin': built-in function bin, ...} It makes sense why this happens of course, but upon return, the globals dict is very large, and finding the stuff you defined in your user_code amongst it is a very difficult task. Avoiding this problem is the 'locals' use-case for me. Cheers, Colin On Thu, May 27, 2010 at 1:38 AM, Guido van Rossum gu...@python.org wrote: This is not easy to fix. The best short-term work-around is probably a hack like this: def define_stuff(user_code): context = {...} stuff = {} stuff.update(context) exec(user_code, stuff) for key in context: if key in stuff and stuff[key] == context[key]: del stuff[key] return stuff -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] variable name resolution in exec is incorrect
Of course :) - I need to pay more attention. Your workaround should do the trick. It would make sense if locals could be used for this purpose, but the workaround doesn't add so much overhead in most situations. Thanks for the help, much appreciated, Colin On Thu, May 27, 2010 at 2:05 AM, Guido van Rossum gu...@python.org wrote: On Wed, May 26, 2010 at 5:53 PM, Colin H hawk...@gmail.com wrote: Thanks for the possible workaround - unfortunately 'stuff' will contain a whole stack of things that are not in 'context', and were not defined in 'user_code' - things that python embeds - a (very small) selection - {..., 'NameError': type 'exceptions.NameError', 'BytesWarning': type 'exceptions.BytesWarning', 'dict': type 'dict', 'input': function input at 0x10047a9b0, 'oct': built-in function oct, 'bin': built-in function bin, ...} It makes sense why this happens of course, but upon return, the globals dict is very large, and finding the stuff you defined in your user_code amongst it is a very difficult task. Avoiding this problem is the 'locals' use-case for me. Cheers, No, if taken literally that doesn't make sense. Those are builtins. I think you are mistaken that each of those (e.g. NameError) is in stuff -- they are in stuff['__builtins__'] which represents the built-in namespace. You should remove that key from stuff as well. --Guido Colin On Thu, May 27, 2010 at 1:38 AM, Guido van Rossum gu...@python.org wrote: This is not easy to fix. The best short-term work-around is probably a hack like this: def define_stuff(user_code): context = {...} stuff = {} stuff.update(context) exec(user_code, stuff) for key in context: if key in stuff and stuff[key] == context[key]: del stuff[key] return stuff -- --Guido van Rossum (python.org/~guido) -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com