Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-29 Thread Colin H
Perhaps the next step is to re-open the issue? If it is seen as a bug,
it would be great to see a fix in 2.6+ - a number of options which
will not break backward compatibility have been put forward - cheers,

Colin

On Thu, May 27, 2010 at 9:05 PM, Reid Kleckner r...@mit.edu wrote:
 On Thu, May 27, 2010 at 11:42 AM, Nick Coghlan ncogh...@gmail.com wrote:
 However, attaining the (sensible) behaviour Colin is requesting when such
 top level variable references exist would actually be somewhat tricky.
 Considering Guido's suggestion to treat two argument exec like a function
 rather than a class and generate a closure with full lexical scoping a
 little further, I don't believe this could be done in exec itself without
 breaking code that expects the current behaviour.

 Just to give a concrete example, here is code that would break if exec
 were to execute code in a function scope instead of a class scope:

 exec 
 def len(xs):
    return -1
 def foo():
    return len([])
 print foo()
  in globals(), {}

 Currently, the call to 'len' inside 'foo' skips the outer scope
 (because it's a class scope) and goes straight to globals and
 builtins.  If it were switched to a local scope, a cell would be
 created for the broken definition of 'len', and the call would resolve
 to it.

 Honestly, to me, the fact that the above code ever worked (ie prints
 0, not -1) seems like a bug, so I wouldn't worry about backwards
 compatibility.

 Reid

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-27 Thread Colin H
I needed to make a small modification to the workaround - I wasn't
able to delete from 'stuff', as the definitions in exec()'d code won't
run - they're relying on that being present at runtime. In practice
the overhead of doing this is quite noticeable if you run your code
like this a lot, and build up a decent sized context (which I do). It
will obviously depend on the usage scenario though.

def define_stuff(user_code):
  context = {...}
  stuff = {}
  stuff.update(context)

  exec(user_code, stuff)

  return_stuff = {}
  return_stuff.update(stuff)

  del return_stuff['__builtins__']
  for key in context:
    if key in return_stuff and return_stuff[key] == context[key]:
      del return_stuff[key]

  return return_stuff

On Thu, May 27, 2010 at 2:13 AM, Colin H hawk...@gmail.com wrote:
 Of course :) - I need to pay more attention. Your workaround should do
 the trick. It would make sense if locals could be used for this
 purpose, but the workaround doesn't add so much overhead in most
 situations.  Thanks for the help, much appreciated,

 Colin

 On Thu, May 27, 2010 at 2:05 AM, Guido van Rossum gu...@python.org wrote:
 On Wed, May 26, 2010 at 5:53 PM, Colin H hawk...@gmail.com wrote:
   Thanks for the possible workaround - unfortunately 'stuff' will
 contain a whole stack of things that are not in 'context', and were
 not defined in 'user_code' - things that python embeds - a (very
 small) selection -

 {..., 'NameError': type 'exceptions.NameError', 'BytesWarning':
 type 'exceptions.BytesWarning', 'dict': type 'dict', 'input':
 function input at 0x10047a9b0, 'oct': built-in function oct,
 'bin': built-in function bin, ...}

 It makes sense why this happens of course, but upon return, the
 globals dict is very large, and finding the stuff you defined in your
 user_code amongst it is a very difficult task.  Avoiding this problem
 is the 'locals' use-case for me.  Cheers,

 No, if taken literally that doesn't make sense. Those are builtins. I
 think you are mistaken that each of those (e.g. NameError) is in stuff
 -- they are in stuff['__builtins__'] which represents the built-in
 namespace. You should remove that key from stuff as well.

 --Guido

 Colin

 On Thu, May 27, 2010 at 1:38 AM, Guido van Rossum gu...@python.org wrote:
 This is not easy to fix. The best short-term work-around is probably a
 hack like this:

 def define_stuff(user_code):
  context = {...}
  stuff = {}
  stuff.update(context)
  exec(user_code, stuff)
  for key in context:
    if key in stuff and stuff[key] == context[key]:
      del stuff[key]
  return stuff

 --
 --Guido van Rossum (python.org/~guido)





 --
 --Guido van Rossum (python.org/~guido)


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-27 Thread Colin H
Yep fair call - was primarily modifying Guido's example to make the
point about not being able to delete from the globals returned from
exec - cheers,

Colin

On Thu, May 27, 2010 at 2:09 PM, Scott Dial
scott+python-...@scottdial.com wrote:
 On 5/27/2010 7:14 AM, Colin H wrote:
 def define_stuff(user_code):
   context = {...}
   stuff = {}
   stuff.update(context)

   exec(user_code, stuff)

   return_stuff = {}
   return_stuff.update(stuff)

   del return_stuff['__builtins__']
   for key in context:
     if key in return_stuff and return_stuff[key] == context[key]:
       del return_stuff[key]

   return return_stuff

 I'm not sure your application, but I suspect you would benefit from
 using an identity check instead of an __eq__ check. The equality check
 may be expensive (e.g., a large dictionary), and I don't think it
 actually is checking what you want -- if the user_code generates an
 __eq__-similar dictionary, wouldn't you still want that? The only reason
 I can see to use __eq__ is if you are trying to detect user_code
 modifying an object passed in, which is something that wouldn't be
 addressed by your original complaint about exec (as in, modifying a
 global data structure).

 Instead of:
     if key in return_stuff and return_stuff[key] == context[key]:

 Use:
     if key in return_stuff and return_stuff[key] is context[key]:

 --
 Scott Dial
 sc...@scottdial.com
 scod...@cs.indiana.edu

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-27 Thread Colin H
Just to put a couple of alternatives on the table that don't break
existing code - not necessarily promoting them, or suggesting they
would be easy to do -

1. modify exec() to take an optional third argument - 'scope_type' -
if it is not supplied (but locals is), then it runs as class namespace
- i.e. identical to existing behaviour. If it is supplied then it will
run as whichever is specified, with function namespace being an
option.  The API already operates along these lines, with the second
argument being optional and implying module namespace if it is not
present.

2. a new API exec2() which uses function namespace, and deprecating
the old exec() - assuming there is agreement that function namespace
makes more sense than the class namespace, because there are real use
cases, and developers would generally expect this behaviour when
approaching the API for the first time.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-27 Thread Colin H
This option sounds very promising - seems right to do it at the
compile stage - i.e. compile(code_str, name, closure) as you have
suggested.  If there were any argument against, it would be that the
most obvious behaviour (function namespace) is the hardest to induce,
but the value in knowing you're not breaking anything is pretty high.

Cheers,
Colin

On Thu, May 27, 2010 at 4:42 PM, Nick Coghlan ncogh...@gmail.com wrote:
 On 27/05/10 10:38, Guido van Rossum wrote:

 On Wed, May 26, 2010 at 5:12 PM, Nick Coghlanncogh...@gmail.com  wrote:

 Lexical scoping only works for code that is compiled as part of a single
 operation - the separation between the compilation of the individual
 string
 and the code defining that string means that the symbol table analysis
 needed for lexical scoping can't cross the boundary.

 Hi Nick,

 I don't think Colin was asking for such things.

 Yes, I realised some time after sending that message that I'd gone off on a
 tangent unrelated to the original question (as a result of earlier parts of
 the discussion I'd been pondering the scoping differences between exec with
 two namespaces and a class definition and ended up writing about that
 instead of the topic Colin originally brought up).

 I suspect Thomas is right that the current two namespace exec behaviour is
 mostly a legacy of the standard scoping before nested scopes were added.

 To state the problem as succinctly as I can, the basic issue is that a code
 object which includes a function definition that refers to top level
 variables will execute correctly when the same namespace is used for both
 locals and globals (i.e. like module level code) but will fail when these
 namespaces are different (i.e. like code in class definition).

 So long as the code being executed doesn't define any functions that refer
 to top level variables in the executed code the two argument form is
 currently perfectly usable, so deprecating it would be an overreaction.

 However, attaining the (sensible) behaviour Colin is requesting when such
 top level variable references exist would actually be somewhat tricky.
 Considering Guido's suggestion to treat two argument exec like a function
 rather than a class and generate a closure with full lexical scoping a
 little further, I don't believe this could be done in exec itself without
 breaking code that expects the current behaviour. However, something along
 these lines could probably be managed as a new compilation mode for
 compile() (e.g. compile(code_str, name, closure)), which would then allow
 these code objects to be passed to exec to get the desired behaviour.

 Compare and contrast:

 def f():
 ...   x = 1
 ...   def g():
 ...     print x
 ...   g()
 ...
 exec f.func_code in globals(), {}
 1

 source = \
 ... x = 1
 ... def g():
 ...   print x
 ... g()
 ... 
 exec source in globals(), {}
 Traceback (most recent call last):
  File stdin, line 1, in module
  File string, line 4, in module
  File string, line 3, in g
 NameError: global name 'x' is not defined

 Breaking out dis.dis on these examples is fairly enlightening, as they
 generate *very* different bytecode for the definition of g().

 Cheers,
 Nick.

 --
 Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
 ---

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-27 Thread Colin H
By hardest to induce I mean the default compile exec(code_str, {}, {})
would still be class namespace, but it's pretty insignificant.

On Fri, May 28, 2010 at 12:32 AM, Colin H hawk...@gmail.com wrote:
 This option sounds very promising - seems right to do it at the
 compile stage - i.e. compile(code_str, name, closure) as you have
 suggested.  If there were any argument against, it would be that the
 most obvious behaviour (function namespace) is the hardest to induce,
 but the value in knowing you're not breaking anything is pretty high.

 Cheers,
 Colin

 On Thu, May 27, 2010 at 4:42 PM, Nick Coghlan ncogh...@gmail.com wrote:
 On 27/05/10 10:38, Guido van Rossum wrote:

 On Wed, May 26, 2010 at 5:12 PM, Nick Coghlanncogh...@gmail.com  wrote:

 Lexical scoping only works for code that is compiled as part of a single
 operation - the separation between the compilation of the individual
 string
 and the code defining that string means that the symbol table analysis
 needed for lexical scoping can't cross the boundary.

 Hi Nick,

 I don't think Colin was asking for such things.

 Yes, I realised some time after sending that message that I'd gone off on a
 tangent unrelated to the original question (as a result of earlier parts of
 the discussion I'd been pondering the scoping differences between exec with
 two namespaces and a class definition and ended up writing about that
 instead of the topic Colin originally brought up).

 I suspect Thomas is right that the current two namespace exec behaviour is
 mostly a legacy of the standard scoping before nested scopes were added.

 To state the problem as succinctly as I can, the basic issue is that a code
 object which includes a function definition that refers to top level
 variables will execute correctly when the same namespace is used for both
 locals and globals (i.e. like module level code) but will fail when these
 namespaces are different (i.e. like code in class definition).

 So long as the code being executed doesn't define any functions that refer
 to top level variables in the executed code the two argument form is
 currently perfectly usable, so deprecating it would be an overreaction.

 However, attaining the (sensible) behaviour Colin is requesting when such
 top level variable references exist would actually be somewhat tricky.
 Considering Guido's suggestion to treat two argument exec like a function
 rather than a class and generate a closure with full lexical scoping a
 little further, I don't believe this could be done in exec itself without
 breaking code that expects the current behaviour. However, something along
 these lines could probably be managed as a new compilation mode for
 compile() (e.g. compile(code_str, name, closure)), which would then allow
 these code objects to be passed to exec to get the desired behaviour.

 Compare and contrast:

 def f():
 ...   x = 1
 ...   def g():
 ...     print x
 ...   g()
 ...
 exec f.func_code in globals(), {}
 1

 source = \
 ... x = 1
 ... def g():
 ...   print x
 ... g()
 ... 
 exec source in globals(), {}
 Traceback (most recent call last):
  File stdin, line 1, in module
  File string, line 4, in module
  File string, line 3, in g
 NameError: global name 'x' is not defined

 Breaking out dis.dis on these examples is fairly enlightening, as they
 generate *very* different bytecode for the definition of g().

 Cheers,
 Nick.

 --
 Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
 ---


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] variable name resolution in exec is incorrect

2010-05-26 Thread Colin H
Hi,

   issue991196 was closed being described as intentional.  I've added
a comment in that issue which argues that this is a serious bug (also
aserted by a previous commenter - Armin Rigo), because it creates a
unique, undocumented, oddly behaving scope that doesn't apply closures
correctly. At the very least I think this should be acknowledged as a
plain old bug (rather than a feature), and then a discussion about
whether it will be fixed or not.  Appreciate your thoughts - cheers,

Colin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-26 Thread Colin H
Thanks for the details on why the observed behaviour occurs - very
clear. My only query would be why this is considered correct? Why is
it running as a class namespace, when it is not a class? Is there any
reason why this is not considered a mistake? Slightly concerned that
this is being considered not a bug because 'it is how it is'.

A really good reason why you would want to provide a separate locals
dictionary is to get access to the stuff that was defined in the
exec()'d code block.  Unfortunately this use case is broken by the
current behaviour.  The only way to get the definitions from the
exec()'d code block is to supply a single dictionary, and then try to
weed out the definitions from amongst all the other globals, which is
very difficult if you don't know in advance what was in the code block
you exec()'d.

So put simply - the bug is that a class namespace is used, but its not a class.

On 26/05/2010 13:51, Nick Coghlan wrote:
 On 26/05/10 19:48, Mark Dickinson wrote:
 This is a long way from my area of expertise (I'm commenting here
 because it was me who sent Colin here in the first place), and it's
 not clear to me whether this is a bug, and if it is a bug, how it
 could be resolved. What would the impact be of having the compiler
 produce 'LOAD_NAME' rather than 'LOAD_GLOBAL' here?

 exec with a single argument = module namespace
 exec with two arguments = class namespace

 Class namespaces are deliberately exempted from lexical scoping so
 that methods can't see class attributes, hence the example in the
 tracker issue works exactly as it would if the code was written as a
 class body.

 class C:
 y = 3
 def execfunc():
 print y
 execfunc()

 With this code, y would end up in C.__dict__ rather than the module
 globals (at least, it would if it wasn't for the exception) and the
 call to execfunc fails with a NameError when attempting to find y.

 I know I've closed other bug reports that were based on the same
 misunderstanding, and I didn't understand it myself until Guido
 explained it to me a few years back, so suggestions for improving the
 exec documentation in this area would be appreciated.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-26 Thread Colin H
The changes to the docs will definitely help in understanding why this
behaves as it does. I would like like to take one last stab though at
justifying why this behaviour isn't correct - will leave it alone if
these arguments don't stack up :)  Appreciate the input and
discussion.

Terry Jan Reedy wrote

 You are expecting that it run as a function namespace (with post 2.2
 nesting), when it is not a function. Why is that any better?

Because a class namespace (as I see it) was implemented to treat a
specific situation - i.e. that functions in classes cannot see class
variables. exec() is a far more generic instrument that has no such
explicit requirement - i.e. it feels like hijacking an edge case to
meet a requirement that doesn't exist. However 'all locals in an
enclosing scope are made available in the function namespace' is
generally understood as python's generic closure implementation, and
would match more effectively the generic nature of the exec()
statement.  A litmus test for this sort of thing - if you polled 100
knowledgeable python devs who hadn't encountered this problem or this
thread and asked if they would expect exec() to run as a class or
function namespace, I think you'd struggle to get 1 of them to expect
a class namespace. Functions are the more generic construct, and thus
more appropriate for the generic nature of exec() (IMHO).

It would appear that the only actual requirement not to make locals in
an enclosing scope available in a nested function scope is for a
class. The situation we are discussing seems have created a similar
requirement for exec(), but with no reason.

 In original Python, the snippet would have given an error whether you
 thought of it as being in a class or function context, which is how
 anyone who knew Python then would have expected. Consistency is not a bug.

 When nested function namespaces were introduced, the behavior of exec
 was left unchanged. Backward compatibility is not a bug.

Generally, most other behaviour did change - locals in enclosing
scopes *did* become available in the nested function namespace, which
was not backward compatible.  Why is a special case made to retain
consistency and backward compatibility for code run using exec()? It's
all python code. Inconsistent backward compatibility might be
considered a bug.

Cheers,

Colin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-26 Thread Colin H
Mark Dickinson wrote:

 Seems to me the whole idea of being able to specify
 separate global and local scopes for top-level code is
 screwy in the first place. Are there any use cases for
 it? Maybe the second scope argument to exec() should
 be deprecated?

It is running as class namespace that makes the argument that there's
no use case. I agree - I can't think of any decent use cases for
exec() as class namespace - defining functions and classes only works
for a subset of function and class definitions

However, if exec() ran as function namespace instead, then the locals
dictionary will contain all the definitions from the exec()'d code
block, and only those definitions. Very useful. This is a major use
case for exec() - defining code from strings (e.g. enabling you to
store python code in the database), and using it at runtime. It seems
to me this must have been the point of locals in the first place.

If you just use globals, then your definitions exist amongst a whole
bunch of other python stuff, and unless you know in advance what was
defined in your code block, its very difficult to extract them.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-26 Thread Colin H
Hi Guido,

   Thanks for the possible workaround - unfortunately 'stuff' will
contain a whole stack of things that are not in 'context', and were
not defined in 'user_code' - things that python embeds - a (very
small) selection -

{..., 'NameError': type 'exceptions.NameError', 'BytesWarning':
type 'exceptions.BytesWarning', 'dict': type 'dict', 'input':
function input at 0x10047a9b0, 'oct': built-in function oct,
'bin': built-in function bin, ...}

It makes sense why this happens of course, but upon return, the
globals dict is very large, and finding the stuff you defined in your
user_code amongst it is a very difficult task.  Avoiding this problem
is the 'locals' use-case for me.  Cheers,

Colin

On Thu, May 27, 2010 at 1:38 AM, Guido van Rossum gu...@python.org wrote:
 This is not easy to fix. The best short-term work-around is probably a
 hack like this:

 def define_stuff(user_code):
  context = {...}
  stuff = {}
  stuff.update(context)
  exec(user_code, stuff)
  for key in context:
    if key in stuff and stuff[key] == context[key]:
      del stuff[key]
  return stuff

 --
 --Guido van Rossum (python.org/~guido)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] variable name resolution in exec is incorrect

2010-05-26 Thread Colin H
Of course :) - I need to pay more attention. Your workaround should do
the trick. It would make sense if locals could be used for this
purpose, but the workaround doesn't add so much overhead in most
situations.  Thanks for the help, much appreciated,

Colin

On Thu, May 27, 2010 at 2:05 AM, Guido van Rossum gu...@python.org wrote:
 On Wed, May 26, 2010 at 5:53 PM, Colin H hawk...@gmail.com wrote:
   Thanks for the possible workaround - unfortunately 'stuff' will
 contain a whole stack of things that are not in 'context', and were
 not defined in 'user_code' - things that python embeds - a (very
 small) selection -

 {..., 'NameError': type 'exceptions.NameError', 'BytesWarning':
 type 'exceptions.BytesWarning', 'dict': type 'dict', 'input':
 function input at 0x10047a9b0, 'oct': built-in function oct,
 'bin': built-in function bin, ...}

 It makes sense why this happens of course, but upon return, the
 globals dict is very large, and finding the stuff you defined in your
 user_code amongst it is a very difficult task.  Avoiding this problem
 is the 'locals' use-case for me.  Cheers,

 No, if taken literally that doesn't make sense. Those are builtins. I
 think you are mistaken that each of those (e.g. NameError) is in stuff
 -- they are in stuff['__builtins__'] which represents the built-in
 namespace. You should remove that key from stuff as well.

 --Guido

 Colin

 On Thu, May 27, 2010 at 1:38 AM, Guido van Rossum gu...@python.org wrote:
 This is not easy to fix. The best short-term work-around is probably a
 hack like this:

 def define_stuff(user_code):
  context = {...}
  stuff = {}
  stuff.update(context)
  exec(user_code, stuff)
  for key in context:
    if key in stuff and stuff[key] == context[key]:
      del stuff[key]
  return stuff

 --
 --Guido van Rossum (python.org/~guido)





 --
 --Guido van Rossum (python.org/~guido)

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com