Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
Guido van Rossum wrote: On Thu, Mar 8, 2012 at 4:33 PM, Nick Coghlan ncogh...@gmail.com wrote: On Fri, Mar 9, 2012 at 3:31 AM, Guido van Rossum gu...@python.org wrote: But the __builtins__ that are actually used by any particular piece of code is *not* taken by importing builtins. It is taken from what the globals store under the key __builtins__. This is a feature that was added specifically for sandboxing purposes, but I believe it has found other uses too. Agreed, but swapping out builtins for a different namespace is still the exception rather than the rule. My Impression of Mark's proposal was that this approach would become the *preferred* way of doing things, and that's the part I don't like at a conceptual level. The key point is that every piece of code already inherits locals, globals and builtins from somewhere else. We can already control locals (by which parameters are passed in) and globals via exec, eval, __import__, and runpy (any others?) but we can't control builtins. Correct - because controlling builtins is the domain of sandboxes. Incorrect (unless I misunderstand the context) -- when you control the globals you control the __builtins__ set there. And this is where I don't like the idea at a practical level. We already have a way to swap in a different set of builtins for a certain execution context (i.e. set __builtins__ in the global namespace) for a small chunk of code, as well as allowing collections.ChainMap to insert additional namespaces into the name lookup path. This proposal suggests adding an additional mapping argument to every API that currently accepts a locals and/or globals mapping, thus achieving... well, nothing substantial, as far as I can tell (aside from a lot of pointless churn in a bunch of APIs, not all of which are under our direct control). In any case, the locals / globals / builtins chain is a simplification; there are also any number of intermediate scopes (between locals and globals) from which nonlocal variables may be used. Like optimized function globals, these don't use a dict lookup at all, they are determined by compile-time analysis. Acknowledged, but code executed via the exec API with both locals and globals passed in is actually one of the few places where that lookup chain survives in its original form (module level class definitions being the other). Now, rereading Mark's original message, a simpler proposal of having *function objects* do an early lookup of self.__globals__['__builtins__'] at creation time and caching that somewhere such that the frame objects can get hold of it (rather than having to do the lookup every time the function gets called or a builtin gets referenced) might be a nice micro-optimisation. It's the gratuitous API changes that I'm objecting to, not the underlying idea of binding the reference to the builtins namespace earlier in the function definition process. I'd even be OK with leaving the default builtins reference *out* of the globals namespace in favour of storing a hidden reference on the frame objects. Agreed on the gratuitous API changes. I'd like to hear Mark's response. C API or Python API? The Python API would be changed, but in a backwards compatible way. exec, eval and __import__ would all gain an optional (keyword-only?) builtins parameter. I see no reason to change any of the C API functions. New functions taking an extra parameter could be added, but it wouldn't be a requirement. Cheers, Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
On Fri, Mar 9, 2012 at 6:19 PM, Mark Shannon m...@hotpy.org wrote: The Python API would be changed, but in a backwards compatible way. exec, eval and __import__ would all gain an optional (keyword-only?) builtins parameter. No, some APIs effectively define *protocols*. For such APIs, *adding* parameters is almost of comparable import to taking them away, because they require that other APIs modelled on the prototype also change. In this case, not only exec() has to change, but eval, __import__, probably runpy, function creation, eventually any third party APIs for code execution, etc, etc. Adding a new parameter to exec is a change with serious implications, and utterly unnecessary, since the API part is already covered by setting __builtins__ in the passed in globals namespace (which is appropriately awkward to advise people that they're doing something strange with potentially unintended consequences or surprising limitations). That said, binding a reference to the builtin *early* (for example, at function definition time or when a new invocation of the eval loop first fires up) may be a reasonable idea, but you don't have to change the user facing API to explore that option - it works just as well with __builtins__ as an optional value in the existing global namespace. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
Nick Coghlan wrote: On Fri, Mar 9, 2012 at 6:19 PM, Mark Shannon m...@hotpy.org wrote: The Python API would be changed, but in a backwards compatible way. exec, eval and __import__ would all gain an optional (keyword-only?) builtins parameter. No, some APIs effectively define *protocols*. For such APIs, *adding* parameters is almost of comparable import to taking them away, because they require that other APIs modelled on the prototype also change. In this case, not only exec() has to change, but eval, __import__, probably runpy, function creation, eventually any third party APIs for code execution, etc, etc. Adding a new parameter to exec is a change with serious implications, and utterly unnecessary, since the API part is already covered by setting __builtins__ in the passed in globals namespace (which is appropriately awkward to advise people that they're doing something strange with potentially unintended consequences or surprising limitations). It is the implementation that interests me. Implementing the (locals, globals, builtins) triple as a single object has advantages both in terms of internal consistency and efficiency. I just thought to expose this to the user. I am now persuaded that I don't want to expose anything :) That said, binding a reference to the builtin *early* (for example, at function definition time or when a new invocation of the eval loop first fires up) may be a reasonable idea, but you don't have to change the user facing API to explore that option - it works just as well with __builtins__ as an optional value in the existing global namespace. OK. So, how about this: (builtins refers to the dict used for variable lookup, not the module) New eval pseudocode eval(code, globals, locals): triple = (locals, globals, globals[__builtins__]) return eval_internal(triple) Similarly for exec, __import__ and runpy. That way the (IMO clumsy) builtins = globals[__builtins__] only happens at a few known locations. It should then be clear where all code gets its namespaces from. Namespaces should be inherited as follows: frame: function scope: globals and builtins from function, locals from parameters. module scope: globals and builtins from module, locals == globals. in eval, exec, or runpy: all explicit. function: globals and builtins from module (no locals) module: globals and builtins from import (no locals) import: explicitly from __import__() or implicitly from current frame in an import statement. For frame and function, free and cell (nonlocal) variables would be unchanged. On entry the namespaces will be {}, {}, sys.modules['builtins'].__dict__ This is pretty much what happens anyway, except that where code gets its builtins from is now well defined. Cheers, Mark. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
The reason I am proposing this here rather than on python-ideas is that treating the triple of [locals, globals, builtins] as a single execution context can be implemented in a really nice way. Internally, the execution context of [locals, globals, builtins] can be treated a single immutable object (custom object or tuple) Treating it as immutable means that it can be copied merely by taking a reference. A nice trick in the implementation is to make a NULL locals mean fast locals for function contexts. Frames, could then acquire their globals and builtins by a single reference copy from the function object, rather than searching globals for a '__builtins__' to find the builtins. Creating a new frame lookup for __builtins__ in globals only if globals of the new frame is different from the globals of the previous frame. You would like to optimize this case? If globals is unchanged, Python just increments the reference counter. When globals is different from the previous frame? When you call a function from a different module maybe? Do you have an idea of the speedup of your optimization? Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
Victor Stinner wrote: The reason I am proposing this here rather than on python-ideas is that treating the triple of [locals, globals, builtins] as a single execution context can be implemented in a really nice way. Internally, the execution context of [locals, globals, builtins] can be treated a single immutable object (custom object or tuple) Treating it as immutable means that it can be copied merely by taking a reference. A nice trick in the implementation is to make a NULL locals mean fast locals for function contexts. Frames, could then acquire their globals and builtins by a single reference copy from the function object, rather than searching globals for a '__builtins__' to find the builtins. Creating a new frame lookup for __builtins__ in globals only if globals of the new frame is different from the globals of the previous frame. You would like to optimize this case? If globals is unchanged, Python just increments the reference counter. I'm more interested in simplifying the code than performance. We this proposed approach, there is no need to test where the globals come from, or what the builtins are; just incref the namespace triple. When globals is different from the previous frame? When you call a function from a different module maybe? Do you have an idea of the speedup of your optimization? No. But it won't be slower. Cheers, Mark ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
On 09/03/2012 12:57, Mark Shannon wrote: No. But it won't be slower. Cheers, Mark Please prove it, you have to convince a number of core developers including, but not limited to, the BDFL :). -- Cheers. Mark Lawrence. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
Jim J. Jewett wrote: http://mail.python.org/pipermail/python-dev/2012-March/117395.html Brett Cannon posted: [in reply to Mark Shannon's suggestion of adding a builtins parameter to match locals and globals] It's a mess right now to try to grab the __import__() implementation and this would actually help clarify import semantics by saying that __import__() for any chained imports comes from __import__()s locals, globals, or builtins arguments (in that order) or from the builtins module itself (i.e. tstate-builtins). How does that differ from today? The idea is that you can change, presumable restrict, the builtins separately from the globals for an import. If you're saying that the locals and (module-level) globals aren't always checked in order, then that is a semantic change. Probably a good change, but still a change -- and it can be made indepenently of Mark's suggestion. Also note that I would assume this was for sandboxing, Actually, I just think it's a cleaner implementation, but sandboxing is a good excuse :) and that missing names should *not* fall back to the real globals, although I would understand if bootstrapping required the import statement to get special treatment. (Note that I like Mark's proposed change; I just don't see how it cleans up import.) I don't think it cleans up import, but I'll defer to Brett on that. I've included __import__() along with exec and eval as it is a place where new namespaces can be introduced into an execution. There may be others I haven't though of. Cheers, Mark. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
On Thu, Mar 8, 2012 at 10:06 PM, Mark Shannon m...@hotpy.org wrote: I don't think it cleans up import, but I'll defer to Brett on that. I've included __import__() along with exec and eval as it is a place where new namespaces can be introduced into an execution. There may be others I haven't though of. runpy is another one. However, the problem I see with builtins as a separate argument is that it would be a lie. The element that's most interesting about locals vs globals vs builtins is the scope of visibility of their contents. When I call out to another function in the same module, locals are not shared, but globals and builtins are. When I call out to code in a *different* module, neither locals nor globals are shared, but builtins are still common. So there are two ways this purported extra builtins parameter could work: 1. Sandboxing - you try to genuinely give the execution context a different set of builtins that's shared by all code executed, even imports from other modules. However, I assume this isn't what you meant, since it is the domain of sandboxing utilities like Victor's pysandbox and is known to be incredibly difficult to get right (hence the demise of both rexec and Bastion and recent comments about known segfault vulnerabilities that are tolerable in the normal case of merely processing untrusted data with trusted code but anathema to a robust CPython native sandboxing scheme that can still cope even when the code itself is untrusted). 2. chained globals - just an extra namespace that's chained behind the globals dictionary for name lookup, not actually shared with code invoked from other modules. The second approach is potentially useful, but: 1. builtins is *not* the right name for it (because any other code invoked will still be using the original builtins) 2. it's already trivial to achieve such chained lookups in 3.3 by passing a collections.ChainMap instance as the globals parameter: http://docs.python.org/dev/library/collections#collections.ChainMap collections.ChainMap also has the virtue of working with any current API that accepts a globals argument and can be extended to an arbitrary level of chaining, whereas this suggestion requires that all such APIs be expanded to accept a third parameter, and could still only chain lookups one additional step in doing so. So a big -1 from me. Cheers, Nick. P.S. I've referenced this talk before, but Tim Dawborn's effort from PyCon AU last year about the sandboxing setup for http://www.ncss.edu.au/ should be required viewing for anyone wanting to understand the kind of effort it takes to fairly comprehensively protect host servers from attacks when executing arbitrary untrusted Python code on CPython. Implementing such protection is certainly *possible* (since Tim's talk is all about one way to do it), but it's not easy, and Tim's approach uses Linux OS level sandboxing rather than rather than relying on a Python language level sandbox. This was largely due to a university requirement that the sandbox solution be language agnostic, but it also serves to protect the sandbox from the documented attacks against the CPython interpreter. Tim reviews a few interesting attempts to break the sandbox around the 5 minute mark in https://www.youtube.com/watch?v=y-WPPdhTKBU. (I did suggest he grab our test_crashers directory to see what happened when they were run in the sandbox, but I doubt it would be much more interesting than merely calling sys.exit()) -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
Nick Coghlan wrote: On Thu, Mar 8, 2012 at 10:06 PM, Mark Shannon m...@hotpy.org wrote: I don't think it cleans up import, but I'll defer to Brett on that. I've included __import__() along with exec and eval as it is a place where new namespaces can be introduced into an execution. There may be others I haven't though of. runpy is another one. Add that to the list. However, the problem I see with builtins as a separate argument is that it would be a lie. The element that's most interesting about locals vs globals vs builtins is the scope of visibility of their contents. When I call out to another function in the same module, locals are not shared, but globals and builtins are. When I call out to code in a *different* module, neither locals nor globals are shared, but builtins are still common. Not necessarily. All functions in a module will inherit their globals *and* builtins from the module, which gets them from __import__(). So there are two ways this purported extra builtins parameter could work: 1. Sandboxing - you try to genuinely give the execution context a different set of builtins that's shared by all code executed, even imports from other modules. Victor's pysandbox seems pretty good to me, I had a go at breaking it and failed, but it is too restrictive. Rather than make pysandbox more secure, I think my proposal could make it more usable, as clearer guarantees about access and visibility can be provided to the sandbox developer. You shouldn't need to cripple introspection in order to limit access to the builtins. However, I assume this isn't what you meant, since it is the domain of sandboxing utilities like Victor's pysandbox and is known to be incredibly difficult to get right (hence the demise of both rexec and Bastion and recent comments about known segfault vulnerabilities that are tolerable in the normal case of merely processing untrusted data with trusted code but anathema to a robust CPython native sandboxing scheme that can still cope even when the code itself is untrusted). By changing the implementation to be based around immutable execution contexts means that the compiler will enforce things for us. Static typing has its advantages, occasionally :) As I stated elsewhere, the crashers can be fixed. I think Victor has already fixed a couple. 2. chained globals - just an extra namespace that's chained behind the globals dictionary for name lookup, not actually shared with code invoked from other modules. That's exactly what builtins already are. They are a fall back for LOAD_GLOBAL and similar when something isn't found in the globals. The second approach is potentially useful, but: 1. builtins is *not* the right name for it (because any other code invoked will still be using the original builtins) Other code will use whatever builtins they were given at __import__. The key point is that every piece of code already inherits locals, globals and builtins from somewhere else. We can already control locals (by which parameters are passed in) and globals via exec, eval, __import__, and runpy (any others?) but we can't control builtins. One last point is that this is a low-impact change. All code using eval, etc. will continue to work as before. It also may speed things up a little. Cheers, Mark. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
On Thu, Mar 8, 2012 at 11:40 PM, Mark Shannon m...@hotpy.org wrote: Other code will use whatever builtins they were given at __import__. Then they're not builtins - they're module-specific chained globals. The thing that makes the builtins special is *who else* can see them (i.e. all the other code in the process). If you replace builtins.open, you replace if for everyone (that hasn't either shadowed it or cached a reference to the original). The key point is that every piece of code already inherits locals, globals and builtins from somewhere else. We can already control locals (by which parameters are passed in) and globals via exec, eval, __import__, and runpy (any others?) but we can't control builtins. Correct - because controlling builtins is the domain of sandboxes. One last point is that this is a low-impact change. All code using eval, etc. will continue to work as before. It also may speed things up a little. Passing in a ChainMap instance as the globals when you want to include an additional namespace in the lookup chain is even lower impact. A reference implementation and concrete use cases might change my mind, but for now, I'm just seeing a horrendously complicated approach with huge implications for the runtime data model semantics for something that 3.3 already supports in a much simpler fashion. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
On Thu, Mar 8, 2012 at 5:57 AM, Nick Coghlan ncogh...@gmail.com wrote: On Thu, Mar 8, 2012 at 11:40 PM, Mark Shannon m...@hotpy.org wrote: Other code will use whatever builtins they were given at __import__. Then they're not builtins - they're module-specific chained globals. The thing that makes the builtins special is *who else* can see them (i.e. all the other code in the process). If you replace builtins.open, you replace if for everyone (that hasn't either shadowed it or cached a reference to the original). Looks like you two are talking about different things. There is only one 'builtins' *module*. But the __builtins__ that are actually used by any particular piece of code is *not* taken by importing builtins. It is taken from what the globals store under the key __builtins__. This is a feature that was added specifically for sandboxing purposes, but I believe it has found other uses too. The key point is that every piece of code already inherits locals, globals and builtins from somewhere else. We can already control locals (by which parameters are passed in) and globals via exec, eval, __import__, and runpy (any others?) but we can't control builtins. Correct - because controlling builtins is the domain of sandboxes. Incorrect (unless I misunderstand the context) -- when you control the globals you control the __builtins__ set there. One last point is that this is a low-impact change. All code using eval, etc. will continue to work as before. It also may speed things up a little. Passing in a ChainMap instance as the globals when you want to include an additional namespace in the lookup chain is even lower impact. A reference implementation and concrete use cases might change my mind, but for now, I'm just seeing a horrendously complicated approach with huge implications for the runtime data model semantics for something that 3.3 already supports in a much simpler fashion. I can't say I'm completely following the discussion. It's not clear whether what I just explained was already implicit in the coversation or is new information. In any case, the locals / globals / builtins chain is a simplification; there are also any number of intermediate scopes (between locals and globals) from which nonlocal variables may be used. Like optimized function globals, these don't use a dict lookup at all, they are determined by compile-time analysis. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
On 8 March 2012 12:52, Nick Coghlan ncogh...@gmail.com wrote: 2. it's already trivial to achieve such chained lookups in 3.3 by passing a collections.ChainMap instance as the globals parameter: http://docs.python.org/dev/library/collections#collections.ChainMap Somewhat OT, but collections.ChainMap is really cool. I hadn't noticed it get added into 3.3, and as far as I can see, it's not in the What's New in 3.3 document. But it's little things like this that *really* make the difference for me in new versions. So thanks to whoever added it, and could we have a whatsnew entry, please? Paul. ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
On 07/03/2012 16:33, Mark Shannon wrote: It should also help with sandboxing, as it would make it easier to analyse and thus control access to builtins, since the execution context of all code would be easier to determine. pysandbox patchs __builtins__ in: - the caller frame - the interpreter state - all modules It uses a read-only dict with only a subset of __builtins__. It is important for: - deny replacing a builtin function - deny adding a new superglobal variable - deny accessing a blocked function If a module or something else leaks the real builtins dict, it would be a vulnerability. pysandbox is able to replace temporary __builtins__ everywhere and then restore the previous state. Can you please explain why/how pysandbox is too restrictive and how your proposition would make it more usable? Currently, it is impossible to allow one function access to sensitive functions like open(), while denying it to others, as any code can then get the builtins of another function via f.__globals__['builtins__']. Separating builtins from globals could solve this. For a sandbox, it's a feature, or maybe a requirement :-) It is a problem if a function accessing to the trusted builtins dict is also accessible in the sandbox. I don't remember why it is a problem: pysandbox blocks access to the __globals__ attribute of functions. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
On Fri, Mar 9, 2012 at 3:31 AM, Guido van Rossum gu...@python.org wrote: But the __builtins__ that are actually used by any particular piece of code is *not* taken by importing builtins. It is taken from what the globals store under the key __builtins__. This is a feature that was added specifically for sandboxing purposes, but I believe it has found other uses too. Agreed, but swapping out builtins for a different namespace is still the exception rather than the rule. My Impression of Mark's proposal was that this approach would become the *preferred* way of doing things, and that's the part I don't like at a conceptual level. The key point is that every piece of code already inherits locals, globals and builtins from somewhere else. We can already control locals (by which parameters are passed in) and globals via exec, eval, __import__, and runpy (any others?) but we can't control builtins. Correct - because controlling builtins is the domain of sandboxes. Incorrect (unless I misunderstand the context) -- when you control the globals you control the __builtins__ set there. And this is where I don't like the idea at a practical level. We already have a way to swap in a different set of builtins for a certain execution context (i.e. set __builtins__ in the global namespace) for a small chunk of code, as well as allowing collections.ChainMap to insert additional namespaces into the name lookup path. This proposal suggests adding an additional mapping argument to every API that currently accepts a locals and/or globals mapping, thus achieving... well, nothing substantial, as far as I can tell (aside from a lot of pointless churn in a bunch of APIs, not all of which are under our direct control). In any case, the locals / globals / builtins chain is a simplification; there are also any number of intermediate scopes (between locals and globals) from which nonlocal variables may be used. Like optimized function globals, these don't use a dict lookup at all, they are determined by compile-time analysis. Acknowledged, but code executed via the exec API with both locals and globals passed in is actually one of the few places where that lookup chain survives in its original form (module level class definitions being the other). Now, rereading Mark's original message, a simpler proposal of having *function objects* do an early lookup of self.__globals__['__builtins__'] at creation time and caching that somewhere such that the frame objects can get hold of it (rather than having to do the lookup every time the function gets called or a builtin gets referenced) might be a nice micro-optimisation. It's the gratuitous API changes that I'm objecting to, not the underlying idea of binding the reference to the builtins namespace earlier in the function definition process. I'd even be OK with leaving the default builtins reference *out* of the globals namespace in favour of storing a hidden reference on the frame objects. Cheers, Nick. -- Nick Coghlan | ncogh...@gmail.com | Brisbane, Australia ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
On Thu, Mar 8, 2012 at 4:33 PM, Nick Coghlan ncogh...@gmail.com wrote: On Fri, Mar 9, 2012 at 3:31 AM, Guido van Rossum gu...@python.org wrote: But the __builtins__ that are actually used by any particular piece of code is *not* taken by importing builtins. It is taken from what the globals store under the key __builtins__. This is a feature that was added specifically for sandboxing purposes, but I believe it has found other uses too. Agreed, but swapping out builtins for a different namespace is still the exception rather than the rule. My Impression of Mark's proposal was that this approach would become the *preferred* way of doing things, and that's the part I don't like at a conceptual level. The key point is that every piece of code already inherits locals, globals and builtins from somewhere else. We can already control locals (by which parameters are passed in) and globals via exec, eval, __import__, and runpy (any others?) but we can't control builtins. Correct - because controlling builtins is the domain of sandboxes. Incorrect (unless I misunderstand the context) -- when you control the globals you control the __builtins__ set there. And this is where I don't like the idea at a practical level. We already have a way to swap in a different set of builtins for a certain execution context (i.e. set __builtins__ in the global namespace) for a small chunk of code, as well as allowing collections.ChainMap to insert additional namespaces into the name lookup path. This proposal suggests adding an additional mapping argument to every API that currently accepts a locals and/or globals mapping, thus achieving... well, nothing substantial, as far as I can tell (aside from a lot of pointless churn in a bunch of APIs, not all of which are under our direct control). In any case, the locals / globals / builtins chain is a simplification; there are also any number of intermediate scopes (between locals and globals) from which nonlocal variables may be used. Like optimized function globals, these don't use a dict lookup at all, they are determined by compile-time analysis. Acknowledged, but code executed via the exec API with both locals and globals passed in is actually one of the few places where that lookup chain survives in its original form (module level class definitions being the other). Now, rereading Mark's original message, a simpler proposal of having *function objects* do an early lookup of self.__globals__['__builtins__'] at creation time and caching that somewhere such that the frame objects can get hold of it (rather than having to do the lookup every time the function gets called or a builtin gets referenced) might be a nice micro-optimisation. It's the gratuitous API changes that I'm objecting to, not the underlying idea of binding the reference to the builtins namespace earlier in the function definition process. I'd even be OK with leaving the default builtins reference *out* of the globals namespace in favour of storing a hidden reference on the frame objects. Agreed on the gratuitous API changes. I'd like to hear Mark's response. -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().
2012/3/7 Mark Shannon m...@hotpy.org: Currently, it is impossible to allow one function access to sensitive functions like open(), while denying it to others, as any code can then get the builtins of another function via f.__globals__['builtins__']. Separating builtins from globals could solve this. I like this idea. We could finally kill __builtins__, too, which has often been confusing for people. -- Regards, Benjamin ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com