Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-09 Thread Mark Shannon

Guido van Rossum wrote:

On Thu, Mar 8, 2012 at 4:33 PM, Nick Coghlan ncogh...@gmail.com wrote:

On Fri, Mar 9, 2012 at 3:31 AM, Guido van Rossum gu...@python.org wrote:

But the __builtins__ that are actually used by any particular piece of
code is *not* taken by importing builtins. It is taken from what the
globals store under the key __builtins__.

This is a feature that was added specifically for sandboxing purposes,
but I believe it has found other uses too.

Agreed, but swapping out builtins for a different namespace is still
the exception rather than the rule. My Impression of Mark's proposal
was that this approach would become the *preferred* way of doing
things, and that's the part I don't like at a conceptual level.


The key point is that every piece of code already inherits locals, globals
and builtins from somewhere else.
We can already control locals (by which parameters are passed in) and
globals via exec, eval, __import__, and runpy (any others?)
but we can't control builtins.

Correct - because controlling builtins is the domain of sandboxes.

Incorrect (unless I misunderstand the context) -- when you control the
globals you control the __builtins__ set there.

And this is where I don't like the idea at a practical level. We
already have a way to swap in a different set of builtins for a
certain execution context (i.e. set __builtins__ in the global
namespace) for a small chunk of code, as well as allowing
collections.ChainMap to insert additional namespaces into the name
lookup path.

This proposal suggests adding an additional mapping argument to every
API that currently accepts a locals and/or globals mapping, thus
achieving... well, nothing substantial, as far as I can tell (aside
from a lot of pointless churn in a bunch of APIs, not all of which are
under our direct control).


In any case, the locals / globals / builtins chain is a
simplification; there are also any number of intermediate scopes
(between locals and globals) from which nonlocal variables may be
used. Like optimized function globals, these don't use a dict lookup
at all, they are determined by compile-time analysis.

Acknowledged, but code executed via the exec API with both locals and
globals passed in is actually one of the few places where that lookup
chain survives in its original form (module level class definitions
being the other).

Now, rereading Mark's original message, a simpler proposal of having
*function objects* do an early lookup of
self.__globals__['__builtins__'] at creation time and caching that
somewhere such that the frame objects can get hold of it (rather than
having to do the lookup every time the function gets called or a
builtin gets referenced) might be a nice micro-optimisation. It's the
gratuitous API changes that I'm objecting to, not the underlying idea
of binding the reference to the builtins namespace earlier in the
function definition process. I'd even be OK with leaving the default
builtins reference *out* of the globals namespace in favour of storing
a hidden reference on the frame objects.


Agreed on the gratuitous API changes. I'd like to hear Mark's response.


C API or Python API?

The Python API would be changed, but in a backwards compatible way.
exec, eval and __import__ would all gain an optional (keyword-only?)
builtins parameter.

I see no reason to change any of the C API functions.
New functions taking an extra parameter could be added,
but it wouldn't be a requirement.

Cheers,
Mark




___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-09 Thread Nick Coghlan
On Fri, Mar 9, 2012 at 6:19 PM, Mark Shannon m...@hotpy.org wrote:
 The Python API would be changed, but in a backwards compatible way.
 exec, eval and __import__ would all gain an optional (keyword-only?)
 builtins parameter.

No, some APIs effectively define *protocols*. For such APIs, *adding*
parameters is almost of comparable import to taking them away, because
they require that other APIs modelled on the prototype also change. In
this case, not only exec() has to change, but eval, __import__,
probably runpy, function creation, eventually any third party APIs for
code execution, etc, etc.

Adding a new parameter to exec is a change with serious implications,
and utterly unnecessary, since the API part is already covered by
setting __builtins__ in the passed in globals namespace (which is
appropriately awkward to advise people that they're doing something
strange with potentially unintended consequences or surprising
limitations).

That said, binding a reference to the builtin *early* (for example, at
function definition time or when a new invocation of the eval loop
first fires up) may be a reasonable idea, but you don't have to change
the user facing API to explore that option - it works just as well
with __builtins__ as an optional value in the existing global
namespace.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-09 Thread Mark Shannon

Nick Coghlan wrote:

On Fri, Mar 9, 2012 at 6:19 PM, Mark Shannon m...@hotpy.org wrote:

The Python API would be changed, but in a backwards compatible way.
exec, eval and __import__ would all gain an optional (keyword-only?)
builtins parameter.


No, some APIs effectively define *protocols*. For such APIs, *adding*
parameters is almost of comparable import to taking them away, because
they require that other APIs modelled on the prototype also change. In
this case, not only exec() has to change, but eval, __import__,
probably runpy, function creation, eventually any third party APIs for
code execution, etc, etc.

Adding a new parameter to exec is a change with serious implications,
and utterly unnecessary, since the API part is already covered by
setting __builtins__ in the passed in globals namespace (which is
appropriately awkward to advise people that they're doing something
strange with potentially unintended consequences or surprising
limitations).


It is the implementation that interests me.
Implementing the (locals, globals, builtins) triple as a single object
has advantages both in terms of internal consistency and efficiency.

I just thought to expose this to the user.
I am now persuaded that I don't want to expose anything :)



That said, binding a reference to the builtin *early* (for example, at
function definition time or when a new invocation of the eval loop
first fires up) may be a reasonable idea, but you don't have to change
the user facing API to explore that option - it works just as well
with __builtins__ as an optional value in the existing global
namespace.


OK. So, how about this:
(builtins refers to the dict used for variable lookup, not the module)

New eval pseudocode
eval(code, globals, locals):
triple = (locals, globals, globals[__builtins__])
return eval_internal(triple)

Similarly for exec, __import__ and runpy.

That way the (IMO clumsy) builtins = globals[__builtins__]
only happens at a few known locations.
It should then be clear where all code gets its namespaces from.

Namespaces should be inherited as follows:

frame:
function scope: globals and builtins from function, locals from parameters.
module scope: globals and builtins from module, locals == globals.
in eval, exec, or runpy: all explicit.

function: globals and builtins from module (no locals)

module:  globals and builtins from import (no locals)

import: explicitly from __import__() or
implicitly from current frame in an import statement.

For frame and function, free and cell (nonlocal) variables would be 
unchanged.


On entry the namespaces will be {}, {}, sys.modules['builtins'].__dict__

This is pretty much what happens anyway,
except that where code gets its builtins from is now well defined.

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-09 Thread Victor Stinner
 The reason I am proposing this here rather than on python-ideas is that
 treating the triple of [locals, globals, builtins] as a single
 execution context can be implemented in a really nice way.

 Internally, the execution context of [locals, globals, builtins]
 can be treated a single immutable object (custom object or tuple)
 Treating it as immutable means that it can be copied merely by taking a
 reference. A nice trick in the implementation is to make a NULL locals
 mean fast locals for function contexts. Frames, could then acquire their
 globals and builtins by a single reference copy from the function object,
 rather than searching globals for a '__builtins__'
 to find the builtins.

Creating a new frame lookup for __builtins__ in globals only if
globals of the new frame is different from the globals of the previous
frame. You would like to optimize this case? If globals is unchanged,
Python just increments the reference counter.

When globals is different from the previous frame? When you call a
function from a different module maybe?

Do you have an idea of the speedup of your optimization?

Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-09 Thread Mark Shannon

Victor Stinner wrote:

The reason I am proposing this here rather than on python-ideas is that
treating the triple of [locals, globals, builtins] as a single
execution context can be implemented in a really nice way.

Internally, the execution context of [locals, globals, builtins]
can be treated a single immutable object (custom object or tuple)
Treating it as immutable means that it can be copied merely by taking a
reference. A nice trick in the implementation is to make a NULL locals
mean fast locals for function contexts. Frames, could then acquire their
globals and builtins by a single reference copy from the function object,
rather than searching globals for a '__builtins__'
to find the builtins.


Creating a new frame lookup for __builtins__ in globals only if
globals of the new frame is different from the globals of the previous
frame. You would like to optimize this case? If globals is unchanged,
Python just increments the reference counter.


I'm more interested in simplifying the code than performance.
We this proposed approach, there is no need to test where the globals 
come from, or what the builtins are; just incref the namespace triple.




When globals is different from the previous frame? When you call a
function from a different module maybe?

Do you have an idea of the speedup of your optimization?


No. But it won't be slower.

Cheers,
Mark
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-09 Thread Mark Lawrence

On 09/03/2012 12:57, Mark Shannon wrote:


No. But it won't be slower.

Cheers,
Mark


Please prove it, you have to convince a number of core developers 
including, but not limited to, the BDFL :).


--
Cheers.

Mark Lawrence.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-08 Thread Mark Shannon

Jim J. Jewett wrote:


http://mail.python.org/pipermail/python-dev/2012-March/117395.html
Brett Cannon posted:

[in reply to Mark Shannon's suggestion of adding a builtins parameter
to match locals and globals]


It's a mess right now to try to grab the __import__()
implementation and this would actually help clarify import semantics by
saying that __import__() for any chained imports comes from __import__()s
locals, globals, or builtins arguments (in that order) or from the builtins
module itself (i.e. tstate-builtins).


How does that differ from today?


The idea is that you can change, presumable restrict, the builtins
separately from the globals for an import.



If you're saying that the locals and (module-level) globals aren't
always checked in order, then that is a semantic change.  Probably
a good change, but still a change -- and it can be made indepenently
of Mark's suggestion.

Also note that I would assume this was for sandboxing, 


Actually, I just think it's a cleaner implementation,
but sandboxing is a good excuse :)

 and that

missing names should *not* fall back to the real globals, although
I would understand if bootstrapping required the import statement to
get special treatment.


(Note that I like Mark's proposed change; I just don't see how it
cleans up import.)


I don't think it cleans up import, but I'll defer to Brett on that.
I've included __import__() along with exec and eval as it is a place 
where new namespaces can be introduced into an execution.

There may be others I haven't though of.

Cheers,
Mark.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-08 Thread Nick Coghlan
On Thu, Mar 8, 2012 at 10:06 PM, Mark Shannon m...@hotpy.org wrote:
 I don't think it cleans up import, but I'll defer to Brett on that.
 I've included __import__() along with exec and eval as it is a place where
 new namespaces can be introduced into an execution.
 There may be others I haven't though of.

runpy is another one.

However, the problem I see with builtins as a separate argument is
that it would be a lie.

The element that's most interesting about locals vs globals vs
builtins is the scope of visibility of their contents.

When I call out to another function in the same module, locals are not
shared, but globals and builtins are.

When I call out to code in a *different* module, neither locals nor
globals are shared, but builtins are still common.

So there are two ways this purported extra builtins parameter could work:

1. Sandboxing - you try to genuinely give the execution context a
different set of builtins that's shared by all code executed, even
imports from other modules.  However, I assume this isn't what you
meant, since it is the domain of sandboxing utilities like Victor's
pysandbox and is known to be incredibly difficult to get right (hence
the demise of both rexec and Bastion and recent comments about known
segfault vulnerabilities that are tolerable in the normal case of
merely processing untrusted data with trusted code but anathema to a
robust CPython native sandboxing scheme that can still cope even when
the code itself is untrusted).

2. chained globals - just an extra namespace that's chained behind the
globals dictionary for name lookup, not actually shared with code
invoked from other modules.

The second approach is potentially useful, but:

1. builtins is *not* the right name for it (because any other code
invoked will still be using the original builtins)
2. it's already trivial to achieve such chained lookups in 3.3 by
passing a collections.ChainMap instance as the globals parameter:
http://docs.python.org/dev/library/collections#collections.ChainMap

collections.ChainMap also has the virtue of working with any current
API that accepts a globals argument and can be extended to an
arbitrary level of chaining, whereas this suggestion requires that all
such APIs be expanded to accept a third parameter, and could still
only chain lookups one additional step in doing so.

So a big -1 from me.

Cheers,
Nick.

P.S. I've referenced this talk before, but Tim Dawborn's effort from
PyCon AU last year about the sandboxing setup for
http://www.ncss.edu.au/ should be required viewing for anyone wanting
to understand the kind of effort it takes to fairly comprehensively
protect host servers from attacks when executing arbitrary untrusted
Python code on CPython. Implementing such protection is certainly
*possible* (since Tim's talk is all about one way to do it), but it's
not easy, and Tim's approach uses Linux OS level sandboxing rather
than rather than relying on a Python language level sandbox. This was
largely due to a university requirement that the sandbox solution be
language agnostic, but it also serves to protect the sandbox from the
documented attacks against the CPython interpreter. Tim reviews a few
interesting attempts to break the sandbox around the 5 minute mark in
https://www.youtube.com/watch?v=y-WPPdhTKBU. (I did suggest he grab
our test_crashers directory to see what happened when they were run in
the sandbox, but I doubt it would be much more interesting than merely
calling sys.exit())

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-08 Thread Mark Shannon

Nick Coghlan wrote:

On Thu, Mar 8, 2012 at 10:06 PM, Mark Shannon m...@hotpy.org wrote:

I don't think it cleans up import, but I'll defer to Brett on that.
I've included __import__() along with exec and eval as it is a place where
new namespaces can be introduced into an execution.
There may be others I haven't though of.


runpy is another one.


Add that to the list.


However, the problem I see with builtins as a separate argument is
that it would be a lie.

The element that's most interesting about locals vs globals vs
builtins is the scope of visibility of their contents.

When I call out to another function in the same module, locals are not
shared, but globals and builtins are.

When I call out to code in a *different* module, neither locals nor
globals are shared, but builtins are still common.


Not necessarily. All functions in a module will inherit their globals 
*and* builtins from the module, which gets them from __import__().




So there are two ways this purported extra builtins parameter could work:

1. Sandboxing - you try to genuinely give the execution context a
different set of builtins that's shared by all code executed, even
imports from other modules.  


Victor's pysandbox seems pretty good to me, I had a go at breaking it
and failed, but it is too restrictive.

Rather than make pysandbox more secure, I think my proposal could make
it more usable, as clearer guarantees about access and visibility can be
provided to the sandbox developer.
You shouldn't need to cripple introspection in order to limit access to 
the builtins.



However, I assume this isn't what you
meant, since it is the domain of sandboxing utilities like Victor's
pysandbox and is known to be incredibly difficult to get right (hence
the demise of both rexec and Bastion and recent comments about known
segfault vulnerabilities that are tolerable in the normal case of
merely processing untrusted data with trusted code but anathema to a
robust CPython native sandboxing scheme that can still cope even when
the code itself is untrusted).


By changing the implementation to be based around immutable execution 
contexts means that the compiler will enforce things for us.

Static typing has its advantages, occasionally :)

As I stated elsewhere, the crashers can be fixed. I think Victor has 
already fixed a couple.




2. chained globals - just an extra namespace that's chained behind the
globals dictionary for name lookup, not actually shared with code
invoked from other modules.


That's exactly what builtins already are. They are a fall back for 
LOAD_GLOBAL and similar when something isn't found in the globals.




The second approach is potentially useful, but:

1. builtins is *not* the right name for it (because any other code
invoked will still be using the original builtins)


Other code will use whatever builtins they were given at __import__.

The key point is that every piece of code already inherits locals, 
globals and builtins from somewhere else.

We can already control locals (by which parameters are passed in) and
globals via exec, eval, __import__, and runpy (any others?)
but we can't control builtins.


One last point is that this is a low-impact change. All code using eval, 
etc. will continue to work as before.

It also may speed things up a little.

Cheers,
Mark.

___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-08 Thread Nick Coghlan
On Thu, Mar 8, 2012 at 11:40 PM, Mark Shannon m...@hotpy.org wrote:
 Other code will use whatever builtins they were given at __import__.

Then they're not builtins - they're module-specific chained globals.
The thing that makes the builtins special is *who else* can see them
(i.e. all the other code in the process). If you replace
builtins.open, you replace if for everyone (that hasn't either
shadowed it or cached a reference to the original).

 The key point is that every piece of code already inherits locals, globals
 and builtins from somewhere else.
 We can already control locals (by which parameters are passed in) and
 globals via exec, eval, __import__, and runpy (any others?)
 but we can't control builtins.

Correct - because controlling builtins is the domain of sandboxes.

 One last point is that this is a low-impact change. All code using eval,
 etc. will continue to work as before.
 It also may speed things up a little.

Passing in a ChainMap instance as the globals when you want to include
an additional namespace in the lookup chain is even lower impact.

A reference implementation and concrete use cases might change my
mind, but for now, I'm just seeing a horrendously complicated approach
with huge implications for the runtime data model semantics for
something that 3.3 already supports in a much simpler fashion.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-08 Thread Guido van Rossum
On Thu, Mar 8, 2012 at 5:57 AM, Nick Coghlan ncogh...@gmail.com wrote:
 On Thu, Mar 8, 2012 at 11:40 PM, Mark Shannon m...@hotpy.org wrote:
 Other code will use whatever builtins they were given at __import__.

 Then they're not builtins - they're module-specific chained globals.
 The thing that makes the builtins special is *who else* can see them
 (i.e. all the other code in the process). If you replace
 builtins.open, you replace if for everyone (that hasn't either
 shadowed it or cached a reference to the original).

Looks like you two are talking about different things. There is only
one 'builtins' *module*.

But the __builtins__ that are actually used by any particular piece of
code is *not* taken by importing builtins. It is taken from what the
globals store under the key __builtins__.

This is a feature that was added specifically for sandboxing purposes,
but I believe it has found other uses too.

 The key point is that every piece of code already inherits locals, globals
 and builtins from somewhere else.
 We can already control locals (by which parameters are passed in) and
 globals via exec, eval, __import__, and runpy (any others?)
 but we can't control builtins.

 Correct - because controlling builtins is the domain of sandboxes.

Incorrect (unless I misunderstand the context) -- when you control the
globals you control the __builtins__ set there.

 One last point is that this is a low-impact change. All code using eval,
 etc. will continue to work as before.
 It also may speed things up a little.

 Passing in a ChainMap instance as the globals when you want to include
 an additional namespace in the lookup chain is even lower impact.

 A reference implementation and concrete use cases might change my
 mind, but for now, I'm just seeing a horrendously complicated approach
 with huge implications for the runtime data model semantics for
 something that 3.3 already supports in a much simpler fashion.

I can't say I'm completely following the discussion. It's not clear
whether what I just explained was already implicit in the coversation
or is new information.

In any case, the locals / globals / builtins chain is a
simplification; there are also any number of intermediate scopes
(between locals and globals) from which nonlocal variables may be
used. Like optimized function globals, these don't use a dict lookup
at all, they are determined by compile-time analysis.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-08 Thread Paul Moore
On 8 March 2012 12:52, Nick Coghlan ncogh...@gmail.com wrote:
 2. it's already trivial to achieve such chained lookups in 3.3 by
 passing a collections.ChainMap instance as the globals parameter:
 http://docs.python.org/dev/library/collections#collections.ChainMap

Somewhat OT, but collections.ChainMap is really cool. I hadn't noticed
it get added into 3.3, and as far as I can see, it's not in the
What's New in 3.3 document. But it's little things like this that
*really* make the difference for me in new versions.

So thanks to whoever added it, and could we have a whatsnew entry, please?

Paul.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-08 Thread Victor Stinner

On 07/03/2012 16:33, Mark Shannon wrote:

It should also help with sandboxing, as it would make it easier to
analyse and thus control access to builtins, since the execution context
of all code would be easier to determine.


pysandbox patchs __builtins__ in:

 - the caller frame
 - the interpreter state
 - all modules

It uses a read-only dict with only a subset of __builtins__. It is 
important for:


 - deny replacing a builtin function
 - deny adding a new superglobal variable
 - deny accessing a blocked function

If a module or something else leaks the real builtins dict, it would be 
a vulnerability.


pysandbox is able to replace temporary __builtins__ everywhere and then 
restore the previous state.


Can you please explain why/how pysandbox is too restrictive and how your 
proposition would make it more usable?



Currently, it is impossible to allow one function access to sensitive
functions like open(), while denying it to others, as any code can then
get the builtins of another function via f.__globals__['builtins__'].
Separating builtins from globals could solve this.


For a sandbox, it's a feature, or maybe a requirement :-)

It is a problem if a function accessing to the trusted builtins dict is 
also accessible in the sandbox. I don't remember why it is a problem: 
pysandbox blocks access to the __globals__ attribute of functions.


Victor
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-08 Thread Nick Coghlan
On Fri, Mar 9, 2012 at 3:31 AM, Guido van Rossum gu...@python.org wrote:
 But the __builtins__ that are actually used by any particular piece of
 code is *not* taken by importing builtins. It is taken from what the
 globals store under the key __builtins__.

 This is a feature that was added specifically for sandboxing purposes,
 but I believe it has found other uses too.

Agreed, but swapping out builtins for a different namespace is still
the exception rather than the rule. My Impression of Mark's proposal
was that this approach would become the *preferred* way of doing
things, and that's the part I don't like at a conceptual level.

 The key point is that every piece of code already inherits locals, globals
 and builtins from somewhere else.
 We can already control locals (by which parameters are passed in) and
 globals via exec, eval, __import__, and runpy (any others?)
 but we can't control builtins.

 Correct - because controlling builtins is the domain of sandboxes.

 Incorrect (unless I misunderstand the context) -- when you control the
 globals you control the __builtins__ set there.

And this is where I don't like the idea at a practical level. We
already have a way to swap in a different set of builtins for a
certain execution context (i.e. set __builtins__ in the global
namespace) for a small chunk of code, as well as allowing
collections.ChainMap to insert additional namespaces into the name
lookup path.

This proposal suggests adding an additional mapping argument to every
API that currently accepts a locals and/or globals mapping, thus
achieving... well, nothing substantial, as far as I can tell (aside
from a lot of pointless churn in a bunch of APIs, not all of which are
under our direct control).

 In any case, the locals / globals / builtins chain is a
 simplification; there are also any number of intermediate scopes
 (between locals and globals) from which nonlocal variables may be
 used. Like optimized function globals, these don't use a dict lookup
 at all, they are determined by compile-time analysis.

Acknowledged, but code executed via the exec API with both locals and
globals passed in is actually one of the few places where that lookup
chain survives in its original form (module level class definitions
being the other).

Now, rereading Mark's original message, a simpler proposal of having
*function objects* do an early lookup of
self.__globals__['__builtins__'] at creation time and caching that
somewhere such that the frame objects can get hold of it (rather than
having to do the lookup every time the function gets called or a
builtin gets referenced) might be a nice micro-optimisation. It's the
gratuitous API changes that I'm objecting to, not the underlying idea
of binding the reference to the builtins namespace earlier in the
function definition process. I'd even be OK with leaving the default
builtins reference *out* of the globals namespace in favour of storing
a hidden reference on the frame objects.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncogh...@gmail.com   |   Brisbane, Australia
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-08 Thread Guido van Rossum
On Thu, Mar 8, 2012 at 4:33 PM, Nick Coghlan ncogh...@gmail.com wrote:
 On Fri, Mar 9, 2012 at 3:31 AM, Guido van Rossum gu...@python.org wrote:
 But the __builtins__ that are actually used by any particular piece of
 code is *not* taken by importing builtins. It is taken from what the
 globals store under the key __builtins__.

 This is a feature that was added specifically for sandboxing purposes,
 but I believe it has found other uses too.

 Agreed, but swapping out builtins for a different namespace is still
 the exception rather than the rule. My Impression of Mark's proposal
 was that this approach would become the *preferred* way of doing
 things, and that's the part I don't like at a conceptual level.

 The key point is that every piece of code already inherits locals, globals
 and builtins from somewhere else.
 We can already control locals (by which parameters are passed in) and
 globals via exec, eval, __import__, and runpy (any others?)
 but we can't control builtins.

 Correct - because controlling builtins is the domain of sandboxes.

 Incorrect (unless I misunderstand the context) -- when you control the
 globals you control the __builtins__ set there.

 And this is where I don't like the idea at a practical level. We
 already have a way to swap in a different set of builtins for a
 certain execution context (i.e. set __builtins__ in the global
 namespace) for a small chunk of code, as well as allowing
 collections.ChainMap to insert additional namespaces into the name
 lookup path.

 This proposal suggests adding an additional mapping argument to every
 API that currently accepts a locals and/or globals mapping, thus
 achieving... well, nothing substantial, as far as I can tell (aside
 from a lot of pointless churn in a bunch of APIs, not all of which are
 under our direct control).

 In any case, the locals / globals / builtins chain is a
 simplification; there are also any number of intermediate scopes
 (between locals and globals) from which nonlocal variables may be
 used. Like optimized function globals, these don't use a dict lookup
 at all, they are determined by compile-time analysis.

 Acknowledged, but code executed via the exec API with both locals and
 globals passed in is actually one of the few places where that lookup
 chain survives in its original form (module level class definitions
 being the other).

 Now, rereading Mark's original message, a simpler proposal of having
 *function objects* do an early lookup of
 self.__globals__['__builtins__'] at creation time and caching that
 somewhere such that the frame objects can get hold of it (rather than
 having to do the lookup every time the function gets called or a
 builtin gets referenced) might be a nice micro-optimisation. It's the
 gratuitous API changes that I'm objecting to, not the underlying idea
 of binding the reference to the builtins namespace earlier in the
 function definition process. I'd even be OK with leaving the default
 builtins reference *out* of the globals namespace in favour of storing
 a hidden reference on the frame objects.

Agreed on the gratuitous API changes. I'd like to hear Mark's response.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Adding a builtins parameter to eval(), exec() and __import__().

2012-03-07 Thread Benjamin Peterson
2012/3/7 Mark Shannon m...@hotpy.org:
 Currently, it is impossible to allow one function access to sensitive
 functions like open(), while denying it to others, as any code can then
 get the builtins of another function via f.__globals__['builtins__'].
 Separating builtins from globals could solve this.

I like this idea. We could finally kill __builtins__, too, which has
often been confusing for people.


-- 
Regards,
Benjamin
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com