[This message is cc:d to the e-lang list, but please take any replies to [EMAIL PROTECTED]
Brett Cannon wrote: > On 7/19/06, Ka-Ping Yee <[EMAIL PROTECTED]> wrote: > >> OMG!!! Is all i can say at the moment. Very excited. This is very encouraging. Thanks to ?!ng, Michael Chermside and others for making the case for capabilities. > Also realize that I am using object-capabilities to secure the interpreter, > not objects. That will be enough of a challenge to do for now. Who knows, > maybe some day Python can support object-capabilities at the object level, > but for now I am just trying to isolate and protect individual interpreters > in the same process. I think that the alternative of providing object-granularity protection domains straight away is more practical than you suggest, and I'd like to at least make sure that this possibility has been thoroughly explored. Below is a first-cut proposal for enforcing namespace restrictions, i.e. support for non-public attributes and methods, on Python objects and modules. It is not sufficient by itself to provide capability security, but it could be the basis for doing that at object granularity. (Note that this proposal would only affect sandboxed/restricted interpreters, at least for the time being. The encapsulation it provides is also useful for reasons other than security, and I think there is nothing about it that would be unreasonable to apply to an unrestricted interpreter, but for compatibility, that would have to be enabled by a __future__ option or similar.) Internal namespace proposal =========================== Existing Python code tends to use a convention where the names of attributes and methods intended only for internal use are prefixed by '_'. This convention comes from PEP 8 <http://www.python.org/dev/peps/pep-0008/>, which says: # In addition, the following special forms using leading or trailing # underscores are recognized (these can generally be combined with any case # convention): # # - _single_leading_underscore: weak "internal use" indicator. E.g. "from M # import *" does not import objects whose name starts with an underscore. # # - single_trailing_underscore_: used by convention to avoid conflicts with # Python keyword, e.g. # # Tkinter.Toplevel(master, class_='ClassName') # # - __double_leading_underscore: when naming a class attribute, invokes name # mangling (inside class FooBar, __boo becomes _FooBar__boo; see below). # # - __double_leading_and_trailing_underscore__: "magic" objects or # attributes that live in user-controlled namespaces. E.g. __init__, # __import__ or __file__. Never invent such names; only use them # as documented. I propose that the "internal" status of names beginning with _ (including those beginning with __) should be enforced in restricted interpreters. This is better than introducing a new annotation, because it will do the right thing for existing code that follows this part of PEP 8. More precisely: A restricted interpreter refuses access to any object attribute or method with a name beginning with '_' (by throwing a new exception type 'InternalAccessException'), unless the access is from a method and its static target is that method's first argument variable. Also, a restricted interpreter refuses access to any module-global variable or module-global function with a name beginning with '_' (by throwing 'InternalAccessException'), unless the access is statically from the same module. (A method's first argument is usually called 'self', but that's just a convention. By "static target", I mean that to access an internal attribute _foo in a method with first argument 'self', you must write "self._foo"; attempting to access "x._foo" will fail even if 'x' happens to be the same object as 'self'. This allows such accesses to be reported at compile-time, rather than only at run-time.) I am using the term "internal" rather than "private" or "protected", because these semantics are not the same as either "private" or "protected" in C++ or Java. In Python with this change, an object can only access its own internal methods and attributes. In C++ and Java, an object can access private and protected members of other objects of the same class. The rationale for this difference is explained below. The use of _single vs __double underscores encodes a useful distinction that would not change. Ignoring the point in the previous paragraph, a _single underscore is similar to "protected" in languages like C++ and Java, while a __double underscore is similar to "private". This is purely a consequence of the name mangling: if a class X and its subclass Y both name an attribute __foo, then we will end up with two attributes _X__foo and _Y__foo in instances of Y, which is the desired behaviour for private attributes. In the case of an attribute called _foo, OTOH, there can be only one such attribute per object, which is the desired behaviour for protected attributes. The name mangling also ensures that an object will not *accidentally* access a private attribute inherited from a superclass. However, in the same example, an instance of Y can still deliberately access the copy of the attribute inherited from X by specifying _X__foo. There is no security problem here, because Y cannot do anything as a result that it could not have done by copying X's code, rather than inheriting from it. Notice that this is only true because we restrict an object to only accessing its own internal attributes and methods; if we followed C++'s semantics where an object can access protected members of any superclass, this would break security. (Java solves this problem by applying a more complicated access rule for protected members, which I considered to be unintuitive. More details on request.) __dict__ is an internal attribute. This means that an object can only directly reflect on itself. I know that there are other means of reflection (e.g. using the 'inspect' module); blocking these or making them safe is a separate issue. If desired, it would be safe to add a 'publicdict' attribute to each object, or a 'publicdict(object)' built-in. This would return a *read-only* dict, probably created lazily if needed, giving access only to public (non-internal) attributes and methods. __init__ is an internal method. This is as it should be, because it should not be possible to call __init__ on an existing object; only to have __init__ implicitly called when a new object is constructed. __repr__ and __str__ are internal under these rules, and probably shouldn't be. Existing classes may expose private state in the strings returned by __repr__ or __str__, but in principle, there is nothing unsafe about being able to convert the public state of an object to a string. OTOH, this functionality is usually accessed via the built-ins 'repr' and 'str', which we could perhaps allow to access '__repr__' and '__str__' as a special case. -- David Hopwood <[EMAIL PROTECTED]> _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com