In response to Ka-Ping's comments on the subject of "Resource Hiding" vs "Resource Crippling", Brett says:
> It seems that your criticisms are aimed at resource crippling > being a "plug holes as needed but if you foul up you are screwed" > with resource hiding being more "fix the fundamental issues and > just don't present access to resources you don't want to give > access to (or wrap accordingly)". And in general I agree with > this assessment. But I also realize that Python was not designed > for security in mind and there seems to be new ways to get access > to 'file'. If I felt confident that I could find and hide 'file' > as needed, I would go that route immediately. But I don't think I > can (and Armin has said this as well). I agree completely. "Resource Hiding" (specifically, Capabilities) has the cleanest concept, yet there are valid reasons to worry about implementing it in Python. However, I would like to point out one other advantage that capabilities has over "Resource Crippling". Resource Crippling implements the restrictions as changes in the underlying C-level objects capable of performing dangerous operations. That means that the restrictions are implemented by the interpreter. The advantage is obvious: we can trust the interpreter. But the disadvantages are (1) it's slow to fix (requires a bugfix release followed by everyone in the world upgrading), and (2) it cannot be extended by the user. With resource crippling, you will need to decide just what kind of restrictions the file type will implement. I believe you are planning to restrict to a list of known filenames and known directories for reading and for writing. (Actually, you said mode string, but I presume that you won't maintain separate lists for 'r' and 'rb' modes.) Then there was discussion of whether the directories ought to be recursive, whether the total number of files opened ought to be restricted, whether the total size written should be restricted, and even whether the size should be measured in bytes or blocks. Such conversations could go on for a long time, and in the end you must make some compromises. If you were using capabilities, you would need to ensure that restricted interpreters could only get the file object that they were given. But then _all_ of these fancy versions of the restrictions would be immediately supported: it would be up to the users to create secure wrappers implementing the specific restrictions desired. I really like this feature of capabilities: that they can be extended (well, restricted) by the user, not just by the language implementer. That's a powerful feature, and I don't want to give it up. But on the other hand, I don't quite see how to maintain it -- here are my best ideas, perhaps they will help. Python already has one essential ingredient for capabilities: unforgable references. But it fails in two other ways: having secure data, and having no auxiliary means of accessing objects. Python's powerful introspection and visible implementation (eg: __dict__) make it impossible to encapsulate data in an object in a way that prevents malicious users from accessing it. But that is actually surprisingly easy to fix. Just create a new type (maybe a new metaclass), implemented in C, which contains private data and a means to control access to it. You would provide a dict which would be stored privately without access from Python, and then provide methods and attributes along with a Python function for evaluating access to each. The type would ensure that the access test was evaluated with access to the private dict before any method or attribute was accessed. Such a construct is simple enough that I believe we could implement it and be reasonably confident that it was reliably secure. (I have played with this in the past and been reasonably pleased with the results.) Adding restrictions would then incur some performance penalties, but that seems unproblematic. That leaves the other problem: auxiliary means of accessing objects. There are things like gc.get_objects(). In the special case of file, which is a type that's also dangerous, there are tricks like "object().__class__.__subclasses__()". I would love to believe that we could plug all of these holes, but experience (rexec) proves otherwise. For something like sockets, I am fairly sure that there's a clear bottleneck (the low-level socket module), but still numerous existing libraries that use this low-level module without being handed a capability. So this is where my alternative plan starts to fall apart. Your (Brett's) plan to use resource crippling for these kinds of restrictions involves putting C code around all direct access to files, sockets, or whatever resource is being accessed. Perhaps instead of your C code doing the security checks directly, it could make sure that the objects returned were contained within the correct secure wrappers. That's OK so far as it goes, but the C checks are per interpreter instance, so how do we get them to apply the correct wrappers? Maybe the interpreter maintains (in C) a stack of wrappers per-thread and provides a means for stack frames (functions) to register wrappers? We would wind up wrapping the wrappers, but that's just the nature of capabilities. Perhaps it would look something like this: def invoke_user_function(): PyXXX.set_file_wrapper(ReadOnlyRestriction) PyXXX.set_socket_wrapper( SingleDomainRestriction('example.com')) untrusted_object.do_stuff() ... To sum up: I agree that you cannot rely on prevent all the possible "python tricks", but I still think that capabilities are a superior solution. I'd like to find a way to achieve the user-customizability of capabilities without throwing out the entire standard library -- maybe some hybrid of "resource hiding" and "resource crippling". I can almost see a way to achieve it, but there are still a few flaws. -- Michael Chermside _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com