Re: [pypy-dev] Use of marshal in the sandbox: is stdlib marshal OK?

Ned Batchelder Wed, 28 Dec 2011 06:34:45 -0800

I guess that is a possibility, but another principle is to use well-usedand widely-reviewed code where possible, no? I guess the problem isthat built-in marshal isn't trying hard to protect itself againstmalicious data?

The problem with "bundling pypy's marshal.py" is that it pulls in a lotof infrastructure modules, which bulks up the calling process. Maybethere's some low-hanging fruit there that we can trim.


Any thoughts on the second issue?

--Ned.

On 12/27/2011 10:09 PM, lahwran wrote:

it will become an issue if there is a bug in the marshal code inside
pypy-c-sandbox which is /creating/ the marshalled data, a bug that
would allow a sandboxed program to alter the marshalled data in such a
way that it can exploit the vulnerability of the stdlib marshal.
Doesn't sound too likely, but in the spirit of having as many layers
of security as possible, I propose simply bundling pypy's marshal.py
with the sandbox.

-- lahwran

On Tue, Dec 27, 2011 at 7:30 PM, Ned Batchelder<[email protected]>  wrote:

The sandbox uses pypy's own implementation of marshal.  In
pypy/translator/sandbox/sandlib.py is this comment:

# Note: we use lib_pypy/marshal.py instead of the built-in marshal
# for two reasons.  The built-in module could be made to segfault
# or be attackable in other ways by sending malicious input to
# load().  Also, marshal.load(f) blocks with the GIL held when
# f is a pipe with no data immediately avaialble, preventing the
# _waiting_thread to run.

I'd like to remove as many dependencies as possible from the sandbox code,
so I'd like to explore the possibility of using the standard library marshal
module.

The first reason above is about crashing marshal with malicious input.  To
my thinking, we are in control of what data is marshaled, so we don't have
to worry about malicious input.  The untrusted Python code running in the
sandbox doesn't have a way of sending marshaled data, so we don't have to
worry that it will be used to attack the marshal module.  The stdout of the
untrusted Python code will become a string that is marshaled, but that
doesn't provide a way for the untrusted code to attack the marshal module.
  Or have I missed something?

The second reason I can't address, is this still a problem?  What bad
effects will we see if it is?

--Ned.
_______________________________________________
pypy-dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/pypy-dev

_______________________________________________
pypy-dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/pypy-dev

Re: [pypy-dev] Use of marshal in the sandbox: is stdlib marshal OK?

Reply via email to