[Python-ideas] Re: Pickle security improvements

Steven D'Aprano Wed, 15 Jul 2020 04:40:20 -0700

On Wed, Jul 15, 2020 at 11:24:17AM +1000, Chris Angelico wrote:

> It's correct far more often than you might think. There's a LOT of
> code out there where the Python source code has the exact same
> external access permissions as its config files - often because
> there's no access to either.


Um, yes? Safe use-cases is not the issue here. It's the unsafe use-cases 
that are important. Especially the use-cases that people may think are 
safe but actually aren't.

To stick to the seat belt analogy for a moment... we don't reject seat 
belts in cars because most of the time cars are safely parked in a 
garage. We add them for the times that cars are in motion at speed.

Improving the security of pickle shouldn't be done for the sake of cases 
where the security of pickle is irrelevent. It should be done for the 
sake of cases where it is necessary, especially for those cases where 
the developer thinks that security isn't necessary, but they are 
mistaken.


> So if you're distributing your code, then maybe you don't use pickle.

Sure. What do I use to serialise my complex data structure? I guess I 
could write out the repr and then call eval on it, that should be 
fine... *wink*


[...]
> > Okay, let's say that somebody else did the work. Some awfully clever
> > chappy found a way to add a magical "pickle.safeload()" function that
> > did everything needed, safely. Would you oppose it?
> >
> > (The old unsafe one would presumably have to remain for backwards
> > compatibility, or for the cases which are inherently unsafe.)
> 
> I would ask them which laws of physics they violated, since pickle
> inherently has to be able to execute arbitrary code in order to be
> able to do everything it needs to.

I'm not a pickle expert, but I don't think that's quite right. pickle 
has to be able to execute arbitrary code in order to be able to 
de-serialise arbitrary pickles, but that doesn't mean it has to 
de-serialise arbitrary pickles if you aren't expecting arbitrary 
pickles.

Random beat it to me by suggesting a white-list, but I was thinking the 
same way. The pickle protocol has to be able to deal with arbitrary 
instances, but very few apps using pickle need to, or want to, accept 
arbitrary instances. If my app serialised Widgets and Gadgets, then it 
ought to be an error to attempt to deserialise anything else.

Then all I need do is ensure that the Widget and Gadget classes are 
secure, not the entire Python universe :-)

As I said, I'm not an expect, but five minutes reading this:

https://rushter.com/blog/pickle-serialization-internals/

allows me to confidently pontificate on the subject *wink*

The depickling virtual machine (pickle machine or PM) is not Turing 
complete. It has no loops or conditionals. It's a dumb machine that 
takes a sequence of op-codes, executing them in order, and then halt.

The GLOBAL op-code (by default) will import any module, and use any 
function from that module. That's dangerous; an option to restrict what 
modules and functions can be called by the PM would go a long way to 
reducing the attack surface of pickle. (I think.)

Random's idea of white-listing seems like a promising approach to me. 
Even if it doesn't make pickle "safe" in some absolute sense, it will 
make it *less unsafe* and reduce the attack surface for people using 
pickle.

Security is always about tradeoffs, and we shouldn't let the idea of 
some unattainable perfectly secure pickle get in the way of improving 
the safety of pickle.


> If someone claims they've created a way to allow untrusted users to
> insert code into your Python programs and have it execute, but they've
> made it safe, would you oppose its inclusion in the stdlib?

But that's not really what we're asking for. We're asking for a way to 
*avoid* executing arbitrary code, while still allowing *trusted* objects 
to be depickled.


> You want "pickle but magically able to know what's safe and what's
> not"?

Of course not. But maybe I want to be able to tell pickle what I think 
is safe, and have everything else fail.


-- 
Steven
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/MBJB2JG47APNHDON2EILYAS7DOAUZGR2/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: Pickle security improvements

Reply via email to