On Thu, Jul 16, 2020, at 02:36, Stephen J. Turnbull wrote:
> Random832 writes:
> 
>  > I was asking for the current Unpickler class, which currently has a
>  > whitelist hook for loading globals,
> 
> Callables are globals in this sense. 

not all callables are globals, as has been pointed out attributes of objects 
(methods) can also be callable.

this does require calling getattr which is itself global, but you can't 
exercise any fine-grained control over this without substituting your own 
getattr function, which creates problems if you are unpickling an object which 
contains a reference to the getattr function. Ultimately, *this* is the problem 
that made me realize that find_class isn't an adequate hook - that you cannot 
block or substitute a callable [whether that's a class, a global function, or a 
method] without also impacting the ability to unpickle objects that contain 
references to the same callable *as data*.

> So overriding
> Unpickler.find_class will allow you to restrict to specified
> callables.  It's not clear to me why you would want more fine-grained
> control: why not put the argument checking in the object constructor?
> In most cases that's probably where it's most useful, anyway, by DRY.
> 
>  > to be modified to also have a whitelist hook so that an application
>  > can provide a function that looks at a callable and its arguments
>  > that the pickle proposes to call, and can choose to either evaluate
>  > it, raise an error, or return a substitute value.
> 
> I would guess you can already have this by overriding
> Unpickler.load_reduce and patching Unpickler.dispatch[REDUCE[0]] to
> the new load_reduce.  Is there any other way for a pickle to specify
> the code to invoke on data supplied by that pickle?

I think I got all of them, but if you think there may be others feel free to be 
an extra pair of eyes. But these overrides are not available for the C version, 
and may well not be available on other python implementations.

>  > The idea that the pickle format is "inherently unsafe" and cannot
>  > be made safe is magical thinking.
> 
> I think you're quite wrong.  Pickle format itself is inherently unsafe
> because it allows a pickle to specify code to be executed on data that
> pickle specifies. 

Which is why an unpickler that *does not directly execute the specified code* 
is necessary and sufficient to be safe.

> Of course they already exist, and can be overriden.  I guess what
> you're asking for is a promise that the interface won't change in
> future versions of pickle.py.  That's the only difference between
> overriding Unpickler.find_class and overriding Unpickler.load_reduce
> (or any other method).

The other difference is that overriding find_class is supported by _pickle.c

[The dispatch mechanism is also unreasonably annoying to override, and having 
to override four separate methods, one of which is underscore-prefixed, to do 
the same thing, is not ideal]

On a mostly unrelated note I also have to admit I am baffled why the NEWOBJ 
opcodes are defined to call __new__ instead of __newobj__, when the latter is 
expected to exist and be a valid reduce function. Having a dunder name for a 
method that expects to be called from pickle seems like it would have been 
useful for security, since then you could add classes to a whitelist without 
allowing unrestricted calls to their constructor. Is this just a performance 
micro-optimization, or was there some other reason to prefer calling __new__ 
directly?
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/OVOGXXB2EQ7CJWRLWCR4HE2A337XTSNJ/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to