Hi there,

I've noticed that, when a frame's __builtins__ is a subclass of dict with an 
overridden __getitem__ method, this overriden method is not used by the 
IMPORT_NAME instruction to lookup __import__ in the dictionary; it uses the 
lookup function of normal dictionaries (via _PyDict_GetItemIdWithError). This 
is contrary to the behaviour of the similar LOAD_BUILD_CLASS, as well as the 
typical name lookup semantics of LOAD_GLOBAL/LOAD_NAME, which all use 
PyDict_CheckExact for a "fast path" before defaulting to PyObject_GetItem, 
which is unexpected.

Perhaps more seriously, if __builtins__ is not a dict at all, then it gets 
erroneously passed to some internal dict functions resulting in a mysterious 
SystemError ("Objects/dictobject.c:1440: bad argument to internal function") 
which, to me, indicates fragile behaviour that isn't supposed to happen.

I'm not sure if this intended, so I didn't want to open an issue yet. It also 
seems a highly specific use case and changing it would probably cause a bit of 
a slow-down in module importing so is perhaps not worth fixing. I just wanted 
to ask here in case this issue had been documented anywhere before, and to 
check if it might actually be supposed to happen before opening a bug report.

I cannot find evidence that this behaviour has changed at all in recent history 
and it seems to be the same on the main branch as in 3.9.6.

A short demo of these things is attached.

Links to relevant CPython code in v3.9.6:

IMPORT_NAME: https://github.com/python/cpython/blob/v3.9.6/Python/ceval.c#L5179

BUILD_CLASS: https://github.com/python/cpython/blob/v3.9.6/Python/ceval.c#L2316

LOAD_NAME: https://github.com/python/cpython/blob/v3.9.6/Python/ceval.c#L2488

LOAD_GLOBAL: https://github.com/python/cpython/blob/v3.9.6/Python/ceval.c#L2546

Thanks,

Patrick Reader

class MyDict(dict):
    # keep a reference around to avoid infinite recursion
    print = print
    dict = dict
    def __getitem__(self, key):
        self.print("getting:", key)
        # Can't use super here because we'd have to keep a reference around instead of looking it up
        # in __builtins__ (to prevent infinite recursion), but then there's no __class__ cell which
        # breaks the lookup mechanism. Instead, just refer to dict by name
        return self.dict.__getitem__(self, key)

__builtins__ = MyDict(vars(__builtins__))

int            # prints "getting: int"
__import__     # prints "getting: __import__"
class X: pass  # prints "getting: __build_class__"
import math    # does not print "getting: __import__" because it uses dictobject internal lookup

################################################################################

# try these individually in the Python shell, because they all error on their own

__builtins__ = "not a dictionary"

int            # TypeError: string indices must be integers (because it's trying to do effectively `"not a dictionary"["int"]`)
__import__     # same error
class X: pass  # same error (trying to load __build_class__)
import math    # SystemError: Objects/dictobject.c:1440: bad argument to internal function
_______________________________________________
Python-Dev mailing list -- python-dev@python.org
To unsubscribe send an email to python-dev-le...@python.org
https://mail.python.org/mailman3/lists/python-dev.python.org/
Message archived at 
https://mail.python.org/archives/list/python-dev@python.org/message/ZQMF6XC76J4APJPB3X6PGATG6CV5NN44/
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to