On Fri, Apr 8, 2022 at 4:38 PM dfremont--- via Python-Dev < python-dev@python.org> wrote:
> Hello, > > I came across what seems like either a bug in the import system or a gap > in its documentation, so I'd like to run it by folks here to see if I > should submit a bug report. If there's somewhere else more appropriate to > discuss this, please let me know. > > If you import A.B, then remove A from sys.modules and import A.B again, > the newly-loaded version of A will not contain an attribute referring to B. > Using "collections.abc" as an example submodule from the standard library: > > >>> import sys > >>> import collections.abc > >>> del sys.modules['collections'] > >>> import collections.abc > >>> collections.abc > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > AttributeError: module 'collections' has no attribute 'abc' > > This behavior seems quite counter-intuitive to me: why should the fact > that B is already loaded prevent adding a reference to it to A? Because `"collections.abc" in sys.modules` is true. The import system expects that if you already imported a module then everything that needed to happen, happened. Basically you cheated by not doing a thorough cleaning of sys.modules by not deleting all the submodules as well. > It also goes against the general principle that "import FOO" makes the > expression "FOO" well-defined; You're dealing with the import system; you never got to have a well-defined statement to begin with. 😉 > for example PLR 5.7 states that "'import XXX.YYY.ZZZ' should expose > 'XXX.YYY.ZZZ' as a usable expression". And it did. But then you went behind the curtain and moved stuff around. > Finally, it violates the "invariant" stated in PLR 5.4.2 that if 'A' and > 'A.B' both appear in sys.modules, then A.B must be defined and refer to > sys.modules['A.B']. > That isn't an invariant that holds when you delete things outside of the import system; that statement is what the import system *does*, not what the import system guarantees to always be true. > > On the other hand, PLR 5.4.2 also states that "when a submodule is loaded > using any mechanism... a binding is placed in the parent module's namespace > to the submodule object", which is consistent with the behavior above, > since the second import of A.B does not actually "load" B (only retrieve it > from the sys.modules cache). So perhaps Python is working as intended here, > and there is an unwritten assumption that if you unload a module from the > cache, you must also unload all of its submodules. If so, I think this > needs to be added to the documentation (which currently places no > restrictions on how you can modify sys.modules, as far as I can tell). > > This may be an obscure corner case that is unlikely to come up in practice > (I imagine few people need to modify sys.modules), but it did actually > cause a bug in a project I work on, where it is necessary to uncache > certain modules so that they can be reloaded. I was able to fix the bug > some other way, but I think it would still be worthwhile to either make the > import behavior more consistent (so that 'import A.B' always sets the B > attribute of A) or add a warning in the documentation about this case. I'd > appreciate any thoughts on this! > Feel free to propose some language to update the docs, but changing this behaviour very well may have unintended consequences, so I would rather not try to change it.
_______________________________________________ Python-Dev mailing list -- python-dev@python.org To unsubscribe send an email to python-dev-le...@python.org https://mail.python.org/mailman3/lists/python-dev.python.org/ Message archived at https://mail.python.org/archives/list/python-dev@python.org/message/U5DDT4DUJ7U3VO62VZ333SWIN7QFZPHJ/ Code of Conduct: http://python.org/psf/codeofconduct/