The (perhaps minor) problem with simply having the same module object have two entries in sys.modules is that at least one of them will have a different __name__ attribute than the corresponding key in sys.modules. It's not the end of the world (os.path.__name__ is 'posixpath' on linux, not 'os.path', for example), but it could nonetheless trip some code up.
Multiple copies of a module even when they are stateless can trip things up. The bug that bit us recently was that a function was checking if its arguments was an instance of a particular class defined in the same module as that function. However the argument had been instantiated using the other copy of the module, and hence as far as Python was concerned, the two classes were different classes and isinstance() returned False. It was very confusing. The problem occurs by accident primarily when you run scripts as __main__ from within a package directory, because the current working directory is in sys.path. I know I know, you're not supposed to do this. But this double import problem is exactly *why* you're not supposed to do this, and despite the advice people do it all the time: for example they might store scripts within a package directory to be run by calling code using subprocess.Popen, or manually by a developer to generate resources, or they might be in the process of turning their pile of scripts into a package, and encounter the problem during the transition whilst they are still running from within the directory. Or they might have had their program os.chdir() into the package directory in order to be able to use relative paths for resources the program needs to load. On Wed, Mar 14, 2018 at 10:18 PM, Steven D'Aprano <st...@pearwood.info> wrote: > On Wed, Mar 14, 2018 at 05:06:02PM +1100, Chris Angelico wrote: > > On Wed, Mar 14, 2018 at 4:58 PM, Steven D'Aprano <st...@pearwood.info> > wrote: > > > On Wed, Mar 14, 2018 at 04:20:20PM +1100, Chris Billington wrote: > > > > > >> Instead, maybe a user should just get a big fat error if they try to > import > > >> the same file twice under different names. > > > > > > Absolutely not. > > > > > > Suppose I import a library, Spam, which does "import numpy". > > > > > > Now I try to "import numpy as np", and I get an error. > > > > That's not the same thing. Both of those statements are importing the > > same file under the same name, "numpy"; one of them then assigns that > > to a different local name. > > Hence, two different names. > > > But in sys.modules, they're the exact same thing. > > Of course they are. But it wasn't clear to me that the alternative was > what Chris was referring to. Either I read carelessly, or he never > mentioned anything > about duplicate entries in sys.modules. > > > The double import problem comes when the same file gets imported under > > two different names *in sys.modules*. > > It actually requires more than that to cause an actual problem. > > For starters, merely having two keys in sys.modules isn't a problem if > they both refer to the same module object: > > sys.modules['maths'] = sys.modules['math'] > > is harmless. Even if you have imported two distinct copies of the same > logical module as separate module objects -- which is hardly something > you can do by accident, apart from one special case -- it won't > necessarily be a problem. > > For instance, if the module consists of nothing but pure functions with > no state, then the worst you have is a redundant copy and some wasted > memory. > > It can even be useful, e.g. I have a module that uses global variables > (I know, I know, "global variables considered harmful"...) and sometimes > it is useful to import it twice as two independent copies. > > That's better than literally duplicating the .py file, and faster than > re-writing the module and changing the scripts that rely on it. > > On Linux, I can make a hard-link of spam.py as ham.py. But if my file > system doesn't support hard-linking, or I can't do that for some other > reason, the next best thing is to intentionally subvert the import > system and/or sys.modules in order to get two distinct copies. > > import spam > sys.modules['ham'] = sys.modules['spam'] > del sys.modules['spam'] > import spam, ham > > will do it. (I don't know if there are any easier ways.) There's nothing > wrong with doing this intentionally. Consenting adults and all that. > > So the question is, how can you do this *by accident*? > > The only way I know of to get a module *accidentally* imported > twice is when a module imports itself in a script: > > # spam.py > if __name__ == '__main__': > import spam > > > does not do what you expect. Now your script is loaded as two > independent copies, once under 'spam' and once under '__main__'. > > The simple fix is: > > Unless you know what you are doing, don't have your runnable scripts > import themselves when running. > > Apart from intentionally manipulating sys.modules or the import system, > or playing file system tricks like hard-linking your files, under what > circumstances can this occur by accident? > > > > -- > Steve > _______________________________________________ > Python-ideas mailing list > Python-ideas@python.org > https://mail.python.org/mailman/listinfo/python-ideas > Code of Conduct: http://python.org/psf/codeofconduct/ >
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/