On Wed, Mar 14, 2018 at 05:06:02PM +1100, Chris Angelico wrote:
> On Wed, Mar 14, 2018 at 4:58 PM, Steven D'Aprano <st...@pearwood.info> wrote:
> > On Wed, Mar 14, 2018 at 04:20:20PM +1100, Chris Billington wrote:
> >
> >> Instead, maybe a user should just get a big fat error if they try to import
> >> the same file twice under different names.
> >
> > Absolutely not.
> >
> > Suppose I import a library, Spam, which does "import numpy".
> >
> > Now I try to "import numpy as np", and I get an error.
> 
> That's not the same thing. Both of those statements are importing the
> same file under the same name, "numpy"; one of them then assigns that
> to a different local name.

Hence, two different names.

> But in sys.modules, they're the exact same thing.

Of course they are. But it wasn't clear to me that the alternative was 
what Chris was referring to. Either I read carelessly, or he never mentioned 
anything 
about duplicate entries in sys.modules.

> The double import problem comes when the same file gets imported under
> two different names *in sys.modules*.

It actually requires more than that to cause an actual problem.

For starters, merely having two keys in sys.modules isn't a problem if 
they both refer to the same module object:

sys.modules['maths'] = sys.modules['math']

is harmless. Even if you have imported two distinct copies of the same 
logical module as separate module objects -- which is hardly something 
you can do by accident, apart from one special case -- it won't 
necessarily be a problem.

For instance, if the module consists of nothing but pure functions with 
no state, then the worst you have is a redundant copy and some wasted 
memory.

It can even be useful, e.g. I have a module that uses global variables 
(I know, I know, "global variables considered harmful"...) and sometimes 
it is useful to import it twice as two independent copies.

That's better than literally duplicating the .py file, and faster than 
re-writing the module and changing the scripts that rely on it.

On Linux, I can make a hard-link of spam.py as ham.py. But if my file 
system doesn't support hard-linking, or I can't do that for some other 
reason, the next best thing is to intentionally subvert the import 
system and/or sys.modules in order to get two distinct copies.

    import spam
    sys.modules['ham'] = sys.modules['spam']
    del sys.modules['spam']
    import spam, ham

will do it. (I don't know if there are any easier ways.) There's nothing 
wrong with doing this intentionally. Consenting adults and all that.

So the question is, how can you do this *by accident*?

The only way I know of to get a module *accidentally* imported 
twice is when a module imports itself in a script:

# spam.py
if __name__ == '__main__':
    import spam


does not do what you expect. Now your script is loaded as two 
independent copies, once under 'spam' and once under '__main__'.

The simple fix is:

Unless you know what you are doing, don't have your runnable scripts 
import themselves when running.

Apart from intentionally manipulating sys.modules or the import system, 
or playing file system tricks like hard-linking your files, under what 
circumstances can this occur by accident?



-- 
Steve
_______________________________________________
Python-ideas mailing list
Python-ideas@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/

Reply via email to