On Wed, Aug 29, 2012 at 11:17 AM, Leo Shklovskii <[email protected]> wrote: > Here's a fun question for the python lovers in the crowd. We ran into some > unexpected behavior in our code and don't have a good explanation for why it > happens.
I added a bit to the test script to show more. Output:
===
sys.modules["foo"] = missing
Importing foo.
expected NameError on 'foo.cars'
sys.modules["foo"] = <module 'foo' from '/tmp/foo/__init__.pyc'>
/tmp/foo/cars.pyc
Printed stuff from foo.
sys.modules["foo"] = <module 'foo' from '/tmp/foo/__init__.pyc'>
sys.modules["foo.cars"] = <module 'foo.cars' from '/tmp/foo/cars.pyc'>
sys.modules["foo"].cars = <module 'foo.cars' from '/tmp/foo/cars.pyc'>
Everything in foo except '__builtins__':
{'__doc__': None,
'__file__': '/tmp/foo/__init__.pyc',
'__name__': 'foo',
'__package__': None,
'__path__': ['/tmp/foo'],
'cars': <module 'foo.cars' from '/tmp/foo/cars.pyc'>,
'print_stuff': <function print_stuff at 0xb727187c>}
===
Leo Shklovskii wrote:
> The takeaway is don't define anything in __init__.py
Chris Barker wrote or quoted somebody unknown:
> I think the real moral here is don't try to use a module you haven't
> imported yet
I wouldn't quite agree with these. There are some straightforward
consequences of Python's import mechanism, and some surprising ones.
The surprising ones may be flaws, but I'm not an import developer so I
don't know if it could be designed better.
Importing a package executes its __init__.py. Subpackages are not
visible at this point unless the __init__.py imports them.
When you import a submodule explicitly ('import foo.bar', 'from foo
import bar', 'from foo.bar import *'), it imports the package and puts
it in sys.modules, then imports the submodule and puts it both in
sys.modules and in the package module. This may have seemed like a
good idea way back when packages were introduced, but it surprises the
caller that 'foo.bar' didn't exist and then it does.
There are four major ways to write a package:
1. Submodules are public and independent, and users import them
directly. The init module is empty or contains just trivial constants
like '__version__' and a docstring. Paste is like this. Paste also
distributes some subpackages separately using a namespace module.
2. Submodules are public, but as a courtesy the package also imports
them into the package namespace. 'os.path' is like this.
3. Submodules are public, but as a courtesy the package imports the
most common objects from selected submodules into the init module.
SQLAlchemy and webhelpers.html do this. Less common or large
subpackages must be explicitly imported.
4. Submodules are private; the entire public API is imported into the
init module.
There are two main gotchas with defining things in the init module:
- If the thing has the same name as a submodule, it will be clobbered
if the user imports the submodule. The package author should avoid
defining objects with the same name as submodules.
- If the package module contains something that submodules depend on
(i.e., that the submodules have to import the init module for), it
creates a circular import. Many people including me think this is bad,
and anything required by both the init module and submodules should be
defined in a separate submodule that imports nothing else in the
package. But it's "safe" as long as the init module defines the
depended-on thing *before* it imports the submodule that depends on
it. Otherwise the thing won't exist at the moment of import.
This also applies when any two modules import each other. Although, as
I said, it's usually better to put the depended-on things in a third
module that both modules can import, and that itself imports nothing.
--
Mike Orr <[email protected]>
test.py
Description: Binary data
