None of this is even about relative imports. Absolute imports are also
broken between them, as I tried to demonstrate using my project
structure. The *whole* import system breaks.
On 2020-01-12 3:12 p.m., Brendan Barnwell wrote:
On 2020-01-11 23:34, Steven D'Aprano wrote:
On Sun, Jan 12, 2020 at 11:59:20AM +1100, Chris Angelico wrote:
>The biggest difference is that scripts can't do relative imports.
How is that relevent? People keep mentioning minor differences between
different ways of executing different kinds of entities (scripts,
packages, submodules etc) but not why those differences are important or
why they would justify any change in the way -m works.
I don't think I condone the details of the OP's proposal, but I do
agree that the process for executing Python files has some irritating
warts. In fact, I would say the problem is precisely that a
difference exists between running a "script" and a "module". So let
me explain why I think this is annoying.
The pain point is relative imports. The docs at
https://docs.python.org/3/reference/import.html#packages say:
"You can think of packages as the directories on a file system and
modules as files within directories, but don’t take this analogy too
literally since packages and modules need not originate from the file
system."
The basic problem is that the overwhelming majority of packages
and modules DO originate from the filesystem, and so people naturally
want to be able to use the filesystem directly to represent package
structure, REGARDLESS OF HOW OR WHETHER THE FILES ARE RUN OR IMPORTED.
I'm sorry to put that in caps but that is really the fundamental
issue. People want to be able to write something like "from . import
stuff" in a file, and know that that will work purely based on the
filesystem location in which that file is situated, regardless of how
the file is "accessed" by Python (i.e., as a module, script, program,
whatever you want to call it).
In other words, what non-expert users expect is that if there is a
directory called `foo` with a subdirectory `bar` with some more files,
that alone should be sufficient to establish that `foo` is a package
with `bar` as a subpackage and the other files available as modules
like `foo.stuff` and `foo.bar.morestuff`. (Some users perhaps
understand that the folders should have an __init__.py to be
considered part of the package, but I think even this is less well
understood in the era of namespace packages.) It should not matter
exactly how you "get to" these files in the first place --- that is,
it should not matter whether you are importing a file or running one
"as a script" or "as a module", nor should it matter precisely which
file you run. The mere fact that a file "a.py" exists and is in the
same directory with a file called "b.py" should be enough for "a.py"
to use "from . import b" and have it work, always.
Now, I realize that there are various reasons why it doesn't work
this way. Basically these reasons boil down to the fact that although
most packages are transparently represented by their file/directory
structure, there are also exist namespace packages, which can have a
more diffuse file/directory structure, and it's also possible to
create "virtual" packages that have no filesystem representation at all.
But the documentation is a long, long way from making this clear.
For instance, it says this:
"For example, the following file system layout defines a top level
parent package with three subpackages:"
But that's not true! The filesystem layout itself does not define
the package! For relative import purposes, it only "counts" as a
package if it's imported, not if a file in it is run directly.
Otherwise it's just some files on disk, and if you run one of them "as
a script", no package exists as far as Python is concerned.
The documentation does go on to describe how __main__ works and
how the file's __name__ is set if it's run, and so on. But it does
all this using the term "package", which is a trap for the unwary,
because they already think package means "a directory with a certain
structure" and not "something you get via the `import` statement".
Ultimately, the problem is that users (especially beginners) want
to be able to put some files in a folder and have it work as a package
as long as they are working locally in that folder --- without messing
with sys.path or "installing" anything. In other words they want to
create a directory and put "my_script.py" in there, and then put
"mylib.py" in there and have the former use relative imports to get
stuff from the latter. But they can't.
Personally, I am in agreement that this behavior is extremely
bothersome. (In particular, the fact that __name__ becomes __main__
when the script is run, but is set to its usual name when it is
imported, was a poor design decision that creates confusing
asymmetries between the run and import cases.) It makes it
unnecessarily difficult to write small, self-contained programs which
make use of relative imports. Yes, it is better to write a setup.py
and specify the dependencies, and blah blah, but for small tasks
people often simply don't want to do that. They want to unzip their
files into a directory and have it work, without notifying Python
about installing anything or putting anything on the path.
As far as solutions, I think an idea worth considering would be a
new command-line option similar to "-m" which effectively says "run
this FILE that I am telling you, but pretend it is in whatever package
it seems to be in based on the directory structure". So like suppose
the option is -f for "file as module". It means if I do "python -f
script.py", it would run that file, but correctly set up __package__
and so on so that "script.py" (and other files it imports) would be
able to use relative imports. Maybe that would mean they could
unexpectedly import higher than their level (i.e., use relative-import
dots going above the actual top level of the package), or maybe the
relative imports would be local to the directory where "script.py" is
located, or maybe you could even specify the relative import "root" in
a separate option, like "python -f script.py -r my/package/root".
The basic point is that people want to use relative imports
without including boilerplate code to put themselves on sys.path, and
without caring about whether the file is run directly or imported as a
module, and without "installing" anything, and in general without
thinking about anything except the local directory structure in which
the file they are running is situated.
I realize that in many ways this is sloppy and you could say
"don't do that", but I think if that is the position, the
documentation needs to be seriously tightened up. In particular it
needs to be made clear --- at every single mention! --- that "package"
refers only to something that is imported and not to a file's
"identity" based on its filesystem location.
Just over six years ago I wrote an answer about this on
StackOverflow
(https://stackoverflow.com/questions/14132789/relative-imports-for-the-billionth-time/14132912#14132912)
that continues to get upvotes and comments of the form "wow why isn't
this explained in the documentation" almost daily. I hope it is clear
that, even if we want to leave the behavior exactly as it is, there is
a major problem with how people think they can use relative imports
based on the official documentation.
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/python-ideas@python.org/message/QKT2LZOZFY6M4VZBIITBBI64B5FB6LDD/
Code of Conduct: http://python.org/psf/codeofconduct/