[Python-ideas] Re: python -m quality of life improvements

Brendan Barnwell Sun, 12 Jan 2020 10:23:34 -0800

On 2020-01-11 23:34, Steven D'Aprano wrote:

On Sun, Jan 12, 2020 at 11:59:20AM +1100, Chris Angelico wrote:

>The biggest difference is that scripts can't do relative imports.

How is that relevent? People keep mentioning minor differences between
different ways of executing different kinds of entities (scripts,
packages, submodules etc) but not why those differences are important or
why they would justify any change in the way -m works.

I don't think I condone the details of the OP's proposal, but I doagree that the process for executing Python files has some irritatingwarts. In fact, I would say the problem is precisely that a differenceexists between running a "script" and a "module". So let me explain whyI think this is annoying.

The pain point is relative imports. The docs athttps://docs.python.org/3/reference/import.html#packages say:

"You can think of packages as the directories on a file system andmodules as files within directories, but don’t take this analogy tooliterally since packages and modules need not originate from the filesystem."

The basic problem is that the overwhelming majority of packages andmodules DO originate from the filesystem, and so people naturally wantto be able to use the filesystem directly to represent packagestructure, REGARDLESS OF HOW OR WHETHER THE FILES ARE RUN OR IMPORTED.I'm sorry to put that in caps but that is really the fundamental issue.People want to be able to write something like "from . import stuff"in a file, and know that that will work purely based on the filesystemlocation in which that file is situated, regardless of how the file is"accessed" by Python (i.e., as a module, script, program, whatever youwant to call it).

In other words, what non-expert users expect is that if there is adirectory called `foo` with a subdirectory `bar` with some more files,that alone should be sufficient to establish that `foo` is a packagewith `bar` as a subpackage and the other files available as modules like`foo.stuff` and `foo.bar.morestuff`. (Some users perhaps understandthat the folders should have an __init__.py to be considered part of thepackage, but I think even this is less well understood in the era ofnamespace packages.) It should not matter exactly how you "get to"these files in the first place --- that is, it should not matter whetheryou are importing a file or running one "as a script" or "as a module",nor should it matter precisely which file you run. The mere fact that afile "a.py" exists and is in the same directory with a file called"b.py" should be enough for "a.py" to use "from . import b" and have itwork, always.

Now, I realize that there are various reasons why it doesn't work thisway. Basically these reasons boil down to the fact that although mostpackages are transparently represented by their file/directorystructure, there are also exist namespace packages, which can have amore diffuse file/directory structure, and it's also possible to create"virtual" packages that have no filesystem representation at all.

But the documentation is a long, long way from making this clear. Forinstance, it says this:

"For example, the following file system layout defines a top levelparent package with three subpackages:"

But that's not true! The filesystem layout itself does not define thepackage! For relative import purposes, it only "counts" as a package ifit's imported, not if a file in it is run directly. Otherwise it's justsome files on disk, and if you run one of them "as a script", no packageexists as far as Python is concerned.

The documentation does go on to describe how __main__ works and how thefile's __name__ is set if it's run, and so on. But it does all thisusing the term "package", which is a trap for the unwary, because theyalready think package means "a directory with a certain structure" andnot "something you get via the `import` statement".

Ultimately, the problem is that users (especially beginners) want to beable to put some files in a folder and have it work as a package as longas they are working locally in that folder --- without messing withsys.path or "installing" anything. In other words they want to create adirectory and put "my_script.py" in there, and then put "mylib.py" inthere and have the former use relative imports to get stuff from thelatter. But they can't.

Personally, I am in agreement that this behavior is extremelybothersome. (In particular, the fact that __name__ becomes __main__when the script is run, but is set to its usual name when it isimported, was a poor design decision that creates confusing asymmetriesbetween the run and import cases.) It makes it unnecessarily difficultto write small, self-contained programs which make use of relativeimports. Yes, it is better to write a setup.py and specify thedependencies, and blah blah, but for small tasks people often simplydon't want to do that. They want to unzip their files into a directoryand have it work, without notifying Python about installing anything orputting anything on the path.

As far as solutions, I think an idea worth considering would be a newcommand-line option similar to "-m" which effectively says "run thisFILE that I am telling you, but pretend it is in whatever package itseems to be in based on the directory structure". So like suppose theoption is -f for "file as module". It means if I do "python -fscript.py", it would run that file, but correctly set up __package__ andso on so that "script.py" (and other files it imports) would be able touse relative imports. Maybe that would mean they could unexpectedlyimport higher than their level (i.e., use relative-import dots goingabove the actual top level of the package), or maybe the relativeimports would be local to the directory where "script.py" is located, ormaybe you could even specify the relative import "root" in a separateoption, like "python -f script.py -r my/package/root".

The basic point is that people want to use relative imports withoutincluding boilerplate code to put themselves on sys.path, and withoutcaring about whether the file is run directly or imported as a module,and without "installing" anything, and in general without thinking aboutanything except the local directory structure in which the file they arerunning is situated.

I realize that in many ways this is sloppy and you could say "don't dothat", but I think if that is the position, the documentation needs tobe seriously tightened up. In particular it needs to be made clear ---at every single mention! --- that "package" refers only to somethingthat is imported and not to a file's "identity" based on its filesystemlocation.

Just over six years ago I wrote an answer about this on StackOverflow(https://stackoverflow.com/questions/14132789/relative-imports-for-the-billionth-time/14132912#14132912)that continues to get upvotes and comments of the form "wow why isn'tthis explained in the documentation" almost daily. I hope it is clearthat, even if we want to leave the behavior exactly as it is, there is amajor problem with how people think they can use relative imports basedon the official documentation.


--
Brendan Barnwell

"Do not follow where the path may lead. Go, instead, where there is nopath, and leave a trail."

   --author unknown
_______________________________________________
Python-ideas mailing list -- python-ideas@python.org
To unsubscribe send an email to python-ideas-le...@python.org
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at 
https://mail.python.org/archives/list/python-ideas@python.org/message/VE64KSEMU7IOUXSJ5HVFMDKTMXDUEZTG/
Code of Conduct: http://python.org/psf/codeofconduct/

[Python-ideas] Re: python -m quality of life improvements

Reply via email to