On Aug 12, 2011, at 2:33 PM, P.J. Eby wrote:

> At 01:09 PM 8/12/2011 -0400, Glyph Lefkowitz wrote:
>> Upon further reflection, PEP 402 _will_ make dealing with namespace packages 
>> from this code considerably easier: we won't need to do AST analysis to look 
>> for a __path__ attribute or anything gross like that improve correctness; we 
>> can just look in various directories on sys.path and accurately predict what 
>> __path__ will be synthesized to be.
> 
> The flip side of that is that you can't always know whether a directory is a 
> virtual package without deep inspection: one consequence of PEP 402 is that 
> any directory that contains a Python module (of whatever type), however 
> deeply nested, will be a valid package name.  So, you can't rule out that a 
> given directory *might* be a package, without walking its entire reachable 
> subtree.  (Within the subset of directory names that are valid Python 
> identifiers, of course.)

Are there any rules about passing invalid identifiers to __import__ though, or 
is that just less likely? :)

> However, you *can* quickly tell that a directory *might* be a package or is 
> *probably* one: if it contains modules, or is the same name as an 
> already-discovered module, it's a pretty safe bet that you can flag it as 
> such.

I still like the idea of a 'marker' file.  It would be great if there were a 
new marker like "__package__.py".  I say this more for the benefit of users 
looking at a directory on their filesystem and trying to understand whether 
this is a package or not than I do for my own programmatic tools though; it's 
already hard enough to understand the package-ness of a part of your filesystem 
and its interactions with PYTHONPATH; making directories mysteriously and 
automatically become packages depending on context will worsen that situation, 
I think.

I also have this not-terribly-well-defined idea that it would be handy for 
different providers of the _contents_ of namespace packages to provide their 
own instrumentation to be made aware that they've been added to the __path__ of 
a particular package.  This may be a solution in search of a problem, but I 
imagine that each __package__.py would be executed in the same module 
namespace.  This would allow namespace packages to do things like set up 
compatibility aliases, lazy imports, plugin registrations, etc, as they 
currently do with __init__.py.  Perhaps it would be better to define its 
relationship to the package-module namespace in a more sensible way than 
"execute all over each other in no particular order".

Also, if I had my druthers, Python would raise an exception if someone added a 
directory marked as a package to sys.path, to refuse to import things from it, 
and when a submodule was run as a script, add the nearest directory not marked 
as a package to sys.path, rather than the script's directory itself.  The whole 
"__name__ is wrong because your current directory was wrong when you ran that 
command" thing is so confusing to explain that I hope we can eventually consign 
it to the dustbin of history.  But if you can't even reasonably guess whether a 
directory is supposed to be an entry on sys.path or a package, that's going to 
be really hard to do.

> In any case, you probably should *not* do the building of a virtual path 
> yourself; the protocols and APIs added by PEP 402 should allow you to simply 
> ask for the path to be constructed on your behalf.  Otherwise, you are going 
> to be back in the same business of second-guessing arbitrary importer 
> backends again!

What do you mean "building of a virtual path"?

> (E.g. note that PEP 402 does not say virtual package subpaths must be 
> filesystem or zipfile subdirectories of their parents - an importer could 
> just as easily allow you to treat subdirectories named 'twisted.python' as 
> part of a virtual package with that name!)
> 
> Anyway, pkgutil defines some extra methods that importers can implement to 
> support module-walking, and part of the PEP 402 implementation should be to 
> make this support virtual packages as well.

The more that this can focus on module-walking without executing code, the 
happier I'll be :).

>> This code still needs to support Python 2.4, but I will make a note of this 
>> for future reference.
> 
> A suggestion: just take the pkgutil code and bundle it for Python 2.4 as 
> something._pkgutil.  There's very little about it that's 2.5+ specific, at 
> least when I wrote the bits that do the module walking.
> 
> Of course, the main disadvantage of pkgutil for your purposes is that it 
> currently requires packages to be imported in order to walk their child 
> modules.  (IIRC, it does *not*, however, require them to be imported in order 
> to discover their existence.)

One of the stipulations of this code is that it might give different results 
when the modules are loaded and not.  So it's fine to inspect that first and 
then invoke pkgutil only in the 'loaded' case, with the knowledge that the 
not-loaded case may be incorrect in the face of certain configurations.

>> In that case, I guess it's a good thing; these bugs should be dealt with.  
>> Thanks for pointing them out.  My opinion of PEP 402 has been completely 
>> reversed - although I'd still like to see a section about the module system 
>> from a library/tools author point of view rather than a time-traveling perl 
>> user's narrative :).
> 
> LOL.
> 
> If you will propose the wording you'd like to see, I'll be happy to check it 
> for any current-and-or-future incorrect assumptions.  ;-)

If I can come up with anything I will definitely send it along.

-glyph
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Reply via email to