At 03:09 PM 7/20/2011 -0700, Glenn Linderman wrote:
On 7/20/2011 6:05 AM, P.J. Eby wrote:
At 02:24 AM 7/20/2011 -0700, Glenn Linderman wrote:
When I read about creating __path__ from sys.path, I immediately
thought of the issue of programs that extend sys.path, and the
above is the "workaround" for such programs.ÃÂ but it requires
such programs to do work, and there are a lot of such programs (I,
a relative newbie, have had to write some).ÃÂ As it turns out, I
can't think of a situation where I have extended sys.path that
would result in a problem for fancy namespace packages, because so
far I've only written modules, not packages, and only modules are
on the paths that I add to sys.path.ÃÂ But that does not make
for a general solution.
Most programs extend sys.path in order to import things. If those
things aren't yet imported, they don't have a __path__ yet, and so
don't need to be fixed. Only programs that modify sys.path
*after* importing something that has a dynamic __path__ would need
to do anything about that.
Sure. But there are a lot of things already imported by Python
itself, and if this mechanism gets used in the stdlib, a program
wouldn't know whether it is safe or not, to not bother with the
pkgutil.extend_virtual_paths() call or not.
I'm not sure I see how the mechanism could meaningfully be used in
the stdlib, since IIUC we're not going for Perl-style package
naming. So, all stdlib packages would be self-contained.
Plus, that requires importing pkgutil, which isn't necessarily done
by every program that extends the sys.path ("import sys" is
sufficient at present).
Plus, if some 3rd party packages are imported before sys.path is
extended, the knowledge of how they are implement is required to
make a choice about whether it is needed to import pkgutil and call
extend_virtual_paths or not.
I'd recommend *always* using it, outside of simple startup code.
So I am still left with my original question:
Is there some way to create a new __path__ that would reflect the
fact that it has been dynamically created, rather than set from
__init__.py, and then when it is referenced, calculate (and
cache?) a new value of __path__ to actually search?
Hm. Yes, there is a way to do something like that, but it would
complicate things a bit. We'd need to:
1. Leave __path__ off of the modules, and always pull them from
sys.virtual_package_paths, and
2. Before using a value in sys.virtual_package_paths, we'd need to
check whether sys.path had changed since we last cached anything, and
if so, clear sys.virtual_package_paths first, to force a refresh.
This doesn't sound particularly forbidding, but there are various
unpleasant consequences, like being unable to tell whether a module
is a package or not, and whether it's a virtual package or not. We'd
have to invent new ways to denote these things.
On the bright side, though, it *would* allow transparent live updates
to virtual package paths, so it might be worth considering.
By the way, the reason we have to get rid of __path__ is that if we
kept it, then code could change it, and then we wouldn't know if it
was actually safe to change it automatically... even if no code had
actually changed it.
In principle, we could keep __path__ attributes around, and
automatically update them in the case where sys.path has changed, so
long as user code hasn't directly altered or replaced the
__path__. But it seems to me to be a dangerous corner case; I'd
rather that code which touches __path__ be taking responsibility for
that path's correctness from then on, rather than having it get
updated (possibly incorrectly) behind its back.
So, I'd say that for this approach, we'd have to actually leave
__path__ off of virtual packages' parent modules.
Anyway, it seems worth considering. We just need to sort out what
the downsides are for any current tools thinking that such modules
aren't packages. (But hey, at least it'll be consistent with what
such tools would think of the on-disk representation! That is, a
tool that thinks foo.py alongside a foo/ subdirectory is just a
module with no package, will also think that 'foo', once imported, is
a module with no package.)
And, in the absence of knowing (because I didn't write them) whether
any of the packages I imported before extending sys.path are virtual
packages or not, I would have to do this every time I extend
sys.path. And so it becomes a burden on writing programs.
If the code is so boilerplate as you describe, should sys.path
become an object that acts like a list, instead of a list, and have
its append method automatically do the pkgutil.extend_virtual_paths
for me? Then I wouldn't have to worry about whether any of the
packages I imported were virtual packages or not.
Well, then we'd have to worry about other mutation methods, and
things like 'sys.path = [blah, blah]', as well. So if we're going to
ditch the need for extend_virtual_paths(), we should probably do it
via the absence of __path__ attributes.
_______________________________________________
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com