On 09/06/2005, at 4:38 PM, Nicolas Lehuen wrote:
Erm, so, no, handlers could also be imported from the document tree.
No problem, we can do that, but the security issues pop up once again.

We can't protect against everything though. ;-)

I've understood you point, but there is a difficulty in judging from a
PythonHandler directive whether the handler should be loaded as a
standard Python module, from the sys.path, or as a dynamic Python
module, from the document tree. Maybe the context of the directive
could be used for that ; if the directive is defined at the server or
virtual host level, then it's a top level handler, otherwise if it is
defined in a Location or Directory (or .htaccess file), then it's a
handler that should be loaded from the document tree (with a possible
fallback to sys.path if it is not found ?).

If PythonImport is used, it can only come from sys.path as there is no
connection with a physical directory. Similar with PythonHandler which
is defined in a Location directive, there is no connection with a
directory and thus can only come from sys.path. Thus, only where the
Directory directive is used, or PythonHandler is specified in the
actual .htaccess file do we have a physical directory and can use
apache.import_module().

Anyway, saying that "import" should be used to import from the
sys.path and apache.import_module should be used to import from the
document tree looks like a clean rule, easy to understand and to
implement.

The suggestion I've made in my former (way too long) mail was simply
that when a module is not found from the document tree, we could fall
back to a careful standard import from the sys.path, but this would
smudge in appearance this clean separation between standard and
dynamic modules.

At the moment I am a bit worried about falling back on sys.path in
apache.import_module() itself, partly because it confuses the two
concepts, but not sure there aren't some strange problems lurking
in there as well if that was done. What can be considered though
is an alternative fallback search path, one that is distinct from
sys.path but where mod_python style module loading is used if the
module is found in the alternate path. I have implemented this in
Vampire to see how it might work in practice. The main thing that
it allows is for utility code to use apache.import_module() without
the need to have to look up some special configuration mechanism to
determine a special directory from which it should otherwise load
from. Jury is out on this one at the moment though as to whether
it is a good idea or not. :-)

Time now to start bringing up some of the other issues that have to be
dealt with. The first is that the current apache.import_module() is
able to support packages to a degree, its not perfect though as is
evidenced I think by problems importing mod_python.psp. This is more
though to do with it not setting up the import exactly as the standard
Python module importer requires it look. If apache.import_module()
doesn't use sys.modules it will not matter if the way it sets things
up as long as it is able to provide the same behaviour.

The question you might be asking is do we need to support packages.
Well, its a question I am not sure I have a good answer for. I know I
have seen people posting on the mailing list with examples of package
use, eg:

  http://www.modpython.org/pipermail/mod_python/2005-May/018182.html

In that case though they were using "from/import" and not actually
using apache.import_module(). They did have the package stored in the
document tree though. So, not sure if in practice people are using
packages with apache.import_module() or not.

If packages were supported there would still be a few things to do.
First is that the module loader when given a module name which equates
to the name of a directory would need to see if the directory contains
a __init__.py file and if it does, load that file as the module.

The big problem now is that if the __init__.py file uses standard
import statement with the expectation that it will grab the module
or package from within the same directory, it will not work. This is
because to the Python import system it will not know that it is to
be treated as a package and look in that local directory first.

I got past this problem in Vampire through the use of the import hook.
Vampire would stash a special global object in the module so the import
hook knew that it was a special Vampire managed module and would grab
the module from the local directory and import it using the Vampire
module importing system rather than standard Python module importer.
At the moment though this only works at global scope and not when
import is used in the handler code when executed, although can
probably solve that.

Although from/import syntax also works, if it tried to import "a.b" from
a subpackage, it will not work if "b" wasn't explicitly imported by "a"
to begin with.

In summary, haven't been able to get package imports to work correctly.
If it can't be made to work then would have to say that packages are
not supported by apache.import_module() and if people are using it to
import packages now, they will not have a choice but to not use packages
for handlers in the document tree and if a utility package is in the
document tree, it will have to be moved outside of the document tree
and sys.path set to that location as import isn't going to work for it
if we don't allow document tree directories into sys.path.

The question thus is, if you understand what I am raving about, is
whether it is reasonable that packages will not be supported by
apache.import_module(). There is a slim chance some ones code may
break as a result but the majority would work fine.

Enough for now.

Graham

Reply via email to