On 09/06/2005, at 4:38 PM, Nicolas Lehuen wrote:
Erm, so, no, handlers could also be imported from the document tree. No problem, we can do that, but the security issues pop up once again.
We can't protect against everything though. ;-)
I've understood you point, but there is a difficulty in judging from a PythonHandler directive whether the handler should be loaded as a standard Python module, from the sys.path, or as a dynamic Python module, from the document tree. Maybe the context of the directive could be used for that ; if the directive is defined at the server or virtual host level, then it's a top level handler, otherwise if it is defined in a Location or Directory (or .htaccess file), then it's a handler that should be loaded from the document tree (with a possible fallback to sys.path if it is not found ?).
If PythonImport is used, it can only come from sys.path as there is no connection with a physical directory. Similar with PythonHandler which is defined in a Location directive, there is no connection with a directory and thus can only come from sys.path. Thus, only where the Directory directive is used, or PythonHandler is specified in the actual .htaccess file do we have a physical directory and can use apache.import_module().
Anyway, saying that "import" should be used to import from the sys.path and apache.import_module should be used to import from the document tree looks like a clean rule, easy to understand and to implement. The suggestion I've made in my former (way too long) mail was simply that when a module is not found from the document tree, we could fall back to a careful standard import from the sys.path, but this would smudge in appearance this clean separation between standard and dynamic modules.
At the moment I am a bit worried about falling back on sys.path in apache.import_module() itself, partly because it confuses the two concepts, but not sure there aren't some strange problems lurking in there as well if that was done. What can be considered though is an alternative fallback search path, one that is distinct from sys.path but where mod_python style module loading is used if the module is found in the alternate path. I have implemented this in Vampire to see how it might work in practice. The main thing that it allows is for utility code to use apache.import_module() without the need to have to look up some special configuration mechanism to determine a special directory from which it should otherwise load from. Jury is out on this one at the moment though as to whether it is a good idea or not. :-) Time now to start bringing up some of the other issues that have to be dealt with. The first is that the current apache.import_module() is able to support packages to a degree, its not perfect though as is evidenced I think by problems importing mod_python.psp. This is more though to do with it not setting up the import exactly as the standard Python module importer requires it look. If apache.import_module() doesn't use sys.modules it will not matter if the way it sets things up as long as it is able to provide the same behaviour. The question you might be asking is do we need to support packages. Well, its a question I am not sure I have a good answer for. I know I have seen people posting on the mailing list with examples of package use, eg: http://www.modpython.org/pipermail/mod_python/2005-May/018182.html In that case though they were using "from/import" and not actually using apache.import_module(). They did have the package stored in the document tree though. So, not sure if in practice people are using packages with apache.import_module() or not. If packages were supported there would still be a few things to do. First is that the module loader when given a module name which equates to the name of a directory would need to see if the directory contains a __init__.py file and if it does, load that file as the module. The big problem now is that if the __init__.py file uses standard import statement with the expectation that it will grab the module or package from within the same directory, it will not work. This is because to the Python import system it will not know that it is to be treated as a package and look in that local directory first. I got past this problem in Vampire through the use of the import hook. Vampire would stash a special global object in the module so the import hook knew that it was a special Vampire managed module and would grab the module from the local directory and import it using the Vampire module importing system rather than standard Python module importer. At the moment though this only works at global scope and not when import is used in the handler code when executed, although can probably solve that. Although from/import syntax also works, if it tried to import "a.b" from a subpackage, it will not work if "b" wasn't explicitly imported by "a" to begin with. In summary, haven't been able to get package imports to work correctly. If it can't be made to work then would have to say that packages are not supported by apache.import_module() and if people are using it to import packages now, they will not have a choice but to not use packages for handlers in the document tree and if a utility package is in the document tree, it will have to be moved outside of the document tree and sys.path set to that location as import isn't going to work for it if we don't allow document tree directories into sys.path. The question thus is, if you understand what I am raving about, is whether it is reasonable that packages will not be supported by apache.import_module(). There is a slim chance some ones code may break as a result but the majority would work fine. Enough for now. Graham
