[ http://issues.apache.org/jira/browse/MODPYTHON-115?page=all ]
Graham Dumpleton updated MODPYTHON-115: --------------------------------------- Assign To: Graham Dumpleton > import_module() and multiple modules of same name. > -------------------------------------------------- > > Key: MODPYTHON-115 > URL: http://issues.apache.org/jira/browse/MODPYTHON-115 > Project: mod_python > Type: Bug > Components: core > Versions: 3.1.4, 3.2.7 > Reporter: Graham Dumpleton > Assignee: Graham Dumpleton > > The "apache.import_module()" function is a thin wrapper over the standard > Python module importing system. This means that modules are still stored in > "sys.modules". As modules in "sys.modules" are keyed by their module name, > this in turn means that there can only be one active instance of a module for > a specific name. > The "import_module()" function tries to work around this by checking the path > name of the location of a module against that being requested and if it is > different will reload the correct module. This check of the path though only > occurs when the "path" argument is actually supplied to the "import_module()" > function. The "path" is only supplied in this way when mod_python.publisher > makes use of the "import_module()" function, it is not supplied when the > "Python*Handler" directives are used because in that circumstance a module > may actually be a system module and supplying "path" would prevent it from > being found. > Even though mod_python.publisher supplies the "path" argument to the > "import_module()" function, the check of the path has bugs, with modules > possibly becoming inaccessible as documented in JIRA as MODPYTHON-9. > The check by mod_python of the path name to the actual code file for a module > to determine if it should be reloaded, can also cause a continual cycle of > module reloading even though the modules on disk may not have changed. This > will occur when successive requests alternate between URLs related to the > distinct modules having the same name. This cyclic reloading is documented in > JIRA as MODPYTHON-10. > That a module is reloaded into the same object space as the existing module > when two modules of the same name are in different locations, can also cause > namespace pollution and security issues if one location for the module was > public and the other private. This cross contamination of modules is as > documented in JIRA as MODPYTHON-11. > In respect of the "Python*Handler" directives where the "path" argument was > never supplied to the "import_module()" function, the result would be that > the first module loaded under the specified name would be used. Thus, any > subsequent module of the same name referred to by a "Python*Handler" > directive found in a different directory but within the same interpreter > would in effect be ignored. > A caveat to this though is that such a "Python*Handler" directive would > result in that handlers directory being inserted at the head of "sys.path". > If the first instance of the module loaded under that name were at some point > modified, the module would be automatically reloaded, but it would load the > version from the different directory. > Now, although these problem as they relate to mod_python.publisher are > addressed in mod_python 3.2.6, the underlying problems in 'import_module()' > are not. As the bug reports as they relate to mod_python.publisher have been > closed off as resolved, am creating this bug report so as to carry on a bug > report for the underlying problem as it applies to "Python*Handler" directive > and use of "import_module()" explicitly. > To illustrate the issue as it applies to "Python*Handler" directive, create > two separate directories with a .htaccess file containing: > AddHandler mod_python .py > PythonHandler index > PythonDebug On > In the "index.py" file in each separate directory put: > import os > from mod_python import apache > def handler(req): > req.content_type = 'text/plain' > print >> req, os.getpid(), __file__ > return apache.OK > Assuming these are accessed as: > /~grahamd/mod_python_9/subdir-1/index.py > /~grahamd/mod_python_9/subdir-2/index.py > access the first URL, and the result will be: > 10665 /Users/grahamd/Sites/mod_python_9/subdir-1/index.py > now access the second URL and we get: > 10665 /Users/grahamd/Sites/mod_python_9/subdir-1/index.py > Note this assumes the same child process got it, so fixing Apache to run one > child process is required for this test. > As one can see, it doesn't actually use the 'subdir-2/index.py" module at all > and still uses the "subdir-1/index.py' module. > If one modifies "subdir-1/index.py' so its timestamp is updated and load the > second URL again, we get: > 10665 /Users/grahamd/Sites/mod_python_9/subdir-2/index.py > This occurs because it detects the change in the first module loaded, but > because sys.path had the second handler directory at the head of sys.path > now, when reloaded it picked up the latter. > These issues with same name module in multiple locations is listed as ISSUE > 14 in my list of module importer problems. See: > http://www.dscpl.com.au/articles/modpython-003.html -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira