[ http://issues.apache.org/jira/browse/MODPYTHON-115?page=all ]

Graham Dumpleton updated MODPYTHON-115:
---------------------------------------

    Assign To: Graham Dumpleton

> import_module() and multiple modules of same name.
> --------------------------------------------------
>
>          Key: MODPYTHON-115
>          URL: http://issues.apache.org/jira/browse/MODPYTHON-115
>      Project: mod_python
>         Type: Bug
>   Components: core
>     Versions: 3.1.4, 3.2.7
>     Reporter: Graham Dumpleton
>     Assignee: Graham Dumpleton

>
> The "apache.import_module()" function is a thin wrapper over the standard 
> Python module importing system. This means that modules are still stored in 
> "sys.modules". As modules in "sys.modules" are keyed by their module name, 
> this in turn means that there can only be one active instance of a module for 
> a specific name.
> The "import_module()" function tries to work around this by checking the path 
> name of the location of a module against that being requested and if it is 
> different will reload the correct module. This check of the path though only 
> occurs when the "path" argument is actually supplied to the "import_module()" 
> function. The "path" is only supplied in this way when mod_python.publisher 
> makes use of the "import_module()" function, it is not supplied when the 
> "Python*Handler" directives are used because in that circumstance a module 
> may actually be a system module and supplying "path" would prevent it from 
> being found.
> Even though mod_python.publisher supplies the "path" argument to the 
> "import_module()" function, the check of the path has bugs, with modules 
> possibly becoming inaccessible as documented in JIRA as MODPYTHON-9. 
> The check by mod_python of the path name to the actual code file for a module 
> to determine if it should be reloaded, can also cause a continual cycle of 
> module reloading even though the modules on disk may not have changed. This 
> will occur when successive requests alternate between URLs related to the 
> distinct modules having the same name. This cyclic reloading is documented in 
> JIRA as MODPYTHON-10.
> That a module is reloaded into the same object space as the existing module 
> when two modules of the same name are in different locations, can also cause 
> namespace pollution and security issues if one location for the module was 
> public and the other private. This cross contamination of modules is as 
> documented in JIRA as MODPYTHON-11.
> In respect of the "Python*Handler" directives where the "path" argument was 
> never supplied to the "import_module()" function, the result would be that 
> the first module loaded under the specified name would be used. Thus, any 
> subsequent module of the same name referred to by a "Python*Handler" 
> directive found in a different directory but within the same interpreter 
> would in effect be ignored.
> A caveat to this though is that such a "Python*Handler" directive would 
> result in that handlers directory being inserted at the head of "sys.path". 
> If the first instance of the module loaded under that name were at some point 
> modified, the module would be automatically reloaded, but it would load the 
> version from the different directory.
> Now, although these problem as they relate to mod_python.publisher are 
> addressed in mod_python 3.2.6, the underlying problems in 'import_module()' 
> are not. As the bug reports as they relate to mod_python.publisher have been 
> closed off as resolved, am creating this bug report so as to carry on a bug 
> report for the underlying problem as it applies to "Python*Handler" directive 
> and use of "import_module()" explicitly.
> To illustrate the issue as it applies to "Python*Handler" directive, create 
> two separate directories with a .htaccess file containing:
>   AddHandler mod_python .py
>   PythonHandler index
>   PythonDebug On
> In the "index.py" file in each separate directory put:
>   import os
>   from mod_python import apache
>   def handler(req):
>     req.content_type = 'text/plain'
>     print >> req, os.getpid(), __file__
>     return apache.OK
> Assuming these are accessed as:
>   /~grahamd/mod_python_9/subdir-1/index.py
>   /~grahamd/mod_python_9/subdir-2/index.py
> access the first URL, and the result will be:
>   10665 /Users/grahamd/Sites/mod_python_9/subdir-1/index.py
> now access the second URL and we get:
>   10665 /Users/grahamd/Sites/mod_python_9/subdir-1/index.py
> Note this assumes the same child process got it, so fixing Apache to run one 
> child process is required for this test.
> As one can see, it doesn't actually use the 'subdir-2/index.py" module at all 
> and still uses the "subdir-1/index.py' module.
> If one modifies "subdir-1/index.py' so its timestamp is updated and load the 
> second URL again, we get:
>   10665 /Users/grahamd/Sites/mod_python_9/subdir-2/index.py
> This occurs because it detects the change in the first module loaded, but 
> because sys.path had the second handler directory at the head of sys.path 
> now, when reloaded it picked up the latter.
> These issues with same name module in multiple locations is listed as ISSUE 
> 14 in my list of module importer problems. See:
>   http://www.dscpl.com.au/articles/modpython-003.html

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to