Jorey Bump wrote ..
> Graham Dumpleton wrote:
> 
> > The only area I guess one may have to be careful with is if you have
> used
> > PythonPath directive to extend module search path, especially if you
> > reference directories in the document tree. This may result in mod_python
> > complaining in the Apache error log at you and in worst case, if you
> have
> > Python packages in document tree, it will not find them.
> 
> Can you clarify this a bit? Does it follow that the use of PythonPath to
> extend the module search path to directories *outside* of the document
> tree will still be safe? I use this extensively to isolate code to 
> virtual hosts. I *never* include packages or modules that exist in the
> document tree in PythonPath, if that's a consideration.

You can still use PythonPath for this purpose. Nothing found on PythonPath
(sys.path) will though be a candidate for automatic module reloading.

If you want any of those separate code modules to be candidates for automatic
module reloading, then rather than using PythonPath, you should set the
PythonOption mod_python.importer.path to the list of directories that the new
importer should additionally search for reloadable modules. The path should be
a full list of directories, you can't extend a list inherited from above and you
shouldn't be putting sys.path in there either.

In other words, there is a clear line between normal Python modules, which
would be found in PythonPath (sys.path) and those which are managed by the new
importer in mod_python. Standard Python modules in PythonPath (sys.path) are
still stored in sys.modules and therefore must have unique names. Those managed
by the new importer are NOT in sys.modules and are distinguished by a full
pathname such that it is possible to have modules of the same name located in
two different directories. The new importer tries to detect overlaps in the 
paths
and will complain when it does.

Because the new importer doesn't use sys.modules, the new importer can't look
after Python packages, as sub imports within packages only work properly if the
package is in sys.modules. As such, automatic  reloading is not supported for
Python packages and thus they have to be located on PythonPath (sys.path).
There are alternate ways with the new module importer of managing package like
groupings of modules where they are a part of the web application.

Where PythonPath wasn't used, note that the directory associated with a handler
directive is no longer added to sys.path. The new module importer determines
through other means when it is necessary to search the handler directory for a
module.

The only consequence of the handler directory not being added to sys.path is
that some separate module outside of the document tree found on PythonPath
(sys.path) will no longer be able to perform a standard Python import to get
hold of a module (such as a config module) from in the handler directory.

This technique was always a bit unreliable anyway because of randomness in the
sys.path and possibility the module name may have been used in different places
with these places not being separated by using their own interpreter space.

Overall, the changes were made to support what would be considered best practice
ways of using the old importer. The new importer will complain though, by way
of messages in the Apache error log, when questionable things are being down
which could have resulted in an old application being potentially unstable. In 
these
cases, the application may have to be restructured in a way to avoid the 
questionable
practices.

The one specific case which I am sure we are most likely to see is where someone
has done:

  <Directory /some/internal/path>
    PythonPath "sys.path+['/some/internal/path','/some/external/path']"
    PythonHandler mptest
    SetHandler mod_python
  </Directory>

Strictly speaking, this was probably a limitation of old importer rather than 
something
which was outright questionable in itself.

Anyway, they have set PythonPath because they wanted some external path to be
added to sys.path. Setting PythonPath though prevents the handler directory 
being
added to sys.path, so they also added that manually. In the new importer it 
will complain
in the Apache error logs about /some/internal/path appearing in sys.path but 
will
otherwise still find mptest in /some/internal/path. It will do this though 
because
the reloadable module will consist of that directory for the purposes of that 
request.
The mptest module will be reloadable and not stored in sys.modules. The problem
now comes if some module in sys.path imports mptest, because the internal path
is in sys.path, it will find it, but that will become a separate copy of the 
module in
memory stored in sys.modules.

The solution here in the new importer is just to say:

  <Directory /some/internal/path>
    PythonPath "sys.path+['/some/external/path']"
    PythonHandler mptest
    SetHandler mod_python
  </Directory>

Ie., the internal path should not be added to sys.path. The new importer will 
still find
mptest in the internal path when needed by the PythonHandler directive. An 
external
module will no longer find mptest from the handler directory. Where obtaining
modules by a search of sys.path is done in this way, then more appropriate ways
such as passing module data down to the external module will have to be used
instead of relying on a trick like it appearing somewhere on sys.path 
automagically.

Graham

Reply via email to