Add a readable/writable req.script_name member.
-----------------------------------------------

         Key: MODPYTHON-68
         URL: http://issues.apache.org/jira/browse/MODPYTHON-68
     Project: mod_python
        Type: New Feature
  Components: core  
    Versions: 3.2.0    
    Reporter: Graham Dumpleton
 Attachments: apache.py.diff

The term SCRIPT_NAME in web servers is used to identify that part of a
URI which identifies the script handling the request. Within the URI,
the SCRIPT_NAME component would be followed by the PATH_INFO component,
the latter being potentially an empty string.

In mod_python, the value of SCRIPT_NAME could be obtained in a few
different ways. These are:

1. Obtain it as req.subprocess_env["SCRIPT_NAME"] after having first
called req.add_common_vars().

2. Obtain it as apache.build_cgi_env(req)["SCRIPT_NAME"]. This
internally calls req.add_common_vars() but then ignores "SCRIPT_NAME"
value from req.subprocess_env and instead tries to calculate it as per
(3) below yeilding a different result to (1) in some cases.

3. Attempt to derive it req.uri using code which is based upon something
like 'req.uri[:-len(req.path_info)]'. If req.path_info is empty, then the result
should be the same as req.uri.

All three methods actually yield incorrect results in certain
circumstances, with the fact that it occurs in (1) suggesting an
underlying Apache bug.

The problem area is where there are multiple successive occurrences of
'/' appearing in the part of the URI which is used to determine the
PATH_INFO value.

Looking at some examples for each case we get:

req.uri = /~grahamd/handler/mptest.py
req.path_info = 
PATH_INFO = None
SCRIPT_NAME (1) = /~grahamd/handler/mptest.py
SCRIPT_NAME (2) = /~grahamd/handler/mptest.py
SCRIPT_NAME (3) = /~grahamd/handler/mptest.py

req.uri = /~grahamd/handler/mptest.py/
req.path_info = /
PATH_INFO = /
SCRIPT_NAME (1) = /~grahamd/handler/mptest.py
SCRIPT_NAME (2) = /~grahamd/handler/mptest.py
SCRIPT_NAME (3) = /~grahamd/handler/mptest.py

req.uri = /~grahamd/handler/mptest.py//
req.path_info = /
PATH_INFO = /
SCRIPT_NAME (1) = /~grahamd/handler/mptest.py
SCRIPT_NAME (2) = /~grahamd/handler/mptest.py/
SCRIPT_NAME (3) = /~grahamd/handler/mptest.py/

req.uri = /~grahamd/handler/mptest.py/a
req.path_info = /a
PATH_INFO = /a
SCRIPT_NAME (1) = /~grahamd/handler/mptest.py
SCRIPT_NAME (2) = /~grahamd/handler/mptest.py
SCRIPT_NAME (3) = /~grahamd/handler/mptest.py

req.uri = /~grahamd/handler/mptest.py/a/b
req.path_info = /a/b
PATH_INFO = /a/b
SCRIPT_NAME (1) = /~grahamd/handler/mptest.py
SCRIPT_NAME (2) = /~grahamd/handler/mptest.py
SCRIPT_NAME (3) = /~grahamd/handler/mptest.py

req.uri = /~grahamd/handler/mptest.py/a//b
req.path_info = /a/b
PATH_INFO = /a/b
SCRIPT_NAME (1) = /~grahamd/handler/mptest.py/a
SCRIPT_NAME (2) = /~grahamd/handler/mptest.py/
SCRIPT_NAME (3) = /~grahamd/handler/mptest.py/

req.uri = /~grahamd/handler/mptest.py/a///b
req.path_info = /a/b
PATH_INFO = /a/b
SCRIPT_NAME (1) = /~grahamd/handler/mptest.py/a/
SCRIPT_NAME (2) = /~grahamd/handler/mptest.py/a
SCRIPT_NAME (3) = /~grahamd/handler/mptest.py/a

req.uri = /~grahamd/handler/mptest.py/a///b//c
req.path_info = /a/b/c
PATH_INFO = /a/b/c
SCRIPT_NAME (1) = /~grahamd/handler/mptest.py/a///b
SCRIPT_NAME (2) = /~grahamd/handler/mptest.py/a/
SCRIPT_NAME (3) = /~grahamd/handler/mptest.py/a/

All very strange and not what one would expect.

Ignoring the strange results, the first point of creating the tracker
item is to propose that a new member be added to the request object
referred to as "req.script_name". This new member should be both
readable and writable.

The argument for adding "script_name" is similar to that for making
"path_info" writable as described in MODPYTHON-67. That is, it would
make the task of writing a middleware stack specifically for mod_python
but in a similar style to WSGI a slightly simpler task.

In adding "script_name", it is perhaps suggested that its initial value be
somewhat saner than as shown in the results above. More along the lines
of:

req.uri = /~grahamd/handler/mptest.py//
req.path_info = /
script_name=/~grahamd/handler/mptest.py

req.uri = /~grahamd/handler/mptest.py/a//b
req.path_info = /a/b
script_name=/~grahamd/handler/mptest.py

req.uri = /~grahamd/handler/mptest.py/a///b
req.path_info = /a/b
script_name=/~grahamd/handler/mptest.py

req.uri = /~grahamd/handler/mptest.py/a//b//c
req.path_info = /a/b/c
script_name=/~grahamd/handler/mptest.py

It should perhaps also normalise the path to eliminate duplicate
instances of '/' in the URI appearing before the PATH_INFO component.

req.uri = /~grahamd/handler////mptest.py/a/b/c
req.path_info = /a/b/c
PATH_INFO = /a/b/c
SCRIPT_NAME (1) = /~grahamd/handler////mptest.py
SCRIPT_NAME (2) = /~grahamd/handler////mptest.py
SCRIPT_NAME (3) = /~grahamd/handler////mptest.py
script_name=/~grahamd/handler/mptest.py

In respect of the problems with (1) and (2), one probably should not do
anything about (1) as that is generated by Apache. As to (2), since it
is mean't to parallel what Apache provides, maybe it should just pass
through the "SCRIPT_NAME" from "req.subprocess_env". Not sure why the
latter ignores the value supplied by Apache and determines it itself, thus
yielding a different value in cases as shown.

And yes I do have an agenda by pushing these req.path_info and
req.script_name changes. My work should benefit mod_python though,
so don't be scared. ;-)



-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to