Some context first. I have been looking at a way of solving:

  https://issues.apache.org/jira/browse/MODPYTHON-146

The basic problem described in this issue is that when DirectoryIndex
directive is used, data added to the Python request object by a fixup
handler for the actual file doesn't get propogated back into the main
request object pertaining to the request against the directory. This
is because mod_dir uses ap_internal_fast_redirect() to do some of the
nasty work and Python stuff isn't part of the request_rec so does not
get copied from subrequest request_rec to main request_rec.

Now I know the above problem is something specific to mod_python and
nothing to do with the core of Apache, but in investigating this, am
starting to question whether what the function ap_internal_fast_redirect()
is doing is even sensible in some parts anyway. Specifically, it does:

    r->headers_out = apr_table_overlay(r->pool, rr->headers_out,
                                       r->headers_out);
    r->err_headers_out = apr_table_overlay(r->pool, rr->err_headers_out,
                                           r->err_headers_out);
    r->subprocess_env = apr_table_overlay(r->pool, rr->subprocess_env,
                                          r->subprocess_env);

In this code "r" is the main request_rec and "rr" is that for the sub
request which matched actual file in DirectoryIndex directive file list.

The problem here is that apr_table_overlay() merges the contents of two
Apache tables. What this means is that if the same key exists in both
tables, that from "rr" doesn't replace that in "r", instead the result
contains both key/value pairs even if they have the same value for that
key.

Now I tend to work out my home directory, thus mod_userdir comes into
play. One of the things it does is add an entry to req.notes recording
what my username is. This is just one example, there are others entries
added as well by other stuff. For example, if I access /~grahamd/index.html
the req.notes table contains:

{'no-etag': '', 'ap-mime-exceptions-list': 'Ϣ', 'mod_userdir_user': 'grahamd', 'python_init_ran': '1'}

I could therefore access the username using:

  req.notes["mod_userdir_user"]

this would yield a string containing "grahamd".

If now I access the directory itself as /~grahamd/ and rely on DirectoryIndex
directive to map it to index.html file, I get:

{'no-etag': '', 'mod_userdir_user': 'grahamd', 'python_init_ran': '1', 'ap-mime-exceptions-list': 'œ\', 'mod_userdir_user': 'grahamd'}

Note how there are two entries for "mod_userdir_user". When I try and get
the username now in a mod_python handler, I will actually get an array:

  ['grahamd', 'grahamd']

If my handler had assumed that that was always going to be a string, it
would promptly generate some sort of exception when used in a way that
wasn't going to work for an array.

The question I guess is whether ap_internal_fast_redirect() is wrong for
merging the tables, or whether any handler (including those not written
in mod_python) are supposed to handle this strangeness?

I have only talked about req.notes here, but same issue applies to the
req.headers_out, req.err_headers_out and req.subprocess_env table objects,
as the same issue can come up with them as well.

Note, I have already posted about this issue on mod_python developers
list, but as I say, it is more related to Apache core than mod_python
in terms of whether the behaviour is correct. If it is deemed as correct,
then mod_python has to find a way to deal with it, or users have to know
about the strangeness and deal with it.

Graham

Reply via email to