Grisha wrote .. > > On Sun, 26 Mar 2006, Graham Dumpleton wrote: > > > One use for it that I already have is to get around the DirectoryIndex > > problems in mod_python caused by Apache's use of the > > ap_internal_fast_redirect() function to implement that feature. The > > specifics of this particular issue are documented under: > > > > http://issues.apache.org/jira/browse/MODPYTHON-146 > > > Could we zoom in this a little bit. I've read the description, but not > quite sure I understand it quite yet. Is "the problem" that if I set > req.notes['foo'] = 'bar' in a phase prior to fixup, by the time we get > to > the content handler, it will be gone because notes would be overwritten > by > mod_dir?
Fixup phase or earlier actually. In the case of req.notes though, it isn't that the value in req.notes vanishes, it is that it gets duplicated. Consider .htaccess file containing: AddHandler mod_python .py PythonHandler mod_python.publisher PythonDebug On DirectoryIndex index.py PythonFixupHandler _fixup In _fixup.py in the same directory, have: from mod_python import apache import time def fixuphandler(req): time.sleep(0.1) req.notes['time'] = str(time.time()) return apache.OK In index.py have: def index(req): return req.notes['time'] When I use a URL: http://localhost:8080/~grahamd/fast_redirect/index.py the result I get is: 1143667522.23 Ie., a single float value holding the time the request was made. If I now instead access the directory using the URL: http://localhost:8080/~grahamd/fast_redirect/ I instead get: ['1143667680.57', '1143667680.47'] In other words, instead of getting the single value I now get two values contained in a list. It wouldn't matter if the the two values were the same they would both still be included. Where a content handler was expecting a single string value, it would die when it gets a list. What is happening is that when the request is made against the directory it runs through the phases up to and including the fixup handler phase. As a consequence it runs _fixup::fixuphandler() with req.notes['time'] being set to be the time at that point. At the end of the fixup phase a mod_dir handler kicks in and it sees that the file type of request_rec->filename as indicated by request_rec->finfo->filetype is APR_DIR. As a consequence it will apply the DirectoryIndex directive, looping through listed files to find a candidate it can redirect the request too. In finding a candidate it reapplies phases up to and including the fixup handler phase on the new candidate filename. This is done so that access and authorisation checks etc are still performed on the candidate file. Because it has run the fixup handlers on the candidate file, the _fixup::fixuphandler() will be run again. This results in req.notes being set. At that stage the req.notes is separate as it is in effect run as a sub request to the main request against the directory. If after checking through the candidates it finds one that matches, to avoid having to run phases up to and including the fixup handler phase on the candidate again, mod_dir tries to fake a redirect. This is what ap_internal_fast_redirect() is being used for. What the method does is to copy details from the request_rec structure of the sub request for the candidate into the request_rec of the main request. When the mod_dir fixup handler returns, the main request then continues on to execute the content handler phase, with the details of the sub request. The problem with this is that rather than simply using req.notes from the sub request, or overlapping the contents from the sub request onto that of the main request, it merges them together. You therefore end up with multiple entries for the 'time' value which was added. To emphasise the problem, change the fixup handler to be: from mod_python import apache def fixuphandler(req): req.notes['filename'] = req.filename return apache.OK and index.py to: def index(req): return req.notes['filename'] The result when using URL against the directory is used is: ['/Users/grahamd/Sites/fast_redirect/index.py', '/Users/grahamd/Sites/fast_redirect/'] Now it isn't just req.notes that is going to see this merging as the code in ap_internal_fast_redirect() is: r->notes = apr_table_overlay(r->pool, rr->notes, r->notes); r->headers_out = apr_table_overlay(r->pool, rr->headers_out, r->headers_out); r->err_headers_out = apr_table_overlay(r->pool, rr->err_headers_out, r->err_headers_out); r->subprocess_env = apr_table_overlay(r->pool, rr->subprocess_env, r->subprocess_env); Thus, it also merges output headers and subprocess environment variables. The merging of these could in themselves also cause problems. This isn't the end of the problems though as ap_internal_fast_redirect() doesn't do anything with: /** Notes on *this* request */ struct ap_conf_vector_t *request_config; This has two implications for mod_python. The first is that it is the request_config that the Python request object instance is cached in. Because the request_config is still that of the main request, when the content handler phase is executed, it will pick up the Python request object of the main request. Thus, any attributes added direct to the Python request object by the sub request will be missing. To illustrate this, change the fixup handler to: from mod_python import apache def fixuphandler(req): if req.finfo[apache.FINFO_FILETYPE] != apache.APR_DIR: req.attribute = req.filename return apache.OK and index.py to: def index(req): return req.attribute Then access index file directly using: http://localhost:8080/~grahamd/fast_redirect/index.py The result is: /Users/grahamd/Sites/fast_redirect/index.py Now access: http://localhost:8080/~grahamd/fast_redirect/ The result is an exception: AttributeError: 'mp_request' object has no attribute 'attribute' Now, in the fixup handler I specifically checked for file type not equal to a directory so that the attribute was only set when fixup handler for index.py was run. So you don't think I am doing something wrong, you could instead use a .htaccess file containing: AddHandler mod_python .py PythonHandler mod_python.publisher PythonDebug On DirectoryIndex index.py PythonFixupHandler _fixup | .py and fixup handler of: from mod_python import apache def fixuphandler(req): req.attribute = req.filename return apache.OK The important things is that whether index.py is called direct or by application of DirectoryIndex they should behave the same and they aren't. The second problem with request_rec not being copied is that details of any Python based output filters registered from within the fixup handler are also being lost. Keeping the .htaccess file such that fixup handler only runs with a .py extension, change fixup handler to be: from mod_python import apache def outputfilter(filter): apache.log_error("outputfilter") filter.pass_on() return apache.OK def fixuphandler(req): req.register_output_filter("PASS", "_fixup::outputfilter") req.add_output_filter("PASS") return apache.OK and index.py to: def index(req): return "HELLO" Access index.py directly and you get: HELLO Check the Apache error log and you will see: outputfilter logged from the filter. Access the directory and the browser gives an error saying it couldn't load any data from that location. Look at the Apache error log and you will get the error: python_filter: Could not find registered filter. This is because Apache had a callback in place to call mod_python for the filter, but then mod_python could not find it, as the registration details were still in the request_config of the sub request request_rec and weren't copied into the main request. Thus there are a series of problems because of how ap_internal_fast_redirect() is implemented and used by mod_dir. The main Apache httpd mailing list acknowledged that how merge of data is done was wrong and that ap_internal_fast_redirect() was in general causing problems for other Apache modules as well, such as mod_rewrite. Some suggested that should avoid the fast redirect and do a full internal redirect, but that such a change wouldn't be able to be done until Apache 2.4. As this is of no help now, a workaround is required which is what my example was one. Note though that this whole issue of problems with the fast redirect is totally distinct from whether req.finfo be able to be updated. It just so happened that was wanting that ability to implement the workaround. I still contend that there are other legitimate reasons for want to have req.finfo updated. BTW, some examples above only work with 3.3 working version. Specifically the output filter example. Graham