On 25/05/07, Ian Bicking <[EMAIL PROTECTED]> wrote: > Graham Dumpleton wrote: > > On 24/05/07, Ian Bicking <[EMAIL PROTECTED]> wrote: > >> Graham Dumpleton wrote: > >> > Does anyone think this would be nice extension for a WSGI adapter > >> > written against current specification to implement even if not > >> > necessarily portable? > >> > >> Eh. In the context of mod_wsgi, I think it would be more interesting to > >> provide a WSGI application that called back into Apache (basically > >> wrapping Apache's normal subrequest machinery in a WSGI exterior). > > > > I was trying to avoid as much as possible having mod_wsgi provide any > > sort of hooks which would allow one to perform actions against > > internals of Apache. I had two reasons for this. > > This is a much more constrained hook into Apache than what mod_python > provides. For instance, you could provide much the same thing, but > where subrequests actually go out over HTTP. There's quite a bit of > data you couldn't share over HTTP, so it's not entirely equivalent, but > it's still pretty close (especially if there was something on the Apache > side to fix up the slightly-richer-than-HTTP environment based on > special headers).
If I am going to be providing any sort of way of interacting with Apache internals though, I don't want to be in the business of having to write custom wrappers for performing specific tasks. This would just turn mod_wsgi into just another framework rather than being just the absolute minimal WSGI adapter it is. When I did look at allowing it to be more than just an adapter, the approach I looked at for providing what you want to do is to simply pass through the SWIG wrapping for the Apache request object (request_rec) in the WSGI environment. This wouldn't be something that I saw happening by default though as by doing so it would place a dependency of mod_wsgi on all the SWIG bindings for Apache and would also be a way of circumventing the locking down of what the user is allowed to do with mod_wsgi. Ie., am trying to make mod_wsgi be as safe as possible so that web hosting companies might consider looking at it for use in shared hosting environments. If it were to have just as many problems and unknowns as mod_python, the whole exercise in writing it would be a waste of time. All this means is that to enable the feature you would first need to specify a configuration directive in the main Apache configuration something like: WSGIExtensions RequestRec This would just indicate that passing a request object would be allowed, you would still then need to enable it for a specific application (part of the URL namespace). WSGIPassRequestRec On That done, the request object could be accessed as 'apache.request_rec' in the WSGI environment. Although only the request object is being passed that is enough, as the separate SWIG bindings for the Apache API, which would not even be a part of mod_wsgi but a separate package, would then provide everything else. The SWIG bindings would though just be a direct mapping to the C API with no real wrapping giving it a Pythonic feel. Thus for example your internal redirect would be written something like: from apache.http_request import * def application(environ, start_response): r = environ['apache.request_rec'] ap_internal_redirect('/some/other/path', r) # Dummy WSGI response as redirect already sent response. start_response('200 OK', []) return [] If desired, people could then write if they wish WSGI component objects which wrap up such low level Apache API calls to do things. One example is obviously an internal redirect, but another may be use apache.mod_ssl.ssl_var_lookup() to lookup specific properties of a client side SSL certificate which wouldn't otherwise be available to an WSGI application. For cases like accessing SSL certificate information using the API there wouldn't be a big problem, but one problem with something like internal redirects is that the way WSGI applications return a response isn't a direct mapping to the lower level Apache handler response but is more complicated than that. Thus you end up having to use some sort of dummy response which wouldn't add to what a sub request may have already returned. Alternatively, you have to provide other stuff in the WSGI environment which the application could use in some way to raise an exception that would then be caught by mod_wsgi and taken to mean that the normal WSGI application response processing doesn't need to be done, but that a normal Apache API status value of OK still be returned (different to HTTP_OK). In other words the mismatch in the APIs and that the WSGI interface is not as rich as the Apache handler API as far as how a handler response and the HTTP status can be indicated can make it all just a bit messy. Also, some of the things one can do through the Apache API are stepping outside of the flow of operations with WSGI applications. Just as comparison, if using just the Apache API direct, it would have been written as: from apache.httpd import * from apache.http_request import * def handler(r): ap_internal_redirect('/some/other/path', r) return DONE Important here is that the value DONE is being returned, which indicates to Apache that a complete response has been provided, by virtue of the sub request, and that for the parent handler processing nothing more should be done if there did so happen to be further handler registered for the response handler phase. This is in contrast to a standard Apache type handler which might have been: from apache.httpd import * from apache.http_protocol import * def handler(r): content = 'hello world!\n' ap_set_content_type(r, 'text/plain') ap_set_content_length(r, len(content)) ap_rwrite(content, r) return OK So in practice they are quite different worlds and my feeling is that allowing WSGI applications to call back into the Apache internals may just cause more problems than its worth, especially since you can't properly represent the low level Apache handler response in the response to the WSGI application. Another mismatch and one I am already having to contend with with mod_wsgi is that HTTP error status can be indicated in two ways. the first is: from apache.httpd import * from apache.http_protocol import * def handler(r): content = 'NOT FOUND. GO AWAY.\n' r.status = HTTP_NOT_FOUND ap_set_content_type(r, 'text/plain') ap_set_content_length(r, len(content)) ap_rwrite(content, r) return OK By returning OK here and using r.status to indicate the HTTP status code, it tells Apache that I have already provided a response body and thus it shouldn't try and provide one through processing ErrorDocument directives or by adding its own default. The other option is: from apache.httpd import * def handler(r): return HTTP_NOT_FOUND In this case, because the HTTP status code was returned as the actual response, it indicates that I haven't provided a response body and thus Apache should instead try and provide one. In WSGI, when something like: start_response('404 Not Found', []) is used, it is still up to the WSGI application to provide a response body with the content to the page to be displayed in the browser. If it doesn't then the browser will prevent a black page. What I don't know is if mod_wsgi should be trying to be smart and pick up where a HTTP error response is returned but where there is no response body and instead of doing equivalent of setting r.status = HTTP_NOT_FOUND and returning OK, just return HTTP_NOT_FOUND as result as in second example, such that Apache can instead get the chance of providing an error page since the WSGI application didn't actually provide one. I know I am rambling, but if you have got this far and followed what I am going on about in the last example, I might ask what you do about error pages. How do you ensure a consistent error page layout across a whole WSGI application containing many disparate components? Do you try and pass down through the WSGI environment some special hook an application can call to generate errors pages with the same style, but then cause a dependency on this hook existing? Do you instead allow a WSGI application to return an empty error page and have a higher up WSGI middleware component catch that and substitute its own based on the error type, ie., in similar style to Apache and ErrorDocument directive? Or do you do something else? The problem then as above is what does one do at the boundary between a WSGI application and the web server hosting it? Do you just always assume a WSGI application provides an error page, or allow some way that a WSGI application can defer to the web server the task of generating an error page instead? Graham _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com