Re: [Web-SIG] Python Web Modules - Version 0.5.0

Ian Bicking Mon, 31 Jan 2005 15:50:16 -0800

James Gardner wrote:

Ian Bicking wrote:
web.wsgi.error: one standard I'd like for middleware would be some key you could set that would indicate that some error handler exists, and applications further down the stack shouldn't catch unexpected exceptions (of course expected exceptions are a different matter). Then the best error handler available would eventually get the error, and process it somehow (e.g., mailing a report, displaying an error, starting a debugger, etc). Anyway, something to think about for this.
That could be useful. Presumably the middleware component nearest the server is likely to have the best error handling (as you would put the best error handler in a position to catch the most errors). So this could be as simple as agreeing a variable name like wsgi.error for the environ dictionary which the highest middleware component up the chain would set to True and ones lower down wouldn't provide error handling if it was already set.

Right. Except when you don't want that ;) Other times you may want to override the error handler locally; e.g., maybe you have a section of the site where you want to use a different error handler that shows exceptions to the browser (e.g., a development section). But presumably you could add an option to the middleware to force it to catch exceptions even when the environment advised not to.

Another thing I noticed when writing the error handler is that if an application or middleware component doesn't form a header or set the status correctly it can be tricky to track down where the error occurred. If the application used a special object for headers and status in the start_response callable which raised an error when it was set with an invalid value that would make life easier.

(Alternatively, if you wanted to change the way things were programmed a bit you could write your application as middleware and specify a terminator which set the headers and status using these special objects. Probably not necessary though!)

I'm not sure I understand you here. What's the exact situation where you encounter this?

So I was thinking that status codes should be sufficient to communicate authorization: 401 for login required, 403 for forbidden. If you are doing cookie logins (which I generally prefer from a UI perspective) the middleware can translate the 401 into a redirect to the login page. And the 403 can turn into a nicer error page --
So in a new version the authentication middleware would display a sign in box if no user was signed in, the authorization middleware would provide objects for the application to test authorisation and would also look for headers to determine whether the application thought the user was authorised and would display a sign in if not.

Basically. If REMOTE_USER wasn't set (or was empty) and the application required login (based on whatever criteria it has) then it should return a 401 code. The authentication middleware doesn't know if login is required, but it would be nice if it can tell if you are logged in anyway (not possible with HTTP Basic auth, but ignoring that case).

web.wsgi.session: I'd like to have some sort of standard for these objects, at least some aspects. Not the details of storage, but mostly access; along the lines of web.session.manager and/or .store. I'm not sure how I feel about the manager with multiple applications, each of which has a store -- I feel like this should be part of the configuration somehow, which isn't necessarily part of the standard user-visible API.
I've been thinking about the way series of applications can work together, which is what the web.wsgi.environment code is about. Perhaps it would be better to specify the application name in web.wsgi.environment (which is more to do with configuration) so that the web.wsgi.session and web.wsgi.auth objects all use the same application name and then the manager becomes more redundant because a store for the particular application is already created.

OK, I was trying to figure out what wsgi.environment was about. Is it basically a way of indication local configuration (like a configuration realm or something)? I still lack a good intuition for how configuration should work.

web.wsgi.cgi: is this safe when a piece of middleware changes QUERY_STRING or otherwise rewrites the request? You can test for this by saving the QUERY_STRING that you originally parsed alongside the resulting FieldStorage, and then reparsing if they don't match. You can even test for matching with "is", since you're really checking for modifications instead of equality. The same should be possible for wsgi.input and POST requests.
The web.wsgi.cgi module actually builds the FieldStorage from the environ dictionary, not QUERY_STRING so this should mean that middleware can do what it likes and the underlying middleware and application will respond to the changes.. is this not a good way of doing it?

Well, FieldStorage looks at particular keys, and I guess the result is derivative of all of those. But the keys are fairly limited -- I think it's just QUERY_STRING, QUERY_METHOD, CONTENT_TYPE, and CONTENT_LENGTH, though this could be confirmed by reading the cgi module. So even though you pass a complete environment, everytime you retrieve the value from the environment you want to check that these values haven't changed (along with wsgi.input).

If I did it, I'd lazily parse the query string, and then reparse if those keys had changed. I guess wsgikit.wsgilib.get_cookies is an example of this: http://svn.colorstudy.com/trunk/WSGIKit/wsgikit/wsgilib.py

One other thing I've been meaning to ask.. The WSGI specification currently allows no way for an application or middleware components to pass custom information back up the middleware chain so that an application can ask a middleware component not to perform a certain task if it needs to. Communication up the chain can only be provided through status, headers, exc_info and content. There could very easily also be a response dictionary added as another parameter to start_response, similar to environ which sent information up the chain. Was this deliberately avoided so that the system wouldn't get complicated?

I was thinking about this too. It certainly makes it simpler to make the response fairly plain and HTTP-like, but I can imagine lots of useful information that doesn't fit well into headers or response codes. E.g., if you are sending a 403 error message, maybe you want to pass some extra information along about why it happened. You could write that out as the HTML response, but then it becomes somewhat opaque if that gets rewritten. Something like the extension information that gets put in the request environment; it's always purely optional, but there to allow cooperation between components. There's no escape mechanism like that for the response.

Well... there is a way, actually -- you can add callbacks to the request. For instance, in my session handler I add a callable to the request that returns the session object. If you don't call that at all then the session isn't even created, and no session ID is assigned (assuming you didn't already have a session). If you do call it, then the middleware modifies the response to add a session ID. So there's really some communication from the application that effects the response, but it isn't being expressed as part of the response stream (the status, headers, and body).

--
Ian Bicking  /  [EMAIL PROTECTED]  /  http://blog.ianbicking.org
_______________________________________________
Web-SIG mailing list
[email protected]
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe: 
http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com

Re: [Web-SIG] Python Web Modules - Version 0.5.0

Reply via email to