I've added another spec to wsgi.org: http://wsgi.org/wsgi/Specifications/handling_post_forms
This one is a little more intrusive than wsgi.url_vars, but it addresses an outstanding source of problems: contention over wsgi.input. Text copied: :Title: Handling POST forms in WSGI :Author: Ian Bicking <[EMAIL PROTECTED]> :Discussions-To: Python Web-SIG <web-sig@python.org> :Status: Draft :Created: 21-Oct-2006 .. contents:: Abstract -------- This suggests a way that WSGI middleware, applications, and frameworks can access POST form bodies so that there is less contention for the ``wsgi.input`` stream. Rationale --------- Currently ``environ['wsgi.input']`` points to a stream that represents the body of the HTTP request. Once this stream has been read, it cannot necessarily be read again. It may not have a ``seek`` method (none is required by the WSGI specification, and frequently none is provided by WSGI servers). As a result any piece of a system that looks at the request body essentially takes ownership of that body, and no one else is able to access it. This is particularly problematic for POST form requests, as many framework pieces expect to have access to this. Specification ------------- This applies when certain requirements of the WSGI environment are met:: def is_post_request(environ): if environ['REQUEST_METHOD'].upper() != 'POST': return False content_type = environ.get('CONTENT_TYPE', 'application/x-www-form-urlencoded') return ( content_type.startswith('application/x-www-form-urlencoded' or content_type.startswith('multipart/form-data')) That is, it must be a POST request, and it must be a form request (generally ``application/x-www-form-urlencoded`` or when there are file uploads ``multipart/form-data``). When this happens, the form can be parsed by ``cgi.FieldStorage``. The results of this parsing should be put in ``environ['wsgi.post_form']`` in a particular fashion:: def get_post_form(environ): assert is_post_request(environ) input = environ['wsgi.input'] post_form = environ.get('wsgi.post_form') if (post_form is not None and post_form[0] is input): return post_form[2] fs = cgi.FieldStorage(fp=input, environ=environ, keep_blank_values=1) new_input = InputProcessed('') post_form = (new_input, input, fs) environ['wsgi.post_form'] = post_form environ['wsgi.input'] = new_input return fs class InputProcessed(object): def read(self, *args): raise EOFError( 'The wsgi.input stream has already been consumed') readline = readlines = __iter__ = read This way multiple consumers can parse a POST form, accessing the form data in any order (later consumers will get the already-parsed data). The replacement ``wsgi.input`` guards against non-conforming access to the data, while the value in ``wsgi.post_form`` allows for access to the original ``wsgi.input`` in case it may be useful. By checking for the replacement ``wsgi.input`` when checking if ``wsgi.post_forms`` applies, this does not get in the way of WSGI middleware that may replace that key. If the key is replaced, then the parsed data is implicitly invalidated. Query String data ----------------- Note that nothing in this specification touches or applies to the query string (in ``environ['QUERY_STRING']``). This is not parsed as part of the process, and nothing in this specification applies to GET requests, or to the query string which may be present in a POST request. Open Issues ----------- 1. Is cgi.FieldStorage the best way to store the parsed data? It's the most common way, at least. 2. This doesn't address non-form-submission POST requests. Most of the same issues apply to such requests, except that frameworks tend not to touch the request body in that case. The body may be large, so the actual contents of the request body shouldn't go in the environment. Perhaps they could go in a temporary file, but this too might be an unnecessary indirection in many cases. Also other kinds of request (like PUT) that have a request body are not covered, for largely the same reason. In both these cases, it is much easier to construct a new ``wsgi.input`` that accesses whatever your internal representation of the request body is. 3. Is the tuple of information necessary in ``wsgi.post_form``, or could it just be the ``FieldStorage`` instance? 4. Should ``wsgi.input`` be replaced by ``InputProcessed``, or just left as is? _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com