Re: [Web-SIG] Proposal for asynchronous WSGI variant
Ionel Maries Cristian ha scritto: This is a very interesting initiative. However there are few problems: - there is no support for chunked input - that would require having support for readline in the first place, also, it should be the gateway's business decoding the chunked input. Unfortunately Nginx does not yet support chunked input, so I can't help here. - the original wsgi spec somewhat has some support for streaming and asynchronicity [*1] Right, and in fact I have used this for the implementation of some extensions in the WSGI module for Nginx. - i don't see how removing the write callable will help (i don't see a issue having the server providing a stringio.write as the write callable for synchronous apps) To summarize: the main problem with the write callable is that after you call it control is not returned to the WSGI gateway. With an asynchronous server it is a problem since if you write a lot of data the server may not be able to send it to the client. This is not a problem if the application returns a generator, since the gateway can suspend the execution until the socket is ready to send data. With the write callable this is not possible, In my implementation of WSGI for Nginx I provide two separate implementation of the write callable: - put the socket temporary in synchronous mode (this is WSGI compliant but it is very bad for Nginx) - buffer all the written data until control is returned to the gateway (this is *not* WSGI compliant) However if you use greenlets, then implementing the write callable is not a problem. - passing nonstring values though middleware will make using/porting existing wsgi middleware hairy (suppose you have a middleware that applies some filter to the appiter - you'll have your code full of isinstance nastiness) Yes, this should be avoided. Also, have you looked at the existing gateway implementations with asynchronous support? There are a bunch of them: http://trac.wiretooth.com/public/wiki/asycwsgi http://chiral.j4cbo.com/trac http://wiki.secondlife.com/wiki/Eventlet my own shot at the problem: http://code.google.com/p/cogen/ and manlio's mod_wsgi for nginx (I may be missing some) However there is absolutely no unity in handling the wsgi.input (or equivalent) The wsgi.input can be handled with ngx.poll: c = ngx.connection_wrapper(wsgi.input) ... ngx.poll_register(c, WSGI_POLLIN) ... ngx.poll(1000) Unfortunately I can not test if this is implementable. I have some doubts. [...] Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposal for asynchronous WSGI variant
Christopher Stawarz ha scritto: On May 7, 2008, at 4:20 AM, Graham Dumpleton wrote: 2008/5/7 Manlio Perillo [EMAIL PROTECTED]: With your solution it seems that writing middlewares will not became more easy. Part of what I was trying to say was that this needn't be exposed to middlewares, unless it has to be. It was effectively a lower level of interaction which a middleware immediately on top of the WSGI adapter would use to hook into the async type model, but then present it to higher levels as more traditional WSGI interface. That would be a really elegant solution, except, as you say: That layer would though obviously use something like greenlets to bridge the two. The problem being that greenlets aren't part of the Python language. They're an extension that works by doing clever stuff with the C stack. And as much as we might wish that Python supported them natively (which I do, since they're a really nice alternative to OS threads), it doesn't, so I don't think they can play any role in a WSGI-ASYNC spec. This is not fully true, after all WSGI explicitly exposes the concept of processes and threads (via the relative variable in the WSGI environ and some hints in the specification) and these are not really part of the Python Language. Chris Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposal for asynchronous WSGI variant
On May 6, 2008, at 8:51 PM, Ionel Maries Cristian wrote: - there is no support for chunked input - that would require having support for readline in the first place, Why is readline a requirement for chunked input? Each chunk specifies its size, and the application receiving a chunk just keeps calling recv() until it's read the specified number of bytes. also, it should be the gateway's business decoding the chunked input. OK, but if it's the gateway's responsibility, then this isn't an issue at all, as decoding of chunked data takes place before the application ever sees the request body. To be clear, I didn't mean to imply that awsgi.input must be the actual socket object connected to the client. It just has to provide a recv() method with the semantics of a socket. The server is free to pre-read the entire request, or it can receive data on demand, decoding any chunked input before it passes it to the application. - i don't see how removing the write callable will help (i don't see a issue having the server providing a stringio.write as the write callable for synchronous apps) Manlio explained this well, so I'll refer you to his response. - passing nonstring values though middleware will make using/porting existing wsgi middleware hairy (suppose you have a middleware that applies some filter to the appiter - you'll have your code full of isinstance nastiness) Yes, my proposal would require existing middleware to be modified to support AWSGI, which is unfortunate. Also, have you looked at the existing gateway implementations with asynchronous support? There are a bunch of them: http://trac.wiretooth.com/public/wiki/asycwsgi http://chiral.j4cbo.com/trac http://wiki.secondlife.com/wiki/Eventlet my own shot at the problem: http://code.google.com/p/cogen/ and manlio's mod_wsgi for nginx (I may be missing some) I've seen some of these, but I'll be sure to take a look at the others. [*1]In my implementation i do a bunch of tricks to make use of regular wsgi middleware with async apps possible - i have a bunch of working examples using pylons: - the extensions in the environ (like your environ['awsgi.readable']) return a empty string that penetrates most[*2] middleware and set the actual message (like your (token, fd, timeout) tuple on some internal object) From this point of view, an async middleware stack is just a set of middleware that supports streaming. This is an interesting idea that I'd like to explore some more. I really like the fact that it works with existing middleware (or at least fully WSGI-compliant middleware, as you point out). Apart from the write() callable, the biggest issue I see with the WSGI spec for asynchronous servers is wsgi.input. The problem is that this is explicitly a file-like object. This means that input.read(n) reads until it finds n bytes or EOF, input.readline() reads until it finds a newline or EOF, and input.readlines() and input.__iter__() always read to EOF. Every one of these functions implies multiple I/O operations (calls to fread() for a file or recv() for a socket). This means that if an application calls input.read(8), and only 4 bytes are available, the first call to recv() returns 4 bytes, and the second one blocks. And now your entire server is blocked until data is available on this one socket. (Of course, the server is free to pre-read the entire request at its leisure and feed it to the application from a buffer, but this may not always be practical or desirable, and I don't think asynchronous servers should be forced to do so.) This is why I propose replacing wsgi.input with awsgi.input, which exposes a recv() method with socket-like (rather than file-like) semantics. The meaning of input.recv(n) is therefore read at most n bytes (possibly less), calling the underlying socket recv() at most one time. So, although your suggestion may eliminate the need to yield non- string output from the application iterable, I still think there needs to be a separate specification for asynchronous gateways, since the semantics of wsgi.input just aren't compatible with an asynchronous model. Chris ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposal for asynchronous WSGI variant
On Wed, 2008-05-07 at 14:00 -0400, Christopher Stawarz wrote: On May 7, 2008, at 4:20 AM, Graham Dumpleton wrote: 2008/5/7 Manlio Perillo [EMAIL PROTECTED]: With your solution it seems that writing middlewares will not became more easy. Part of what I was trying to say was that this needn't be exposed to middlewares, unless it has to be. It was effectively a lower level of interaction which a middleware immediately on top of the WSGI adapter would use to hook into the async type model, but then present it to higher levels as more traditional WSGI interface. That would be a really elegant solution, except, as you say: That layer would though obviously use something like greenlets to bridge the two. The problem being that greenlets aren't part of the Python language. They're an extension that works by doing clever stuff with the C stack. And as much as we might wish that Python supported them natively (which I do, since they're a really nice alternative to OS threads), it doesn't, so I don't think they can play any role in a WSGI-ASYNC spec. It's my understanding that greenlets are python, not C. Are you thinking of tasklets in stackless? d ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposal for asynchronous WSGI variant
On May 7, 2008, at 3:35 PM, Duncan McGreggor wrote: It's my understanding that greenlets are python, not C. Are you thinking of tasklets in stackless? The version for CPython is a C extension module. Have a look at the comments in http://svn.red-bean.com/bob/greenlet/trunk/greenlet.c The switching is accomplished by saving and restoring chunks of the C stack, which I find both extremely clever and kind of scary :) Chris ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposal for asynchronous WSGI variant
On Wed, May 7, 2008 at 10:00 PM, Christopher Stawarz [EMAIL PROTECTED] wrote: On May 6, 2008, at 8:51 PM, Ionel Maries Cristian wrote: - there is no support for chunked input - that would require having support for readline in the first place, Why is readline a requirement for chunked input? Each chunk specifies its size, and the application receiving a chunk just keeps calling recv() until it's read the specified number of bytes. Well, not really a requirement, i was implying there is some sort of readline since that is what one would generaly use some sort of realine to get the size of a chunk - but not necessarily. also, it should be the gateway's business decoding the chunked input. OK, but if it's the gateway's responsibility, then this isn't an issue at all, as decoding of chunked data takes place before the application ever sees the request body. To be clear, I didn't mean to imply that awsgi.input must be the actual socket object connected to the client. It just has to provide a recv() method with the semantics of a socket. The server is free to pre-read the entire request, or it can receive data on demand, decoding any chunked input before it passes it to the application. - i don't see how removing the write callable will help (i don't see a issue having the server providing a stringio.write as the write callable for synchronous apps) Manlio explained this well, so I'll refer you to his response. - passing nonstring values though middleware will make using/porting existing wsgi middleware hairy (suppose you have a middleware that applies some filter to the appiter - you'll have your code full of isinstance nastiness) Yes, my proposal would require existing middleware to be modified to support AWSGI, which is unfortunate. Also, have you looked at the existing gateway implementations with asynchronous support? There are a bunch of them: http://trac.wiretooth.com/public/wiki/asycwsgi http://chiral.j4cbo.com/trac http://wiki.secondlife.com/wiki/Eventlet my own shot at the problem: http://code.google.com/p/cogen/ and manlio's mod_wsgi for nginx (I may be missing some) I've seen some of these, but I'll be sure to take a look at the others. [*1]In my implementation i do a bunch of tricks to make use of regular wsgi middleware with async apps possible - i have a bunch of working examples using pylons: - the extensions in the environ (like your environ['awsgi.readable']) return a empty string that penetrates most[*2] middleware and set the actual message (like your (token, fd, timeout) tuple on some internal object) From this point of view, an async middleware stack is just a set of middleware that supports streaming. This is an interesting idea that I'd like to explore some more. I really like the fact that it works with existing middleware (or at least fully WSGI-compliant middleware, as you point out). Apart from the write() callable, the biggest issue I see with the WSGI spec for asynchronous servers is wsgi.input. The problem is that this is explicitly a file-like object. This means that input.read(n) reads until it finds n bytes or EOF, input.readline() reads until it finds a newline or EOF, and input.readlines() and input.__iter__() always read to EOF. Every one of these functions implies multiple I/O operations (calls to fread() for a file or recv() for a socket). This means that if an application calls input.read(8), and only 4 bytes are available, the first call to recv() returns 4 bytes, and the second one blocks. And now your entire server is blocked until data is available on this one socket. (Of course, the server is free to pre-read the entire request at its leisure and feed it to the application from a buffer, but this may not always be practical or desirable, and I don't think asynchronous servers should be forced to do so.) This is why I propose replacing wsgi.input with awsgi.input, which exposes a recv() method with socket-like (rather than file-like) semantics. The meaning of input.recv(n) is therefore read at most n bytes (possibly less), calling the underlying socket recv() at most one time. So, although your suggestion may eliminate the need to yield non-string output from the application iterable, I still think there needs to be a separate specification for asynchronous gateways, since the semantics of wsgi.input just aren't compatible with an asynchronous model. Chris The way I see it asynchronous wsgi is just a matter of deciding how to handle the input asynchronously - a asynchronous input wsgi extension specification. So I suggest completely dropping the idea of a incompatibility between async_wsgi and wsgi (since it doesn't help anyone in the long run really - it just fragments the gateway providers and overcomplicate things) and concentrate more on the async input extension. So the idea is that the gateways would provide async input by
Re: [Web-SIG] Proposal for asynchronous WSGI variant
On May 7, 2008, at 5:36 PM, Ionel Maries Cristian wrote: The way I see it asynchronous wsgi is just a matter of deciding how to handle the input asynchronously - a asynchronous input wsgi extension specification. Another crucial element is the ability to perform non-blocking I/O on other file descriptors (TCP connections to other servers, pipes to other OS processes). This is why the readable/writable functions (or something like them) are necessary. So I suggest completely dropping the idea of a incompatibility between async_wsgi and wsgi (since it doesn't help anyone in the long run really - it just fragments the gateway providers and overcomplicate things) and concentrate more on the async input extension. This is a compelling argument. As long as the application iterable yields only strings (which, the more I think about it, seems like the right thing to do), then the remaining functionality I propose can be implemented as extensions to WSGI, perhaps in a x-wsgiorg.async namespace. However, the problem remains that, even though an asynchronous server can implement the write() callable and wsgi.input as required by the WSGI spec, they effectively can't be used by applications, since they involve potentially blocking I/O operations. So either WSGI has to be revised to take the needs of asynchronous servers into account, or we have to accept that async servers can never be fully WSGI compliant. So the idea is that the gateways would provide async input by default and a piece of middleware or config option to make it synchronous (well, actually, buffer it). You mean the middleware would be used to make the input synchronous so that an app that uses wsgi.input would function normally (reading from the buffer)? That would fix the problem for wsgi.input, but the issue with write() remains. Another point to keep in mind is that in order to function correctly on an async server, an application really has to be written with that execution environment in mind. For example, an app couldn't use httplib, since it does blocking I/O (which, again, would freeze up the entire server). Also, since there already are a bunch of async gateways out there I would like to hear if the other providers would/could implement the proposed form of common async input - that would ultimately decide the success of this proposed spec. I would like to hear their opinions as well. In particular, do any Twisted folks have comments on what we've discussed? Chris ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposal for asynchronous WSGI variant
Christopher Stawarz ha scritto: (I'm new to the list, so please forgive me for making my first post a specification proposal :) Browsing through the list archives, I see there's been some inconclusive discussions on adding better support for asynchronous web servers to the WSGI spec. Since such support would be very useful for some upcoming projects of mine, I decided to take a shot at specing out and implementing it. I'd be grateful for any feedback you have. If this seems like something worth pursuing, I would also welcome collaborators to help develop the spec further. I'm glad to know that there are some other people interested in asynchronous application, do you have seen my extensions to WSGI in my module for Nginx? The extension is documented here: http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/README see the Extensions chapter. For some examples: http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/examples/nginx-postgres-async.py http://hg.mperillo.ath.cx/nginx/mod_wsgi/file/tip/examples/nginx-poll-sleep.py Note that in Nginx the request body is pre-read before the application is called (in fact wsgi.input is either a cStringIO or File object). Unfortunately there is a *big* usability problem: the extension is based on a well specified feature of WSGI: the gateway can suspend the execution of the WSGI application when it yields. However if the asynchronous code is present in a child function, we have something like this: def application(environ, start_response): def nested(): while True: poll(xxx) yield '' yield result for r in nested(): if not r: yield '' yield r That is, all the functions in the chain have to yield, and is not very good. The solution is to use coroutines, and I'm planning to integrate greenlets (from the pylib project) into the WSGI module for Nginx. [...] Regards Manlio Perillo ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposal for asynchronous WSGI variant
On May 5, 2008, at 10:08 PM, Graham Dumpleton wrote: If write() isn't to be returned by start_response(), then do away with start_response() if possible as per discussions for WSGI 2.0. I think start_response() is necessary, because the application may need to yield for I/O readiness (e.g. to read the request body, as in my example app) before it decides what response status and headers to send. Also take note of: http://www.wsgi.org/wsgi/Amendments_1.0 and think about how Python 3.0 would affect things. OK, will do. I'd also rather it not be called AWSGI as not sufficient distinct from WSGI. If you want to pursue this asynchronous style, then be more explicitly and call it ASYNC-WSGI and use 'asyncwsgi' tag in environ. Good point. It'd be easy to type wsgi when you meant awsgi, or vice versa. But I think I'd prefer wsgi_async to asyncwsgi. Thanks, Chris ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposal for asynchronous WSGI variant
On May 6, 2008, at 6:17 AM, Manlio Perillo wrote: I'm glad to know that there are some other people interested in asynchronous application, do you have seen my extensions to WSGI in my module for Nginx? Yes, I have, and I had your module in mind as a potential provider of the AWSGI interface. Note that in Nginx the request body is pre-read before the application is called (in fact wsgi.input is either a cStringIO or File object). Although I didn't state it explicitly in my spec, my intention is for the server to be able to implement awsgi.input in any way it likes, as long as it provides a recv() method. It's totally acceptable for the request body to be pre-read. Unfortunately there is a *big* usability problem: the extension is based on a well specified feature of WSGI: the gateway can suspend the execution of the WSGI application when it yields. However if the asynchronous code is present in a child function, we have something like this: ... That is, all the functions in the chain have to yield, and is not very good. Yes, you're right. However, if you're willing/able to use Python 2.5, you can use the new features of generators to implement a call stack that lets you call child functions and receive return values and exceptions from them. I've implemented this in awsgiref.callstack. Have a look at http://pseudogreen.org/bzr/awsgiref/examples/echo_request_with_callstack.py for an example of how it works. The solution is to use coroutines, and I'm planning to integrate greenlets (from the pylib project) into the WSGI module for Nginx. Interesting, but it's not clear to me how/if this would work. Can you explain more or point me to some code? Thanks, Chris ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Proposal for asynchronous WSGI variant
This is a very interesting initiative. However there are few problems: - there is no support for chunked input - that would require having support for readline in the first place, also, it should be the gateway's business decoding the chunked input. - the original wsgi spec somewhat has some support for streaming and asynchronicity [*1] - i don't see how removing the write callable will help (i don't see a issue having the server providing a stringio.write as the write callable for synchronous apps) - passing nonstring values though middleware will make using/porting existing wsgi middleware hairy (suppose you have a middleware that applies some filter to the appiter - you'll have your code full of isinstance nastiness) Also, have you looked at the existing gateway implementations with asynchronous support? There are a bunch of them: http://trac.wiretooth.com/public/wiki/asycwsgi http://chiral.j4cbo.com/trac http://wiki.secondlife.com/wiki/Eventlet my own shot at the problem: http://code.google.com/p/cogen/ and manlio's mod_wsgi for nginx (I may be missing some) However there is absolutely no unity in handling the wsgi.input (or equivalent) [*1]In my implementation i do a bunch of tricks to make use of regular wsgi middleware with async apps possible - i have a bunch of working examples using pylons: - the extensions in the environ (like your environ['awsgi.readable']) return a empty string that penetrates most[*2] middleware and set the actual message (like your (token, fd, timeout) tuple on some internal object) From this point of view, an async middleware stack is just a set of middleware that supports streaming. Please see: http://cogen.googlecode.com/svn/trunk/docs/cogen.web.async.html http://cogen.googlecode.com/svn/trunk/docs/cogen.web.wsgi.html [*2] middleware that consume the app iter ruin that pattern, but regardless, they are not compliant to the wsgi spec (see http://www.python.org/dev/peps/pep-0333/#middleware-handling-of-block-boundaries ) - notable examples are most of the exception handling middleware (they can't work otherwise anyway) On Tue, May 6, 2008 at 4:30 AM, Christopher Stawarz [EMAIL PROTECTED] wrote: (I'm new to the list, so please forgive me for making my first post a specification proposal :) Browsing through the list archives, I see there's been some inconclusive discussions on adding better support for asynchronous web servers to the WSGI spec. Since such support would be very useful for some upcoming projects of mine, I decided to take a shot at specing out and implementing it. I'd be grateful for any feedback you have. If this seems like something worth pursuing, I would also welcome collaborators to help develop the spec further. The name for this proposed specification is the Asynchronous Web Server Gateway Interface (AWSGI). As the name suggests, the spec is closely related to WSGI and is most easily described in terms of how it differs from WSGI. AWSGI eliminates the following parts of WSGI: - the environment variables wsgi.version and wsgi.input - the write() callable returned by start_response() AWSGI adds the following environment variables: - awsgi.version - awsgi.input - awsgi.readable - awsgi.writable - awsgi.timeout In addition, AWSGI allows the application iterable to yield two types of data: - byte strings, handled as in WSGI - the result of calling awsgi.readable or awsgi.writable, which indicates that the application should be paused and restarted when a specified file descriptor is ready for reading or writing Because of AWSGI's similarity to WSGI, a simple wrapper can be used to run AWSGI applications on WSGI servers without alteration. The following example application demonstrates typical usage of AWSGI. This application simply reads the request body and sends it back to the client. Each time it wants to receive data from the client, it first tests awsgi.input for readability and then calls its recv() method. If awsgi.input is not readable after one second, the application sends a 408 Request Timeout response to the client and terminates: def echo_request_body(environ, start_response): input = environ['awsgi.input'] readable = environ['awsgi.readable'] nbytes = int(environ.get('CONTENT_LENGTH') or 0) output = '' while nbytes: yield readable(input, 1.0) # Time out after 1 second if environ['awsgi.timeout']: msg = 'The request timed out.' start_response('408 Request Timeout', [('Content-Type', 'text/plain'), ('Content-Length', str(len(msg)))]) yield msg return data = input.recv(nbytes) if not data: break output += data nbytes -= len(data) start_response('200 OK', [('Content-Type', 'text/plain'), ('Content-Length',
[Web-SIG] Proposal for asynchronous WSGI variant
(I'm new to the list, so please forgive me for making my first post a specification proposal :) Browsing through the list archives, I see there's been some inconclusive discussions on adding better support for asynchronous web servers to the WSGI spec. Since such support would be very useful for some upcoming projects of mine, I decided to take a shot at specing out and implementing it. I'd be grateful for any feedback you have. If this seems like something worth pursuing, I would also welcome collaborators to help develop the spec further. The name for this proposed specification is the Asynchronous Web Server Gateway Interface (AWSGI). As the name suggests, the spec is closely related to WSGI and is most easily described in terms of how it differs from WSGI. AWSGI eliminates the following parts of WSGI: - the environment variables wsgi.version and wsgi.input - the write() callable returned by start_response() AWSGI adds the following environment variables: - awsgi.version - awsgi.input - awsgi.readable - awsgi.writable - awsgi.timeout In addition, AWSGI allows the application iterable to yield two types of data: - byte strings, handled as in WSGI - the result of calling awsgi.readable or awsgi.writable, which indicates that the application should be paused and restarted when a specified file descriptor is ready for reading or writing Because of AWSGI's similarity to WSGI, a simple wrapper can be used to run AWSGI applications on WSGI servers without alteration. The following example application demonstrates typical usage of AWSGI. This application simply reads the request body and sends it back to the client. Each time it wants to receive data from the client, it first tests awsgi.input for readability and then calls its recv() method. If awsgi.input is not readable after one second, the application sends a 408 Request Timeout response to the client and terminates: def echo_request_body(environ, start_response): input = environ['awsgi.input'] readable = environ['awsgi.readable'] nbytes = int(environ.get('CONTENT_LENGTH') or 0) output = '' while nbytes: yield readable(input, 1.0) # Time out after 1 second if environ['awsgi.timeout']: msg = 'The request timed out.' start_response('408 Request Timeout', [('Content-Type', 'text/plain'), ('Content-Length', str(len(msg)))]) yield msg return data = input.recv(nbytes) if not data: break output += data nbytes -= len(data) start_response('200 OK', [('Content-Type', 'text/plain'), ('Content-Length', str(len(output)))]) yield output I have rough but functional implementations of a number of AWSGI components available in a Bazaar branch at http://pseudogreen.org/bzr/awsgiref/. The package includes an asyncore-based AWSGI server and an AWSGI-to-WSGI application wrapper. In addition, the file spec.txt contains a more detailed description of the specification (which is also appended below). Again, I'd very much appreciate comments and criticism. Thanks, Chris Detailed AWSGI Specification - Required AWSGI environ variables: * All variables required by WSGI, except for wsgi.version and wsgi.input, which must *not* be present * awsgi.version = the tuple (1, 0) * awsgi.input This is an object with one method, recv(bufsize), which behaves like the socket method of the same name (although it doesn't support the optional flags parameter). Before each call to recv(), the application must test awsgi.input for readability via awsgi.readable. The result of calling recv() without doing so is undefined. (XXX: Should recv() handle EINTR for the application?) * awsgi.readable * awsgi.writable These are callables with the signature f(fd, timeout=None). fd is either a file descriptor (i.e. int or long) or an object with a fileno() method that returns a file descriptor. timeout has the same semantics as the timeout parameter to select.select(). If the operation times out, awsgi.timeout will be true when the application resumes. In addition to checking readiness for reading or writing, servers should also monitor file descriptors for exceptional conditions (e.g. out-of-band data) and restart the application if they occur. * awsgi.timeout = boolean indicating whether the most recent read or write wait timed out (false if there have been no waits) - start_response() must *not* return a write() callable, as this method of providing application output to the server is incompatible with asynchronous execution. - The server must accept awsgi.input as input to awsgi.readable, either by providing an actual socket object or by special-case
Re: [Web-SIG] Proposal for asynchronous WSGI variant
2008/5/6 Christopher Stawarz [EMAIL PROTECTED]: (I'm new to the list, so please forgive me for making my first post a specification proposal :) Browsing through the list archives, I see there's been some inconclusive discussions on adding better support for asynchronous web servers to the WSGI spec. Since such support would be very useful for some upcoming projects of mine, I decided to take a shot at specing out and implementing it. I'd be grateful for any feedback you have. If this seems like something worth pursuing, I would also welcome collaborators to help develop the spec further. The name for this proposed specification is the Asynchronous Web Server Gateway Interface (AWSGI). As the name suggests, the spec is closely related to WSGI and is most easily described in terms of how it differs from WSGI. AWSGI eliminates the following parts of WSGI: - the environment variables wsgi.version and wsgi.input - the write() callable returned by start_response() AWSGI adds the following environment variables: - awsgi.version - awsgi.input - awsgi.readable - awsgi.writable - awsgi.timeout In addition, AWSGI allows the application iterable to yield two types of data: - byte strings, handled as in WSGI - the result of calling awsgi.readable or awsgi.writable, which indicates that the application should be paused and restarted when a specified file descriptor is ready for reading or writing Because of AWSGI's similarity to WSGI, a simple wrapper can be used to run AWSGI applications on WSGI servers without alteration. The following example application demonstrates typical usage of AWSGI. This application simply reads the request body and sends it back to the client. Each time it wants to receive data from the client, it first tests awsgi.input for readability and then calls its recv() method. If awsgi.input is not readable after one second, the application sends a 408 Request Timeout response to the client and terminates: def echo_request_body(environ, start_response): input = environ['awsgi.input'] readable = environ['awsgi.readable'] nbytes = int(environ.get('CONTENT_LENGTH') or 0) output = '' while nbytes: yield readable(input, 1.0) # Time out after 1 second if environ['awsgi.timeout']: msg = 'The request timed out.' start_response('408 Request Timeout', [('Content-Type', 'text/plain'), ('Content-Length', str(len(msg)))]) yield msg return data = input.recv(nbytes) if not data: break output += data nbytes -= len(data) start_response('200 OK', [('Content-Type', 'text/plain'), ('Content-Length', str(len(output)))]) yield output I have rough but functional implementations of a number of AWSGI components available in a Bazaar branch at http://pseudogreen.org/bzr/awsgiref/. The package includes an asyncore-based AWSGI server and an AWSGI-to-WSGI application wrapper. In addition, the file spec.txt contains a more detailed description of the specification (which is also appended below). Again, I'd very much appreciate comments and criticism. Thanks, Chris Detailed AWSGI Specification - Required AWSGI environ variables: * All variables required by WSGI, except for wsgi.version and wsgi.input, which must *not* be present * awsgi.version = the tuple (1, 0) * awsgi.input This is an object with one method, recv(bufsize), which behaves like the socket method of the same name (although it doesn't support the optional flags parameter). Before each call to recv(), the application must test awsgi.input for readability via awsgi.readable. The result of calling recv() without doing so is undefined. (XXX: Should recv() handle EINTR for the application?) * awsgi.readable * awsgi.writable These are callables with the signature f(fd, timeout=None). fd is either a file descriptor (i.e. int or long) or an object with a fileno() method that returns a file descriptor. timeout has the same semantics as the timeout parameter to select.select(). If the operation times out, awsgi.timeout will be true when the application resumes. In addition to checking readiness for reading or writing, servers should also monitor file descriptors for exceptional conditions (e.g. out-of-band data) and restart the application if they occur. * awsgi.timeout = boolean indicating whether the most recent read or write wait timed out (false if there have been no waits) - start_response() must *not* return a write() callable, as this method of providing application output to the server