[Web-SIG] PEP 444 and asynchronous support, continued
After a weekend of experimentation with several asynchronous frameworks including gevent, tornado and twisted (and writing one myself too), and these are my findings so far: - asynchronous socket implementations vary wildly across different frameworks - gevent is the fastest, tornado comes second while twisted is pretty slow - twisted provides the most comprehensive support for implementing protocols, while the other two mostly just provide low level support for asynchronous sockets - futures seem to have a significant overhead (from the thread synchronization) - gevent provides the easiest programming interface with greenlets, since it pretty much lets you write asynchronous code as you would write it synchronously - gevent could make use of the regular, synchronous PEP 444 API by monkey patching the socket library (through its import monkey; monkey.patch_socket() call) The significance of this for the Python web standards effort is that providing an asynchronous API that works for the existing asynchronous frameworks does not seem feasible. I'd love to see a solution for this in the standard library, but gevent's monkey patching approach, while convenient for the developer, cannot obviously work there. Before an asynchronous WSGI API can be provided, this lower level problem needs to be solved first. The crucial question is: is it possible to provide gevent's level of convenience through the standard library, and if not, what is the next best solution? I'd like to hear everyone's thoughts on this (especially Guido's). ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 and asynchronous support, continued
On Jan 16, 2011, at 10:49 PM, Alex Grönholm wrote: After a weekend of experimentation with several asynchronous frameworks including gevent, tornado and twisted (and writing one myself too), and these are my findings so far: - asynchronous socket implementations vary wildly across different frameworks That's certainly true. - gevent is the fastest, tornado comes second while twisted is pretty slow Fastest at... what? If you have a WSGI benchmark for Twisted, could you contribute it in a form that we could use at http://speed.twistedmatrix.com/ so that we can improve the situation? Thanks. - futures seem to have a significant overhead (from the thread synchronization) If there were some way to have tighter control over where the callbacks in add_done_callback were executed, thread synchronization might not be necessary. The module as currently specified does need to have a bit of overhead to deal with that, but the general concept doesn't. The significance of this for the Python web standards effort is that providing an asynchronous API that works for the existing asynchronous frameworks does not seem feasible. I don't see how that follows from anything you've said above. I'd love to see a solution for this in the standard library, but gevent's monkey patching approach, while convenient for the developer, cannot obviously work there. gevent and eventlet don't need any special support from WSGI though. It's basically its own special kind of multithreading, with explicit context-switches, but from the application developer's perspective it's almost exactly the same as working with threads. The API can be the existing WSGI API. Twisted and Tornado and Marrow (and Diesel, if that were a thing that still existed) do need explicit APIs though, and it seems to me that there might be some value in that. For that matter, Eventlet can use Twisted as a networking engine, so actually you can already use Twisted asynchronously with WSGI that way. The whole point of having an asynchronous WSGI standard is to allow applications to be written such that they can have explicitly-controlled event-driven concurrency, not abstracted-over context switches in a convenience wrapper. Before an asynchronous WSGI API can be provided, this lower level problem needs to be solved first. I'm not even clear on what lower level problem you're talking about. If you're talking about interoperability between event-driven frameworks, I see it the other way around: asynchronous WSGI is a good place to start working on interoperability, not a problem to solve later when the rest of the harder low-level things have somehow been unified. (I'm pretty sure that's never going to happen.) The crucial question is: is it possible to provide gevent's level of convenience through the standard library, and if not, what is the next best solution? I'd like to hear everyone's thoughts on this (especially Guido's). gevent and eventlet already have things that will monkey patch the socket module that the standard library uses (for example: http://eventlet.net/doc/patching.html), so ... yes? And if this level of convenience is what you're aiming for (blocking calls with an efficient, non-threaded scheduler), again, you don't need async WSGI for that. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 and asynchronous support, continued
17.01.2011 06:47, Glyph Lefkowitz kirjoitti: On Jan 16, 2011, at 10:49 PM, Alex Grönholm wrote: After a weekend of experimentation with several asynchronous frameworks including gevent, tornado and twisted (and writing one myself too), and these are my findings so far: - asynchronous socket implementations vary wildly across different frameworks That's certainly true. - gevent is the fastest, tornado comes second while twisted is pretty slow Fastest at... what? If you have a WSGI benchmark for Twisted, could you contribute it in a form that we could use athttp://speed.twistedmatrix.com/ so that we can improve the situation? Thanks. I'm already regretting saying anything about performance. Our tests were run with the Apache Benchmark (ab) against a Hello World type WSGI app. Certainly nothing special. - futures seem to have a significant overhead (from the thread synchronization) If there were some way to have tighter control over where the callbacks in add_done_callback were executed, thread synchronization might not be necessary. The module as currently specified does need to have a bit of overhead to deal with that, but the general concept doesn't. Unfortunately you are wrong. Thread synchronization is not necessary for callbacks, but it is necessary for supporting the result() method, since other threads may be blocking on that call. The significance of this for the Python web standards effort is that providing an asynchronous API that works for the existing asynchronous frameworks does not seem feasible. I don't see how that follows from anything you've said above. Asynchronous apps (save for gevent and the likes) can't use the standard wsgi.input since reading would block the event loop. Therefore an alternative input has to be provided, right? How would that work then? If something, say, wsgi.async_input was to be provided, what would it return from .read()? Futures? Deferreds? I'd love to see a solution for this in the standard library, but gevent's monkey patching approach, while convenient for the developer, cannot obviously work there. gevent and eventlet don't need any special support from WSGI though. It's basically its own special kind of multithreading, with explicit context-switches, but from the application developer's perspective it's almost exactly the same as working with threads. The API can be the existing WSGI API. Twisted and Tornado and Marrow (and Diesel, if that were a thing that still existed) do need explicit APIs though, and it seems to me that there might be some value in that. Which leads to the problem I described above. For that matter, Eventlet can use Twisted as a networking engine, so actually you can already use Twisted asynchronously with WSGI that way. The whole point of having an asynchronous WSGI standard is to allow applications to be written such that they can have explicitly-controlled event-driven concurrency, not abstracted-over context switches in a convenience wrapper. It is my understanding that eventlet only runs on CPython. Am I mistaken? Before an asynchronous WSGI API can be provided, this lower level problem needs to be solved first. I'm not even clear on what lower level problem you're talking about. If you're talking about interoperability between event-driven frameworks, I see it the other way around: asynchronous WSGI is a good place to start working on interoperability, not a problem to solve later when the rest of the harder low-level things have somehow been unified. (I'm pretty sure that's never going to happen.) The crucial question is: is it possible to provide gevent's level of convenience through the standard library, and if not, what is the next best solution? I'd like to hear everyone's thoughts on this (especially Guido's). gevent and eventlet already have things that will monkey patch the socket module that the standard library uses (for example:http://eventlet.net/doc/patching.html), so ... yes? And if this level of convenience is what you're aiming for (blocking calls with an efficient, non-threaded scheduler), again, you don't need async WSGI for that. That's what I've been saying. But that only holds true for gevent/eventlet. Twisted, for one, needs explicit support unless, as you said, is used through eventlet. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 feature request - Futures executor
Quartz is certainly powerful, but I think it's outside the scope of something we want in a WSGI spec. Is there a specific feature you're referring to? - Original Message - From: Nam Nguyen bits...@gmail.com To: Timothy Farrell tfarr...@owassobible.org Cc: P.J. Eby p...@telecommunity.com, web-sig@python.org Sent: Tuesday, January 11, 2011 2:28:55 AM Subject: Re: [Web-SIG] PEP 444 feature request - Futures executor On Tue, Jan 11, 2011 at 10:40 AM, Timothy Farrell tfarr...@owassobible.org wrote: - Original Message - From: P.J. Eby p...@telecommunity.com To: Timothy Farrell tfarr...@owassobible.org, web-sig@python.org Sent: Friday, January 7, 2011 2:14:20 PM Subject: Re: [Web-SIG] PEP 444 feature request - Futures executor There are some other issues that might need to be addressed, like maybe adding an attribute or two for the level of reliability guaranteed by the executor, or allowing the app to request a given reliability level. Specifically, it might be important to distinguish between: * this will be run exactly once as long as the server doesn't crash * this will eventually be run once, even if the server suffers a fatal error between now and then IOW, to indicate whether the thing being done is transactional, so to speak. I understand why this would be good (credit card transactions particularly), but how would this play our in the real world? All servers will do their best to run the jobs given them. Are you suggesting that there would be a property of the executor that would change based on the load of the server or some other metric? Say the server has 100 queued jobs and only 2 worker threads, would it then have a way of saying, I'll get to this eventually, but I'm pretty swamped.? Is that what you're getting at or something more like database transactions...I guarantee that I won't stop halfway through this process. Maybe in the same vein as Quartz (http://www.quartz-scheduler.org/) in Java world. Nam ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 feature request - Futures executor
At 09:11 PM 1/10/2011 -0600, Timothy Farrell wrote: PJ, You seem to be old-hat at this so I'm looking for a little advise as I draft this spec. It seems a bad idea to me to just say environ['wsgi.executor'] will be a wrapped futures executor because the api handled by the executor can and likely will change over time. Am I write in thinking that a spec should be more specific in saying that the executor object will have these specific methods and so as future changes, the spec is not in danger of invalidation due to the changes? I'd actually just suggest something like: future = environ['wsgiorg.future'](func, *args, **kw) (You need to use the wsgiorg.* namespace for extension proposals like this, btw.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 feature request - Futures executor
- Original Message - From: P.J. Eby p...@telecommunity.com To: Timothy Farrell tfarr...@owassobible.org, web-sig@python.org Sent: Friday, January 7, 2011 2:14:20 PM Subject: Re: [Web-SIG] PEP 444 feature request - Futures executor There are some other issues that might need to be addressed, like maybe adding an attribute or two for the level of reliability guaranteed by the executor, or allowing the app to request a given reliability level. Specifically, it might be important to distinguish between: * this will be run exactly once as long as the server doesn't crash * this will eventually be run once, even if the server suffers a fatal error between now and then IOW, to indicate whether the thing being done is transactional, so to speak. I understand why this would be good (credit card transactions particularly), but how would this play our in the real world? All servers will do their best to run the jobs given them. Are you suggesting that there would be a property of the executor that would change based on the load of the server or some other metric? Say the server has 100 queued jobs and only 2 worker threads, would it then have a way of saying, I'll get to this eventually, but I'm pretty swamped.? Is that what you're getting at or something more like database transactions...I guarantee that I won't stop halfway through this process. Thanks, -t ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea
Here's what I've mutated Alex Grönholm's minimal middleware example into: (see the change history for the evolution of this) https://gist.github.com/771398 A complete functional (as in function, not working ;) async-capable middleware layer (that does nothing) is 12 lines. That, I think is a reasonable amount of boilerplate. Also, no decorators needed. It's quite readable, even the way I've compressed it. The class-based version is basically identical, but with added comments explaining the assumptions this example makes and demonstrating where the acutal middleware code can be implemented for simple middleware. - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] [PEP 444] Future- and Generator-Based Async Idea
Warning: this assumes we're running on bizzaro-world PEP 444 that mandates applications are generators. Please do not dismiss this idea out of hand but give it a good look and maybe some feedback. ;) -- Howdy! I've finished touching up the p-code illustrating my idea of using generators to implement async functionality within a WSGI application and middleware, including the idea of a wsgi2ref-supplied decorator to simplify middleware. https://gist.github.com/770743 There may be a few typos in there; I switched from the idea of passing back the returned value of the future to passing back the future itself in order to better handle exception handling (i.e. not requiring utter insanity in the middleware to determine the true source of an exception and the need to pass it along). The second middleware demonstration (using a decorator) makes middleware look a lot more like an application: yielding futures, or a response, with the addition of yielding an application callable not explored in the first (long, but trivial) example. I believe this should cover 99% of middleware use cases, including interactive debugging, request routing, etc. and the syntax isn't too bad, if you don't mind standardized decorators. This should be implementable within the context of Marrow HTTPd (http://bit.ly/fLfamO) without too much difficulty. As a side note, I'll be adding threading support to the server (actually, marrow.server, the underlying server/protocol abstraction m.s.http utilizes) using futures some time over the week-end by wrapping the async callback that calls the application with a call to an executor, making it immune to blocking, but I suspect the overhead will outweigh the benefit for speedy applications. Testing multi-process vs. multi-threaded using 2 workers each and the prime calculation example, threading is 1.5x slower for CPU-intensive tasks under Python 2.7. That's terrible. It should be 2x; I have 2 cores. :/ - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea
As a quick note, this proposal would signifigantly benefit from the simplified syntax offered by PEP 380 (Syntax for Delegating to a Subgenerator) [1] and possibly PEP 3152 (Cofunctions) [2]. The former simplifies delegation and exception passing, and the latter simplifies the async side of this. Unfortunately, AFIK, both are affected by PEP 3003 (Python Language Moratorium) [3], which kinda sucks. - Alice. [1] http://www.python.org/dev/peps/pep-0380/ [2] http://www.python.org/dev/peps/pep-3152/ [3] http://www.python.org/dev/peps/pep-3003/ ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 feature request - Futures executor
On 2011-01-07 09:47:12 -0800, Timothy Farrell said: However, I'm requesting that servers _optionally_ provide environ['wsgi.executor'] as a futures executor that applications can use for the purpose of doing something after the response is fully sent to the client. This is feature request is designed to be concurrency methodology agnostic. Done. (In terms of implementation, not updating PEP 444.) :3 The Marrow server now implements a thread pool executor using the concurrent.futures module (or equiv. futures PyPi package). The following are the commits; the changes will look bigger than they are due to cutting and pasting of several previously nested blocks of code into separate functions for use as callbacks. 100% unit test coverage is maintained (without errors), an example application is added, and the benchmark suite updated to support the definition of thread count. http://bit.ly/gUL33v http://bit.ly/gyVlgQ Testing this yourself requires Git checkouts of the marrow.server/threading branch and marrow.server.http/threading branch, and likely the latest marrow.io from Git as well: https://github.com/pulp/marrow.io https://github.com/pulp/marrow.server/tree/threaded https://github.com/pulp/marrow.server.http/tree/threaded This update has not been tested under Python 3.x yet; I'll do that shortly and push any fixes; I doubt there will be any. On 2011-01-08 03:26:28 -0800, Alice Bevan–McGregor said in the [PEP 444] Future- and Generator-Based Async Idea thread: As a side note, I'll be adding threading support to the server... but I suspect the overhead will outweigh the benefit for speedy applications. I was surprisingly quite wrong in this prediction. The following is the output of a C25 pair of benchmarks, the first not threaded, the other with 30 threads (enough so there would be no waiting). https://gist.github.com/770893 The difference is the loss of 60 RSecs out of 3280. Note that the implementation I've devised can pass the concurrent.futures executor to the WSGI application (and, in fact, does), fufilling the requirements of this discussion. :D The use of callbacks internally to the HTTP protocol makes a huge difference in overhead, I guess. - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea
I made a few errors in that massive post... At 12:00 PM 1/8/2011 -0500, P.J. Eby wrote: My major concern about the approach is still that it requires a fair amount of overhead on the part of both app developers and middleware developers, even if that overhead mostly consists of importing and decorating. (More below.) The above turned out to be happily wrong by the end of the post, since no decorators or imports are actually required for app and middleware developers. You can then implement response-processing middleware like this: def latinize_body(body_iter): while True: chunk = yield body_iter if chunk is None: break else: yield piglatin(yield body_iter) The last line above is incorrect; it should've been yield piglatin(chunk), i.e.: def latinize_body(body_iter): while True: chunk = yield body_iter if chunk is None: break else: yield piglatin(chunk) It's still rather unintuitive, though. There are also plenty of topics left to discuss, both of the substantial and bikeshedding varieties. One big open question still in my mind is, are these middleware idioms any easier to get right than the WSGI 1 ones? For things that don't process response bodies, the answer seems to be yes: you just stick in a yield and you're done. For things that DO process response bodies, however, you have to have ugly loops like the one above. I suppose it could be argued that, as unintuitive as that body-processing loop is, it's still orders of magnitude more intuitive than a piece of WSGI 1 middleware that has to handle both application yields and write()s! I suppose my hesitance is due to the fact that it's not as simple as: return (piglatin(chunk) for chunk in body_iter) Which is really the level of simplicity that I was looking for. (IOW, all response-processing middleware pays in this slightly-added complexity to support the subset of apps and response-processing middleware that need to wait for events during body output.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea
At 05:39 AM 1/8/2011 -0800, Alice BevanMcGregor wrote: As a quick note, this proposal would signifigantly benefit from the simplified syntax offered by PEP 380 (Syntax for Delegating to a Subgenerator) [1] and possibly PEP 3152 (Cofunctions) [2]. The former simplifies delegation and exception passing, and the latter simplifies the async side of this. Unfortunately, AFIK, both are affected by PEP 3003 (Python Language Moratorium) [3], which kinda sucks. Luckily, neither PEP is necessary, since we do not need to support arbitrary protocols for the subgenerators being called. This makes it possible to simply yield instead of yield from, and the trampoline functions take care of distinguishing a terminal (return) result from an intermediate one. The Coroutine class I suggested, however, *does* accept explicit returns via raise StopIteration(value), so it is actually fully equivalent to supporting yield from, as long as it's used with an appropriate trampoline function. (In fact, the structure of the Coroutine class I proposed was stolen from an earlier Python-Dev post I did in an attempt to show why PEP 380 was unnecessary for doing coroutines. ;-) ) In effect, the only thing that PEP 380 would add here is the syntax sugar for 'raise StopIteration(value)', but you can do that with: def return_(value): raise StopIteration(value) In any case, my suggestion doesn't need this for either apps or response bodies, since the type of data yielded suffices to indicate whether the value is a return or not. You only need a subgenerator to raise StopIteration if you want to return something to your caller that *isn't* a response or body chunk. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea
On Sat, Jan 8, 2011 at 6:26 AM, Alice Bevan–McGregor al...@gothcandy.com wrote: Warning: this assumes we're running on bizzaro-world PEP 444 that mandates applications are generators. Please do not dismiss this idea out of hand but give it a good look and maybe some feedback. ;) -- Howdy! I've finished touching up the p-code illustrating my idea of using generators to implement async functionality within a WSGI application and middleware, including the idea of a wsgi2ref-supplied decorator to simplify middleware. https://gist.github.com/770743 There may be a few typos in there; I switched from the idea of passing back the returned value of the future to passing back the future itself in order to better handle exception handling (i.e. not requiring utter insanity in the middleware to determine the true source of an exception and the need to pass it along). The second middleware demonstration (using a decorator) makes middleware look a lot more like an application: yielding futures, or a response, with the addition of yielding an application callable not explored in the first (long, but trivial) example. I believe this should cover 99% of middleware use cases, including interactive debugging, request routing, etc. and the syntax isn't too bad, if you don't mind standardized decorators. This should be implementable within the context of Marrow HTTPd (http://bit.ly/fLfamO) without too much difficulty. As a side note, I'll be adding threading support to the server (actually, marrow.server, the underlying server/protocol abstraction m.s.http utilizes) using futures some time over the week-end by wrapping the async callback that calls the application with a call to an executor, making it immune to blocking, but I suspect the overhead will outweigh the benefit for speedy applications. Testing multi-process vs. multi-threaded using 2 workers each and the prime calculation example, threading is 1.5x slower for CPU-intensive tasks under Python 2.7. That's terrible. It should be 2x; I have 2 cores. :/ - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/paul.joseph.davis%40gmail.com For contrast, I thought it might be beneficial to have a comparison with an implementation that didn't use async might look like: http://friendpaste.com/4lFbZsTpPGA9N9niyOt9PF If your implementation requires that people change source code (yield vs return) when they move code between sync and async servers, doesn't that pretty much violate the main WSGI goal of portability? IMO, the async middleware is quite more complex than the current state of things with start_response. The ability to subtly miss invoking the generator, or invoking it too many times and dropping part of a response. Forcing every middleware to unwrap iterators and handle their own StopExceptions is worrisome as well. I can't decide if casting the complexity of the async middleware as a side benefit of discouraging authors was a joke or not. Either way this proposal reminds me quite a bit of Duff's device [1]. On its own Duff's device is quite amusing and could even be employed in some situations to great effect. On the other hand, any WSGI spec has to be understandable and implementable by people from all skill ranges. If its a spec that only a handful of people comprehend, then I fear its adoption would be significantly slowed in practice. [1] http://en.wikipedia.org/wiki/Duff's_device ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea
At 01:24 PM 1/8/2011 -0500, Paul Davis wrote: For contrast, I thought it might be beneficial to have a comparison with an implementation that didn't use async might look like: http://friendpaste.com/4lFbZsTpPGA9N9niyOt9PF Compare your version with this one, that uses my revision of Alice's proposal: def my_awesome_application(environ): # do stuff yield b'200 OK', [], [Hello, World!] def my_middleware(app): def wrapper(environ): # maybe edit environ try: status, headers, body = yield app(environ) # maybe edit response: # body = latinize(body) yield status, headers, body except: # maybe handle error finally: # maybe release resources def my_server(app, httpreq): environ = wsgi.make_environ(httpreq) def process_response(result): status, headers, body = result write_headers(httpreq, status, headers) Coroutine(body, body_trampoline, finish_response) def finish_response(result): # cleanup, if any Coroutine(app(environ), app_trampoline, process_response) The primary differences are that the server needs to split some of its processing into separate routines, and response-processing done by middleware has to happen in a while loop rather than a for loop. If your implementation requires that people change source code (yield vs return) when they move code between sync and async servers, doesn't that pretty much violate the main WSGI goal of portability? The idea here would be to have WSGI 2 use this protocol exclusively, not to have two different protocols. IMO, the async middleware is quite more complex than the current state of things with start_response. Under the above proposal, it isn't, since you can't (only) do a for loop over the response body; you have to write a loop and a push-based handler as well. In this case, it is reduced to just writing one loop. I'm still not entirely convinced of the viability of the approach, but I'm no longer in the that's just crazy talk category regarding an async WSGI. The cost is no longer crazy, but there's still some cost involved, and the use case rationale hasn't improved much. OTOH, I can now conceive of actually *using* such an async API for something, and that's no small feat. Before now, the idea held virtually zero interest for me. Either way this proposal reminds me quite a bit of Duff's device [1]. On its own Duff's device is quite amusing and could even be employed in some situations to great effect. On the other hand, any WSGI spec has to be understandable and implementable by people from all skill ranges. If its a spec that only a handful of people comprehend, then I fear its adoption would be significantly slowed in practice. Under my modification of Alice's proposal, nearly all of the complexity involved migrates to the server, mostly in the (shareable) Coroutine implementation. For an async server, the arrange for coroutine(result) to be called operations are generally native to async APIs, so I'd expect them to find that simple to implement. Synchronous servers just need to invoke the waited-on operation synchronously, then pass the value back into the coroutine. (e.g. by returning pause from the trampoline, then calling coroutine(value, exc_info) to resume processing after the result is obtained.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea
On 9 January 2011 12:16, Alice Bevan–McGregor al...@gothcandy.com wrote: On 2011-01-08 09:00:18 -0800, P.J. Eby said: (The next interesting challenge would be to integrate this withGraham's proposal for adding cleanup handlers...) class MyApplication(object): def __init__(self): pass # process startup code def __call__(self, environ): yield None # must be a generator pass # request code def __enter__(self): pass # request startup code def __exit(exc_type, exc_val, exc_tb): pass # request shutdown code -- regardless of exceptions We could mandate context managers! :D (Which means you can still wrap a simple function in @contextmanager.) Context managers don't solve the problem I am trying to address. The 'with' statement doesn't apply context managers to WSGI application objects in way that is desirable and use of a decorator to achieve the same means having to replace close() which is what am trying to avoid because of extra complexity that causes for WSGI middleware just to make sure wsgi.file_wrapper works. We want a world where it should never be necessary for WSGI middleware, or proxy decorators, to have to fudge up a generator and override the close() chain to add cleanups. Graham - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/graham.dumpleton%40gmail.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
On 2011-01-06 20:49:57 -0800, P.J. Eby said: It would be helpful if you addressed the issue of scope, i.e., whatfeatures are you proposing to offer to the application developer. Conformity, predictability, and portability. That's a lot of y's. (Pardon the pun!) Alex Grönholm's post describes the goal quite clearly. So far, I believe you're the second major proponent (i.e. ones with concrete proposals and/or implementations to discuss) of an async protocol... and what you have in common with the other proponent is that you happen to have written an async server that would benefit from having apps operating asynchronously. ;-) Well, the Marrow HTTPd does operate in multi-process mode, and, one day, multi-threaded or a combination. Integration of a futures executor to the WSGI environment would alleviate the major need for a multi-threaded implementation in the server core; intensive tasks can be deferred to a thread pool vs. everything being deferred to a thread pool. (E.g. template generation, PDF/other text extraction for indexing of file uploads, image scaling, etc. all of which are real use cases I have which would benefit from futures.) I find it hard to imagine an app developer wanting to do something asynchronously for which they would not want to use one of the big-dog asynchronous frameworks. (Especially if their app involves database access, or other communications protocols.) Admittedly, a truly async server needs some way to allow file descriptors to be registered with the reactor core, with the WSGI application being resumed upon some event (e.g. socket is readable or writeable for DB access, or even pipe operations for use cases I can't think of at the moment). Futures integration is a Good Idea, IMHO, and being optional and easily added to the environ by middleware for servers that don't implement it natively is even better. As for how to provide a generic interface to an async core, I have two ideas, but one is magical and the other is more so; I'll describe these in a descrete post. This doesn't mean I think having a futures API is a bad thing, butISTM that a futures extension to WSGI 1 could be defined right nowusing an x-wsgi-org extension in that case... and you could thenfind out how many people are actually interested in using it. I'll add writing up a WSGI middleware layer that configures and adds a future.executor to the environ to my already overweight to-do list. It actually is something I have a use for right now on at least one commercial project. :) Mainly, though, what I see is people using the futures thing to shuffle off compute-intensive tasks... That's what it's for. ;) ...but if they do that, then they're basically trying to make the server's life easier... but under the existing spec, any truly async server implementing WSGI is going to run the *app* in a future of some sort already... Running the application in a future is actually not a half-bad way for me to add threading to marrow.server... thanks! Which means that the net result is that putting in async is like saying to the app developer: hey, you know this thing that you just could do in WSGI 1 and the server would take care of it foryou? Well, now you can manage that complexity by yourself! Isn't that wonderful? ;-) That's a bit extreme; PEP 444 servers may still implement threading, multi-processing, etc. at the reactor level (a la CherryPy or Paste). Giving WSGI applications access to a futures executor (possibly the one powering the main processing threads) simply gives applications the ability to utilize it, not the requirement to do so. I could be wrong of course, but I'd like to see what concrete usecases people have for async. Earlier in this post I illustrated a few that directly apply to a commercial application I am currently writing. I'll elaborate: :: Image scaling would benefit from multi-processing (spreading the load across cores). Also, only one sacle is immediately required before returning the post-upload page: the thumbnail. The other scales can be executed without halting the WSGI application's return. :: Asset content extraction and indexing would benefit from threading, and would also not require pausing the WSGI application. :: Since most templating engines aren't streaming (see my unanswered thread in the general mailing list re: this), pausing the application pending a particularly difficult render is a boon to single-threaded async servers, though true streaming templating (with flush semantics) would be the holy grail. ;) :: Long-duration calls to non-async-aware libraries such as DB access. The WSGI application could queue up a number of long DB queries, pass the futures instances to the template, and the template could then .result() (block) across them or yield them to be suspended and resumed when the result is available. :: True async is useful for WebSockets,
Re: [Web-SIG] PEP 444 / WSGI 2 Async
On 2011-01-06 10:15:19 -0800, Antoine Pitrou said: Alice Bevan–McGregor al...@... writes: Er, for the record, in Python 3 non-blocking file objects return None when read() would block. -1 I'm aware, however that's not practically useful. How would you detect from within the WSGI 2 application that the file object has become readable? Implement your own async reactor / select / epoll loop? That's crazy talk! ;) I was just pointing out that if you need to choose a convention for signaling blocking reads on a non-blocking object, it's already there. I don't. I need a way to suspend execution of a WSGI application pending some operation, often waiting for socket or file read or write availability. (Just as often something entirely unrelated to file descriptors, see my previous post from a few moments ago.) By the way, an event loop is the canonical implementation of asynchronous programming, so I'm not sure what you're complaining about. Or perhaps you're using async in a different meaning? (which one?) If you use non-blocking sockets, and the WSGI server provides a way to directly access the client socket (ack!), utilizing the none response on reads would require you to utilize a tight loop within your application to wait for actual data. That's really, really bad, and in a single-threaded server, deadly. I don't understand why you want a yield at this level. IMHO, WSGI needn't involve generators. A higher-level wrapper (framework, middleware, whatever) can wrap fd-waiting in fancy generator stuff if so desired. Or, in some other environments, delegate it to a reactor with callbacks and deferreds. Or whatever else, such as futures. WSGI already involves generators: the response body. In fact, the templating engine I wrote (and extended to support flush semantics) utilizes a generator to return the response body. Works like a hot damn, too. Yield is the Python language's native way to suspend execution of a callable in a re-entrant way. A trivial example of this is an async ping-pong reactor. I wrote one (you aren't a real Python programmer unless...) as an experiment and utilize it for server monitoring with tasks being generally scheduled against time, vs. edge-triggered or level-triggered fd operation availability. Everyone has their own idea of what a deferred is, and there is only one definition of a future, which (in a broad sense) is the same as the general idea of a deferred. Deferreds just happen to be implementation-specific and often require rewriting large portions of external libraries to make them compatible with that specific deferred implementation. That's not a good thing. Hell; an extension to the futures spec to handle file descriptor events might not be a half-bad idea. :/ By the way, the concurrent.futures module is new. Though it will be there in 3.2, it's not guaranteed that its API and semantics will be 100% stable while people start to really flesh it out. Ratification of PEP 444 is a long way off itself. Also, Alex Grönholm maintains a pypi backport of the futures module compatible with 2.x+ (not sure of the specific minimum version) and 3.2. I'm fairly certain deprecation warnings wouldn't kill the usefulness of that implementation. Worrying about instability, at this point, may be premature. +1 for pure futures which (in theory) eliminate the need for dedicated async versions of absolutely everything at the possible cost of slightly higher overhead. I don't understand why futures would solve the need for a low-level async facility. You mis-interpreted; I didn't mean to infer that futures would replace an async core reactor, just that long-running external library calls could be trivially deferred using futures. You still need to define a way for the server and the app to wake each other (and for the server to wake multiple apps). Futures is a pretty convienent way to have a server wake an app; using a future completion callback wrapped (using partial) with the paused application generator would do it. (The reactor Marrow uses, a modified Tornado IOLoop, would require calling reactor.add_callback(partial(worker, app_gen)) followed by reactor._wake() in the future callback.) Waking up the server would be accomplished by yielding a futures instance (or fd magical value, etc). This isn't done naturally in Python (except perhaps with stackless or greenlets). Using fds give you well-known flexible possibilities. Yield is the natural way for one side of that, re-entering the generator on future completion covers the other side. Stackless and greenlets are alternate ideas, but yield is built-in (and soon, so will futures). If you want to put the futures API in WSGI, think of the poor authors of a WSGI server written in C who will have to write their own executor and future implementation. I'm sure they have better things to do. If they embed a Python interpreter via C,
Re: [Web-SIG] PEP 444 Goals
On Thu, 6 Jan 2011, Alice Bevan–McGregor wrote: :: Clear separation of narrative from rules to be followed. This allows developers of both servers and applications to easily run through a confomance check list. +1 :: Isolation of examples and rationale to improve readability of the core rulesets. +1 :: Clarification of often mis-interpreted rules from PEP 333 (and those carried over in ). +1 :: Elimination of unintentional non-conformance, esp. re: cgi.FieldStorage. +1 :: Massive simplification of call flow. Replacing start_response with a returned 3-tuple immensely simplifies the task of middleware that needs to capture HTTP status or manipulate (or even examine) response headers. [1] +1 I was initially resistant to this one in a we fear change kind of way, but I've since recognized that a) I was thinking about it mostly in terms of existing code I have that would need to be changed b) it _is_ more pythonic. :: Reduction of re-implementation / NIH syndrome by incorporating the most common (1%) of features most often relegated to middleware or functional helpers. Unicode decoding of a small handful of values (CGI values that pull from the request URI) is the biggest example. [2, 3] 0 (as in unsure, need to be convinced, etc) The zero here is in large part because this particular goal could cover a large number of things from standardized query string processing (maybe a good idea) to filters (which I've already expressed reservations about). So this goal seems like it ought to be several separate goals. :: Cross-compatibility considerations. The definition and use of native strings vs. byte strings is the biggest example of this in the rewrite. +1 :: Making optional (and thus rarely-implemented) features non-optional. E.g. server support for HTTP/1.1 with clarifications for interfacing applications to 1.1 servers. Thus pipelining, chunked encoding, et. al. as per the HTTP 1.1 RFC. 0 The other option (than non-optional) for optional things is to remove them. I think working from a list of goals is an excellent way to make some headway. -- Chris Dent http://burningchrome.com/ [...]___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 Goals
On 2011-01-06 20:18:12 -0800, P.J. Eby said: :: Reduction of re-implementation / NIH syndrome by incorporatingthe most common (1%) of features most often relegated to middlewareor functional helpers. Note that nearly every application-friendly feature you add will increase the burden on both server developers and middleware developers, which ironically means that application developers actually end up with fewer options. Some things shouldn't have multiple options in the first place. ;) I definitely consider implementation overhead on server, middleware, and application authors to be important. As an example, if yield syntax is allowable for application objects (as it is for response bodies) middleware will need to iterate over the application, yielding up-stream anything that isn't a 3-tuple. When it encounters a 3-tuple, the middleware can do its thing. If the app yield semantics are required (which may be a good idea for consistency and simplicity sake if we head down this path) then async-aware middleware can be implemented as a generator regardless of the downstream (wrapped) application's implementation. That's not too much overhead, IMHO. Unicode decoding of a small handful of values (CGI values that pull from the request URI) is the biggest example. [2, 3] Does that mean you plan to make the other values bytes, then? Or will they be unicode-y-bytes as well? Specific CGI values are bytes (one, I believe), specific ones are true unicode (URI-related values) and decoded using a configurable encoding with a fallback to bytes in unicode (iso-8859-1/latin1), are kept internally consistent (if any one fails, treat as if they all failed), have the encoding used recorded in the environ, and all others are native strings (bytes in unicode where native strings are unicode). What happens for additional server-provided variables? That is the domain of the server to document, though native strings would be nice. (The PEP only covers CGI variables.) The PEP choice was for uniformity. At one point, I advocated simply using surrogateescape coding, but this couldn't be made uniform across Python versions and maintain compatibility. As an open question to anyone: is surrogateescape availabe in Python 2.6? Mandating that as a minimum version for PEP 444 has yielded benefits in terms of back-ported features and syntax, like b''. :: Cross-compatibility considerations. The definition and use ofnative strings vs. byte strings is the biggest example of this in the rewrite. I'm not sure what you mean here. Do you mean portability of WSGI 2code samples across Python versions (esp. 2.x vs. 3.x)? It should be possible (and currently is, as demonstrated by marrow.server.http) to create a polygot server, polygot middleware/filters (demonstrated by marrow.wsgi.egress.compression), and polygot applications, though obviously polygot code demands the lowest common denominator in terms of feature use. Application / framework authors would likely create Python 3 specific WSGI applications to make use of the full Python 3 feature set, with cross-compatibility relegated to server and middleware authors. - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 Goals
On 2011-01-07 01:08:42 -0800, chris.dent said: ... this particular goal [reduction of reimplementation / NIH] could cover a large number of things from standardized query string processing (maybe a good idea) to filters (which I've already expressed reservations about). So this goal seems like it ought to be several separate goals. +1 This definitely needs to be broken out to be explicit over the things that can be abstracted away from middleware and applications. Input from framework authors would be valuable here to see what they disliked re-implementing the most. ;) Query string processing is a difficult task at the best of times, and is one area that is reimplemented absolutely everywhere. (At some point I should add up the amount of code + unit testing code that covers this topic alone from the top 10 frameworks.) The other option (than non-optional) for optional things is to remove them. True; though optional things already exist as if they were not there. Implementors rarely, it seems, expend the effort to implement optional components, thus every HTTP server I came across having comments in the code saying up to the application to implement chunked responses indicating -some- thought, but despite chunked /request/ support being mandated by HTTP/1.1. (And other ignored requirements.) - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
At 12:39 AM 1/7/2011 -0800, Alice BevanMcGregor wrote: Earlier in this post I illustrated a few that directly apply to a commercial application I am currently writing. I'll elaborate: :: Image scaling would benefit from multi-processing (spreading the load across cores). Also, only one sacle is immediately required before returning the post-upload page: the thumbnail. The other scales can be executed without halting the WSGI application's return. :: Asset content extraction and indexing would benefit from threading, and would also not require pausing the WSGI application. :: Since most templating engines aren't streaming (see my unanswered thread in the general mailing list re: this), pausing the application pending a particularly difficult render is a boon to single-threaded async servers, though true streaming templating (with flush semantics) would be the holy grail. ;) In all these cases, ISTM the benefit is the same if you future the WSGI apps themselves (which is essentially what most current async WSGI servers do, AFAIK). :: Long-duration calls to non-async-aware libraries such as DB access. The WSGI application could queue up a number of long DB queries, pass the futures instances to the template, and the template could then .result() (block) across them or yield them to be suspended and resumed when the result is available. :: True async is useful for WebSockets, which seem a far superior solution to JSON/AJAX polling in addition to allowing real web-based socket access, of course. The point as it relates to WSGI, though, is that there are plenty of mature async APIs that offer these benefits, and some of them (e.g. Eventlet and Gevent) do so while allowing blocking-style code to be written. That is, you just make what looks like a blocking call, but the underlying framework silently suspends your code, without tying up the thread. Or, if you can't use a greenlet-based framework, you can use a yield-based framework. Or, if for some reason you really wanted to write continuation-passing style code, you could just use the raw Twisted API. But in all of these cases you would be better off than if you used a half-implementation of the same thing using futures under WSGI, because all of those frameworks already have mature and sophisticated APIs for doing async communications and DB access. If you try to do it with WSGI under the guise of portability, all this means is that you are stuck rolling your own replacements for those existing APIs. Even if you've already written a bunch of code using raw sockets and want to make it asynchronous, Eventlet and Gevent actually let you load a compatibility module that makes it all work, by replacing the socket API with an exact duplicate that secretly suspends your code whenever a socket operation would block. IOW, if you are writing a truly async application, you'd almost have to be crazy to want to try to do it *portably*, vs. picking a full-featured async API and server suite to code against. And if you're migrating an existing, previously-synchronous WSGI app to being asynchronous, the obvious thing to do would just be to grab a copy of Eventlet or Gevent and import the appropriate compatibility modules, not rewrite the whole thing to use futures. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 Goals
At 01:17 AM 1/7/2011 -0800, Alice BevanMcGregor wrote: On 2011-01-06 20:18:12 -0800, P.J. Eby said: :: Reduction of re-implementation / NIH syndrome by incorporatingthe most common (1%) of features most often relegated to middlewareor functional helpers. Note that nearly every application-friendly feature you add will increase the burden on both server developers and middleware developers, which ironically means that application developers actually end up with fewer options. Some things shouldn't have multiple options in the first place. ;) I meant that if a server doesn't implement the spec because of a required feature, then the app developer doesn't have the option of using that feature anyway -- meaning that adding the feature to the spec didn't really help. I definitely consider implementation overhead on server, middleware, and application authors to be important. As an example, if yield syntax is allowable for application objects (as it is for response bodies) middleware will need to iterate over the application, yielding up-stream anything that isn't a 3-tuple. When it encounters a 3-tuple, the middleware can do its thing. If the app yield semantics are required (which may be a good idea for consistency and simplicity sake if we head down this path) then async-aware middleware can be implemented as a generator regardless of the downstream (wrapped) application's implementation. That's not too much overhead, IMHO. The reason I proposed the 3-tuple return in the first place (see http://dirtsimple.org/2007/02/wsgi-middleware-considered-harmful.html ) was that I wanted to make middleware *easy* to write. Easy enough to write quick, say, 10-line utility functions that are correct middleware -- so that you could actually build your application out of WSGI functions calling other WSGI-based functions. The yielding thing wouldn't work for that at all. Unicode decoding of a small handful of values (CGI values that pull from the request URI) is the biggest example. [2, 3] Does that mean you plan to make the other values bytes, then? Or will they be unicode-y-bytes as well? Specific CGI values are bytes (one, I believe), specific ones are true unicode (URI-related values) and decoded using a configurable encoding with a fallback to bytes in unicode (iso-8859-1/latin1), are kept internally consistent (if any one fails, treat as if they all failed), have the encoding used recorded in the environ, and all others are native strings (bytes in unicode where native strings are unicode). So, in order to know what type each CGI variable is, you'll need a reference? What happens for additional server-provided variables? That is the domain of the server to document, though native strings would be nice. (The PEP only covers CGI variables.) I mean the ones required by the spec, not server-specific extensions. The PEP choice was for uniformity. At one point, I advocated simply using surrogateescape coding, but this couldn't be made uniform across Python versions and maintain compatibility. As an open question to anyone: is surrogateescape availabe in Python 2.6? Mandating that as a minimum version for PEP 444 has yielded benefits in terms of back-ported features and syntax, like b''. No, otherwise I'd totally go for the surrogateescape approach. Heck, I'd still go for it if it were possible to write a surrogateescape handler for 2.6, and require that a PEP 444 server register one with Python's codec system. I don't know if it's *possible*, though, hopefully someone with more knowledge can weigh in on that. :: Cross-compatibility considerations. The definition and use ofnative strings vs. byte strings is the biggest example of this in the rewrite. I'm not sure what you mean here. Do you mean portability of WSGI 2code samples across Python versions (esp. 2.x vs. 3.x)? It should be possible (and currently is, as demonstrated by marrow.server.http) to create a polygot server, polygot middleware/filters (demonstrated by marrow.wsgi.egress.compression), and polygot applications, though obviously polygot code demands the lowest common denominator in terms of feature use. Application / framework authors would likely create Python 3 specific WSGI applications to make use of the full Python 3 feature set, with cross-compatibility relegated to server and middleware authors. I'm just asking whether, in your statement of goals and rationale, you would expand cross compatibility as meaning cross-python version portability, or whether you meant something else. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 Goals
No, otherwise I'd totally go for the surrogateescape approach. Heck, I'd still go for it if it were possible to write a surrogateescape handler for 2.6, and require that a PEP 444 server register one with Python's codec system. I don't know if it's *possible*, though, hopefully someone with more knowledge can weigh in on that. This error handler is written in C; I don’t know whether it would be possible to reimplement it in Python. See PEP 383 for a description, Python/codecs.c for the source. Regards ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
Alice Bevan–McGregor al...@... writes: I don't understand why you want a yield at this level. IMHO, WSGI needn't involve generators. A higher-level wrapper (framework, middleware, whatever) can wrap fd-waiting in fancy generator stuff if so desired. Or, in some other environments, delegate it to a reactor with callbacks and deferreds. Or whatever else, such as futures. WSGI already involves generators: the response body. Wrong. The response body is an arbitrary iterable, which means it can be a sequence, a generator, or something else. WSGI doesn't mandate any specific feature of generators, such as coroutine-like semantics, and the server doesn't have to know about them. Everyone has their own idea of what a deferred is, and there is only one definition of a future, which (in a broad sense) is the same as the general idea of a deferred. A Twisted deferred is as well defined as a Python stdlib future; actually, deferreds have been in use by the Python community for much, much longer than futures. But that's besides the point, since I'm proposing that your spec doesn't rely on a high-level abstraction at all. Ratification of PEP 444 is a long way off itself. Right, that's why I was suggesting you drop your concern for Python 2 compatibility. Antoine. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
When I originally requested a futures executor option (the email that started this thread), this is more like what I had in mind. I'm not against async...rather indifferent. But I wanted the ability for the server to run something after the response had been fully served to the client and thus not blocking the response. The example I gave was sending an email, but there are plenty of other use cases. Futures seemed like the right way to do this. I'm also not sure futures is the right way to build an async specification and for that matter, there will be a lot to work out with regard to PEP 444. Rather than responding to this, I'll start a new thread since this takes the environ[wsgi.executor] discusssion in a different direction. Please send your comments there. -t - Original Message - From: Guido van Rossum gu...@python.org To: P.J. Eby p...@telecommunity.com Cc: al...@gothcandy.com, web-sig@python.org Sent: Thursday, January 6, 2011 11:30:11 PM Subject: Re: [Web-SIG] PEP 444 / WSGI 2 Async On Thu, Jan 6, 2011 at 8:49 PM, P.J. Eby p...@telecommunity.com wrote: At 05:47 PM 1/6/2011 -0800, Alice BevanMcGregor wrote: Tossing the idea around all day long will then, of course, be happening regardless. Unfortunately for that particular discussion, PEP 3148 / Futures seems to have won out in the broader scope. Do any established async frameworks or server (e.g. Twisted, Eventlet, Gevent, Tornado, etc.) make use of futures? PEP 3148 Futures are meant for a rather different purpose than those async frameworks. Those frameworks all are trying to minimize the number of threads using some kind of callback-based non-blocking I/O system. PEP 3148 OTOH doesn't care about that -- it uses threads or processes proudly. This is useful for a different type of application, where there are fewer, larger tasks, and the overhead of threads doesn't matter. The Monocle framework, which builds on top of Tornado or Twisted, uses something not entirely unlike Futures, though they call it Callback. I don't think the acceptance of PEP 3148 should be taken as forcing the direction that async frameworks should take. Having a ratified and incorporated language PEP (core in 3.2 w/ compatibility package for 2.5 or 2.6+ support) reduces the scope of async discussion down to: how do we integrate futures into WSGI 2 instead of how do we define an async API at all. It would be helpful if you addressed the issue of scope, i.e., what features are you proposing to offer to the application developer. While the idea of using futures presents some intriguing possibilities, it seems to me at first glance that all it will do is move the point where the work gets done. That is, instead of simply running the app in a worker, the app will be farming out work to futures. But if this is so, then why doesn't the server just farm the apps themselves out to workers? I guess what I'm saying is, I haven't heard use cases for this from the application developer POV -- why should an app developer care about having their app run asynchronously? So far, I believe you're the second major proponent (i.e. ones with concrete proposals and/or implementations to discuss) of an async protocol... and what you have in common with the other proponent is that you happen to have written an async server that would benefit from having apps operating asynchronously. ;-) I find it hard to imagine an app developer wanting to do something asynchronously for which they would not want to use one of the big-dog asynchronous frameworks. (Especially if their app involves database access, or other communications protocols.) This doesn't mean I think having a futures API is a bad thing, but ISTM that a futures extension to WSGI 1 could be defined right now using an x-wsgi-org extension in that case... and you could then find out how many people are actually interested in using it. Mainly, though, what I see is people using the futures thing to shuffle off compute-intensive tasks... but if they do that, then they're basically trying to make the server's life easier... but under the existing spec, any truly async server implementing WSGI is going to run the *app* in a future of some sort already... Which means that the net result is that putting in async is like saying to the app developer: hey, you know this thing that you just could do in WSGI 1 and the server would take care of it for you? Well, now you can manage that complexity by yourself! Isn't that wonderful? ;-) I could be wrong of course, but I'd like to see what concrete use cases people have for async. We dropped the first discussion of async six years ago because someone (I think it might've been James) pointed out that, well, it isn't actually that useful. And every subsequent call for use cases since has been answered with, well, the use case is that you want it to be async. Only, that's a *server* developer's use case
[Web-SIG] PEP 444 feature request - Futures executor
There has been much discussion about how to handle async in PEP 444 and that discussion centers around the use of futures. However, I'm requesting that servers _optionally_ provide environ['wsgi.executor'] as a futures executor that applications can use for the purpose of doing something after the response is fully sent to the client. This is feature request is designed to be concurrency methodology agnostic. Some example use cases are: - send an email that might block on a slow email server (Alice, I read what you said about Turbomail, but one product is not the solution to all situations) - initiate a database vacuum - clean a cache - build a cache - compile statistics When serving pages of an application, these are all things that could be done after the response has been sent. Ideally these things don't need to be done in a request thread and aren't incredibly time-sensitive. It seems to me that futures would be an ideal way of handling this. Thoughts? ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 feature request - Futures executor
If it's optional, what's the benefit for the app of getting it through WSGI instead of through importing some other standard module? The API of the executor will require a lot of thought. I worry that this weighs down the WSGI standard with the responsibility of coming up with the perfect executor API, and if it's not quite perfect after all, servers are additionally required to support the standard but suboptimal API effectively forever. Or they can choose not to provide it, in which case it was a waste of time putting it in WSGI. On Fri, Jan 7, 2011 at 9:47 AM, Timothy Farrell tfarr...@owassobible.org wrote: There has been much discussion about how to handle async in PEP 444 and that discussion centers around the use of futures. However, I'm requesting that servers _optionally_ provide environ['wsgi.executor'] as a futures executor that applications can use for the purpose of doing something after the response is fully sent to the client. This is feature request is designed to be concurrency methodology agnostic. Some example use cases are: - send an email that might block on a slow email server (Alice, I read what you said about Turbomail, but one product is not the solution to all situations) - initiate a database vacuum - clean a cache - build a cache - compile statistics When serving pages of an application, these are all things that could be done after the response has been sent. Ideally these things don't need to be done in a request thread and aren't incredibly time-sensitive. It seems to me that futures would be an ideal way of handling this. Thoughts? ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 feature request - Futures executor
If it's optional, what's the benefit for the app of getting it through WSGI instead of through importing some other standard module? Performance primarily. If you instantiate an executor at every page request, wouldn't that slow things down unnecessarily? Aside from that, servers currently specify if they are multi-threaded and/or multi-process. Having the server provide the executor allows it to provide an executor that most matches its own concurrency model...again for performance reasons. Optional and not manditory because not every application wants or need such functionality. Maybe this should be a server option instead of a spec option. But since we already have the module available, it shouldn't be too much of a burden on server/gateway authors to add support for it. I worry that this weighs down the WSGI standard with the responsibility of coming up with the perfect executor API, and if it's not quite perfect after all, servers are additionally required to support the standard but suboptimal API effectively forever. I'm not following you here. What's wrong with executor.submit() that might need changing? Granted, it would not be ideal if an application called executor.shutdown(). This doesn't seem difficult to my tiny brain. - Original Message - From: Guido van Rossum gu...@python.org To: Timothy Farrell tfarr...@owassobible.org Cc: web-sig@python.org Sent: Friday, January 7, 2011 11:59:10 AM Subject: Re: [Web-SIG] PEP 444 feature request - Futures executor If it's optional, what's the benefit for the app of getting it through WSGI instead of through importing some other standard module? The API of the executor will require a lot of thought. I worry that this weighs down the WSGI standard with the responsibility of coming up with the perfect executor API, and if it's not quite perfect after all, servers are additionally required to support the standard but suboptimal API effectively forever. Or they can choose not to provide it, in which case it was a waste of time putting it in WSGI. On Fri, Jan 7, 2011 at 9:47 AM, Timothy Farrell tfarr...@owassobible.org wrote: There has been much discussion about how to handle async in PEP 444 and that discussion centers around the use of futures. However, I'm requesting that servers _optionally_ provide environ['wsgi.executor'] as a futures executor that applications can use for the purpose of doing something after the response is fully sent to the client. This is feature request is designed to be concurrency methodology agnostic. Some example use cases are: - send an email that might block on a slow email server (Alice, I read what you said about Turbomail, but one product is not the solution to all situations) - initiate a database vacuum - clean a cache - build a cache - compile statistics When serving pages of an application, these are all things that could be done after the response has been sent. Ideally these things don't need to be done in a request thread and aren't incredibly time-sensitive. It seems to me that futures would be an ideal way of handling this. Thoughts? ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 feature request - Futures executor
On Fri, Jan 7, 2011 at 9:47 AM, Timothy Farrell wrote: There has been much discussion about how to handle async in PEP 444 and that discussion centers around the use of futures. However, I'm requesting that servers _optionally_ provide environ['wsgi.executor'] as a futures executor that applications can use for the purpose of doing something after the response is fully sent to the client. This is feature request is designed to be concurrency methodology agnostic. +1 On 2011-01-07 11:07:36 -0800, Timothy Farrell said: On 2011-01-07 09:59:10 -0800, Guido van Rossum said: If it's optional, what's the benefit for the app of getting it through WSGI instead of through importing some other standard module? Aside from that, servers currently specify if they are multi-threaded and/or multi-process. Having the server provide the executor allows it to provide an executor that most matches its own concurrency model... I think that's the bigger point; WSGI servers do implement their own concurrency model for request processing and utilizing a server-provided executor which interfaces with whatever the internal representation of concurrency is would be highly beneficial. (Vs. an application utilizing a more generic executor implementation that adds a second thread pool...) Taking futures to be separate and distinct from the rest of async discussion, I still think it's an extremely useful feature. I outlined my own personal use cases in my slew of e-mails last night, and many of them are also not time sensitive. (E.g. image scaling, full text indexing, etc.) Maybe this should be a server option instead of a spec option. It would definitely fall under the Server API spec, not the application one. Being optional, and with simple (wsgi.executor) access via the environ would also allow middleware developers to create executor implementations (or just reference the concurrent.futures implementation). I worry that this weighs down the WSGI standard with the responsibility of coming up with the perfect executor API, and if it's not quite perfect after all, servers are additionally required to support the standard but suboptimal API effectively forever. I'm not following you here. What's wrong with executor.submit() that might need changing? Granted, it would not be ideal if an application called executor.shutdown(). This doesn't seem difficult to my tiny brain. The perfect executor API is already well defined in PEP 3148 AFIK. Specific methods with specific semantics implemented in a duck-typed way. The underlying implementation is up to the server, or the server can utilize an external (or built-in in 3.2) futures implementation. If WSGI 2 were to incorporate futures as a feature there would have to be some mandate as to which methods applications and middleware are allowed to call; similar to how we do not allow .close() across wsgi.input or wsgi.errors. - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
On 2011-01-07 09:04:07 -0800, Antoine Pitrou said: Alice Bevan–McGregor al...@... writes: I don't understand why you want a yield at this level. IMHO, WSGI needn't involve generators. A higher-level wrapper (framework, middleware, whatever) can wrap fd-waiting in fancy generator stuff if so desired. Or, in some other environments, delegate it to a reactor with callbacks and deferreds. Or whatever else, such as futures. WSGI already involves generators: the response body. Wrong. I'm aware that it can be any form of iterable, from a list-wrapped string all the way up to generators or other nifty things. I mistakenly omitted these assuming that the other iterables were universally understood and implied. However, using a generator is a known, vlaid use case that I do see in the wild. (And also rely upon in some of my own applications.) Right, that's why I was suggesting you drop your concern for Python 2 compatibility. -1 There is practically no reason for doing so; esp. considering that I've managed to write a 2k/3k polygot server that is more performant out of the box than any other WSGI HTTP server I've come across and is far simpler in implementation than most of the ones I've come across with roughly equivelant feature sets. Cross compatibility really isn't that hard, and arguing that 2.x support should be dropped for the sole reason that it might be dead by the time this is ratified is a bit off. Python 2.x will be around for a long time. - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
On 2011-01-07 08:10:43 -0800, P.J. Eby said: At 12:39 AM 1/7/2011 -0800, Alice BevanMcGregor wrote: :: Image scaling would benefit from multi-processing (spreading theload across cores). Also, only one sacle is immediately requiredbefore returning the post-upload page: the thumbnail. The otherscales can be executed without halting the WSGI application's return. :: Asset content extraction and indexing would benefit fromthreading, and would also not require pausing the WSGI application. In all these cases, ISTM the benefit is the same if you future theWSGI apps themselves (which is essentially what most current asyncWSGI servers do, AFAIK). Image scaling and asset content extraction should not block the response to a HTTP request; these need to be 'forked' from the main request. Only template generation (where the app needs to effectively block pending completion) is solved easily by threading the whole application call. :: Long-duration calls to non-async-aware libraries such as DB access. The WSGI application could queue up a number of long DB queries,pass the futures instances to the template, and the template couldthen .result() (block) across them or yield them to be suspended andresumed when the result is available. :: True async is useful for WebSockets, which seem a far superiorsolution to JSON/AJAX polling in addition to allowing real web-basedsocket access, of course. The point as it relates to WSGI, though, is that there are plenty ofmature async APIs that offer these benefits, and some of them (e.g.Eventlet and Gevent) do so while allowing blocking-style code to bewritten. That is, you just make what looks like a blocking call, butthe underlying framework silently suspends your code, without tyingup the thread. Or, if you can't use a greenlet-based framework, you can use a yield-based framework. Or, if for some reason you really wanted to write continuation-passing style code, you could just use the raw Twisted API. But is there really any problem with providing a unified method for indication a suspend point? What the server does when it gets the yielded value is entirely up to the implementation of the server; if it (the server) wants to use greenlets, it can. If it has other methedologies, it can go nuts. Even if you've already written a bunch of code using raw sockets and want to make it asynchronous, Eventlet and Gevent actually let youload a compatibility module that makes it all work, by replacing the socket API with an exact duplicate that secretly suspends your code whenever a socket operation would block. I generally frown upon magic, and each of these implementations is completely specific. :/ - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
There is practically no reason for doing so; esp. considering that I've managed to write a 2k/3k polygot server that is more performant out of the box than any other WSGI HTTP server I've come across and is far simpler in implementation than most of the ones I've come across with roughly equivelant feature sets. Is the code for this server online? I'd be interested in reading through it. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 feature request - Futures executor
- Original Message - From: P.J. Eby p...@telecommunity.com To: Timothy Farrell tfarr...@owassobible.org, web-sig@python.org Sent: Friday, January 7, 2011 2:14:20 PM Subject: Re: [Web-SIG] PEP 444 feature request - Futures executor This seems like a potentially good way to do it; I suggest making it a wsgi.org extension; see (and update) http://www.wsgi.org/wsgi/Specifications with your proposal. I would suggest including a simple sample executor wrapper that servers could use to block all but the methods allowed by your proposal. (i.e., presumably not shutdown(), for example.) OK, will do. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
Alice Bevan–McGregor al...@... writes: On 2011-01-07 09:04:07 -0800, Antoine Pitrou said: Alice Bevan–McGregor al...@... writes: I don't understand why you want a yield at this level. IMHO, WSGI needn't involve generators. A higher-level wrapper (framework, middleware, whatever) can wrap fd-waiting in fancy generator stuff if so desired. Or, in some other environments, delegate it to a reactor with callbacks and deferreds. Or whatever else, such as futures. WSGI already involves generators: the response body. Wrong. I'm aware that it can be any form of iterable, [snip] Ok, so, WSGI doesn't already involve generators. QED. Right, that's why I was suggesting you drop your concern for Python 2 compatibility. -1 There is practically no reason for doing so; Of course, there is one: a less complex PEP without any superfluous compatibility language sprinkled all over. And a second one: a simpler PEP is probably easier to get contructive comments about, and (perhaps some day) consensus on. esp. considering that I've managed to write a 2k/3k polygot server that is more performant out of the box than any other WSGI HTTP server I've come across and is far simpler in implementation than most of the ones I've come across with roughly equivelant feature sets. Just because you managed to write some piece of code for a *particular* use case doesn't mean that cross-compatibility is a solved problem. If you think it's easy, then I'm sure the authors of various 3rd-party libs would welcome your help achieving it. Python 2.x will be around for a long time. And so will PEP and even PEP 333. People who value legacy compatibility will favour these old PEPs over your new one anyway. People who don't will progressively jump to 3.x. Antoine. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
On 2011-01-07 12:42:24 -0800, Paul Davis said: Is the code for this server online? I'd be interested in reading through it. https://github.com/pulp/marrow.server.http There are two branches: master will always refer to the version published on Python.org, and draft refers to my rewrite. (When published, draft will be merged.) - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
On 2011-01-07 13:21:36 -0800, Antoine Pitrou said: Ok, so, WSGI doesn't already involve generators. QED. This can go around in circles; by allowing all forms of iterable, it involves generators. Geneators are a type of iterable. QED right back. ;) Right, that's why I was suggesting you drop your concern for Python 2 compatibility. -1 There is practically no reason for doing so; Of course, there is one: a less complex PEP without any superfluous compatibility language sprinkled all over. There isn't any compatibility language sprinkled within the PEP. In fact, the only mention of it is in the introduction (stating that 2.6 support may be possible but is undefined) and the title of a section Python Cross-Version Compatibility. Using native strings where possible encourages compatibility, though for the environ variables previously mentioned (URI, etc.) explicit exceptional behaviour is clearly defined. (Byte strings and true unicode.) Just because you managed to write some piece of code for a *particular* use case doesn't mean that cross-compatibility is a solved problem. The particular use case happens to be PEP 444 as implemented using an async and multi-process (some day multi-threaded) HTTP server, so I'm not quite sure what you're getting at, here. I think that use case is sufficiently broad to be able to make claims about the ease of implementing PEP 444 in a compatible way. If you think it's easy, then I'm sure the authors of various 3rd-party libs would welcome your help achieving it. I helped proof a book about Python 3 compatibility and am giving a presentation in March that contains information on Python 3 compatibility from the viewpoint of implementing the Marrow suite. Python 2.x will be around for a long time. And so will PEP and even PEP 333. People who value legacy compatibility will favour these old PEPs over your new one anyway. People who don't will progressively jump to 3.x. Yup. Not sure how this is really an issue. PEP 444 is the /future/, 333[3] is /now/ [-ish]. - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
On 2011-01-07 09:04:07 -0800, Antoine Pitrou said: WSGI doesn't mandate any specific feature of generators, such as coroutine-like semantics, and the server doesn't have to know about them. The joy of writing a new specification is that we are not (potentially) shackled by old ways of doing things. Case in point: dropping start_response and changing the return value. PEP 444 isn't WSGI 1, and can change things, including additional changes to the allowable return value. - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
07.01.2011 07:24, Alex Grönholm kirjoitti: 07.01.2011 06:49, P.J. Eby kirjoitti: At 05:47 PM 1/6/2011 -0800, Alice BevanMcGregor wrote: Tossing the idea around all day long will then, of course, be happening regardless. Unfortunately for that particular discussion, PEP 3148 / Futures seems to have won out in the broader scope. Do any established async frameworks or server (e.g. Twisted, Eventlet, Gevent, Tornado, etc.) make use of futures? I understand that Twisted has incorporated futures support to their deferreds. Others, I believe, don't support them yet. You have to consider that Python 3.2 (the first Python with futures support in stdlib) hasn't even been released yet, and it's only been two weeks since I released the drop-in backport (http://pypi.python.org/pypi/futures/2.1). Exarkun corrected me on this -- there is currently no futures support in Twisted. Sorry about the false information. Having a ratified and incorporated language PEP (core in 3.2 w/ compatibility package for 2.5 or 2.6+ support) reduces the scope of async discussion down to: how do we integrate futures into WSGI 2 instead of how do we define an async API at all. It would be helpful if you addressed the issue of scope, i.e., what features are you proposing to offer to the application developer. While the idea of using futures presents some intriguing possibilities, it seems to me at first glance that all it will do is move the point where the work gets done. That is, instead of simply running the app in a worker, the app will be farming out work to futures. But if this is so, then why doesn't the server just farm the apps themselves out to workers? I guess what I'm saying is, I haven't heard use cases for this from the application developer POV -- why should an app developer care about having their app run asynchronously? Applications need to be asynchronous to work on a single threaded server. There is no other benefit than speed and concurrency, and having to program a web app to operate asynchronously can be a pain. AFAIK there is no other way if you want to avoid the context switching overhead and support a huge number of concurrent connections. Thread/process pools are only necessary in an asynchronous application where the app needs to use blocking network APIs or do heavy computation, and such uses can unfortunately present a bottleneck. It follows that it's pretty pointless to have an asynchronous application that uses a thread/process pool on every request. The goal here is to define a common API for these mutually incompatible asynchronous servers to implement so that you could one day run an asynchronous app on Twisted, Tornado, or whatever without modifications. So far, I believe you're the second major proponent (i.e. ones with concrete proposals and/or implementations to discuss) of an async protocol... and what you have in common with the other proponent is that you happen to have written an async server that would benefit from having apps operating asynchronously. ;-) I find it hard to imagine an app developer wanting to do something asynchronously for which they would not want to use one of the big-dog asynchronous frameworks. (Especially if their app involves database access, or other communications protocols.) This doesn't mean I think having a futures API is a bad thing, but ISTM that a futures extension to WSGI 1 could be defined right now using an x-wsgi-org extension in that case... and you could then find out how many people are actually interested in using it. Mainly, though, what I see is people using the futures thing to shuffle off compute-intensive tasks... but if they do that, then they're basically trying to make the server's life easier... but under the existing spec, any truly async server implementing WSGI is going to run the *app* in a future of some sort already... Which means that the net result is that putting in async is like saying to the app developer: hey, you know this thing that you just could do in WSGI 1 and the server would take care of it for you? Well, now you can manage that complexity by yourself! Isn't that wonderful? ;-) I could be wrong of course, but I'd like to see what concrete use cases people have for async. We dropped the first discussion of async six years ago because someone (I think it might've been James) pointed out that, well, it isn't actually that useful. And every subsequent call for use cases since has been answered with, well, the use case is that you want it to be async. Only, that's a *server* developer's use case, not an app developer's use case... and only for a minority of server developers, at that. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi
Re: [Web-SIG] PEP 444 feature request - Futures executor
07.01.2011 19:59, Guido van Rossum kirjoitti: If it's optional, what's the benefit for the app of getting it through WSGI instead of through importing some other standard module? The API of the executor will require a lot of thought. I worry that this weighs down the WSGI standard with the responsibility of coming up with the perfect executor API, and if it's not quite perfect after all, servers are additionally required to support the standard but suboptimal API effectively forever. Or they can choose not to provide it, in which case it was a waste of time putting it in WSGI. The only plausible reason for having a wsgi.executor object is to make writing (asynchronous) middleware easier. Otherwise the app could just create its own executor (as it is done now). If there is no wsgi.executor, how will the middleware get ahold of a thread/process pool? Having individual middleware maintain their own pools is pretty pointless, as I'm sure everyone will agree. On the other hand, I believe allowing wsgi.executor to be either a process pool or a thread pool is a recipe for disaster. I'm really not sure where to go from here. On Fri, Jan 7, 2011 at 9:47 AM, Timothy Farrell tfarr...@owassobible.org wrote: There has been much discussion about how to handle async in PEP 444 and that discussion centers around the use of futures. However, I'm requesting that servers _optionally_ provide environ['wsgi.executor'] as a futures executor that applications can use for the purpose of doing something after the response is fully sent to the client. This is feature request is designed to be concurrency methodology agnostic. Some example use cases are: - send an email that might block on a slow email server (Alice, I read what you said about Turbomail, but one product is not the solution to all situations) - initiate a database vacuum - clean a cache - build a cache - compile statistics When serving pages of an application, these are all things that could be done after the response has been sent. Ideally these things don't need to be done in a request thread and aren't incredibly time-sensitive. It seems to me that futures would be an ideal way of handling this. Thoughts? ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/guido%40python.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
Alice Bevan–McGregor al...@... writes: On 2011-01-07 13:21:36 -0800, Antoine Pitrou said: Ok, so, WSGI doesn't already involve generators. QED. This can go around in circles; by allowing all forms of iterable, it involves generators. Geneators are a type of iterable. QED right back. ;) Please read back in context. There isn't any compatibility language sprinkled within the PEP.[...] Using native strings where possible encourages compatibility, [snip] The whole native strings thing *is* compatibility cruft. A Python 3 PEP would only need two string types: bytes and unicode (str). Just because you managed to write some piece of code for a *particular* use case doesn't mean that cross-compatibility is a solved problem. The particular use case happens to be PEP 444 as implemented using an async and multi-process (some day multi-threaded) HTTP server, so I'm not quite sure what you're getting at, here. It's becoming to difficult to parse. You aren't sure yet what the async part of PEP 444 should look like but you have already implemented it? If you think it's easy, then I'm sure the authors of various 3rd-party libs would welcome your help achieving it. I helped proof a book about Python 3 compatibility and am giving a presentation in March that contains information on Python 3 compatibility from the viewpoint of implementing the Marrow suite. Well, I hope not too many people will waste time trying to write code cross-compatible code rather than solely target Python 3. The whole point of Python 3 is to make developers' life better, not worse. Python 2.x will be around for a long time. And so will PEP and even PEP 333. People who value legacy compatibility will favour these old PEPs over your new one anyway. People who don't will progressively jump to 3.x. Yup. Not sure how this is really an issue. PEP 444 is the /future/, 333[3] is /now/ [-ish]. Please read back in context (instead of stripping it), *again*. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
08.01.2011 05:36, Antoine Pitrou kirjoitti: Alice Bevan–McGregoral...@... writes: On 2011-01-07 13:21:36 -0800, Antoine Pitrou said: Ok, so, WSGI doesn't already involve generators. QED. This can go around in circles; by allowing all forms of iterable, it involves generators. Geneators are a type of iterable. QED right back. ;) Please read back in context. There isn't any compatibility language sprinkled within the PEP.[...] Using native strings where possible encourages compatibility, [snip] The whole native strings thing *is* compatibility cruft. A Python 3 PEP would only need two string types: bytes and unicode (str). Just because you managed to write some piece of code for a *particular* use case doesn't mean that cross-compatibility is a solved problem. The particular use case happens to be PEP 444 as implemented using an async and multi-process (some day multi-threaded) HTTP server, so I'm not quite sure what you're getting at, here. It's becoming to difficult to parse. You aren't sure yet what the async part of PEP 444 should look like but you have already implemented it? We are still discussing the possible mechanics of PEP 444 with async support. There is nothing definite yet, and certainly no workable implementation yet either. Async support may or may not materialize in PEP 444, in another PEP or not at all based on the discussions on this list and on IRC. If you think it's easy, then I'm sure the authors of various 3rd-party libs would welcome your help achieving it. I helped proof a book about Python 3 compatibility and am giving a presentation in March that contains information on Python 3 compatibility from the viewpoint of implementing the Marrow suite. Well, I hope not too many people will waste time trying to write code cross-compatible code rather than solely target Python 3. The whole point of Python 3 is to make developers' life better, not worse. Python 2.x will be around for a long time. And so will PEP and even PEP 333. People who value legacy compatibility will favour these old PEPs over your new one anyway. People who don't will progressively jump to 3.x. Yup. Not sure how this is really an issue. PEP 444 is the /future/, 333[3] is /now/ [-ish]. Please read back in context (instead of stripping it), *again*. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
At 12:37 PM 1/7/2011 -0800, Alice BevanMcGregor wrote: But is there really any problem with providing a unified method for indication a suspend point? Yes: a complexity burden that is paid by the many to serve the few -- or possibly non-existent. I still haven't seen anything that suggests there is a large enough group of people who want a portable async API to justify inconveniencing everyone else in order to suit their needs, vs. simply having a different calling interface for that need. If I could go back and change only ONE thing about WSGI 1, it would be the calling convention. It was messed up from the start, specifically because I wasn't adamant enough about weighing the needs of the many enough against the needs of the few. Only a few needed a push protocol (write()), and only a few even remotely cared about our minor nod to asynchrony (yielding empty strings to pause output). If I'd been smart (or more to the point, prescient), I'd have just done a 3-tuple return value from the get-go, and said to hell with those other use cases, because everybody else is paying to carry a few people who aren't even going to use these features for real. (As it happens, I thought write() would be needed in order to drive adoption, and it may well have been at one time.) Anyway, with a new spec we have the benefit of hindsight: we know that, historically, nobody has actually cared enough to propose a full-blown async API who wasn't also trying to make their async server implementation work without needing threads. Never in the history of the web-sig, AFAIK, has anyone come in and said, hey, I want to have an async app that can run on any async framework. Nobody blogs or twitters about how terrible it is that the async frameworks all have different APIs and that this makes their apps non-portable. We see lots of complaints about not having a Python 3 WSGI spec, but virtually none about WSGI being essentially synchronous. I'm not saying there's zero audience for such a thing... but then, at some point there was a non-zero audience for write() and for yielding empty strings. ;-) The big problem is this: if, as an app developer, you want this hypothetical portable async API, you either already have an app that is async or you don't. If you do, then you already got married to some particular API and are happy with your choice -- or else you'd have bit the bullet and ported. What you would not do, is come to the Web-SIG and ask for a spec to help you port, because you'd then *still have to port* to the new API... unless of course you wanted it to look like the API you're already using... in which case, why are you porting again, exactly? Oh, you don't have an app... okay, so *hypothetically*, if you had this API -- which, because you're not actually *using* an async API right now, you probably don't even know quite what you need -- hypothetically if you had this API you would write an app and then run it on multiple async frameworks... See? It just gets all the way to silly. The only way you can actually get this far in the process seems to be if you are on the server side, thinking it would be really cool to make this thing because then surely you'll get users. In practice, I can't imagine how you could write an app with substantial async functionality that was sanely portable across the major async frameworks, with the possible exception of the two that at least share some common code, paradigms, and API. And even if you could, I can't imagine someone wanting to. So far, you have yet to give a concrete example of an application that you personally (or anyone you know of) want to be able to run on two different servers. You've spoken of hypothetical apps and hypothetical portability... but not one concrete, I want to run this under both Twisted and Eventlet (or some other two frameworks/servers), because of [actual, non-hypothetical rationale here]. I don't deny that [actual non-hypothetical rationale] may exist somewhere, but until somebody shows up with a concrete case, I don't see a proposal getting much traction. (The alternative would be if you pull a rabbit out of your hat and propose something that doesn't cost anybody anything to implement... but the fact that you're tossing the 3-tuple out in favor of yielding indicates you've got no such proposal ready at the present time.) On the plus side, the run this in a future after the request concept has some legs, and I hope Timothy (or anybody) takes it and runs with it. That has plenty of concrete use cases for portability -- every sufficiently-powerful web framework will want to either provide that feature, build other features on top of it, or both. It's the make the request itself async part that's the hard sell here, and in need of some truly spectacular rationale in order to justify the ubiquitous costs it imposes.
Re: [Web-SIG] PEP 444 Goals
At 01:22 PM 1/7/2011 -0800, Alice BevanMcGregor wrote: On 2011-01-07 08:28:15 -0800, P.J. Eby said: At 01:17 AM 1/7/2011 -0800, Alice BevanMcGregor wrote: On 2011-01-06 20:18:12 -0800, P.J. Eby said: :: Reduction of re-implementation / NIH syndrome byincorporatingthe most common (1%) of features most oftenrelegated to middlewareor functional helpers. Note that nearly every application-friendly feature you add willincrease the burden on both server developers and middlewaredevelopers, which ironically means that application developersactually end up with fewer options. Some things shouldn't have multiple options in the first place. ;) I meant that if a server doesn't implement the spec because of arequired feature, then the app developer doesn't have the option of using that feature anyway -- meaning that adding the feature to the spec didn't really help. I truly can not worry about non-conformant applications, middleware, or servers and still keep my hair. I said if a server doesn't implement the *spec*, meaning, they choose not to support PEP 444 *at all*, not that they skip providing the feature. Easy enough to write quick, say, 10-line utility functions that arecorrect middleware -- so that you could actually build yourapplication out of WSGI functions calling other WSGI-based functions. The yielding thing wouldn't work for that at all. Handling a possible generator isn't that difficult. That it's difficult at all means removes degree-of-difficulty as a strong motivation to switch. So, in order to know what type each CGI variable is, you'll need a reference? Reference? Re-read what I wrote. Only URI-specific values utilize an encoding reference variable in the environment; that's four values out of the entire environ. There is one, clearly defined bytes value. The rest are native strings, decoded using latin1/iso-8859-1/str-in-unicode where native strings are unicode. IOW, there are six specific facts someone needs to remember in order to know the type of a given CGI variable, over and above the mere fact that it's a CGI variable. Hence, reference. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
08.01.2011 07:09, P.J. Eby kirjoitti: At 12:37 PM 1/7/2011 -0800, Alice BevanMcGregor wrote: But is there really any problem with providing a unified method for indication a suspend point? Yes: a complexity burden that is paid by the many to serve the few -- or possibly non-existent. I still haven't seen anything that suggests there is a large enough group of people who want a portable async API to justify inconveniencing everyone else in order to suit their needs, vs. simply having a different calling interface for that need. If I could go back and change only ONE thing about WSGI 1, it would be the calling convention. It was messed up from the start, specifically because I wasn't adamant enough about weighing the needs of the many enough against the needs of the few. Only a few needed a push protocol (write()), and only a few even remotely cared about our minor nod to asynchrony (yielding empty strings to pause output). If I'd been smart (or more to the point, prescient), I'd have just done a 3-tuple return value from the get-go, and said to hell with those other use cases, because everybody else is paying to carry a few people who aren't even going to use these features for real. (As it happens, I thought write() would be needed in order to drive adoption, and it may well have been at one time.) Anyway, with a new spec we have the benefit of hindsight: we know that, historically, nobody has actually cared enough to propose a full-blown async API who wasn't also trying to make their async server implementation work without needing threads. Never in the history of the web-sig, AFAIK, has anyone come in and said, hey, I want to have an async app that can run on any async framework. Nobody blogs or twitters about how terrible it is that the async frameworks all have different APIs and that this makes their apps non-portable. We see lots of complaints about not having a Python 3 WSGI spec, but virtually none about WSGI being essentially synchronous. I'm not saying there's zero audience for such a thing... but then, at some point there was a non-zero audience for write() and for yielding empty strings. ;-) The big problem is this: if, as an app developer, you want this hypothetical portable async API, you either already have an app that is async or you don't. If you do, then you already got married to some particular API and are happy with your choice -- or else you'd have bit the bullet and ported. What you would not do, is come to the Web-SIG and ask for a spec to help you port, because you'd then *still have to port* to the new API... unless of course you wanted it to look like the API you're already using... in which case, why are you porting again, exactly? Oh, you don't have an app... okay, so *hypothetically*, if you had this API -- which, because you're not actually *using* an async API right now, you probably don't even know quite what you need -- hypothetically if you had this API you would write an app and then run it on multiple async frameworks... See? It just gets all the way to silly. The only way you can actually get this far in the process seems to be if you are on the server side, thinking it would be really cool to make this thing because then surely you'll get users. In practice, I can't imagine how you could write an app with substantial async functionality that was sanely portable across the major async frameworks, with the possible exception of the two that at least share some common code, paradigms, and API. And even if you could, I can't imagine someone wanting to. So far, you have yet to give a concrete example of an application that you personally (or anyone you know of) want to be able to run on two different servers. You've spoken of hypothetical apps and hypothetical portability... but not one concrete, I want to run this under both Twisted and Eventlet (or some other two frameworks/servers), because of [actual, non-hypothetical rationale here]. How do you suppose common async middleware could be implemented without a common async API? Today we have plenty of WSGI middleware, which would not be possible without a common API. You would have to make separate interfaces for every major framework and separately test against each of them instead of having a reasonable expectation that it will work uniformly across compliant frameworks. I would really love to see common middleware components that are usable on twisted, tornado etc. without modifications. You seem to be under the impression that asynchronous applications only have some specialized uses. Asynchronous applications are no more limited in scope than synchronous ones are. It's just an alternative programming paradigm that has the potential of squeezing more performance out of a server. Note that I am in now way insisting that PEP 444 require async support; I'm only exploring that possibility. If we cannot figure out
Re: [Web-SIG] PEP 444 Goals
On 2011-01-07 20:34:09 -0800, P.J. Eby said: That it [handling generators] is difficult at all means removes degree-of-difficulty as a strong motivation to switch. Agreed. I will be following up with a more concrete idea (including p-code) to better describe what is currently in my brain. (One half of which will be just as objectionable, the other half, with Alex Grönholm's input, far more reasonable.) IOW, there are six specific facts someone needs to remember in orderto know the type of a given CGI variable, over and above the merefact that it's a CGI variable. Hence, reference. No, practically there is one. If you are implementing a Python 3 solution, a single value (original URI) is an instance of bytes, the rest are str. If you are implementing a Python 2 solution, there's a single rule you need to remember: values derived from the URI (QUERY_STRING, PATH_INFO, etc.) are unicode, the rest are str. Poloygot implementors are already accepting that they will need to include more in their headspace before writing a single line of code; knowing that native string differs between the two langauges is a fundamental concept nessicary for the act of writing polygot code. - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
On 2011-01-07 22:13:17 -0800, Alex Grönholm said: 08.01.2011 07:09, P.J. Eby wrote: On the plus side, the run this in a future after the request concept has some legs... [snip] What exactly does run this in a future after the request mean? There seems to be some terminology confusion here. I suspect he's referring to some of the notes on the PEP 444 feature request - Futures executor thread and several of my illustrated use cases, notably: :: Image scaling (e.g. to multiple sizes) after uploading of an image to be scaled where the response (Congratulations, image uploded!) does not require the result of the scaling. :: Content indexing which can also be performed after returning the success page. The former would executor.submit() a number of scaling jobs, attach completion callbacks to perform some cleanup / database updating / etc., and return a response immediately. The latter is a single executor submission that is entirely non-time-critical. And likely other use cases as well. This (inclusion of an executor tuned to the underlying server in the environment) is one thing I think we can (almost) all agree is a good idea. :D Discussion on that particular idea should be relegated to the feature request thread, though. - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
On 2011-01-07 13:21:36 -0800, Antoine Pitrou said: Ok, so, WSGI doesn't already involve generators. QED. Let me try this again. With the understanding that: :: PEP 333[3] and 444 define a response body as an iterable. :: Thus WSGI involves iterables through definition. :: A generator is a type of iterable. :: Thus WSGI involves generators through the use of iterables. The hypothetical redefinition of an application as a generator is not too far out to lunch, considering that WSGI _already involves generators_. (And that the simple case, an application that does not utilize async, will require a single word be changed: s/return/yield) Is that clearer? The idea refered to below (and posted separately) involve this redefinition, which I understand fully will have a number of strong opponents. Considering PEP 444 is a new spec (already breaking direct compatibility via the /already/ redefined return value) I hope people do not reject this out of hand but instead help explore the idea further. On 2011-01-07 19:36:52 -0800, Antoine Pitrou said: Alice Bevan–McGregor al...@... writes: The particular use case happens to be PEP 444 as implemented using an async and multi-process (some day multi-threaded) HTTP server, so I'm not quite sure what you're getting at, here. It's becoming to difficult to parse. You aren't sure yet what the async part of PEP 444 should look like but you have already implemented it? Marrow HTTPd (marrow.server.http) [1] is, internally, an asynchronous server. It does not currently expose the reactor to the WSGI application via any interface whatsoever. I am, however, working on some p-code examples (that I will post for discussion as mentioned above) which I can base a fork of m.s.http off of to experiment. This means that, yes, I'm not sure how async will work in PEP 444 /in the end/, but I am at least attempting to explore the practical implications of the ideas thus far in a real codebase. I'm getting it done, even if it has to change or be scrapped. I helped proof a book about Python 3 compatibility and am giving a presentation in March that contains information on Python 3 compatibility from the viewpoint of implementing the Marrow suite. Well, I hope not too many people will waste time trying to write code cross-compatible code rather than solely target Python 3. The whole point of Python 3 is to make developers' life better, not worse. I agree, with one correction to your first point. Application and framework developers should whole-heartedly embrase Python 3 and make full use of its many features, simplifications and clarifications. However, it is demonstrably not Insanely Difficult™ to have compatible server and middleware implementations with the draft's definition of native string. If server and middleware developers are willing to create polygot code, I'm not going to stop them. Note that this type of compatibility is not mandated, and the use of native strings (with one well defined byte string exception) means that pure Python 3 programmers can be blissfully ignorant of the compatibility implications -- everything else is unicode (str), even if it's just bytes-in-unicode (latin1/iso-8859-1). Pure Python 2 programmers have only a small difference (for them) of the URI values being unicode; the remaining values are byte strings (str). I would like to hear a technical reason why this (native strings) is a bad idea instead of vague this will make things harder -- it won't, at least, not measurably, and I have the proof as a working, 100% unit tested, performant, cross-compatible polygot HTTP/1.1-compliant server. Written in several days worth of full-time work spread across weeks because this is a spare-time project; i.e. not a lot of literal work, nor hard. Hell, it has transformed from a crappy hack to experiment with HTTP into a complete (or very nearly so) implementation of PEP 444 in both of its current forms (published and draft) that is almost usable, ignoring the fact that PEP 444 is mutable, of course. - Alice. [1] http://bit.ly/fLfamO ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
Alice Bevan–McGregor al...@... writes: agronholm: what new features does pep 444 propose to add to pep ? \ async, filters, no buffering? GothAlice: Async, filters, no server-level buffering, native string usage, the definition of byte string as the format returned by socket read (which, on Java, is unicode!), and the allowance for returned data to be Latin1 Unicode. Regardless of the rest, I think the latter would be a large step backwards. Clear distinction between bytes and unicode is a *feature* of Python 3. Unicode-ignorant programmers should use frameworks which do the encoding work for them. (by the way, why you are targeting both Python 2 and 3?) agronholm: I'm not very comfortable with the idea of wsgi.input in async apps \ I'm just thinking what would happen when you do environ['wsgi.input'].read() GothAlice: One of two things: in a sync environment, it blocks until it can read, in an async environment [combined with yield] it pauses/shelves your application until the data is available. Er, for the record, in Python 3 non-blocking file objects return None when read() would block. For example: r, w = os.pipe() flags = fcntl.fcntl(r, fcntl.F_GETFL, 0); fcntl.fcntl(r, fcntl.F_SETFL, flags | os.O_NONBLOCK) 0 os.read(r, 1) Traceback (most recent call last): File stdin, line 1, in module OSError: [Errno 11] Resource temporarily unavailable f = open(r, rb) f.read(1) is None True agronholm: the requirements of async apps are a big problem agronholm: returning magic values from the app sounds like a bad idea agronholm: the best solution I can come up with is to have wsgi.async_input or something, which returns an async token for any given read operation The idiomatic abstraction for non-blockingness under POSIX is file descriptors. So, at the low level (the WSGI level), exchanging fds between server and app could be enough to allow both to wake up each other (perhaps two fds: one the server can wait on, one the app can wait on). Similarly to what signalfd() does. Then higher-level tools can wrap inside Futures or whatever else. However, this also means Windows compatibility becomes more complicated, unless the fds are sockets. Regards Antoine. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
On Wed, 5 Jan 2011, Alice Bevan–McGregor wrote: This should give a fairly comprehensive explanation of the rationale behind some decisions in the rewrite; a version of these conversations (in narrative style vs. discussion) will be added to the rewrite Real Soon Now™ under the Rationale section. Thanks for this. I've been trying to follow along with this conversation as an interested WSGI app developer and admit that much of the thrust of things is getting lost in the details and people's tendency to overquote. One thing that would be useful is if, when you post, Alice, you could give the URL of whatever and wherever your current draft is. That out of the way some comments: For me WSGI is a programmers' aid used to encourage ecapsulation and separation of concerns in web applications I develop. After that there's a bit about reuability and portability, but the structure of the apps/middleware themselves are the most important concerns for me. I don't use frameworks, or webob or any of that stuff. I just cook up callables that take environ and start_response. I don't want my awareness of the basics of HTTP abstracted away, because I want to make sure that my apps behave well. Plain WSGI is a good thing, for me, because it means that my applications are a) very webby (in the stateless HTTP sense) and b) very testable. This is all works because WSGI is very simple, so my tendency is to be resistant to ideas which appear to add complexity. --- 444 vs. GothAlice: Async, filters, no server-level buffering, native string usage, the definition of byte string as the format returned by socket read (which, on Java, is unicode!), and the allowance for returned data to be Latin1 Unicode. \ All of this together will allow a '''def hello(environ): return 200 OK, [], [Hello world!]''' example application to work across Python versions without modification (or use of b prefix) On async: I agree with some others who have suggested that maybe async should be its own thing, rather than integrated into a WSGI2. A server could choose to be WSGI2 compliant or AWSGI compliant, or both. Having spent some time messing about with node.js recently, I can say that the coding style for happy little async apps is great fun, but actually not what I really want to be doing in my run-of-the-mill as- RESTful-as-possible web apps. This might make me a bit of a dinosaur. Or a grape. That said I can understand why an app author might like to be able to read or write in an async way, and being able to shelf an app to wait around for the next cycle would be a good thing. I just don't want efforts to make that possible to make writing a boring wsgi thing more annoying. On filters: I can't get my head around filters yet. They sound like a different way to do middleware, with a justification of something along the lines of I don't like middleware for filtering. I'd like to be (directly) pointed at a more robust justification. I suspect you have already pointed at such a thing, but it is lost in the sands of time... Filters seem like something that could be added via a standardized piece of middleware, rather than being part of the spec. I like minimal specs. GothAlice: Latin1 = \u → \u00FF — it's one of the only formats that can be decoded while preserving raw bytes, and if another encoding is needed, transcode safely. \ Effectively requiring Latin1 for unicode output ensures single byte conformance on the data. \ If an application needs to return UTF-8, for example, it can return an encoded UTF-8 bytestream, which will be passed right through, There's a rule of thumb about constraints. If you must constrain, do none, one or all, never some. Ah, here it is: http://en.wikipedia.org/wiki/Zero_One_Infinity Does that apply here? It seems you either allow unicode strings or you don't, not a certain subsection. My own personal method is: textual apps _always_ return unicode producing iterators and a piece of (required, thus not offical by some people's biases) middleware turns it into UTF-8 on the way out. I've naively never understood why you want do anything else? My general rule is unicode inside, UTF-8 at the boundaries. That's all I got so far. I applaud you for taking on this challenge. It's work that needs to be done. I hope to be able to comment more and make a close reading of the various documents, but time is tough sometimes. I'll do what I can as I can. Thanks. -- Chris Dent http://burningchrome.com/ [...]___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
At 01:03 PM 1/6/2011 +, chris.d...@gmail.com wrote: Does that apply here? It seems you either allow unicode strings or you don't, not a certain subsection. That's why PEP requires bytes instead - only the application knows what it's sending, and the server and middleware shouldn't have to guess. My general rule is unicode inside, UTF-8 at the boundaries. Which would be easy to enforce if you can only yield bytes, as is the case with PEP . I worry a bit that right now, there may be Python 3.2 servers (other than the ones built on wsgiref.handlers) that may not be enforcing this rule yet. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
On 2011-01-06 03:53:14 -0800, Antoine Pitrou said: Alice Bevan-McGregor al...@... writes: GothAlice: ... native string usage, the definition of byte string as the format returned by socket read (which, on Java, is unicode!) ... Just so no-one feels the need to correct me; agronholm made sure I didn't drink the kool-aid of one article I was reading and basing some ideas on. Java socket ojects us byte-based buffers, not unicode. My bad! Regardless of the rest, I think the latter would be a large step backwards. Clear distinction between bytes and unicode is a *feature* of Python 3. Unicode-ignorant programmers should use frameworks which do the encoding work for them. +0.5 I'm beginning to agree; with the advent of b'' syntax in 2.6, the only compelling reason to include this feature (examples that work without modification across major versions of Python) goes up in smoke. The examples should use the b'' syntax and have done with it. (by the way, why you are targeting both Python 2 and 3?) For the same reason that Python 3 features are introduced to 2.x; migration. Users are more likely to adopt something that doesn't require them to change production environments, and 3.x is far away from being deployed in production anywhere but on Gentoo, it seems. ;) Broad development and deployment options are a Good Thing™, and with b'', there is no reason -not- to target 2.6+. (There is no requirement that a PEP 444 / WSGI 2 server even try to be a cross-compatible polygot; there is room for 2.x-specific and 3.x-specific solutions, and, in theory, it should be possible to support Python 2.6, I just don't feel it's worthwhile to lock your application into Very Old™ interpreters.) agronholm: I'm not very comfortable with the idea of wsgi.input in async apps \ I'm just thinking what would happen when you do environ['wsgi.input'].read() GothAlice: One of two things: in a sync environment, it blocks until it can read, in an async environment [combined with yield] it pauses/shelves your application until the data is available. Er, for the record, in Python 3 non-blocking file objects return None when read() would block. -1 I'm aware, however that's not practically useful. How would you detect from within the WSGI 2 application that the file object has become readable? Implement your own async reactor / select / epoll loop? That's crazy talk! ;) agronholm: the requirements of async apps are a big problem agronholm: returning magic values from the app sounds like a bad idea agronholm: the best solution I can come up with is to have wsgi.async_input or something, which returns an async token for any given read operation The idiomatic abstraction for non-blockingness under POSIX is file descriptors. So, at the low level (the WSGI level), exchanging fds between server and app could be enough to allow both to wake up each other (perhaps two fds: one the server can wait on, one the app can wait on). Similarly to what signalfd() does. Then higher-level tools can wrap inside Futures or whatever else. -0 Hmm; I'll have to mull that over. Initial thoughts: having a magic yield value that combines a fd and operation (read/write) is too magical. However, this also means Windows compatibility becomes more complicated, unless the fds are sockets. +1 for pure futures which (in theory) eliminate the need for dedicated async versions of absolutely everything at the possible cost of slightly higher overhead. - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
Chris, On 2011-01-06 05:03:15 -0800, Chris Dent said: On Wed, 5 Jan 2011, Alice Bevan–McGregor wrote: This should give a fairly comprehensive explanation of the rationale behind some decisions in the rewrite; a version of these conversations (in narrative style vs. discussion) will be added to the rewrite Real Soon Now™ under the Rationale section. Thanks for this. I've been trying to follow along with this conversation as an interested WSGI app developer and admit that much of the thrust of things is getting lost in the details and people's tendency to overquote. Yeah; I knew the IRC log dump was only so useful. It's a lot of material to go through, and much of it was discussed at strange hours with little sleep. ;) One thing that would be useful is if, when you post, Alice, you could give the URL of whatever and wherever your current draft is. Tomorrow (ack, today!) I'll finish converting over the PEP from Textile to ReStructuredText and get it re-submitted to the Python website. https://github.com/GothAlice/wsgi2/blob/master/pep444.textile http://www.python.org/dev/peps/pep-0444/ I don't use frameworks, or webob or any of that stuff. I just cook up callables that take environ and start_response. I don't want my awareness of the basics of HTTP abstracted away, because I want to make sure that my apps behave well. Kudos! That approach is heavily frowned upon in the #python IRC channel, but I fully agree that working solutions can be reasonably made using that methedology. There are some details that are made easier by frameworks, though. Testing benefits from MVC: you can test the dict return value of the controller, the templates, and the model all separately. Plain WSGI is a good thing, for me, because it means that my applications are a) very webby (in the stateless HTTP sense) and b) very testable. c) And very portable. You need not depend on some pre-arranged stack (including web server). I agree with some others who have suggested that maybe async should be its own thing, rather than integrated into a WSGI2. A server could choose to be WSGI2 compliant or AWSGI compliant, or both. -1 That is already the case with filters, and will be when I ratify the async idea (after further discussion here). My current thought process is that async will be optional for server implementors and will be easily detectable by applications and middleware and have zero impact on middleware/applications if disabled (by configuration) or missing. That said I can understand why an app author might like to be able to read or write in an async way, and being able to shelf an app to wait around for the next cycle would be a good thing. Using futures, async covers any callable at all; you can queue up a dozen DB calls at the top of your application, then (within a body generator) yield those futures to be paused pending the data. That would, as an example, allow complex pages to be generated and streamed to the end-user in a efficient way -- the user would see a page begin to appear, and the browser downloading static resources, while intensive tasks complete. I just don't want efforts to make that possible to make writing a boring wsgi thing more annoying. +9001 See above. I can't get my head around filters yet. They sound like a different way to do middleware, with a justification of something along the lines of I don't like middleware for filtering. I'd like to be (directly) pointed at a more robust justification. I suspect you have already pointed at such a thing, but it is lost in the sands of time... Filters offer several benefits, some of which are mild: :: Simplified application / middleware debugging via smaller stack. :: Clearly defined tasks; ingress = altering the environ / input, egress = altering the output. :: Egress filters are not executed if an unhandled exception is raised. The latter point is important; you do not want badly written middleware to absorb exceptions that should bubble, etc. (I'll need to elaborate on this and add a few more points when I get some sleep.) Filters seem like something that could be added via a standardized piece of middleware, rather than being part of the spec. I like minimal specs. Filters are optional, and an example is/will be provided for utilizing ingress/egress filter stacks as middleware. The problem with /not/ including the filtering API (which, by itself is stupidly simple and would barely warrant its own PEP, IMHO) is that a separate standard would not be seen and taken into consideration when developers are writing what they will think /must/ be middleware. Seing as a middleware version of a filter is trivial to create (just execute the filter in a thin middleware wrapper), it should be a consideration up front. Latin1 = \u → \u00FF [snip] There's a rule of thumb about constraints. If you must constrain, do none, one or all, never
Re: [Web-SIG] PEP 444 / WSGI 2 Async
At Thu, 6 Jan 2011 13:03:15 + (GMT), chris dent wrote: snip On async: I agree with some others who have suggested that maybe async should be its own thing, rather than integrated into a WSGI2. A server could choose to be WSGI2 compliant or AWSGI compliant, or both. /snip +1 After seeing some of the ideas regarding how to add async into a new version of WSGI, it isn't the specific problem the async feature addresses in terms of WSGI servers. Is the goal is to support long running connections? Are we trying to support WebSockets and other long running connection interfaces? If that is the case, async is a *technique* for handling this paradigm, but it doesn't address the real problem. There are techniques that have sounded reasonable like making available the socket such that a server can give it to the application to do something use with it (here is an example doing something similar with CherryPy - http://trac.defuze.org/browser/oss/ws4cp/ws4cp.py). Just to summarize, I'm for making async something else while finding a way to support long running connections in WSGI outside of adopting a particular technique a potentially viable goal. Just my $.02 on the issue. Eric Larson ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] PEP 444 Goals
Hello web-sig. My name is Timothy Farrell. I'm the developer of the Rocket web server. I understand that most of you are more experienced and passionate than myself. But I'm come here because I want to see certain things standardized. I'm pretty new to this forum but I've read through all the recent discussions on PEP 444. That being said, I'll try to take a humble approach. It seems to me that the spec that Alice is working on could be something great but the problems are not well defined (in the PEP). This causes confusion about what the goals are. There's some disagreement about whether or not certain features should be in PEP 444. I think those people have a different idea for what PEP 444 ought to be. The first thing that should be done is clearly defining the shortcomings with PEP that PEP 444 seeks to address and limit our PEP 444 discussions to solving those problems. Since Alice is rewriting the PEP perhaps we should all sit back for a second until we have a PEP to work off of. That will help the discussion be a little more focused. Sorry if I've stepped on anyone's toes. -tim ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
On 2011-01-06 09:06:10 -0800, chris.d...@gmail.com said: I wasn't actually talking about the log dump. That was useful. What I was talking about were earlier messages in the thread where people were making responses, quoting vast swaths of text for no clear reason. Ah. :) I do make an effort to trim quoted text to only the relevant parts. On Thu, 6 Jan 2011, Alice Bevan–McGregor wrote: https://github.com/GothAlice/wsgi2/blob/master/pep444.textile Thanks, watching that now. The textile document will no longer be updated; the pep-444.rst document is where it'll be at. I should have been more explicit here as I now feel I must defend myself from frowns. I'm not talking about single methods that do the entire app. I nest a series of middleware that bottom out at Selector which then does url based dispatch to applications, which themselves are defined as handlers (simple wsgi functions) and access StorageInterfaces and Serializations. The middleware, handlers, stores and serializers are all independently testable (and usable). *nods* My framework (WebCore) is basically a packaged up version of a custom middleware stack so I can easily re-use it from project to project. I assumed (in my head) you were rolling your own framework/stack. That is already the case with filters, and will be when I ratify the async idea (after further discussion here). My current thought process is that async will be optional for server implementors and will be easily detectable by applications and middleware and have zero impact on middleware/applications if disabled (by configuration) or missing. This notion of being detectable seems weird to me. Are we actually expecting an application to query the server, find out it is not async capable, and choose a different code path as a result? Seems much more likely that the installer will choose a server or app that meets their needs. That is: you don't need to detect, you need to know (presumably at install/config time). Or maybe I am imagining the use cases incorrectly here. I think of app being async as an explicit choice made by the builder to achieve some goal. More to the point it needs to be detectable by middleware without explicitly configuring every layer of middleware, potentially with differing configuration mechanics and semantics. (I.e. arguments like enable_async, async_enable, iLoveAsync, ...) I can't get my head around filters yet.[snip] Filters offer several benefits, some of which are mild: :: Simplified application / middleware debugging via smaller stack. :: Clearly defined tasks; ingress = altering the environ / input, egress = altering the output. :: Egress filters are not executed if an unhandled exception is raised. Taken individually none of these seem super critical to me. Or to put it another way: Yeah, so? (This is the aforementioned resistance showing through. The above sounds perfectly nice, reasonable and desireable, but not _necessary_.) It isn't necessary; it is, however, an often re-implemented feature of a framework on top of WSGI. CherryPy, Paste, Django, etc. all implement some form of non-WSGI (or, hell, Paste uses WSGI middleware) thing they call a 'filter'. Filters are optional, and an example is/will be provided for utilizing ingress/egress filter stacks as middleware. In a conversation with some people about the Atom Publishing Protocol I tried to convince them that the terms SHOULD and MAY had no place in a spec. WSGI* is not really the same kind of spec, but optionality still grates in the same way. I fully agree; that's why a lot of the PEP 333 optionally or may features have become must. Optionally and may simply never get implemented. Filters are optional because a number of people have raised valid arguments that it might not be entirely needed. Thus, it's not required. But I strongly feel that some defined API should be present in (or /at least/ referred to by) the PEP, otherwise the future will hold the same server-specific incompatible implementations. - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
06.01.2011 20:02, Eric Larson kirjoitti: At Thu, 6 Jan 2011 13:03:15 + (GMT), chris dent wrote: snip On async: I agree with some others who have suggested that maybe async should be its own thing, rather than integrated into a WSGI2. A server could choose to be WSGI2 compliant or AWSGI compliant, or both. /snip +1 After seeing some of the ideas regarding how to add async into a new version of WSGI, it isn't the specific problem the async feature addresses in terms of WSGI servers. Is the goal is to support long running connections? Are we trying to support WebSockets and other long running connection interfaces? If that is the case, async is a *technique* for handling this paradigm, but it doesn't address the real problem. There are techniques that have sounded reasonable like making available the socket such that a server can give it to the application to do something use with it (here is an example doing something similar with CherryPy - http://trac.defuze.org/browser/oss/ws4cp/ws4cp.py). The primary idea behind asynchronous servers/applications is the ability to efficiently serve a huge number of concurrent connections with a small number of threads. Asynchronous applications tend to be faster because there is less thread context switching happening in the CPU. Any application that runs on top of a web server that allocates less threads to the application than the number of connections has to be quick to respond so as not to starve the thread pool or block the event loop. This is true regardless of whether nonblocking I/O or some other technique is used. I'm a bit unclear as to how else you would do this. Care to elaborate on that? I looked at the Cherrypy code, but I couldn't yet figure that out. Just to summarize, I'm for making async something else while finding a way to support long running connections in WSGI outside of adopting a particular technique a potentially viable goal. Just my $.02 on the issue. Eric Larson ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 Goals
On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote: :: Making optional (and thus rarely-implemented) features non-optional. E.g. server support for HTTP/1.1 with clarifications for interfacing applications to 1.1 servers. Thus pipelining, chunked encoding, et. al. as per the HTTP 1.1 RFC. Requirements on the HTTP compliance of the server don't really have any place in the WSGI spec. You should be able to be WSGI compliant even if you don't use the HTTP transport at all (e.g. maybe you just send around requests via SCGI). The original spec got this right: chunking etc are something which is not relevant to the wsgi application code -- it is up to the server to implement the HTTP transport according to the HTTP spec, if it's purporting to be an HTTP server. James ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
Alice Bevan–McGregor wrote: chris.d...@gmail.com said: I can't get my head around filters yet... It isn't necessary; it is, however, an often re-implemented feature of a framework on top of WSGI. CherryPy, Paste, Django, etc. all implement some form of non-WSGI (or, hell, Paste uses WSGI middleware) thing they call a 'filter'. Or, if you had actually read what I wrote weeks ago, you'd say CherryPy used to have a thing they call a 'filter', but then replaced it with a much better mechanism (hooks and tools) once the naïve categories of ingress/egress were shown in practice to be inadequate. Not to mention that, even when CherryPy had something called a 'filter', that it not only predated WSGI but ran at the innermost WSGI layer, not the outermost. It's apples and oranges at best, or reinventing the square wheel at worst. We don't need Yet Another Way of hooking in processing components; if anything, we need a standard mechanism to compose existing middleware graphs so that invariant orderings are explicit and guaranteed. For example, encode, then gzip, then cache. By introducing egress filters as described in PEP 444 (which mentions gzip as a candidate for an egress filter), you're then stuck in a tug-of-war as to whether to build a new caching component as middleware, as an egress filter, or (most likely, in order to compete) both. Robert Brewer fuman...@aminus.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
2011/1/6 Alex Grönholm alex.gronh...@nextday.fi 06.01.2011 20:02, Eric Larson kirjoitti: At Thu, 6 Jan 2011 13:03:15 + (GMT), chris dent wrote: snip On async: I agree with some others who have suggested that maybe async should be its own thing, rather than integrated into a WSGI2. A server could choose to be WSGI2 compliant or AWSGI compliant, or both. /snip +1 After seeing some of the ideas regarding how to add async into a new version of WSGI, it isn't the specific problem the async feature addresses in terms of WSGI servers. Is the goal is to support long running connections? Are we trying to support WebSockets and other long running connection interfaces? If that is the case, async is a *technique* for handling this paradigm, but it doesn't address the real problem. There are techniques that have sounded reasonable like making available the socket such that a server can give it to the application to do something use with it (here is an example doing something similar with CherryPy - http://trac.defuze.org/browser/oss/ws4cp/ws4cp.py). The primary idea behind asynchronous servers/applications is the ability to efficiently serve a huge number of concurrent connections with a small number of threads. Asynchronous applications tend to be faster because there is less thread context switching happening in the CPU. Any application that runs on top of a web server that allocates less threads to the application than the number of connections has to be quick to respond so as not to starve the thread pool or block the event loop. This is true regardless of whether nonblocking I/O or some other technique is used. I'm a bit unclear as to how else you would do this. Care to elaborate on that? I looked at the Cherrypy code, but I couldn't yet figure that out. Since I wrote that piece of code, I guess I ought to chime in. First of all, the code isn't part of CherryPy, simply it's one idea to provide WebSocket to CherryPy. Considering WebSocket bootstraps on HTTP but once that's done, it's just a raw socket with bits and pieces on top, I wanted to find a way not to block CherryPy from serving other requests once a WebSocket handshake had been performed. The idea was simply to detach the socket from the worker thread once the handshake had been performed. Then the application had a socket at hand and this particular instance, I simply decided to use asyncore to loop through those sockets aside from the CherryPy HTTP server. In effect, you end up with asyncore for WS sockets and CherryPy for any HTTP serving but from within one single process, using CherryPy's main loop. By large this is not a generic solution for implementing async in WSGI but a specific example on how one can have both threads and an async loop playing together. It's merely a proof of concept :) Hope that clarifies that piece of code. -- - Sylvain http://www.defuze.org http://twitter.com/lawouach ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 Goals
On 2011-01-06 13:06:36 -0800, James Y Knight said: On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote: :: Making optional (and thus rarely-implemented) features non-optional. E.g. server support for HTTP/1.1 with clarifications for interfacing applications to 1.1 servers. Thus pipelining, chunked encoding, et. al. as per the HTTP 1.1 RFC. Requirements on the HTTP compliance of the server don't really have any place in the WSGI spec. You should be able to be WSGI compliant even if you don't use the HTTP transport at all (e.g. maybe you just send around requests via SCGI). The original spec got this right: chunking etc are something which is not relevant to the wsgi application code -- it is up to the server to implement the HTTP transport according to the HTTP spec, if it's purporting to be an HTTP server. Chunking is actually quite relevant to the specification, as WSGI and PEP 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for chunked bodies regardless of higher-level support for chunking. The body iterator. Previously you /had/ to define a length, with chunked encoding at the server level, you don't. I agree, however, that not all gateways will be able to implement the relevant HTTP/1.1 features. FastCGI does, SCGI after a quick Google search, seems to support it as well. I should re-word it as: For those servers capable of HTTP/1.1 features the implementation of such features is required. +1 - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 Goals
On Jan 6, 2011, at 4:56 PM, Alice Bevan–McGregor wrote: On 2011-01-06 13:06:36 -0800, James Y Knight said: On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote: :: Making optional (and thus rarely-implemented) features non-optional. E.g. server support for HTTP/1.1 with clarifications for interfacing applications to 1.1 servers. Thus pipelining, chunked encoding, et. al. as per the HTTP 1.1 RFC. Requirements on the HTTP compliance of the server don't really have any place in the WSGI spec. You should be able to be WSGI compliant even if you don't use the HTTP transport at all (e.g. maybe you just send around requests via SCGI). The original spec got this right: chunking etc are something which is not relevant to the wsgi application code -- it is up to the server to implement the HTTP transport according to the HTTP spec, if it's purporting to be an HTTP server. Chunking is actually quite relevant to the specification, as WSGI and PEP 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for chunked bodies regardless of higher-level support for chunking. The body iterator. Previously you /had/ to define a length, with chunked encoding at the server level, you don't. No you don't -- HTTP 1.0 allows indeterminate-length output. The server simply must close the connection to indicate the end of the response if either the client version HTTP/1.0, or the server doesn't implement HTTP/1.1. James ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 Goals
One other comment about HTTP/1.1 features. You will always be battling to have some HTTP/1.1 features work in a controllable way. This is because WSGI gateways/adapters aren't often directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI, AJP, CGI etc. In this sort of situation you are at the mercy of what the modules implementing those protocols do, or even are hamstrung by how those protocols work. The classic example is 100-continue processing. This simply cannot work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting mechanisms where proxying is performed as the protocol being used doesn't implement a notion of end to end signalling in respect of 100-continue. The current WSGI specification acknowledges that by saying: Servers and gateways that implement HTTP 1.1 must provide transparent support for HTTP 1.1's expect/continue mechanism. This may be done in any of several ways: * Respond to requests containing an Expect: 100-continue request with an immediate 100 Continue response, and proceed normally. * Proceed with the request normally, but provide the application with a wsgi.input stream that will send the 100 Continue response if/when the application first attempts to read from the input stream. The read request must then remain blocked until the client responds. * Wait until the client decides that the server does not support expect/continue, and sends the request body on its own. (This is suboptimal, and is not recommended.) If you are going to try and push for full visibility of HTTP/1.1 and an ability to control it at the application level then you will fail with 100-continue to start with. So, although option 2 above would be the most ideal and is giving the application control, specifically the ability to send an error response based on request headers alone, and with reading the response and triggering the 100-continue, it isn't practical to require it, as the majority of hosting mechanisms for WSGI wouldn't even be able to implement it that way. The same goes for any other feature, there is no point mandating a feature that can only be realistically implementing on a minority of implementations. This would be even worse where dependence on such a feature would mean that the WSGI application would no longer be portable to another WSGI server and destroys the notion that WSGI provides a portable interface. This isn't just restricted to HTTP/1.1 features either, but also applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers that are directly hooked into the URL parsing of the base HTTP server can provide that information, which basically means that only pure Python HTTP/WSGI servers are likely able to provide it without guessing, and in that case such servers usually are always used where WSGI application mounted at root anyway. Graham On 7 January 2011 09:29, Graham Dumpleton graham.dumple...@gmail.com wrote: On 7 January 2011 08:56, Alice Bevan–McGregor al...@gothcandy.com wrote: On 2011-01-06 13:06:36 -0800, James Y Knight said: On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote: :: Making optional (and thus rarely-implemented) features non-optional. E.g. server support for HTTP/1.1 with clarifications for interfacing applications to 1.1 servers. Thus pipelining, chunked encoding, et. al. as per the HTTP 1.1 RFC. Requirements on the HTTP compliance of the server don't really have any place in the WSGI spec. You should be able to be WSGI compliant even if you don't use the HTTP transport at all (e.g. maybe you just send around requests via SCGI). The original spec got this right: chunking etc are something which is not relevant to the wsgi application code -- it is up to the server to implement the HTTP transport according to the HTTP spec, if it's purporting to be an HTTP server. Chunking is actually quite relevant to the specification, as WSGI and PEP 444 / WSGI 2 (damn, that's getting tedious to keep dual-typing ;) allow for chunked bodies regardless of higher-level support for chunking. The body iterator. Previously you /had/ to define a length, with chunked encoding at the server level, you don't. I agree, however, that not all gateways will be able to implement the relevant HTTP/1.1 features. FastCGI does, SCGI after a quick Google search, seems to support it as well. I should re-word it as: For those servers capable of HTTP/1.1 features the implementation of such features is required. I would question whether FASTCGI, SCGI or AJP support the concept of chunking of responses to the extent that the application can prepare the final content including chunks as required by the HTTP specification. Further, in Apache at least, the output from a web application served via those protocols is still pushed through the Apache output filter chain so as to allow the filters to modify the response, eg., apply compression using mod_deflate. As a consequence, the standard HTTP 'CHUNK' output filter
Re: [Web-SIG] PEP 444 / WSGI 2 Async
06.01.2011 23:11, Sylvain Hellegouarch kirjoitti: 2011/1/6 Alex Grönholm alex.gronh...@nextday.fi mailto:alex.gronh...@nextday.fi 06.01.2011 20:02, Eric Larson kirjoitti: At Thu, 6 Jan 2011 13:03:15 + (GMT), chris dent wrote: snip On async: I agree with some others who have suggested that maybe async should be its own thing, rather than integrated into a WSGI2. A server could choose to be WSGI2 compliant or AWSGI compliant, or both. /snip +1 After seeing some of the ideas regarding how to add async into a new version of WSGI, it isn't the specific problem the async feature addresses in terms of WSGI servers. Is the goal is to support long running connections? Are we trying to support WebSockets and other long running connection interfaces? If that is the case, async is a *technique* for handling this paradigm, but it doesn't address the real problem. There are techniques that have sounded reasonable like making available the socket such that a server can give it to the application to do something use with it (here is an example doing something similar with CherryPy - http://trac.defuze.org/browser/oss/ws4cp/ws4cp.py). The primary idea behind asynchronous servers/applications is the ability to efficiently serve a huge number of concurrent connections with a small number of threads. Asynchronous applications tend to be faster because there is less thread context switching happening in the CPU. Any application that runs on top of a web server that allocates less threads to the application than the number of connections has to be quick to respond so as not to starve the thread pool or block the event loop. This is true regardless of whether nonblocking I/O or some other technique is used. I'm a bit unclear as to how else you would do this. Care to elaborate on that? I looked at the Cherrypy code, but I couldn't yet figure that out. Since I wrote that piece of code, I guess I ought to chime in. First of all, the code isn't part of CherryPy, simply it's one idea to provide WebSocket to CherryPy. Considering WebSocket bootstraps on HTTP but once that's done, it's just a raw socket with bits and pieces on top, I wanted to find a way not to block CherryPy from serving other requests once a WebSocket handshake had been performed. The idea was simply to detach the socket from the worker thread once the handshake had been performed. Then the application had a socket at hand and this particular instance, I simply decided to use asyncore to loop through those sockets aside from the CherryPy HTTP server. In effect, you end up with asyncore for WS sockets and CherryPy for any HTTP serving but from within one single process, using CherryPy's main loop. By large this is not a generic solution for implementing async in WSGI but a specific example on how one can have both threads and an async loop playing together. It's merely a proof of concept :) Yes, this is how I figured it too. In the end, what really matters is that code that doesn't get a dedicated thread has to be designed a little differently. The purpose of this discussion is to come up with a standard interface for such applications. I'd also like to explore the possibility of incorporating such a mechanism in PEP 444, provided that it does not complicate the implementation too much. Otherwise, a separate specification may be necessary. Hope that clarifies that piece of code. -- - Sylvain http://www.defuze.org http://twitter.com/lawouach ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 Goals
07.01.2011 01:14, Graham Dumpleton kirjoitti: One other comment about HTTP/1.1 features. You will always be battling to have some HTTP/1.1 features work in a controllable way. This is because WSGI gateways/adapters aren't often directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI, AJP, CGI etc. In this sort of situation you are at the mercy of what the modules implementing those protocols do, or even are hamstrung by how those protocols work. The classic example is 100-continue processing. This simply cannot work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting mechanisms where proxying is performed as the protocol being used doesn't implement a notion of end to end signalling in respect of 100-continue. I think we need some concrete examples to figure out what is and isn't possible with WSGI 1.0.1. My motivation for participating in this discussion can be summed up in that I want the following two applications to work properly: - PlasmaDS (Flex Messaging implementation) - WebDAV The PlasmaDS project is the planned Python counterpart to Adobe's BlazeDS. Interoperability with the existing implementation requires that both the request and response use chunked transfer encoding, to achieve bidirectional streaming. I don't really care how this happens, I just want to make sure that there is nothing preventing it. The WebDAV spec, on the other hand, says (http://www.webdav.org/specs/rfc2518.html#STATUS_102): The 102 (Processing) status code is an interim response used to inform the client that the server has accepted the complete request, but has not yet completed it. This status code /SHOULD/ only be sent when the server has a reasonable expectation that the request will take significant time to complete. As guidance, if a method is taking longer than 20 seconds (a reasonable, but arbitrary value) to process the server /SHOULD/ return a 102 (Processing) response. The server /MUST/ send a final response after the request has been completed. Again, I don't care how this is done as long as it's possible. The current WSGI specification acknowledges that by saying: Servers and gateways that implement HTTP 1.1 must provide transparent support for HTTP 1.1's expect/continue mechanism. This may be done in any of several ways: * Respond to requests containing an Expect: 100-continue request with an immediate 100 Continue response, and proceed normally. * Proceed with the request normally, but provide the application with a wsgi.input stream that will send the 100 Continue response if/when the application first attempts to read from the input stream. The read request must then remain blocked until the client responds. * Wait until the client decides that the server does not support expect/continue, and sends the request body on its own. (This is suboptimal, and is not recommended.) If you are going to try and push for full visibility of HTTP/1.1 and an ability to control it at the application level then you will fail with 100-continue to start with. So, although option 2 above would be the most ideal and is giving the application control, specifically the ability to send an error response based on request headers alone, and with reading the response and triggering the 100-continue, it isn't practical to require it, as the majority of hosting mechanisms for WSGI wouldn't even be able to implement it that way. The same goes for any other feature, there is no point mandating a feature that can only be realistically implementing on a minority of implementations. This would be even worse where dependence on such a feature would mean that the WSGI application would no longer be portable to another WSGI server and destroys the notion that WSGI provides a portable interface. This isn't just restricted to HTTP/1.1 features either, but also applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers that are directly hooked into the URL parsing of the base HTTP server can provide that information, which basically means that only pure Python HTTP/WSGI servers are likely able to provide it without guessing, and in that case such servers usually are always used where WSGI application mounted at root anyway. Graham On 7 January 2011 09:29, Graham Dumpletongraham.dumple...@gmail.com wrote: On 7 January 2011 08:56, Alice Bevan–McGregoral...@gothcandy.com wrote: On 2011-01-06 13:06:36 -0800, James Y Knight said: On Jan 6, 2011, at 3:52 PM, Alice Bevan–McGregor wrote: :: Making optional (and thus rarely-implemented) features non-optional. E.g. server support for HTTP/1.1 with clarifications for interfacing applications to 1.1 servers. Thus pipelining, chunked encoding, et. al. as per the HTTP 1.1 RFC. Requirements on the HTTP compliance of the server don't really have any place in the WSGI spec. You should be able to be WSGI compliant even if you don't use the HTTP transport at all (e.g. maybe you just send around requests via SCGI). The original spec
Re: [Web-SIG] PEP 444 Goals
2011/1/7 Alex Grönholm alex.gronh...@nextday.fi: 07.01.2011 01:14, Graham Dumpleton kirjoitti: One other comment about HTTP/1.1 features. You will always be battling to have some HTTP/1.1 features work in a controllable way. This is because WSGI gateways/adapters aren't often directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI, AJP, CGI etc. In this sort of situation you are at the mercy of what the modules implementing those protocols do, or even are hamstrung by how those protocols work. The classic example is 100-continue processing. This simply cannot work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting mechanisms where proxying is performed as the protocol being used doesn't implement a notion of end to end signalling in respect of 100-continue. I think we need some concrete examples to figure out what is and isn't possible with WSGI 1.0.1. My motivation for participating in this discussion can be summed up in that I want the following two applications to work properly: - PlasmaDS (Flex Messaging implementation) - WebDAV The PlasmaDS project is the planned Python counterpart to Adobe's BlazeDS. Interoperability with the existing implementation requires that both the request and response use chunked transfer encoding, to achieve bidirectional streaming. I don't really care how this happens, I just want to make sure that there is nothing preventing it. That can only be done by changing the rules around wsgi.input is used. I'll try and find a reference to where I have posted information about this before, otherwise I'll write something up again about it. The WebDAV spec, on the other hand, says (http://www.webdav.org/specs/rfc2518.html#STATUS_102): The 102 (Processing) status code is an interim response used to inform the client that the server has accepted the complete request, but has not yet completed it. This status code SHOULD only be sent when the server has a reasonable expectation that the request will take significant time to complete. As guidance, if a method is taking longer than 20 seconds (a reasonable, but arbitrary value) to process the server SHOULD return a 102 (Processing) response. The server MUST send a final response after the request has been completed. That I don't offhand see a way of being able to do as protocols like SCGI and CGI definitely don't allow interim status. I am suspecting that FASTCGI and AJP don't allow it either. I'll have to even do some digging as to how you would even handle that in Apache with a normal Apache handler. Graham Again, I don't care how this is done as long as it's possible. The current WSGI specification acknowledges that by saying: Servers and gateways that implement HTTP 1.1 must provide transparent support for HTTP 1.1's expect/continue mechanism. This may be done in any of several ways: * Respond to requests containing an Expect: 100-continue request with an immediate 100 Continue response, and proceed normally. * Proceed with the request normally, but provide the application with a wsgi.input stream that will send the 100 Continue response if/when the application first attempts to read from the input stream. The read request must then remain blocked until the client responds. * Wait until the client decides that the server does not support expect/continue, and sends the request body on its own. (This is suboptimal, and is not recommended.) If you are going to try and push for full visibility of HTTP/1.1 and an ability to control it at the application level then you will fail with 100-continue to start with. So, although option 2 above would be the most ideal and is giving the application control, specifically the ability to send an error response based on request headers alone, and with reading the response and triggering the 100-continue, it isn't practical to require it, as the majority of hosting mechanisms for WSGI wouldn't even be able to implement it that way. The same goes for any other feature, there is no point mandating a feature that can only be realistically implementing on a minority of implementations. This would be even worse where dependence on such a feature would mean that the WSGI application would no longer be portable to another WSGI server and destroys the notion that WSGI provides a portable interface. This isn't just restricted to HTTP/1.1 features either, but also applies to raw SCRIPT_NAME and PATH_INFO as well. Only WSGI servers that are directly hooked into the URL parsing of the base HTTP server can provide that information, which basically means that only pure Python HTTP/WSGI servers are likely able to provide it without guessing, and in that case such servers usually are always used where WSGI application mounted at root anyway. Graham On 7 January 2011 09:29, Graham Dumpleton graham.dumple...@gmail.com wrote: On 7 January 2011 08:56, Alice Bevan–McGregor al...@gothcandy.com
Re: [Web-SIG] PEP 444 Goals
2011/1/7 Graham Dumpleton graham.dumple...@gmail.com: 2011/1/7 Alex Grönholm alex.gronh...@nextday.fi: 07.01.2011 01:14, Graham Dumpleton kirjoitti: One other comment about HTTP/1.1 features. You will always be battling to have some HTTP/1.1 features work in a controllable way. This is because WSGI gateways/adapters aren't often directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI, AJP, CGI etc. In this sort of situation you are at the mercy of what the modules implementing those protocols do, or even are hamstrung by how those protocols work. The classic example is 100-continue processing. This simply cannot work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting mechanisms where proxying is performed as the protocol being used doesn't implement a notion of end to end signalling in respect of 100-continue. I think we need some concrete examples to figure out what is and isn't possible with WSGI 1.0.1. My motivation for participating in this discussion can be summed up in that I want the following two applications to work properly: - PlasmaDS (Flex Messaging implementation) - WebDAV The PlasmaDS project is the planned Python counterpart to Adobe's BlazeDS. Interoperability with the existing implementation requires that both the request and response use chunked transfer encoding, to achieve bidirectional streaming. I don't really care how this happens, I just want to make sure that there is nothing preventing it. That can only be done by changing the rules around wsgi.input is used. I'll try and find a reference to where I have posted information about this before, otherwise I'll write something up again about it. BTW, even if WSGI specification were changed to allow handling of chunked requests, it would not work for FASTCGI, SCGI, AJP, CGI or mod_wsgi daemon mode. Also not likely to work on uWSGI either. This is because all of these work on the expectation that the complete request body can be written across to the separate application process before actually reading the response from the application. In other words, both way streaming is not possible. The only solution which would allow this with Apache is mod_wsgi embedded mode, which in mod_wsgi 3.X already has an optional feature which can be enabled so as to allow you to step out of current bounds of the WSGI specification and use wsgi.input as I will explain, to do this both way streaming. Pure Python HTTP/WSGI servers which are a front facing server could also be modified to handle this is WSGI specification were changed, but whether those same will work if put behind a web proxy will depend on how the front end web proxy works. Graham The WebDAV spec, on the other hand, says (http://www.webdav.org/specs/rfc2518.html#STATUS_102): The 102 (Processing) status code is an interim response used to inform the client that the server has accepted the complete request, but has not yet completed it. This status code SHOULD only be sent when the server has a reasonable expectation that the request will take significant time to complete. As guidance, if a method is taking longer than 20 seconds (a reasonable, but arbitrary value) to process the server SHOULD return a 102 (Processing) response. The server MUST send a final response after the request has been completed. That I don't offhand see a way of being able to do as protocols like SCGI and CGI definitely don't allow interim status. I am suspecting that FASTCGI and AJP don't allow it either. I'll have to even do some digging as to how you would even handle that in Apache with a normal Apache handler. Graham Again, I don't care how this is done as long as it's possible. The current WSGI specification acknowledges that by saying: Servers and gateways that implement HTTP 1.1 must provide transparent support for HTTP 1.1's expect/continue mechanism. This may be done in any of several ways: * Respond to requests containing an Expect: 100-continue request with an immediate 100 Continue response, and proceed normally. * Proceed with the request normally, but provide the application with a wsgi.input stream that will send the 100 Continue response if/when the application first attempts to read from the input stream. The read request must then remain blocked until the client responds. * Wait until the client decides that the server does not support expect/continue, and sends the request body on its own. (This is suboptimal, and is not recommended.) If you are going to try and push for full visibility of HTTP/1.1 and an ability to control it at the application level then you will fail with 100-continue to start with. So, although option 2 above would be the most ideal and is giving the application control, specifically the ability to send an error response based on request headers alone, and with reading the response and triggering the 100-continue, it isn't practical to require it, as the
Re: [Web-SIG] PEP 444 Goals
07.01.2011 04:09, Graham Dumpleton kirjoitti: 2011/1/7 Graham Dumpletongraham.dumple...@gmail.com: 2011/1/7 Alex Grönholmalex.gronh...@nextday.fi: 07.01.2011 01:14, Graham Dumpleton kirjoitti: One other comment about HTTP/1.1 features. You will always be battling to have some HTTP/1.1 features work in a controllable way. This is because WSGI gateways/adapters aren't often directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI, AJP, CGI etc. In this sort of situation you are at the mercy of what the modules implementing those protocols do, or even are hamstrung by how those protocols work. The classic example is 100-continue processing. This simply cannot work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting mechanisms where proxying is performed as the protocol being used doesn't implement a notion of end to end signalling in respect of 100-continue. I think we need some concrete examples to figure out what is and isn't possible with WSGI 1.0.1. My motivation for participating in this discussion can be summed up in that I want the following two applications to work properly: - PlasmaDS (Flex Messaging implementation) - WebDAV The PlasmaDS project is the planned Python counterpart to Adobe's BlazeDS. Interoperability with the existing implementation requires that both the request and response use chunked transfer encoding, to achieve bidirectional streaming. I don't really care how this happens, I just want to make sure that there is nothing preventing it. That can only be done by changing the rules around wsgi.input is used. I'll try and find a reference to where I have posted information about this before, otherwise I'll write something up again about it. BTW, even if WSGI specification were changed to allow handling of chunked requests, it would not work for FASTCGI, SCGI, AJP, CGI or mod_wsgi daemon mode. Also not likely to work on uWSGI either. This is because all of these work on the expectation that the complete request body can be written across to the separate application process before actually reading the response from the application. In other words, both way streaming is not possible. The only solution which would allow this with Apache is mod_wsgi embedded mode, which in mod_wsgi 3.X already has an optional feature which can be enabled so as to allow you to step out of current bounds of the WSGI specification and use wsgi.input as I will explain, to do this both way streaming. Pure Python HTTP/WSGI servers which are a front facing server could also be modified to handle this is WSGI specification were changed, but whether those same will work if put behind a web proxy will depend on how the front end web proxy works. Then I suppose this needs to be standardized in PEP 444, wouldn't you agree? Graham The WebDAV spec, on the other hand, says (http://www.webdav.org/specs/rfc2518.html#STATUS_102): The 102 (Processing) status code is an interim response used to inform the client that the server has accepted the complete request, but has not yet completed it. This status code SHOULD only be sent when the server has a reasonable expectation that the request will take significant time to complete. As guidance, if a method is taking longer than 20 seconds (a reasonable, but arbitrary value) to process the server SHOULD return a 102 (Processing) response. The server MUST send a final response after the request has been completed. That I don't offhand see a way of being able to do as protocols like SCGI and CGI definitely don't allow interim status. I am suspecting that FASTCGI and AJP don't allow it either. I'll have to even do some digging as to how you would even handle that in Apache with a normal Apache handler. Graham Again, I don't care how this is done as long as it's possible. The current WSGI specification acknowledges that by saying: Servers and gateways that implement HTTP 1.1 must provide transparent support for HTTP 1.1's expect/continue mechanism. This may be done in any of several ways: * Respond to requests containing an Expect: 100-continue request with an immediate 100 Continue response, and proceed normally. * Proceed with the request normally, but provide the application with a wsgi.input stream that will send the 100 Continue response if/when the application first attempts to read from the input stream. The read request must then remain blocked until the client responds. * Wait until the client decides that the server does not support expect/continue, and sends the request body on its own. (This is suboptimal, and is not recommended.) If you are going to try and push for full visibility of HTTP/1.1 and an ability to control it at the application level then you will fail with 100-continue to start with. So, although option 2 above would be the most ideal and is giving the application control, specifically the ability to send an error response based on request headers alone, and with reading the response and triggering
Re: [Web-SIG] PEP 444 Goals
2011/1/7 Alex Grönholm alex.gronh...@nextday.fi: 07.01.2011 04:09, Graham Dumpleton kirjoitti: 2011/1/7 Graham Dumpletongraham.dumple...@gmail.com: 2011/1/7 Alex Grönholmalex.gronh...@nextday.fi: 07.01.2011 01:14, Graham Dumpleton kirjoitti: One other comment about HTTP/1.1 features. You will always be battling to have some HTTP/1.1 features work in a controllable way. This is because WSGI gateways/adapters aren't often directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI, AJP, CGI etc. In this sort of situation you are at the mercy of what the modules implementing those protocols do, or even are hamstrung by how those protocols work. The classic example is 100-continue processing. This simply cannot work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting mechanisms where proxying is performed as the protocol being used doesn't implement a notion of end to end signalling in respect of 100-continue. I think we need some concrete examples to figure out what is and isn't possible with WSGI 1.0.1. My motivation for participating in this discussion can be summed up in that I want the following two applications to work properly: - PlasmaDS (Flex Messaging implementation) - WebDAV The PlasmaDS project is the planned Python counterpart to Adobe's BlazeDS. Interoperability with the existing implementation requires that both the request and response use chunked transfer encoding, to achieve bidirectional streaming. I don't really care how this happens, I just want to make sure that there is nothing preventing it. That can only be done by changing the rules around wsgi.input is used. I'll try and find a reference to where I have posted information about this before, otherwise I'll write something up again about it. BTW, even if WSGI specification were changed to allow handling of chunked requests, it would not work for FASTCGI, SCGI, AJP, CGI or mod_wsgi daemon mode. Also not likely to work on uWSGI either. This is because all of these work on the expectation that the complete request body can be written across to the separate application process before actually reading the response from the application. In other words, both way streaming is not possible. The only solution which would allow this with Apache is mod_wsgi embedded mode, which in mod_wsgi 3.X already has an optional feature which can be enabled so as to allow you to step out of current bounds of the WSGI specification and use wsgi.input as I will explain, to do this both way streaming. Pure Python HTTP/WSGI servers which are a front facing server could also be modified to handle this is WSGI specification were changed, but whether those same will work if put behind a web proxy will depend on how the front end web proxy works. Then I suppose this needs to be standardized in PEP 444, wouldn't you agree? Huh! Not sure you understand what I am saying. Even if you changed the WSGI specification to allow for it, the bulk of implementations wouldn't be able to support it. The WSGI specification has no influence over distinct protocols such as FASTCGI, SCGI, AJP or CGI or proxy implementations and so cant be used to force them to be changed. So, as much as I would like to see WSGI specification changed to allow it, others may not on the basis that there is no point if few implementations could support it. Graham Graham The WebDAV spec, on the other hand, says (http://www.webdav.org/specs/rfc2518.html#STATUS_102): The 102 (Processing) status code is an interim response used to inform the client that the server has accepted the complete request, but has not yet completed it. This status code SHOULD only be sent when the server has a reasonable expectation that the request will take significant time to complete. As guidance, if a method is taking longer than 20 seconds (a reasonable, but arbitrary value) to process the server SHOULD return a 102 (Processing) response. The server MUST send a final response after the request has been completed. That I don't offhand see a way of being able to do as protocols like SCGI and CGI definitely don't allow interim status. I am suspecting that FASTCGI and AJP don't allow it either. I'll have to even do some digging as to how you would even handle that in Apache with a normal Apache handler. Graham Again, I don't care how this is done as long as it's possible. The current WSGI specification acknowledges that by saying: Servers and gateways that implement HTTP 1.1 must provide transparent support for HTTP 1.1's expect/continue mechanism. This may be done in any of several ways: * Respond to requests containing an Expect: 100-continue request with an immediate 100 Continue response, and proceed normally. * Proceed with the request normally, but provide the application with a wsgi.input stream that will send the 100 Continue response if/when the application first attempts to read from the
Re: [Web-SIG] PEP 444 Goals
On Jan 6, 2011, at 7:46 PM, Alex Grönholm wrote: The WebDAV spec, on the other hand, says (http://www.webdav.org/specs/rfc2518.html#STATUS_102): The 102 (Processing) status code is an interim response used to inform the client that the server has accepted the complete request, but has not yet completed it. This status code SHOULD only be sent when the server has a reasonable expectation that the request will take significant time to complete. As guidance, if a method is taking longer than 20 seconds (a reasonable, but arbitrary value) to process the serverSHOULD return a 102 (Processing) response. The server MUST send a final response after the request has been completed. Again, I don't care how this is done as long as it's possible. This pretty much has to be generated by the server implementation. One thing that could be done in WSGI is a callback function inserted into the environ to suggest to the server that it generate a certain 1xx response. That is, something like: if 'wsgi.intermediate_response' in environ: environ['wsgi.intermediate_response'](102, {'Random-Header': 'Whatever'}) If a server implements this, it should probably ignore any requests from the app to send a 100 or 101 response. The server should be free to ignore the request, or not implement it. Given that the only actual use case (WebDAV) is rather rare and marks it as a SHOULD, I don't see any real practical issues with it being optional. The other thing that could be done is simply have a server-side configuration to allow sending 102 after *any* request takes 20 seconds to process. That wouldn't require any changes to WSGI. I'd note that HTTP/1.1 clients are *required* to be able to handle any number of 1xx responses followed by a final response, so it's supposed to be perfectly safe for a server to always send a 102 as a response to any request, no matter what the app is, or what client user-agent is (so long as it advertised HTTP/1.1), or even whether the resource has anything to do with WebDAV. Of course, I'm willing to bet that's patently false back here in the Real World -- no doubt plenty of HTTP/1.1 clients incorrectly barf on 1xx responses. James___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 Goals
2011/1/7 James Y Knight f...@fuhm.net: On Jan 6, 2011, at 7:46 PM, Alex Grönholm wrote: The WebDAV spec, on the other hand, says (http://www.webdav.org/specs/rfc2518.html#STATUS_102): The 102 (Processing) status code is an interim response used to inform the client that the server has accepted the complete request, but has not yet completed it. This status code SHOULD only be sent when the server has a reasonable expectation that the request will take significant time to complete. As guidance, if a method is taking longer than 20 seconds (a reasonable, but arbitrary value) to process the serverSHOULD return a 102 (Processing) response. The server MUST send a final response after the request has been completed. Again, I don't care how this is done as long as it's possible. This pretty much has to be generated by the server implementation. One thing that could be done in WSGI is a callback function inserted into the environ to suggest to the server that it generate a certain 1xx response. That is, something like: if 'wsgi.intermediate_response' in environ: environ['wsgi.intermediate_response'](102, {'Random-Header': 'Whatever'}) If a server implements this, it should probably ignore any requests from the app to send a 100 or 101 response. The server should be free to ignore the request, or not implement it. Given that the only actual use case (WebDAV) is rather rare and marks it as a SHOULD, I don't see any real practical issues with it being optional. The other thing that could be done is simply have a server-side configuration to allow sending 102 after *any* request takes 20 seconds to process. That wouldn't require any changes to WSGI. I'd note that HTTP/1.1 clients are *required* to be able to handle any number of 1xx responses followed by a final response, so it's supposed to be perfectly safe for a server to always send a 102 as a response to any request, no matter what the app is, or what client user-agent is (so long as it advertised HTTP/1.1), or even whether the resource has anything to do with WebDAV. Of course, I'm willing to bet that's patently false back here in the Real World -- no doubt plenty of HTTP/1.1 clients incorrectly barf on 1xx responses. FWIW, Apache provides ap_send_interim_response() to allow interim status. This is used by mod_proxy, but no where else in Apache core code. So, you would be fine if proxying to a pure Python HTTP/WSGI server which could generate interim responses, but would be out of luck with FASTCGI, SCGI, AJP, CGI and any modules which do custom proxying using own protocol such as uWSGI or mod_wsgi daemon mode. In all the latter, the wire protocols for proxy connection would themselves need to be modified as well as module implementation, which isn't going to happen for any of those which are generic protocols. Graham ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 Goals
07.01.2011 04:55, Graham Dumpleton kirjoitti: 2011/1/7 Alex Grönholmalex.gronh...@nextday.fi: 07.01.2011 04:09, Graham Dumpleton kirjoitti: 2011/1/7 Graham Dumpletongraham.dumple...@gmail.com: 2011/1/7 Alex Grönholmalex.gronh...@nextday.fi: 07.01.2011 01:14, Graham Dumpleton kirjoitti: One other comment about HTTP/1.1 features. You will always be battling to have some HTTP/1.1 features work in a controllable way. This is because WSGI gateways/adapters aren't often directly interfacing with the raw HTTP layer, but with FASTCGI, SCGI, AJP, CGI etc. In this sort of situation you are at the mercy of what the modules implementing those protocols do, or even are hamstrung by how those protocols work. The classic example is 100-continue processing. This simply cannot work end to end across FASTCGI, SCGI, AJP, CGI and other WSGI hosting mechanisms where proxying is performed as the protocol being used doesn't implement a notion of end to end signalling in respect of 100-continue. I think we need some concrete examples to figure out what is and isn't possible with WSGI 1.0.1. My motivation for participating in this discussion can be summed up in that I want the following two applications to work properly: - PlasmaDS (Flex Messaging implementation) - WebDAV The PlasmaDS project is the planned Python counterpart to Adobe's BlazeDS. Interoperability with the existing implementation requires that both the request and response use chunked transfer encoding, to achieve bidirectional streaming. I don't really care how this happens, I just want to make sure that there is nothing preventing it. That can only be done by changing the rules around wsgi.input is used. I'll try and find a reference to where I have posted information about this before, otherwise I'll write something up again about it. BTW, even if WSGI specification were changed to allow handling of chunked requests, it would not work for FASTCGI, SCGI, AJP, CGI or mod_wsgi daemon mode. Also not likely to work on uWSGI either. This is because all of these work on the expectation that the complete request body can be written across to the separate application process before actually reading the response from the application. In other words, both way streaming is not possible. The only solution which would allow this with Apache is mod_wsgi embedded mode, which in mod_wsgi 3.X already has an optional feature which can be enabled so as to allow you to step out of current bounds of the WSGI specification and use wsgi.input as I will explain, to do this both way streaming. Pure Python HTTP/WSGI servers which are a front facing server could also be modified to handle this is WSGI specification were changed, but whether those same will work if put behind a web proxy will depend on how the front end web proxy works. Then I suppose this needs to be standardized in PEP 444, wouldn't you agree? Huh! Not sure you understand what I am saying. Even if you changed the WSGI specification to allow for it, the bulk of implementations wouldn't be able to support it. The WSGI specification has no influence over distinct protocols such as FASTCGI, SCGI, AJP or CGI or proxy implementations and so cant be used to force them to be changed. I believe I understand what you are saying, but I don't want to restrict the freedom of the developer just because of some implementations that can't support some particular feature. If you need to do streaming, use a server that supports it, obviously! If Java can do it, why can't we? I would hate having to rely on a non-standard implementation if we have the possibility to standardize this in a specification. So, as much as I would like to see WSGI specification changed to allow it, others may not on the basis that there is no point if few implementations could support it. Graham Graham The WebDAV spec, on the other hand, says (http://www.webdav.org/specs/rfc2518.html#STATUS_102): The 102 (Processing) status code is an interim response used to inform the client that the server has accepted the complete request, but has not yet completed it. This status code SHOULD only be sent when the server has a reasonable expectation that the request will take significant time to complete. As guidance, if a method is taking longer than 20 seconds (a reasonable, but arbitrary value) to process the server SHOULD return a 102 (Processing) response. The server MUST send a final response after the request has been completed. That I don't offhand see a way of being able to do as protocols like SCGI and CGI definitely don't allow interim status. I am suspecting that FASTCGI and AJP don't allow it either. I'll have to even do some digging as to how you would even handle that in Apache with a normal Apache handler. Graham Again, I don't care how this is done as long as it's possible. The current WSGI specification acknowledges that by saying: Servers and gateways that implement HTTP 1.1 must provide transparent support for HTTP
Re: [Web-SIG] PEP 444 Goals
At 12:52 PM 1/6/2011 -0800, Alice BevanMcGregor wrote: Ignoring async for the moment, the goals of the PEP 444 rewrite are: :: Clear separation of narrative from rules to be followed. This allows developers of both servers and applications to easily run through a confomance check list. :: Isolation of examples and rationale to improve readability of the core rulesets. :: Clarification of often mis-interpreted rules from PEP 333 (and those carried over in ). :: Elimination of unintentional non-conformance, esp. re: cgi.FieldStorage. :: Massive simplification of call flow. Replacing start_response with a returned 3-tuple immensely simplifies the task of middleware that needs to capture HTTP status or manipulate (or even examine) response headers. [1] A big +1 to all the above as goals. :: Reduction of re-implementation / NIH syndrome by incorporating the most common (1%) of features most often relegated to middleware or functional helpers. Note that nearly every application-friendly feature you add will increase the burden on both server developers and middleware developers, which ironically means that application developers actually end up with fewer options. Unicode decoding of a small handful of values (CGI values that pull from the request URI) is the biggest example. [2, 3] Does that mean you plan to make the other values bytes, then? Or will they be unicode-y-bytes as well? What happens for additional server-provided variables? The PEP choice was for uniformity. At one point, I advocated simply using surrogateescape coding, but this couldn't be made uniform across Python versions and maintain compatibility. Unfortunately, even with the move to 2.6+, this problem remains, unless server providers are required to register a surrogateescape error handler -- which I'm not even sure can be done in Python 2.x. :: Cross-compatibility considerations. The definition and use of native strings vs. byte strings is the biggest example of this in the rewrite. I'm not sure what you mean here. Do you mean portability of WSGI 2 code samples across Python versions (esp. 2.x vs. 3.x)? ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
At 05:47 PM 1/6/2011 -0800, Alice BevanMcGregor wrote: Tossing the idea around all day long will then, of course, be happening regardless. Unfortunately for that particular discussion, PEP 3148 / Futures seems to have won out in the broader scope. Do any established async frameworks or server (e.g. Twisted, Eventlet, Gevent, Tornado, etc.) make use of futures? Having a ratified and incorporated language PEP (core in 3.2 w/ compatibility package for 2.5 or 2.6+ support) reduces the scope of async discussion down to: how do we integrate futures into WSGI 2 instead of how do we define an async API at all. It would be helpful if you addressed the issue of scope, i.e., what features are you proposing to offer to the application developer. While the idea of using futures presents some intriguing possibilities, it seems to me at first glance that all it will do is move the point where the work gets done. That is, instead of simply running the app in a worker, the app will be farming out work to futures. But if this is so, then why doesn't the server just farm the apps themselves out to workers? I guess what I'm saying is, I haven't heard use cases for this from the application developer POV -- why should an app developer care about having their app run asynchronously? So far, I believe you're the second major proponent (i.e. ones with concrete proposals and/or implementations to discuss) of an async protocol... and what you have in common with the other proponent is that you happen to have written an async server that would benefit from having apps operating asynchronously. ;-) I find it hard to imagine an app developer wanting to do something asynchronously for which they would not want to use one of the big-dog asynchronous frameworks. (Especially if their app involves database access, or other communications protocols.) This doesn't mean I think having a futures API is a bad thing, but ISTM that a futures extension to WSGI 1 could be defined right now using an x-wsgi-org extension in that case... and you could then find out how many people are actually interested in using it. Mainly, though, what I see is people using the futures thing to shuffle off compute-intensive tasks... but if they do that, then they're basically trying to make the server's life easier... but under the existing spec, any truly async server implementing WSGI is going to run the *app* in a future of some sort already... Which means that the net result is that putting in async is like saying to the app developer: hey, you know this thing that you just could do in WSGI 1 and the server would take care of it for you? Well, now you can manage that complexity by yourself! Isn't that wonderful? ;-) I could be wrong of course, but I'd like to see what concrete use cases people have for async. We dropped the first discussion of async six years ago because someone (I think it might've been James) pointed out that, well, it isn't actually that useful. And every subsequent call for use cases since has been answered with, well, the use case is that you want it to be async. Only, that's a *server* developer's use case, not an app developer's use case... and only for a minority of server developers, at that. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 Goals
On 2011-01-06 14:14:32 -0800, Alice Bevan–McGregor said: There was something, somewhere I was reading related to WSGI about requiring content-length... but no matter. Right, I remember now: the HTTP 1.0 specification. (Honestly not trying to sound sarcastic!) See: http://www.w3.org/Protocols/HTTP/1.0/draft-ietf-http-spec.html#Entity-Body However, after testing every browser on my system (from Links and ELinks, through Firefox, Chrome, Safari, Konqueror, and Dillo) across the following test code, I find that they all handle a missing content-length in the same way: reading the socket until it closes. http://pastie.textmate.org/1435415 - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
07.01.2011 06:49, P.J. Eby kirjoitti: At 05:47 PM 1/6/2011 -0800, Alice BevanMcGregor wrote: Tossing the idea around all day long will then, of course, be happening regardless. Unfortunately for that particular discussion, PEP 3148 / Futures seems to have won out in the broader scope. Do any established async frameworks or server (e.g. Twisted, Eventlet, Gevent, Tornado, etc.) make use of futures? I understand that Twisted has incorporated futures support to their deferreds. Others, I believe, don't support them yet. You have to consider that Python 3.2 (the first Python with futures support in stdlib) hasn't even been released yet, and it's only been two weeks since I released the drop-in backport (http://pypi.python.org/pypi/futures/2.1). Having a ratified and incorporated language PEP (core in 3.2 w/ compatibility package for 2.5 or 2.6+ support) reduces the scope of async discussion down to: how do we integrate futures into WSGI 2 instead of how do we define an async API at all. It would be helpful if you addressed the issue of scope, i.e., what features are you proposing to offer to the application developer. While the idea of using futures presents some intriguing possibilities, it seems to me at first glance that all it will do is move the point where the work gets done. That is, instead of simply running the app in a worker, the app will be farming out work to futures. But if this is so, then why doesn't the server just farm the apps themselves out to workers? I guess what I'm saying is, I haven't heard use cases for this from the application developer POV -- why should an app developer care about having their app run asynchronously? Applications need to be asynchronous to work on a single threaded server. There is no other benefit than speed and concurrency, and having to program a web app to operate asynchronously can be a pain. AFAIK there is no other way if you want to avoid the context switching overhead and support a huge number of concurrent connections. Thread/process pools are only necessary in an asynchronous application where the app needs to use blocking network APIs or do heavy computation, and such uses can unfortunately present a bottleneck. It follows that it's pretty pointless to have an asynchronous application that uses a thread/process pool on every request. The goal here is to define a common API for these mutually incompatible asynchronous servers to implement so that you could one day run an asynchronous app on Twisted, Tornado, or whatever without modifications. So far, I believe you're the second major proponent (i.e. ones with concrete proposals and/or implementations to discuss) of an async protocol... and what you have in common with the other proponent is that you happen to have written an async server that would benefit from having apps operating asynchronously. ;-) I find it hard to imagine an app developer wanting to do something asynchronously for which they would not want to use one of the big-dog asynchronous frameworks. (Especially if their app involves database access, or other communications protocols.) This doesn't mean I think having a futures API is a bad thing, but ISTM that a futures extension to WSGI 1 could be defined right now using an x-wsgi-org extension in that case... and you could then find out how many people are actually interested in using it. Mainly, though, what I see is people using the futures thing to shuffle off compute-intensive tasks... but if they do that, then they're basically trying to make the server's life easier... but under the existing spec, any truly async server implementing WSGI is going to run the *app* in a future of some sort already... Which means that the net result is that putting in async is like saying to the app developer: hey, you know this thing that you just could do in WSGI 1 and the server would take care of it for you? Well, now you can manage that complexity by yourself! Isn't that wonderful? ;-) I could be wrong of course, but I'd like to see what concrete use cases people have for async. We dropped the first discussion of async six years ago because someone (I think it might've been James) pointed out that, well, it isn't actually that useful. And every subsequent call for use cases since has been answered with, well, the use case is that you want it to be async. Only, that's a *server* developer's use case, not an app developer's use case... and only for a minority of server developers, at that. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/alex.gronholm%40nextday.fi ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe:
Re: [Web-SIG] PEP 444 Goals
On Jan 7, 2011, at 12:13 AM, Alice Bevan–McGregor wrote: On 2011-01-06 14:14:32 -0800, Alice Bevan–McGregor said: There was something, somewhere I was reading related to WSGI about requiring content-length... but no matter. Right, I remember now: the HTTP 1.0 specification. (Honestly not trying to sound sarcastic!) See: http://www.w3.org/Protocols/HTTP/1.0/draft-ietf-http-spec.html#Entity-Body You've misread that section. In HTTP/1.0, *requests* were required to have a Content-Length if they had a body (HTTP 1.1 fixed that with chunked request support). Responses have never had that restriction: they have always (even since before HTTP 1.0) been allowed to omit Content-Length and terminate by closing the socket. HTTP 1.1 didn't really add any new functionality to *responses* by adding chunking, simply bit of efficiency and error detection ability. James ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
On Thu, Jan 6, 2011 at 8:49 PM, P.J. Eby p...@telecommunity.com wrote: At 05:47 PM 1/6/2011 -0800, Alice BevanMcGregor wrote: Tossing the idea around all day long will then, of course, be happening regardless. Unfortunately for that particular discussion, PEP 3148 / Futures seems to have won out in the broader scope. Do any established async frameworks or server (e.g. Twisted, Eventlet, Gevent, Tornado, etc.) make use of futures? PEP 3148 Futures are meant for a rather different purpose than those async frameworks. Those frameworks all are trying to minimize the number of threads using some kind of callback-based non-blocking I/O system. PEP 3148 OTOH doesn't care about that -- it uses threads or processes proudly. This is useful for a different type of application, where there are fewer, larger tasks, and the overhead of threads doesn't matter. The Monocle framework, which builds on top of Tornado or Twisted, uses something not entirely unlike Futures, though they call it Callback. I don't think the acceptance of PEP 3148 should be taken as forcing the direction that async frameworks should take. Having a ratified and incorporated language PEP (core in 3.2 w/ compatibility package for 2.5 or 2.6+ support) reduces the scope of async discussion down to: how do we integrate futures into WSGI 2 instead of how do we define an async API at all. It would be helpful if you addressed the issue of scope, i.e., what features are you proposing to offer to the application developer. While the idea of using futures presents some intriguing possibilities, it seems to me at first glance that all it will do is move the point where the work gets done. That is, instead of simply running the app in a worker, the app will be farming out work to futures. But if this is so, then why doesn't the server just farm the apps themselves out to workers? I guess what I'm saying is, I haven't heard use cases for this from the application developer POV -- why should an app developer care about having their app run asynchronously? So far, I believe you're the second major proponent (i.e. ones with concrete proposals and/or implementations to discuss) of an async protocol... and what you have in common with the other proponent is that you happen to have written an async server that would benefit from having apps operating asynchronously. ;-) I find it hard to imagine an app developer wanting to do something asynchronously for which they would not want to use one of the big-dog asynchronous frameworks. (Especially if their app involves database access, or other communications protocols.) This doesn't mean I think having a futures API is a bad thing, but ISTM that a futures extension to WSGI 1 could be defined right now using an x-wsgi-org extension in that case... and you could then find out how many people are actually interested in using it. Mainly, though, what I see is people using the futures thing to shuffle off compute-intensive tasks... but if they do that, then they're basically trying to make the server's life easier... but under the existing spec, any truly async server implementing WSGI is going to run the *app* in a future of some sort already... Which means that the net result is that putting in async is like saying to the app developer: hey, you know this thing that you just could do in WSGI 1 and the server would take care of it for you? Well, now you can manage that complexity by yourself! Isn't that wonderful? ;-) I could be wrong of course, but I'd like to see what concrete use cases people have for async. We dropped the first discussion of async six years ago because someone (I think it might've been James) pointed out that, well, it isn't actually that useful. And every subsequent call for use cases since has been answered with, well, the use case is that you want it to be async. Only, that's a *server* developer's use case, not an app developer's use case... and only for a minority of server developers, at that. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/guido%40python.org -- --Guido van Rossum (python.org/~guido) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 Goals
On 2011-01-06 21:26:32 -0800, James Y Knight said: You've misread that section. In HTTP/1.0, *requests* were required to have a Content-Length if they had a body (HTTP 1.1 fixed that with chunked request support). Responses have never had that restriction: they have always (even since before HTTP 1.0) been allowed to omit Content-Length and terminate by closing the socket. Ah ha, that explains my confusion, then! Thank you. - Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
[Apologies if this is a double- or triple-post; I seem to be having a stupid number of connectivity problems today.] Howdy! Apologies for the delay in responding, it’s been a hectic start to the new year. :) On 2011-01-03, at 6:22 AM, Timothy Farrell wrote: You don't know me but I'm the author of the Rocket Web Server (http://pypi.python.org/pypi/rocket) and have, in the past, been involved in the web2py community. Like you, I'm interested in seeing web development come to Python3. I'm glad you're taking up WSGI2. I have a feature-request for it that perhaps we could work in. Of course; in fact, I hope you don’t mind that I’ve re-posted this response to the web-sig mailing list. Async needs significantly broader discussion. I would appreciate it if you could reply to the mailing list thread. I would like to see futures added as a server option. This way, controllers could dispatch emails (or run some other blocking or long-running task) that would not block the web-response. WSGI2 Servers could provide a futures executor as environ['wsgi.executor'] that the app could use to offload processes that need not complete before the web-request is served to the client. E-mail dispatch is one of the things I solved a long time ago with TurboMail; it uses a dedicated thread pool and can deliver 100 unique messages per second (more if you use BCC) in the default configuration, so I don’t really see that one use case as one that can benefit from the futures module. Updating TurboMail to use futures would be an interesting exercise. ;) I was thinking of exposing the executor as environ[‘wsgi.async.executor’], with ‘wsgi.async’ being a boolean value indicating support. What should the server do with the future instances? The executor returns future instances when running executor.submit/map; the application never generates its own Future instances. The application may, however, use whatever executor it sees fit; it can, for example, have one thread pool executor and one process pool, used for different tasks. The server itself can utilize any combination of single-threaded IO-based async (see further on in this message), and multi-threaded or multi-process management of WSGI requests. Resuming suspended applications (ones pending future results) is an implementation detail of the server. Should future.add_done_callback() be allowed? I'm not sure how practical/reliable this would be. (By the time the callback is called, the calling environment could be gone. Is this undefined behavior?) If you wrap your callback in a partial(my_callback, environ) the environ will survive the end of the request/response cycle (due to the incremented reference count), and should be allowed to enable intelligent behaviour in the callbacks. (Obviously the callbacks will not be able to deliver a response to the client at the time they are called; the body iterator can, however, wait for the future instance to complete and/or timeout.) A little bit later in this message I describe a better solution than the application registering its own callbacks. Do we need to also specify what type of executor is provided (threaded vs. separate process)? I think that’s an application-specific configuration issue, not really the concern of the PEP. Do you have any thoughts about this? I believe that intelligent servers need some way to ‘pause’ a WSGI worker rather than relying on the worker executing in a thread and blocking while waiting for the return value of a future. Using generator syntax (yield) with the following rules is my initial idea: * The application may yield None. This is a polite way to have the async reactor (in the WSGI server/gateway) reschedule the worker for the next reactor cycle. Useful as a hint that “I’m about do do something that may take a moment”, allowing other workers to get a chance to perform work. (Cooperative multi-tasking on single-threaded async servers.) * The application must yield one 3-tuple WSGI response, and must not yield additional data afterwords. This is usually the last thing the WSGI application would do, with possible cleanup code afterwords (before falling off the bottom / raising StopIteration / returning None). * The application may yield Future instances returned by environ[‘wsgi.executor’].submit/map; the worker will then be paused pending execution of the future; the return value of the future will be returned from the yield statement. Exceptions raised by the future will be re-raised from the yield statement and can thus be captured in a natural way. E.g.: try: complex_value = yield environ[‘wsgi.executor’].submit(long_running) except: pass # handle exceptions generated from within long_running Similar rules apply to the response body iterator: it yields bytestrings, may yield unicode strings where native strings are unicode strings, and
Re: [Web-SIG] PEP 444 / WSGI 2 Async
Alex Grönholm and I have been discussing async implementation details (and other areas of PEP 444) for some time on IRC. Below is the cleaned up log transcriptions with additional notes where needed. Note: The logs are in mixed chronological order — discussion of one topic is chronological, potentially spread across days, but separate topics may jump around a bit in time. Because of this I have eliminated the timestamps as they add nothing to the discussion. Dialogue in square brackets indicates text added after-the-fact for clarity. Topics are separated by three hyphens. Backslashes indicate joined lines. This should give a fairly comprehensive explanation of the rationale behind some decisions in the rewrite; a version of these conversations (in narrative style vs. discussion) will be added to the rewrite Real Soon Now™ under the Rationale section. — Alice. --- General agronholm: my greatest fear is that a standard is adopted that does not solve existing problems GothAlice: [Are] there any guarantees as to which thread / process a callback [from the future instance] will be executed in? --- 444 vs. agronholm: what new features does pep 444 propose to add to pep ? \ async, filters, no buffering? GothAlice: Async, filters, no server-level buffering, native string usage, the definition of byte string as the format returned by socket read (which, on Java, is unicode!), and the allowance for returned data to be Latin1 Unicode. \ All of this together will allow a '''def hello(environ): return 200 OK, [], [Hello world!]''' example application to work across Python versions without modification (or use of b prefix) agronholm: why the special casing for latin1 btw? is that an http thing? GothAlice: Latin1 = \u → \u00FF — it's one of the only formats that can be decoded while preserving raw bytes, and if another encoding is needed, transcode safely. \ Effectively requiring Latin1 for unicode output ensures single byte conformance on the data. \ If an application needs to return UTF-8, for example, it can return an encoded UTF-8 bytestream, which will be passed right through, --- Filters agronholm: regarding middleware, you did have a point there -- exception handling would be pretty difficult with ingress/egress filters GothAlice: Yup. It's pretty much a do or die scenario in filter-land. agronholm: but if we're not ditching middleware, I wonder about the overall benefits of filtering \ it surely complicates the scenario so it'd better be worth it \ I don't so much agree with your reasoning that [middleware] complicates debugging \ I don't see any obvious performance improvements either (over middleware) GothAlice: Simplified debugging of your application w/ reduced stack to sort through, reduced nested stack overhead (memory allocation improvement), clearer separation of tasks (egress compression is a good example). This follows several of the Zen of Python guidelines: \ Simple is better than complex. \ Flat is better than nested. \ There should be one-- and preferably only one --obvious way to do it. \ If the implementation is hard to explain, it's a bad idea. \ If the implementation is easy to explain, it may be a good idea. agronholm: I would think that whatever memory the stack elements consume is peanuts compared to the rest of the application \ ingress/egress isn't exactly simpler than middleware GothAlice: The implementation for ingress/egress filters is two lines each: a for loop and a call to the elements iterated over. Can't get much simpler or easier to explain. ;) \ Middleware is pretty complex… \ The majority of ingress filters won't have to examine wsgi.input, and supporting async on egress would be relatively easy for the filters (pass-through non-bytes data in body_iter). \ If you look at a system that offers input filtering, output filtering, and decorators (middleware), modifying input should obviously be an input filter, and vice-versa. agronholm: how does a server invoke the ingress filters \ in my opinion, both ingress and egress filters should essentially be pipes \ compression filters are a good example of this \ once a block of request data (body) comes through from the client, it should be sent through the filter chain agronholm: consider an application that receives a huge gzip encoded upload \ the decompression filter decompresses as much as it can using the incoming data \ the application only gets the next block once the decompression filter has enough raw data to decompress GothAlice: Ingress decompression, for example, would accept the environ argument, detect gzip content-encoding, then decompress the wsgi.input into its own buffer, and finally replace wsgi.input in the environ with its decompressed version. \ Alternatively, it could decompress chunks and have a more intelligent replacement for wsgi.input (to delay decompression until it is needed).
Re: [Web-SIG] PEP 444 != WSGI 2.0
At 05:04 PM 1/2/2011 +1100, Graham Dumpleton wrote: That PEP was rejected in the end and was replaced with PEP 342 which worked quite differently, yet I cant see that the WSGI specification was revisited in light of how it ended up being implemented and the implications of that. Part of my contribution to PEP 342 was ensuring that it was sufficiently PEP 325-compatible to ensure that PEP 333 wouldn't *need* revisiting. At least, not with respect to generator close() methods anyway. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 != WSGI 2.0
Alice and Graham, I worry a lot that there's a fight brewing here that will lead to disappointment all around. I see accusations, demands, and passion. I also see a lot of apathy in the web-sig. This is not a recipe for a successful standard. Since it appears we have or are about to face a breakdown of communication, I'd like to start with some remarks on the mechanics of communication. - Alice hasn't posted a link to her rewrite of PEP 444 in a while. AFAICT it's this: https://github.com/GothAlice/wsgi2/blob/master/pep444.textile. I find it a bit disturbing that the official copy of PEP 444 ( http://www.python.org/dev/peps/pep-0444/ ) hasn't been updated. This is confusing for occasional observers (like myself), since the python.orgcopy looks quite dead. It also is not in line with the PEP workflow as written down in PEP 1 ( http://www.python.org/dev/peps/pep-0001/#pep-work-flow ). - It is not reasonable to demand a discussion on IRC. In fact I think it is one of the worst media for arriving agreement over a standard. IRC doesn't have public logs for those who didn't participate in real time (apparently intentionally so); it is pretty hostile to people who don't use it regularly (I am one of those); it doesn't work well for people in different timezones. Blog comments are not much better (they are archived, but as a medium they get too much spam and are too scattered to be worth tracking down for other participants); the web-sig mailing list is the preferred forum. - If you are going to quote stuff from earlier in the thread and respond to it using you, please don't strip the attributions (or add them if your mailer doesn't). Also it's best to keep the person you address in the To: line of the message (or add them back if your mailer doesn't automatically do this). With that out of the way, let me try to analyze the crux of the matter. The WSGI 1.0 standard (PEP 333) has been widely successful, but, like any standard, it has some flaws. People thought that Python 3 would be a good opportunity to fix the flaws. A proposal was drafted, and posted as PEP 444, but no agreement was reached. In order to fix some obvious flaws due to Python 3's different treatment of bytes and text, a much less ambitious update was produced as PEP , and labeled WSGI 1.0.1. Although this is still marked as draft, I personally think of it as accepted; it is really just a very small set of clarifications and disambiguations of PEP 333, specifically for the benefit of interoperability between WSGI 1.0 apps and servers across the Python 2 / Python 3 boundary. But there still was no WSGI 2.0, and the original PEP 444 authors stopped pushing for their more ambitious draft (I recall this from the web-sig list; the PEP itself was not updated to reflect this). Then Alice came along, with much enthusiasm (though perhaps not enough prior understanding of Python's PEP process) and offered to rewrite PEP 444 to make it better, and to aim to make the updated version the agreed-upon WSGI 2.0 standard. I can't quite tell what happened from there; IIRC Alice's proposal did not receive much objection but neither did she get much support from the folks on web-sig -- perhaps people were tired of the discussion (which had already gone around once and not reached any agreement), perhaps people were too busy to read the list. It seems Graham, at least, falls in the latter category and is now regretting that he didn't speak up earlier. Ian, OTOH, seems to implicitly endorse Alice's actions, and seems to be hoping that a widely accepted WSGI 2.0 standard will come out of her work. In the mean time, Alice (understandably) has looked for other forums where she got more feedback -- I may not like IRC, but I can see how the general apathy on the web-sig is not exactly encouraging. (This is a general problem with Python -- we always complain that there aren't enough people to do the work, but when someone shows up and offers to do some work, they don't get much support. On python-dev we've acknowledged this and are trying to get better about it.) In order to get WSGI 2.0 back on the standards track, I think a number of things have to happen. First, it would be great if Alice could prepare a version of her draft in the format required for PEPs, and submit it to the PEP editors ( p...@python.org). Note that the PEP editors do *not* judge a PEP by its technical merits or have a say in its approval (though they may reject outright nonsense) -- they merely facilitate the discussion by promptly checking in the submitted draft, so that PEP authors don't need SVN (or, soon, I hope Hg) access. The PEP editors also ensure that the PEP is formatted right, so that it can be automatically processed into HTML for publication on python.org, and they may make simple fixes for typos/spelling/grammar. In my experience (being one of them), the PEP editors usually respond within 24 hours, and don't have
Re: [Web-SIG] PEP 444 != WSGI 2.0
On Sun, 2011-01-02 at 09:21 -0800, Guido van Rossum wrote: Graham, I hope that you can stop being grumpy about the process that is being followed and start using your passion to write up a critique of the technical merits of Alice's draft. You don't have to attack the whole draft at once -- you can start by picking one or two important issues and try to guide a discussion here on web-sig to tease out the best solutions. Please understand that given the many different ways people use and implement WSGI there may be no perfect solution within reach -- writing a successful standard is the art of the compromise. (If you still think the process going forward should be different, please write me off-list with your concerns.) Everyone else on this list, please make a new year's resolution to help the WSGI 2.0 standard become a reality in 2011. I think Graham mostly has an issue with this thing being called WSGI 2. FTR, avoiding naming arguments is why I titled the original PEP Web3. I knew that if I didn't (even though personally I couldn't care less if it was called Buick or McNugget), people would expend effort arguing about the name rather than concentrate on the process of creating a new standard. They did anyway of course; many people argued publically wishing to rename Web3 to WSGI2. On balance, though, I think giving the standard a neutral name before it's widely accepted as a WSGI successor was (and still is) a good idea, if only as a conflict avoidance strategy. ;-) That said, I have no opinion on the technical merits of the new PEP 444 draft; I've resigned myself to using derivatives of PEP forever. It's good enough. Most of the really interesting stuff seems to happen at higher levels anyway, and the benefit of a new standard doesn't outweigh the angst caused by trying to reach another compromise. I'd suggest we just embrace it, adding minor tweaks as necessary, until we reach some sort of technical impasse it doesn't address. - C ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 != WSGI 2.0
On Sun, Jan 2, 2011 at 12:14 PM, Alice Bevan–McGregor al...@gothcandy.comwrote: On 2011-01-02 09:21:29 -0800, Guido van Rossum said: Alice hasn't posted a link to her rewrite of PEP 444 in a while. AFAICT it's this: https://github.com/GothAlice/wsgi2/blob/master/pep444.textile. I find it a bit disturbing that the official copy of PEP 444 ( http://www.python.org/dev/peps/pep-0444/ ) hasn't been updated. This is confusing for occasional observers (like myself), since the python.orgcopy looks quite dead. It also is not in line with the PEP workflow as written down in PEP 1 ( http://www.python.org/dev/peps/pep-0001/#pep-work-flow ). I am unsure of the policy behind updating a PEP on the website from a partial (with glaring holes) source. In my rewrite there are several sections that need to be fleshed out before I would consider pushing it up-stream. I'm tentative that way. ;) Please drop the perfectionism. It is unpythonic. :) We believe in release early, release often here. Even if *you* believe there are no glaring holes, others will find them anyway. The point of having incomplete drafts up on the website (and in SVN) is that everybody has the same chance of seeing the draft. (Also note that the website version is the first hit if you type PEP 444 into a search engine.) If it's incomplete in places, just put XXX markers in it. People are used to this. Publication to the website does not imply any kind of approval -- it just is a mechanism for people to comment on a draft in a uniform way. (Also, a detailed SVN history might help people see the relevant changes -- a big bulk update doesn't shed much light on what changed since it looks like *everything* changed.) PEP is an excellent solution that should be quick to adopt. My PEP 444 rewrite takes a fundamentally different approach in an attempt to simplify and solve broader problems than pure compatibility. It might be good if you summarized the differences, either here (on the list) or in the PEP itself. Unlike the documents issued by some standards bodies, Python PEPs are meant to be readable documents, and can contain sections about motivation, history, examples, whatever else the author deems likely to sway its acceptance. The formal specification itself can be in a section labeled as such. (PS: I saw you just posted a quick summary, for which I am grateful. Maybe you can add a more thorough one to the PEP itself.) First, it would be great if Alice could prepare a version of her draft in the format required for PEPs, and submit it to the PEP editors. I will make this a priority. Great! -- --Guido van Rossum (python.org/~guido) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 != WSGI 2.0
On Sun, Jan 2, 2011 at 12:55 PM, Masklinn maskl...@masklinn.net wrote: On 2011-01-02, at 21:38 , Alice Bevan–McGregor wrote: On 2011-01-02 11:14:00 -0800, Chris McDonough said: I'd suggest we just embrace it, adding minor tweaks as necessary, until we reach some sort of technical impasse it doesn't address. Async is one area that does not cover, and that by not having a standard which incorporates async means competing, incompatible solutions have been created. If I remember the previous Web3 discussion correctly, the result was basically that async has no business being shoehorned in WSGI, that WSGI's model is fundamentally unfit for async and you can't correctly support sync and async with the same spec, and therefore an asynchronous equivalent to WSGI should be developed separately, in order to correctly match the needs of asynchronous servers and interfaces, without the cruft and inadequacies of being forked from a synchronous gateway model. Masklinn, those are pretty strong words (bordering on offensive). I'm sure Alice has a different opinion. Alice, hopefully you can write down your ideas for all to see? Perhaps you have an implementation too? Maybe seeing a concrete proposal will help us all see how big or small of a shoehorn will be needed. (Just trying to keep this thread from degenerating into a shouting match.) -- --Guido van Rossum (python.org/~guido) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 != WSGI 2.0
On 2011-01-02, at 23:16 , Alice Bevan–McGregor wrote: (Just trying to keep this thread from degenerating into a shouting match.) I missed how his statements could be construed as offensive. :/ I missed it as well (though my report might have been brusque), and definitely didn't intent it to be offensive. I interpreted the multiple you can't references to be careless shorthand Yes, it was intended as a short report from the previous discussion's gist/result (or more precisely my potentially incorrect recollection of it), and aimed at suggesting that you check out that previous thread, not as a statement of fact (let alone personal criticism or attacks). ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 != WSGI 2.0
At 12:38 PM 1/2/2011 -0800, Alice BevanMcGregor wrote: On 2011-01-02 11:14:00 -0800, Chris McDonough said: I'd suggest we just embrace it, adding minor tweaks as necessary, until we reach some sort of technical impasse it doesn't address. Async is one area that does not cover, and that by not having a standard which incorporates async means competing, incompatible solutions have been created. Actually, it's the other way 'round. I wrote off async for PEP 333 because the competing, incompatible solutions that already existed lacked sufficient ground to build a spec on. In effect, any direction I took would effectively have either blessed one async paradigm as the correct one, or else been a mess that nobody would've used anyway. And this condition largely still exists today: about the only common ground between at least *some* async Python frameworks today is the use of greenlets... but if you have greenlets, then you don't need a fancy async API in the first place, because you can just block during I/O from the POV of the app. The existence of a futures API in the stdlib doesn't alter this much, either, because the async frameworks by and large already had their own API paradigms for doing such things (e.g. Twisted deferreds and thread/process pools, or generator/greenlet-based APIs in other frameworks). The real bottleneck isn't even that, so much as that if you're going to write an async WSGI application, WSGI itself can't define enough of an async API to let you do anything useful. For example, you may still need database access that's compatible with the async environment you're using... so you'd only be able to write portable async applications if they didn't do ANY I/O outside of HTTP itself! That being the case, I don't see how a meaningfully cross-platform async WSGI can ever really exist, and be attractive both to application developers (who want to run on many platforms) and framework developers (who want many developers to choose their platform). On 2011-01-02 12:00:39 -0800, Guido van Rossum said: Actually that does sound like an opinion on the technical merits. I can't tell though, because I'm not familiar enough with PEP 444 to know what the critical differences are compared to PEP . Could someone summarize? Async, distinction between byte strings (type returned by socket.read), native strings, and unicode strings, What distinction do you mean? I see a difference in *how* you're distinguishing byte, native, and unicode strings, but not *that* they're distinguished from one another. (i.e., PEP distinguishes them too.) thorough unicode decoding (moving some of the work from middleware to the server), What do you mean by thorough decoding and moving from middleware to server? Are these references to the redundant environ variables, to the use of decoded headers (rather than bytes-in-unicode ones), or something else? The async part is an idea in my head that I really do need to write down, clarified with help from agronholm on IRC. The futures PEP is available as a pypi installable module, is core in 3.2, and seems to provide a simple enough abstract (duck-typed) interface that it should be usable as a basis for async in PEP 444. I suggest reviewing the Web-SIG history of previous async discussions; there's a lot more to having a meaningful API spec than having a plausible approach. It's not that there haven't been past proposals, they just couldn't get as far as making it possible to write a non-trivial async application that would actually be portable among Python-supporting asynchronous web servers. (Now, if you're proposing that web servers run otherwise-synchronous applications using futures, that's a different story, and I'd be curious to see what you've come up with. But that's not the same thing as an actually-asynchronous WSGI.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 != WSGI 2.0
At 02:21 PM 1/2/2011 -0800, Alice BevanMcGregor wrote: On 2011-01-02 11:57:19 -0800, P.J. Eby said: * -1 on the key-specific encoding schemes for the various CGI variables' values -- for continuity of coding (not to mention simplicity) PEP 's approach to environ encodings should beused. (That is, the environ consists of bytes-in-unicode-form, rather than true unicode strings.) Does ISO-8859-1 not accomodate this for all but a small number of the environment variables in PEP 444? PEP requires that environment variables contain the bytes of the HTTP headers, decoded using ISO-8859-1. The unicode strings, in other words are restricted to code points in the 0-255 range, and are really just a representation of bytes, rather than being a unicode decoding of the contents of the bytes. What I saw in your draft of PEP 444 (which I admittedly may be confused about) is language that simply loosely refers to unicode environment variables, which could easily be misconstrued as meaning that the values could actually contain other code points. To be precise, in PEP 333, the true unicode value of an environment variable is: environ[key].encode('iso-8859-1').decode(appropriate_encoding_for_key) Whereas, my reading of your current draft implies that this has to already be done by the server. As I understand it, the problem with this is that the server developer can't always provide such a decoding correctly, and would require that the server guess, in the absence of any information that it could use to do the guessing. An application developer is in a better position to deal with this ambiguity than the server developer, and adding configuration to the server just makes deployment more complicated, and breaks application composability if two sub-applications within a larger application need different decodings. That's the rationale for the PEP approach -- it essentially acknowledges that HTTP is bytes, and we're only using strings for the API conveniences they afford. * Where is the PARAMETERS variable defined in the CGI spec? Whatservers actually support it? It's defined in the HTTP spec by way of http://www.ietf.org/rfc/rfc2396.txt URI Syntax. These values are there, they should be available. (Specifically semi-colon separated parameters and hash-separated fragment.) I mean, what web servers currently provide PARAMETERS as a CGI variable? If it's not a CGI variable, it doesn't go in all caps. What's more, the spec you reference points out that parameters can be placed in *each* path-segment, which means that they would: 1) already be in PATH_INFO, and 2) have multiple values So, -1 on the notion of PARAMETERS, since AFAICT it is redundant, not CGI, and would only hold one parameter segment. * The language about empty vs. missing environment variables appears to have disappeared; without it, the spec is ambiguous. I will re-examine the currently published PEP 444. I don't know if it's in there or not; I've read your spec more thoroughly than that one. I'm referring to the language from PEP 333 and its successor, with which I'm much more intimately familiar. Indeed. I do try to understand the issues covered in a broader scope before writing; for example, I do consider the ability for new developers to get up and running without worrying about the example applications they are trying to use work in their version of Python; thus /allowing/ native strings to be used as response values on Python 3. I don't understand. If all the examples in your PEP use b'' strings (per the 2.6+ requirement), where is the problem? They can't use WSGI 1(.0.1) code examples at all (as your draft isn't backward-compatible), so I don't see any connection there, either. Byte strings are still perferred, and may be more performant, Performance was not the primary considerations; they were: * One Obvious Way * Refuse The Temptation To Guess * Errors Should Not Pass Silently The first two would've been fine with unicode; the third was the effective tie-breaker. (Since if you use Unicode, at some point you will send garbled data and end up with an error message far away from the point where the error occurred.) I certainly will; I just need to see concrete points against the technical merits of the rewritten PEP Well, I've certainly given you some, but it's hard to comment other than abstractly on an async spec you haven't proposed yet. ;-) Nonetheless, it's really important to understand that the PEP process (especially for Informational-track standards) is not so much about technical merits in an absolute sense, as it is about *community consensus*. And that means it's actually a political and marketing process at least as much as it is a technical one. If you miss that, you may well end up with a technically-perfect spec (in the sense that nobody gives you any additional concrete points against the technical
Re: [Web-SIG] PEP 444 != WSGI 2.0
Until the PEP is approved, it's just a suggestion. So for it to really be WSGI 2 it will have to go through at least some approval process; which is kind of ad hoc, but not so ad hoc as just to implicitly happen. For WSGI 2 to happen, someone has to write something up and propose it. Alice has agreed to do that, working from PEP 444 which several other people have participated in. Calling it WSGI 2 instead of Web 3 was brought up on this list, and the general consensus seemed to be that it made sense -- some people felt a little funny about it, but ultimately it seemed to be something everyone was okay with (with some people like myself feeling strongly it should be WSGI 2). I'm not sure why you are so stressed out about this? If you think it's really an issue, perhaps 2 could be replaced with 2alpha until such time as it is approved? On Sat, Jan 1, 2011 at 8:02 PM, Graham Dumpleton graham.dumple...@gmail.com wrote: Can we please clear up a matter. GothAlice (don't know off hand there real name), keeps going around and claiming: After some discussion on the Web-SIG mailing list, PEP 444 is now officially WSGI 2, and PEP is WSGI 1.1 In this instance on web.py forum on Google Groups. I have pointed out a couple of times to them that there is no way that PEP 444 has been blessed as being the official WSGI 2.0 but they are not listening and are still repeating this claim. They can't also get right that PEP clearly says it is still WSGI 1.0 and not WSGI 1.1. If the people here who's opinion matters are quite happy for GothAlice to hijack the WSGI 2.0 moniker for PEP 444 I will shut up. But if that happens, I will voice my objections by simply not having anything to do with WSGI 2.0 any more. Graham ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/ianb%40colorstudy.com -- Ian Bicking | http://blog.ianbicking.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] PEP 444 Draft Rewrite
Howdy! I've mostly finished a draft rewrite of PEP 444 (WSGI 2), incorporating some additional ideas covering things like py2k/py3k interoperability and switching from a more narrative style to a substantially RFC-inspired language. http://bit.ly/e7rtI6 I'm using Textile as my intermediary format, and will obviously need to convert this to ReStructuredText when I'm done. Missing are: * The majority of the examples. * Narrative rationale, wich I'll be writing shortly. * Narrative Python compatibility documentation. * Asynchronous documentation. This will likely rely on the abstract API defined in PEP 3148 (futures) as implemented in Python 3.2 and the futures package available on PyPi. * Additional and complete references. The Rationale chapter will add many references to community discussion. I would appreciate it greatly if this rewrite could be read through and questions, corrections, or even references to possible ambiguity mentioned in discussion. Have a happy holidays and a merry new-year, everybody! :) - Alice. P.s. I'll be updating my PEP 444 reference implementation HTTP 1.1 server (marrow.server.http) over the holidays to incorporate the changes in this rewrite; most notably the separation of byte strings, unicode strings, and native strings. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI2 Proposal: Filters to suppliment middleware.
On Sun, Dec 12, 2010 at 9:59 PM, Alice Bevan–McGregor al...@gothcandy.comwrote: Howdy! There's one issue I've seen repeated a lot in working with WSGI1 and that is the use of middleware to process incoming data, but not outgoing, and vice-versa; middleware which filters the output in some way, but cares not about the input. Wrapping middleware around an application is simple and effective, but costly in terms of stack allocation overhead; it also makes debugging a bit more of a nightmare as the stack trace can be quite deep. My updated draft PEP 444[1] includes a section describing Filters, both ingress (input filtering) and egress (output filtering). The API is trivially simple, optional (as filters can be easily adapted as middleware if the host server doesn't support filters) and easy to implement in a server. (The Marrow HTTP/1.1 server implements them as two for loops.) It's not clear to me how this can be composed or abstracted. @webob.dec.wsgify does kind of handle this with its request/response pattern; in a simplified form it's like: def wsgify(func): def replacement(environ): req = Request(environ) resp = func(req) return resp(environ) return replacement This allows you to do an output filter like: @wsgify def output_filter(req): resp = some_app(req.environ) fiddle_with_resp(resp) return resp (Most output filters also need the request.) And an input filter like: @wsgify def input_filter(req): fiddle_with_req(req) return some_app But while it handles the input filter case, it doesn't try to generalize this or move application composition into the server. An application is an application and servers are imagined but not actually concrete. If you handle filters at the server level you have to have some way of registering these filters, and it's unclear what order they should be applied. At import? Does the server have to poke around in the app it is running? How can it traverse down if you have dispatching apps (like paste.urlmap or Routes)? You can still implement this locally of course, as a class that takes an app and input and output filters. -- Ian Bicking | http://blog.ianbicking.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI2 Proposal: Filters to suppliment middleware.
On Tue, Dec 14, 2010 at 12:54 PM, Alice Bevan–McGregor al...@gothcandy.comwrote: An application is an application and servers are imagined but not actually concrete. Could you elaborate? (Define concrete in this context.) WSGI applications never directly touch the server. They are called by the server, but have no reference to the server. Servers in turn take an app and parameters specific to there serveryness (which may or may not even involve HTTP), but it's good we've gotten them out of the realm of application composition (early on WSGI servers frequently handled mounting apps at locations in the path, but that's been replaced with dispatching middleware). An application wrapped with middleware is also a single object you can hand around; we don't have an object that represents all of application, list of pre-filters, list of post-filters. If you handle filters at the server level you have to have some way of registering these filters, and it's unclear what order they should be applied. At import? Does the server have to poke around in the app it is running? How can it traverse down if you have dispatching apps (like paste.urlmap or Routes)? Filters are unaffected by, and unaware of, dispatch. They are defined at the same time your application middleware stack is constructed, and passed (in the current implementation) to the HTTPServer protocol as a list at the same time as your wrapped application stack. You can still implement this locally of course, as a class that takes an app and input and output filters. If you -do- need region specific filtering, you can ostensibly wrap multiple final applications in filter management middleware, as you say. That's a fairly advanced use-case regardless of filtering. I would love to see examples of what people might implement as filters (i.e. middleware that does ONE of ingress or egress processing, not both). From CherryPy I see things like: * BaseURLFilter (ingress Apache base path adjustments) * DecodingFilter (ingress request parameter decoding) * EncodingFilter (egress response header and body encoding) * GzipFilter (already mentioned) * LogDebugInfoFilter (egress insertion of page generation time into HTML stream) * TidyFilter (egress piping of response body to Tidy) * VirtualHostFilter (similar to BaseURLFilter) None of these (with the possible exception of LogDebugInfoFilter) I could imagine needing to be path-specific. GzipFilter is wonky at best (it interacts oddly with range requests and etags). Prefix handling is useful (e.g., paste.deploy.config.PrefixMiddleware), and usually global and unconfigured. Debugging and logging stuff often needs per-path configuration, which can mean multiple instances applied after dispatch. Encoding and Decoding don't apply to WSGI. Tidy is intrusive and I think questionable on a global level. I don't think the use cases are there. Tightly bound pre-filters and post-filters are particularly problematic. This all seems like a lot of work to avoid a few stack frames in a traceback. -- Ian Bicking | http://blog.ianbicking.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI2 Proposal: Filters to supplimentmiddleware.
Alice Bevan–McGregor There's one issue I've seen repeated a lot in working with WSGI1 and that is the use of middleware to process incoming data, but not outgoing, and vice-versa; middleware which filters the output in some way, but cares not about the input. Wrapping middleware around an application is simple and effective, but costly in terms of stack allocation overhead; it also makes debugging a bit more of a nightmare as the stack trace can be quite deep. My updated draft PEP 444[1] includes a section describing Filters, both ingress (input filtering) and egress (output filtering). The API is trivially simple, optional (as filters can be easily adapted as middleware if the host server doesn't support filters) and easy to implement in a server. (The Marrow HTTP/1.1 server implements them as two for loops.) Basically an input filter accepts the environment dictionary and can mutate it. Ingress filters take a single positional argument that is the environ. The return value is ignored. (This is questionable; it may sometimes be good to have ingress filters return responses. Not sure about that, though.) An egress filter accepts the status, headers, body tuple from the applciation and returns a status, headers, and body tuple of its own which then replaces the response. An example implementation is: for filter_ in ingress_filters: filter_(environ) response = application(environ) for filter_ in egress_filters: response = filter_(*response) That looks amazingly like the code for CherryPy Filters circa 2005. In version 2 of CherryPy, Filters were the canonical extension method (for the framework, not WSGI, but the same lessons apply). It was still expensive in terms of stack allocation overhead, because you had to call () each filter to see if it was on. It would be much better to find a way to write something like: for f in ingress_filters: if f.on: f(environ) It was also fiendishly difficult to get executed in the right order: if you had a filter that was both ingress and egress, the natural tendency for core developers and users alike was to append each to each list, but this is almost never the correct order. But even if you solve the issue of static composition, there's still a demand for programmatic composition (if X then add Y after it), and even decomposition (find the caching filter my framework added automatically and turn it off), and list.insert()/remove() isn't stellar at that. Calling the filter to ask it whether it is on also leads filter developers down the wrong path; you really don't want to have Filter A trying to figure out if some other, conflicting Filter B has already run (or will run soon) that demands Filter A return without executing anything. You really, really want the set of filters to be both statically defined and statically analyzable. Finally, you want the execution of filters to be configurable per URI and also configurable per controller. So the above should be rewritten again to something like: for f in ingress_filters(controller): if f.on(environ['path_info']): f(environ) It was for these reasons that CherryPy 3 ditched its version 2 filters and replaced them with hooks and tools in version 3. You might find more insight by studying the latest cherrypy/_cptools.py Robert Brewer fuman...@aminus.org ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI2 Proposal: Filters to supplimentmiddleware.
That looks amazingly like the code for CherryPy Filters circa 2005. In version 2 of CherryPy, Filters were the canonical extension method (for the framework, not WSGI, but the same lessons apply). It was still expensive in terms of stack allocation overhead, because you had to call () each filter to see if it was on. It would be much better to find a way to write something like: for f in ingress_filters: if f.on: f(environ) .on will need to be an @property in most cases, still not avoiding stack allocation and, in fact, doubling the overhead per filter. Statically disabled filters should not be added to the filter list. It was also fiendishly difficult to get executed in the right order: if you had a filter that was both ingress and egress, the natural tendency for core developers and users alike was to append each to each list, but this is almost never the correct order. If something is both an ingress and egress filter, it should be implemented as middleware instead. Nothing can prevent developers from doing bad things if they really try. Appending to ingress and prepending to egress would be the right thing to simulate middleware behaviour with filters, but again, don't do that. ;) But even if you solve the issue of static composition, there's still a demand for programmatic composition (if X then add Y after it), and even decomposition (find the caching filter my framework added automatically and turn it off), and list.insert()/remove() isn't stellar at that. I have plans (and partial implementation) of a init.d-style needs/uses/provides declaration and automatic dependency graphing. WebCore, for example, adds the declarations to existing middleware layers to sort the middleware. Calling the filter to ask it whether it is on also leads filter developers down the wrong path; you really don't want to have Filter A trying to figure out if some other, conflicting Filter B has already run (or will run soon) that demands Filter A return without executing anything. You really, really want the set of filters to be both statically defined and statically analyzable. Unfortunately, most, if not all filters need to check for request headers and response headers to determine the capability to run. E.g. compression checks environ.get('HTTP_ACCEPT_ENCODING', '').lower() for 'gzip', and checks the response to determine if a 'Content-Encoding' header has already been specified. Finally, you want the execution of filters to be configurable per URI and also configurable per controller. So the above should be rewritten again to something like: for f in ingress_filters(controller): if f.on(environ['path_info']): f(environ) It was for these reasons that CherryPy 3 ditched its version 2 filters and replaced them with hooks and tools in version 3. This is possible by wrapping multiple applications, say, in the filter middleware adapter with differing filter setups, then using the separate wrapped applications with some form of dispatch. You could also utilize filters as decorators. This is an implementation detail left up to the framework utilizing WSGI2, however. WSGI2 itself has no concept of controllers. None of this prevents the simplified stack from being useful during exception handling, though. ;) What I was really trying to do is reduce the level of nesting on each request and make what used to be middleware more explicit in its purpose. You might find more insight by studying the latest cherrypy/_cptools.py I'll give it a gander, though I firmly believe filter management (as middleware stack management) is the domain of a framework on top of WSGI2, not the domain of the protocol. — Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] PEP 444 / WSGI2 Proposal: Filters to suppliment middleware.
Howdy! There's one issue I've seen repeated a lot in working with WSGI1 and that is the use of middleware to process incoming data, but not outgoing, and vice-versa; middleware which filters the output in some way, but cares not about the input. Wrapping middleware around an application is simple and effective, but costly in terms of stack allocation overhead; it also makes debugging a bit more of a nightmare as the stack trace can be quite deep. My updated draft PEP 444[1] includes a section describing Filters, both ingress (input filtering) and egress (output filtering). The API is trivially simple, optional (as filters can be easily adapted as middleware if the host server doesn't support filters) and easy to implement in a server. (The Marrow HTTP/1.1 server implements them as two for loops.) Basically an input filter accepts the environment dictionary and can mutate it. Ingress filters take a single positional argument that is the environ. The return value is ignored. (This is questionable; it may sometimes be good to have ingress filters return responses. Not sure about that, though.) An egress filter accepts the status, headers, body tuple from the applciation and returns a status, headers, and body tuple of its own which then replaces the response. An example implementation is: for filter_ in ingress_filters: filter_(environ) response = application(environ) for filter_ in egress_filters: response = filter_(*response) I'd love to get some input on this. Questions, comments, criticisms, or better ideas are welcome! — Alice. [1] https://github.com/GothAlice/wsgi2/blob/master/pep-0444.rst ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444
I’ve updated my copy of the PEP, re-naming non-commentary and non-revision text to reference WSGI2, wsgi2, or wsgi (environment variables) as appropriate. I’ve also added the first draft of the text describing filters and some sample code, including a middleware adapter for filters. Here are some additional notes: https://gist.github.com/719763 — filter vs. middleware http://dirtsimple.org/2007/02/wsgi-middleware-considered-harmful.html It might be worth another PEP to describe interfaces to common data to encourage interoperability between filters/middleware, such as GET/POST data, cookies, session data (likely using Beaker’s API as a base), etc. Also something I’ve been exploring is automatic resolution of middleware/filter dependance by utilizing “uses”, “needs”, and “provides” properties on the callables and a middleware stack factory which can graph the dependancy tree. On a side note, I do not appear to be receiving posts to this mailing list, only the out-of-list CC/BCCs. :/ And here I’ve been getting used to reading and posting to comp.lang.python[.announce] on Usenet. ;) — Alice. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444
Ok. Retracted. I have been admiring the purity and simplicity of the JSON spec which intentionally has no version number. New changes are therefore only allowed with a new name. That removes a lot of complexity around figuring out what versions of the spec what server implements, etc. But the wsgi spec is far more complicated than JSON, and I expect that complexity is probably unavoidable. --Mark Ramm On Monday, November 22, 2010, Ian Bicking i...@colorstudy.com wrote: On Mon, Nov 22, 2010 at 5:05 PM, Mark Ramm mark.mchristen...@gmail.com wrote: I would very much prefer it if we could keep the current name or choose a new unrelated name, not wsgi2 as I think there API changes warrant a new name to prevent confusion. Why? Obviously 2 implies some breaking changes, and the changes are reasonable enough that it's not a complete change. Most of the changes have been discussed as WSGI 2 for a long time preceding this spec anyway. -- Ian Bicking | http://blog.ianbicking.org -- Mark Ramm-Christensen email: mark at compoundthinking dot com blog: www.compoundthinking.com/blog ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444
Would you prefer to give me collaboration permissions on your repo, or should I fork it? This message was sent from a mobile device. Please excuse any terseness and spelling or grammatical errors. If additional information is indicated it will be sent from a desktop computer as soon as possible. Thank you. On 2010-11-21, at 11:40 PM, Chris McDonough chr...@plope.com wrote: Georg Brandl has thus far been updating the canonical PEP on python.org. I don't know how you get access to that. My working copy is at https://github.com/mcdonc/web3 . ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444
I’ve forked it, now available at: https://github.com/GothAlice/wsgi2 Re-naming it to wsgi2 will be my first order of business during the week, altering your association the second. I’ll post change descriptions for discussion as I go. — Alice. On 2010-11-22, at 12:12 AM, Chris McDonough wrote: Would you prefer to give me collaboration permissions on your repo, or should I fork it? Please fork it or create another repository entirely. I have no plans to do more work on it personally, so I don't think it should really be associated with me. To that end, I think I'd prefer my name to either be off the PEP entirely or just listed as a helper or typist or something. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444
I would very much prefer it if we could keep the current name or choose a new unrelated name, not wsgi2 as I think there API changes warrant a new name to prevent confusion. --Mark On Mon, Nov 22, 2010 at 3:18 AM, Alice Bevan-McGregor al...@gothcandy.com wrote: I’ve forked it, now available at: https://github.com/GothAlice/wsgi2 Re-naming it to wsgi2 will be my first order of business during the week, altering your association the second. I’ll post change descriptions for discussion as I go. — Alice. On 2010-11-22, at 12:12 AM, Chris McDonough wrote: Would you prefer to give me collaboration permissions on your repo, or should I fork it? Please fork it or create another repository entirely. I have no plans to do more work on it personally, so I don't think it should really be associated with me. To that end, I think I'd prefer my name to either be off the PEP entirely or just listed as a helper or typist or something. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/mark.mchristensen%40gmail.com -- Mark Ramm-Christensen email: mark at compoundthinking dot com blog: www.compoundthinking.com/blog ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com