Re: [Web-SIG] OT: dotted names (Was: Re: A Python Web Application Package and Format)
At 04:11 PM 4/15/2011 -0400, Fred Drake wrote: These end users don't really care if the object identified is a class or function in module, a nested attribute on a class, or anything else, so long as it does what it's advertised to do. By not pushing implementation details into the identifier, the package maintainer is free to change the implementation in more ways, without creating backward incompatibility. That would be one advantage of using entry points instead. ;-) (i.e., the user doesn't specify the object location, the package author does.) Note, however, that one must perform considerably more work to resolve a name, when you don't know whether each part of the name is a module or an attribute. Either you have to get an AttributeError first, and then fall back to importing, or get an ImportError first, and fall back to getattr. If the syntax is explicit, OTOH, then you don't have to guess, thereby saving lots of work and wasteful exceptions. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] OT: dotted names (Was: Re: A Python Web Application Package and Format)
At 02:02 PM 4/15/2011 -0400, Jim Fulton wrote: On Fri, Apr 15, 2011 at 1:32 PM, Éric Araujo wrote: > As an aside, I wonder why people use dot+colon notation instead of just > dots to reference callables. In distutils2 for example we resolve > dotted names to find command classes, command hooks and compilers. So > what's the benefit, marginally easier parsing? An opportunity of using a colon is that it allows:: dotted.module.name:expression where expression may be more than just a name:: foo.bar:Bar() The reason setuptools uses ':' is that it allows you to unambiguously reference object attributes, e.g.: some.module:SomeClass.some_method_or_attribute (It doesn't allow expressions, just dotted "paths".) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)
At 09:45 AM 1/20/2011 -0500, James Y Knight wrote: But I agree, a clarification could be added to the statement '''all objects referred to in this specification as "strings" must be of type str or StringType''' and '''For values referred to in this specification as "bytestrings" [...] the value must be of type bytes under Python 3, and str in earlier versions of Python'''. It's not 100% obvious that it really does mean "type(obj) is str/bytes". Feel free to write said clarification, check it in, and add glowing praise for your efforts to the acknowledgements section. I will indeed appreciate it, so you won't even be lying. ;-) Or, you can just send me a patch, and receive slightly less praise. Either way works for me, though. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)
At 02:52 PM 1/12/2011 -0800, Guido van Rossum wrote: On Wed, Jan 12, 2011 at 2:34 PM, Alice BevanMcGregor wrote: > On 2011-01-10 13:12:57 -0800, Guido van Rossum said: >> >> Ok, now that we've had a week of back and forth about this, let me repeat >> my "threat". Unless more concerns are brought up in the next 24 hours, can >> PEP be accepted? It seems a lot of people are waiting for a decision >> that enables implementers to go ahead and claim PEP 333[3] compatibility. >> PEP 444 can take longer. > > With the lack of responses, can I assume this has been or will be shortly > marked as "accepted"? Yep. Phillip, can you do the honors? Apparently not -- I went to check it in and found Raymond had already marked it "Final". ;-) (I'm not clear on whether there's a difference between "Final" and "Accepted" heredifference, but I assume that if we find some sort of actual error we can still fix it.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 feature request - Futures executor
At 09:11 PM 1/10/2011 -0600, Timothy Farrell wrote: PJ, You seem to be old-hat at this so I'm looking for a little advise as I draft this spec. It seems a bad idea to me to just say environ['wsgi.executor'] will be a wrapped futures executor because the api handled by the executor can and likely will change over time. Am I write in thinking that a spec should be more specific in saying that the executor object will have "these specific methods" and so as future changes, the spec is not in danger of invalidation due to the changes? I'd actually just suggest something like: future = environ['wsgiorg.future'](func, *args, **kw) (You need to use the wsgiorg.* namespace for extension proposals like this, btw.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Server-side async API implementation sketches
At 05:06 PM 1/9/2011 -0800, Alice BevanMcGregor wrote: On 2011-01-09 09:03:38 -0800, P.J. Eby said: Hm. I'm not sure if I like that. The typical app developer really shouldn't be yielding multiple body strings in the first place. Wait; what? So you want the app developer to load a 40MB talkcast MP3 into memory before sending it? Statistically speaking, the "typical app" is producing a web page, made of HTML and severely limited in size by the short attention span of the human user reading it. ;-) Obviously, the spec should allow and support streaming. You want to completely eliminate the ability to stream an HTML page to the client in chunks (e.g. block, headers + search box, search results, advertisements, footer -- the exact thing Google does with every search result)? That sounds like artificially restricting application developers, to me. First, I don't want to eliminate it. Second, Google is hardly the "typical app developer". If you need the capability, it'll still be there. In your approach, the above samples have to be rewritten as: return app(environ) [snip] My code does not use return. At all. Only yield. If you return the calling of a generator, then you pass the original generator through to the caller, and it is the equivalent of writing a loop in place that iterates over the subgenerator, only without the additional complexity of needing to send/throw. The above middleware pattern works with the sketches I gaveon the PEAK wiki, and I've now updated the wiki to include an exampleapp and middleware for clarity. I'll need to re-read the code on your wiki; I find it incredibly difficult to grok, however, you can help me out a bit by answering a few questions about it: How does middleware trap exceptions raised by the application. With try/except around the "yield app(environ)" call (main app run), or with try/except around the "yield body_iter" call (body iterator run). (Specifically how does the server pass the buck with exceptions? And how does the exception get to the application to bubble out towards the server, through middleware, as it does now?) All that is in the Coroutine class, which is a generator-based "green thread" implementation. Remember how you were saying that your sketch would benefit from PEP 380? The Coroutine class is a pure-Python implementation of PEP 380, minus the syntactic sugar. It turns "yield" into "yield from" whenever the value you yield is itself a geniter. So, if you pretend that "yield app(environ)" and "yield body_iter" are actually "yield from"s instead, then the mechanics should become clearer. Coroutine runs a generator by sending or throwing into it. It then takes the result (either a value or an exception) and decides where to send that. If it's an object with send/throw methods, it pushes it on the stack, and passes None into it to start it running, thereby "calling" the subgenerator. If it's an exception or a return value (e.g. StopIteration(value=None)), it pops the stack and propagates the exception or return value to calling generator. If it's a future or some other object the server cares about, then the server can pause the coroutine (by returning 'routine.PAUSE' when the coroutine asks it what to do). Coroutine accepts a trampoline function and a completion callback as parameters: the trampoline function inspects a value yielded by a generator and then tells the coroutine whether it should PAUSE, CALL, RETURN, RESUME, or RAISE in response to that particular yield. RESUME is used for synchronous replies, where the yield returns immediately. RETURN means pop the current generator off the stack and return a value to the calling generator. RAISE raises an error immediately in the top-of-stack generator. CALL pushes a geniter on the stack. IOW, the Coroutine class lets you write servers with just a little glue code to tell it how you want the control to flow. It's actually entirely independent of WSGI or any particular WSGI protocol... I'm thinking that I should probably wrap it up into a PyPI package with some docs and tests, though I'm not sure when I'd get around to it. (Heck, it's the sort of thing that probably ought to be in the stdlib -- certainly PEP 380 can be implemented in terms of it.) Anyway, both the sync and async server examples have trampolines that detect futures and process them accordingly. If you yield to a future, you get back its result -- either a value or an exception at the point where you yielded it. You don't have to explicitly call .result() (in fact, you *can't*); it's already been called before control gets back to the place that yielded it. IOW, in my sketch, yielding to a future
Re: [Web-SIG] Server-side async API implementation sketches
At 04:39 PM 1/9/2011 -0800, Alice BevanMcGregor wrote: On 2011-01-09 09:26:19 -0800, P.J. Eby said: If wsgi.input offers any synchronous methods... Regardless of whether or not wsgi.input is implemented in an async way, wrap it in a future and eventually get around to yielding it. Problem /solved/. Not the API problem. If I'm accustomed to writing synchronous code, the async version looks ridiculous. Also, an existing WSGI web framework isn't going to be able to be ported to this API without putting it in a future. My hope was for an API that would be a simple enough translation that *everybody* could be persuaded to use it, but having to use futures just to write a "normal" application simply isn't going to work for the core WSGI API. As a separate "WSGI-A" profile, sure, it works fine. If it offers only asynchronous methods, OTOH, then you can't pass wsgi.input to any existing libraries (e.g. the cgi module). Describe to me how a function can be suspended (other than magical greenthreads) if it does not yield; if I knew this, maybe I wouldn't be so confused. I'm not sure what you're confused about. I'm the one who forgot you have to read from wsgi.input in a blocking way to write a normal app. ;-) (Mainly, because I was so excited about the potential in your sketched API, and I got sucked into the process of implementing/improving it.) I've deviated from your sketch, obviously, and any semblance of yielding a 3-tuple. Stop thinking of my example code as conforming to your ideas; it's a new idea, or, worst case, a narrowing of an idea into its simplest form. What I'm trying to point out is that you've missed two important API enhancements in my sketch, that make it so that app and middleware authors don't have to explicitly manage any generator methods or even future methods. The mechanics of yielding futures instances allows you to (in your server) implement the necessary async code however you wish while providing a uniform interface to both sync and async applications running on sync and async servers. In fact, you would be able to safely run a sync application on an async server and vice-versa. You can, on an async server: :: Add a callback to the yielded future to re-schedule the application generator. :: If using greenthreads, just block on future.result() then immediately wake up the application generator. :: Do other things I can't think of because I'm still waking up. I am not sure why you're reiterating these things. The sample code I posted shows precisely where you'd *do* them in a sync or async server. That's not where the problem lies. That is not optimum, because now you have an optional API that applications who want to be compatible will need to detect and choose between. It wasn't supposed to be optional, but it's beside the point since the presence of a blocking API means the application can block. The issue might be addressable by having an environment key like, 'wsgi.canblock' (indicating whether the application is already in a separate thread/process), and a piece of middleware that simply spawns its child app to a future if wsgi.canblock isn't set. Then people who write blocking applications could use the decorator. Mostly, though, it seems to me that the need to be able to write blocking code does away with most of the benefit of trying to have a single API in the first place. You have artificially created this need, ignoring the semantics of using the server-specific executor to detect async-capable requests and the yield mechanics I suggested; which happens to be a single, coherent API across sync and async servers and applications. I haven't ignored them. I'm simply representing the POV of existing WSGI apps and frameworks, which currently block, and are unlikely to be rewritten so as not to block. I thought, briefly, that it was possible to make an API with a low-enough conceptual overhead to allow that porting to occur, and let my enthusiasm carry me away. I was wrong, though: even the extremely minimalist version isn't going to be usable for ported code, which relegates the async version to a niche role. I would note, though, that this is *still* better than my previous position, which was that there was no point making an async API *at all*. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Server-side async API implementation sketches
At 08:09 PM 1/9/2011 +0200, Alex Grönholm wrote: Asynchronous applications may not be ready to send the status line as the first thing coming out of the generator. So? In the sketches that are the subject of this thread, it doesn't have to be the first thing. If the application yields a future first, it will be paused... and so will the middleware. When this line is executed in the middleware: status, headers, body = yield app(environ) ...the middleware is paused until the application actually yields its response tuple. Specifically, this yield causes the app iterator to be pushed on the Coroutine object's .stack attribute, then iterated. If the application yields a future, the server suspends the whole thing until it gets called back, at which point it .send()s the result back into the app iterator. The app iterator then yields its response, which is tagged as a return value, so the app is popped off the .stack, and the response is sent via .send() into the middleware, which then proceeds as if nothing happened in the meantime. It then yields *its* response, and whatever body iterator is given gets put into a second coroutine that proceeds similarly. When the process_response() part of the middleware does a "yield body_iter", the body iterator is pushed, and the middleware is paused until the body iterator yields a chunk. If the body yields a future, the whole process is suspended and resumed. The middleware won't be resumed until the body yields another chunk, at which point it is resumed. If it yields a chunk of its own, then that's passed up to any response-processing middleware further up the stack. In contrast, middleware based on the 2+body protocol cannot process a body without embedding coroutine management into the middleware itself. For example, you can't write a standalone body processor function, and reuse it inside of two pieces of middleware, without doing a bunch of send()/throw() logic to make it work. Outside of the application/middleware you mean? I hope there isn't any more confusion left about what a future is. The fact is that you cannot use synchronous API calls directly from an async app no matter what. Some workaround is always necessary. Which pretty much kills the whole idea as being a single, universal WSGI protocol, since most people don't care about async. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Server-side async API implementation sketches
At 04:25 AM 1/9/2011 -0800, Alice BevanMcGregor wrote: On 2011-01-08 13:16:52 -0800, P.J. Eby said: In the limit case, it appears that any WSGI 1 server could provide an (emulated) async WSGI2 implementation, simply by wrapping WSGI2 apps with a finished version of the decorator in my sketch. Or, since users could do it themselves, this would mean that WSGI2 deployment wouldn't be dependent on all server implementers immediately turning out their own WSGI2 implementations. This, if you'll pardon my language, is bloody awesome. :D That would strongly drive adoption of WSGI2. Note that adapting a WSGI1 application to WSGI2 server would likewise be very handy, and I suspect, even easier to implement. I very much doubt that. You'd need greenlets or a thread with a communication channel in order to support WSGI 1 apps that use write() calls. By the way, I don't really see the point of the new sketches you're doing, as they aren't nearly as general as the one I've already done, but still have the same fundamental limitation: wsgi.input. If wsgi.input offers any synchronous methods, then they must be used from a future and must somehow raise an error when called from within the application -- otherwise it would block, nullifying the point of having a generator-based API. If it offers only asynchronous methods, OTOH, then you can't pass wsgi.input to any existing libraries (e.g. the cgi module). The latter problem is the worse one, because it means that the translation of an app between my original WSGI2 API and the current sketch is no longer just "replace 'return' with 'yield'". The only way this would work is if WSGI applications are still allowed to be written in a blocking style. Greenlet-based frameworks would have no problem with this, of course, but servers like Twisted would still have to run WSGI apps in a worker thread pool, just because they *might* block. If we're okay with this as a limitation, then adding _async method variants that return futures might work, and we can proceed from there. Mostly, though, it seems to me that the need to be able to write blocking code does away with most of the benefit of trying to have a single API in the first place. Either everyone ends up putting their whole app into a future, or else the server has to accept that the app could block... and put it into a future for them. ;-) So, the former case will be unacceptable to app developers who don't feel a need for async code, and the latter doesn't seem to offer anything to the developers of non-blocking servers. (The exception to these conditions, of course, are greenlet-based servers, but they can run WSGI *1* apps in a non-blocking way, and so have no need for a new protocol.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Server-side async API implementation sketches
At 06:06 AM 1/9/2011 +0200, Alex Grönholm wrote: A new feature here is that the application itself yields a (status, headers) tuple and then chunks of the body (or futures). Hm. I'm not sure if I like that. The typical app developer really shouldn't be yielding multiple body strings in the first place. I much prefer that the canonical example of a WSGI app just return a list with a single bytestring -- preferably in a single statement for the entire return operation, whether it's a yield or a return. IOW, I want it to look like the normal way to do thing is to just return the whole request at once, and use the additional difficulty of creating a second iterator to discourage people writing iterated bodies when they should just write everything to a BytesIO and be done with it. Also, it makes middleware simpler: the last line can just yield the result of calling the app, or a modified version, i.e.: yield app(environ) or: s, h, b = app(environ) # ... modify or replace s, h, b yield s, h, b In your approach, the above samples have to be rewritten as: return app(environ) or: result = app(environ) s, h = yield result # ... modify or replace s, h yield s, h for data in result: # modify b as we go yield result Only that last bit doesn't actually work, because you have to be able to send future results back *into* the result. Try actually making some code that runs on this protocol and yields to futures during the body iteration. Really, this modified protocol can't work with a full async API the way my coroutine-based version does, AND the middleware is much more complicated. In my version, your do-nothing middleware looks like this: class NullMiddleware(object): def __init__(self, app): self.app = app def __call__(environ): # ACTION: pre-application environ mangling s, h, body = yield self.app(environ) # modify or replace s, h, body here yield s, h, body If you want to actually process the body in some way, it looks like: class NullMiddleware(object): def __init__(self, app): self.app = app def __call__(environ): # ACTION: pre-application environ mangling s, h, body = yield self.app(environ) # modify or replace s, h, body here yield s, h, self.process(body) def process(self, body_iter): while True: chunk = yield body_iter if chunk is None: break # process/modify chunk here yield chunk And that's still a lot simpler than your sketch. Personally, I would write both of the above as: def null_middleware(app): def wrapped(environ): # ACTION: pre-application environ mangling s, h, body = yield app(environ) # modify or replace s, h, body here yield s, h, process(body) def process(body_iter): while True: chunk = yield body_iter if chunk is None: break # process/modify chunk here yield chunk return wrapped But that's just personal taste. Even as a class, it's much easier to write. The above middleware pattern works with the sketches I gave on the PEAK wiki, and I've now updated the wiki to include an example app and middleware for clarity. Really, the only hole in this approach is dealing with applications that block. The elephant in the room here is that while it's easy to write these example applications so they don't block, in practice people read files and do database queries and whatnot in their requests, and those APIs are generally synchronous. So, unless they somehow fold their entire application into a future, it doesn't work. I liked the idea of having a separate async_read() method in wsgi.input, which would set the underlying socket in nonblocking mode and return a future. The event loop would watch the socket and read data into a buffer and trigger the callback when the given amount of data has been read. Conversely, .read() would set the socket in blocking mode. What kinds of problems would this cause? That you could never *call* the .read() method outside of a future, or else you would block the server, thereby obliterating the point of having the async API in the first place. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Server-side async API implementation sketches
At 06:15 PM 1/8/2011 -0800, Alice BevanMcGregor wrote: On 2011-01-08 17:22:44 -0800, Alex Grönholm said: On 2011-01-08 13:16:52 -0800, P.J. Eby said: I've written the sketches dealing only with PEP 3148 futures, but sockets were also proposed, and IMO there should be simple support for obtaining data from wsgi.input. I'm a bit unclear as to how this will work with async. How do you propose that an asynchronous application receives the request body? In my example https://gist.github.com/770743 (which has been simplified greatly by P.J. Eby in the "Future- and Generator-Based Async Idea" thread) for dealing with wsgi.input, I have: future = environ['wsgi.executor'].submit(environ['wsgi.input'].read, 4096) yield future While ugly, if you were doing this, you'd likely: submit = environ['wsgi.executor'].submit input_ = environ['wsgi.input'] future = yield submit(input_.read, 4096) data = future. I don't quite understand the above -- in my sketch, the above would be: data = yield submit(input._read, 4096) It looks like your original sketch wants to call .result() on the future, whereas in my version, the return value of yielding a future is the result (or an error is thrown if the result was an error). Is there some reason I'm missing, for why you'd want to explicitly fetch the result in a separate step? Meanwhile, thinking about Alex's question, ISTM that if WSGI 2 is asynchronous, then the wsgi.input object should probably just have read(), readline() etc. methods that simply return (possibly-mock) futures. That's *much* better than having to do all that submit() crud just to read data from wsgi.input(). OTOH, if you want to use the cgi module to parse a form POST from the input, you're going to need to write an async version of it in that case, or else feed the entire operation to an executor... but then the methods would need to be synchronous... *argh*. I'm starting to not like this idea at all. Alex has actually pinpointed a very weak spot in the scheme, which is that if wsgi.input is synchronous, you destroy the asynchrony, but if it's asynchronous, you can't use it with any normal code that operates on a stream. I don't see any immediate fixes for this problem, so I'll let it marinate in the back of my mind for a while. This might be the achilles heel for the whole idea of a low-rent async WSGI. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Server-side async API implementation sketches
At 04:40 AM 1/9/2011 +0200, Alex Grönholm wrote: 09.01.2011 04:15, Alice BevanMcGregor kirjoitti: I hope that clearly identifies my idea on the subject. Since async servers will /already/ be implementing their own executors, I don't see this as too crazy. -1 on this. Those executors are meant for executing code in a thread pool. Mandating a magical socket operation filter here would considerably complicate server implementation. Actually, the *reverse* is true. If you do it the way Alice proposes, my sketches don't get any more complex, because the filtering goes in the executor facade or submit function. Truthfully, I don't really see the point of exposing the map() method (which is the only other executor method we'd expose), so it probably makes more sense to just offer a 'wsgi.submit' key... which can be a function as follows: def submit(callable, *args, **kw): ob = getattr(callable, '__self__', None) if isinstance(ob, ServerProvidedSocket): # could be an ABC future = MockFuture() if callable==ob.read: # set up read callback to fire future elif callable==ob.write: # set up write callback to fire future return future else: return real_executor.submit(callable, *args, **kw) Granted, this might be a rather long function. However, since it's essentially an optimization, a given server can decide how many functions can be shortcut in this way. The spec may wish to offer a guarantee or recommendation for specific methods of certain stdlib-provided types (sockets in particular) and wsgi.input. Personally, I do think it might be *better* to offer extended operations on wsgi.input that could be used via yield, e.g. "yield input.nb_read()". But of course then the trampoline code has to recognize those values instead of futures. Either way works, but somewhere there is going to be some type-testing (explicit or implicit) taking place to determine how to suspend and resume the app. Note, too, that this complexity also only affects servers that want to offer a truly async API. A synchronous server has no reason to pay particular attention to what's in a future, since it can't offer any performance improvement. I do think that this sort of API discussion, though, is the most dangerous part of trying to do an async spec. That is, I don't expect that everyone will spontaneously agree on the exact same API. Alice's proposal (simply submitting object methods) has the advantage of severely limiting the scope of API discussions. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] Server-side async API implementation sketches
As a semi-proof-of-concept, I whipped these up: http://peak.telecommunity.com/DevCenter/AsyncWSGISketch It's an expanded version of my Coroutine concept, updated with sample server code for both a synchronous server and an asynchronous one. The synchronous "server" is really just a decorator that wraps a WSGI2 async app with futures support, and handles pauses by simply waiting for the future to finish. The asynchronous server is a bit more hand-wavy, in that there are some bits (clearly marked) that will be server/framework dependent. However, they should be straightforward for a specialist in any given async framework to implement. What is *most* handwavy at the moment, however, is in the details of precisely what one is allowed to "yield to". I've written the sketches dealing only with PEP 3148 futures, but sockets were also proposed, and IMO there should be simple support for obtaining data from wsgi.input. However, even this part is pretty easy to extrapolate: both server examples just add more type-testing branches in their "base_trampoline()" function, copying and modifying the existing branches that deal with futures. The entire result is surprisingly compact -- each server weighed in at about 40 lines, and the common Coroutine class used by both adds another 60-something lines. In the limit case, it appears that any WSGI 1 server could provide an (emulated) async WSGI2 implementation, simply by wrapping WSGI2 apps with a finished version of the decorator in my sketch. Or, since users could do it themselves, this would mean that WSGI2 deployment wouldn't be dependent on all server implementers immediately turning out their own WSGI2 implementations. True async API implementations would be more involved, of course -- using a WSGI2 decorator on say, Twisted's WSGI1 implementation, would give you no performance advantages vs. using Twisted's APIs directly. But, as soon as someone wrote a Twisted-specific translation of my async-server sketch, such an app would be portable. More discussion is still needed, but at this point I'm convinced the concept is *technically* feasible. (Whether there's enough need in the "market" to make it worthwhile, is a separate question.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea
At 01:24 PM 1/8/2011 -0500, Paul Davis wrote: For contrast, I thought it might be beneficial to have a comparison with an implementation that didn't use async might look like: http://friendpaste.com/4lFbZsTpPGA9N9niyOt9PF Compare your version with this one, that uses my revision of Alice's proposal: def my_awesome_application(environ): # do stuff yield b'200 OK', [], ["Hello, World!"] def my_middleware(app): def wrapper(environ): # maybe edit environ try: status, headers, body = yield app(environ) # maybe edit response: # body = latinize(body) yield status, headers, body except: # maybe handle error finally: # maybe release resources def my_server(app, httpreq): environ = wsgi.make_environ(httpreq) def process_response(result): status, headers, body = result write_headers(httpreq, status, headers) Coroutine(body, body_trampoline, finish_response) def finish_response(result): # cleanup, if any Coroutine(app(environ), app_trampoline, process_response) The primary differences are that the server needs to split some of its processing into separate routines, and response-processing done by middleware has to happen in a while loop rather than a for loop. If your implementation requires that people change source code (yield vs return) when they move code between sync and async servers, doesn't that pretty much violate the main WSGI goal of portability? The idea here would be to have WSGI 2 use this protocol exclusively, not to have two different protocols. IMO, the async middleware is quite more complex than the current state of things with start_response. Under the above proposal, it isn't, since you can't (only) do a for loop over the response body; you have to write a loop and a push-based handler as well. In this case, it is reduced to just writing one loop. I'm still not entirely convinced of the viability of the approach, but I'm no longer in the "that's just crazy talk" category regarding an async WSGI. The cost is no longer crazy, but there's still some cost involved, and the use case rationale hasn't improved much. OTOH, I can now conceive of actually *using* such an async API for something, and that's no small feat. Before now, the idea held virtually zero interest for me. Either way this proposal reminds me quite a bit of Duff's device [1]. On its own Duff's device is quite amusing and could even be employed in some situations to great effect. On the other hand, any WSGI spec has to be understandable and implementable by people from all skill ranges. If its a spec that only a handful of people comprehend, then I fear its adoption would be significantly slowed in practice. Under my modification of Alice's proposal, nearly all of the complexity involved migrates to the server, mostly in the (shareable) Coroutine implementation. For an async server, the "arrange for coroutine(result) to be called" operations are generally native to async APIs, so I'd expect them to find that simple to implement. Synchronous servers just need to invoke the waited-on operation synchronously, then pass the value back into the coroutine. (e.g. by returning "pause" from the trampoline, then calling coroutine(value, exc_info) to resume processing after the result is obtained.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea
At 05:39 AM 1/8/2011 -0800, Alice BevanMcGregor wrote: As a quick note, this proposal would signifigantly benefit from the simplified syntax offered by PEP 380 (Syntax for Delegating to a Subgenerator) [1] and possibly PEP 3152 (Cofunctions) [2]. The former simplifies delegation and exception passing, and the latter simplifies the async side of this. Unfortunately, AFIK, both are affected by PEP 3003 (Python Language Moratorium) [3], which kinda sucks. Luckily, neither PEP is necessary, since we do not need to support arbitrary protocols for the "subgenerators" being called. This makes it possible to simply "yield" instead of "yield from", and the trampoline functions take care of distinguishing a terminal ("return") result from an intermediate one. The Coroutine class I suggested, however, *does* accept explicit returns via "raise StopIteration(value)", so it is actually fully equivalent to supporting "yield from", as long as it's used with an appropriate trampoline function. (In fact, the structure of the Coroutine class I proposed was stolen from an earlier Python-Dev post I did in an attempt to show why PEP 380 was unnecessary for doing coroutines. ;-) ) In effect, the only thing that PEP 380 would add here is the syntax sugar for 'raise StopIteration(value)', but you can do that with: def return_(value): raise StopIteration(value) In any case, my suggestion doesn't need this for either apps or response bodies, since the type of data yielded suffices to indicate whether the value is a "return" or not. You only need a subgenerator to raise StopIteration if you want to return something to your caller that *isn't* a response or body chunk. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea
I made a few errors in that massive post... At 12:00 PM 1/8/2011 -0500, P.J. Eby wrote: My major concern about the approach is still that it requires a fair amount of overhead on the part of both app developers and middleware developers, even if that overhead mostly consists of importing and decorating. (More below.) The above turned out to be happily wrong by the end of the post, since no decorators or imports are actually required for app and middleware developers. You can then implement response-processing middleware like this: def latinize_body(body_iter): while True: chunk = yield body_iter if chunk is None: break else: yield piglatin(yield body_iter) The last line above is incorrect; it should've been "yield piglatin(chunk)", i.e.: def latinize_body(body_iter): while True: chunk = yield body_iter if chunk is None: break else: yield piglatin(chunk) It's still rather unintuitive, though. There are also plenty of topics left to discuss, both of the substantial and bikeshedding varieties. One big open question still in my mind is, are these middleware idioms any easier to get right than the WSGI 1 ones? For things that don't process response bodies, the answer seems to be yes: you just stick in a "yield" and you're done. For things that DO process response bodies, however, you have to have ugly loops like the one above. I suppose it could be argued that, as unintuitive as that body-processing loop is, it's still orders of magnitude more intuitive than a piece of WSGI 1 middleware that has to handle both application yields and write()s! I suppose my hesitance is due to the fact that it's not as simple as: return (piglatin(chunk) for chunk in body_iter) Which is really the level of simplicity that I was looking for. (IOW, all response-processing middleware pays in this slightly-added complexity to support the subset of apps and response-processing middleware that need to wait for events during body output.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [PEP 444] Future- and Generator-Based Async Idea
At 03:26 AM 1/8/2011 -0800, Alice BevanMcGregor wrote: Warning: this assumes we're running on bizzaro-world PEP 444 that mandates applications are generators. Please do not dismiss this idea out of hand but give it a good look and maybe some feedback. ;) First-glance feedback: I'm impressed. You may have something going here after all. I just wish you'd sent this sooner. ;-) I can easily see why I didn't think of this myself: I hadn't shifted my thinking to accomodate for two important changes in the Python environment since the first WSGI spec, circa 2003-04: 1. Coroutines and decorators are ubiquitous and non-intrusive 2. WSGI has stdlib support, and in any event it is much easier to rely on non-stdlib packages My major concern about the approach is still that it requires a fair amount of overhead on the part of both app developers and middleware developers, even if that overhead mostly consists of importing and decorating. (More below.) The second middleware demonstration (using a decorator) makes middleware look a lot more like an application: yielding futures, or a response, with the addition of yielding an application callable not explored in the first (long, but trivial) example. I believe this should cover 99% of middleware use cases, including interactive debugging, request routing, etc. and the syntax isn't too bad, if you don't mind standardized decorators. If we assume that the implementation would be in a wsgi2ref for Python 3.3 and distributed standalone for 2.x, I think we can make something work. (In the sense of practical to implement, not necessarily *desirable*.) One of my goals is that it should be possible to write "async-naive" applications and middleware, so that people who don't care about async can ignore it. On the application side, this is easy: a trivial decorator suffices to translate a return into a yield. For middleware, it's not quite as simple, unless you have a pure ingress or egress filter, since you can't simply "call" the application. However, a "context manager"-like pattern applies, wherein you could simply yield to calling a wrapped version of the application. Hm. This seems to pretty much generalize to a standard coroutine/trampoline pattern, where the server provides the trampoline, and can provide APIs in the environ to create waitable objects that can be yielded upward. Actually, this is kind of like what I really wanted the futures PEP to be about. And it also preserves composability nicely. In fact, it doesn't actually need any middleware decorators, if the server provides the trampoline. We would leave your "my_awesome_application" example intact (possibly apart from having a friendlier API for reading from wsgi.input), but change my_middleware as follows: def my_middleware(app): def wrapper(environ): # pre-response code here response = yield app(environ) # post-response code here yield altered_response return wrapper That's it. No decorators, no nothing. The server-level trampoline is then just a function that looks something like this: def app_trampoline(coroutine, yielded): if [yielded is a future of some sort]: [arrange to invoke 'coroutine(result)' upon completion] [arrange to inovke 'coroutine(None, exc_info)' upon error] return "pause" elif [yielded is a response]: return "return" elif [yielded has send/throw methods]: return "call" # tell the coroutine to call it else: raise TypeError The trampoline function is used with a coroutine class like this: class Coroutine: def __init__(self, iterator, trampoline, callback): self.stack = [iterator] self.trampoline = trampoline self() def __call__(self, value=None, exc_info=()): stack = self.stack while stack: try: it = stack[-1] if exc_info: try: rv = it.throw(*exc_info) finally: exc_info = () else: rv = it.send(value) except BaseException: value = None exc_info = sys.exc_info() if exc_info[0] is StopIteration: # pass return value up the stack value, = exc_info[1].args or (None,) exc_info = () # but not the error stack.pop() else: switch = self.trampoline(self, rv) if switch=="pause": return elif switch=="call": stack.append(rv) # Call subgenerator value, exc_info = None
Re: [Web-SIG] PEP 444 Goals
At 01:22 PM 1/7/2011 -0800, Alice BevanMcGregor wrote: On 2011-01-07 08:28:15 -0800, P.J. Eby said: At 01:17 AM 1/7/2011 -0800, Alice BevanMcGregor wrote: On 2011-01-06 20:18:12 -0800, P.J. Eby said: :: Reduction of re-implementation / NIH syndrome by>>>incorporating>the most common (1%) of features most often>>>relegated to middleware>or functional helpers. Note that nearly every application-friendly feature you add will>>increase the burden on both server developers and middleware>>developers, which ironically means that application developers>>actually end up with fewer options. Some things shouldn't have multiple options in the first place. ;) I meant that if a server doesn't implement the spec because of arequired feature, then the app developer doesn't have the option of using that feature anyway -- meaning that adding the feature to the spec didn't really help. I truly can not worry about non-conformant applications, middleware, or servers and still keep my hair. I said "if a server doesn't implement the *spec*", meaning, they choose not to support PEP 444 *at all*, not that they skip providing the feature. Easy enough to write quick, say, 10-line utility functions that arecorrect middleware -- so that you could actually build yourapplication out of WSGI functions calling other WSGI-based functions. The yielding thing wouldn't work for that at all. Handling a possible generator isn't that difficult. That it's difficult at all means removes degree-of-difficulty as a strong motivation to switch. So, in order to know what type each CGI variable is, you'll need a reference? Reference? Re-read what I wrote. Only URI-specific values utilize an encoding reference variable in the environment; that's four values out of the entire environ. There is one, clearly defined bytes value. The rest are native strings, decoded using latin1/iso-8859-1/"str-in-unicode" where native strings are unicode. IOW, there are six specific facts someone needs to remember in order to know the type of a given CGI variable, over and above the mere fact that it's a CGI variable. Hence, "reference". ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
At 12:37 PM 1/7/2011 -0800, Alice BevanMcGregor wrote: But is there really any problem with providing a unified method for indication a suspend point? Yes: a complexity burden that is paid by the many to serve the few -- or possibly non-existent. I still haven't seen anything that suggests there is a large enough group of people who want a "portable" async API to justify inconveniencing everyone else in order to suit their needs, vs. simply having a different calling interface for that need. If I could go back and change only ONE thing about WSGI 1, it would be the calling convention. It was messed up from the start, specifically because I wasn't adamant enough about weighing the needs of the many enough against the needs of the few. Only a few needed a push protocol (write()), and only a few even remotely cared about our minor nod to asynchrony (yielding empty strings to pause output). If I'd been smart (or more to the point, prescient), I'd have just done a 3-tuple return value from the get-go, and said to hell with those other use cases, because everybody else is paying to carry a few people who aren't even going to use these features for real. (As it happens, I thought write() would be needed in order to drive adoption, and it may well have been at one time.) Anyway, with a new spec we have the benefit of hindsight: we know that, historically, nobody has actually cared enough to propose a full-blown async API who wasn't also trying to make their async server implementation work without needing threads. Never in the history of the web-sig, AFAIK, has anyone come in and said, "hey, I want to have an async app that can run on any async framework." Nobody blogs or twitters about how terrible it is that the async frameworks all have different APIs and that this makes their apps non-portable. We see lots of complaints about not having a Python 3 WSGI spec, but virtually none about WSGI being essentially synchronous. I'm not saying there's zero audience for such a thing... but then, at some point there was a non-zero audience for write() and for yielding empty strings. ;-) The big problem is this: if, as an app developer, you want this hypothetical portable async API, you either already have an app that is async or you don't. If you do, then you already got married to some particular API and are happy with your choice -- or else you'd have bit the bullet and ported. What you would not do, is come to the Web-SIG and ask for a spec to help you port, because you'd then *still have to port* to the new API... unless of course you wanted it to look like the API you're already using... in which case, why are you porting again, exactly? Oh, you don't have an app... okay, so *hypothetically*, if you had this API -- which, because you're not actually *using* an async API right now, you probably don't even know quite what you need -- hypothetically if you had this API you would write an app and then run it on multiple async frameworks... See? It just gets all the way to silly. The only way you can actually get this far in the process seems to be if you are on the server side, thinking it would be really cool to make this thing because then surely you'll get users. In practice, I can't imagine how you could write an app with substantial async functionality that was sanely portable across the major async frameworks, with the possible exception of the two that at least share some common code, paradigms, and API. And even if you could, I can't imagine someone wanting to. So far, you have yet to give a concrete example of an application that you personally (or anyone you know of) want to be able to run on two different servers. You've spoken of hypothetical apps and hypothetical portability... but not one concrete, "I want to run this under both Twisted and Eventlet" (or some other two frameworks/servers), "because of [actual, non-hypothetical rationale here]". I don't deny that [actual non-hypothetical rationale] may exist somewhere, but until somebody shows up with a concrete case, I don't see a proposal getting much traction. (The alternative would be if you pull a rabbit out of your hat and propose something that doesn't cost anybody anything to implement... but the fact that you're tossing the 3-tuple out in favor of yielding indicates you've got no such proposal ready at the present time.) On the plus side, the "run this in a future after the request" concept has some legs, and I hope Timothy (or anybody) takes it and runs with it. That has plenty of concrete use cases for portability -- every sufficiently-powerful web framework will want to either provide that feature, build other features on top of it, or both. It's the "make the request itself async" part that's the hard sell here, and in need of some truly spectacular rationale in order to justify the ubiquitous costs it imposes. ___
Re: [Web-SIG] PEP 444 feature request - Futures executor
At 11:47 AM 1/7/2011 -0600, Timothy Farrell wrote: There has been much discussion about how to handle async in PEP 444 and that discussion centers around the use of futures. However, I'm requesting that servers _optionally_ provide environ['wsgi.executor'] as a futures executor that applications can use for the purpose of doing something after the response is fully sent to the client. This is feature request is designed to be concurrency methodology agnostic. Some example use cases are: - send an email that might block on a slow email server (Alice, I read what you said about Turbomail, but one product is not the solution to all situations) - initiate a database vacuum - clean a cache - build a cache - compile statistics When serving pages of an application, these are all things that could be done after the response has been sent. Ideally these things don't need to be done in a request thread and aren't incredibly time-sensitive. It seems to me that futures would be an ideal way of handling this. Thoughts? This seems like a potentially good way to do it; I suggest making it a wsgi.org extension; see (and update) http://www.wsgi.org/wsgi/Specifications with your proposal. I would suggest including a simple sample executor wrapper that servers could use to block all but the methods allowed by your proposal. (i.e., presumably not shutdown(), for example.) There are some other issues that might need to be addressed, like maybe adding an attribute or two for the level of reliability guaranteed by the executor, or allowing the app to request a given reliability level. Specifically, it might be important to distinguish between: * this will be run exactly once as long as the server doesn't crash * this will eventually be run once, even if the server suffers a fatal error between now and then IOW, to indicate whether the thing being done is "transactional", so to speak. I mean, I can imagine building a transactional service on top of the basic service, by queuing task information externally, then just using executor calls to pump the queue. But IMO it seems pretty intrinsic to want that kind of persistence guarantee for at least the email case, or, say, sending off a charge to a credit card or something like that. One other relevant use case: sometimes you want a long-running process step that the user checks back in on periodically, so having a way to get a "handle" for a future that can be kept in a session or something might be important. Like, say, you're preparing a report that will be viewed in the browser, and using meta-refresh or some such to poll. The app needs to check on a previously queued future and get its results. I don't know how easy any of the above are to implement with the futures API or your proposal, but they seem like worthwhile things to have available, and actually would provide for some rich application use cases. But if they're implementable over the futures API at all, it should be possible to implement them as WSGI 1.x middleware or as a server extension. A spec like that definitely needs some thrashing out, but I don't think it need derail any PEPs in progress: the API of such an extension doesn't affect the basic WSGI protocol at all. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 Goals
At 01:17 AM 1/7/2011 -0800, Alice BevanMcGregor wrote: On 2011-01-06 20:18:12 -0800, P.J. Eby said: :: Reduction of re-implementation / NIH syndrome by incorporating>the most common (1%) of features most often relegated to middleware>or functional helpers. Note that nearly every application-friendly feature you add will increase the burden on both server developers and middleware developers, which ironically means that application developers actually end up with fewer options. Some things shouldn't have multiple options in the first place. ;) I meant that if a server doesn't implement the spec because of a required feature, then the app developer doesn't have the option of using that feature anyway -- meaning that adding the feature to the spec didn't really help. I definitely consider implementation overhead on server, middleware, and application authors to be important. As an example, if yield syntax is allowable for application objects (as it is for response bodies) middleware will need to iterate over the application, yielding up-stream anything that isn't a 3-tuple. When it encounters a 3-tuple, the middleware can do its thing. If the app yield semantics are required (which may be a good idea for consistency and simplicity sake if we head down this path) then async-aware middleware can be implemented as a generator regardless of the downstream (wrapped) application's implementation. That's not too much overhead, IMHO. The reason I proposed the 3-tuple return in the first place (see http://dirtsimple.org/2007/02/wsgi-middleware-considered-harmful.html ) was that I wanted to make middleware *easy* to write. Easy enough to write quick, say, 10-line utility functions that are correct middleware -- so that you could actually build your application out of WSGI functions calling other WSGI-based functions. The yielding thing wouldn't work for that at all. Unicode decoding of a small handful of values (CGI values that> pull from the request URI) is the biggest example. [2, 3] Does that mean you plan to make the other values bytes, then? Or will they be unicode-y-bytes as well? Specific CGI values are bytes (one, I believe), specific ones are true unicode (URI-related values) and decoded using a configurable encoding with a fallback to "bytes in unicode" (iso-8859-1/latin1), are kept internally consistent (if any one fails, treat as if they all failed), have the encoding used recorded in the environ, and all others are native strings ("bytes in unicode" where native strings are unicode). So, in order to know what type each CGI variable is, you'll need a reference? What happens for additional server-provided variables? That is the domain of the server to document, though native strings would be nice. (The PEP only covers CGI variables.) I mean the ones required by the spec, not server-specific extensions. The PEP choice was for uniformity. At one point, I advocated simply using surrogateescape coding, but this couldn't be made uniform across Python versions and maintain compatibility. As an open question to anyone: is surrogateescape availabe in Python 2.6? Mandating that as a minimum version for PEP 444 has yielded benefits in terms of back-ported features and syntax, like b''. No, otherwise I'd totally go for the surrogateescape approach. Heck, I'd still go for it if it were possible to write a surrogateescape handler for 2.6, and require that a PEP 444 server register one with Python's codec system. I don't know if it's *possible*, though, hopefully someone with more knowledge can weigh in on that. :: Cross-compatibility considerations. The definition and use of>native strings vs. byte strings is the biggest example of this in the rewrite. I'm not sure what you mean here. Do you mean "portability of WSGI 2code samples across Python versions (esp. 2.x vs. 3.x)?" It should be possible (and currently is, as demonstrated by marrow.server.http) to create a polygot server, polygot middleware/filters (demonstrated by marrow.wsgi.egress.compression), and polygot applications, though obviously polygot code demands the "lowest common denominator" in terms of feature use. Application / framework authors would likely create Python 3 specific WSGI applications to make use of the full Python 3 feature set, with cross-compatibility relegated to server and middleware authors. I'm just asking whether, in your statement of goals and rationale, you would expand "cross compatibility" as meaning cross-python version portability, or whether you meant something else. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
At 12:39 AM 1/7/2011 -0800, Alice BevanMcGregor wrote: Earlier in this post I illustrated a few that directly apply to a commercial application I am currently writing. I'll elaborate: :: Image scaling would benefit from multi-processing (spreading the load across cores). Also, only one sacle is immediately required before returning the post-upload page: the thumbnail. The other scales can be executed without halting the WSGI application's return. :: Asset content extraction and indexing would benefit from threading, and would also not require pausing the WSGI application. :: Since most templating engines aren't streaming (see my unanswered thread in the general mailing list re: this), pausing the application pending a particularly difficult render is a boon to single-threaded async servers, though true streaming templating (with flush semantics) would be the holy grail. ;) In all these cases, ISTM the benefit is the same if you future the WSGI apps themselves (which is essentially what most current async WSGI servers do, AFAIK). :: Long-duration calls to non-async-aware libraries such as DB access. The WSGI application could queue up a number of long DB queries, pass the futures instances to the template, and the template could then .result() (block) across them or yield them to be suspended and resumed when the result is available. :: True async is useful for WebSockets, which seem a far superior solution to JSON/AJAX polling in addition to allowing real web-based socket access, of course. The point as it relates to WSGI, though, is that there are plenty of mature async APIs that offer these benefits, and some of them (e.g. Eventlet and Gevent) do so while allowing blocking-style code to be written. That is, you just make what looks like a blocking call, but the underlying framework silently suspends your code, without tying up the thread. Or, if you can't use a greenlet-based framework, you can use a yield-based framework. Or, if for some reason you really wanted to write continuation-passing style code, you could just use the raw Twisted API. But in all of these cases you would be better off than if you used a half-implementation of the same thing using futures under WSGI, because all of those frameworks already have mature and sophisticated APIs for doing async communications and DB access. If you try to do it with WSGI under the guise of "portability", all this means is that you are stuck rolling your own replacements for those existing APIs. Even if you've already written a bunch of code using raw sockets and want to make it asynchronous, Eventlet and Gevent actually let you load a compatibility module that makes it all work, by replacing the socket API with an exact duplicate that secretly suspends your code whenever a socket operation would block. IOW, if you are writing a truly async application, you'd almost have to be crazy to want to try to do it *portably*, vs. picking a full-featured async API and server suite to code against. And if you're migrating an existing, previously-synchronous WSGI app to being asynchronous, the obvious thing to do would just be to grab a copy of Eventlet or Gevent and import the appropriate compatibility modules, not rewrite the whole thing to use futures. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)
At 05:27 PM 1/7/2011 +1100, Graham Dumpleton wrote: Another thing though. For output changed to sys.stdout.buffer. For input should we be using sys.stdin.buffer as well if want bytes? %&$*()&%!!! Sorry, still getting used to this whole Python 3 thing. (Honestly, I don't even use Python 2.6 for anything real yet.) Good thing I tried running this. Did we all assume that someone else was actually running it to check it? :-) Well, I only recently started changing the examples to actual Python 3, vs being the old Python 2 examples. Though, I'm not sure anybody ever ran the Python 2 ones. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)
At 05:13 PM 1/7/2011 +1100, Graham Dumpleton wrote: The version at: http://svn.python.org/projects/peps/trunk/pep-.txt still shows: elif not headers_sent: # Before the first output, send the stored headers status, response_headers = headers_sent[:] = headers_set sys.stdout.write('Status: %s\r\n' % status) for header in response_headers: sys.stdout.write('%s: %s\r\n' % header) sys.stdout.write('\r\n') so not using buffer there and also not converting strings written for headers to bytes. Fixed in SVN now. The main issue now is that we need to fix the re-raises and error handling for Python 3, in the text and examples. I also found some references for with_traceback() and I think I've got that sorted now, but if someone can check my work that'd be a good idea. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)
At 05:00 PM 1/7/2011 +1100, Graham Dumpleton wrote: Stupid question first. When running 2to3 on the example CGI code, Don't do that. It's supposed to already be Python 3 code. ;-) It did, however, reveal a bug where I have not in fact done the correct Python 3 thing: if headers_sent: # Re-raise original exception if headers sent -raise exc_info[0], exc_info[1], exc_info[2] +raise exc_info[0](exc_info[1]).with_traceback(exc_info[2]) finally: exc_info = None # avoid dangling circular ref Can somebody weigh in on what the correct translation here is? The only real Python 3 coding I've done to date has been experiments to test changes to other aspects of WSGI. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
At 05:47 PM 1/6/2011 -0800, Alice BevanMcGregor wrote: Tossing the idea around all day long will then, of course, be happening regardless. Unfortunately for that particular discussion, PEP 3148 / Futures seems to have won out in the broader scope. Do any established async frameworks or server (e.g. Twisted, Eventlet, Gevent, Tornado, etc.) make use of futures? Having a ratified and incorporated language PEP (core in 3.2 w/ compatibility package for 2.5 or 2.6+ support) reduces the scope of async discussion down to: "how do we integrate futures into WSGI 2" instead of "how do we define an async API at all". It would be helpful if you addressed the issue of scope, i.e., what features are you proposing to offer to the application developer. While the idea of using futures presents some intriguing possibilities, it seems to me at first glance that all it will do is move the point where the work gets done. That is, instead of simply running the app in a worker, the app will be farming out work to futures. But if this is so, then why doesn't the server just farm the apps themselves out to workers? I guess what I'm saying is, I haven't heard use cases for this from the application developer POV -- why should an app developer care about having their app run asynchronously? So far, I believe you're the second major proponent (i.e. ones with concrete proposals and/or implementations to discuss) of an async protocol... and what you have in common with the other proponent is that you happen to have written an async server that would benefit from having apps operating asynchronously. ;-) I find it hard to imagine an app developer wanting to do something asynchronously for which they would not want to use one of the big-dog asynchronous frameworks. (Especially if their app involves database access, or other communications protocols.) This doesn't mean I think having a futures API is a bad thing, but ISTM that a futures extension to WSGI 1 could be defined right now using an x-wsgi-org extension in that case... and you could then find out how many people are actually interested in using it. Mainly, though, what I see is people using the futures thing to shuffle off compute-intensive tasks... but if they do that, then they're basically trying to make the server's life easier... but under the existing spec, any truly async server implementing WSGI is going to run the *app* in a "future" of some sort already... Which means that the net result is that putting in async is like saying to the app developer: "hey, you know this thing that you just could do in WSGI 1 and the server would take care of it for you? Well, now you can manage that complexity by yourself! Isn't that wonderful?" ;-) I could be wrong of course, but I'd like to see what concrete use cases people have for async. We dropped the first discussion of async six years ago because someone (I think it might've been James) pointed out that, well, it isn't actually that useful. And every subsequent call for use cases since has been answered with, "well, the use case is that you want it to be async." Only, that's a *server* developer's use case, not an app developer's use case... and only for a minority of server developers, at that. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)
At 09:51 AM 1/7/2011 +1100, Graham Dumpleton wrote: Is that the last thing or do I need to go spend some time and write my own CGI/WSGI bridge for Python 3 based on my own Python 2 one I have lying around and just do some final validation checks with a parallel implementation as a sanity check to make sure we got everything? This might be a good idea anyway. It would. In the meantime, though, I've checked in the two-line change to add .buffer in. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)
At 10:33 AM 1/4/2011 -0800, Guido van Rossum wrote: But the flush() I was referring to is actually *before* either of these, suggesting sys.stdout.flush() sys.stdout.buffer.write(data) sys.stdout.buffer.flush() However the first flush() is only necessary is there's a possibility that previously something had been written to sys.stdout (not to sys.stdout.buffer). Yeah, that sort of error checking seems out of scope for a PEP example. If something was written to sys.stdout at that point, your CGI was already broken. ;-) > For the CGI example in the PEP, I don't want to bother trying to make it > fully production-usable; that's what we have wsgiref in the stdlib for. So > I won't worry about mixing strings and regular output in the example, even > if perhaps wsgiref should add the StringIO's proposed by Graham. Not sure what you mean. Surely copying and pasting the examples should work? What are the details you'd like to leave out? I meant that a production CGI gateway should handle various boundary/error conditions. Graham was saying that a CGI gateway should replace sys.stdout to avoid stray print()s causing problems, and I consider that similar to saying, "what if somebody wrote text to sys.stdout?" -- i.e., an error handling case that would be a good idea in a production gateway, but which would obscure the common case that the example is meant to illustrate. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 Goals
At 12:52 PM 1/6/2011 -0800, Alice BevanMcGregor wrote: Ignoring async for the moment, the goals of the PEP 444 rewrite are: :: Clear separation of "narrative" from "rules to be followed". This allows developers of both servers and applications to easily run through a confomance "check list". :: Isolation of examples and rationale to improve readability of the core rulesets. :: Clarification of often mis-interpreted rules from PEP 333 (and those carried over in ). :: Elimination of unintentional non-conformance, esp. re: cgi.FieldStorage. :: Massive simplification of call flow. Replacing start_response with a returned 3-tuple immensely simplifies the task of middleware that needs to capture HTTP status or manipulate (or even examine) response headers. [1] A big +1 to all the above as goals. :: Reduction of re-implementation / NIH syndrome by incorporating the most common (1%) of features most often relegated to middleware or functional helpers. Note that nearly every application-friendly feature you add will increase the burden on both server developers and middleware developers, which ironically means that application developers actually end up with fewer options. Unicode decoding of a small handful of values (CGI values that pull from the request URI) is the biggest example. [2, 3] Does that mean you plan to make the other values bytes, then? Or will they be unicode-y-bytes as well? What happens for additional server-provided variables? The PEP choice was for uniformity. At one point, I advocated simply using surrogateescape coding, but this couldn't be made uniform across Python versions and maintain compatibility. Unfortunately, even with the move to 2.6+, this problem remains, unless server providers are required to register a surrogateescape error handler -- which I'm not even sure can be done in Python 2.x. :: Cross-compatibility considerations. The definition and use of native strings vs. byte strings is the biggest example of this in the rewrite. I'm not sure what you mean here. Do you mean "portability of WSGI 2 code samples across Python versions (esp. 2.x vs. 3.x)?" ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 / WSGI 2 Async
At 01:03 PM 1/6/2011 +, chris.d...@gmail.com wrote: Does that apply here? It seems you either allow unicode strings or you don't, not a certain subsection. That's why PEP requires bytes instead - only the application knows what it's sending, and the server and middleware shouldn't have to guess. My general rule is unicode inside, UTF-8 at the boundaries. Which would be easy to enforce if you can only yield bytes, as is the case with PEP . I worry a bit that right now, there may be Python 3.2 servers (other than the ones built on wsgiref.handlers) that may not be enforcing this rule yet. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)
At 08:53 PM 1/4/2011 +1100, Graham Dumpleton wrote: BTW, to what extent are the examples in the PEP meant to be able to work on both Python 2.X and Python 3.X as is. Does it need to be clarified where examples will only work on Python 3.X, in particular the CGI gateway. The intention is that PEP will have examples specific to Python 3 in future. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)
At 09:51 PM 1/4/2011 +1100, Graham Dumpleton wrote: Add another point. FWIW, these are coming up because of questions being asked on python-dev IRC channel about PEP . The issue as it came down to was that the PEP may not be clear enough in explaining that where str() is unicode and as such something like PATH_INFO, although unicode, is actually bytes decoded as ISO-8859-1, needed to be re encoded/decoded to get it back to Unicode in the charset required before use. They were thinking that because it was unicode already they could use it as is and not need to do anything. Ie., didn't realise that need to do: path_info = environ.get('PATH_INFO', '') path_info = path_info.encode('ISO-8859-1').decode('UTF-8') for example to get it interpreted as UTF-8 first. They were simply looking at concatenating new URL bits to the ISO-8859-1 variant from other unicode strings that weren't bytes represented as ISO-8859-1. In Python 2.X it was obvious that since it wasn't unicode that you had to decode it, but confusion may arise for Python 3.X if this requirement is not explicitly spelled out with a code example like above. We all may see it as obvious and yes perhaps it could be covered in separate articles or commentaries be people, but given this person was new to it, maybe it is deserving of more explanation in the PEP itself if they were confused. It would be really awesome if somebody would write separate Application Authors' Guide and Middleware Authors' Guides to WSGI. They don't need to know absolutely everything in the PEP, unlike server authors. It could also be that the PEP covers it adequately already. I am too tired to read through it again right now. It's pretty prominently stated early on that NO strings in the spec are really unicode, they're just bytes packed into unicode objects. Obviously, no matter how prominently this is stated, some people will still make this mistake, but if desired, we could always put some additional info near the environ part of the spec for clarification. (It occurs to me in retrospect that I should probably have updated wsgiref in the stdlib to check the bytesy-ness of strings used to create Header objects. Too late for 3.2, though.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] CGI in PEP 444
At 12:43 PM 1/4/2011 +, Antoine Pitrou wrote: Alice BevanMcGregor writes: > > [1] http:://bit.ly/e7rtI6 So, while we are at it, could we get rid of the "CGI server example" in this new SWGI spec? This is 2011, and we should promote modern idioms, not encourage people to do 1995 Web programming. 10 years ago, CGI was already frown upon. (and even the idea that WSGI should provide some kind of CGI compatibility sounds a bit ridiculous to me) Regards Antoine. I still use CGI for the odd one-off, testing, prototyping, etc., and it's by far the easiest thing to deploy on a lot of web hosts. Hell, even Google App Engine *emulates* CGI in its default deployment configuration, IIRC. So it's not exactly obsolete. Also, the main purpose of the example is to show what a web server developer needs to do to hook up their own piping to provide WSGI services... and most web server developers have something like CGI code already lying around, or at least know what CGI looks like. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/pje%40telecommunity.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)
At 06:30 PM 1/3/2011 -0800, Guido van Rossum wrote: Would sys.stdout.buffer.write(b'abc') do? (If you mix this with writing strings to sys.stdout directly, you may have to call sys.stdout.flush() first.) The current code is: sys.stdout.write(data) # TODO: this needs to be binary on Py3 sys.stdout.flush() Should I be using sys.stdout.buffer for both, or just the write? For the CGI example in the PEP, I don't want to bother trying to make it fully production-usable; that's what we have wsgiref in the stdlib for. So I won't worry about mixing strings and regular output in the example, even if perhaps wsgiref should add the StringIO's proposed by Graham. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)
At 08:04 PM 1/3/2011 -0500, Randy Syring wrote: In the server/gateway example, there is a comment in the code that says: # TODO: this needs to be binary on Py3 The "TODO" part confuses me. In other areas of the PEP, there are comments like: # call must be byte-safe on Py3 which make sense. But is the TODO meant to be a TODO for the PEP or is it meant to be a note to the person running the example on Py3. If the latter, maybe "TODO" isn't the best prefix. FWIW, don't consider this an objection, it is just a question I had as I read through the PEP. Those are my TODO's for the PEP itself, and I've fixed a couple of them in SVN (probably around the time you were writing the above). If somebody can point me to the proper Py3 incantation for writing bytes to stdout, I'll fix the one remaining TODO marker as well. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Declaring PEP 3333 accepted (was: PEP 444 != WSGI 2.0)
At 04:43 PM 1/3/2011 -0800, Guido van Rossum wrote: On Mon, Jan 3, 2011 at 3:13 PM, Jacob Kaplan-Moss wrote: > On Sun, Jan 2, 2011 at 9:21 AM, Guido van Rossum wrote: >> Although [PEP ] is still marked as draft, I personally think of it >> as accepted; [...] > > What does it take to get PEP formally marked as accepted? Is > there anything I can do to push that process forward? > > The lack of a WSGI answer on Py3 is the main thing that's keeping me, > personally, from feeling excited about the platform. Once that's done > I can feel comfortable coding to it -- and browbeating those who don't > support it. > > I understand that PEP 444/Web3/WSGI 2/whatever might be a better > answer, but it's clearly got some way to go. In the meantime, what's > next to get PEP officially endorsed and accepted? I haven't heard anyone speak up against it, ever, since it was submitted. If no-one speaks up in the next 24 hours consider it accepted (and after that delay, anyone with SVN privileges can mark it thus). There are a few minor changes to the code samples needed to make them proper Python 3; I just checked in the hairiest of them, though. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 != WSGI 2.0
At 02:21 PM 1/2/2011 -0800, Alice BevanMcGregor wrote: On 2011-01-02 11:57:19 -0800, P.J. Eby said: * -1 on the key-specific encoding schemes for the various CGI variables' values -- for continuity of coding (not to mention simplicity) PEP 's approach to environ encodings should beused. (That is, the environ consists of bytes-in-unicode-form, rather than true unicode strings.) Does ISO-8859-1 not accomodate this for all but a small number of the environment variables in PEP 444? PEP requires that environment variables contain the bytes of the HTTP headers, decoded using ISO-8859-1. The unicode strings, in other words are restricted to code points in the 0-255 range, and are really just a representation of bytes, rather than being a unicode decoding of the contents of the bytes. What I saw in your draft of PEP 444 (which I admittedly may be confused about) is language that simply loosely refers to unicode environment variables, which could easily be misconstrued as meaning that the values could actually contain other code points. To be precise, in PEP 333, the "true" unicode value of an environment variable is: environ[key].encode('iso-8859-1').decode(appropriate_encoding_for_key) Whereas, my reading of your current draft implies that this has to already be done by the server. As I understand it, the problem with this is that the server developer can't always provide such a decoding correctly, and would require that the server guess, in the absence of any information that it could use to do the guessing. An application developer is in a better position to deal with this ambiguity than the server developer, and adding configuration to the server just makes deployment more complicated, and breaks application composability if two sub-applications within a larger application need different decodings. That's the rationale for the PEP approach -- it essentially acknowledges that HTTP is bytes, and we're only using strings for the API conveniences they afford. * Where is the PARAMETERS variable defined in the CGI spec? Whatservers actually support it? It's defined in the HTTP spec by way of http://www.ietf.org/rfc/rfc2396.txt URI Syntax. These values are there, they should be available. (Specifically semi-colon separated parameters and hash-separated fragment.) I mean, what web servers currently provide PARAMETERS as a CGI variable? If it's not a CGI variable, it doesn't go in all caps. What's more, the spec you reference points out that parameters can be placed in *each* path-segment, which means that they would: 1) already be in PATH_INFO, and 2) have multiple values So, -1 on the notion of PARAMETERS, since AFAICT it is redundant, not CGI, and would only hold one parameter segment. * The language about empty vs. missing environment variables appears to have disappeared; without it, the spec is ambiguous. I will re-examine the currently published PEP 444. I don't know if it's in there or not; I've read your spec more thoroughly than that one. I'm referring to the language from PEP 333 and its successor, with which I'm much more intimately familiar. Indeed. I do try to understand the issues covered in a broader scope before writing; for example, I do consider the ability for new developers to get up and running without worrying about the example applications they are trying to use work in their version of Python; thus /allowing/ native strings to be used as response values on Python 3. I don't understand. If all the examples in your PEP use b'' strings (per the 2.6+ requirement), where is the problem? They can't use WSGI 1(.0.1) code examples at all (as your draft isn't backward-compatible), so I don't see any connection there, either. Byte strings are still perferred, and may be more performant, Performance was not the primary considerations; they were: * One Obvious Way * Refuse The Temptation To Guess * Errors Should Not Pass Silently The first two would've been fine with unicode; the third was the effective tie-breaker. (Since if you use Unicode, at some point you will send garbled data and end up with an error message far away from the point where the error occurred.) I certainly will; I just need to see concrete points against the technical merits of the rewritten PEP Well, I've certainly given you some, but it's hard to comment other than abstractly on an async spec you haven't proposed yet. ;-) Nonetheless, it's really important to understand that the PEP process (especially for Informational-track standards) is not so much about technical merits in an absolute sense, as it is about *community consensus*. And that means it's actually a political and marketing process at least as much as it is a technical one. If you miss
Re: [Web-SIG] PEP 444 != WSGI 2.0
At 12:38 PM 1/2/2011 -0800, Alice BevanMcGregor wrote: On 2011-01-02 11:14:00 -0800, Chris McDonough said: I'd suggest we just embrace it, adding minor tweaks as necessary, until we reach some sort of technical impasse it doesn't address. Async is one area that does not cover, and that by not having a standard which incorporates async means competing, incompatible solutions have been created. Actually, it's the other way 'round. I wrote off async for PEP 333 because the competing, incompatible solutions that already existed lacked sufficient ground to build a spec on. In effect, any direction I took would effectively have either blessed one async paradigm as the correct one, or else been a mess that nobody would've used anyway. And this condition largely still exists today: about the only common ground between at least *some* async Python frameworks today is the use of greenlets... but if you have greenlets, then you don't need a fancy async API in the first place, because you can just "block" during I/O from the POV of the app. The existence of a futures API in the stdlib doesn't alter this much, either, because the async frameworks by and large already had their own API paradigms for doing such things (e.g. Twisted deferreds and thread/process pools, or generator/greenlet-based APIs in other frameworks). The real bottleneck isn't even that, so much as that if you're going to write an async WSGI application, WSGI itself can't define enough of an async API to let you do anything useful. For example, you may still need database access that's compatible with the async environment you're using... so you'd only be able to write portable async applications if they didn't do ANY I/O outside of HTTP itself! That being the case, I don't see how a meaningfully cross-platform async WSGI can ever really exist, and be attractive both to application developers (who want to run on many platforms) and framework developers (who want many developers to choose their platform). On 2011-01-02 12:00:39 -0800, Guido van Rossum said: Actually that does sound like an opinion on the technical merits. I can't tell though, because I'm not familiar enough with PEP 444 to know what the critical differences are compared to PEP . Could someone summarize? Async, distinction between byte strings (type returned by socket.read), native strings, and unicode strings, What distinction do you mean? I see a difference in *how* you're distinguishing byte, native, and unicode strings, but not *that* they're distinguished from one another. (i.e., PEP distinguishes them too.) thorough unicode decoding (moving some of the work from middleware to the server), What do you mean by "thorough decoding" and "moving from middleware to server"? Are these references to the redundant environ variables, to the use of decoded headers (rather than bytes-in-unicode ones), or something else? The async part is an idea in my head that I really do need to write down, clarified with help from agronholm on IRC. The futures PEP is available as a pypi installable module, is core in 3.2, and seems to provide a simple enough abstract (duck-typed) interface that it should be usable as a basis for async in PEP 444. I suggest reviewing the Web-SIG history of previous async discussions; there's a lot more to having a meaningful API spec than having a plausible approach. It's not that there haven't been past proposals, they just couldn't get as far as making it possible to write a non-trivial async application that would actually be portable among Python-supporting asynchronous web servers. (Now, if you're proposing that web servers run otherwise-synchronous applications using futures, that's a different story, and I'd be curious to see what you've come up with. But that's not the same thing as an actually-asynchronous WSGI.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 != WSGI 2.0
At 01:47 AM 1/2/2011 -0800, Alice BevanMcGregor wrote: The only things that depress me in the slightest are the lack of current discussion on the Web-SIG list (I did post a thread announcing my rewrite and asking for input, but there were no takers) FWIW, my lack of interest has been due to two factors. First, the original version of PEP 444 before you worked on it was questionable in my book, due to the addition of new optional features (e.g. async), and second, when I saw your "filters" proposal, it struck me as more of the same, and put me off from investigating further. (The whole idea of having a WSGI 2 (IMO at least), was to get rid of cruft and fix bugs, not to add new features.) Reading the draft now (I had not done so previously), I would suggest making your filters proposal available as a standalone module (or an addition to a future version of wsgiref.util), for anybody who wants that feature, e.g. via: def filter_app(app, ingress=(), egress=()): def wrapped(environ): for f in ingress: f(environ) s, h, b = app(environ) for f in egress: s, h, b = f(environ, s, h, b) return s, h, b return wrapped (Writing this implementation highlights one problem with the notion of filters, though, and that's that egress filters can't trap an error in the application.) As far as the rest of the draft is concerned, in order to give *thorough* technical feedback, I'd need to first make sure that all of the important rules from are still present, which is hard to do without basically printing out both PEPs and doing a line-by-line analysis. I notice that file_wrapper is missing, for example, but am not clear on the rationale for its removal. Is it your intention that servers wishing to support file iteration should check for a fileno() directly? There are a number of minor errors in the draft, such as saying that __iter__ must return a bytes instance (it should return an iterator yielding bytes instances) and __iter__ itself has broken markup. On other matters: * I remain a strong -1 on the .script_name and .path_info variables (one of which is named incorrectly in the draft), for reasons outlined here previously. (Specifically, that environ redundancy is a train wreck for middleware, which must be written to support both ways of providing the same variable, or to delete the extended version when modifying the environment.) * -1 on the key-specific encoding schemes for the various CGI variables' values -- for continuity of coding (not to mention simplicity) PEP 's approach to environ encodings should be used. (That is, the environ consists of bytes-in-unicode-form, rather than true unicode strings.) * +1 to requiring Python 2.6 -- standardizing on b'' notation makes good sense and improves forward-portability to Python 3. * Where is the PARAMETERS variable defined in the CGI spec? What servers actually support it? * The language about empty vs. missing environment variables appears to have disappeared; without it, the spec is ambiguous. Those are the issues I have identified on a first reading, without doing a full analysis. However, in lieu of such an analysis, I would take Graham's word on whether or not your spec addresses the HTTP compliance/implementation issues found in previous WSGI specs. If I may offer a recommendation from previous experience steering this PEP, I would say that just because other people know (say) HTTP better than you, doesn't mean that you can't still make a better spec than they can. You don't have to make *your* proposal into *Graham's* spec, or even the spec that Graham would have wanted you to make. But you *do* need to take Graham's *goals* into consideration. During the original drafting of PEP 333, Ian proposed quite a lot of features that I shot down... but I modified what I was doing so that Ian could still achieve his *goals* within the spec, without compromising my core vision for the spec (which was that it should be as close to trivial for server implementers as possible, and not expose any application API that might be perceived by framework developers as competition). So, I urge you to pay attention when Graham says something about what the spec is lacking or how it's broken. You don't have to fix it *his* way, but you do need to fix such that he can still get there. WSGI really does need some fixes for chunking and streaming, and Graham absolutely has the expertise in that area, and I would 100% defer to his judgment on whether some proposal of mine got it right in that area. That doesn't mean I would just adopt whatever he proposed, though, if it didn't meet *my* goals. That's the hart part of making a spec, but it's also the part that makes one great. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscrib
Re: [Web-SIG] PEP 444 != WSGI 2.0
At 05:04 PM 1/2/2011 +1100, Graham Dumpleton wrote: That PEP was rejected in the end and was replaced with PEP 342 which worked quite differently, yet I cant see that the WSGI specification was revisited in light of how it ended up being implemented and the implications of that. Part of my contribution to PEP 342 was ensuring that it was sufficiently PEP 325-compatible to ensure that PEP 333 wouldn't *need* revisiting. At least, not with respect to generator close() methods anyway. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] Should PEP 3333 be Python 3-only? What about transcoding?
As I've been tidying up wsgiref in the stdlib for PEP , I've been noticing that there's a bit of an issue with the PEP as far as CGI variables. Currently, the CGI example is the same as it is in PEP , which means that it's correct code for Python 2.x, but wrong for 3.x due to the environment transcoding issue. (See http://bugs.python.org/issue10155 for details.) There are other code sample differences, too. In effect, PEP is still using Python 2 code samples, because it's trying to cover every version of Python from 2.1 through 3.2. Should we ditch that, and say, "hey, if you want Python 2.x code samples, go see PEP 333?" That will simplify a couple of things, but still won't address the transcoding issue. Specifically, the problem is that on Python 3, os.environ contains *unicode*, not bytes masquerading as unicode. Unfortunately, this means that it very possibly contains garbage for CGI variables, as the web server puts bytes in the environment, then Python converts those bytes to unicode using the system encoding + surrogateescape. To get back to bytes, then, we have to decode using the same combination, then re-encode with latin-1 to get back to a WSGI-compatible string. The hitch is this: not everything in os.environ comes from an HTTP request, and therefore may not be decodable in such a fashion. For example, if you decode TMP or HOME or even DOCUMENT_ROOT that way, you're going to get rubbish. In wsgiref for the stdlib, I've used a variation of And Clover's patch in issue #10155 to implement something that *only* transcodes CGI variables that come from the web client request, but it's dreadfully complex. This isn't really a problem in wsgiref, because as far as I know, nobody else has bothered to make another CGI WSGI runner besides the one in wsgiref, and the sample in the PEP. But it is a problem for the PEP, because the complexity involved is high -- so high it would completely obscure the essential simplicity of the CGI example, if it was written in-line. There are many possible ways to address this, but my current leaning is to: 1. Change the PEP code samples to Python 3 only, and backreference PEP 333 for Python 2 code samples 2. Make the CGI sample in do an indiscriminate transcode (which only takes a few lines) and add a note to indicate that a robust CGI implementation should only do it to CGI variables, suggesting the wsgiref.handlers.read_environ() code as an example. Any thoughts? ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Is PEP 3333 the final solution for WSGI on Python 3?
At 02:26 PM 10/23/2010 +0300, Armin Ronacher wrote: Hi, On 10/22/10 2:35 AM, Graham Dumpleton wrote: has said: """Hopefully not. WSGI could do better and there is a proposal for that (444).""" Just to give this some more context: I think WSGI (even in Python 2) is faulty and we have the possibility now to fix it. Nobody in the web community is really eager to use Python 3 currently as far as I can see, so we have some extra time where we can actually introduce some value in to web development on Python 3. An improved WSGI specification could be a key to that. If PEP is what we end up with, that is fine with me as well. I don't think it's an either-or case. PEP just means that there's a clear path to port WSGI 1 apps. If somebody wants to champion a WSGI 1.1, a 2.0, and whatever else, that's great! I'm really trying to step *down* from involvement in this; the only reason I stepped up to do this now is because of the pending 3.2 release and the open question(s) over stdlib APIs that have to stabilize in this release. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Is PEP 3333 the final solution for WSGI on Python 3?
At 10:35 AM 10/22/2010 +1100, Graham Dumpleton wrote: Any one care to comment on my blog post? http://blog.dscpl.com.au/2010/10/is-pep--final-solution-for-wsgi-on.html As far as web framework developers commenting, Armin at: http://www.reddit.com/r/Python/comments/du7bf/is_pep__the_final_solution_for_wsgi_on_python/ has said: """Hopefully not. WSGI could do better and there is a proposal for that (444).""" So, looks he is very cool on the idea. No other developers of actual web frameworks has commented at all on PEP from what I can see. Graham Just out of curiosity, Graham, isn't PEP basically only a slight modification to what you yourself proposed and implemented in mod_wsgi for Python 3? My guess is that there's been no comment because there's really not much to say about it. The most controversial thing about it was Python-Dev's objection to modifying PEP 333 in place -- and that's the *only* reason why it's a new PEP at all. (Indeed, I originally just made the discussed amendments to PEP 333, and specifically wanted to avoid having a new PEP number in order to create unnecessary additional discussion or questions.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] wsgiref 0.2 dev in svn w/PEP 3333 support
At 09:37 PM 10/9/2010 +0200, And Clover wrote: On 10/06/2010 07:21 PM, P.J. Eby wrote: How would these relate to the Python 3.2 release? Can you make 3.x and 2.x versions? Yes, I have separate fixup code paths for 2.x and 3.x. 3.x faces the reverse situation to that previously described, in that os.environ is accurate on Windows but needs reverse-decoding on POSIX. Currently I use utf-8 and surrogateescape, but for Python 3.2 presumably os.environb will be the safer bet. Ok; if you can submit patches against svn://svn.eby-sarna.com/svnroot/wsgiref (for 2.x) and http://svn.python.org/projects/python/branches/py3k/Lib/wsgiref (for 3.x), adding an IISCGIHandler and whatever else, I'll review them and apply. Note, by the way, that just because the environment is unicode on 3.x, doesn't mean it's WSGI-correct: WSGI requires that unicode environment strings be just bytestrings in disguise. It's actually an error if those environment strings contain any character greater than 255! ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] wsgiref 0.2 dev in svn w/PEP 3333 support
At 01:28 PM 10/6/2010 +0200, And Clover wrote: On 10/05/2010 04:23 AM, P.J. Eby wrote: A preliminary update of the standalone (Python <3.x) version of wsgiref is now available Is there any interest in putting fixup code into wsgiref's CGIHandler? I appreciate this is really ugly, but the CGI-to-WSGI gateway is the most logical place for this, as otherwise the WSGI environment created by CGIHandler often doesn't meet the requirements of the spec. Trying to fix these problems at an application, framework or middleware level is impractical because they don't know that the WSGI environ originally came from CGI. (And they can't re-read the environ at that point without breaking environ-altering middleware.) In particular: for Python 2.x running on Win32, read the environment using ctypes where available, allowing non-ASCII characters to be read directly instead of irretrievably mangled by the ANSI-code-page-encoded os.environ interface. Then encode the extracted Unicode environ to byte strings using ISO-8859-1, except if the server software is Microsoft/IIS, where the encoding will probably be UTF-8. IIS also needs a fix to remove the duplicated SCRIPT_NAME from the front of PATH_INFO. This is a bit more risky as existing apps/libraries may already be doing this and might get confused if someone's already done the fix. Maybe a subclass like IISCGIHandler? How would these relate to the Python 3.2 release? Can you make 3.x and 2.x versions? (I currently consider getting 3.2 out a higher priority, and want equity between the standalone 0.2 and the bundled version in 3.2.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] wsgiref 0.2 dev in svn w/PEP 3333 support
A preliminary update of the standalone (Python <3.x) version of wsgiref is now available using "easy_install wsgiref==dev". Relevant diffs are here: http://svn.eby-sarna.com/?rev=2689&view=rev This is preliminary work that'll need to be ported to the Python 3 version of wsgiref (note that the standalone version is *not* 2to3 friendly as yet), but I wanted to get the basic implementation done before porting. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 3333 (WSGI 1.0.1) - Now updated with wsgi.org amendments
At 12:04 PM 10/4/2010 -0400, P.J. Eby wrote: PEP has now been updated with the amendments and clarifications that I announced two weeks ago; see this link for the (nicely formatted) differences between PEP 333 and PEP : http://svn.python.org/view/peps/trunk/pep-.txt?r1=85014&r2=HEAD Clarification: the above will only show you the amendments *other* than the Python 3 changes. For the FULL diff between 333 and 333, see: http://svn.python.org/view/peps/trunk/pep-.txt?r1=84854&r2=HEAD Sorry for any confusion. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] PEP 3333 (WSGI 1.0.1) - Now updated with wsgi.org amendments
PEP has now been updated with the amendments and clarifications that I announced two weeks ago; see this link for the (nicely formatted) differences between PEP 333 and PEP : http://svn.python.org/view/peps/trunk/pep-.txt?r1=85014&r2=HEAD Or this link for the full PEP: http://www.python.org/dev/peps/pep-/ Now is the time for any error corrections, objections, nitpicking, volunteering to help update wsgiref, etc. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly
At 01:22 PM 9/27/2010 -0400, Terry Reedy wrote: On 9/26/2010 9:38 PM, P.J. Eby wrote: At 11:15 AM 9/27/2010 +1000, Ben Finney wrote: You misunderstand me; I wasn't asking how to *add* a link, but how to turn OFF the automatic conversion of the phrase "PEP 333" that happens without any special markup. Currently, the PEP preface is littered with unnecessary links, because the PEP pre-processor turns *every* mere textual mention of a PEP into a link to it. Ouch. This is about as annoying as Thunderbird's message editor popping up a windowed asking me what file I want to at.tach everytime I write the word "at-tach' or a derivative without the extra punctuation. It would definitely not be the vehicle for writing about at=mentment syndromes. Suggestion pending something better from rst/PEP experts: "This PEP extends PEP 333 (abbreviated P333 hereafter)." perhaps with "to avoid auto-link creation" added before ')' to pre-answer pesky questions and to avoid some editor re-expanding the abbreviations. It turns out that using a backslash before the number (e.g. PEP \333) turns off the automatic conversion. The PEP still hasn't showed up on Python.org, though, so I'm wondering if maybe I broke something else somewhere. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI is now Python 3-friendly
At 11:15 AM 9/27/2010 +1000, Ben Finney wrote: "P.J. Eby" <<http://mail.python.org/mailman/listinfo/python-dev>pje at telecommunity.com> writes: > (For that matter, if anybody knows how to make it not turn *every* PEP > reference into a link, that'd be good too! It doesn't really need to > turn 5 or 6 occurrences of "PEP 333" in the same paragraph into > separate links. ;-) ) reST, being designed explicitly for Python documentation, has support for PEP references built in: You misunderstand me; I wasn't asking how to *add* a link, but how to turn OFF the automatic conversion of the phrase "PEP 333" that happens without any special markup. Currently, the PEP preface is littered with unnecessary links, because the PEP pre-processor turns *every* mere textual mention of a PEP into a link to it. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly
At 02:59 PM 9/26/2010 -0400, Terry Reedy wrote: You could mark added material is a way that does not conflict with rst or html. Or use .rst to make new text stand out in the .html web verion (bold, underlined, red, or whatever). People familiar with 333 can focus on the marked sections. New readers can ignore the marking. If you (or anybody else) have any idea how to do that (highlight stuff in PEP-dialect .rst), let me know. (For that matter, if anybody knows how to make it not turn *every* PEP reference into a link, that'd be good too! It doesn't really need to turn 5 or 6 occurrences of "PEP 333" in the same paragraph into separate links. ;-) ) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly
Done. The other amendments were never actually made, so I just reverted the Python 3 bit after moving it to the new PEP. I'll make the changes to instead as soon as I have another time slot free. At 01:56 PM 9/26/2010 -0700, Guido van Rossum wrote: Since you have commit privileges, just do it. The PEP editor position mostly exists to assure non-committers are not prevented from authoring PEPs. Please do add a prominent note at the top of PEP 333 pointing to PEP for further information on Python 3 compliance or some such words. Add a similar note at the top of PEP -- maybe mark up the differences in PEP so people can easily tell what was added. And move PEP 333 to Final status. --Guido On Sun, Sep 26, 2010 at 1:50 PM, P.J. Eby wrote: > At 01:44 PM 9/26/2010 -0700, Guido van Rossum wrote: >> >> On Sun, Sep 26, 2010 at 12:47 PM, Barry Warsaw wrote: >> > On Sep 26, 2010, at 1:33 PM, P.J. Eby wrote: >> > >> >> At 08:20 AM 9/26/2010 -0700, Guido van Rossum wrote: >> >>> I'm happy approving Final status for the >> >>> *original* PEP 333 and I'm happy to approve a new PEP which includes >> >>> PJE's corrections. >> >> >> >> Can we make it PEP , then? ;-) >> > >> > That works for me. >> >> Go for it. > > Shall I just "svn cp" it, then (to preserve edit history), or wait for the > PEP editor do it? > > -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list python-...@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/pje%40telecommunity.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly
At 01:44 PM 9/26/2010 -0700, Guido van Rossum wrote: On Sun, Sep 26, 2010 at 12:47 PM, Barry Warsaw wrote: > On Sep 26, 2010, at 1:33 PM, P.J. Eby wrote: > >> At 08:20 AM 9/26/2010 -0700, Guido van Rossum wrote: >>> I'm happy approving Final status for the >>> *original* PEP 333 and I'm happy to approve a new PEP which includes >>> PJE's corrections. >> >> Can we make it PEP , then? ;-) > > That works for me. Go for it. Shall I just "svn cp" it, then (to preserve edit history), or wait for the PEP editor do it? ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly
At 08:20 AM 9/26/2010 -0700, Guido van Rossum wrote: I'm happy approving Final status for the *original* PEP 333 and I'm happy to approve a new PEP which includes PJE's corrections. Can we make it PEP , then? ;-) That number would at least communicate that it's the same thing, but for Python 3. Really, my reason for trying to do the (non Py3-specific) amendments in a way that didn't require a new PEP number was because of the many ancillary questions that it raises for the community, such as: * Is this is some sort of competition/replacement to PEP 444? * What happened to the old one, why can't we just use that? * Why isn't there a different protocol version? * How is this different from the old one? To be fair, I *also* wanted to avoid all the work associated with *answering* them. ;-) (Heck, I really wanted to avoid the work of having to even *think* about which questions *might* arise and how they'd need to be addressed.) OTOH, I can certainly see that my attempt to avoid this has *already* failed: it simply brought up a different set of questions, just on Python-Dev instead of Web-SIG or Python-list. Oh well. Perhaps making the numbering appear to be a continuation will help a bit. Another option would be to make a PEP that consists solely of the amendments and errata themselves, as this would answer most of the above questions directly. Still another would be to abandon the effort to amend the PEP, and simply leave things as they are now: AFAICT, the fact that these amendments aren't in the PEP hasn't stopped anybody from *treating* most of them as if they were. (Because everyone understands that failure to follow them constitutes a bug in your program, even if it technically complies with the spec.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly
At 07:15 PM 9/25/2010 -0700, Guido van Rossum wrote: Don't see this as a new spec. See it as a procedural issue. As a procedural issue, PEP 333 is an Informational PEP, in "Draft" status, which I'd like to make Final after these amendments. See http://www.wsgi.org/wsgi/Amendments_1.0, which Graham created in 2007, stating: """This page is intended to collect any ideas related to amendments to the original WSGI 1.0 so that it can be marked as 'Final'.""" IOW, there is no intention to treat the PEP as "mutable" going forward; this is just cleanup so we can mark it Final. After that, it's an ex-parrot. Clarifications of ambiguous/unspecified behavior can possibly rule as non-conforming implementations that used to get the benefit of the doubt. Best-practice recommendations also have the effect of changing (perceived) compliance. I understand the general principle, but with respect to these *specific* changes, any perceived-compliance arguments that were going to happen, already happened years ago. The changes are merely to officially document the way those arguments already turned out, so the PEP can become Final. Specifically, the changes all fall into one of three categories: 1. Textual clarification (SERVER_PORT is not an int, iteration can stop before all output is consumed) 2. Practical issues with wsgi.input arising from the fact that real-world programs needed its behavior to be more "file-like" than the specification required... and which essentially forced servers that were not using socket.makefile() to make their emulations work like that, anyway (or else be rejected by users). 3. Clarification of behavior that would break HTTP compliance (apps or servers sending more than Content-Length bytes) and is therefore *already a bug* in any implementation that does it. Since in all three categories any implementation that did not end up following the recommendations on its own is going to have been considered buggy by its users (regardless of its formal "compliance"), and because the changes do not actually declare the buggy behaviors in categories 2 and 3 to be non-compliant, I do not see how any of these changes can produce the type of problems you're worried about here. Certainly, if I thought such problems were possible, I wouldn't have accepted these amendments. Likewise, if I thought that changes would continue to be made to the PEP past this point, the goal wouldn't be getting it to Final status. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly
At 02:07 PM 9/25/2010 -0700, Guido van Rossum wrote: This is a very laudable initiative and I approve of the changes -- but I really think it ought to be a separate PEP rather than pretending it is just a set of textual corrections on the existing PEP 333. With the exception of the bytes change, I ruled out accepting any proposed amendments that would actually alter the protocol. The amendments are all either textual clarifications, clarifications of ambiguous/unspecified areas, or best-practice recommendations by implementors. (i.e., which are generally already implemented in major servers) The full list of things Graham and others have asked for or recommended would indeed require a 1.1 version at minimum, and thus a new PEP. But I really don't want to start down that road right now, and therefore hope that I can talk Graham or some other poor soul into shepherding a 1.1 PEP instead. ;-) (Seriously: through an ironic twist of fate, I have done nearly *zero* Python web programming since around the time I drafted the first spec in 2004, so even if it makes sense for me to finish PEP 333, it makes little sense for me to be starting a *new* one on the topic now!) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [Python-Dev] WSGI is now Python 3-friendly
At 09:22 PM 9/25/2010 -0400, Jesse Noller wrote: It seems like it will end up different enough to be a different specification, closely related to the original, but different enough to trip up all the people maintaining current WSGI servers and apps. The only actual *change* to the spec is mandating the use of the 'bytes' type or equivalent for HTTP bodies when using Python 3. Seriously, that's *it*. Everything else that's (planned to be) added is either 100% truly just clarifications (e.g. nothing in the spec *ever* said SERVER_PORT could be an int, but apparently some people somehow interpreted it so), or else best-practice recommendations from people who actually implemented WSGI servers. For example, the readline() size hint is "not supported" in the original spec (meaning clients can't call it and be compliant). The planned modification is "servers should implement it" (best practice), but you can't call an implementation that *doesn't* implement it noncompliant. (This just addresses the fact that most practical implementations *did* in fact support it, and code out there relies on this.) So, no (previously-)compliant implementations were harmed in the making of the updated spec. If they were compliant before, they're compliant now. I'm actually a bit surprised people are bringing this up now, since when I announced the plan to make these changes, I said that nothing would be changed that would break anything... even for what I believe are the only Python 3 WSGI implementations right now (by Graham Dumpleton and Robert Brewer). Indeed, all of the changes (except the bytes thing) are stuff previously discussed endlessly on the Web-SIG (years ago in most cases) and widely agreed on as, "this should have been made clear in the original PEP". And, I also explicitly deferred and/or rejected items that *can't* be done in a 100% backward-compatible way, and would have to be WSGI 1.1 or higher -- indeed, I have a long list of changes from Graham that I've pronounced "can't be done without a 1.1". Indeed, the entire point of the my scope choices were to allow all this to happen *without* a whole new spec. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] WSGI is now Python 3-friendly
I have only done the Python 3-specific changes at this point; the diff is here if anybody wants to review, nitpick or otherwise comment: http://svn.python.org/view/peps/trunk/pep-0333.txt?r1=85014&r2=85013&pathrev=85014 For that matter, if anybody wants to take a crack at updating Python 3's wsgiref based on the above, feel free. ;-) I'll be happy to answer any questions I can that come up in the process. (Please note: I went with Ian Bicking's "headers are strings, bodies are bytes" proposal, rather than my original "bodies and outputs are bytes" one, as there were not only some good arguments in its favor, but because it also resulted in fewer changes to the PEP, especially in the code samples.) I will continue to work on adding the other addenda/errata mentioned here: http://mail.python.org/pipermail/web-sig/2010-September/004655.html But because these are "shoulds" rather than musts, and apply to both Python 2 and 3, they are not as high priority for immediate implementation in wsgiref and do not necessarily need to hold up the 3.2 release. (Nonetheless, if anybody is willing to implement them in the Python 3 version, I will happily review the changes for backport into the Python 2 standalone version of wsgiref, and issue an updated release to include them.) Thanks! ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Last call for WSGI 1.0 errata/clarifications
At 01:22 PM 9/24/2010 +0200, René Dudfield wrote: Hi, Have all the changes been tested with real world implementations? mod_wsgi under Python 3 is compliant with the changes, and I believe it has all the general addenda/clarifications implemented under Python 2 as well (and for some years now, in fact). ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Last call for WSGI 1.0 errata/clarifications
At 09:52 AM 9/24/2010 -0600, Jeff Hardy wrote: On Thu, Sep 23, 2010 at 10:32 AM, P.J. Eby wrote: > Just a reminder: I'm planning to actually update PEP 333 over the weekend > and start working on wsgiref updates, so if you have any last-minute > comments on the proposal, now's the time to post them, however unpolished > they may be! Will you bump the version number to 1.1, or will it stay at 1.0? Does anyone actually check the version number? Since these are just clarifications to the existing spec, and no previously-compliant implementations are invalidated by the changes, there will be no changes to the version number. - Jeff ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Last call for WSGI 1.0 errata/clarifications
At 01:45 PM 9/24/2010 +0200, Manlio Perillo wrote: Il 23/09/2010 18:32, P.J. Eby ha scritto: > Just a reminder: I'm planning to actually update PEP 333 over the > weekend and start working on wsgiref updates, so if you have any > last-minute comments on the proposal, now's the time to post them, > however unpolished they may be! > Where can I find a draft of the update? See http://mail.python.org/pipermail/web-sig/2010-September/004655.html for the notes; I have not updated the PEP yet, but am about to. One change since that post: Ian has convinced me to make headers text and bodies bytes, where before I proposed to only have input headers be text, and output headers be bytes. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Output header encodings? (was Re: Backup plan: WSGI 1 Addenda and wsgiref update for Py3)
At 03:48 PM 9/23/2010 -0500, Ian Bicking wrote: It only fixes the one case of non-Latin1 characters, there are still many other values you can put into a header (a newline or control character for instance), and innumerable header-specific issues. It seems to be adding complexity for one of the least problematic cases. Ok, you found one that convinces me. ;-) "Headers are text, bodies are bytes" shall be the rule. I'll rewrite the "note about string types" and change the way I'm updating the spec accordingly. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Output header encodings? (was Re: Backup plan: WSGI 1 Addenda and wsgiref update for Py3)
At 11:17 AM 9/23/2010 -0500, Ian Bicking wrote: I don't see any reason why Location shouldn't be ASCII. Any header could have any character put in it, of course, there's just no valid case where Location shouldn't be a URL, and URLs are ASCII. Cookie can contain weirdness, yes. I would expect any library that abstracts cookies to handle this (it's certainly doable)... otherwise, this seems like one among many ways a person can do the wrong thing. This can also be detected with the validator, which doesn't avoid runtime errors, but bytes allow runtime errors too -- they will just happen somewhere else (e.g., when a value is converted to bytes in an application or library). Right: somewhere much closer to the *actual* error, where the developer can know the problem is, "I have garbage data or have not selected an appropriate codec", rather than "this WSGI stuff is giving me errors some place". If servers print the invalid value on error (instead of just some generic error) I don't think it would be that hard to track down problems. This requires some explicit effort on the part of the server (most servers handle app_iter==None ungracefully, which is a similar problem). The difference is that if a server rejects non-bytes, you'll know *right away* that your app isn't compliant, instead of having to wait until some non-latin1 data shows up. AFAICT, there are only two advantages to using text for output headers: 1. Text is easier to work with, and 2. It's symmetric with using text for input headers. Both of which can still be had, by using the @encode_headers decorator. I'm a little bit on the fence on this one, because 1) it does seem a little pointless (if harmless) to shuffle headers around in bytes form, and 2) Location and Set-Cookie are very likely the only headers where any kind of damage could ever happen. But, since it *can* happen, and because it is also really easy to fix the API issue with a decorator, I'm still leaning in favor of "output is bytes" over "headers are text, bodies are bytes", unless somebody can come up with either some actually-bad consequence of using bytes, or some extra-good consequence of using text (that isn't addressed by just using the decorator). (Note, by the way, that WSGI design has always leaned in the direction of "any convenience that can be handled by a library should be", if it keeps the spec simpler and more verifiable. So, this seems like a good use of that principle.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Last call for WSGI 1.0 errata/clarifications
At 02:51 PM 9/23/2010 -0400, Tres Seaver wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 P.J. Eby wrote: > Just a reminder: I'm planning to actually update PEP 333 over the > weekend and start working on wsgiref updates, so if you have any > last-minute comments on the proposal, now's the time to post them, > however unpolished they may be! I'm fine with the substance of the changes you proposed, but puzzled about the process: in what case does it work to updated an already-approved-and-implemented PEP would be updated, instead of replacing it with a newer PEP (e.g., PEPs 241 -> 314 -> 345). In the case where one is clarifying ambiguities/questions in the original spec. ;-) (None of the changes invalidate existing implementations, but simply provide additional guidance/best practice suggestions. Even the Python 3 changes won't invalidate at least mod_wsgi's Python 3 implementation.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Output header encodings? (was Re: Backup plan: WSGI 1 Addenda and wsgiref update for Py3)
At 11:11 AM 9/23/2010 -0600, Jeff Hardy wrote: On Thu, Sep 23, 2010 at 10:06 AM, P.J. Eby wrote: > So, unless somebody has some additional arguments on this one, I think I'm > going to stick with bytes output. I don't have a strong opinion on whether it should be bytes or strings -- I'll leave that discussion for people who know more about the details than I do. I do think input and output should be symmetric, though. If response headers are going to be bytes, then the request headers should be as well, or vice versa. The same arguments apply to both, after all. Actually, they don't. There are more apps than servers, so more code to get right, by more people. Servers also don't generally *create* any of the bytes or text involved, they're just ferrying it from one place to the next. So the API conditions are not symmetrical. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] Last call for WSGI 1.0 errata/clarifications
Just a reminder: I'm planning to actually update PEP 333 over the weekend and start working on wsgiref updates, so if you have any last-minute comments on the proposal, now's the time to post them, however unpolished they may be! ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] Output header encodings? (was Re: Backup plan: WSGI 1 Addenda and wsgiref update for Py3)
At 12:57 PM 9/21/2010 -0400, Ian Bicking wrote: On Tue, Sep 21, 2010 at 12:09 PM, P.J. Eby <<mailto:p...@telecommunity.com>p...@telecommunity.com> wrote: The Python 3 specific changes are to use: * ``bytes`` for I/O streams in both directions * ``str`` for environ keys and values * ``bytes`` for arguments to start_response() and write() This is the only thing that seems odd to me -- it seems like the response should be symmetric with the request, and the request in this case uses str for headers (status being header-like), and bytes for the body. So, I've given some thought to your suggestion, and, while it's true that most of the output headers are far less prone to ending up with unintended unicode content, there are at least two output headers that can include some sort of application content (and can therefore have random failures): Location and Set-Cookie. If these headers accidentally contain non-Latin1 characters, the error isn't detectable until the header reaches the origin server doing the transmission encoding, and it'll likely be a dynamic (and therefore hard-to-debug) error. However, if the output is always bytes (and this can be relatively-statically verified), then any error can't occur except *inside* the application, where the app's developer can find it more easily. So I guess the question boils down to: would we rather make sure that coding errors happen *inside* applications, or would we rather make porting WSGI apps trivial (or nearly so)? But I think that it's possible here to have one's cake and eat it too: if we require bytes for all outputs, but provide a pair of decorators in wsgiref.util like the following: def encode_body(codec='utf8'): """Allow a WSGI app to output its response body as strings w/specified encoding""" def decorate(app): def encode(response): try: for data in response: yield data.encode(codec) finally: if hasattr(response, 'close'): response.close() def decorated_app(environ, start_response): def start(status, response_headers, exc_info=None): _write = start_response(status, response_headers, exc_info) def write(data): return _write(data.encode(codec)) return write return encode(app(environ, start)) return decorated_app return decorate def encode_headers(codec='latin1'): """Allow a WSGI app to output its headers as strings, w/specified encoding""" def decorate(app): def decorated_app(environ, start_response): def start(status, response_headers, exc_info=None): status = status.encode(codec) response_headers = [ (k.encode(codec), v.encode(codec)) for k,v in response_headers ] return start_response(status, response_headers, exc_info) return app(environ, start) return decorated_app return decorate So, this seems like a win-win to me: relatively-static verification, errors stay in the app (or at least in the decorator), and the API is clean-and-easy. Indeed, it seems likely that at least some apps that don't read wsgi.input themselves could be ported *just* by adding the appropriate decorator(s). And, if your app is using unicode on 2.x, you can even use the same decorators there, for the benefit of 2to3. (Assuming I release an updated standalone wsgiref version with the decorators, of course.) So, unless somebody has some additional arguments on this one, I think I'm going to stick with bytes output. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Most WSGI servers close connections to early.
At 08:34 AM 9/22/2010 -0700, Robert Brewer wrote: Marcel Hellkamp wrote: > I would like to add a warning to the WSGI/web3 specification to address > this issue: > > "An application should read all available data from > `environ['wsgi.input']` on POST or PUT requests, even if it does not > process that data. Otherwise, the client might fail to complete the > request and not display the response." Indeed. CherryPy has protected against this for some time. But it shouldn't be the burden of *applications* to do this; the WSGI "origin" server can do so quite easily. However, the caveat requires a caveat: servers must still be able to protect themselves from malicious clients. In practice, that means allowing servers to close the connection without reading the entire request body if a certain number of bytes is exceeded. We can certainly add warnings, although these are both more of a "best practices" advisory rather than a part of the spec per se. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
[trimming reply headers to just web-sig] At 12:57 PM 9/21/2010 -0400, Ian Bicking wrote: On Tue, Sep 21, 2010 at 12:09 PM, P.J. Eby <<mailto:p...@telecommunity.com>p...@telecommunity.com> wrote: The Python 3 specific changes are to use: * ``bytes`` for I/O streams in both directions * ``str`` for environ keys and values * ``bytes`` for arguments to start_response() and write() This is the only thing that seems odd to me -- it seems like the response should be symmetric with the request, and the request in this case uses str for headers (status being header-like), and bytes for the body. Are you suggesting a "``str`` for headers, ``bytes`` for bodies" approach instead? I suppose that could work; I was going for "str in, bytes out". My assumption, though, was that headers are relatively easy to address at a choke point from a framework's output. But I guess that iterator output is equally chokable. I'm open to discussion on this point, so long as every value produced or consumed by a WSGI application is of a specified single type(). Otherwise this seems good to me, the only other major errata I can think of are all listed in the links you included. Um, if by "links" you mean, "included textually in the proposal", then sure. If it's not in the proposal, it's not going in the PEP, even if it's on the WSGI Amendments page or Graham's blog. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [Python-Dev] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
At 06:52 PM 9/21/2010 +0200, Antoine Pitrou wrote: On Tue, 21 Sep 2010 12:09:44 -0400 "P.J. Eby" wrote: > While the Web-SIG is trying to hash out PEP 444, I thought it would > be a good idea to have a backup plan that would allow the Python 3 > stdlib to move forward, without needing a major new spec to settle > out implementation questions. If this allows the Web situation in Python 3 to be improved faster and with less hassle then all the better. There's something strange in your proposal: it mentions WSGI 2 at several places while there's no guarantee about what WSGI 2 will be (is there?). Sorry - "WSGI 2" should be read as shorthand for, "whatever new spec succeeds PEP 333", whether that's PEP 444 or something else. It just means that any new spec that doesn't have to be backward-compatible can (and should) more thoroughly address the issue in question. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [Python-Dev] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
At 12:55 PM 9/21/2010 -0400, Ian Bicking wrote: On Tue, Sep 21, 2010 at 12:47 PM, Chris McDonough <<mailto:chr...@plope.com>chr...@plope.com> wrote: On Tue, 2010-09-21 at 12:09 -0400, P.J. Eby wrote: > While the Web-SIG is trying to hash out PEP 444, I thought it would > be a good idea to have a backup plan that would allow the Python 3 > stdlib to move forward, without needing a major new spec to settle > out implementation questions. If a WSGI-1-compatible protocol seems more sensible to folks, I'm personally happy to defer discussion on PEP 444 or any other backwards-incompatible proposal. I think both make sense, making WSGI 1 sensible for Python 3 (as well as other small errata like the size hint) doesn't detract from PEP 444 at all, IMHO. Yep. I agree. I do, however, want to get these amendments settled and make sure they get carried over to whatever spec is the successor to PEP 333. I've had a lot of trouble following exactly what was changed in 444, and I'm a tad worried that several new ambiguities may be being introduced. So, solidifying 333 a bit might be helpful if it gives a good baseline against which to diff 444 (or whatever). ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
[Web-SIG] Backup plan: WSGI 1 Addenda and wsgiref update for Py3
While the Web-SIG is trying to hash out PEP 444, I thought it would be a good idea to have a backup plan that would allow the Python 3 stdlib to move forward, without needing a major new spec to settle out implementation questions. After all, even if PEP 333 is ultimately replaced by PEP 444, it's probably a good idea to have *some* sort of WSGI 1-ish thing available on Python 3, with bytes/unicode and other matters settled. In the past, I was waiting for some consensuses (consensi?) on Web-SIG about different approaches to Python 3, looking for some sort of definite, "yes, we all like this" response. However, I can see now that this just means it's my fault we don't have a spec yet.:-( So, unless any last-minute showstopper rebuttals show up this week, I've decided to go ahead officially bless nearly all of what Graham Dumpleton (who's not only the mod_wsgi author, but has put huge amounts of work into shepherding WSGI-on-Python3 proposals, WSGI amendments, etc.) has proposed, with a few minor exceptions. In other words: almost none of the following is my own original work; it's like 90% Graham's. Any praise for this belongs to him; the only thing that belongs to me is the blame for not doing this sooner! (Sorry Graham. You asked me to do this ages ago, and you were right.) Anyway, I'm posting this for comment to both Python-Dev and the Web-SIG. If you are commenting on the technical details of the amendments, please reply to the Web-SIG only. If you are commenting on the development agenda for wsgiref or other Python 3 library issues, please reply to Python-Dev only. That way, neither list will see off-topic discussions. Thanks! The Plan I plan to update the proposal below per comments and feedback during this week, then update PEP 333 itself over the weekend or early next week, followed by a code review of Python 3's wsgiref, and implementation of needed changes (such as recoding os.environ to latin1-captured bytes in the CGI handler). To complete the changes, it is possible that I may need assistance from one or more developers who have more Python 3 experience. If after reading the proposed changes to the spec, you would like to volunteer to help with updating wsgiref to match, please let me know! The Proposal Overview 1. The primary purpose of this update is to provide a uniform porting pattern for moving Python 2 WSGI code to Python 3, meaning a pattern of changes that can be mechanically applied to as little code as practical, while still keeping the WSGI spec easy to programmatically validate (e.g. via ``wsgiref.validate``). The Python 3 specific changes are to use: * ``bytes`` for I/O streams in both directions * ``str`` for environ keys and values * ``bytes`` for arguments to start_response() and write() * text stream for wsgi.errors In other words, "strings in, bytes out" for headers, bytes for bodies. In general, only changes that don't break Python 2 WSGI implementations are allowed. The changes should also not break mod_wsgi on Python 3, but may make some Python 3 wsgi applications non-compliant, despite continuing to function on mod_wsgi. This is because mod_wsgi allows applications to output string headers and bodies, but I am ruling that option out because it forces every piece of middleware to have to be tested with arbitrary combinations of strings and bytes in order to test compliance. If you want your application to output strings rather than bytes, you can always use a decorator to do that. (And a sample one could be provided in wsgiref.) 2. The secondary purpose of the update is to address some long-standing open issues documented here: http://www.wsgi.org/wsgi/Amendments_1.0 As with the Python 3 changes, only changes that don't retroactively invalidate existing implementations are allowed. 3. There is no tertiary purpose. ;-) (By which I mean, all other kinds of changes are out-of-scope for this update.) 4. The section below labeled "A Note On String Types" is proposed for verbatim addition to the "Specification Overview" section in the PEP; the other sections below describe changes to be made inline at the appropriate part of the spec, and changes that were proposed but are rejected for inclusion in this amendment. A Note On String Types -- In general, HTTP deals with bytes, which means that this specification is mostly about handling bytes. However, the content of those bytes often has some kind of textual interpretation, and in Python, strings are the most convenient way to handle text. But in many Python versions and implementations, strings are Unicode, rather than bytes. This requires a careful balance between a usable API and correct translations between bytes and text in the context of HTTP... especially to support porting code between Python implementations with different ``str`` types. WSGI theref
Re: [Web-SIG] PEP 444 (aka Web3)
At 09:01 AM 9/18/2010 -0700, Robert Brewer wrote: Marcel Hellkamp wrote: > > Removing any support for this type of asynchronism would render web3 > useless for all but completely synchronous and trivial applications. > Even frameworks would have no way to work around this anymore. I've run a few businesses now on WSGI without doing what you describe, so I don't see why blocking makes an application 'trivial'. I believe he means: all_but(synchronous_apps + trivial_apps), not all_but(apps(synchronous & trivial)). ;-) (That being said, for WSGI 2 I still want to get rid of start_response. IMO, async WSGI needs to be a different protocol.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
At 03:43 PM 9/17/2010 +0200, And Clover wrote: On 09/17/2010 02:03 PM, Armin Ronacher wrote: In case we change the spec as Ian mentioned above, I am all for a "wsgi.guessed_encoding" = True flag or something like that. Yes, I'd like to see that. I believe going with *only* a raw-or-reconstructed path_info, rather than having both path_info and PATH_INFO, is probably best, for the middleware-dupication reasons PJE mentioned. A more in-depth possibility might be: wsgi.path_accuracy = 0: script_name/path_info have been crudely reconstructed from SCRIPT_NAME/PATH_INFO from an unknown source. Beware! If there is to be backwards compatibility with WSGI1, this would be seen as the 'default value' given a missing path_accuracy. 1: script_name/path_info have been reconstructed, but it is known that path_info is accurate, other than %2F and non-ASCII issues. That is, it's known that the path doesn't come from IIS's broken PATH_INFO, or the IIS error has been detected and compensated for. 2: script_name/path_info have been reconstructed using known-good encodings for the env. The only way in which they may differ from the original request path is that a slash might originally have been a %2F. (This is good enough for the vast majority of applications.) 3: script_name/path_info come directly from the request path without any intervening mangling. So, do you have an example of what some real-world code is going to *do* with this information? i.e., what's the use case for knowing the precise degree of messed-uppedness of the path? ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
At 02:54 PM 9/16/2010 -0400, Chris McDonough wrote: On Thu, 2010-09-16 at 14:04 -0400, P.J. Eby wrote: > At 10:35 AM 9/16/2010 -0700, Guido van Rossum wrote: > >No comments on the rest except to note that at this point it looks > >unlikely that we can make everyone happy (or even get an agreement to > >adopt what would be the long-term technically optimal solution -- > >AFAICT there is no agreement on what that solution would be, if one > >weren't to take porting Python 2 code into account). IOW > >something/sokebody has gotta give. > > Indeed. This entire discussion has pushed me strongly in favor of > doing a super-minimalist update to PEP 333 with the following points: Right on, write it all down! ;-) I thought I just did. ;-) Okay, I will carve out some cycles. (Btw, it appears that somebody has recently hacked on the code in PEP 333 and inadvertently broken the specification, so I'll be fixing that first.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
At 02:17 PM 9/16/2010 -0500, Ian Bicking wrote: On Thu, Sep 16, 2010 at 1:04 PM, P.J. Eby <<mailto:p...@telecommunity.com>p...@telecommunity.com> wrote: * Clarifying the encoding of environ values (locale+surrogateescape vs. latin1, TBD) locale+surrageescape would be insanity! CGI will just require some configuration with respect to the environment. Anyway, I suspect CGI only really works because: (a) people using CGI are sticking to ASCII, (b) they've fixed stuff up in their apps, (c) they just produce garbage and no one cares. Ok. There are some simple errata, most of which I believe web3 covers (in addition to other things it covers). I think everyone is on board with:  status, headers, app_iter = app(environ) Web3 proposed a different order, but it seems clear from the thread that people prefer the more natural order, and web3 authors don't particularly object. My comments were about releasing a WSGI 1.0 update for Python 3, not making changes to web3. The current free-for-all (and the 3.2 stdlib need) have convinced me to stop arguing for throwing out WSGI 1 on Python 3. Or, to put it another way: splitting the spec into two 100% incompatible versions is a bad idea for Python 3 adoption. With a WSGI 1 addendum, we should be able to make it possible to put the same apps and middleware on 2 and 3 with just a decorator wrapping them. (i.e., people should be able to write libraries that run on both 2 and 3, which is probably critical to adoption). I just wish I'd come to these conclusions much sooner... like a year or two ago. :-( ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
At 10:35 AM 9/16/2010 -0700, Guido van Rossum wrote: No comments on the rest except to note that at this point it looks unlikely that we can make everyone happy (or even get an agreement to adopt what would be the long-term technically optimal solution -- AFAICT there is no agreement on what that solution would be, if one weren't to take porting Python 2 code into account). IOW something/sokebody has gotta give. Indeed. This entire discussion has pushed me strongly in favor of doing a super-minimalist update to PEP 333 with the following points: * Clarifying the encoding of environ values (locale+surrogateescape vs. latin1, TBD) * Making the streams and all output values byte strings ('str' on 2.x, 'bytes' on 3.x), leaving everything else "native" strings ('str' on both 2.x and 3.x) * Any other minor errata/clarifications that the folks with the requisite experience (e.g. Robert, Ian, Graham -- not an exclusive list, but at least they all have both heavy WSGI implementations under their belts and 3.x experience) think are absolutely necessary to resolve open questions for Python 3.2 WSGI implementations. Something like that has a halfway decent chance of being able to settle and get implemented in the short timeline, and it also doesn't put Graham (mod_wsgi) in the position of coming back from vacation to a huge new spec to unravel. ;-) (To be clear, what I'm suggesting is almost exactly what mod_wsgi does; it's just stricter on outputs than what mod_wsgi accepts, and there may be some minor issues regarding the environ encoding: mod_wsgi is probably using the latin1 approach rather than locale+surrogateescape, and I think we need to talk that one out a bit.) Anyway, web3 is nice, but it doesn't look like it'll really fit the bill for porting applications. i.e., it's like a bike shed full of red herrings for what Python-Dev needs right now. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] PEP 444 (aka Web3)
At 07:03 PM 9/15/2010 -0400, Chris McDonough wrote: A PEP was submitted and accepted today for a WSGI successor protocol named Web3: http://python.org/dev/peps/pep-0444/ I'd encourage other folks to suggest improvements to that spec or to submit a competing spec, so we can get WSGI-on-Python3 settled soon. The first thing I notice is that web3.async appears to force all existing middleware to delete it from the environment if it wishes to remain compatible, unless it adapts to support receiving callables itself. On further reading I see you have something about middleware disabling itself if it doesn't support async execution, but this doesn't make any sense to me: if it can't support async execution, why wouldn't it just delete web3.async from the environ, forcing its wrapped app to be synchronous instead? I'm also not a fan of the bytes environ, or the new path_info/script_name variables; note that the spec's sample CGI implementation does not itself provide the new variables, and that middleware must be explicitly written to handle the case where there is duplication. My main fear with this spec is that people will assume they can just make a few superficial changes to run WSGI code on it, when in fact it is deeply incompatible where middleware is concerned. In fact, AFAICT, it seems like it will be *harder* to write correct web3 middleware than it is to write correct WSGI middleware now. This seems like a step backward, since the whole idea behind dropping start_response() was to make correct middleware *easier* to write. Any time a spec makes something optional or allows More Than One Way To Do It, it immediately doubles the mimimum code required to implement that portion of the spec in compliant middleware. This spec has two optionalities: web3.async, and the optional path_info/script_name, so the return handling of every piece of middleware is doubled (or else "environ['web3.async'] = False" must be added at the top), and any code that modifies paths must similarly ditch the special variables or do double work to update them. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI for Python 3
At 02:37 PM 8/30/2010 +1000, Graham Dumpleton wrote: Anyway, rather than keep arguing the point and move forward, let us perhaps start now with the following definitions and new names to identify them. We can even go a bit stupid and give each its own code name so they are in part more memorable. Any next option based on your suggestions about changing the WHEAT option can be called MAIZE. And if you thinking I am going stark raving mad and should be put in a white jacket and locked up, you could well be right. I am not a happy camper right now, but that is because of many things besides this WSGI stuff. :-) And yes I know about the page that has been just recently put up at: http://www.wsgi.org/wsgi/Python_3 From memory when I first read it I wasn't sure if that it was completely accurate, but at least it doesn't now mention mod_python instead of mod_wsgi which was mighty confusing. We can perhaps merge the following into that page, ie., expand the table, and talk more about the abstract definitions rather than linking it to specific implementations at this point. We can perhaps then start capturing the pros and cons against each option in the page rather than loosing them in the email chain. I've added a column to the page called "flat" that captures my current proposal (native keys, surrogateescape values, byte stream in, strict bytes-only for all outputs). This seems to me an optimum balance between: * Verifiability (especially *composable* verifiability) * Low cognitive overhead (i.e., fewest things to remember) * Low amount of finger-typing and fewer conversions But I certainly could be convinced otherwise by example or argument. (One other thing I consider a plus for this approach, btw: os.environ is still largely usable as a WSGI environ in the CGI case. This isn't so much a valuable thing in itself, as that it's an indicator of low complexity and cognitive overhead.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI for Python 3
At 11:16 AM 8/30/2010 +1000, Graham Dumpleton wrote: Although I almost begged that if we are going to discuss bytes, compared to text/unicode, that agreement at least first be made about the definition of the bytes leaning option, that request has pretty well fallen on death ears. Did you not see my reply? I (thought I) answered your question, and I actually also suggested that a variation of your unicode proposal might work, too. See: http://mail.python.org/pipermail/web-sig/2010-August/004545.html ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI for Python 3
At 02:17 PM 8/27/2010 +1000, Graham Dumpleton wrote: Since the major stumbling block, irrespective of other changes, to any sort of agreement is still bytes vs unicode, and where we have a reasonable clear definition of what unicode suggestion is, can we please as a first step get a definition of what bytes actually implies so everyone knows what we are talking about. I specifically ask this, as it isn't clear because people don't explain in detail what they mean when they are saying 'bytes'. Going back to my definition #2 in my blog post from a year ago, I had: 1. The application is passed an instance of a Python dictionary containing what is referred to as the WSGI environment. All keys in this dictionary are native strings. For CGI variables, all names are going to be ISO-8859-1 and so where native strings are unicode strings, that encoding is used for the names of CGI variables FYI, one thing that's changed here is the existence of os.environb in Python 3.2, at least on non-Windows OSes. 2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI environment, the value of the variable should be a native string. Since any meaningful use of this value is going to end up needing to be bytes again (e.g. Location headers), and for consistency's sake, I lean towards saying this is bytes too. 3. For the CGI variables contained in the WSGI environment, the values of the variables are byte strings. 4. The WSGI input stream 'wsgi.input' contained in the WSGI environment and from which request content is read, should yield byte strings. 5. The status line specified by the WSGI application must be a byte string. 6. The list of response headers specified by the WSGI application must contain tuples consisting of two values, where each value is a byte string. 7. The iterable returned by the application and from which response content is derived, must yield byte strings. The points of disagreement I have seen about this is are as follows. For (1), the keys should also be bytes, including names of 'wsgi.' special keys. For (2), the value of 'wsgi.url_scheme' should be bytes. So, do you really want bytes absolutely everywhere, or are keys still going to be unicode taken as ISO-8859-1. If we follow the example of os.environb, then the keys have to be bytes also. However, I can already see that the big problem with all of this is that WSGI code is going to be littered with a plague of "b"s hanging off the front of every string literal, and that 2to3 is probably not going to handle it correctly. Making the keys bytes as well just multiplies the problem. Note that we are not agreeing to the final solution here, just what bytes means in contrast to the unicode option, so we know that we are comparing only two options and not many options because people have different interpretations of what bytes means. As contrast, what we generally mean by the unicode option is definition #3 from my blog post. That being: 1. The application is passed an instance of a Python dictionary containing what is referred to as the WSGI environment. All keys in this dictionary are native strings. For CGI variables, all names are going to be ISO-8859-1 and so where native strings are unicode strings, that encoding is used for the names of CGI variables 2. For the WSGI variable 'wsgi.url_scheme' contained in the WSGI environment, the value of the variable should be a native string. 3. For the CGI variables contained in the WSGI environment, the values of the variables are native strings. Where native strings are unicode strings, ISO-8859-1 encoding would be used such that the original character data is preserved and as necessary the unicode string can be converted back to bytes and thence decoded to unicode again using a different encoding. 4. The WSGI input stream 'wsgi.input' contained in the WSGI environment and from which request content is read, should yield byte strings. 5. The status line specified by the WSGI application should be a byte string. Where native strings are unicode strings, the native string type can also be returned in which case it would be encoded as ISO-8859-1. 6. The list of response headers specified by the WSGI application should contain tuples consisting of two values, where each value is a byte string. Where native strings are unicode strings, the native string type can also be returned in which case it would be encoded as ISO-8859-1. 7. The iterable returned by the application and from which response content is derived, should yield byte strings. Where native strings are unicode strings, the native string type can also be returned in which case it would be encoded as ISO-8859-1. Even though call it unicode, it actually has bytes in places as well. The key issues over bytes vs unicode has been in values in the dictionary, but as pointed out about, not clear whether for bytes option, we are talking about bytes for keys as well and for value of 'wsgi.url_scheme'. The
Re: [Web-SIG] WSGI for Python 3
At 06:05 PM 8/27/2010 +0200, Christoph Zwerschke wrote: For instance, user = 'özkan'.encode('latin1') if user in request.META.get('REMOTE_USER', b'').lower(): will not work it the user has logged in as 'Özkan'. Isn't that a problem with code that does this now? ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI for Python 3
At 01:37 AM 8/27/2010 +0200, Armin Ronacher wrote: Hi, Is there a status update on that now I missed? Did something decide on bytes for the environment values or are we still unsure about that? To the extent we're "unsure", I think the holdup is simply that nobody has tried doing an all-bytes WSGI implementation -- unless of course you count all our Python 2.x experience as experience with an all-bytes implementation. ;-) (Of course, that experience won't help us with Python 3 stdlib issues.) At that point I don't care at all about what is decided on as long as something is decided. Can someone please stand up and just do that? :) Essentially the problem right now is that unless such a choice is made, there's little hope of getting the stdlib issues to be resolved, because we can't exactly file bug reports against the stdlib if we don't know what we want it to do. ;-) My personal inclination is to define WSGI 2 as a bytes-oriented protocol, and then encourage people to port to WSGI 2 before moving to Python 3. In theory, if we did it correctly it could actually minimize the porting pain for Python 3. In practice, I'm not sure how to do this, as I lack experience with 2to3 at the moment, or any production experience with Python 3 whatsoever. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI for Python 3
At 01:01 PM 7/18/2010 +1000, Graham Dumpleton wrote: This is on the basis that if people are going to have to rewrite their code a fair bit to handle bytes everywhere, What you mean by "rewrite their code a fair bit", and who is it that you think will have to do this? Or, more precisely, how is that any different from the text or text-and-bytes proposals? AFAICT, the main difference is that under a bytes-only regime, the changes should be more consistent/mechanical, i.e., able to be performed by relatively superficial code inspection. My personal opinion is that if you are going to go bytes everywhere, then you may as well throw out the complete WSGI specification as it stands now and fix all the other problems with the specification. That may not be a bad idea; I'm certainly in favor of going ahead and ditching start_response/write while we're at it. The requirement to change both the entry and exit points to match the calling convention also seems to provide an ideal opportunity to insert any necessary encoding or decoding operations. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI for Python 3
At 07:20 PM 7/16/2010 -0400, Chris McDonough wrote: I'd much rather say be able to say: """ The PATH_INFO environment variable is a ``bytes-with-benefits`` type. To decode it: - First, split it on slashes:: segments = PATH_INFO.split('/') - Then, de-encode each segment's urlencoded portions: urldecoded_segments = [ urllib.unquote(x) for x in segments ] - Then re-encode each urldecoded segment into the encoding expected by your application app_segments = [ str(x, encoding='utf-8') for x in urldecoded_segments ] """ +1. I do wish we actually *had* a bytes-with-benefits type (as I proposed on Python-Dev), but I don't think we can really get one until the language moratorium is over. Plain old bytes are the next best thing. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI for Python 3
At 05:42 PM 7/16/2010 -0400, Tres Seaver wrote: P.J. Eby wrote: > (Hm. Although actually, I suppose we *could* just borrow the time > machine and pretend that WSGI called for "byte-strings everywhere" > all along...) I like the idea of pushing responsibility for decoding stuff into the framework / app writer's hands. OTOH, doesn't that hose authors of existing middleware, due to the borkedness of working with bytes in Python3? It only creates a "new" problem if they are currently not using *any* unicode in 2.x, and are passing through bytes from the input to the output without any encoding or decoding. AFAICT, if any part of their app is currently unicode, they would have the same problems in 2.x. (Minus, of course, any problems introduced by missing bytes methods in 3.x, or the fact that single-subscripted bytes are ints rather than bytestrings.) Anyway, the problems introduced will be problems that can be solved by waving a fairly standard set of dead chickens at the problem, i.e. picking where you're going to encode/decode, and deciding what encoding(s) are meaningful to your app. And frameworks that already have a unicode API are ahead of the game here. So, AFAICT, the only people who'd be punished by a change to bytes are the people who have non-ASCII inputs or outputs, but haven't been using unicode (because 2to3 will convert them to using strings instead of bytes). From what I can tell, though, this is also the group it's most politically correct to hate on in Python-Dev, so we should be relatively safe in shifting the burden to them. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI for Python 3
At 02:28 PM 7/16/2010 -0500, Ian Bicking wrote: On Fri, Jul 16, 2010 at 1:40 PM, P.J. Eby <<mailto:p...@telecommunity.com>p...@telecommunity.com> wrote: At 11:07 AM 7/16/2010 -0500, Ian Bicking wrote: And this doesn't help with Python 3: either we have byte values of SCRIPT_NAME and PATH_INFO in Python 3, or we have text values.à  I think bytes will be more awkward to port to than text, and inconsistent with other WSGI values. OTOH, it has the tremendous advantage of pushing the encoding question onto the app (or framework) developer...  who's really the only one who can make the right decision for their particular application.  And personally, I'd rather have clear boundaries between text and bytes, such that porting (even if tedious or awkward) is *consistent*, and clear as to when you're finished, not, "oh, did I check to make sure I converted SCRIPT_NAME and PATH_INFO...  not just in my app code, but in all the library code I call *from* my app?" IOW, the bytes/string discussion on Python-dev has kind of led me to realize that we might just as well make the *entire* stack bytes (incoming and outgoing headers *and* streams), and rewrite that bit in PEP 333 about using str on "Python 3000" to say we go with bytes on Python 3+ for everything that's a str in today's WSGI. This was my first intuition too, until I started thinking in more detail about the particular values involved. Some obviously are textish, like environ['SERVER_NAME']. Not a very useful value, but definitely text. Basically all the internal strings are textish, so we're left with: wsgi.url_scheme SCRIPT_NAME/PATH_INFO QUERY_STRING HTTP_*, CONTENT_TYPE, CONTENT_LENGTH (headers) response status response headers (name and value) What I'm getting at, though, is it's precisely this sort of "hm, which ones are bytes again?" stuff that makes you have to stop and *think*, i.e., it doesn't Fit My Brain any more. ;-) There should be one, and preferably *only* one, obvious way to do it. And given that HTTP is inherently a bunch of bytes, bytes is the one obvious way. I previously was under the impression that bytes wouldn't interoperate with strings in 3.x, but they *do*, in much the same way as they did in 2.x. That means you'll be (mostly) bug-compatible in 3.x, only you'll likely encounter encoding issues *sooner*, rather than later. (i.e., the minute you combine non-ASCII inputs with your regular string constants). Yes, you will also be forced to convert your return values to bytes, but if you've used string constants *anywhere*, then you know you'll be outputting text, which you should already have been encoding for output. (So you'll just be forced to deal with errors on that side sooner as well.) All in all, I'd say this also fits with what people on Python-Dev keep hammering on as the One Obvious Way to deal with bytes and strings in a program: i.e., bytes for I/O, text for text processing. WSGI is HTTP, and HTTP is I/O, ergo, WSGI is I/O, and we should therefore "byte" the bullet here. ;-) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI for Python 3
At 11:07 AM 7/16/2010 -0500, Ian Bicking wrote: And this doesn't help with Python 3: either we have byte values of SCRIPT_NAME and PATH_INFO in Python 3, or we have text values. I think bytes will be more awkward to port to than text, and inconsistent with other WSGI values. OTOH, it has the tremendous advantage of pushing the encoding question onto the app (or framework) developer... who's really the only one who can make the right decision for their particular application. And personally, I'd rather have clear boundaries between text and bytes, such that porting (even if tedious or awkward) is *consistent*, and clear as to when you're finished, not, "oh, did I check to make sure I converted SCRIPT_NAME and PATH_INFO... not just in my app code, but in all the library code I call *from* my app?" IOW, the bytes/string discussion on Python-dev has kind of led me to realize that we might just as well make the *entire* stack bytes (incoming and outgoing headers *and* streams), and rewrite that bit in PEP 333 about using str on "Python 3000" to say we go with bytes on Python 3+ for everything that's a str in today's WSGI. Or, to put it another way, if I knew then what I know *now*, I think I'd have written the PEP the other way around, such that the use of 'str' in WSGI would be a substitute for the future 'bytes' type, rather than viewing some byte strings as a forward-compatible substitute for Py3K unicode strings. Of course, this would be a WSGI 2 change, but IMO we're better off making a clean break with backward compatibility here anyway, rather than having conditionals. Also, going with bytes everywhere means we don't have to rename SCRIPT_NAME and PATH_INFO, which in turn avoids deeper rewrites being required in today's apps. (Hm. Although actually, I suppose we *could* just borrow the time machine and pretend that WSGI called for "byte-strings everywhere" all along...) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Emulating req.write() in WSGI
At 12:33 PM 6/29/2010 -0600, Aaron Fransen wrote: I was sending text/html (I probably should have used multipart before) ... should I try multipart now, even with having everything in a single stream? Heck if I know. I just assumed that what you're doing would be unlikely to work, whereas multipart has at least been previously documented as working with Apache (at least for nph scripts). Dunno if mod_wsgi'll do that or not. Actually, what I'd do in your place is try a "nph-" CGI in Python (using a wsgiref CGIHandler with its 'origin_server' attribute set to True), have it send multipart, and see if that works. If it doesn't work, then it's probably a problem with your app. If it *does* work, but the same app doesn't work under mod_wsgi, then it's a mod_wsgi issue; possibly related to configuration. From what Graham's said, mod_wsgi shouldn't be buffering anything, which means it has to either be Apache or your app that's buffering. If it's Apache, doing a proper nph+multipart ought to fix it, unless there's something else going on in the Apache configuration. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Emulating req.write() in WSGI
At 10:14 AM 6/29/2010 -0600, Aaron Fransen wrote: Couple more things I've been able to discern. The first happened after I "fixed" the html code. Originally under mod_python, I guess I was cheating more than a little bit by sending code blocks twice, once for the incremental notices, once for the final content. Once I changed the code to send a single properly parsed block, the entire document showed up as expected, however it still did not send any part of the html incrementally. Watching the line with Wireshark, all of the data was transmitted at the same time, so nothing was sent to the browser incrementally. So, you're not sending a multipart/x-mixed-replace ("server push") transmission? ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Emulating req.write() in WSGI
At 03:43 PM 6/28/2010 -0600, Aaron Fransen wrote: Using mod_wsgi on Apache doesn't seem to exhibit that behavior. You may need "WSGIOutputBuffering Off" in your config; see: http://code.google.com/p/modwsgi/wiki/ConfigurationDirectives#WSGIOutputBuffering Another possibility is that you've got some middleware or something else buffering between your app and mod_wsgi, I suppose. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] Emulating req.write() in WSGI
At 01:01 PM 6/28/2010 -0600, Aaron Fransen wrote: One of the nice things about mod_python is the req.write() function. Although I realize it's somewhat of an abuse to the http protocol, it's handy being able to periodically update the client browser with a status message for a long-running job. So handy in fact that I have a number of applications that rely fairly heavily on it as a means of keeping the client (person) happy instead of just showing them the default "browser busy" notification. There are a couple of workarounds, neither of which are ideal: 1. Take them immediately to a secondary page, then submit the actual job automatically on that second page. 2. Instead of using HTTP POST, use an HTTP Request Object (ie. Ajax). Both of them involve significantly more development effort than an equivalent req.write(). Is there a way to emulate the periodic-write functionality in WSGI? Each string yielded (or passed to the write() callable returned by start_response) is supposed to be sent straight through to the client. As long as your WSGI stack is actually conformant to the protocol, that's all you need to do. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] [RFC] x-wsgiorg.suspend extension
At 01:25 PM 4/12/2010 +0200, Manlio Perillo wrote: The purpose of the extension if to just have a standard interface that WSGI applications can use to take advantage of the possibility, offered by asynchronous server, to suspend execution and resume it later. WSGI has this ability now - it's yielding an empty string. Yielding an empty string is a hint to the server that the application is not ready to send any output, and the server is free to schedule other applications next. And WSGI does not require the application to be rescheduled any time soon. In other words, if saying "don't call me for a while" is the purpose of the extension, it is not needed. As Graham says, the thing that would actually be needed is a way to tell the server when to poll the app again. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] wsgi and generators (was Re: WSGI and start_response)
At 02:04 PM 4/10/2010 +0100, Chris Dent wrote: I realize I'm able to build up a complete string or yield via a generator, or a whole bunch of various ways to accomplish things (which is part of why I like WSGI: that content is just an iterator, that's a good thing) so I'm not looking for a statement of what is or isn't possible, but rather opinions. Why is yielding lots of moderately sized strings *very bad*? Why is it _not_ very bad (as presumably others think)? How bad it is depends a lot on the specific middleware, server architecture, OS, and what else is running on the machine. The more layers of architecture you have, the worse the overhead is going to be. The main reason, though, is that alternating control between your app and the server means increased request lifetime and worsened average request completion latency. Imagine that I have five tasks to work on right now. Let us say each takes five units of time to complete. If I have five units of time right now, I can either finish one task now, or partially finish five. If I work on them in an interleaved way, *none* of the tasks will be done until twenty-five units have elapsed, and so all tasks will have a completion latency of 25 units. If I work on them one at a time, however, then one task will be done in 5 units, the next in 10, and so on -- for an average latency of only 15 units. And that is *not* counting any task switching overhead. But it's *worse* than that, because by multitasking, my task queue has five things in it the whole time... so I am using more memory and have more management overhead, as well as task switching overhead. If you translate this to the architecture of a web application, where the "work" is the server serving up bytes produced by the application, then you will see that if the application serves up small chunks, the web server is effectively forced to multitask, and keep more application instances simultaneously running, with lowered latency, increased memory usage, etc. However, if the application hands either its entire output to the server, then the "task" is already *done* -- the server doesn't need the thread or child process for that app anymore, and can have it do something else while the I/O is happening. The OS is in a better position to interleave its own I/O with the app's computation, and the overall request latency is reduced. Is this a big emergency if your server's mostly idle? Nope. Is it a problem if you're writing a CGI program or some other direct API that doesn't automatically flush I/O? Not at all. I/O buffering works just fine for making sure that the tasks are handed off in bigger chunks. But if you're coding up a WSGI framework, you don't really want to have it sending tiny chunks of data up a stack of middleware, because WSGI doesn't *have* any buffering, and each chunk is supposed to be sent *immediately*. Well-written web frameworks usually do some degree of buffering already, for API and performance reasons, so for simplicity's sake, WSGI was spec'd assuming that applications would send data in already-buffered chunks. (Specifically, the simplicity of not needing to have an explicit flushing API, which would otherwise have been necessary if middleware and servers were allowed to buffer the data, too.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI and start_response
At 10:18 PM 4/8/2010 +0200, Manlio Perillo wrote: Suppose I have an HTML template file, and I want to use a sub request. ... ${subrequest('/header/'} ... The problem with this code is that, since Mako will buffer all generated content, the result response body will contain incorrect data. It will first contain the response body generated by the sub request, then the content generated from the Mako template (XXX I have not checked this, but I think it is how it works). Okay, I'm confused even more now. It seems to me like what you've just described is something that's fundamentally broken, even if you're not using WSGI at all. So, when executing a sub request, it is necessary to flush (that is, send to Nginx, in my case) the content generated from the template before the sub request is done. This seems to only makes sense if you're saying that the subrequest *has to* send its output directly to the client, rather than to the parent request. If the subrequest sends its output to the parent request (as a sane implementation would), then there is no problem. Likewise, if the subrequest is sent to a buffer that's then inserted into the parent invocation. Anything else seems utterly insane to me, unless you're basically taking a bunch of legacy CGI code using 'print' statements and hacking it into something else. (Which is still insane, just differently. ;-) ) Ah, you are right sorry. But this is not required for the Mako example (I was focusing on that example). As far as I can tell, that example is horribly wrong. ;-) But when using the greenlet middleware, and when using the function for flushing Mako buffer, some data will be yielded *before* the application returns and status and headers are passed to Nginx. And that's probably because sharing a single output channel between the parent and child requests is a bad idea. ;-) (Specifically, it's an increase in "temporal coupling", I believe. I know it's some kind of coupling between functions that's considered bad, I just don't remember if that's the correct name for it.) > This is also a good time for people to learn that generators are usually > a *very bad* way to write WSGI apps It's the only way to be able to suspend execution, when the WSGI implementation is embedded in an async web server not written in Python. It's true that dropping start_response() means you can't yield empty strings prior to determining your headers, yes. > - yielding is for server push or > sending blocks of large files, not tiny strings. Again, consider the use of sub requests. yielding a "not large" block is the only choice you have. No, it isn't. You can buffer your output and yield empty strings until you're ready to flush. Unless, of course, you implement sub request support in pure Python (or using SSI - Server Side Include). I don't see why it has to be "pure", actually. It just that the subrequest needs to send data to the invoker rather than sending it straight to the client. That's the bit that's crazy in your example -- it's not a scenario that WSGI 2 should support, and I'd consider the fact that WSGI 1 lets you do it to be a bug, not a feature. ;-) That being said, I can see that removing start_response() closes a loophole that allows async apps to *potentially* exist under WSGI 1 (as long as you were able to tolerate the resulting crappy API). However, to fix that crappy API requires greenlets or threads, at which point you might as well just use WSGI 2. In the Nginx case, you can either do WSGI 1 in C and then use an adapter to provide WSGI 2, or you can expose your C API to Python and write a small greenlets-using Python wrapper to support suspending. It would look something like: def gateway(request_info, app): # set up environ run(greenlet(lambda: Finished(app(environ def run(child): while not child.dead: data = child.switch() if isinstance(data, Finished): send_status(data.status) send_headers(data.headers) send_response(data.response) else: perform_appropriate_action_on(data) if data.suspend: # arrange for run(child) to be re-called later, then... return Suspension now works by switching back to the parent greenlet with command objects (like Finished()) to tell the run() loop what to do. The run() loop is not stateful, so when the task is unsuspended, you simply call run(child) again. A similar structure would exist for send_response() - i.e., it's a loop over the response, can break out of the loop if it needs to suspend, and arranges for itself to be re-called at the appropriate time. Voila - you now have asynchronous WSGI 2 support. Now, whether you actually *want* to do that is a separate question, but as (I hope) you can see, you definitely *can* do
Re: [Web-SIG] WSGI and start_response
At 08:06 PM 4/8/2010 +0200, Manlio Perillo wrote: What I'm trying to do is: * as in the example I posted, turn Mako render function in a generator. The reason is that I would lite to to implement support for Nginx subrequests. By subrequest, do you mean that one request is invoking another, like one WSGI application calling multiple other WSGI applications to render one page containing contents from more than one? During a subrequest, the generated response body is sent directly to the client, so it is necessary to be able to flush the Mako buffer I don't quite understand this, since I don't know what Mako is, or, if it's a template engine, what flushing its buffer would have to do with WSGI buffering. > Under > WSGI 1, you can do this by yielding empty strings before calling > start_response. No, in this case this is not what I need to do. Well, if that's not when you're needing to suspend the application, then I don't see what you're losing in WSGI 2. I need to call start_response, since the greenlet middleware will yield data to the caller before the application returns. I still don't understand you. In WSGI 1, the only way to suspend execution (without using greenlets) prior to determining the headers is to yield empty strings. I'm beginning to wonder if maybe what you're saying is that you want to be able to write an application function in the form of a generator? If so, be aware that any WSGI 1 app written as: def app(environ, start_response): start_response(status, headers) yield "foo" yield "bar" can be written as a WSGI 2 app thus: def app(environ, start_response): def respond(): yield "foo" yield "bar" return status, headers, respond() This is also a good time for people to learn that generators are usually a *very bad* way to write WSGI apps - yielding is for server push or sending blocks of large files, not tiny strings. In general, if you're yielding more than one block, you're almost certainly doing WSGI wrong. The typical HTML, XML, or JSON output that's 99% of a webapp's requests should be transmitted as a single string, rather than as a series of snippets. IOW, the absence of generator support in WSGI 2 is a feature, not a bug. In my new attempt I plan to: 1) Implement the simple suspend/resume extension 2) Implement a Python extension module that wraps the Nginx events system. 3) Implement a pure Python WSGI middleware that, using greenlets, will enable normal applications to take advantage of Nginx async features. I think maybe I'm understanding a little better now -- you want to implement the WSGI gateway entirely in C, without using any Python, and without using the greenlet API directly. I think I've been unable to understand because I'm thinking in terms of a server implemented in Python, or at least that has the WSGI part implemented in Python. Do you think it will possible to implement all the requirements of WSGI 2 (including Python 3.x support) in a simple adapter on top of WSGI 1.0 ? My practical experience with Python 3 is essentially nonexistent, but being able to implement WSGI 2 in terms of WSGI 1 is a *design requirement* for WSGI 2; it's likely that much early use and development of WSGI 2 will be done through such an adapter. And what about applications that need to use the WSGI 1.0 API but require to run with Python 3.x? That's a tougher nut to crack; again, my practical experience with Python 3 is essentially nonexistent. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI and start_response
At 05:40 PM 4/8/2010 +0200, Manlio Perillo wrote: With WSGI 2.0 we will end up with: - WSGI 1.0, a full featured protocol, but with hard to implement middlewares - WSGI 2.0, a simple protocol, with more easy to implement middlewares but without support for some "advanced" applications Let me see if I understand what you're saying. You want to support suspending an application, without using greenlets or threads. Under WSGI 1, you can do this by yielding empty strings before calling start_response. Under WSGI 2, you can only do this by directly suspending execution, e.g. via greenlet or eventlets or some similar API provided by the server. Is this your objection? As far as I know, nobody has actually implemented an async app facility for WSGI 1, although it sounds like perhaps you're trying to design or implement such a thing now. If so, then there's nothing stopping you from implementing a WSGI 1 server and providing a WSGI 2 adapter, since as you point out, WSGI 2 is easier to implement on top of WSGI 1 than the other way around. (Note, however, that if you simply use a greenlet or eventlet-based API for your async server, then the problem is neatly solved whether you are using WSGI 1 or 2, and the effective API is a lot cleaner than yielding empty strings.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI and start_response
At 04:59 PM 4/8/2010 +0200, Manlio Perillo wrote: Aaron Watters ha scritto: > someone remind me: where is the canonical WSGI 2 spec? http://wsgi.org/wsgi/WSGI_2.0 > I assume there is a way to "wrap" WSGI 1 applications > without breaking them? Or is this the regex-->re fiasco > all over again? > start_response can be implemented by a function that will store the status code and response headers. There should be a sample WSGI 2.0 implementation for CGI, and a sample WSGI 1.0 -> 2.0 adapter. This adapter should be able to support the coroutine example, > http://paste.pocoo.org/show/199202/ but I would like to test. write callable, as far as I know, can not be implemented. Implementing it requires greenlets or threads, but it's implementable. See: http://mail.python.org/pipermail/web-sig/2009-September/003986.html (Btw, I've noticed that this early sketch of mine doesn't support the case where an application is a generator, because start_response won't have been called when the application returns. This can be fixed, but it requires the addition of a wrapper class and a few other annoying details. It also doesn't support exc_info properly, so it's still a ways from being a correct WSGI 1 server implementation. Getting rid of all these little variations, though, is the goal of having a WSGI 2 - it's difficult to write *any* middleware to be completely WSGI 1 compliant.) ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI and start_response
At 04:08 PM 4/8/2010 +0200, Manlio Perillo wrote: Hi. Some time ago I objected the decision to remove start_response function from next version WSGI, using as rationale the fact that without start_callable, asynchronous extension are impossible to support. Now I have found that removing start_response will also make impossible to support coroutines (or, at least, some coroutines usage). Here is an example (this is the same example I posted few days ago): http://paste.pocoo.org/show/199202/ Forgetting about the write callable, the problem is that the application starts to yield data when tmpl.render_unicode function is called. Please note that this has *nothing* to do with asynchronus applications. The code should work with *all* WSGI implementations. In the pasted example, the Mako render_unicode function is "turned" into a generator, with a simple function that allows to flush the current buffer. Can someone else confirm that this code is impossible to support in WSGI 2.0? I don't understand why it's a problem. See my previous post here: http://mail.python.org/pipermail/web-sig/2009-September/003986.html for a sketch of a WSGI 1-to-2 converter. It takes a WSGI 1 application callable as the input, and returns a WSGI 2 function. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] wsgi.errors and close method
At 08:10 PM 3/27/2010 +0100, Manlio Perillo wrote: Some time ago, someone reported me that an application embedded in Nginx with my WSGI module failed to execute, since in my implementation the wsgi.errors object does not implement the .close method. We should probably note in the spec that WSGI applications have no business closing the errors object; ISTM it is a completely meaningless action. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: http://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com