Re: [Web-SIG] WSGI2: write callable?
On Fri, Sep 26, 2014 at 9:58 PM, PJ Eby p...@telecommunity.com wrote: On Thu, Sep 25, 2014 at 11:32 PM, Robert Collins robe...@robertcollins.net wrote: So I propose we drop the write callable, and include a queue based implementation in the adapter for PEP- code. If you're dropping write(), then you might as well drop start_response() altogether, and replace it with returning a (status, headers, body-iterator) tuple, as in wsgi_lite ( https://github.com/pjeby/wsgi_lite ) or as found in other languages' versions of WSGI. (start_response+write was only ever needed in order to support legacy apps, so other languages never bothered.) wsgi_lite has a couple of other protocol extensions, namely the 'wsgi_lite.closing' environment key, flagging callables' supported WSGI version (for transparent interop), and the argument binding protocol, but for the most part these are orthogonal to the calling schema. I would suggest, however, that the calling protocol be flagged in some way to allow easier interop. I quite like the idea of always returning an iterator for the body it would simplify the code a lot... About returning the status and other thing, I quite agree, but imo we also need to return an extra parameter where the application or the middleware could maintain a state or something like it. Thoughts? - benoit ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: https://mail.python.org/mailman/options/web-sig/bchesneau%40gmail.com ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI2: write callable?
On 27 September 2014 08:21, Benoit Chesneau bchesn...@gmail.com wrote: On Fri, Sep 26, 2014 at 5:32 AM, Robert Collins robe...@robertcollins.net wrote: ... So I propose we drop the write callable, and include a queue based implementation in the adapter for PEP- code. -Rob What would be the advantage of using a queue compared to simply write to the server? Internally the server can use queue, but why the client should know it? What is the reasoning behind it? The point is to remove the complexity of having both an iterator over content *and* a write method. Thats really complex for server [and middleware] writers. So the interface to send bytes to the container would just be 'yield them'. (Or return a fully populated list). So the point about the Queue is that to support PEP- we either need to retain the write() callable, or we need an adapter that can expose on its upper side the iterator we want, and on the lower side accept *either* an iterator *or* use of write() method - I think you'll find thats quite hard to write without a Queue or similar construct. -Rob -- Robert Collins rbtcoll...@hp.com Distinguished Technologist HP Converged Cloud ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI2: write callable?
On Fri, Sep 26, 2014 at 5:02 PM, Robert Collins robe...@robertcollins.net wrote: But perhaps it would be nicer to say: iterator of headers_dict_or_body_bytes With the first item yielded having to be headers (or error thrown),and the last item yielded may be a dict to emit trailers. So: def app(environ): yield {':status': '200'} yield b'hello world' yield {'Foo': 'Bar'} is an entirely valid, if trivial, app. What do you think? I think this would make it harder to write middleware, actually, and for the same reason that I dislike folding status into the headers. It's a case of flat is better than nested, I think, in both cases. That is, if the status is always required, it's easier to validate its presence in a 3-tuple than nested inside another data structure. As far as trailers go, I'm not sure what those are used for or how they'd be used in practice, but my initial thought is that they should be attached to the response body, analagous to how FileWrapper works. The other alternative is to use a dict as the response object (analagous to environ as the request object), with named keys for status, headers, trailers, body, etc. It would then be extensible to handle things like the Associated content concept. In this way, middleware that is simply passing things through unchanged can do so, while middleware that is creating a new response can discard the old object. wsgi_lite has a couple of other protocol extensions, namely the 'wsgi_lite.closing' environment key, flagging callables' supported WSGI version (for transparent interop), and the argument binding protocol, but for the most part these are orthogonal to the calling schema. I would suggest, however, that the calling protocol be flagged in some way to allow easier interop. We're bumping the WSGI version, will that serve as a sufficient flag? I mean, flagged on the app end. For example, wsgi_lite marks apps that support wsgi_lite with a true-valued `__wsgi_lite__` attribute. In this way, a container invoking the app knows it can be called with just an environ (and no start_response). So, I'm saying that an app callable would opt in to this new WSGI version, so that servers and middleware don't need to grow new APIs for registering apps -- they can auto-detect. Also, having auto-detection means you can write a decorator (e.g. in wsgiref), to wrap and convert WSGI 1 apps to WSGI 2, without needing to know if you're passing something already wrapped. It means that a WSGI 2 server or middleware can just wrap whatever apps it sees, and get back a WSGI 2 app, whether the thing it got was WSGI 1 or WSGI 2. The closing thing is nice - its basically unittest.TestCase.addCleanup for WSGI, allowing apps to not have to write a deep nested finally. Lets start a new thread about the design for that specifically. You note that exception management isn't defined yet - perhaps we can tackle that as a group? Sure. ___ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com
Re: [Web-SIG] WSGI2: write callable?
On 27 September 2014 10:31, PJ Eby p...@telecommunity.com wrote: On Fri, Sep 26, 2014 at 5:02 PM, Robert Collins robe...@robertcollins.net wrote: But perhaps it would be nicer to say: iterator of headers_dict_or_body_bytes With the first item yielded having to be headers (or error thrown),and the last item yielded may be a dict to emit trailers. So: def app(environ): yield {':status': '200'} yield b'hello world' yield {'Foo': 'Bar'} is an entirely valid, if trivial, app. What do you think? I think this would make it harder to write middleware, actually, and for the same reason that I dislike folding status into the headers. It's a case of flat is better than nested, I think, in both cases. That is, if the status is always required, it's easier to validate its presence in a 3-tuple than nested inside another data structure. I'm intrigued here - validation of the status code is tied into into the details of the headers. For instance, 301/302 need a Location header to be valid. So I don't understand how its any easier with status split out. I'd be delighted to whip up a few constrasting middleware samples to let us compare and contrast. Note too that folk can still return bad status codes with a different layout (status, headers, body, trailers) return None, {}, [], {} One thing we could do with the status code in the headers dict is to default to 200 - the vastly common case (in the same way that throwing an error generates a 500). Then status wouldn't be required at all for trivial uses. That would make things easier, no? As far as trailers go, I'm not sure what those are used for or how they'd be used in practice, but my initial thought is that they should be attached to the response body, analagous to how FileWrapper works. So a classic example for Trailers is digitally signing streamed content. Using the same strawman API as above: def app(environ): yield {':status': '200} md5sum = md5.new() for bytes in block_reader(open('foo', 'rb'), 65536): md5sum.update(bytes) yield bytes digest = md5sum.hexdigest() signature = sign_bytes(digest.encode('utf8')) yield {'Content-MD5Sum': digest, 'X-Signature': signature} Note that this doesn't need to buffer or use a closure. Writing that with a callback for trailers (which is the only alternative - its either a callback or a generator - because until the body is fully handled the content of the trailers cannot be determined): def app(environ): md5sum = md5.new() def body(): for bytes in block_reader(open('foo', 'rb'), 65536): md5sum.update(bytes) yield bytes def trailers(): digest = md5sum.hexdigest() signature = sign_bytes(digest.encode('utf8')) yield {'Content-MD5Sum': digest, 'X-Signature': signature} return '200', {}, body, trailers The other alternative is to use a dict as the response object (analagous to environ as the request object), with named keys for status, headers, trailers, body, etc. It would then be extensible to handle things like the Associated content concept. That might work, though it will force more closures. One of the things I like about the generator style is the clarity in code that we can achieve. In this way, middleware that is simply passing things through unchanged can do so, while middleware that is creating a new response can discard the old object. That seems to apply either way, right? Here's a body-size logging middleware: def logger(app): def middleware(environ): wrapped = app(environ) yield next(wrapped) body_bytes = 0 for maybe_body in wrapped: if type(maybe_body) is bytes: body_bytes += len(maybe_body) yield maybe_body logging.info(Saw %d bytes for %s % (body_bytes, environ['PATH_INFO'])) return middleware .. We're bumping the WSGI version, will that serve as a sufficient flag? I mean, flagged on the app end. For example, wsgi_lite marks apps that support wsgi_lite with a true-valued `__wsgi_lite__` attribute. In this way, a container invoking the app knows it can be called with just an environ (and no start_response). Ok, So we'd use the absence of such a mark to trigger the WSGI1 adapter automagically? I'm curious if that will work well enough we are given wsgi_lite or other extensions to wsgi. Perhaps we should refuse to guess and just supply the adapters and instructions? So, I'm saying that an app callable would opt in to this new WSGI version, so that servers and middleware don't need to grow new APIs for registering apps -- they can auto-detect. Also, having auto-detection means you can write a decorator (e.g. in wsgiref), to wrap and convert WSGI 1 apps to WSGI 2, without needing to know if you're passing something already wrapped. It means that a WSGI 2 server or middleware can just wrap whatever apps it sees, and get back a WSGI 2 app, whether the
Re: [Web-SIG] WSGI2: write callable?
On Fri, Sep 26, 2014 at 7:41 PM, Robert Collins robe...@robertcollins.net wrote: One thing we could do with the status code in the headers dict is to default to 200 - the vastly common case (in the same way that throwing an error generates a 500). Then status wouldn't be required at all for trivial uses. That would make things easier, no? At the cost of variation. A core design principle of WSGI is that variations make things *harder*, not easier, because it means more alternatives that apps, servers, and middleware have to support, with more code paths and fewer of them properly tested. Every variation that is part of the spec (as opposed to an extension), creates a LOT of complexity in the field. (Which is one reason it'll be nice to get rid of start_response(), and all its convoluted sequencing logics.) So a classic example for Trailers is digitally signing streamed content. Using the same strawman API as above: def app(environ): yield {':status': '200} md5sum = md5.new() for bytes in block_reader(open('foo', 'rb'), 65536): md5sum.update(bytes) yield bytes digest = md5sum.hexdigest() signature = sign_bytes(digest.encode('utf8')) yield {'Content-MD5Sum': digest, 'X-Signature': signature} Note that this doesn't need to buffer or use a closure. Please bear in mind that another core WSGI design principle is that we don't make apps easier to write by making servers and middleware harder to write. That kills adoption and growth, because the audience that *needs* to adopt WSGI (or any successor standard) is the audience of people who write servers and middleware. If a feature is sinfully ugly for the app writer, but a thing of beauty for a middleware author, we *want* that feature. Conversely, if a feature means that *every* piece of middleware now has to add an extra if statement to support the feature in order to make it pretty for the app writer, then we do NOT want that feature, and it should be taken out and shot *at once*. It's not a fair tradeoff, because only server authors and middleware authors *have to* deal with WSGI directly. App authors can use libraries to pretty it up, so we don't need to pretty it for them in advance -- especially since we don't know what their *personal* idea of pretty is going to be. ;-) The above API is cute and clean for the app writer, but for a middleware writer it's a barrel of misery. *Every* piece of middleware that even wants to *read* anything from the response (let alone modify it), now needs to check types of yielded values, accumulate headers, and maybe buffer content. And there are many ways to write that middleware that will be wrong, but *appear* right because the author didn't think of all the ways that an app could violate the middleware author's assumptions. On the other hand, if somebody wants to make a library implementing a similar API to your proposal *on top* of WSGI, then sure, why not? That's fine: it only adds overhead at a *single point*: the library that implements the pretty API on top of WSGI. Writing that with a callback for trailers (which is the only alternative - its either a callback or a generator - because until the body is fully handled the content of the trailers cannot be determined): Doesn't look bad to me. It'd also be fine as a method on the response body, and that would let us stick to (status, headers, body) as a return value. The other alternative is to use a dict as the response object (analagous to environ as the request object), with named keys for status, headers, trailers, body, etc. It would then be extensible to handle things like the Associated content concept. That might work, though it will force more closures. One of the things I like about the generator style is the clarity in code that we can achieve. Please try to think instead of how you could implement those things in a make it nice API for app authors. WSGI wasn't made ugly on a whim; it's the direct result of some very important design principles. While the need for start_response() is gone, many of the other reasons for its ugliness remain. (In any case, you can still implement a generator-based API for writing WSGI apps, without needing to make WSGI *itself* be implemented that way.) Here's a body-size logging middleware: def logger(app): def middleware(environ): wrapped = app(environ) yield next(wrapped) body_bytes = 0 for maybe_body in wrapped: if type(maybe_body) is bytes: body_bytes += len(maybe_body) yield maybe_body logging.info(Saw %d bytes for %s % (body_bytes, environ['PATH_INFO'])) return middleware Perhaps you meant this as a sketch, but note that you're not calling close() on the underlying iterator. At minimum, you need a try/finally to do that, or else you need to use the wsgi_lite closing extension -- and you need to assume that your parent middleware