I think we're uncovering important assumptions / facts here. For clarity: I'm not interested in a nice API for HTTP/2. I want HTTP/2 and its full featureset to be *possible*, *efficient* and *clear* in a protocol that can replace WSGI - and do so with a fair chance of adoption. Ditto websockets. Neither is possible within WSGI today: the base protocol is insufficient, and every implementation of either HTTP/2 or Websockets for app writers only works by depending on extensions that don't meet the basic design principles - for instance exposing the actual server socket as an extension, which mod_wsgi cannot do.
So, basic axioms I've been working from: * HTTP/2 cannot be tunnelled through HTTP/1: it can be downgraded, but not tunnelled. An HTTP/2->HTTP1.1->HTTP/2 chain is not capable of the same results as a straight HTTP/2 connection (or chain). * This almost certainly applies to WSGI as well: WSGI2 -> WSGI1 -> WSGI2 will have to downgrade to WSGI1. Some things may be tunnelable [and we can try to do that], but the full set of features almost certainly cannot. >From this I drew the proposal to do interop by providing an API [not protocol] that provides WSGI1 on the top and 2 on the bottom, and another that does the reverse: allowing folk to upgrade individual middleware piecemeal, and get the full benefits whenever they have a fully upgraded stack. E.g. leave upgrading debug middleware to the end. Perhaps this is misguided and implementors will reject such assistance? On 28 September 2014 07:55, PJ Eby <p...@telecommunity.com> wrote: > On Sat, Sep 27, 2014 at 12:20 AM, Robert Collins > <robe...@robertcollins.net> wrote: >> We should capture these design principles somewhere FAQ-like, since >> many of the folk participating in this rework weren't part of the >> original design. > > A lot of it is in the PEP itself, albeit in ways that seem a lot more > obscure now, 10 years later, than they did at the time of writing. > It's also spread out among different parts, including the FAQ at the > end. :) - I am familiar with PEP, so yeah, does feel a bit obscure :). Thank you for chiming in to reinforce them. > Any feature which is added solely to entice an end-consumer of WSGI > (vs. a framework or library implementer) is 100% wasted. I understand that argument, but... ... > If WSGI 2 adds features that users want, and library/framework > developers can reasonably add those features to *their* APIs, then > there is a chance that they will do so. But if they have to throw out > their whole existing paradigm to do that, or users have to abandon > their framework to adopt the WSGI 2 paradigm, then nothing was really > gained by the effort. libraries and frameworks exist for the same users. WSGI's ability to say 'and this is up to library/framework developers' is contingent on the protocol being *sufficient* for folk to do that. I suspect a bunch of our discussions are going to end up being around whether specific changes are necessary or things libraries can do. > Basically, going after end users puts you in a "boil the ocean" > position. That is, a situation where you must more or less convince > everybody to change at the same time in order for the standard to > reach critical mass. I had hoped not, due to proposing that we provide an API [not protocol] for adapting between the protocols. That would exist solely to make implementors have an easier time bringing support in incrementally. So - I think you're misinterpreting my thrust as being 'after end users' - I'm not: I'm squarely focused on the implementation problems of server and middleware authors. > However, if you are *not* trying to boil the ocean by attracting end > users, then anything that you do to benefit them (at the expense of > framework, middleware, or server authors) is pure waste, since the > incremental strategy (that WSGI was based on in the first place) > doesn't depend on end-users using the raw WSGI protocol. As the PEP > itself explains: >... > If you replace "WSGI" with "WSGI 2" in the above, the rationale > remains unchanged. Sure. >>> The above API is cute and clean for the app writer, but for a >>> middleware writer it's a barrel of misery. *Every* piece of >>> middleware that even wants to *read* anything from the response (let >>> alone modify it), now needs to check types of yielded values, >>> accumulate headers, and maybe buffer content. And there are many ways >>> to write that middleware that will be wrong, but *appear* right >>> because the author didn't think of all the ways that an app could >>> violate the middleware author's assumptions. >> >> Hang on, why would they buffer content? Buffering response content is >> currently verboten, and I haven't seen any proposal to change that. I >> don't understand how phrasing the API as I suggested would lead to >> buffering being permitted or required. > > By "content" I was actually talking about the headers or other > metadata. Sorry for the confusion. No worries. Right now buffering of headers is required - the whole 'until the iterator returns a non-empty bytestring' bit - sure, I'd like to get rid of that. I still don't see a case where the generator based protocol would force buffering of headers [outside of the context of middleware that actually wants to buffer headers]. >> If its a method on the response body, the returning a list or >> generator no longer works, unless you start poking random attributes >> onto things. It would also be inconsistent - why would trailers be a >> method on the response, but headers be a dict in the return value? > > (FWIW, I never proposed making headers a dict. That's a bad idea, IMO.) Could you enlarge on that? There have been lots of [often security related] bugs in implementations of HTTP/1.x which were due to protocol handlers *not* treating the headers as dicts. Things like appending a header that cannot be repeated where in an N-tier deployed system the first layer consults the last header and the second layer consults the first. HTTP's header model could be modelled as {header: [value, ...]} or even more strictly as {header: value_or_list_value}. I'm going to guess and say 'a list is necessary, a dict isn't, and someone can write middleware to sanitise response headers' ? > As for returning a list or generator, I don't see why you can't do e.g. > > return status, headers, trailing_signature(body, ...) > > Where trailing_signature is a function that returns an iterator with > appropriate annotation, wrapping the original iterable. That works > whether body is a list or a generator or some other custom iterable. > > ("Poking random attributes onto things" isn't a requirement, IOW.) yield from in recent pythons could make that fairly efficient, ok. Still leaves the inconsistency between an immediate value for headers and a late bound value for trailers but perhaps thats ok. .. > Sure -- the existence of bytes is an obvious win, as is the dropping > of start_response. But if you want WSGI 2 to be *interoperable* with > WSGI 1, or more precisely, if we want to support *tunneling* WSGI 2 > through a WSGI 1 stack, then the design has to be at least somewhat > constrained by WSGI 1. Ok, so I don't think we *can* do that, and in fact I think we shouldn't. I think we *can* do the following: - make WSGI2 degrade to WSGI1 via an adapter - tunnel WSGI1 through WSGI2 I may be wrong, and if we're clever enough - great. OTOH some of the changes we're discussing - like getting rid of start_response and making bidirectional channels possible - are pretty fundamentally different to WSGI1, and I'd be worried about a protocol that requires middleware authors to write to *both* WSGI1 and WSGI2 at the same time. I think thats an unnecessary burden and one that will hinder adoption. > So, I don't see a problem with creating a response object per se. I > was just thinking that with middleware, you really want to be able to > mix and match what features are being returned with the response, so > unless you use `__getattr__` proxying, or it's required that response > objects allow arbitrary attributes to be added, then the paradigm "bag > of related features in a dictionary" better fits the requirement than > "return an object". Ok. >>> So, let's trim the sharp edges for the poor middleware and server >>> developers, rather than polishing the bits that app writers aren't >>> going to be using, anyway. (Since most of them are going to be using >>> Django, Pyramid, Flask, or whatever the latest hotness is, anyway.) >> >> Do you have a hitlist of such sharp edges you'd like to see catered >> for in this new spec? > > The ones described in the wsgi_lite docs: > > 1. People forgetting that the environ is volatile > 2. People forgetting to close() > 3. The horror that is the stateful nature of the current protocol (all > the rules on what can be called when) > > In wsgi_lite I addressed #1 by providing the binding protocol to map > desired request data to keyword arguments. #2, by the "closing" > extension, and #3 by switching to a functional paradigm rather than an > imperative one. (Thus eliminating any rules on what can be called > when, because the response is a return value, not an invocation of > something.) Has wsgi_lite been picked up by server and middleware authors? Do we have any feedback on how well its working? > All in all, it kind of sounds to me like what you *really* want is to > make a user-level API for HTTP/2 applications. And maybe it would be > a good idea to do that *first*, without reference to tweaking WSGI. So, my personal driver is that I have multiple use cases, most but not all of which are end user use cases, that depend on HTTP/2 // will benefit from HTTP/2. A user level API is certainly a thing that will need to exist, but all the servers around so far are just degrading HTTP/2 to WSGI - the lingua franca. One perhaps unintended consequence of WSGI is that its become that lingua franca, and many things are internally structured around middleware stacks :). So the first thing that needs to be done is a WSGI like thing and internal code shuffling. You're right though that more implementor experience would be good - I'm hoping do be doing that on the basis of drafts and discussion. ... > And finally, we could look at that protocol and say, "okay, can we > encapsulate this protocol in such a way that it can be safely tunneled > through WSGI 1?" If it can :). > Each of these stages has benefit. If you only get through the first, > at least it's possible to do HTTP/2 in Python! If you get through the > second, well, maybe it's not WSGI, but at least it's a protocol (SSGI? > H2GI?). And so on. > I guess what I'm saying is, based on what you seem to be trying to do, > I think trying to update WSGI is *way* premature. Even WSGI wasn't > proposed in a vacuum: it was based on looking at the APIs provided by > existing Python-supporting web servers and required by existing Python > web frameworks. So, in the absence of even *one* HTTP/2 framework API > to drive the requirements, it's probably premature to propose paradigm > shifts in WSGI itself. So, there are multiple examples of websockets today, which share much in common with HTTP/2. All of them require server support, and tunnel through WSGI in ways that are liable to break (e.g. a middleware that remotes objects will almost certainly fail to handle the raw socket). > Does an HTTP/2 server or API for Python even *exist* yet? Yes. http://nghttp2.org/documentation/package_README.html#python-bindings The model is of a handler class, and four events - headers, data, request fully received, stream closed. It supports push, but in a way that prevents implementing a notification server such as https://tools.ietf.org/html/draft-thomson-webpush-http2-00 specifies. -Rob -- Robert Collins <rbtcoll...@hp.com> Distinguished Technologist HP Converged Cloud _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com