Per the previous discussion about HTTP/2, websockets, et al, here's my attempt at providing something we can start using and implementing today, as a bridge to future specifications. If you'd prefer to read it nicely formatted, you can find an HTML version in progress at:
https://gist.github.com/pjeby/62e3892cd75257518eb0 I'm very interested in feedback from server and framework developers with relevant experience to help close the "open issues and questions" section. Questions about the content or feedback on its presentation would also be very helpful. (For now, the text is in markdown, but of course I will switch it to ReST once it begins stabilizing.) # The WSGI Middleware Escape for Native Server APIs # Overview This document specifies a proposed standard WSGI extension that allows WSGI applications to "escape" the standard WSGI API and access native web server APIs, such as websockets, HTTP/2 features, or Twisted/tulip-style asynchronous APIs. The proposed extension, the Middleware Escape for Nativer Server APIs or "MENSA", allows WSGI to continue to be used for the 98% of typical web application use cases that fall within the basic HTTP/1.0 "request/response" paradigm, while allowing the 2% of use cases with more sophisticated requirements to still benefit from "inbound" WSGI middleware for sessions, authentication, authorization, routing, and so forth, as well as keeping the other advantages of sharing the same process with other WSGI code. Specifically, the MENSA protocol allows a WSGI application to *dynamically* switch at runtime from using a standard WSGI response, to using a web server's "native" API to handle the current request (and possibly subsequent ones), subject to certain conditions. This approach provides present-day WSGI applications and frameworks with a smooth upward migration path in the event that they require access to websockets, HTTP/2-specific features, etc. With it: * Web servers can expose their native API to any WSGI application or framework * Application developers can use existing middleware, libraries, or frameworks to handle front-end tasks like routing and authentication * Frameworks can offer a simple `response.use_native_api(...)` (or similar) API to allow app developers to easily "jump out" of the framework and request the use of a specific native server API for the current request, and * Even developers using frameworks that *don't* offer this escape API can still use it, by invoking a short utility function given in this specification, and adding a little framework-specific glue code # Motivation Recent discussion on the Python Web-SIG about incorporating HTTP/2 features into present-day WSGI has highlighted the extreme difficulties of doing so without breaking certain types of middleware. In addition, it highlighted the strong existing need for Websockets in present-day web apps, and the ways in which existing Websocket extensions for WSGI have the same problems. Both HTTP/2 and Websockets are a fairly extreme break from the request/response paradigm of HTTP/1.0 that WSGI was designed around, making them difficult to represent within WSGI, and therefore a poor fit for a direct extension of the existing WSGI protocol. Such a direct extension would not only be premature for HTTP/2 (due to a lack of existing HTTP/2 APIs for Python), but would also be unnecessarily confining in terms of what features could be supported, and unnecessarily complex in how those features would need to be implemented. Therefore, this proposal seeks to defer or *table* ("mensa" is the Latin word for "table") the issue of creating an HTTP/2 WSGI extension API, by making it possible for existing WSGI applications to access *any* such API that existing web servers or server frameworks may wish to provide. (i.e. giving all of them "a seat at the table".) Thus, it would not be necessary to standardize on the One True Websocket API or One True HTTP/2 API at this time, because server authors can simply expose their native APIs for the use of those web applications that have need of such APIs. This neatly resolves two current issues in the community at present: 1. Often, the only way to mix websockets (or HTTP/2) and WSGI is through separate processes, often with the need to reinvent the wheel for routing and other functions commonly handled by WSGI front-end middleware 2. The "chicken and egg" problem of developing an HTTP/2 API spec when there are few such APIs existing in the field, but nobody wants to *implement* such APIs because nobody can use them from WSGI, and nobody wants to abandon WSGI to write their entire applications or frameworks based on a new and largely-untested API that's not yet blessed as a specification. In contrast, adoption of the WSGI MENSA spec allows both server developers and application developers to experiment with advanced server APIs, without throwing away their WSGI investments (or native server API investments!), and only making new investments in that portion of the application space that require access to more advanced APIs. That is, if the bulk of one's code is still in WSGI, it is still migratable to other server platforms, with only the advanced portions needing to be ported. Thus, the risk of tying one's application too tightly to one particular native API is considerably reduced. Thus, as community experience with advanced server APIs is increased, the practicality of actually defining a *standard* server API for these types of applications is also increased. Eventually, such a standard API could then perhaps even replace WSGI, while still being accessible from within legacy WSGI frameworks (via the MENSA escape). # Scope Goals of this specification include: 1. Defining a way for WSGI applications, at runtime (i.e., during the execution of a request), to detect the existence of, and access, "native sever APIs" which can be used in place of WSGI for either effecting a response to the current request, or initiating a more advanced communications protocol (such as websocket connections, associated content pushing, etc.) 2. Defining ways for WSGI middleware to: 1. Continue to be used for request routing and other pre-response activities for all requests, as well as post-response activities for requests that do not require native API access 2. Intercept and assume control of any native APIs to be used by wrapped applications or subrequests (assuming the middleware knows how to do this for a specific native API, and desires to do so) 3. Disable any or even *all* native API access by its wrapped apps -- even without prior knowledge of *which* APIs might be used -- in the event that the middleware can only perform its intended function by denying such access 3. Defining a way for WSGI servers to negotiate a smooth transition of response handling between standard WSGI and their native API, while safely detecting whether intervening middleware has taken over or altered the response in a way that conflicts with elevating the current request to native API processing Non-goals include: * Actually defining any specification for the native APIs themselves ;-) # Specification The basic idea of MENSA is to add a dictionary to the WSGI environment, under the key `wsgi.native_api_hooks`. Within this dictionary, a single key is reserved for each non-WSGI API offered by the server (or implemented via middleware). So, for example, if Twisted were to offer a MENSA escape for WSGI apps, it might register a `twisted` key within the `wsgi.native_api_hooks` dictionary. ## Accessing a Native API WSGI applications query the `wsgi.native_api_hooks` dictionary in order to access the native API of their choice, and then delegate to it. So, for example, a pure WSGI app that switches to the `foobar` native API mid-request might look like this: def my_wsgi_app(environ, start_response): native_apis = environ.get('wsgi.native_api_hooks', {}) foobar_api = native_apis.get('foobar') if foobar_api is None: # appropriate error action here # i.e. raise something, or return an error response def my_foobar_app(foobar_specific_arg, another_foobar_arg, etc...): # code here that uses the foobar API to do something cool, # like maybe websockets or signed streaming trailers or # other buzzword-laden stuff ;-) # Delegate the WSGI response to the native API return foobar_api(environ, start_response, my_foobar_app) On the application side, this is all that's necessary for a pure-WSGI application to switch to using a native server API and whatever its advanced features permit. (For applications using frameworks that don't directly expose the WSGI start_response() or allow returning a WSGI response body directly, a little extra glue code is required; those details are covered in a later section of the spec.) In the above example, `my_foobar_app` is a function, but depending on the specific API involved, it could be a class or an instance of some kind, or perhaps just a data structure of some sort. The nature of the "app" or other parameters passed to the API hook is completely dependent on the design of the API being wrapped: only the first two arguments to the hook are dictated by this specification. So, for example, a Twisted native API might expect a `Protocol` instance, rather than a function. A gevent-based native API might expect a generator, generator function, or perhaps a greenlet. A websocket API might take *two* parameters, for a writer and reader. Defining and documenting the exact nature of the additional parameters passed to the API hook is entirely up to the hook's provider. ## Providing an API The implementation of a native API hook consists of a callable object, looking something like this pseudocode: def some_server_api_hook(environ, start_response, native_app): response_key = new_unique_header_compatible_string() native_request.response_registry[response_key] = native_app start_response('399 WSGI-Escape: '+response_key, [ ('Content-Type', 'application/x-wsgi-escape; id='+response_key), ('Content-Length', str(len(response_key))) ]) return [response_key] As you can see, this is a little bit like a WSGI application -- and in fact it *is* a valid WSGI application, except for the addition of the `native_app` parameter. The API hook's job is to generate a unique ASCII "native string" key for this response, and register the provided native app (or other arguments) under that key for *future use*. The server MUST NOT actually invoke or begin using the native application until *after* the standard WSGI response process has been completed, and it has verified that its markers are still present in the WSGI response. Those markers -- found in the status, headers, *and* response body -- are used to verify three things: 1. That the registered application is indeed a response to the original incoming request, and not merely to a subrequest created by middleware 2. That intervening middleware hasn't replaced the native API response with a response of its own (for example, an error response created because of an error occurring after the native app was registered, but before it was used) 3. *Which* native application should be invoked, if more than one was registered So, a server providing a native API must wait until it receives a WSGI response whose status, content-type, content-length, and body all unequivocally identify which of the native applications registered for the current request should actually be used. In the event that the status, type, and body all match, the server MUST then activate the registered native application, allowing the current request (and possibly subsequent requests, depending on the API involved) to be handled via the associated native API. (And discard any other registered applications for the current request.) In the event that neither the status nor headers designate a registered native application, the server MUST treat the response as a standard WSGI response, and discard all registered applications for the current request. In the event that the status and headers disagree on *which* native application is to be used (or *whether* one is to be used at all), or in the event that they *do* agree, but the body disagrees with them, the server MUST generate an error response, and discard both the WSGI response and any registered native applications. (In the face of ambiguity, refuse the temptation to guess; errors should not pass silently.) ### Response Key Details The key used to distinguish responses MUST be an ASCII "native string" (as defined by PEP 3333). It SHOULD also be relatively short, and MUST contain only those characters that are valid in a MIME "token". (That is, it may contain any non-space, non-control ASCII character, except the special characters `(`, `)`, `<`, `>`, `@`, `,`, `;`, `:`, `\`, `"`, `/`, `[`, `]`, `?`, and `=`.) Response keys generated for a given API MUST be unique for the duration of a given request, and MUST be generated in such a way so as not to collide with keys issued for any *other* API during the same request. (e.g., by including the API's name in them.) Response keys SHOULD also be unique within the lifetime of the process that generates them, e.g. by simply including a global counter value. (So, the simplest valid way of generating a response key is to just append a global counter to a string identifying the native API. However, there is nothing stopping a server from adding information like a request ID, channel desginator, or other information in, as an aid to debugging. Just make sure there's no whitespace or special characters involved, as mentioned above.) ## Intercepting or Disabling APIs Because all server API hooks are contained in a single WSGI environment key, it is easy for WSGI middleware to disable access to them when creating subrequests, by simply deleting that key before invoking an application. Likewise, in the event that WSGI middleware wishes to disable one *specific* API, or intercept it, it can do so by removing or replacing the appropriate hook within the hooks dictionary. (Note: The `wsgi.native_api_hooks` dictionary is to be considered volatile in the same way as the WSGI environment is. That is, apps or middleware are allowed to modify or delete its contents freely, so a copy MUST be saved by middleware if it wishes to access the original values after it has been passed to another application or middleware.) ## Accessing Native APIs Inside Application Frameworks Since relatively few applications are written in "pure WSGI", it's necessary to show how one would go about accessing a native API from inside an application framework that doesn't provide direct access to the WSGI `start_response`, or allow directly returning a response body. Here is a simple, but fully-generic utility function that works around this problem, provided there is at least access to the WSGI environment: def use_native_api(environ, api_key, *args, **kw): native_api = environ.get('wsgi.native_api_hooks', {}).get(api_key) if native_api is None: raise RuntimeError("API unavailable") status = headers = None def start_response(s, h): nonlocal status, headers status, headers = s, h return status, headers, native_api(environ, start_response, *args, **kw) The returned status, headers, and body can then be sent using framework-specific APIs, so that they propagate back out through the WSGI stack. (Individual web frameworks, of course, can and *should* offer their own, similar utilities to perform this function, e.g. by adding a `use_native_api()` method on their response objects. In that way, developers can be spared the details of setting the status, headers, etc.) # Notes on Current Design Rationale * A dictionary is used for all native APIs, so they can be easily disabled for subrequests * Multiple registrations are allowed, so that middleware invoking multiple subrequests is unaffected, so long as exactly one subrequest's response is returned * A `Content-Type` header is part of the spec, because most response-altering middleware should avoid altering content types it does not understand, thereby increasing the likelihood that the response will be passed through unchanged # Open Questions and Issues * What if middleware adds headers but leaves the status and content-type unchanged? Should that be an error? What happens if middleware requests setting cookies? * Do the chosen status/headers/body signatures actually make sense? Do they even need to be more specified, less-specified? * Are there any major obstacles to sending a special status from major web frameworks? * Should a different status be used? * We need better examples! (They should more closely resemble some actual use cases, rather than being vague abstracts) * Are there any other ways to corrupt, confuse, or break this? * What else am I missing, overlooking, or getting wrong? # Acknowledgements (TBD, but should definitely include Robert Collins for research, inspiration, and use cases) # References TBD # Copyright This document has been placed in the public domain. _______________________________________________ Web-SIG mailing list Web-SIG@python.org Web SIG: http://www.python.org/sigs/web-sig Unsubscribe: https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com