> On 4 Jan 2016, at 11:27 PM, Cory Benfield <c...@lukasa.co.uk> wrote:
>
> All,
>
> **TL;DR: What do you believe WSGI 2.0 should and should not do? Should we do
> it at all?**
>
> It’s a new year, and that means it’s time for another attempt to get WSGI 2.0
> off the ground. Many of you may remember that we attempted to do this last
> year with Rob Collins leading the charge, but unfortunately personal
> commitments made it impossible for Rob to keep pushing that attempt forward.
Although you call this round 2, it isn’t really. Robert’s effort was not the
first time someone has pushed a WSGI 2.0 variant. So this is more like being
about round 5 or 6.
In part because of those repeated attempts by people to propose something and
label it as WSGI 2.0, I am very cool on reusing the WSGI 2.0 moniker. You will
find little or no mention of ‘WSGI 2.0’ as a label in:
https://github.com/python-web-sig/wsgi-ng
<https://github.com/python-web-sig/wsgi-ng>
That is probably somewhat due to my grumbling about the use of ‘WSGI 2.0’ back
then.
Time has moved on and so the bad feelings and memories associated with the
‘WSGI 2.0’ label due to early failed efforts have faded, but I would still
suggest avoiding the label ‘WSGI 2.0’ if at all possible.
My general feeling is that if any proposed changes to the existing WSGI (PEP
3333) specification cannot be technically implemented on all existing WSGI
server/adapter implementations that any new specification should not still be
called WSGI.
In other words, even if many of these implementations may not be used much any
more, it must be able to work, without needing to mark things as optional, on
CGI, FASTCGI, SCGI, mod_wsgi, gunicorn, uWSGI, Waitress, etc etc.
This is purely to avoid the confusion whereby implementations cannot or choose
not to implement any new specification. The last thing any WSGI server author
wants is having to deal with a constant stream of questions and bug reports
about not supporting an updated specification where technically it was never
going to be possible. We have some obligation not to inflict this on what are,
in nearly all cases, volunteers in the Open Source world who work on these
things in their spare time and who are not doing it as part of their paid
employment.
> Since then, the need for a revision of WSGI has become even more apparent.
> Casual discussion on the web has indicated that application developers are
> uncomfortable with the limitations of WSGI. These limitations are providing
> an incentive for both application developers and server developers to take an
> end-run around WSGI in an attempt to get a framework that is more suitable
> for the modern web. A great example of the result of WSGI’s deficiencies is
> Andrew Godwin’s channels work[0] for Django, which represents a paradigm
> shift in application development that takes it far away from what WSGI is
> today.
>
> For this reason, I think we need to try again to get WSGI 2.0 off the ground.
> But I don’t believe we can do this without getting broad consensus from the
> developer community that a revision to WSGI is needed, and without
> understanding what developers need from a new revision of WSGI. This should
> take into account the prior discussions we’d had on this thread: however, I’m
> also going to actively solicit feedback from some of the more notable WSGI
> implementers, to ensure that whatever comes out of this SIG is something that
> they would actually use.
>
> This WG already had a list of requirements, which are as follows:
>
> - Support servers speaking HTTP/1.x, HTTP/2 and Websockets (potentially all
> on a single port).
Any support for implementing WebSockets should though be seen as a separate
requirement to implementing HTTP/2.
A specific WSGI server implementation may be able to support HTTP/2, but not
support WebSockets, or it could support WebSockets via HTTP/1.x already. In
fact basic request/response functionality of HTTP/2 maps into the existing WSGI
API specification and doesn’t really require any changes be made to the WSGI
specification.
For example, mod_wsgi already supports HTTP/2 by virtue of the fact that the
mod_h2 module in Apache exists. The existing internal APIs of Apache and how
mod_wsgi uses those means that HTTP/2 bridges into the WSGI world with no code
changes to mod_wsgi.
To support WebSockets is a much bigger problem and is not achievable with CGI,
FASTCGI, SCGI.
It may be able to be supported within the Apache/mod_wsgi implementation, but
the major re-architecting required in the mod_wsgi code, and the fact that it
couldn’t be done by simply exposing a socket, but by requiring a new high level
abstract API be developed which doesn’t expose the actual socket object, means
you are really talking about a whole new API.
To me the WebSocket requirement and the need for a completely new API rules out
ever doing this as part of an updated WSGI specification. It should really be
treated as a completely separate thing.
There has been discussed previously the possibility of bootstrapping into a
WebSocket session (or any other new protocol or its corresponding API) via a
connection upgrade process. In other words, you have the request actually make
it to the WSGI application and it then decides to push back some response that
causes the underlying server to resubmit the request back to the Python web
application as a whole, but via a different API.
This idea that the WSGI application would make the decision though was a
somewhat clumsy mechanism and could easily be messed up where people start
wrapping WSGI middleware around applications and so the decision point is
nested. This would likely be impractical for implementations such as mod_wsgi
and may be uWSGI as well, where you may at the point of calling into the Python
code already be nested within the layers of some C level abstractions that
exist between the WSGI application and the underlying server. You are really
well past the point where the decision to use a particular protocol can
sensibly be made. It is just too hard to try and unwind any server level layers
and switch protocols.
So if something were to support WebSockets, it should be a decision made down
in the underlying server and calling into any Python web application should be
done through a distinct API from the existing WSGI application API, where the
API entry point for WebSocket was defined distinct from the WSGI application
one.
They are therefore two different APIs and so why WebSocket should be dealt with
in a separate specification and not carry the WSGI label at all. A specific
WSGI server could still support the new WebSocket API, but purely because it
decides to support both in the same process. Not because the WebSocket API
makes use of the WSGI specification.
The only thing you might allow to make it easier to have both coexist in same
code, is to add a convention that a WSGI application callable might provide a
new function such as ‘__endpoint__(protocol)’ which allows the underlying
server to request of the same application object an API entry point for any new
protocol such as WebSocket. This may well be better done though as some new
higher level abstraction encapsulating the whole concept of a web application
which supports idea of startup/shutdown hooks, passing of configuration from
the server etc. Right now there is no consistency for this between WSGI
servers. If such a higher level abstraction for an entrypoint were created,
even getting a the ‘WSGI’ API endpoint may require the initial call to request
it.
Whatever way a server learns about a web application supporting additional
protocols, be it through server configuration or a discoverable higher level
application object abstraction, the key thing is that the server should be the
one left to make the decision of what actual Python API object to call into so
that the server is more readily able to set up any protocol stack with the
server part to match before it is too late and it isn’t possible to undo what
it may have already set up.
> - Support graceful degradation for applications that can use HTTP/2 but still
> support HTTP/1.x requests.
The issue here is really how much of the new functionality of HTTP/2 you expose
to a Python web application.
As far as basic request/response mapping into existing WSGI interface there is
no need for graceful degradation to be considered, at least not at the WSGI
level, as that is an issue for the underlying server. Whether the server
handles it as HTTP/1.x or HTTP/2, it still maps to the same WSGI application
API and the application wouldn’t care.
For new functionality of HTTP/2, much like WebSockets, I believe a completely
new API should be developed. It isn’t necessarily going to be realistic to try
and shoe horn it into the existing WSGI API somehow.
> - Graceful incremental adoption path - no upgrade-all-components requirement
> baked into the design.
It is hard to see what you expectations are here.
Prior attempts to force ASYNC into WSGI, and in some respects WebSockets
through forcing raw fd access have not been practical. WSGI simply is not a
good vehicle for it. Long term it is going to be much better to have new APIs
for new WebSocket and HTTP/2 support.
The only even partly graceful path is perhaps first ignoring WebSockets and
HTTP/2 and coming up with a more rich higher level abstraction for the complete
Python web application entry point itself. So the idea above of a higher level
object which defines hooks for startup/shutdown, passing configuration and also
perhaps the querying of what protocols are supported by the application and
even optionally what specific URL endpoints those protocols are active on. You
could even have a application say where static file assets live so the server
could host them itself via any more optimal methods than the application itself
could use.
Get this in place then existing WSGI servers could be changed to accomodate
this new higher level abstraction for the entrypoint. They may not support new
protocols initially, or maybe not at all, but it at least provides a framework
for the server and application to coordinate better and so allow a server to
direct certain protocol types to different endpoints in the application, or
even for the server to notify the application that certain protocols aren’t
supported and so allow an application to use alternative mechanisms.
Down the track with HTTP/2 support, with the ability of the application to say,
this is where my static assets are, you could even perhaps have a way of
flagging that certain assets should be pushed back by the server knowing that
they will be required. This way the server becomes responsible for that at the
place where lower level access to HTTP/2 primitives is available, which might
not be passed through a higher level API.
> - Support Python 2.7 and 3.x (where x is not yet discussed)
3.3 would need to be the absolute minimum. Support anything older in 3.x is too
much of a pain.
> - Support the existing ecosystem of containers (such as mod_wsgi) with the
> new API. We want a clean, fast and approachable API, and we want to ensure
> that its no less friendly to work with than WSGI, for all that it will expose
> much more functionality.
> - Apps need to be able to tell what protocol is in use, and what optional
> features are available. For instance, HTTP/2 PUSH PROMISE is an optional
> feature that can be disabled by clients. Websockets needs to expose a socket
> like object, and so on.
I will stress my opposition to exposing of any raw socket. Some existing
servers will simply not be able to do that in a sensible way where they already
use a internal proxying arrangement where there exists a messaging layer
between processes and the raw socket is actually only available in a completely
different process to the web application.
> - Support websockets
> - Support HTTP/2
> - Support HTTP/1.x (which may be just 'point at PEP-3333’.)
> - Continue to support lightweight shims being built on top such as
> https://github.com/Pylons/webob/blob/master/webob/request.py
>
> I believe that all of these requirements are up for grabs, and subject to
> change and consensus discussion. In this thread, then, I’d like to hear from
> people about these requirements and others. What do you believe WSGI 2.0
> should do? Just as importantly, what do you believe it should not do? What
> prior art should we take into account? Should we bother revising WSGI at all,
> or should we let the wider application ecosystem pursue its own solutions à
> la Django's channels? Should we simply adopt Andrew Godwin’s ASGI draft[1] on
> which channels is based and call *that* WSGI 2.0?
My current thinking is that what needs to be done is:
1. An optionally updated WSGI specification labelled as WSGI 1.1. This has got
nothing really to do with the initiative to have a way to handle WebSockets and
HTTP/2. It would simply to be integrate changes which were raised the last time
the WSGI specification was updated, but which were passed over because a PEP
was in the end rushed through just to deal with Python 3, ignoring other
concerns. There are only a few changes which this would cover.
The first relates to the guarantee that you are able to read past
CONTENT_LENGTH because a WSGI server will return an empty string on end of
input. This is to support chunked request encoding and compressed request
content where decompression is handled by the server. Basically, CONTENT_LENGTH
becomes advisory only. WSGI applications are allowed to ignore it, expect maybe
for raising a 413 response, and read to end of input. The change of
wsgi.version to 1.1 is needed to allow frameworks to know the guarantee exists
that this will work.
This ability has existed in Apache/mod_wsgi for a long time and the Flask
builtin server also supports it with Werkzeug/Flask currently relying on
looking for special non standard markers in WSGI environ to know the guarantee
exists. I have blogged about this issue before.
The second relates to the wsgi.file_wrapper object being required to be a class
type. I will not go into how as I have also blogged about this before, but this
allows middleware to wrap a response iterable to add a action on close() but
not break any optimisations for more performant sending of files.
A third change is to fix the example for wsgi.file_wrapper fallback, which
doesn’t close the file descriptor properly and so results in leakage of file
descriptors, with them only being cleaned up by the garbage collector.
I vaguely recollect there may have been another issue for wsgi.file_wrapper
around response Content-Length. I can’t remember if that definitely required a
change. I will need to go back my blog posts about that one. There could also
be other things I have found as still being wrong.
As I note above, this is optional. But if we are going to close out WSGI and
not develop it further, would be nice to fix up some of the last problems with
it.
2. Develop a higher level abstraction for what is a Python web application.
Thus hooks for startup/shutdown, passing configuration from the server, or
querying back configuration from the application pertaining to supported
protocols, along with what sub URLs protocols are supported on, and where
static file assets may be that application may want the server to handle if
that would be more performant.
I believe that such a new high level abstraction will provide a better
framework to hang things off when we introduce new protocols.
3. Separate WebSocket API.
Basically ignore existing WSGI specification completely. Come up with the best
API one can for WebSocket interaction at the server level. This should not just
be exposing a socket, but be a higher level abstraction involving passing of
actual WebSocket messages.
By using higher level abstraction it allows a server to implement the details
using whatever mechanisms best fit that server implementation.
4. Separate HTTP/2 API.
Again, ignore existing WSGI specification complete. Come up with the best API
one can for dealing with HTTP/2.
For (3) and (4) lets do these as being our holy grail. Rather than compromise
by trying to work with WSGI, lets first come up with what would be our ideal.
Then lets see how that can fit within existing servers, possibly integrated via
the richer application abstraction of (2).
> Right now I want this to be very open. I’d like people to come up with a
> broad statement listing what they believe should and should not be present in
> WSGI. This first stage of the work is very general: I just want to get a
> feeling for what the community believes is important. Once we’re done with
> that, if the consensus is that this work is worth pursuing, I’ll come up with
> an initial draft that we can start making concrete changes to.
>
> In the short term, I’m going to keep this consultation open for **at least
> two weeks**: that is, I will not start working on an initial draft PEP until
> at least the **18th of January**. If you believe there are application or
> server developers that should be involved in this discussion, please reach
> out to them and point them to this list. I personally have CC’d some people
> that I believe need to be involved in the discussion, but please reach out to
> others as well.
It isn’t clear what you expect this PEP to include, but trying to push for a
PEP so quickly is unrealistic. There is likely going to need to be a fair bit
of discussion and with the fact that people have real jobs, or other
obligations, history has shown that rushing to a PEP just disenfranchises
people and they will not contribute due to the inability to do so in too short
a time frame.
> I’d really love to come to the end of 2016 with a solid direction for the
> future of web programming in Python. I’m looking forward to working with you
> all on achieving that.
Graham
_______________________________________________
Web-SIG mailing list
Web-SIG@python.org
Web SIG: http://www.python.org/sigs/web-sig
Unsubscribe:
https://mail.python.org/mailman/options/web-sig/archive%40mail-archive.com