[Note: I have not read all the channels docs, sorry if some of these points
are covered there.]

On a packaging note, is there a way to use django[channels] type syntax
like flask does? I'm not familiar with the restrictions of this but it may
remove the need for try/except imports.

I'm also curious about what a "small scale" deployment of channels would
look like. It has features which are useful to small sites which deploy on
a single server or heroku dyno. It would be great to be able to run a
complete channels stack with in memory communication with a single process
start - something as easy as gunicorn wsgi.py. I know you are intending
this to be possible for users not using channels, but when it has features
so useful for modern web apps like tasks and websockets, getting a no-setup
deploy of this working would be a huge win over celery/other systems.
Obviously you would lose the no down time deploys here.

I've said this in person to you, but I think a REDIS_SERVERS setting like
DATABASES would be a hugely useful feature for django independently of
channels, especially if it supported tests well. I'm yet to find a third
party app which does this well.

Testing in general is an interesting question - how do you envisage a test
environment would run? Would self.client go straight to the consumer? How
would you test the channel setup? Will there be test utilities to test,
say, that a given message has been sent to a given channel without
consuming it?

You talk about being able to load balance between backend tasks and
requests. Is there am easy way to not do this in a multi server setup,
where say image processing is on a box with far more RAM than is needed for
a request processing box?

Have you considered a means of asking the channels system how much load it
is under so that systems could do intelligent autoscaling?

>> > You seem to be assuming I'm here to foist a brand new middle layer on
>> everyone; I'm not. I'm here to make one that fits neatly into Django, that
>> I think most people will want to turn on, and that provides a lot of value
>> in exchange for a slight round-trip performance hit - my goal is sub-5ms,
>> and preferably sub-3. If it starts being 10/20/30 milliseconds of cost,
>> then we'll have to change our approach until it's acceptable.
>> Yes that's how I read this plan and that's why I think it needs some
>> clarity. I didn't mean for this to turn into a long discussion about
>> performance. This was meant to be a discussion about the transition plan.
>> To go back to my original message, I see no gain for existing WSGI
>> applications to have this on by default, even using the in-memory
>> channel, when they upgrade to 1.10 (or whenever this lands). The current
>> plan reads as though it will.
> I agree - that was my original intention and how the current version of
> channels works, but it's no longer my plan, and I should update the
> integration plan to be more specific and discuss things like introducing
> different HttpRequest subclasses other than WSGIRequest.
> To be clear, 1.10 might have a different request handling stack than 1.9
> that isn't so WSGI-native, but there's still going to be a way to get a
> request in and a response out directly without hitting Channels. It's
> almost enforced by the channels design, actually, as the entire Django URL
> and view system would have to run as a single consumer there anyway.
>> From what you are saying above this sounds more like a
>> django.contrib.channels than a django.core.channels. Either way I feel the
>> plan should provide more clarity in that regard.
> Also true - it's not immediately apparent why it shouldn't be contrib or
> even entirely separate, but there are good reasons, chief among them being
> that channels needs some low-level changes to the HTTP subclasses and
> session framework; it's somewhat similar to migrations in this regard. The
> 1.8/1.9 app versions are going to be full of monkeypatches and not perform
> as well as I won't be able to eliminate all the duplicate code being run.
> I also have a philisophical belief that we should highlight the channels
> model as the new "base layer" of Django - teaching people how everything
> else builds on it to do view handling, socket handling, background tasks,
> etc. - but that isn't at odds with direct-WSGI handling; an in-memory
> backend and direct handling are basically indistinguishable from the
> outside.
> I'll work on another draft that more clearly highlights WSGI and direct
> request handling in the "keeping Django the same part" - hopefully that
> will help make things clearer.
>> Andrew, I thank you for your patience and civility in these discussions.
>> I know this is something you've been working hard on and I'm not trying
>> to be needlessly critical of your work.
> I honestly appreciate your feedback here too - it's very hard to see these
> things from the outside, so it's important to have these discussions.
> There's a reason I wanted to get a draft up and get comments!
> Andrew
>>>> Anssi criticisms are fair and I feel that some of these responses are
>>>> glossing over the details.
>>> I'm sorry if it comes across like that - a lot of these are things I've
>>> been considering for a while and so I can forget to provide context.
>>>> You've claimed this is the same or equivalent to a forked worker model
>>>> but it isn't because there is no process management/link between the
>>>> interface and worker and because you've chosen to make this network
>>>> transparent. As much as you'd like to claim this isn't like Celery,
>>>> the same issues that exist when trying to (ab)use Celery for blocking RPC
>>>> calls is what you will have here.
>>> I never claimed this isn't like Celery; it is quite a bit like it, but
>>> with specific changes (no guaranteed delivery, no single-response
>>> mechanism) that make it better at throughput.
>>>> > Then the response is dropped, the same way a WSGI worker drops a
>>>> connection if it dies.
>>>> It isn't entirely the same because in this channel case the worker
>>>> doesn't know the client/interface dropped the connection. It's still
>>>> working hard to generate a response which will sit in the response channel
>>>> (Redis, memory, etc) until it expires (assuming that all backends expire
>>>> channel messages). That doesn't happen in the current WSGI interface.
>>> True, though from what I understand of the WSGI spec you also don't know
>>> the client has disconnected until you try to write out content to it. Most
>>> views would still run the entire thing and make a response, only to drop it.
>>> There is also a http.disconnect message type planned to be implemented
>>> for when a client disconnects before the response is entirely read, but
>>> that's more for long-poll usage where you want to keep track of who has an
>>> open connection.
>>>> > All Django requests already involve multiple round trips to database
>>>> servers, and the Redis backend at least is much quicker on the processing
>>>> side.
>>>> This isn't true. Database round trips are not a requirement for
>>>> Django's current architecture. I'll concede that many views touch the DB
>>>> but that's a choice of the developer, not Django's. When using channels the
>>>> views will still need to make the same DB calls, though that processing
>>>> happens at the worker. Putting Redis in between doesn't make it faster.
>>> Right, and channels will not be a requirement for Django's future
>>> architecture. They'll just be a thing that I expect most people to turn on
>>> as they provide a feature set a lot of types of sites need. And I never
>>> said it would be faster - you'll see me repeatedly say channels provides no
>>> performance gain - at best, it might help smooth response times with the
>>> way the workers load balance jobs based on when they're free.
>>>> > You could make the same argument about not having separate load
>>>> balancers and nginx serving static files - after all, you're just adding
>>>> another network roundtrip to send traffic onward from the other server to
>>>> Django.
>>>> There are notable differences here. These are persistent HTTP
>>>> connections and don't suffer from the same problem of client/server drops
>>>> previously noted. They are also cacheable in a known way allowing
>>>> round-trips or bandwidth to be avoided. Many Django applications will
>>>> continue to use load balancers with channels. It can't be denied that
>>>> channels introduce two more network round trips that didn't exist before.
>>>> Trying to paint this as "we already talk over the network, so what's a
>>>> couple more" is not a compelling argument to me.
>>> Sorry, I think my tone came across wrong there. I'm more just saying
>>> that it's equivalent to adding an extra layer of that kind of
>>> infrastructure, in that it also uses persistent connections and should only
>>> be a few milliseconds of delay, and for most Django sites a few
>>> milliseconds is perhaps a percentage point of their response time. Again,
>>> if someone wants higher performance, they don't have to use channels and
>>> can just connect directly.
>>> You seem to be assuming I'm here to foist a brand new middle layer on
>>> everyone; I'm not. I'm here to make one that fits neatly into Django, that
>>> I think most people will want to turn on, and that provides a lot of value
>>> in exchange for a slight round-trip performance hit - my goal is sub-5ms,
>>> and preferably sub-3. If it starts being 10/20/30 milliseconds of cost,
>>> then we'll have to change our approach until it's acceptable.
>>> If you don't want the new features and the resulting change in stack,
>>> Django as it is now will be there for you, but if you're that sensitive to
>>> performance then Django maybe isn't for you already unless you're heavily
>>> modifying it.
>>> Andrew
>>>>>> I have a gut feeling this isn't going to work that well. The reasons
>>>>>> include:
>>>>>>   - Backwards compatibility: how is a large site going to upgrade
>>>>>> from 1.9 to 1.10?
>>>>> None of the core view API will change. A 1.9 codebase will boot and
>>>>> work on 1.10/channels with no code changes.
>>>>>>   - Complexity of setup.
>>>>> Could you elaborate? If you don't want it, you don't need to configure
>>>>> anything, and if you do, for most people it's just getting a Redis server
>>>>> running and pointing a setting at it.
>>>>>>   - Error conditions: for example, what happens when an interface
>>>>>> server sends a request to worker, and then dies (that is, the response
>>>>>> channel has zero listeners). Similarly for chunked messages.
>>>>> Then the response is dropped, the same way a WSGI worker drops a
>>>>> connection if it dies. Chunked responses will time out after a certain
>>>>> period and be dropped too, in the same way that a deadlocked normal server
>>>>> would.
>>>>>>   - Does the architecture really scale enough? The channel backend is
>>>>>> going to be a bottleneck, it needs the ability to handle a huge amount of
>>>>>> data and a huge amount of individual messages. In particular, the request
>>>>>> channel is going to be contested. We know classic http scales, but is the
>>>>>> same true for interface server architecture?
>>>>> I believe it will - have you read through the sharding and scaling
>>>>> plan in the docs? Channels is carefully designed to have no state in
>>>>> anything but interface servers, all workers handling all message types and
>>>>> queuing of messages exactly so you can scale horizontally; you can divide 
>>>>> a
>>>>> very large site into several clusters of interfaces and workers fronted by
>>>>> load balancers, and each cluster would have multiple Redis (e.g.) backends
>>>>> with requests equally sharded across them using consistent hashing.
>>>>>>   - Performance. Each request and response needs two additional
>>>>>> network roundtrips. One to save to the channel server, one to fetch from
>>>>>> the channel server. If the messages are large, this adds a lot of 
>>>>>> latency.
>>>>> All Django requests already involve multiple round trips to database
>>>>> servers, and the Redis backend at least is much quicker on the processing
>>>>> side. You could make the same argument about not having separate load
>>>>> balancers and nginx serving static files - after all, you're just adding
>>>>> another network roundtrip to send traffic onward from the other server to
>>>>> Django.
>>>>>>   - Untested architecture: does any big site use this kind of
>>>>>> architecture for all http handling?
>>>>> I agree with you here - see below for my justification. I've seen it
>>>>> used for other things (data update networks, service calls), but not 
>>>>> direct
>>>>> HTTP.
>>>>>> A realistic test for this is to push a scalable amount of scalable
>>>>>> sized requests through the stack. The stack should recover even if you 
>>>>>> shut
>>>>>> down parts of the network, any single interface server, channel backend
>>>>>> server or worker server. Of course, when a server or a part of the 
>>>>>> network
>>>>>> recovers, the stack would need to recover from that. Compare the
>>>>>> performance, simplicity of setup and ability to recover from error
>>>>>> conditions to a setup with only classic Django http servers.
>>>>>> I'm sorry if this feels negative. But, you are planning to change the
>>>>>> very core of what a Django server is, and I feel we need to know with
>>>>>> certainty that the new architecture really works. And not only that, it
>>>>>> needs to be at least as good as classic http handling for existing users.
>>>>> That's why a decent part of my proposal to Mozilla for funding was to
>>>>> help us fund hardware and time for extensive performance and scale 
>>>>> testing.
>>>>> I've seen this architecture work at scale before for non-HTTP traffic, and
>>>>> I believe that it will work as well for HTTP and WebSockets.
>>>>> Don't get me wrong - I don't believe this is a magical panacea to
>>>>> solve all problems, and we're going to have to do plenty of testing and
>>>>> development work to get the solution to the level of existing HTTP
>>>>> handling, but remember, it also brings positive results:
>>>>>  - Downtime-less code deploys (if you stop workers, requests will just
>>>>> wait for new ones to appear until they hit timeout)
>>>>>  - Ability to add and remove processing capacity live without
>>>>> loadbalancer reconfiguration
>>>>>  - Lets you run different parts of the site on different Python
>>>>> runtimes, if you want (e.g. one part on PyPy, one part on CPython 2, one
>>>>> part on CPython 3)
>>>>>  - Background task processing
>>>>>  - And, of course, WebSockets/HTTP2/long-poll HTTP/other
>>>>> non-request-response protocol support
>>>>> It's never going to be a solution that works for everyone, but the
>>>>> whole nice part about channels is that, like all the best parts of Django,
>>>>> you can just ignore it and not use it if you don't want it; we're not 
>>>>> going
>>>>> to get rid of WSGI support, and the default shipping version in 1.10 isn't
>>>>> going to make you find a Redis server before you can even boot it up; 
>>>>> it'll
>>>>> just work like it does now, with extra flexibility there if you want to go
>>>>> turn it on and read through the next part of the tutorial/docs.
>>>>> I don't expect to get this past the community, core and technical
>>>>> board and approved into a release until it's proven itself to run and work
>>>>> at scale, and I already have several offers of testbeds to help prove this
>>>>> out with realistic web loads, which we can combine with synthetic load
>>>>> tests.
>>>>> Put it this way - I do not see any other way to handle WebSockets that
>>>>> is as feasible as this. Most solutions either require us to run the whole
>>>>> of Django in an async Python environment, which comes with its own set of
>>>>> issues, or they're more stateful proxy servers or run-alongside-servers
>>>>> that don't seem to have a story for scaling them to hundreds of thousands
>>>>> of connections without blowing up every packet received into HTTP 
>>>>> requests.
>>>>> I think Django absolutely has to adapt to the modern web environment
>>>>> and move away from just rendering templates when browsers request them, 
>>>>> and
>>>>> this to me is part of that. If there are other solutions to the same
>>>>> problems I think we should consider them as well; I've just not run across
>>>>> any that work as well in the two years I've been planning this out before 
>>>>> I
>>>>> brought it out to be talked about.
>>>>> Andrew
>>>>>>> Yes, that is the idea. While it obviously adds overhead (a
>>>>>>> millisecond or two in my first tests), it also adds natural load 
>>>>>>> balancing
>>>>>>> between workers and then lets us have the same architecture for 
>>>>>>> websockets
>>>>>>> and normal HTTP.
>>>>>>> (The interface server does do all the HTTP parsing, so what gets
>>>>>>> sent over is slightly less verbose than normal HTTP and needs less work 
>>>>>>> to
>>>>>>> use, but it's not a big saving)
>>>>>>> Andrew
>>>>>>>> Is the idea a large site using classic request-response
>>>>>>>> architecture would get the requests at interface servers, these would 
>>>>>>>> then
>>>>>>>> push the HTTP requests through channels to worker processes, which 
>>>>>>>> process
>>>>>>>> the message and push the response through the channel backend back to 
>>>>>>>> the
>>>>>>>> interface server and from there back to the client?
>>>>>>>>  - Anssi
>>>>>>>>> To address the points so far:
>>>>>>>>>  - I'm not yet sure whether "traditional" WSGI mode would actually
>>>>>>>>> run through the in memory backend or just be plugged in directly to 
>>>>>>>>> the
>>>>>>>>> existing code path; it really depends on how much code would need to 
>>>>>>>>> be
>>>>>>>>> moved around in either case. I'm pretty keen on keeping a raw-WSGI 
>>>>>>>>> path
>>>>>>>>> around for performance/compatability reasons, and so we can hard fail 
>>>>>>>>> if
>>>>>>>>> you try *any* channels use (right now the failure mode for trying to 
>>>>>>>>> use
>>>>>>>>> channels with the wsgi emulation is silent failure)
>>>>>>>>> - Streaming HTTP responses are already in the channels spec as
>>>>>>>>> chunked messages; you just keep sending response-style messages with 
>>>>>>>>> a flag
>>>>>>>>> saying "there's more".
>>>>>>>>> - File uploads are more difficult, due to the nature of the worker
>>>>>>>>> model (you can't guarantee all the messages will go to the same 
>>>>>>>>> worker). My
>>>>>>>>> current plan here is to revise the message spec to allow infinite size
>>>>>>>>> messages and make the channel backend handle chunking in the best way
>>>>>>>>> (write to shared disk, use lots of keys, etc), but if there are other
>>>>>>>>> suggestions I'm open. This would also let people return large http
>>>>>>>>> responses without having to worry about size limits.
>>>>>>>>> - Alternative serialisation formats will be looked into; it's up
>>>>>>>>> to the channel backend what to use, I just chose JSON as our previous
>>>>>>>>> research into this at work showed that it was actually the fastest 
>>>>>>>>> overall
>>>>>>>>> due to the fact it has a pure C implementation, but that's a year or 
>>>>>>>>> two
>>>>>>>>> old. Whatever is chosen needs large support and forwards 
>>>>>>>>> compatability,
>>>>>>>>> however. The message format is deliberately specified as JSON-capable
>>>>>>>>> structures (dicts, lists, strings) as it's assumed any serialisation 
>>>>>>>>> format
>>>>>>>>> can handle this, and so it can be portable across backends.
>>>>>>>>> - I thought SCRIPT_NAME was basically unused by anyone these days,
>>>>>>>>> but hey, happy to be proved wrong. Do we have any usage numbers on it 
>>>>>>>>> to
>>>>>>>>> know if we'd need it for a new standalone server to implement? It's 
>>>>>>>>> really
>>>>>>>>> not hard to add it into the request format, just thought it was one of
>>>>>>>>> those CGI remnants we might finally be able to kill.
>>>>>>>>> Andrew
>>>>>>>>>>> Hi Andrew,
>>>>>>>>>>> - I share Mark's concern about the performance (latency,
>>>>>>>>>>> specifically)
>>>>>>>>>>> implications for projects that want to keep deploying old-style,
>>>>>>>>>>> given
>>>>>>>>>>> all the new serialization that would now be in the request path.
>>>>>>>>>>> I think
>>>>>>>>>>> some further discussion of this, with real benchmark numbers to
>>>>>>>>>>> refer
>>>>>>>>>>> to, is a prerequisite to considering Channels as a candidate for
>>>>>>>>>>> Django
>>>>>>>>>>> 1.10. To take a parallel from Python, Guido has always said that
>>>>>>>>>>> he
>>>>>>>>>>> won't consider removing the GIL unless it can be done without
>>>>>>>>>>> penalizing
>>>>>>>>>>> single-threaded code. If you think a different approach makes
>>>>>>>>>>> sense here
>>>>>>>>>>> (that is, that it's OK to penalize the simple cases in order to
>>>>>>>>>>> facilitate the less-simple ones), can you explain your reasons
>>>>>>>>>>> for that
>>>>>>>>>>> position?
>>>>>>>>>> We would also need some form of streamed messages for streamed
>>>>>>>>>> http responses.
>>>>>>>>>> Is it possible to handle old-style http the way it has always
>>>>>>>>>> been handled?
>>>>>>>>>>  - Anssi
