Re: Channels integration plan, first draft

Andrew Godwin Fri, 18 Dec 2015 02:02:21 -0800

On Fri, Dec 18, 2015 at 7:34 AM, Anssi Kääriäinen <[email protected]>
wrote:


> I have a gut feeling this isn't going to work that well. The reasons
> include:
>   - Backwards compatibility: how is a large site going to upgrade from 1.9
> to 1.10?
>

None of the core view API will change. A 1.9 codebase will boot and work on
1.10/channels with no code changes.


>   - Complexity of setup.
>

Could you elaborate? If you don't want it, you don't need to configure
anything, and if you do, for most people it's just getting a Redis server
running and pointing a setting at it.


>   - Error conditions: for example, what happens when an interface server
> sends a request to worker, and then dies (that is, the response channel has
> zero listeners). Similarly for chunked messages.
>

Then the response is dropped, the same way a WSGI worker drops a connection
if it dies. Chunked responses will time out after a certain period and be
dropped too, in the same way that a deadlocked normal server would.


>   - Does the architecture really scale enough? The channel backend is
> going to be a bottleneck, it needs the ability to handle a huge amount of
> data and a huge amount of individual messages. In particular, the request
> channel is going to be contested. We know classic http scales, but is the
> same true for interface server architecture?
>

I believe it will - have you read through the sharding and scaling plan in
the docs? Channels is carefully designed to have no state in anything but
interface servers, all workers handling all message types and queuing of
messages exactly so you can scale horizontally; you can divide a very large
site into several clusters of interfaces and workers fronted by load
balancers, and each cluster would have multiple Redis (e.g.) backends with
requests equally sharded across them using consistent hashing.


>   - Performance. Each request and response needs two additional network
> roundtrips. One to save to the channel server, one to fetch from the
> channel server. If the messages are large, this adds a lot of latency.
>

All Django requests already involve multiple round trips to database
servers, and the Redis backend at least is much quicker on the processing
side. You could make the same argument about not having separate load
balancers and nginx serving static files - after all, you're just adding
another network roundtrip to send traffic onward from the other server to
Django.


>   - Untested architecture: does any big site use this kind of architecture
> for all http handling?
>

I agree with you here - see below for my justification. I've seen it used
for other things (data update networks, service calls), but not direct HTTP.


>
> A realistic test for this is to push a scalable amount of scalable sized
> requests through the stack. The stack should recover even if you shut down
> parts of the network, any single interface server, channel backend server
> or worker server. Of course, when a server or a part of the network
> recovers, the stack would need to recover from that. Compare the
> performance, simplicity of setup and ability to recover from error
> conditions to a setup with only classic Django http servers.
>
> I'm sorry if this feels negative. But, you are planning to change the very
> core of what a Django server is, and I feel we need to know with certainty
> that the new architecture really works. And not only that, it needs to be
> at least as good as classic http handling for existing users.
>

That's why a decent part of my proposal to Mozilla for funding was to help
us fund hardware and time for extensive performance and scale testing. I've
seen this architecture work at scale before for non-HTTP traffic, and I
believe that it will work as well for HTTP and WebSockets.

Don't get me wrong - I don't believe this is a magical panacea to solve all
problems, and we're going to have to do plenty of testing and development
work to get the solution to the level of existing HTTP handling, but
remember, it also brings positive results:

 - Downtime-less code deploys (if you stop workers, requests will just wait
for new ones to appear until they hit timeout)
 - Ability to add and remove processing capacity live without loadbalancer
reconfiguration
 - Lets you run different parts of the site on different Python runtimes,
if you want (e.g. one part on PyPy, one part on CPython 2, one part on
CPython 3)
 - Background task processing
 - And, of course, WebSockets/HTTP2/long-poll HTTP/other
non-request-response protocol support

It's never going to be a solution that works for everyone, but the whole
nice part about channels is that, like all the best parts of Django, you
can just ignore it and not use it if you don't want it; we're not going to
get rid of WSGI support, and the default shipping version in 1.10 isn't
going to make you find a Redis server before you can even boot it up; it'll
just work like it does now, with extra flexibility there if you want to go
turn it on and read through the next part of the tutorial/docs.

I don't expect to get this past the community, core and technical board and
approved into a release until it's proven itself to run and work at scale,
and I already have several offers of testbeds to help prove this out with
realistic web loads, which we can combine with synthetic load tests.

Put it this way - I do not see any other way to handle WebSockets that is
as feasible as this. Most solutions either require us to run the whole of
Django in an async Python environment, which comes with its own set of
issues, or they're more stateful proxy servers or run-alongside-servers
that don't seem to have a story for scaling them to hundreds of thousands
of connections without blowing up every packet received into HTTP requests.

I think Django absolutely has to adapt to the modern web environment and
move away from just rendering templates when browsers request them, and
this to me is part of that. If there are other solutions to the same
problems I think we should consider them as well; I've just not run across
any that work as well in the two years I've been planning this out before I
brought it out to be talked about.

Andrew


> On Thursday, December 17, 2015, Andrew Godwin <[email protected]> wrote:
>
>> Yes, that is the idea. While it obviously adds overhead (a millisecond or
>> two in my first tests), it also adds natural load balancing between workers
>> and then lets us have the same architecture for websockets and normal HTTP.
>>
>> (The interface server does do all the HTTP parsing, so what gets sent
>> over is slightly less verbose than normal HTTP and needs less work to use,
>> but it's not a big saving)
>>
>> Andrew
>>
>> On Thu, Dec 17, 2015 at 9:01 PM, Anssi Kääriäinen <[email protected]>
>> wrote:
>>
>>> Is the idea a large site using classic request-response architecture
>>> would get the requests at interface servers, these would then push the HTTP
>>> requests through channels to worker processes, which process the message
>>> and push the response through the channel backend back to the interface
>>> server and from there back to the client?
>>>
>>>  - Anssi
>>>
>>> On Thursday, December 17, 2015, Andrew Godwin <[email protected]>
>>> wrote:
>>>
>>>> To address the points so far:
>>>>
>>>>  - I'm not yet sure whether "traditional" WSGI mode would actually run
>>>> through the in memory backend or just be plugged in directly to the
>>>> existing code path; it really depends on how much code would need to be
>>>> moved around in either case. I'm pretty keen on keeping a raw-WSGI path
>>>> around for performance/compatability reasons, and so we can hard fail if
>>>> you try *any* channels use (right now the failure mode for trying to use
>>>> channels with the wsgi emulation is silent failure)
>>>>
>>>> - Streaming HTTP responses are already in the channels spec as chunked
>>>> messages; you just keep sending response-style messages with a flag saying
>>>> "there's more".
>>>>
>>>> - File uploads are more difficult, due to the nature of the worker
>>>> model (you can't guarantee all the messages will go to the same worker). My
>>>> current plan here is to revise the message spec to allow infinite size
>>>> messages and make the channel backend handle chunking in the best way
>>>> (write to shared disk, use lots of keys, etc), but if there are other
>>>> suggestions I'm open. This would also let people return large http
>>>> responses without having to worry about size limits.
>>>>
>>>> - Alternative serialisation formats will be looked into; it's up to the
>>>> channel backend what to use, I just chose JSON as our previous research
>>>> into this at work showed that it was actually the fastest overall due to
>>>> the fact it has a pure C implementation, but that's a year or two old.
>>>> Whatever is chosen needs large support and forwards compatability, however.
>>>> The message format is deliberately specified as JSON-capable structures
>>>> (dicts, lists, strings) as it's assumed any serialisation format can handle
>>>> this, and so it can be portable across backends.
>>>>
>>>> - I thought SCRIPT_NAME was basically unused by anyone these days, but
>>>> hey, happy to be proved wrong. Do we have any usage numbers on it to know
>>>> if we'd need it for a new standalone server to implement? It's really not
>>>> hard to add it into the request format, just thought it was one of those
>>>> CGI remnants we might finally be able to kill.
>>>>
>>>> Andrew
>>>>
>>>> On Thu, Dec 17, 2015 at 6:32 PM, Anssi Kääriäinen <[email protected]>
>>>> wrote:
>>>>
>>>>> On Thursday, December 17, 2015, Carl Meyer <[email protected]> wrote:
>>>>>
>>>>>> Hi Andrew,
>>>>>
>>>>>
>>>>>> - I share Mark's concern about the performance (latency, specifically)
>>>>>> implications for projects that want to keep deploying old-style, given
>>>>>> all the new serialization that would now be in the request path. I
>>>>>> think
>>>>>> some further discussion of this, with real benchmark numbers to refer
>>>>>> to, is a prerequisite to considering Channels as a candidate for
>>>>>> Django
>>>>>> 1.10. To take a parallel from Python, Guido has always said that he
>>>>>> won't consider removing the GIL unless it can be done without
>>>>>> penalizing
>>>>>> single-threaded code. If you think a different approach makes sense
>>>>>> here
>>>>>> (that is, that it's OK to penalize the simple cases in order to
>>>>>> facilitate the less-simple ones), can you explain your reasons for
>>>>>> that
>>>>>> position?
>>>>>>
>>>>>
>>>>> We would also need some form of streamed messages for streamed http
>>>>> responses.
>>>>>
>>>>> Is it possible to handle old-style http the way it has always been
>>>>> handled?
>>>>>
>>>>>  - Anssi
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "Django developers (Contributions to Django itself)" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>> To post to this group, send email to
>>>>> [email protected].
>>>>> Visit this group at https://groups.google.com/group/django-developers.
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/django-developers/CALMtK1Gz%3DaYMLyFW2da2C6Wo_-c_V2T_4p6K9eh0vwrKB91dKw%40mail.gmail.com
>>>>> <https://groups.google.com/d/msgid/django-developers/CALMtK1Gz%3DaYMLyFW2da2C6Wo_-c_V2T_4p6K9eh0vwrKB91dKw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>> --
>>>> You received this message because you are subscribed to the Google
>>>> Groups "Django developers (Contributions to Django itself)" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>> an email to [email protected].
>>>> To post to this group, send email to [email protected]
>>>> .
>>>> Visit this group at https://groups.google.com/group/django-developers.
>>>> To view this discussion on the web visit
>>>> https://groups.google.com/d/msgid/django-developers/CAFwN1upm8N8tOhCdAgPEbYbBO6MU%2BmnEQ%3D%3Dp6EmW75%3DbNXHkfg%40mail.gmail.com
>>>> <https://groups.google.com/d/msgid/django-developers/CAFwN1upm8N8tOhCdAgPEbYbBO6MU%2BmnEQ%3D%3Dp6EmW75%3DbNXHkfg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "Django developers (Contributions to Django itself)" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>> To post to this group, send email to [email protected].
>>> Visit this group at https://groups.google.com/group/django-developers.
>>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/django-developers/CALMtK1FOVa6K-MMsZ9vACfcw0w0KHwdCXJ2vxu7_Y5Q9PHJ6Gg%40mail.gmail.com
>>> <https://groups.google.com/d/msgid/django-developers/CALMtK1FOVa6K-MMsZ9vACfcw0w0KHwdCXJ2vxu7_Y5Q9PHJ6Gg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>> For more options, visit https://groups.google.com/d/optout.
>>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Django developers (Contributions to Django itself)" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> Visit this group at https://groups.google.com/group/django-developers.
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/django-developers/CAFwN1uoCX67D4E5J8n_wOo3VvPVjZYVnOG%2BshX6oDQk3zgu3vw%40mail.gmail.com
>> <https://groups.google.com/d/msgid/django-developers/CAFwN1uoCX67D4E5J8n_wOo3VvPVjZYVnOG%2BshX6oDQk3zgu3vw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Django developers (Contributions to Django itself)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> Visit this group at https://groups.google.com/group/django-developers.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/django-developers/CALMtK1E4HLL5v-z4u7temEDh6WfBpa9mcOXOf%3D-nSpKWRV%3DJsA%40mail.gmail.com
> <https://groups.google.com/d/msgid/django-developers/CALMtK1E4HLL5v-z4u7temEDh6WfBpa9mcOXOf%3D-nSpKWRV%3DJsA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Django developers  (Contributions to Django itself)" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/django-developers.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-developers/CAFwN1uoUc7_-4XbterzhriSB9vLhJ3DLfqrSUH9iyMNyVwSD7A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Channels integration plan, first draft

Reply via email to