Do your clients all start up at the same time?  If not, it's not clear to
me why your setup wouldn't work.  If the clients' start times are randomly
distributed, then if the server closes each connection after 30m, the
connection close times should be just as randomly distributed as the client
start times, which means that as soon as the new server comes up, clients
should start trickling into it.  It may take 30m for the load to fully
balance, but the new server should start getting new load immediately, and
the load should increase slowly over that 30m period.

I don't know anything about the Kubernetes autoscaling side of things, but
maybe there are parameters you can tune there to give the new server more
time to accumulate load before Kubernetes kills it?

In general, it's not clear to me that there's a better approach than the
one you're already taking.  There's always a bit of tension between load
balancing and streaming RPCs, because the whole point of a streaming RPC is
that it doesn't go through load balancing for each individual message,
which means that all of the messages go to the same backend.

I hope this information is helpful.

On Mon, Oct 7, 2019 at 3:54 PM howardjohn via grpc.io <
grpc-io@googlegroups.com> wrote:

> We have a case where we have many clients and few servers, typically
> 1000:1 ratio. The traffic is a single bidirectional stream per client.
>
> The problem we are seeing is that when a new server comes up, it will have
> no clients connected, as they maintain their connection to the other
> servers.
>
> This is made worse by Kubernetes autoscaling, as this new server will have
> 0 load it will scale down and we flip flop between n and n+1 replicas. This
> graph shows this behavior pretty well:
> https://snapshot.raintank.io/dashboard/snapshot/SceOCrNpdOr4qmTUk1UHF20xMiNqGk6K?panelId=4&fullscreen&orgId=2
>
> As a mitigation against this, we have the server close the connections
> every 30m. This is not great, because it takes a least 30 min to balance,
> and due to the above issue this generally doesn't ever work.
>
>
> I am wondering if there are any best practices for handling this type of
> problem?
>
> One possible idea we have is the server sharing load information and
> shedding load if they have more than their "fair share" of connections, but
> this is pretty complex
>
> --
> You received this message because you are subscribed to the Google Groups "
> grpc.io" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to grpc-io+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/grpc-io/3423f295-a2fe-4096-b003-fc09c605987e%40googlegroups.com
> <https://groups.google.com/d/msgid/grpc-io/3423f295-a2fe-4096-b003-fc09c605987e%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>


-- 
Mark D. Roth <r...@google.com>
Software Engineer
Google, Inc.

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/CAJgPXp4-VXZ99TsjnR2XgbsaJTn_-8qgFgkozfeG_J%2BU7jQB5A%40mail.gmail.com.

Reply via email to