Even better (for our use case in particular, but in general too probably) would be to actually respect the TTL in a DNS record.
Luke > On Dec 13, 2016, at 7:28 PM, Kun Zhang <[email protected]> wrote: > > The generalized version of the issue would be that you add a server but the > client will never pick it up until one existing server is down. > We could add periodical refreshing functionality to our DNS NameResolver. I > have filed https://github.com/grpc/grpc-java/issues/2514 > > >> On Tuesday, December 13, 2016 at 4:08:31 PM UTC-8, Carl Mastrangelo wrote: >> A couple things: >> >> >> * Is there anyway to add a new server to the pool before turning down the >> previous replica? That would make you slightly over capacity during the >> roll out. >> * The load balancer in Java works differently based on the strategy you use. >> In RR, the LB maintains a connection to each of the replicas, so that when >> one goes down the next one will be used. This means that as long as you >> have some in the pool, it will always go to the next connection. That said, >> if you are doing rolling restart, this pool will get smaller and smaller >> until every connection has been killed and marked unusuable. At that point >> I believe it will refresh the list. >> >>> On Friday, December 9, 2016 at 2:45:57 PM UTC-8, [email protected] wrote: >>> I'm a bit unclear as to how name resolvers in GRPC work with load balancing >>> in cases of rolling deploys. As far as I can tell, it seems like the most >>> recently deployed server will end up no traffic if we follow our current >>> deploy strategy. >>> >>> Our rolling deploy strategy that we plan to adapt for GRPC works as follows >>> >>> 1. Build artifacts >>> 2. De-register a server replica from its DNS name >>> 3. Update the server and restart it >>> 4. Re-register the server replica from its DNS name. >>> >>> For the Java GRPC implementation, it looks like the GRPC name resolver does >>> not refresh the list of IPs unless (1) there is an error in a previous >>> resolve or (2) a server goes down. I believe the core implementation does >>> the same thing, though I'm not familiar enough with C to really tell. >>> >>> What I believe will happen during a rolling deploy is: >>> >>> 1. Before deploy: Client is talking to N nodes >>> 2. A server is removed from DNS, nothing happens on the client >>> 3. The server issues a GOAWAY frame to clients. The client removes the >>> server from its list of connections, and resolves a new list of servers, >>> finding any newly added servers >>> 4. The server is restarted and added to the DNS >>> 5. Repeat for all other servers in the server set >>> 5. After deploy: Client is talking to N-1 nodes and will never attempt to >>> look for the last server to be restarted >>> >>> Is my analysis correct? And if so, what is the recommended way to make sure >>> the client ends up talking to all N servers after a rolling deploy? -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/grpc-io. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/B2119977-C217-4B0A-94A9-ED616CD9DD47%40compass.com. For more options, visit https://groups.google.com/d/optout.
