The generalized version of the issue would be that you add a server but the client will never pick it up until one existing server is down. We could add periodical refreshing functionality to our DNS NameResolver. I have filed https://github.com/grpc/grpc-java/issues/2514
On Tuesday, December 13, 2016 at 4:08:31 PM UTC-8, Carl Mastrangelo wrote: > > A couple things: > > > * Is there anyway to add a new server to the pool before turning down the > previous replica? That would make you slightly over capacity during the > roll out. > * The load balancer in Java works differently based on the strategy you > use. In RR, the LB maintains a connection to each of the replicas, so that > when one goes down the next one will be used. This means that as long as > you have some in the pool, it will always go to the next connection. That > said, if you are doing rolling restart, this pool will get smaller and > smaller until every connection has been killed and marked unusuable. At > that point I believe it will refresh the list. > > On Friday, December 9, 2016 at 2:45:57 PM UTC-8, [email protected] wrote: >> >> I'm a bit unclear as to how name resolvers in GRPC work with load >> balancing in cases of rolling deploys. As far as I can tell, it seems like >> the most recently deployed server will end up no traffic if we follow our >> current deploy strategy. >> >> Our rolling deploy strategy that we plan to adapt for GRPC works as >> follows >> >> 1. Build artifacts >> 2. De-register a server replica from its DNS name >> 3. Update the server and restart it >> 4. Re-register the server replica from its DNS name. >> >> For the Java GRPC implementation, it looks like the GRPC name resolver does >> not refresh the list of IPs unless >> <https://github.com/grpc/grpc-java/blob/e9779d7c00cde14b33ba3239c859f997e70c2b2e/core/src/main/java/io/grpc/internal/ManagedChannelImpl.java#L696> >> >> (1) there is an error in a previous resolve or (2) a server goes down. I >> believe the core implementation does the same thing, though I'm not >> familiar enough with C to really tell. >> >> What I believe will happen during a rolling deploy is: >> >> 1. Before deploy: Client is talking to N nodes >> 2. A server is removed from DNS, nothing happens on the client >> 3. The server issues a GOAWAY frame to clients. The client removes the >> server from its list of connections, and resolves a new list of servers, >> finding any newly added servers >> 4. The server is restarted and added to the DNS >> 5. Repeat for all other servers in the server set >> 5. After deploy: Client is talking to N-1 nodes and will never attempt to >> look for the last server to be restarted >> >> Is my analysis correct? And if so, what is the recommended way to make >> sure the client ends up talking to all N servers after a rolling deploy? >> > -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/grpc-io. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/7e6f86ab-25c7-44b3-863e-23895c069334%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
