response inline

On Tuesday, December 13, 2016 at 8:49:22 PM UTC-8, Yunchi Luo wrote:
>
> I'm very happy to get so much feedback!
>
> To Carl,
>
> We don't have a container cluster management system like kubernetes or 
> mesos running, so to spin up an additional instance would mean starting up 
> a new machine and that will add significant latency to our deploys.
>
> We use RR. In RR does the resolver server list refresh only happen when 
> all server connections are down, or each time a single server go down?
>

I believe it happens when a single connection goes down.  FTR a goaway is 
not considered going down.  Assuming your server stops accepting 
connections, and THEN sends a goaway, this should be okay.  Right now, we a 
single connection gets a goaway, we try to reconnect.  If your server sends 
goaway after it stops calling accept, the client will:

1. Get a goaway
2.  Try to reconnect
3.  Get an unreachable error

But, if the server sends goaway first, and once connections are gone stops 
calling accept, the client may be stuck in a loop:

1.  Get a goaway
2.  Try to reconnect.
3.  Get another goaway.
4.   Goto 1.

Neither of these are very good.   Thus, I think there is a bug in our 
reconnect code.  The github issue would be the right place to talk about 
solutions.
 

>
> To Kun,
>
> Thanks for filing the issue. We are planning to use grpc in Python, Go, 
> and Java. We implemented our own Go dns resolver that does support periodic 
> updates since the Go library seems to be missing dns resolvers entirely.
>
> Do you know what the core implementation which Python uses does? Does the 
> periodic refresh or TTL need to be implemented in core as well?
>
> Support rolling deploys and/or adding servers to a name and having clients 
> pick them up periodically seems like a pretty critical feature for a 
> production system. Is there design for these cases documented somewhere?
>
> To Luke and Kun,
>
> I agree TTL seems like the most natural option since it's part of DNS. I 
> think I saw an issue in core somewhere that mentioned DNS TTL but can't 
> seem to dig it up.
>
> Thanks!
> Yunchi
> On Tue, Dec 13, 2016 at 8:30 PM Luke Tyler Downey <[email protected] 
> <javascript:>> wrote:
>
>> Even better (for our use case in particular, but in general too probably) 
>> would be to actually respect the TTL in a DNS record.
>>
>> Luke
>>
>>
>> On Dec 13, 2016, at 7:28 PM, Kun Zhang <[email protected] <javascript:>> 
>> wrote:
>>
>> The generalized version of the issue would be that you add a server but 
>> the client will never pick it up until one existing server is down.
>> We could add periodical refreshing functionality to our DNS NameResolver. 
>> I have filed https://github.com/grpc/grpc-java/issues/2514
>>
>>
>> On Tuesday, December 13, 2016 at 4:08:31 PM UTC-8, Carl Mastrangelo wrote:
>>>
>>> A couple things:
>>>
>>>
>>> * Is there anyway to add a new server to the pool before turning down 
>>> the previous replica?  That would make you slightly over capacity during 
>>> the roll out.
>>> * The load balancer in Java works differently based on the strategy you 
>>> use.  In RR, the LB maintains a connection to each of the replicas, so that 
>>> when one goes down the next one will be used.  This means that as long as 
>>> you have some in the pool, it will always go to the next connection.  That 
>>> said, if you are doing rolling restart, this pool will get smaller and 
>>> smaller until every connection has been killed and marked unusuable.  At 
>>> that point I believe it will refresh the list.  
>>>
>>> On Friday, December 9, 2016 at 2:45:57 PM UTC-8, [email protected] 
>>> wrote:
>>>>
>>>> I'm a bit unclear as to how name resolvers in GRPC work with load 
>>>> balancing in cases of rolling deploys. As far as I can tell, it seems like 
>>>> the most recently deployed server will end up no traffic if we follow our 
>>>> current deploy strategy.
>>>>
>>>> Our rolling deploy strategy that we plan to adapt for GRPC works as 
>>>> follows
>>>>
>>>> 1. Build artifacts
>>>> 2. De-register a server replica from its DNS name
>>>> 3. Update the server and restart it
>>>> 4. Re-register the server replica from its DNS name.
>>>>
>>>> For the Java GRPC implementation, it looks like the GRPC name resolver 
>>>> does 
>>>> not refresh the list of IPs unless 
>>>> <https://github.com/grpc/grpc-java/blob/e9779d7c00cde14b33ba3239c859f997e70c2b2e/core/src/main/java/io/grpc/internal/ManagedChannelImpl.java#L696>
>>>>  
>>>> (1) there is an error in a previous resolve or (2) a server goes down. I 
>>>> believe the core implementation does the same thing, though I'm not 
>>>> familiar enough with C to really tell.
>>>>
>>>> What I believe will happen during a rolling deploy is:
>>>>
>>>> 1. Before deploy: Client is talking to N nodes
>>>> 2. A server is removed from DNS, nothing happens on the client
>>>> 3. The server issues a GOAWAY frame to clients. The client removes the 
>>>> server from its list of connections, and resolves a new list of servers, 
>>>> finding any newly added servers
>>>> 4. The server is restarted and added to the DNS
>>>> 5. Repeat for all other servers in the server set
>>>> 5. After deploy: Client is talking to N-1 nodes and will never attempt 
>>>> to look for the last server to be restarted
>>>>
>>>> Is my analysis correct? And if so, what is the recommended way to make 
>>>> sure the client ends up talking to all N servers after a rolling deploy?
>>>>
>>> -- 
>> You received this message because you are subscribed to a topic in the 
>> Google Groups "grpc.io" group.
>> To unsubscribe from this topic, visit 
>> https://groups.google.com/d/topic/grpc-io/wxgLgjzkR30/unsubscribe.
>> To unsubscribe from this group and all its topics, send an email to 
>> [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/grpc-io.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/grpc-io/B2119977-C217-4B0A-94A9-ED616CD9DD47%40compass.com
>>  
>> <https://groups.google.com/d/msgid/grpc-io/B2119977-C217-4B0A-94A9-ED616CD9DD47%40compass.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/1cf891be-1c10-40b9-b285-b3808a467953%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to