Hi David,

That's great news.  Makes perfect sense why it wasn't working.  Will keep 
an eye out for the PR to be merged, will test again and let you know the 
outcome.

On Wednesday, November 8, 2017 at 1:43:38 PM UTC+13, David Garcia Quintas 
wrote:
>
> Hey Daniel,
>
> I see what's going on. 10.0.0.{4,5} go down, re-resolution is triggered, 
> but before the DNS has been updated, so round robin gets 10.0.0.{4,5} 
> again. These addresses will never again have a backend behind, so RR gets 
> caught up in a retry-loop, unaware of the DNS update pointing to the 
> available 10.0.0.{7,8}. The fix for this issue is almost ready to be merged 
> (see here <https://github.com/grpc/grpc/pull/12829>), and consists on 
> actively re-requesting resolutions in a different way, not only when we go 
> from healthy to unhealthy LB (right now we don't re-resolve if we stay in 
> unhealthy without having ever been healthy).
>
> tl;dr: please check back once https://github.com/grpc/grpc/pull/12829 
> <https://www.google.com/url?q=https%3A%2F%2Fgithub.com%2Fgrpc%2Fgrpc%2Fpull%2F12829&sa=D&sntz=1&usg=AFQjCNEk1gdmn-8X3QMm08f7G3XtQoKwig>
>  
> has been merged, which should happen within 1-2 weeks. I'm adding Juanli 
> (the author of that change) to this thread.
>
> On Thursday, 2 November 2017 00:59:26 UTC-7, [email protected] wrote:
>>
>> Hi David,
>>
>> The ports don't change, they remain the same (port 3000) for all server 
>> instances.  DNS is updated with the new IP addresses to show this, the logs 
>> show the results of dns lookups (by nodejs) when each request is sent to 
>> the server.  Log from client with debug logging enabled is attached.
>>
>> Hope you are able to spot something.
>>
>> Regards,
>> Daniel
>>
>> On Thursday, November 2, 2017 at 1:42:41 PM UTC+13, David Garcia Quintas 
>> wrote:
>>>
>>> Another consideration is: has DNS been updated the reflect the new 
>>> servers's IPs? The way things work is:
>>>
>>>    - (DNS) names such as foo.com resolve to a set of ip addresses
>>>    - LB policies are created over the set of ip addresses (internally 
>>>    there's one subchannel per ip)
>>>    - If all subchannels go into shutdown (eg when all servers die), the 
>>>    LB policy will also die. This will result in a request for re-resolution 
>>> of 
>>>    the name under which the channel was created (in this case, the DNS 
>>> name). 
>>>    A new LB policy will be created from the results of this re-resolution.
>>>
>>> Which is why, if 1) port numbers change (DNS doesn't provide port 
>>> information) and/or 2) DNS doesn't resolve to the servers's new addresses, 
>>> the new LB policy won't contain valid subchannels. 
>>>
>>> On Wednesday, 1 November 2017 16:48:15 UTC-7, David Garcia Quintas wrote:
>>>>
>>>> Hi Daniel,
>>>>
>>>> Do the port numbers of the server also change? If not, it'd be helpful 
>>>> if you could provide me with the logs produced when run with the following 
>>>> environment variables set: GRPC_VERBOSITY=debug 
>>>> GRPC_TRACE=client_channel,round_robin
>>>>
>>>> On Thursday, 26 October 2017 13:01:17 UTC-7, [email protected] wrote:
>>>>>
>>>>> Hi Michael,
>>>>>
>>>>> On Friday, October 27, 2017 at 5:03:33 AM UTC+13, Michael Lumish wrote:
>>>>>>
>>>>>> To clarify, are you saying that after your client loses its 
>>>>>> connection to every server, it never reestablishes a connection with any 
>>>>>> of 
>>>>>> them?
>>>>>>
>>>>>>>
>>>>>>>
>>>>> Yes, exactly.  
>>>>>
>>>>> I have a small project and a bash script that that demonstrates this.  
>>>>> If you've got a Linux/Mac with nodejs and docker running in swarm mode, I 
>>>>> can share it with you.  Essentially it starts 1 client and 2 server 
>>>>> instances (the client just sends a 'ping' request every 2 seconds the 
>>>>> server just sends a response indicating which server responded) all runs 
>>>>> well and shows load balancing between them. Then the script shuts down 
>>>>> both 
>>>>> instances of the server and starts them up again in a way to force them 
>>>>> to 
>>>>> have different IP addresses.  The client is never able to reconnect with 
>>>>> the 2 new instances on the different IP addresses.
>>>>>
>>>>> Regards,
>>>>> Daniel
>>>>>
>>>>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/0d8e5bd9-4254-4f72-81dd-384f2bdad397%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to