[jira] [Commented] (CURATOR-229) No retry on DNS lookup failure

Andy Sloane (JIRA) Mon, 10 Apr 2017 16:05:58 -0700

    [ 
https://issues.apache.org/jira/browse/CURATOR-229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15963625#comment-15963625
 ]


Andy Sloane commented on CURATOR-229:
-------------------------------------

Right, there are two cases (permanent and temporary), but in both cases I would 
argue the behavior is undesirable.

If the host is truly not resolvable, then you get the above background thread 
exception logged, and... nothing else is obviously wrong. 
{{CuratorFrameworkImpl.start()}} returns without issue while the background 
thread hangs, and there's no API-level indication, unless you've registered an 
UnhandledErrorListenable, that anything is wrong, at least if you're using 
simple things like {{LeaderLatch}}.

If it's a temporary DNS failure, and retrying would work, then retrying in the 
background and not complaining in {{start()}} is fine, but if it's permanent 
you're stuck without really bubbling the configuration error to the surface.

Even just not treating the error within {{CuratorFrameworkImpl.start()}} as a 
background exception but instead just throwing it to the caller would improve 
the situation. And if it was previously connected, and is reconnecting outside 
of {{start}} then attempting to reconnect to zk makes sense.


> No retry on DNS lookup failure
> ------------------------------
>
>                 Key: CURATOR-229
>                 URL: https://issues.apache.org/jira/browse/CURATOR-229
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Framework
>    Affects Versions: 2.7.0
>            Reporter: Michael Putters
>
> Our environment is setup so that host names (rather than IP addresses) are 
> used when registering services.
> When disconnecting a node from the network, it will attempt to reconnect and 
> - in order to do this - attempts to resolve a host name, which fails (since 
> we have no network connectivity and a DNS server is used).
> It appears this type of exception is not retryable, and the node simply gives 
> up and never reconnects, even when the network connectivity is back.
> Is this the expected behavior? Is there any way to configure Curator so that 
> this type of exception is retryable? I had a look at 
> {{CuratorFrameworkImpl.java}} around line 768 but there doesn't seem to be 
> anything configurable.
> If this is not the expected behavior (or if it is but you don't mind making 
> it configurable), I should be able to provide a patch via a pull request.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Commented] (CURATOR-229) No retry on DNS lookup failure

Reply via email to