[
https://issues.apache.org/jira/browse/GEODE-9880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17456560#comment-17456560
]
Anthony Baker commented on GEODE-9880:
--------------------------------------
The locator list returned to the client contained \{ip1, host1, ip2, host2}.
The discovered list provided to the client should have only contained 2 entries
corresponding to the 2 locators.
Second, the advertised address of the locator should follow this semantic:
1) If hostname-for-clients is set use that.
2) If bind-address is set, use that interface.
3) Otherwise select an available network interface but there are no guarantees
about ordering or dns resolution.
> Cluster with multiple locators in an environment with no host name
> resolution, leads to null pointer exception
> --------------------------------------------------------------------------------------------------------------
>
> Key: GEODE-9880
> URL: https://issues.apache.org/jira/browse/GEODE-9880
> Project: Geode
> Issue Type: Bug
> Components: locator
> Affects Versions: 1.12.5
> Reporter: Tigran Ghahramanyan
> Priority: Major
>
> In our use case we have two locators that are initially configured with IP
> addresses, but _AutoConnectionSourceImpl.UpdateLocatorList()_ flow keeps on
> adding their corresponding host names to the locators list, while these host
> names are not resolvable.
> Later in {_}AutoConnectionSourceImpl.queryLocators(){_}, whenever a client
> tries to use such non resolvable host name to connect to a locator it tries
> to establish a connection to {_}socketaddr=0.0.0.0{_}, as written in
> {_}SocketCreator.connect(){_}. Which seems strange.
> Then, if there is no locator running on the same host, the next locator in
> the list is contacted, until reaching a locator contact configured with IP
> address - which succeeds eventually.
> But, when there happens to be a locator listening on the same host, then we
> have a null pointer exception in the second line below, because _inetadd=null_
> _socket.connect(sockaddr, Math.max(timeout, 0)); // sockaddr=0.0.0.0,
> connects to a locator listening on the same host_
> _configureClientSSLSocket(socket, inetadd.getHostName(), timeout); // inetadd
> = null_
>
> As a result, the cluster comes to a failed state, unable to recover.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)