[jira] [Commented] (GEODE-9666) Client throws NoAvailableLocatorsException after locators change IP addresses

2021-10-13 Thread ASF subversion and git services (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17428370#comment-17428370
 ] 

ASF subversion and git services commented on GEODE-9666:


Commit b39958fafa8690bb978710018a6ecf2bc56244f3 in geode's branch 
refs/heads/develop from Aaron Lindsey
[ https://gitbox.apache.org/repos/asf?p=geode.git;h=b39958f ]

GEODE-9666: Avoid caching InetSocketAddress (#6938)

The changes for GEODE-9139 changed the behavior of
org.apache.geode.distributed.internal.tcpserver.HostAndPort to
permanently cache the internal InetSocketAddress once it has tried one
time to resolve the address. This undoes part of the fix introduced by
GEODE-7808, in which HostAndPort was created as a way to hold an
unresolved hostname.

The issue is that the cached InetSocketAddress may contain a stale or
unresolved address which will be returned by getSocketInetAddress for
the lifetime of the HostAndPort/InetSocketWrapper object. This prevents
the address from being resolved correctly after changes in DNS records.
(Such changes are common in cloud environments.)

This commit removes the cached internal InetSocketAddress from
InetSocketWrapper so that getSocketInetAddress will try to resolve the
address each time it is called with an unresolved address.

> Client throws NoAvailableLocatorsException after locators change IP addresses
> -
>
> Key: GEODE-9666
> URL: https://issues.apache.org/jira/browse/GEODE-9666
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.15.0
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Labels: needsTriage, pull-request-available
>
> We have a test for Geode on Kubernetes which:
>  * Deploys a Geode cluster consisting of 2 locator Pods, 3 server Pods
>  * Deploys 5 Spring boot client Pods which continually do PUTs and GETs
>  * Triggers a rolling restart of the locator Pods
>  ** The rolling restart operation restarts one locator at a time, waiting for 
> each restarted locator to become fully online before restarting the next 
> locator
>  * Stops the client operations and validates there were no exceptions thrown 
> in the clients.
> Occasionally, we see {{NoAvailableLocatorsException}} thrown on one of the 
> clients:
> {code:none}
> org.apache.geode.cache.client.NoAvailableLocatorsException: Unable to connect 
> to any locators in the list 
> [system-test-gemfire-locator-0.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334,
>  
> system-test-gemfire-locator-1.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334]
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.findServer(AutoConnectionSourceImpl.java:174)
>   at 
> org.apache.geode.cache.client.internal.ConnectionFactoryImpl.createClientToServerConnection(ConnectionFactoryImpl.java:198)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:196)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:190)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:276)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:136)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:119)
>   at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:801)
>   at org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:92)
>   at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:114)
>   at 
> org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2802)
>   at 
> org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1469)
>   at 
> org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1442)
>   at 
> org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:197)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1379)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1318)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1303)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:439)
>   at 
> org.apache.geode.kubernetes.client.service.AsyncOperationService.evaluate(AsyncOperationService.java:282)
>   at 
> 

[jira] [Commented] (GEODE-9666) Client throws NoAvailableLocatorsException after locators change IP addresses

2021-10-05 Thread Aaron Lindsey (Jira)


[ 
https://issues.apache.org/jira/browse/GEODE-9666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424702#comment-17424702
 ] 

Aaron Lindsey commented on GEODE-9666:
--

I no longer see the {{NoAvailableLocatorsException}} when running our test with 
this change: 
[https://github.com/apache/geode/compare/develop...aaronlindsey:GEODE-9666-NoAvailableLocatorsException?expand=1]

> Client throws NoAvailableLocatorsException after locators change IP addresses
> -
>
> Key: GEODE-9666
> URL: https://issues.apache.org/jira/browse/GEODE-9666
> Project: Geode
>  Issue Type: Bug
>  Components: membership
>Affects Versions: 1.15.0
>Reporter: Aaron Lindsey
>Assignee: Aaron Lindsey
>Priority: Major
>  Labels: needsTriage
>
> We have a test for Geode on Kubernetes which:
>  * Deploys a Geode cluster consisting of 2 locator Pods, 3 server Pods
>  * Deploys 5 Spring boot client Pods which continually do PUTs and GETs
>  * Triggers a rolling restart of the locator Pods
>  ** The rolling restart operation restarts one locator at a time, waiting for 
> each restarted locator to become fully online before restarting the next 
> locator
>  * Stops the client operations and validates there were no exceptions thrown 
> in the clients.
> Occasionally, we see {{NoAvailableLocatorsException}} thrown on one of the 
> clients:
> {code:none}
> org.apache.geode.cache.client.NoAvailableLocatorsException: Unable to connect 
> to any locators in the list 
> [system-test-gemfire-locator-0.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334,
>  
> system-test-gemfire-locator-1.system-test-gemfire-locator.gemfire-system-test-3f1ecc74-b1ea-4288-b4d1-594bbb8364ab.svc.cluster.local:10334]
>   at 
> org.apache.geode.cache.client.internal.AutoConnectionSourceImpl.findServer(AutoConnectionSourceImpl.java:174)
>   at 
> org.apache.geode.cache.client.internal.ConnectionFactoryImpl.createClientToServerConnection(ConnectionFactoryImpl.java:198)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:196)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.createPooledConnection(ConnectionManagerImpl.java:190)
>   at 
> org.apache.geode.cache.client.internal.pooling.ConnectionManagerImpl.borrowConnection(ConnectionManagerImpl.java:276)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:136)
>   at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.execute(OpExecutorImpl.java:119)
>   at 
> org.apache.geode.cache.client.internal.PoolImpl.execute(PoolImpl.java:801)
>   at org.apache.geode.cache.client.internal.GetOp.execute(GetOp.java:92)
>   at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.get(ServerRegionProxy.java:114)
>   at 
> org.apache.geode.internal.cache.LocalRegion.findObjectInSystem(LocalRegion.java:2802)
>   at 
> org.apache.geode.internal.cache.LocalRegion.getObject(LocalRegion.java:1469)
>   at 
> org.apache.geode.internal.cache.LocalRegion.nonTxnFindObject(LocalRegion.java:1442)
>   at 
> org.apache.geode.internal.cache.LocalRegionDataView.findObject(LocalRegionDataView.java:197)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1379)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1318)
>   at 
> org.apache.geode.internal.cache.LocalRegion.get(LocalRegion.java:1303)
>   at 
> org.apache.geode.internal.cache.AbstractRegion.get(AbstractRegion.java:439)
>   at 
> org.apache.geode.kubernetes.client.service.AsyncOperationService.evaluate(AsyncOperationService.java:282)
>   at 
> org.apache.geode.kubernetes.client.api.Controller.evaluateRegion(Controller.java:88)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.base/java.lang.reflect.Method.invoke(Method.java:566)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:197)
>   at 
> org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:141)
>   at 
> org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:106)
>   at 
>