[ 
https://issues.apache.org/jira/browse/GEODE-4251?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16319481#comment-16319481
 ] 

Barry Oglesby commented on GEODE-4251:
--------------------------------------

I see whats happening in this test.

The registerInterest call times out:
{noformat}
[vm3] [warn 2018/01/09 15:49:13.919 PST <RMI TCP Connection(1)-127.0.0.1> 
tid=0x13] Pool unexpected socket timed out on client 
connection=SubscriptionConnectionImpl[localhost:27374:closed])
{noformat}
This causes causes the QueueConnectionImpl to be destroyed, which causes the 
CacheClientUpdater to be closed:
{noformat}
[vm3] [info 2018/01/09 15:49:13.919 PST <Cache Client Updater Thread  on 
192.168.2.6(20159)<v1>:32770 port 27374> tid=0x3e] Cache client updater for 
Queue on endpoint localhost:27374 exiting. Scheduling recovery.
{noformat}
Here is a stack trace of that area of the code. OpExecutorImpl.handleException 
handles the SocketTimeoutException by destroying the QueueConnectionImpl. This 
calls QueueManagerImpl.endpointCrashed which closes the CacheClientUpdater.
{noformat}
java.lang.Exception: Stack trace
        at java.lang.Thread.dumpStack(Thread.java:1333)
        at 
org.apache.geode.internal.cache.tier.sockets.CacheClientUpdater.close(CacheClientUpdater.java:548)
        at 
org.apache.geode.cache.client.internal.QueueConnectionImpl.internalDestroy(QueueConnectionImpl.java:107)
        at 
org.apache.geode.cache.client.internal.QueueManagerImpl.endpointCrashed(QueueManagerImpl.java:392)
        at 
org.apache.geode.cache.client.internal.QueueManagerImpl.connectionCrashed(QueueManagerImpl.java:370)
        at 
org.apache.geode.cache.client.internal.QueueConnectionImpl.destroy(QueueConnectionImpl.java:83)
        at 
org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:774)
        at 
org.apache.geode.cache.client.internal.OpExecutorImpl.handleException(OpExecutorImpl.java:615)
        at 
org.apache.geode.cache.client.internal.OpExecutorImpl.executeOnQueuesAndReturnPrimaryResult(OpExecutorImpl.java:561)
        at 
org.apache.geode.cache.client.internal.PoolImpl.executeOnQueuesAndReturnPrimaryResult(PoolImpl.java:842)
        at 
org.apache.geode.cache.client.internal.RegisterInterestListOp.execute(RegisterInterestListOp.java:47)
        at 
org.apache.geode.cache.client.internal.ServerRegionProxy.registerInterestList(ServerRegionProxy.java:533)
        at 
org.apache.geode.internal.cache.LocalRegion.processSingleInterest(LocalRegion.java:3763)
        at 
org.apache.geode.internal.cache.LocalRegion.registerInterest(LocalRegion.java:3848)
        at 
org.apache.geode.internal.cache.LocalRegion.registerInterest(LocalRegion.java:3842)
        at 
org.apache.geode.internal.cache.LocalRegion.registerInterest(LocalRegion.java:3837)
        at TestClient.registerInterest(TestClient.java:94)
        at TestClient.main(TestClient.java:31)
{noformat}
Its unclear why the registerInterest call times out, but the test sets the read 
timeout = 2000.
{noformat}
p = PoolManager.createFactory().addServer(host, PORT1).addServer(host, PORT2)
                
.setSubscriptionEnabled(true).setSubscriptionRedundancy(-1).setReadTimeout(2000)
{noformat}
In the meantime, the CacheClientProxy has been established on each server, so 
when the CacheClientUpdater is attempted to be recreated by the 
RedundancySatisfierTask, it fails with:
{noformat}
[vm0] [warn 2018/01/09 15:49:13.932 PST <Client Queue Initialization  Thread 0> 
tid=0x5c] A previous connection attempt from this client is still being 
processed: identity(192.168.2.6(20162:loner):58955:43e553dd,connection=1
[vm3] [warn 2018/01/09 15:49:13.932 PST 
<queueTimer-RegisterInterestKeysDUnitTestPool> tid=0x3f] Cache Client Updater 
Thread  on 192.168.2.6(20159)<v1>:32770 port 27374 (localhost:27374): Caught 
following exception while attempting to create a server-to-client communication 
socket and will exit: 
org.apache.geode.cache.client.ServerRefusedConnectionException: <null inet_addr 
hostname><ec>:27374 refused connection: A previous connection attempt from this 
client is still being processed: 
identity(192.168.2.6(20162:loner):58955:43e553dd,connection=1
{noformat}
Here is a stack trace of the CacheClientUpdater recovery from the client:
{noformat}
java.lang.Exception: Stack trace
        at java.lang.Thread.dumpStack(Thread.java:1333)
        at 
org.apache.geode.internal.cache.tier.sockets.CacheClientUpdater.<init>(CacheClientUpdater.java:273)
        at 
org.apache.geode.cache.client.internal.ConnectionFactoryImpl.createServerToClientConnection(ConnectionFactoryImpl.java:302)
        at 
org.apache.geode.cache.client.internal.QueueManagerImpl.initializeQueueConnection(QueueManagerImpl.java:977)
        at 
org.apache.geode.cache.client.internal.QueueManagerImpl.createNewPrimary(QueueManagerImpl.java:816)
        at 
org.apache.geode.cache.client.internal.QueueManagerImpl.recoverPrimary(QueueManagerImpl.java:928)
        at 
org.apache.geode.cache.client.internal.QueueManagerImpl$RedundancySatisfierTask.run2(QueueManagerImpl.java:1469)
        at 
org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1326)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)
{noformat}
Here is a stack trace of the CacheClientNotifier.registerClient method that 
logs the 'previous connection attempt' warning:
{noformat}
java.lang.Exception: Stack trace
        at java.lang.Thread.dumpStack(Thread.java:1333)
        at 
org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.registerClient(CacheClientNotifier.java:524)
        at 
org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.registerGFEClient(CacheClientNotifier.java:332)
        at 
org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.registerClient(CacheClientNotifier.java:279)
        at 
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$ClientQueueInitializerTask.run(AcceptorImpl.java:1844)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at 
org.apache.geode.internal.cache.tier.sockets.AcceptorImpl$3$1.run(AcceptorImpl.java:611)
        at java.lang.Thread.run(Thread.java:745)
{noformat}

The current checks are:
- staleClientProxy.isConnected() (boolean on the CacheClientProxy that won't be 
false unless CacheClientProxy close has been invoked)
- staleClientProxy.getSocket().isConnected() (will be true if the socket has 
ever been connected even if it is closed now)

I tried staleClientProxy.getSocket().isClosed(), but that returned false even 
though the socket had been closed on the remote side.

Since there is no activity on the CacheClientProxy's MessageDispatcher, it 
doesn't realize its socket has been terminated from the remote side. The way 
that happens is during the ClientHealthMonitor's periodic checks, it realizes 
it hasn't gotten a ping from the client and it closes all the connections for 
that client including the CacheClientProxy's MessageDispatcher.

Here is a stack trace of the ClientHealthMonitor terminating the 
ServerConnection and CacheClientProxy:
{noformat}
java.lang.Exception: Stack trace
        at java.lang.Thread.dumpStack(Thread.java:1333)
        at 
org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.terminateDispatching(CacheClientProxy.java:891)
        at 
org.apache.geode.internal.cache.tier.sockets.CacheClientProxy.close(CacheClientProxy.java:793)
        at 
org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.closeDeadProxies(CacheClientNotifier.java:1750)
        at 
org.apache.geode.internal.cache.tier.sockets.CacheClientNotifier.unregisterClient(CacheClientNotifier.java:725)
        at 
org.apache.geode.internal.cache.tier.sockets.ClientHealthMonitor.unregisterClient(ClientHealthMonitor.java:281)
        at 
org.apache.geode.internal.cache.tier.sockets.ServerConnection.handleTermination(ServerConnection.java:914)
        at 
org.apache.geode.internal.cache.tier.sockets.ClientHealthMonitor.validateThreads(ClientHealthMonitor.java:624)
        at 
org.apache.geode.internal.cache.tier.sockets.ClientHealthMonitor$ClientHealthMonitorThread.run(ClientHealthMonitor.java:848)
{noformat}
Questions:

1. Why is the CacheClientUpdater closed when the registerInterest 
QueueConnectionImpl times out? Do they need to be maintained in pairs? Would it 
be possible to recreate a connection on that same server?
2. Is it possible to use the ClientHealthMonitor to check the stat of the 
CacheClientProxy? Is it worth doing this?

Normal operations (like puts) do not cause this same behavior. The 
CacheClientUpdater is not closed in the case of a normal operation timing out.


> [CI Failure]Class 
> org.apache.geode.internal.cache.tier.sockets.RegisterInterestKeysPRDUnitTest  
>  --- testRegisterCreatesInvalidEntry
> ------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-4251
>                 URL: https://issues.apache.org/jira/browse/GEODE-4251
>             Project: Geode
>          Issue Type: Bug
>          Components: cq
>            Reporter: nabarun
>
> {noformat}
> org.apache.geode.test.dunit.RMIException: While invoking 
> org.apache.geode.internal.cache.tier.sockets.RegisterInterestKeysDUnitTest$$Lambda$28/1575286050.run
>  in VM 3 running on Host ea663fe20602 with 4 VMs
>       at org.apache.geode.test.dunit.VM.invoke(VM.java:393)
>       at org.apache.geode.test.dunit.VM.invoke(VM.java:363)
>       at org.apache.geode.test.dunit.VM.invoke(VM.java:308)
>       at 
> org.apache.geode.internal.cache.tier.sockets.RegisterInterestKeysDUnitTest.testRegisterCreatesInvalidEntry(RegisterInterestKeysDUnitTest.java:135)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
>       at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>       at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47)
>       at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>       at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:55)
>       at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>       at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:325)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:78)
>       at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:57)
>       at org.junit.runners.ParentRunner$3.run(ParentRunner.java:290)
>       at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:71)
>       at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:288)
>       at org.junit.runners.ParentRunner.access$000(ParentRunner.java:58)
>       at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:268)
>       at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>       at org.junit.runners.ParentRunner.run(ParentRunner.java:363)
>       at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.runTestClass(JUnitTestClassExecuter.java:114)
>       at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecuter.execute(JUnitTestClassExecuter.java:57)
>       at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassProcessor.processTestClass(JUnitTestClassProcessor.java:66)
>       at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>       at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>       at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:32)
>       at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:93)
>       at com.sun.proxy.$Proxy2.processTestClass(Unknown Source)
>       at 
> org.gradle.api.internal.tasks.testing.worker.TestWorker.processTestClass(TestWorker.java:109)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:35)
>       at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>       at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:147)
>       at 
> org.gradle.internal.remote.internal.hub.MessageHubBackedObjectConnection$DispatchWrapper.dispatch(MessageHubBackedObjectConnection.java:129)
>       at 
> org.gradle.internal.remote.internal.hub.MessageHub$Handler.run(MessageHub.java:404)
>       at 
> org.gradle.internal.concurrent.ExecutorPolicy$CatchAndRecordFailures.onExecute(ExecutorPolicy.java:63)
>       at 
> org.gradle.internal.concurrent.StoppableExecutorImpl$1.run(StoppableExecutorImpl.java:46)
>       at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>       at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>       at java.lang.Thread.run(Thread.java:748)
> Caused by: java.lang.AssertionError: failed while registering interest
>       at org.apache.geode.test.dunit.Assert.fail(Assert.java:66)
>       at 
> org.apache.geode.internal.cache.tier.sockets.RegisterInterestKeysDUnitTest.registerKeysK1(RegisterInterestKeysDUnitTest.java:224)
>       at 
> org.apache.geode.internal.cache.tier.sockets.RegisterInterestKeysDUnitTest.lambda$testRegisterCreatesInvalidEntry$bb17a952$2(RegisterInterestKeysDUnitTest.java:135)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at hydra.MethExecutor.executeObject(MethExecutor.java:244)
>       at 
> org.apache.geode.test.dunit.standalone.RemoteDUnitVM.executeMethodOnObject(RemoteDUnitVM.java:70)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:498)
>       at sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:357)
>       at sun.rmi.transport.Transport$1.run(Transport.java:200)
>       at sun.rmi.transport.Transport$1.run(Transport.java:197)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at sun.rmi.transport.Transport.serviceCall(Transport.java:196)
>       at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:568)
>       at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:826)
>       at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.lambda$run$0(TCPTransport.java:683)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:682)
>       ... 3 more
> Caused by: org.apache.geode.cache.NoSubscriptionServersAvailableException: 
> org.apache.geode.cache.NoSubscriptionServersAvailableException: Primary 
> discovery failed.
>       at 
> org.apache.geode.cache.client.internal.QueueManagerImpl.getAllConnections(QueueManagerImpl.java:190)
>       at 
> org.apache.geode.cache.client.internal.OpExecutorImpl.executeOnQueuesAndReturnPrimaryResult(OpExecutorImpl.java:558)
>       at 
> org.apache.geode.cache.client.internal.PoolImpl.executeOnQueuesAndReturnPrimaryResult(PoolImpl.java:842)
>       at 
> org.apache.geode.cache.client.internal.RegisterInterestListOp.execute(RegisterInterestListOp.java:47)
>       at 
> org.apache.geode.cache.client.internal.ServerRegionProxy.registerInterestList(ServerRegionProxy.java:533)
>       at 
> org.apache.geode.internal.cache.LocalRegion.processSingleInterest(LocalRegion.java:3761)
>       at 
> org.apache.geode.internal.cache.LocalRegion.registerInterest(LocalRegion.java:3846)
>       at 
> org.apache.geode.internal.cache.LocalRegion.registerInterest(LocalRegion.java:3840)
>       at 
> org.apache.geode.internal.cache.LocalRegion.registerInterest(LocalRegion.java:3835)
>       at 
> org.apache.geode.internal.cache.tier.sockets.RegisterInterestKeysDUnitTest.registerKeysK1(RegisterInterestKeysDUnitTest.java:213)
>       ... 24 more
> Caused by: org.apache.geode.cache.NoSubscriptionServersAvailableException: 
> Primary discovery failed.
>       at 
> org.apache.geode.cache.client.internal.QueueManagerImpl$ConnectionList.setPrimaryDiscoveryFailed(QueueManagerImpl.java:1351)
>       at 
> org.apache.geode.cache.client.internal.QueueManagerImpl.recoverPrimary(QueueManagerImpl.java:965)
>       at 
> org.apache.geode.cache.client.internal.QueueManagerImpl$RedundancySatisfierTask.run2(QueueManagerImpl.java:1467)
>       at 
> org.apache.geode.cache.client.internal.PoolImpl$PoolTask.run(PoolImpl.java:1326)
>       at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>       at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>       at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>       ... 3 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to