[ 
https://issues.apache.org/jira/browse/TINKERPOP3-814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14717138#comment-14717138
 ] 

stephen mallette commented on TINKERPOP3-814:
---------------------------------------------

Thanks for reporting this.  Is there an "easy" way to get the driver into this 
state?  You mentioned the "gryo" serialization failure...anything I can 
easily/readily reproduce?

> ConnectionPool can fill with dead Connections
> ---------------------------------------------
>
>                 Key: TINKERPOP3-814
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP3-814
>             Project: TinkerPop 3
>          Issue Type: Bug
>          Components: driver
>    Affects Versions: 3.0.0-incubating
>         Environment: "Cluster" of a single gremlin-server/Titan node.  Single 
> client.  minConnectionPoolSize = 2; maxConnectionPoolSize = 10.
>            Reporter: Kieran Sherlock
>            Assignee: stephen mallette
>
> Client gets a GryoMessageSerializerV1d0 decoding error on a couple of 
> requests (different unfiled issue).   ConnectionPool gets into the situation 
> where there are 2 dead connections being maintained in the pool, but no new 
> connections are being issued.
> A sample of the log output...
> {noformat}
> 2015-08-27 09:57:18,122 [Grizzly-worker(15)] DEBUG 
> org.apache.tinkerpop.gremlin.driver.ConnectionPool  - Borrowing connection 
> from pool on Host{address=localhost/127.0.0.1:18182, 
> hostUri=ws://localhost:18182/gremlin} - timeout in 3000 MILLISECONDS
> 2015-08-27 09:57:18,122 [Grizzly-worker(15)] DEBUG 
> org.apache.tinkerpop.gremlin.driver.ConnectionPool  - Return least used 
> Connection{host=Host{address=localhost/127.0.0.1:18182, 
> hostUri=ws://localhost:18182/gremlin}, isDead=false, borrowed=1, pending=0} 
> on Host{address=localhost/127.0.0.1:18182, 
> hostUri=ws://localhost:18182/gremlin}
> 2015-08-27 09:57:18,122 [Grizzly-worker(15)] DEBUG 
> org.apache.tinkerpop.gremlin.driver.Client  - Submitted RequestMessage{, 
> requestId=0349c1ce-19dc-4d54-9ca9-30965c9314ae, op='eval', processor='', 
> args={<redacted>}} to - 
> Connection{host=Host{address=localhost/127.0.0.1:18182, 
> hostUri=ws://localhost:18182/gremlin}}
> 2015-08-27 09:57:18,125 [gremlin-driver-worker-6] DEBUG 
> org.apache.tinkerpop.gremlin.driver.ConnectionPool  - Attempting to return 
> Connection{host=Host{address=localhost/127.0.0.1:18182, 
> hostUri=ws://localhost:18182/gremlin}} on 
> Host{address=localhost/127.0.0.1:18182, hostUri=ws://localhost:18182/gremlin}
> 2015-08-27 09:57:18,126 [gremlin-driver-worker-6] DEBUG 
> org.apache.tinkerpop.gremlin.driver.ConnectionPool  - On 
> Host{address=localhost/127.0.0.1:18182, hostUri=ws://localhost:18182/gremlin} 
> pool size of 3 > minPoolSize 2 and borrowed of 0 <= 
> minSimultaneousUsagePerConnection 8 so destroy 
> Connection{host=Host{address=localhost/127.0.0.1:18182, 
> hostUri=ws://localhost:18182/gremlin}, isDead=false, borrowed=0, pending=0}
> 2015-08-27 09:57:18,126 [gremlin-driver-worker-6] DEBUG 
> org.apache.tinkerpop.gremlin.driver.ConnectionPool  - 
> Connection{host=Host{address=localhost/127.0.0.1:18182, 
> hostUri=ws://localhost:18182/gremlin}, isDead=false, borrowed=0, pending=0} 
> destroyed
> 2015-08-27 09:57:18,126 [Grizzly-worker(15)] DEBUG 
> org.apache.tinkerpop.gremlin.driver.ConnectionPool  - Borrowing connection 
> from pool on Host{address=localhost/127.0.0.1:18182, 
> hostUri=ws://localhost:18182/gremlin} - timeout in 3000 MILLISECONDS
> 2015-08-27 09:57:18,126 [Grizzly-worker(15)] DEBUG 
> org.apache.tinkerpop.gremlin.driver.ConnectionPool  - Pool was initialized 
> but a connection could not be selected earlier - waiting for connection on 
> Host{address=localhost/127.0.0.1:18182, hostUri=ws://localhost:18182/gremlin}
> 2015-08-27 09:57:18,126 [Grizzly-worker(15)] DEBUG 
> org.apache.tinkerpop.gremlin.driver.ConnectionPool  - Wait 3000 MILLISECONDS 
> for an available connection on Host{address=localhost/127.0.0.1:18182, 
> hostUri=ws://localhost:18182/gremlin} with Thread[Grizzly-worker(15),5,main]
> 2015-08-27 09:57:21,127 [Grizzly-worker(15)] DEBUG 
> org.apache.tinkerpop.gremlin.driver.ConnectionPool  - Continue to wait for 
> connection on -1 if {} > 0
> 2015-08-27 09:57:21,127 [Grizzly-worker(15)] DEBUG 
> org.apache.tinkerpop.gremlin.driver.ConnectionPool  - Timed-out waiting for 
> connection on Host{address=localhost/127.0.0.1:18182, 
> hostUri=ws://localhost:18182/gremlin} - possibly unavailable
> 2015-08-27 09:57:21,127 [gremlin-driver-worker-3] DEBUG 
> org.apache.tinkerpop.gremlin.driver.Host  - Trying to reconnect to dead host 
> at Host{address=localhost/127.0.0.1:18182, 
> hostUri=ws://localhost:18182/gremlin}
> 2015-08-27 09:57:21,128 [Grizzly-worker(15)] DEBUG 
> org.apache.tinkerpop.gremlin.driver.Client  - Submitted RequestMessage{, 
> requestId=67761973-7ae7-4a6a-8e62-443fdb55142d, op='eval', processor='', 
> args=<redacted>} to - connection not initialized
> 2015-08-27 09:57:21,128 [gremlin-driver-worker-3] DEBUG 
> org.apache.tinkerpop.gremlin.driver.ConnectionPool  - Trying to re-establish 
> connection on Host{address=localhost/127.0.0.1:18182, 
> hostUri=ws://localhost:18182/gremlin}
> 2015-08-27 09:57:21,128 [Grizzly-worker(15)] ERROR 
> EdnaMetaDataServiceGremlinImpl  - Failed man lookupMerchantAccountName
> java.lang.RuntimeException: java.lang.RuntimeException: 
> java.util.concurrent.TimeoutException
> {noformat}
> This behavior continues with each subsequent request.  The gremlin-server is 
> available at all times.
> As a workaround I have applied the following patch, but I am pretty sure it 
> is not the correct solution
> {noformat}
> diff --git 
> a/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ConnectionPool.java
>  
> b/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ConnectionPool.java
> index 30cd71f..cb8ebfa 100644
> --- 
> a/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ConnectionPool.java
> +++ 
> b/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ConnectionPool.java
> @@ -352,7 +352,7 @@ final class ConnectionPool {
>              }
>              remaining = to - TimeUtil.timeSince(start, unit);
> -            logger.debug("Continue to wait for connection on {} if {} > 0", 
> remaining);
> +            logger.debug("Continue to wait for connection on {} if {} > 0", 
> host, remaining);
>          } while (remaining > 0);
>          logger.debug("Timed-out waiting for connection on {} - possibly 
> unavailable", host);
> @@ -371,13 +371,16 @@ final class ConnectionPool {
>          // let the load-balancer know that the host is acting poorly
>          this.cluster.loadBalancingStrategy().onUnavailable(host);
> -
>      }
>      private boolean tryReconnect(final Host h) {
>          logger.debug("Trying to re-establish connection on {}", host);
>          try {
> +            // remove dead connections
> +            for (Connection c : connections)
> +                if (c.isDead())
> +                    definitelyDestroyConnection(c);
>              connections.add(new Connection(host.getHostUri(), this, cluster, 
> settings().maxInProcessPerConnection));
>              this.open.set(connections.size());
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to