[ https://issues.apache.org/jira/browse/TINKERPOP-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15133908#comment-15133908 ]
Ramzi Oueslati commented on TINKERPOP-1127: ------------------------------------------- Hi I definitely agree with the _open_ variable not being decremented when the connections are destroyed. Another point : in addConnectionIfUnderMaximum, if opened < maxPoolSize then _open_ is incremented and a new Connection is added to the pool. What if "new Connection(...)" fails ? Well _open_ gets incremented anyway. That's why I would also add this : {code} @@ -300,6 +303,7 @@ final class ConnectionPool { try { connections.add(new Connection(host.getHostUri(), this, settings().maxInProcessPerConnection)); } catch (ConnectionException ce) { + open.decrementAndGet(); logger.debug("Connections were under max, but there was an error creating the connection.", ce); considerUnavailable(); return false; {code} > client fails to reconnect to restarted server > --------------------------------------------- > > Key: TINKERPOP-1127 > URL: https://issues.apache.org/jira/browse/TINKERPOP-1127 > Project: TinkerPop > Issue Type: Bug > Components: driver > Affects Versions: 3.1.0-incubating > Reporter: Kieran Sherlock > Assignee: stephen mallette > Fix For: 3.1.2-incubating > > > If a gremlin-server is restarted, the client will never reconnect to it. > Start server1 > Start server2 > Start client such as > {code} > GryoMapper kryo = > GryoMapper.build().addRegistry(TitanIoRegistry.INSTANCE).create(); > MessageSerializer serializer = new GryoMessageSerializerV1d0(kryo); > Cluster titanCluster = Cluster.build() > .addContactPoints("54.X.X.X,54.Y.Y.Y".split(",")) > .port(8182) > .minConnectionPoolSize(5) > .maxConnectionPoolSize(10) > .reconnectIntialDelay(1000) > .reconnectInterval(30000) > .serializer(serializer) > .create(); > Client client = titanCluster.connect(); > client.init(); > System.out.println("initialized"); > for (int i = 0; i < 200; i++) { > try { > long id = System.currentTimeMillis(); > ResultSet results = client.submit("graph.addVertex('a','" + > id + "')"); > results.one(); > results = client.submit("g.V().has('a','" + id + "')"); > System.out.println(results.one()); > } catch (Exception e) { > e.printStackTrace(); > } > try { > TimeUnit.SECONDS.sleep(3); > } catch (InterruptedException e) { > e.printStackTrace(); > } > } > System.out.println("done"); > client.close(); > System.exit(0); > } > {code} > After client has performed a couple of query cycles > Restart server1 > Wait 60 seconds so the reconnect should occur > stop server2 > Notice that there are no more successful queries, the client has never > reconnected to server1 > start server2 > Notice that still there are no more successful queries > The method ConnectionPool.addConnectionIfUnderMaximum is always returning > false because opened >= maxPoolSize. In this particular case opened = 10. I > believe that open is trying to track the size of the List of connections but > is getting out of sync. The following diff addresses this problem for this > particular case > {code:diff} > diff --git > a/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ConnectionPool.java > > b/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ConnectionPool.java > index 96c151c..81ce81d 100644 > --- > a/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ConnectionPool.java > +++ > b/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ConnectionPool.java > @@ -326,6 +326,7 @@ final class ConnectionPool { > private void definitelyDestroyConnection(final Connection connection) { > bin.add(connection); > connections.remove(connection); > + open.decrementAndGet(); > if (connection.borrowed.get() == 0 && bin.remove(connection)) > connection.closeAsync(); > @@ -388,6 +389,8 @@ final class ConnectionPool { > // if the host is unavailable then we should release the connections > connections.forEach(this::definitelyDestroyConnection); > + // there are no connections open > + open.set(0); > // let the load-balancer know that the host is acting poorly > this.cluster.loadBalancingStrategy().onUnavailable(host); > @@ -413,6 +416,7 @@ final class ConnectionPool { > this.cluster.loadBalancingStrategy().onAvailable(host); > return true; > } catch (Exception ex) { > + logger.debug("Failed reconnect attempt on {}", host); > if (connection != null) definitelyDestroyConnection(connection); > return false; > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)