[
https://issues.apache.org/jira/browse/TINKERPOP-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17213265#comment-17213265
]
Borys Pierov commented on TINKERPOP-2445:
-----------------------------------------
This is very important for the project I work on. The service has to
communicate with a large number of graph database instances (AWS Neptune, FWIW)
- such, that caching "cluster" objects is not feasible.
We tried creating cluster objects on the fly but current overhead is between
600ms-1s with the following code:
{code:java}
Cluster connectCluster(String endpoitn, int port) {
Cluster cluster = Cluster.build()
.addContactPoint(endpoint)
.port(port)
.enableSsl(true)
.minConnectionPoolSize(1)
.maxConnectionPoolSize(1)
.channelizer(SigV4WebSocketChannelizer.class)
.serializer(Serializers.GRAPHBINARY_V1D0)
.maxWaitForConnection((int) TimeUnit.SECONDS.toMillis(2))
.reconnectInterval(500)
.maxContentLength(1024000)
.create();
// force it to actually establish a connection
cluster.connect();
return cluster;
}
{code}
Besides that, what we found is that even after the above, the very 1st request
takes way longer than any of subsequent ones. So we started "priming" the
connection by executing a bogus traversal like below:
{code}
GraphTraversalSource createCluster(String endpoitn, int port) {
Cluster cluster = connectCluster(endpoint, port);
GraphTraversalSource g = AnonymousTraversalSource.traversal()
.withRemote(DriverRemoteConnection.using(cluster));
primeCluster(g);
return g;
}
void primeCluster(GraphTraversalSource g) {
g
.V()
.limit(1)
.next();
}
{code}
Here is a typical tracing sample of the above code execution. There are
outliers that take longer that that, but on average - that's what we see for
"connecting" and "priming".
!screenshot-1.png!
>From our perspective, we just need a way to communicate with an arbitrary
>number of clusters without that much of overhead - would that be through
>optimizing the WSS client, or being able to execute traversal bytecode via
>HTTPS.
> Speed up client initialization
> ------------------------------
>
> Key: TINKERPOP-2445
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2445
> Project: TinkerPop
> Issue Type: Improvement
> Components: driver
> Affects Versions: 3.5.0, 3.4.8
> Reporter: Divij Vaidya
> Priority: Minor
> Attachments: screenshot-1.png
>
>
> The current Java client has a lot of initialization overhead. Some of the
> things we could do to trim the fat are:
> 1. Parallelize the connection creation inside a connection pool, i.e. make
> [this for
> loop|https://github.com/apache/tinkerpop/blob/3.4-dev/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/ConnectionPool.java]
> parallel.
> 2. Do not create a bootstrap [for every
> connection|https://github.com/apache/tinkerpop/blob/3.4-dev/gremlin-driver/src/main/java/org/apache/tinkerpop/gremlin/driver/Connection.java#L111].
> A single bootstrap could be reused.
> 3. Remove SASL Handler from the pipeline after negotiation is complete for a
> connection.
> 4. Do not initialize SASL Handler if not required.
> As part of this task, we should profile the start-up time and identify other
> places where we could optimize the start-up time.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)