[
https://issues.apache.org/jira/browse/TINKERPOP-2369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105380#comment-17105380
]
Stephen Mallette commented on TINKERPOP-2369:
---------------------------------------------
Thanks for creating this issue. Of the two solutions you described I seem to
recall looking into this one before:
> by adding a listener for the close frame being sent to the underlying
> channel to replace the connection.
and not quite getting it to work for some reason. perhaps there's a JIRA issue
somewhere that would remind me of what happened.
> Connections in ConnectionPool are not replaced in background when underlying
> channel is closed
> ----------------------------------------------------------------------------------------------
>
> Key: TINKERPOP-2369
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2369
> Project: TinkerPop
> Issue Type: Bug
> Components: driver
> Affects Versions: 3.4.1
> Reporter: Johannes Carlsen
> Priority: Major
>
> Hi Tinkerpop team!
>
> We are using the Gremlin Java Driver to connect to an Amazon Neptune cluster.
> We are using the IAM authentication feature provided by Neptune, which means
> that individual websocket connections are closed by the server every 36
> hours, when their credentials expire. The current implementation of the
> driver does not handle this situation well, as the Connection whose channel
> has been closed by the server remains in the ConnectionPool. The connection
> is only reported as dead and replaced when when it is later chosen by the
> LoadBalancingStrategy to server a client request, which inevitably fails when
> the connection attempts to write to the closed channel.
> A fix for this bug would cause the connection pool to be automatically
> refreshed in the background by either the keep-alive mechanism, which should
> replace a connection if a keep-alive request fails, or by adding a listener
> for the close frame being sent to the underlying channel to replace the
> connection. Without a fix, the only way to recover from a stale connection is
> to retry the request at the cluster level, which will allow the request to be
> directed to a different connection.
> I noticed a PR out for the .NET client to fix this behavior:
> [https://github.com/apache/tinkerpop/pull/1279.] We are hoping for something
> similar in the Gremlin Java Driver.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)