[
https://issues.apache.org/jira/browse/TINKERPOP-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17300204#comment-17300204
]
ASF GitHub Bot commented on TINKERPOP-2531:
-------------------------------------------
FlorianHockmann commented on pull request #1404:
URL: https://github.com/apache/tinkerpop/pull/1404#issuecomment-797394617
Hey @radu-iviniciu and thanks for your contribution to TinkerPop!
The functionality to replace closed connections automatically in the
background was only added recently (see
[TINKERPOP-2288](https://issues.apache.org/jira/browse/TINKERPOP-2288)). So,
there are definitely cases that aren't covered yet by it. The problem you
describe in TINKERPOP-2531 is of course a case where the current retry logic
isn't enough when the server is offline longer than the retry period as we then
don't start another repair attempt. We originally thought about adding a
scheduled repair job that checks the pool every minute or so and tries to
repair it if necessary but didn't implement it yet because it would add a lot
of complexity to the driver that I would like to avoid having if we don't
really need it.
I really like your solution to this problem as it's much simpler than a
scheduled repair but it still solves the problem. It ensures that we will try
again to repair the pool if a new request should be sent but the previous
attempt failed and it shouldn't add unnecessary costs as we already have a
mechanism to only perform one repair operation in parallel.
I see only one thing missing in this PR: Could you please add a test case
for the problem this is trying to solve? We already have a few
[`ConnectionPoolTests`](https://github.com/apache/tinkerpop/blob/3.4-dev/gremlin-dotnet/test/Gremlin.Net.UnitTest/Driver/ConnectionPoolTests.cs)
that use mocks to create the scenarios they want to test. If you have problems
with creating a test case for this, then I can also give it a try.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> Gremlin .NET driver ConnectionPool can remain without connections if server
> is down for 1-2 minutes
> ---------------------------------------------------------------------------------------------------
>
> Key: TINKERPOP-2531
> URL: https://issues.apache.org/jira/browse/TINKERPOP-2531
> Project: TinkerPop
> Issue Type: Bug
> Affects Versions: 3.4.10
> Reporter: Radu Iviniciu
> Priority: Major
>
> We are using Gremlin .NET client to connect to AWS Neptune.
> If the server is down for even just 1 - 2 minutes (E,.g. Instance reboot)
> the connection pool can remain without any healthy connections permanently
> even after the server comes back up again.
> Tracked this down to here:
> https://github.com/apache/tinkerpop/blob/60bfc90ce43567d609b1165989c1c74ce825109b/gremlin-dotnet/src/Gremlin.Net/Driver/ConnectionPool.cs#L154
> If the server was not ready yet and the connections were not restored as part
> of the previous TryGetAvailableConnection then the connection pool will
> remain empty indefinitely and all connections are dead even after the server
> is back up.
> Is this longer-ish downtime a use case the client should handle ? Or is it
> expected behavior and we should let the caller handle the
> ServerUnavailableException, potentially recreate the GremClient altogether ?
> What do you guys think ? If this is something the client should handle I can
> take a stab at it and put up a PR.
> Thank you.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)