[ https://issues.apache.org/jira/browse/TINKERPOP-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17647689#comment-17647689 ]
ASF GitHub Bot commented on TINKERPOP-2813: ------------------------------------------- kenhuuu commented on PR #1882: URL: https://github.com/apache/tinkerpop/pull/1882#issuecomment-1352051659 LGTM. I've tested this and it greatly improves the handling of temporary errors. > Improve driver usability for cases where NoHostAvailableException is > currently thrown > ------------------------------------------------------------------------------------- > > Key: TINKERPOP-2813 > URL: https://issues.apache.org/jira/browse/TINKERPOP-2813 > Project: TinkerPop > Issue Type: Improvement > Components: driver > Affects Versions: 3.5.4 > Reporter: Stephen Mallette > Assignee: Stephen Mallette > Priority: Blocker > > A {{NoHostAvailableException}} occurs in two cases: > 1. where the {{Client}} is initialized and a failure occurs on all {{Host}} > instances configured > 2. when the {{Client}} attempts to {{chooseConnection()}} to send a request > and all {{Host}} instances configured are marked unavailable. > In the first case, you can get a cause for the failure which is helpful, but > the inadequacy is that you only get the failure of the first {{Host}} to > cause a problem. The second case is a bit worse because there you get no > cause in the exception and it's a "fast fail" in that as soon as the request > is sent there is no pause to see if the {{Host}} comes back online. Moreover, > a {{Host}} can be marked for failure for the infraction of just a single > {{Connection}} that may have just encountered a intermittent network issue, > thus quite quickly killing the entire {{ConnectionPool}} and turning 100s or > requests per second into 100s of {{NoHostAvailableException}} per second. > Note that you can also get an infraction for the pool just being overloaded > with requests which may signal that either the pool or server not being sized > right for the current workload - in either case, the > {{NoHostAvailableException}} is a bit of a harsh way to deal with that and in > any event doesn't quite give the user clues as to how to deal with it. > All in all, this situation makes {{NoHostAvailableException}} hard to debug. > This ticket is meant to help smooth some of these problems. Initial thoughts > for improvements include better logging, ensuring that > {{NoHostAvailableException}} is not thrown without a cause, preferring more > specific exceptions in the fist place to {{NoHostAvailableException}}, > getting rid of "fast fails" in favor of longer pauses to see if a host can > recover and taking a softer stance on when a {{Host}} is actually considered > "unavailable". > Expecting to implement this without breaking API changes, though exceptions > may shift around a bit, but will try to keep those to a minimum. > -- This message was sent by Atlassian Jira (v8.20.10#820010)