Stephen Mallette created TINKERPOP-2813:
-------------------------------------------
Summary: Improve driver usability for cases where
NoHostAvailableException is currently thrown
Key: TINKERPOP-2813
URL: https://issues.apache.org/jira/browse/TINKERPOP-2813
Project: TinkerPop
Issue Type: Improvement
Components: driver
Affects Versions: 3.5.4
Reporter: Stephen Mallette
Assignee: Stephen Mallette
A {{NoHostAvailableException}} occurs in two cases:
1. where the {{Client}} is initialized and a failure occurs on all {{Host}}
instances configured
2. when the {{Client}} attempts to {{chooseConnection()}} to send a request and
all {{Host}} instances configured are marked unavailable.
In the first case, you can get a cause for the failure which is helpful, but
the inadequacy is that you only get the failure of the first {{Host}} to cause
a problem. The second case is a bit worse because there you get no cause in the
exception and it's a "fast fail" in that as soon as the request is sent there
is no pause to see if the {{Host}} comes back online. Moreover, a {{Host}} can
be marked for failure for the infraction of just a single {{Connection}} that
may have just encountered a intermittent network issue, thus quite quickly
killing the entire {{ConnectionPool}} and turning 100s or requests per second
into 100s of {{NoHostAvailableException}} per second. Note that you can also
get an infraction for the pool just being overloaded with requests which may
signal that either the pool or server not being sized right for the current
workload - in either case, the {{NoHostAvailableException}} is a bit of a harsh
way to deal with that and in any event doesn't quite give the user clues as to
how to deal with it.
All in all, this situation makes {{NoHostAvailableException}} hard to debug.
This ticket is meant to help smooth some of these problems. Initial thoughts
for improvements include better logging, ensuring that
{{NoHostAvailableException}} is not thrown without a cause, preferring more
specific exceptions in the fist place to {{NoHostAvailableException}}, getting
rid of "fast fails" in favor of longer pauses to see if a host can recover and
taking a softer stance on when a {{Host}} is actually considered "unavailable".
Expecting to implement this without breaking API changes, though exceptions may
shift around a bit, but will try to keep those to a minimum.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)