Stephen Mallette created TINKERPOP-2813:
-------------------------------------------

             Summary: Improve driver usability for cases where 
NoHostAvailableException is currently thrown
                 Key: TINKERPOP-2813
                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2813
             Project: TinkerPop
          Issue Type: Improvement
          Components: driver
    Affects Versions: 3.5.4
            Reporter: Stephen Mallette
            Assignee: Stephen Mallette


A {{NoHostAvailableException}} occurs in two cases:

1. where the {{Client}} is initialized and a failure occurs on all {{Host}} 
instances configured
2. when the {{Client}} attempts to {{chooseConnection()}} to send a request and 
all {{Host}} instances configured are marked unavailable.

In the first case, you can get a cause for the failure which is helpful, but 
the inadequacy is that you only get the failure of the first {{Host}} to cause 
a problem. The second case is a bit worse because there you get no cause in the 
exception and it's a "fast fail" in that as soon as the request is sent there 
is no pause to see if the {{Host}} comes back online. Moreover, a {{Host}} can 
be marked for failure for the infraction of just a single {{Connection}} that 
may have just encountered a intermittent network issue, thus quite quickly 
killing the entire {{ConnectionPool}} and turning 100s or requests per second 
into 100s of {{NoHostAvailableException}} per second. Note that you can also 
get an infraction for the pool just being overloaded with requests which may 
signal that either the pool or server not being sized right for the current 
workload - in either case, the {{NoHostAvailableException}} is a bit of a harsh 
way to deal with that and in any event doesn't quite give the user clues as to 
how to deal with it.

All in all, this situation makes {{NoHostAvailableException}} hard to debug. 
This ticket is meant to help smooth some of these problems. Initial thoughts 
for improvements include better logging, ensuring that 
{{NoHostAvailableException}} is not thrown without a cause, preferring more 
specific exceptions in the fist place to {{NoHostAvailableException}}, getting 
rid of "fast fails" in favor of longer pauses to see if a host can recover and 
taking a softer stance on when a {{Host}} is actually considered "unavailable".

Expecting to implement this without breaking API changes, though exceptions may 
shift around a bit, but will try to keep those to a minimum.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to