Josh Elser created ACCUMULO-3268:
------------------------------------
Summary: HoldTimeoutException is poorly propagated to clients
Key: ACCUMULO-3268
URL: https://issues.apache.org/jira/browse/ACCUMULO-3268
Project: Accumulo
Issue Type: Improvement
Components: client
Affects Versions: 1.6.1
Reporter: Josh Elser
Priority: Critical
Fix For: 1.6.2, 1.7.0
6 node cluster was running randomwalk when the MultiTable module failed. A
BatchWriter was trying to add a new Mutation to a table in
{{o.a.a.test.randomwalk.multitable.Write}}. The call to {{addMutations}} failed
with a MutationsRejectedException with the information that there was an
exception on the server.
In actuality, the addition of this mutation triggered a flush and tried to ship
it over to a tabletserver. The tabletserver hosting the tablet for that
mutation was under load but still responsive. The hold time was exceeded for
this tserver, but all the client sees is that there was *some* exception on
this server.
If the client actually *knew* that commits were being held, it could correctly
back off (sleep) and retry the mutations since the last flush. Right now, they
can't really do anything. Additionally, being unable to get the mutations that
were buffered since the last flush is sub-par, but that can be worked around.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)