Bryan Beaudreault created HBASE-27487:
-----------------------------------------
Summary: Slow meta can create pathological feedback loop with
multigets
Key: HBASE-27487
URL: https://issues.apache.org/jira/browse/HBASE-27487
Project: HBase
Issue Type: Improvement
Affects Versions: 2.4.15, 2.5.1
Reporter: Bryan Beaudreault
This only affects the Table implementation in 2.x releases.
h4. Call stack
When Table.batch is called, an AsyncProcessTask is created with
SubmittedRows.ALL, which is sent to AsyncProcess.submit(). For the ALL case,
this goes to submitAll which creates an AsyncRequestFutureImpl and then calls
groupAndSendMultiAction on that.
When a AsyncRequestFutureImpl is created, a RetryingTimeTracker is created and
started as the last step of the constructor.
In groupAndSendMultiAction, the first thing that has to be done is resolve the
HRegionLocation for every action in the batch. This is currently done
sequentially, with no timeout on the overall batch completion.
Once all actions have been resolved, they are passed into sendMultiAction which
creates a SingleServerRequestRunnable. Once that runnable is executed, the
first thing it does is create a new MultiServerCallable using the same
RetryingTimeTracker that was originally created way back.
That callable extends CancellableRegionServerCallable, and the call method
first checks the tracker.getRemainingTime() before actually doing any work. If
exceeded, it throws an exception.
h4. Problem
If meta is overloaded, or you send any sufficiently large batch of actions, the
resolving of HRegionLocations (which happens sequentially) may take a while.
Depending on the operation timeout configured for the client, that duration may
already exceed that timeout before even reaching the
CancellableRegionServerCallable.call().
When the timeout is exceeded there, a DoNotRetryIOException is thrown. This is
considered a cache clearing exception, so any locations that may have been
slowly resolved earlier up the chain will be thrown away.
If done with enough concurrency, this can create a feedback loop that is
impossible to recover from.
h4. Potential Solutions
# Change the thrown exception type from DoNotRetryIOException to something
more appropriate for the actual error (some sort of timeout exception). We'd
have to make that exception a "special" exception in ClientExceptionUtil so
that it doesn't clear the cache.
# Make DoNotRetryIOException itself a "special" exception. The point of
clearing cache is to make retries more likely to succeed if the failure was
related to a wrong location. But DoNotRetryIOException explicitly is not
supposed to be retried, so you might think it shouldn't clear the cache as
well. There are many usages of this exception, so it's hard to say for sure
that this would be universally safe.
# Reset the RetryingTimeTracker after resolving region locations.
I think I'd lean towards option 1, because it seems odd to say "don't retry in
that case". In fact, retrying should be more likely to succeed because
locations will have been resolved.
Whichever we choose, I think we should additionally check the timeout in
groupAndSendMultiAction after resolving each region location. We should not
allow that process to exceed timeouts and currently it can way more than exceed
them before finally being checked incidentally at the end.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)