[
https://issues.apache.org/jira/browse/HBASE-27487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bryan Beaudreault updated HBASE-27487:
--------------------------------------
Fix Version/s: 2.6.0
> Slow meta can create pathological feedback loop with multigets
> --------------------------------------------------------------
>
> Key: HBASE-27487
> URL: https://issues.apache.org/jira/browse/HBASE-27487
> Project: HBase
> Issue Type: Improvement
> Affects Versions: 2.5.1, 2.4.15
> Reporter: Bryan Beaudreault
> Assignee: Briana Augenreich
> Priority: Major
> Fix For: 2.6.0, 2.4.16, 2.5.3
>
>
> This only affects the Table implementation in 2.x releases.
> h4. Call stack
> When Table.batch is called, an AsyncProcessTask is created with
> SubmittedRows.ALL, which is sent to AsyncProcess.submit(). For the ALL case,
> this goes to submitAll which creates an AsyncRequestFutureImpl and then calls
> groupAndSendMultiAction on that.
> When a AsyncRequestFutureImpl is created, a RetryingTimeTracker is created
> and started as the last step of the constructor.
> In groupAndSendMultiAction, the first thing that has to be done is resolve
> the HRegionLocation for every action in the batch. This is currently done
> sequentially, with no timeout on the overall batch completion.
> Once all actions have been resolved, they are passed into sendMultiAction
> which creates a SingleServerRequestRunnable. Once that runnable is executed,
> the first thing it does is create a new MultiServerCallable using the same
> RetryingTimeTracker that was originally created way back.
> That callable extends CancellableRegionServerCallable, and the call method
> first checks the tracker.getRemainingTime() before actually doing any work.
> If exceeded, it throws an exception.
> h4. Problem
> If meta is overloaded, or you send any sufficiently large batch of actions,
> the resolving of HRegionLocations (which happens sequentially) may take a
> while.
> Depending on the operation timeout configured for the client, that duration
> may already exceed that timeout before even reaching the
> CancellableRegionServerCallable.call().
> When the timeout is exceeded there, a DoNotRetryIOException is thrown. This
> is considered a cache clearing exception, so any locations that may have been
> slowly resolved earlier up the chain will be thrown away.
> If done with enough concurrency, this can create a feedback loop that is
> impossible to recover from.
> h4. Potential Solutions
> # Change the thrown exception type from DoNotRetryIOException to something
> more appropriate for the actual error (some sort of timeout exception). We'd
> have to make that exception a "special" exception in ClientExceptionUtil so
> that it doesn't clear the cache.
> # Make DoNotRetryIOException itself a "special" exception. The point of
> clearing cache is to make retries more likely to succeed if the failure was
> related to a wrong location. But DoNotRetryIOException explicitly is not
> supposed to be retried, so you might think it shouldn't clear the cache as
> well. There are many usages of this exception, so it's hard to say for sure
> that this would be universally safe.
> # Reset the RetryingTimeTracker after resolving region locations.
> I think I'd lean towards option 1, because it seems odd to say "don't retry
> in that case". In fact, retrying should be more likely to succeed because
> locations will have been resolved.
> Whichever we choose, I think we should additionally check the timeout in
> groupAndSendMultiAction after resolving each region location. We should not
> allow that process to exceed timeouts and currently it can way more than
> exceed them before finally being checked incidentally at the end.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)