briaugenreich opened a new pull request, #4900:
URL: https://github.com/apache/hbase/pull/4900

   This only affects the Table implementation in 2.x releases. 
   
   This change to the exception thrown and failure response during an operation 
timeout for multigets ensures we do not create a feedback loop that is 
impossible to recover from by clearing the meta cache. We skip over the cache 
clear and simply set each get as failed. 
   
   
   If meta is overloaded, or you send any sufficiently large batch of actions, 
the resolving of HRegionLocations (which happens sequentially) may take a 
while. Depending on the operation timeout configured for the client, that 
duration may already exceed that timeout before even reaching the 
CancellableRegionServerCallable.call(). When the timeout is exceeded there, a 
DoNotRetryIOException is thrown. This is considered a cache clearing exception, 
so any locations that may have been slowly resolved earlier up the chain will 
be thrown away. If done with enough concurrency, this can create a feedback 
loop that is impossible to recover from.
   
   cc: @bbeaudreault 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to