Hari Krishna Dara created HBASE-13959:
-----------------------------------------
Summary: Region splitting takes too long because it uses a single
thread in most common cases
Key: HBASE-13959
URL: https://issues.apache.org/jira/browse/HBASE-13959
Project: HBase
Issue Type: Bug
Components: regionserver
Affects Versions: 0.98.12
Reporter: Hari Krishna Dara
Assignee: Hari Krishna Dara
When storefiles need to be split as part of a region split, the current logic
uses a threadpool with the size set to the size of the number of stores. Since
most common table setup involves only a single column family, this translates
to having a single store and so the threadpool is run with a single thread.
However, in a write heavy workload, there could be several tens of storefiles
in a store at the time of splitting, and with a threadpool size of one, these
files end up getting split sequentially.
With a bit of tracing, I noticed that it takes on an average of 350ms to create
a single reference file, and splitting each storefile involves creating two of
these, so with a storefile count of 20, it takes about 14s just to get through
this phase alone (2 reference files for each storefile), pushing the total time
the region is offline to 18s or more. For environments that are setup to fail
fast, this makes the client exhaust all retries and fail with
NotServingRegionException.
The fix should increase the concurrency of this operation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)