[
https://issues.apache.org/jira/browse/HBASE-13959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14601035#comment-14601035
]
Hari Krishna Dara commented on HBASE-13959:
-------------------------------------------
I just attached region-split-durations-compared.png.
I have done a basic comparison of split times with one thread vs 8 threads on a
table. The table had no presplits and had a single column family. Starting from
an empty table, I loaded 400M rows (about 570 bytes/row). The run with 1 thread
encountered NSRE exceptions a few times that coincides with a long running
split. The run with 8 threads had no NSRE's. Here are some numbers:
Thread pool size = 1
Number of splits: 27
Average split duration: 8.44s
Min split duration: 3
Max split duration: 16
p99 split duration: 16
Thread pool size = 8
Number of splits: 25
Average split duration: 3.4s
Min split duration: 2
Max split duration: 6
p99 split duration: 5.76
I will attach an histogram showing the durations side by side.
> Region splitting takes too long because it uses a single thread in most
> common cases
> ------------------------------------------------------------------------------------
>
> Key: HBASE-13959
> URL: https://issues.apache.org/jira/browse/HBASE-13959
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.98.12
> Reporter: Hari Krishna Dara
> Assignee: Hari Krishna Dara
> Fix For: 0.98.14
>
> Attachments: HBASE-13959-2.patch, HBASE-13959-3.patch,
> HBASE-13959-4.patch, HBASE-13959.patch, region-split-durations-compared.png
>
>
> When storefiles need to be split as part of a region split, the current logic
> uses a threadpool with the size set to the size of the number of stores.
> Since most common table setup involves only a single column family, this
> translates to having a single store and so the threadpool is run with a
> single thread. However, in a write heavy workload, there could be several
> tens of storefiles in a store at the time of splitting, and with a threadpool
> size of one, these files end up getting split sequentially.
> With a bit of tracing, I noticed that it takes on an average of 350ms to
> create a single reference file, and splitting each storefile involves
> creating two of these, so with a storefile count of 20, it takes about 14s
> just to get through this phase alone (2 reference files for each storefile),
> pushing the total time the region is offline to 18s or more. For environments
> that are setup to fail fast, this makes the client exhaust all retries and
> fail with NotServingRegionException.
> The fix should increase the concurrency of this operation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)