[ 
https://issues.apache.org/jira/browse/HBASE-13959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606429#comment-14606429
 ] 

Lars Hofhansl commented on HBASE-13959:
---------------------------------------

bq. However, I came across blogs in which folks used unusually large blocking 
store file count (e.g., 200)

True. And that is why we need the new config option. Folks who increase # 
blocking store files that dramatically better know what they're doing. Either 
they'd presplit everything, or they'd want a large number of threads doing the 
splitting.
With this, we mostly keep the spirit of the original code: one thread per file. 
I think we can do that here, and then fix it in a better way.

This is just to come up with a good proxy that will work in most situations. 
Another good proxy would be the number of a handler threads configured for the 
NameNode (and then allow a split to use maybe a 1/4 of that by default).

As [~apurtell] and I pointed out elsewhere, on demand thread pools are smelly 
beasts and should be avoided. Once we fixed this issue, we should revisit that. 
The tricky part with a central pool - presumably - would be making sure that 
all tasks terminate; right now this is enforced by shutting down the pool.


> Region splitting takes too long because it uses a single thread in most 
> common cases
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-13959
>                 URL: https://issues.apache.org/jira/browse/HBASE-13959
>             Project: HBase
>          Issue Type: Bug
>          Components: regionserver
>    Affects Versions: 0.98.12
>            Reporter: Hari Krishna Dara
>            Assignee: Lars Hofhansl
>            Priority: Critical
>             Fix For: 2.0.0, 0.98.14, 1.2.0, 1.3.0
>
>         Attachments: 13959-suggest.txt, HBASE-13959-2.patch, 
> HBASE-13959-3.patch, HBASE-13959-4.patch, HBASE-13959-5.patch, 
> HBASE-13959.patch, region-split-durations-compared.png
>
>
> When storefiles need to be split as part of a region split, the current logic 
> uses a threadpool with the size set to the size of the number of stores. 
> Since most common table setup involves only a single column family, this 
> translates to having a single store and so the threadpool is run with a 
> single thread. However, in a write heavy workload, there could be several 
> tens of storefiles in a store at the time of splitting, and with a threadpool 
> size of one, these files end up getting split sequentially.
> With a bit of tracing, I noticed that it takes on an average of 350ms to 
> create a single reference file, and splitting each storefile involves 
> creating two of these, so with a storefile count of 20, it takes about 14s 
> just to get through this phase alone (2 reference files for each storefile), 
> pushing the total time the region is offline to 18s or more. For environments 
> that are setup to fail fast, this makes the client exhaust all retries and 
> fail with NotServingRegionException.
> The fix should increase the concurrency of this operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to