[
https://issues.apache.org/jira/browse/HBASE-13959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14606429#comment-14606429
]
Lars Hofhansl commented on HBASE-13959:
---------------------------------------
bq. However, I came across blogs in which folks used unusually large blocking
store file count (e.g., 200)
True. And that is why we need the new config option. Folks who increase #
blocking store files that dramatically better know what they're doing. Either
they'd presplit everything, or they'd want a large number of threads doing the
splitting.
With this, we mostly keep the spirit of the original code: one thread per file.
I think we can do that here, and then fix it in a better way.
This is just to come up with a good proxy that will work in most situations.
Another good proxy would be the number of a handler threads configured for the
NameNode (and then allow a split to use maybe a 1/4 of that by default).
As [~apurtell] and I pointed out elsewhere, on demand thread pools are smelly
beasts and should be avoided. Once we fixed this issue, we should revisit that.
The tricky part with a central pool - presumably - would be making sure that
all tasks terminate; right now this is enforced by shutting down the pool.
> Region splitting takes too long because it uses a single thread in most
> common cases
> ------------------------------------------------------------------------------------
>
> Key: HBASE-13959
> URL: https://issues.apache.org/jira/browse/HBASE-13959
> Project: HBase
> Issue Type: Bug
> Components: regionserver
> Affects Versions: 0.98.12
> Reporter: Hari Krishna Dara
> Assignee: Lars Hofhansl
> Priority: Critical
> Fix For: 2.0.0, 0.98.14, 1.2.0, 1.3.0
>
> Attachments: 13959-suggest.txt, HBASE-13959-2.patch,
> HBASE-13959-3.patch, HBASE-13959-4.patch, HBASE-13959-5.patch,
> HBASE-13959.patch, region-split-durations-compared.png
>
>
> When storefiles need to be split as part of a region split, the current logic
> uses a threadpool with the size set to the size of the number of stores.
> Since most common table setup involves only a single column family, this
> translates to having a single store and so the threadpool is run with a
> single thread. However, in a write heavy workload, there could be several
> tens of storefiles in a store at the time of splitting, and with a threadpool
> size of one, these files end up getting split sequentially.
> With a bit of tracing, I noticed that it takes on an average of 350ms to
> create a single reference file, and splitting each storefile involves
> creating two of these, so with a storefile count of 20, it takes about 14s
> just to get through this phase alone (2 reference files for each storefile),
> pushing the total time the region is offline to 18s or more. For environments
> that are setup to fail fast, this makes the client exhaust all retries and
> fail with NotServingRegionException.
> The fix should increase the concurrency of this operation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)