[jira] [Resolved] (HBASE-3721) Speedup LoadIncrementalHFiles

stack (JIRA) Thu, 05 May 2011 21:29:45 -0700

     [ 
https://issues.apache.org/jira/browse/HBASE-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


stack resolved HBASE-3721.
--------------------------

       Resolution: Fixed
    Fix Version/s: 0.92.0
     Hadoop Flags: [Reviewed]

Committed to TRUNK.  Thanks for the patch Ted (Thanks Adam for testing).

> Speedup LoadIncrementalHFiles
> -----------------------------
>
>                 Key: HBASE-3721
>                 URL: https://issues.apache.org/jira/browse/HBASE-3721
>             Project: HBase
>          Issue Type: Improvement
>          Components: util
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>             Fix For: 0.92.0
>
>         Attachments: 3721-v2.txt, 3721-v3.txt, 3721-v4.txt, 3721-v6.patch, 
> 3721.txt, LoadIncrementalHFiles.java
>
>
> From Adam Phelps:
> from the logs it looks like <1% of the hfiles we're loading have to be split. 
>  Looking at the code for LoadIncrementHFiles (hbase v0.90.1), I'm actually 
> thinking our problem is that this code loads the hfiles sequentially.  Our 
> largest table has over 2500 regions and the data being loaded is fairly well 
> distributed across them, so there end up being around 2500 HFiles for each 
> load period.  At 1-2 seconds per HFile that means the loading process is very 
> time consuming.
> Currently server.bulkLoadHFile() is a blocking call.
> We can utilize ExecutorService to achieve better parallelism on multi-core 
> computer.
> New configuration parameter "hbase.loadincremental.threads.max" is introduced 
> which sets the maximum number of threads for parallel bulk load.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HBASE-3721) Speedup LoadIncrementalHFiles

Reply via email to