[
https://issues.apache.org/jira/browse/HBASE-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ted Yu updated HBASE-3721:
--------------------------
Attachment: (was: 3721-v2.txt)
> Speedup LoadIncrementalHFiles
> -----------------------------
>
> Key: HBASE-3721
> URL: https://issues.apache.org/jira/browse/HBASE-3721
> Project: HBase
> Issue Type: Improvement
> Components: util
> Reporter: Ted Yu
> Attachments: 3721.txt
>
>
> From Adam Phelps:
> from the logs it looks like <1% of the hfiles we're loading have to be split.
> Looking at the code for LoadIncrementHFiles (hbase v0.90.1), I'm actually
> thinking our problem is that this code loads the hfiles sequentially. Our
> largest table has over 2500 regions and the data being loaded is fairly well
> distributed across them, so there end up being around 2500 HFiles for each
> load period. At 1-2 seconds per HFile that means the loading process is very
> time consuming.
> Currently server.bulkLoadHFile() is a blocking call.
> We can utilize ExecutorService to achieve better parallelism.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira