Speedup LoadIncrementalHFiles
-----------------------------
Key: HBASE-3721
URL: https://issues.apache.org/jira/browse/HBASE-3721
Project: HBase
Issue Type: Improvement
Components: util
Reporter: Ted Yu
>From Adam Phelps:
from the logs it looks like <1% of the hfiles we're loading have to be split.
Looking at the code for LoadIncrementHFiles (hbase v0.90.1), I'm actually
thinking our problem is that this code loads the hfiles sequentially. Our
largest table has over 2500 regions and the data being loaded is fairly well
distributed across them, so there end up being around 2500 HFiles for each load
period. At 1-2 seconds per HFile that means the loading process is very time
consuming.
Currently server.bulkLoadHFile() is a blocking call.
We can utilize ExecutorService to achieve better parallelism.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira