Does anyone have any suggestions for speeding up LoadIncrementalHFiles?

We have M/R jobs that directly generate HFiles and are then loaded into HBase via LoadIncrementalHFiles. We're attempting to maintain a backup of our production HBase on a backup Hadoop cluster by copying the HFiles there and then loading them there.

The problem we're running into is that we want the backup cluster to use a good number fewer nodes than the primary cluster, however despite having a pretty low load (CPU, disk IO, etc) it isn't keeping up well. We'd rather not dedicate more nodes from the overall pool to this purpose if at all possible. Are there any settings that can be adjusted to improve the performance of the bulk load?

Alternate suggestions for maintaining an HBase backup would also be of interest.

- Adam

Reply via email to