On Mon, Feb 20, 2012 at 1:20 PM, Paul Mackles <[email protected]> wrote: > We are on hbase 0.90.4 (cd3u2). We are using the standard hbase export/import > for backups. In a test run, our imports ran extremely slow. While a full > export of our dataset took about an hour, the corresponding import took 20+ > hours (for 216 regions across 15 servers). While it finished, I am a little > uncomfortable with that sort of recovery time should disaster strike. Are > there any recommendations for speeding up imports in a recovery scenario? One > thing I noticed while watching the region-server logs was that there were a > lot of compactions happening during the import (both major and minor). Should > we disable compactions while the import is running and then do it all at the > end? We have our region-size set to 100GB right now so we can manage > splitting. Thanks in advance for any recommendations. >
Can you tell where it was spending the time Paul? Upping config. so less flushing sounds like it might good way to go. You might want to do stuff like large flush sizes when importing so flushes are larger. How did you import? A MR job? It was running full on? HBase was what was keeping it slow? Anyone played with going from an export to a bulk load? I wonder if this would run faster? St.Ack
