You need to create the table with pre-splits, see http://hbase.apache.org/book.html#perf.writing
J-D On Thu, Sep 19, 2013 at 9:52 AM, Dolan Antenucci <antenucc...@gmail.com>wrote: > I have about 1 billion values I am trying to load into a new HBase table > (with just one column and column family), but am running into some issues. > Currently I am trying to use MapReduce to import these by first converting > them to HFiles and then using LoadIncrementalHFiles.doBulkLoad(). I also > use HFileOutputFormat.configureIncrementalLoad() as part of my MR job. My > code is essentially the same as this example: > > https://github.com/Paschalis/HBase-Bulk-Load-Example/blob/master/src/cy/ac/ucy/paschalis/hbase/bulkimport/Driver.java > > The problem I'm running into is that only 1 reducer is created > by configureIncrementalLoad(), and there is not enough space on this node > to handle all this data. configureIncrementalLoad() should start one > reducer for every region the table has, so apparently the table only has 1 > region -- maybe because it is empty and brand new (my understanding of how > regions work is not crystal clear)? The cluster has 5 region servers, so > I'd at least like that many reducers to handle this loading. > > On a side note, I also tried the command line tool, completebulkload, but > am running into other issues with this (timeouts, possible heap issues) -- > probably due to only one server being assigned the task of inserting all > the records (i.e. I look at the region servers' logs, and only one of the > servers has log entries; the rest are idle). > > Any help is appreciated > > -Dolan Antenucci >