Re: Bulkload into empty table with configureIncrementalLoad()

Jean-Daniel Cryans Thu, 19 Sep 2013 11:51:25 -0700

You need to create the table with pre-splits, see
http://hbase.apache.org/book.html#perf.writing


J-D


On Thu, Sep 19, 2013 at 9:52 AM, Dolan Antenucci <antenucc...@gmail.com>wrote:

> I have about 1 billion values I am trying to load into a new HBase table
> (with just one column and column family), but am running into some issues.
>  Currently I am trying to use MapReduce to import these by first converting
> them to HFiles and then using LoadIncrementalHFiles.doBulkLoad().  I also
> use HFileOutputFormat.configureIncrementalLoad() as part of my MR job.  My
> code is essentially the same as this example:
>
> https://github.com/Paschalis/HBase-Bulk-Load-Example/blob/master/src/cy/ac/ucy/paschalis/hbase/bulkimport/Driver.java
>
> The problem I'm running into is that only 1 reducer is created
> by configureIncrementalLoad(), and there is not enough space on this node
> to handle all this data.  configureIncrementalLoad() should start one
> reducer for every region the table has, so apparently the table only has 1
> region -- maybe because it is empty and brand new (my understanding of how
> regions work is not crystal clear)?  The cluster has 5 region servers, so
> I'd at least like that many reducers to handle this loading.
>
> On a side note, I also tried the command line tool, completebulkload, but
> am running into other issues with this (timeouts, possible heap issues) --
> probably due to only one server being assigned the task of inserting all
> the records (i.e. I look at the region servers' logs, and only one of the
> servers has log entries; the rest are idle).
>
> Any help is appreciated
>
> -Dolan Antenucci
>

Re: Bulkload into empty table with configureIncrementalLoad()

Reply via email to