Re: Bulk-load data into HBase using Storm

Nick Dimiduk Wed, 04 Feb 2015 13:53:17 -0800

To use existing bulk load tools, you'll need to write a valid HFile to
HDFS (have a look at HFileWriterV{2,3}) and load it into the region
server(s) using the utilities provided in LoadIncrementalHFiles.


There's no way to do this "in memory" at the moment. Closest would be to
batch up your data into a single large RPC, but that's going through the
online machinery, memstore flush, &c.

On Wed, Feb 4, 2015 at 10:49 AM, Jaime Solano <[email protected]> wrote:

> For a proof of concept we'll be working on, we want to bulk-load data into
> HBase, following a similar approach to the one explained here
> <
> http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/
> >,
> but with the difference that for the HFile creation (step 2 in the
> mentioned article), we want to use Storm instead of MapReduce. That is, we
> want to bulk load data not sitting in HDFS, but probably in memory.
>
>    1. What are your thoughts about this? Is it feasible?
>    2. What challenges do you foresee?
>    3. What other approaches would you suggest?
>
> Thanks in advance,
> -Jaime
>

Re: Bulk-load data into HBase using Storm

Reply via email to