To use existing bulk load tools, you'll need to write a valid HFile to
HDFS (have a look at HFileWriterV{2,3}) and load it into the region
server(s) using the utilities provided in LoadIncrementalHFiles.There's no way to do this "in memory" at the moment. Closest would be to batch up your data into a single large RPC, but that's going through the online machinery, memstore flush, &c. On Wed, Feb 4, 2015 at 10:49 AM, Jaime Solano <[email protected]> wrote: > For a proof of concept we'll be working on, we want to bulk-load data into > HBase, following a similar approach to the one explained here > < > http://blog.cloudera.com/blog/2013/09/how-to-use-hbase-bulk-loading-and-why/ > >, > but with the difference that for the HFile creation (step 2 in the > mentioned article), we want to use Storm instead of MapReduce. That is, we > want to bulk load data not sitting in HDFS, but probably in memory. > > 1. What are your thoughts about this? Is it feasible? > 2. What challenges do you foresee? > 3. What other approaches would you suggest? > > Thanks in advance, > -Jaime >
