HFiles and MapReduce

Leif Wickland Mon, 01 Aug 2011 12:51:09 -0700

A few questions about HFiles and MapReduce:

1. Is there any case where it's a bad idea to use HFileOutputFormat instead
of TableOutputFormat when writing to HBase from MapReduce?


2. What are the failure modes for LoadIncrementalHFiles.doBulkLoad?  Is it
possible some regions will be adopted and others fail?

3. I think I'd like to create HFiles as the output of my MapReduce, then use
the HFiles as the input to a MapReduce to calculate some new aggregates, and
then doBulkLoad on the HFiles.  Is there any easy way to use a directory of
HFiles as the input to a MapReduce?  Is this inadvisable?  It seems like
this would be a more sensible approach than scanning for columns with
timestamps in an interval to find the freshly written columns.

Thanks for any feedback.

Leif Wickland

HFiles and MapReduce

Reply via email to