A few questions about HFiles and MapReduce: 1. Is there any case where it's a bad idea to use HFileOutputFormat instead of TableOutputFormat when writing to HBase from MapReduce?
2. What are the failure modes for LoadIncrementalHFiles.doBulkLoad? Is it possible some regions will be adopted and others fail? 3. I think I'd like to create HFiles as the output of my MapReduce, then use the HFiles as the input to a MapReduce to calculate some new aggregates, and then doBulkLoad on the HFiles. Is there any easy way to use a directory of HFiles as the input to a MapReduce? Is this inadvisable? It seems like this would be a more sensible approach than scanning for columns with timestamps in an interval to find the freshly written columns. Thanks for any feedback. Leif Wickland
