I filed HBASE-5440. Although I am placing this more as a import to bulk load. I.e. we run export as do now, but on import one can choose to create HFiles for bulk load, instead of updating the life cluster through the API.
-- Lars ________________________________ From: lars hofhansl <[email protected]> To: "[email protected]" <[email protected]> Sent: Tuesday, February 21, 2012 9:27 AM Subject: Re: export/import for backup It seems we could converge the import and importtsv tools. importtsv can write directly to a (life) table or use HFileOutputFormat. -- Lars ________________________________ From: Stack <[email protected]> To: [email protected] Sent: Monday, February 20, 2012 9:19 PM Subject: Re: export/import for backup On Mon, Feb 20, 2012 at 1:58 PM, Paul Mackles <[email protected]> wrote: > Actually, an hbase export to "bulk load" facility sounds like a great idea. > We have been using bulk loads to migrate data from an older data store and > they have worked awesome for us. It also doesn't seem like it would be that > hard to implement. So what am I missing? > Little? Check out the Import.java in mapreduce package. See how its pulling from SequenceFiles into a map that outputs to a TableOutputFormat inside in the map. Make a new MR job that has same input but that outputs to HFileOutputFormat instead (you'll need the total order partitioner and a reducer in the mix which Import doesn't have). St.Ack
