However, worth noting that your load performance will be much slower in this case. The splitting of bulk load outputs to fit into the new regions is done on the client in the "completebulkload" tool, so it will be very slow if for example all of the regions have split since you ran the MR job.
In the usual case, the completebulkload tool is run soon after the job completes so we expect little to no region churn. -Todd On Tue, Oct 26, 2010 at 2:49 PM, Stack <[email protected]> wrote: > On Tue, Oct 26, 2010 at 2:34 PM, Jack Levin <[email protected]> wrote: > > Hi, suppose we run bulk loader yesterday, and today, the regions names > > on the same table no longer exist because of region splits, etc.? > > What happens to the data when its 'loaded' into the hbase region > > directories? Will it make 'older' regions per, from 24 hours ago? Or > > cause some sort of an issue and an exception? > > > > > > It does the right thing. It adjusts to the new lay of the land > splitting the bulk written files to match new region layout as needed. > See > http://hbase.apache.org/docs/r0.89.20100924/xref/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.html#181 > > St.Ack > -- Todd Lipcon Software Engineer, Cloudera
