Adam, They are probably not deleted, but moved to the appropriate region subdirectory under /hbase.
On Fri, Apr 29, 2011 at 1:15 PM, Adam Phelps <[email protected]> wrote: > I just verified this, and the hfiles seem to be deleted one at a time as > the bulk load runs. > > - Adam > > > On 4/28/11 4:28 PM, Stack wrote: > >> I took a look through the code and don't see any explicit removes and >> looking through history of changes to the file, I don't see any change >> of substance. >> >> Can you figure what is doing the delete? At what stage? Is it as >> completebulkload runs? >> >> St.Ack >> >> On Thu, Apr 28, 2011 at 10:59 AM, Adam Phelps<[email protected]> wrote: >> >>> We were using a backup scheme for our system where we have map-reduce >>> jobs >>> generating HFiles, which we then loaded using LoadIncrementalHFiles >>> before >>> making a remote copy of them using distcp. >>> >>> However we just upgraded hbase (we're using cloudera's package, so we >>> went >>> from CDH3B4 to CDH3U0, both of which are versions of 0.90.1), and >>> discovered >>> that the HFiles now get deleted by the load operation. Is this a recent >>> change? Is there a configuration variable to revert this behavior? >>> >>> We can work around it by doing the copy before the load, but that is less >>> than optimal in our scenario as we'd prefer to have quicker access to the >>> data in HBase. >>> >>> - Adam >>> >>> >> >
