I could believe that, although I was under the impression that these files are actually incorporated into the existing region files. Still, its definitely a different behavior than what we were seeing before our recent upgrade.

- Adam

On 4/29/11 10:41 AM, Patrick Angeles wrote:
Adam,

They are probably not deleted, but moved to the appropriate region
subdirectory under /hbase.

On Fri, Apr 29, 2011 at 1:15 PM, Adam Phelps<[email protected]>  wrote:

I just verified this, and the hfiles seem to be deleted one at a time as
the bulk load runs.

- Adam


On 4/28/11 4:28 PM, Stack wrote:

I took a look through the code and don't see any explicit removes and
looking through history of changes to the file, I don't see any change
of substance.

Can you figure what is doing the delete? At what stage?  Is it as
completebulkload runs?

St.Ack

On Thu, Apr 28, 2011 at 10:59 AM, Adam Phelps<[email protected]>   wrote:

We were using a backup scheme for our system where we have map-reduce
jobs
generating HFiles, which we then loaded using LoadIncrementalHFiles
before
making a remote copy of them using distcp.

However we just upgraded hbase (we're using cloudera's package, so we
went
from CDH3B4 to CDH3U0, both of which are versions of 0.90.1), and
discovered
that the HFiles now get deleted by the load operation.  Is this a recent
change?  Is there a configuration variable to revert this behavior?

We can work around it by doing the copy before the load, but that is less
than optimal in our scenario as we'd prefer to have quicker access to the
data in HBase.

- Adam






Reply via email to