On Tue, Jun 1, 2010 at 11:53 PM, Zheng Lv <[email protected]> wrote: > Hello All, > We run export job to backup our hbase tables, but it takes so long time. > Can we first stop the hbase and copy the /hbase directory as backup? If it > works, what if we dont stop it first? Can we just stop all the jobs writing > data into hbase? Thanks a lot.
If you stop and copy the hbase.rootdir, thats a complete and truthful copy of the data in cluster at time of shutdown. If you copy while its running, it'll be fuzzy at the edges since some edits will be up in memory still. Though the in-memory edits have been added to the WAL, you'd need a replay to get these edits back in the mix on restore. That faciilty does not exist as yet. You could do take off the writes and do a force flush on the table. Currently this facility in the shell is unfortunatley not synchronous; i.e. it sends out the flush signal across the cluster and then returns immediately. You'd have to do something like watch the logs on all regionservers to see when flush had completed or just give it some time and then do the copy. Related, progress is being made on hbase-50, a snapshotting facility. Check it out if interested. A design was just posted so any comments most welcome. Thanks, St.Ack
