No, I am not aware about any other than (HBASE-7912) solution for incremental backup. You can try patch from HBASE-11085. It compiles and runs on master (2.0) and should compile with, probably, minimal changes on 1.0-1.1 - not 0.98. This should give you basic feature of full/incr backup and restore.
-Vlad On Thu, Jul 2, 2015 at 2:34 PM, Nicola Ferraro <[email protected]> wrote: > Thank you Vlad, > I looked at the design document and found the approach really interesting. > I have also checked some of the linked jiras and found out that there is > still much work ahead for developing and testing the solution, though you > are doing a great job. > I was wondering whether you knew any solution/workaround for hot, > consistent and incremental backup that is already applicable with version > (up to) 1.1. I know that snapshots can be taken online and are consistent, > but, in the application on which I am currently working, some tables are > expected to become huge after some time, so there is a strong need of an > incremental solution. > > Thank you for your support, > Nicola > > > > Il giorno gio 2 lug 2015 alle ore 22:00 Vladimir Rodionov < > [email protected]> ha scritto: > > > Hi, Nicola > > > > I recommend you to read HBASE-7912 design doc (it has been updated > today). > > https://issues.apache.org/jira/browse/HBASE-7912 > > > > -Vlad > > > > On Thu, Jul 2, 2015 at 11:46 AM, Nicola Ferraro <[email protected]> > > wrote: > > > > > HBase has many options for performing the backup of data stored in a > > table. > > > The "export" tool is described by O'Reilly (HBase, the definitive > guide), > > > but also here [ > > > > > > > > > http://blog.cloudera.com/blog/2013/11/approaches-to-backup-and-disaster-recovery-in-hbase/comment-page-1/#comment-63294 > > > ] > > > as a way to perform hot and incremental backups on a table. > > > > > > Essentially, the procedure consists in: > > > - performing the backup from tome 0 to time t1 > > > - performing the backup from tome t1 to time t2 > > > - ... and so on > > > > > > Suppose we want to perform a incremental backup from t1 to t2. > > > Obviously the backup will start at a time t3 greater or equals to t2 > and > > > finish at time t4. > > > An export-backup is a MapReduce job that essentially queries HBase in > > order > > > to retrieve data updated from time t1 to t2. > > > > > > Now, suppose that a client starts writing a particular cell right > before > > t2 > > > and updates it continuously with a different value every second. > > > > > > Fresh data is written to WAL (not checked by the export tool) and > > memstore > > > only, so, every time the client writes a different cell value, the old > > data > > > is lost (assuming we are not using data versioning). > > > > > > This means that, if the clients overwrite the cell after t2 but before > > t3, > > > the backup process will not export a consistent snapshot made at time > t2, > > > instead, the backup will contain the fresh data written after t2. This > > > could happen also with data written by the client after t3 and before > t4 > > > (i.e. when the backup is in progress). > > > > > > In order to make the incremental (consistent) backup work, I see two > > > options: > > > - Enable (infinite) version history on every data written to HBase (to > > > avoid overriding in memstore) > > > - Disable compaction temporarily, force memstore flush (eg. with a > > > "snapshot" command), perform the backup with t2 being the snapshot > time, > > > then re-enable compaction. > > > > > > I don't know if the second option is feasible as I did not find a way > to > > > disable compaction temporarily. > > > > > > Is there any other, reliable, feasible option to execute hot + > > > consistent + incremental backups with HBase? > > > > > > Nicola > > > > > >
