Re: Hot (consistent) incremental backup

Vladimir Rodionov Thu, 02 Jul 2015 15:25:52 -0700

No, I am not aware about any other than (HBASE-7912) solution for
incremental backup.
You can try patch from HBASE-11085. It compiles and runs on master (2.0)
and should compile with, probably, minimal changes on 1.0-1.1 - not 0.98.
This should give you basic feature of full/incr backup and restore.


-Vlad

On Thu, Jul 2, 2015 at 2:34 PM, Nicola Ferraro <[email protected]>
wrote:

> Thank you Vlad,
> I looked at the design document and found the approach really interesting.
> I have also checked some of the linked jiras and found out that there is
> still much work ahead for developing and testing the solution, though you
> are doing a great job.
> I was wondering whether you knew any solution/workaround for hot,
> consistent and incremental backup that is already applicable with version
> (up to) 1.1. I know that snapshots can be taken online and are consistent,
> but, in the application on which I am currently working, some tables are
> expected to become huge after some time, so there is a strong need of an
> incremental solution.
>
> Thank you for your support,
> Nicola
>
>
>
> Il giorno gio 2 lug 2015 alle ore 22:00 Vladimir Rodionov <
> [email protected]> ha scritto:
>
> > Hi, Nicola
> >
> > I recommend you to read HBASE-7912 design doc (it has been updated
> today).
> > https://issues.apache.org/jira/browse/HBASE-7912
> >
> > -Vlad
> >
> > On Thu, Jul 2, 2015 at 11:46 AM, Nicola Ferraro <[email protected]>
> > wrote:
> >
> > > HBase has many options for performing the backup of data stored in a
> > table.
> > > The "export" tool is described by O'Reilly (HBase, the definitive
> guide),
> > > but also here [
> > >
> > >
> >
> http://blog.cloudera.com/blog/2013/11/approaches-to-backup-and-disaster-recovery-in-hbase/comment-page-1/#comment-63294
> > > ]
> > > as a way to perform hot and incremental backups on a table.
> > >
> > > Essentially, the procedure consists in:
> > > - performing the backup from tome 0 to time t1
> > > - performing the backup from tome t1 to time t2
> > > - ... and so on
> > >
> > > Suppose we want to perform a incremental backup from t1 to t2.
> > > Obviously the backup will start at a time t3 greater or equals to t2
> and
> > > finish at time t4.
> > > An export-backup is a MapReduce job that essentially queries HBase in
> > order
> > > to retrieve data updated from time t1 to t2.
> > >
> > > Now, suppose that a client starts writing a particular cell right
> before
> > t2
> > > and updates it continuously with a different value every second.
> > >
> > > Fresh data is written to WAL (not checked by the export tool) and
> > memstore
> > > only, so, every time the client writes a different cell value, the old
> > data
> > > is lost (assuming we are not using data versioning).
> > >
> > > This means that, if the clients overwrite the cell after t2 but before
> > t3,
> > > the backup process will not export a consistent snapshot made at time
> t2,
> > > instead, the backup will contain the fresh data written after t2. This
> > > could happen also with data written by the client after t3 and before
> t4
> > > (i.e. when the backup is in progress).
> > >
> > > In order to make the incremental (consistent) backup work, I see two
> > > options:
> > > - Enable (infinite) version history on every data written to HBase (to
> > > avoid overriding in memstore)
> > > - Disable compaction temporarily, force memstore flush (eg. with a
> > > "snapshot" command), perform the backup with t2 being the snapshot
> time,
> > > then re-enable compaction.
> > >
> > > I don't know if the second option is feasible as I did not find a way
> to
> > > disable compaction temporarily.
> > >
> > > Is there any other, reliable, feasible option to execute hot +
> > > consistent + incremental backups with HBase?
> > >
> > > Nicola
> > >
> >
>

Re: Hot (consistent) incremental backup

Reply via email to