Re: hbase hdfs snapshots

rahul gidwani Fri, 10 Jul 2015 15:47:23 -0700

Hi Matteo,

We do SKIP_FLUSH.  We have 1200+ regionservers with a single table with 60k
regions and 4 column families.  It takes around 30 minutes to snapshot this
table using manifests compared to just seconds doing this with hdfs.
Cloning this table takes considerably longer.


For cases where someone would want to run Map/Reduce over snapshots this
could be much faster as we could take an hdfs snapshot and bypass the clone.

rahul


On Thu, Jul 9, 2015 at 12:20 PM, Matteo Bertozzi <[email protected]>
wrote:

> On Thu, Jul 9, 2015 at 12:12 PM, rahul gidwani <[email protected]>
>  wrote:
>
> > Even with manifests (Snapshot V2) for our larger tables it can take hours
> > to Snapshot and Clone a table.
> >
>
> on snapshot time the only thing that can take hours, is "flush".
> if you don't need that (which is what you get with hdfs snapshots) you can
> specify SKIP_FLUSH => true
>
>
> Matteo
>
>
> On Thu, Jul 9, 2015 at 12:12 PM, rahul gidwani <[email protected]>
> wrote:
>
> > HBase snapshots are a very useful feature. but it was implemented back
> > before there was the ability to snapshot via HDFS.
> >
> > Newer versions of Hadoop support HDFS snapshots.  I was wondering if the
> > community would be interested in something like a Snapshot V3 where we
> use
> > HDFS to take these snapshots.
> >
> > Even with manifests (Snapshot V2) for our larger tables it can take hours
> > to Snapshot and Clone a table.
> >
> > Would this feature be of use to anyone?
> >
> > thanks
> > rahul
> >
>

Re: hbase hdfs snapshots

Reply via email to