On Wed, Oct 13, 2010 at 9:25 AM, Jeff Zhang <[email protected]> wrote:
> Hi all,
>
> Since HBase has bulk import, could hbase delete a whole region ?
> Currently I have to do a scan operation then get the row id and invoke
> delete operation for each row id, this inefficiency. And internally,
> one region is just some hdfs files, so invoke some hdfs file deletion
> is more efficiency.
>
> My initial idea is that before deletion of region, the region should
> first be frozen (flush MemStore to disk, and inhibit any put operation
> into this region). Then invoke a delete operation on the region, only
> some hdfs file operation is needed. Not sure whether this is possible
> and on the roadmap of hbase ?
>

This could be a useful feature.

I think you could script it easy enough in TRUNK (I think you need
TRUNK because you can ask it when a region is closed since there is no
synchronous close of regions currently).

1. Close the region (See shell for how to send a close message or look
at HBaseAdmin API doc).
2. While its closing, you may have to disable the region in .META.
(See bin/*.rb scripts for how to mangle .META.).  This may not be
necessary IIRC in TRUNK (In 0.20.x, it is necessary to prevent the
region being opened elsewhere when the regionserver reports sucessful
close).
3. Check the regionserver periodically for the closing region.  When
its no longer mentioned in online regions, you know its closed.
4. Close the region that falls just after the one you just closed
(Same trick w/ offlining above).  Do fixup on meta where you extend
the key scope of this region so that it covers the region just closed
by making its startkey that of the region just closed.
5. Reenable (in 0.20. this would mean flipping region to be enabled
again -- in TRUNK, you might have to explicitly open it on a
regionserver -- would have to check).

If you want to work on the above, open a JIRA and I'll help you out.

St.Ack

Reply via email to