Re: Bulk load using HFileOutputFormat.RecordWriter

Stack Fri, 07 Jan 2011 10:17:13 -0800

On Fri, Jan 7, 2011 at 9:31 AM, Nanheng Wu <[email protected]> wrote:
> Also do you see a problem with dropping tables while serving queries
> for other tables?


You mean in shell doing disable, drop?  That functionality is kinda
flakey in 0.20.  It does not work reliably.  The disable action runs
through all regions and asks all the regionservers to close out
regions of the table.  You can run the disable while serving queries
but the close up of the old regions will put a load on the system --
perhaps dragging down latency of reads -- as regions are usually
flushed and compacted before the close can complete (We need a
facility for saying just close -- no flush, no compact -- for case of
a table we know we don't want to keep).  The drop then deletes entries
from .META. and deletes content in HDFS.

You might be better off writing a few scripts of your own that did a
slow motion remove of the old table.  They'd pick a region off .META.,
disable the individual region, check it had happened, then did remove
from .META. and HDFS.  You'd run the table to delete slowly to ensure
serving was not effected.

St.Ack


> We are using hbase 20 right now and using this bulk
> load method we are getting great performance but it does require using
> a new table for each load. We want to clean up older data by dropping
> their tables either when a new table is loaded or by a cron job. What
> kind of impact would this approach have on reads? Thanks!
>

Re: Bulk load using HFileOutputFormat.RecordWriter

Reply via email to