Re: hbase doesn't delete data older than TTL in old regions

Jinsong Hu Thu, 16 Sep 2010 10:31:35 -0700

I updated the ticket with our discussion, and added the following comments:

What I suggest is to make the sweep part of the major_compact. basically, itneeds to merge consecutive empty regions to the neighboring region that isnot empty. it need to merge the records in .META. table, and delete theempty directories in the HDFS for the empty regions. it should then instructthe region servers to unload the original regions and reload the mergedregions.


Jimmy.

--------------------------------------------------
From: "Stack" <[email protected]>
Sent: Thursday, September 16, 2010 9:49 AM
To: <[email protected]>
Subject: Re: hbase doesn't delete data older than TTL in old regions

On Thu, Sep 16, 2010 at 9:32 AM, Jinsong Hu <[email protected]>wrote:

That means, if we run this in production system and key is chronological
order, we will end up
having thousands of regions as time goes on and the number of regionsnever
decrease,
even though old data are compacted away. we don't really mind havingseveral
empty regions, but the fact that the region number continue to grow
unlimited without stop as time goes on, is really troublesome. It waste
hadoop namenode resource, and waste memory resource on regionserver, aseach
region takes some memory to store region info.


Agreed.

It'd be easy enough to write a script to do this run out of cron but
yeah, we should have a facility to sweep hbase and in particular if
regions are empty of store files, merge to neighbour.

Would you mind updating hbase-2999 to make it clear what is  needed to
satisfy the issue?  The clearer the stipulation, the easier it is on
the implementor (Patches also accepted if you'd like to have a go at
this yourself).

St.Ack

Re: hbase doesn't delete data older than TTL in old regions

Reply via email to