This sounds reasonable. We are tracking min/max timestamps in storefiles too, so it's possible that we could expire some files of a region as well, even if the region was not completely expired.
Jinsong, mind filing a jira? JG > -----Original Message----- > From: Jinsong Hu [mailto:[email protected]] > Sent: Wednesday, September 15, 2010 10:39 AM > To: [email protected] > Subject: Re: hbase doesn't delete data older than TTL in old regions > > Yes, Current TTL based on compaction is working as advertised if the > key > randomly distribute the incoming data > among all regions. However, if the key is designed in chronological > order, > the TTL doesn't really work, as no compaction > will happen for data already written. So we can't say that current TTL > really work as advertised, as it is key structure dependent. > > This is a pity, because a major use case for hbase is for people to > store > history or log data. normally people only > want to retain the data for a fixed period. for example, US government > default data retention policy is 7 years. Those > data are saved in chronological order. Current TTL implementation > doesn't > work at all for those kind of use case. > > In order for that use case to really work, hbase needs to have an > active > thread that periodically runs and check if there > are data older than TTL, and delete the data older than TTL is > necessary, > and compact small regions older than certain time period > into larger ones to save system resource. It can optimize the deletion > by > delete the whole region if it detects that the last time > stamp for the region is older than TTL. There should be 2 parameters > to > configure for hbase: > > 1. whether to disable/enable the TTL thread. > 2. the interval that TTL will run. maybe we can use a special value > like 0 > to indicate that we don't run the TTL thread, thus saving one > configuration > parameter. > for the default TTL, probably it should be set to 1 day. > 3. How small will the region be merged. it should be a percentage of > the > store size. for example, if 2 consecutive region is only 10% of the > store > szie ( default is 256M), we can initiate a region merge. We probably > need a > parameter to reduce the merge too. for example , we only merge for > regions > who's largest timestamp > is older than half of TTL. > > > Jimmy > > -------------------------------------------------- > From: "Stack" <[email protected]> > Sent: Wednesday, September 15, 2010 10:08 AM > To: <[email protected]> > Subject: Re: hbase doesn't delete data older than TTL in old regions > > > On Wed, Sep 15, 2010 at 9:54 AM, Jinsong Hu <[email protected]> > > wrote: > >> I have tested the TTL for hbase and found that it relies on > compaction to > >> remove old data . However, if a region has data that is older > >> than TTL, and there is no trigger to compact it, then the data will > >> remain > >> there forever, wasting disk space and memory. > >> > > > > So its working as advertised then? > > > > There's currently an issue where we can skip major compactions if > your > > write loading has a particular character: hbase-2990. > > > > > >> It appears at this state, to really remove data older than TTL we > need to > >> start a client side deletion request. > > > > Or run a manual major compaction: > > > > $ echo "major_compact TABLENAME" | ./bin/hbase shell > > > > > > > > This is really a pity because > >> it is an more expensive way to get the job done. Another side > effect of > >> this is that as time goes on, we will end up with some small > >> regions if the data are saved in chronological order in regions. It > >> appears > >> that hbase doesn't have a mechanism to merge 2 consecutive > >> small regions into a bigger one at this time. > > > > $ ./bin/hbase org.apache.hadoop.hbase.util.Merge > > Usage: bin/hbase merge <table-name> <region-1> <region-2> > > > > Currently only works on offlined table but there's a patch available > > to make it run against onlined regions. > > > > > > So if data is saved in > >> chronological order, sooner or later we will run out of capacity , > even > >> if > >> the amount of data in hbase is small, because we have lots of > regions > >> with > >> small storage space. > >> > >> A much cheaper way to remove data older than TTL would be to > remember the > >> latest timestamp for the region in the .META. table > >> and if the time is older than TTL, we just adjust the row in .META. > and > >> delete the store , without doing any compaction. > >> > > > > Say more on the above. It sounds promising. Are you suggesting that > > in addition to compactions that we also have a provision where we > keep > > account of a storefiles latest timestamp (we already do this I > > believe) and that when now - storefile-timestamp > ttl, we just > remove > > the storefile wholesale. That sounds like it could work, if that is > > what you are suggesting. Mind filing an issue w/ a detailed > > description? > > > > Thanks, > > St.Ack > > > > > > > >> Can this be added to the hbase requirement for future release ? > >> > >> Jimmy > >> > >> > >> > >
