On Wed, Sep 15, 2010 at 9:54 AM, Jinsong Hu <[email protected]> wrote: > I have tested the TTL for hbase and found that it relies on compaction to > remove old data . However, if a region has data that is older > than TTL, and there is no trigger to compact it, then the data will remain > there forever, wasting disk space and memory. >
So its working as advertised then? There's currently an issue where we can skip major compactions if your write loading has a particular character: hbase-2990. > It appears at this state, to really remove data older than TTL we need to > start a client side deletion request. Or run a manual major compaction: $ echo "major_compact TABLENAME" | ./bin/hbase shell This is really a pity because > it is an more expensive way to get the job done. Another side effect of > this is that as time goes on, we will end up with some small > regions if the data are saved in chronological order in regions. It appears > that hbase doesn't have a mechanism to merge 2 consecutive > small regions into a bigger one at this time. $ ./bin/hbase org.apache.hadoop.hbase.util.Merge Usage: bin/hbase merge <table-name> <region-1> <region-2> Currently only works on offlined table but there's a patch available to make it run against onlined regions. So if data is saved in > chronological order, sooner or later we will run out of capacity , even if > the amount of data in hbase is small, because we have lots of regions with > small storage space. > > A much cheaper way to remove data older than TTL would be to remember the > latest timestamp for the region in the .META. table > and if the time is older than TTL, we just adjust the row in .META. and > delete the store , without doing any compaction. > Say more on the above. It sounds promising. Are you suggesting that in addition to compactions that we also have a provision where we keep account of a storefiles latest timestamp (we already do this I believe) and that when now - storefile-timestamp > ttl, we just remove the storefile wholesale. That sounds like it could work, if that is what you are suggesting. Mind filing an issue w/ a detailed description? Thanks, St.Ack > Can this be added to the hbase requirement for future release ? > > Jimmy > > >
