Re: hbase doesn't delete data older than TTL in old regions

Andrew Purtell Fri, 17 Sep 2010 02:51:33 -0700

> Unfortunately it confirmed my suspicion that current TTL is
> implemented
> purely based on active compaction.  And in log
> table/history data table, current implementation is not
> sufficient.


You continue to make that statement but it not an accurate statement.

HBase respects TTL when returning answers. At no time will you see a value that 
has expired.

So it is not "purely based on active compaction". 

Let us not be overly general in our language here. You are claiming a feature 
is broken and in fact it is not broken, it functions as advertised.

Best regards,

    - Andy

Why is this email five sentences or less?
http://five.sentenc.es/


--- On Wed, 9/15/10, Jinsong Hu <[email protected]> wrote:

> From: Jinsong Hu <[email protected]>
> Subject: Re: hbase doesn't delete data older than TTL in old regions
> To: [email protected], [email protected]
> Date: Wednesday, September 15, 2010, 5:50 PM
> I artificially set TTL to 10 minutes
> so that I can get the results quicker, and I don't have to
> wait for one day to get results. the TTL is set to 600
> seconds ( equals 10 minutes) when I did the testing.
> 
> In real application, TTL will be set to several months to
> years.
> 
> One thing I am not clear about major compaction is that for
> the regions with a single map file,
> will hbase actually load it and remove the records older
> than TTL ? I read the document and
> it doesn't seem to be the case. From engineering point of
> view, it also doesn't make sense to run compaction on a
> region that has only one single map file. The consequence is
> for those regions with single map file and old data, 
> old data will not be dropped forever even though it has well
> passed TTL .
> 
> I designed the TTL test case to see whether it works under
> different scenarios and figure out how it is actually done.
> Unfortunately it confirmed my suspicion that current TTL is
> implemented
> purely based on active compaction.  And in log
> table/history data table, current implementation is not
> sufficient.
> 
> Jimmy
> 
> --------------------------------------------------
> From: "Andrew Purtell" <[email protected]>
> Sent: Wednesday, September 15, 2010 5:33 PM
> To: <[email protected]>
> Subject: Re: hbase doesn't delete data older than TTL in
> old regions
> 
> >>  I did a test with 2 key structure: 1. 
> time:random ,
> >> and  2. random:time.
> >> the TTL is set to 10 minutes. the time is current
> system
> >> time. the random is a random string with length
> 2-10
> >> characters long.
> > 
> > This use case doesn't make much sense the way HBase
> currently works. You can set the TTL to 10 minutes but
> default major compaction runs every 24 hours. This can be
> tuned down, I've run with it every 4 or 8 hours during
> various experiments with different operational conditions.
> However TTL is specified in seconds instead of milliseconds
> given the notion of the typical TTL being greater than the
> major compaction interval.
> > 
> > If TTL is so short, maybe it should not be flushed
> from memstore at all? Is that what you want?
> > 
> >    - Andy
> > 
> > 
> >> From: Jinsong Hu <[email protected]>
> >> Subject: Re: hbase doesn't delete data older than
> TTL in old regions
> >> To: [email protected]
> >> Date: Wednesday, September 15, 2010, 11:56 AM
> >> Hi, ryan:
> >>  I did a test with 2 key structure: 1. 
> time:random ,
> >> and  2. random:time.
> >> the TTL is set to 10 minutes. the time is current
> system
> >> time. the random is a random string with length
> 2-10
> >> characters long.
> >> 
> >>  I wrote a test program to continue to pump
> data into hbase
> >> table , with the time going up with time.
> >> for the second test case, the number of rows
> remains
> >> approximately constant after it reaches
> >> certain limit. I also checked a specific row, and
> wait to
> >> 20 minutes later and check it again
> >> and found it is indeed gone.
> >> 
> >> In the first key case, the number of rows continue
> to grow
> >> and the number of regions continue to grow..
> >> to some number much higher than first case, and
> doesn't
> >> stop. I checked some stores that  with data
> >> several  hours old and they still remain
> there without
> >> getting deleted.
> >> 
> >> Jimmy.
> >> 
> >>
> --------------------------------------------------
> >> From: "Ryan Rawson" <[email protected]>
> >> Sent: Wednesday, September 15, 2010 11:43 AM
> >> To: <[email protected]>
> >> Subject: Re: hbase doesn't delete data older than
> TTL in
> >> old regions
> >> 
> >> > I feel the need to pipe in here, since people
> are
> >> accusing hbase of
> >> > having a broken feature 'TTL' when from the
> >> description in this email
> >> > thread, and my own knowledge doesn't really
> describe a
> >> broken feature.
> >> > Non optimal maybe, but not broken.
> >> >
> >> > First off, the TTL feature works on the
> timestamp,
> >> thus rowkey
> >> > structure is not related.  This is
> because the
> >> timestamp is stored in
> >> > a different field.  If you are also
> storing the
> >> data in row key
> >> > chronological order, then you may end up with
> sparse
> >> or 'small'
> >> > regions.  But that doesn't mean the
> feature is
> >> broken - ie: it does
> >> > not remove data older than the TTL. 
> Needs tuning
> >> yes, but not broken.
> >> >
> >> > Also note that "client side deletes" work in
> the same
> >> way that TTL
> >> > does, you insert a tombstone marker, then a
> compaction
> >> actually purges
> >> > the data itself.
> >> >
> >> > -ryan
> >> >
> >> > On Wed, Sep 15, 2010 at 11:26 AM, Jinsong Hu
> <[email protected]>
> >> wrote:
> >> >> I opened a ticket https://issues.apache.org/jira/browse/HBASE-2999 to
> >> track
> >> >> issue. dropping old store , and update
> the
> >> adjacent region's key range when
> >> >> all
> >> >> store for a region is gone is probably
> the
> >> cheapest solution, both in terms
> >> >> of coding and in terms of resource usage
> in the
> >> cluster. Do we know when
> >> >> this can be done ?
> >> >>
> >> >>
> >> >> Jimmy.
> >> >>
> >> >>
> >>
> --------------------------------------------------
> >> >> From: "Jonathan Gray" <[email protected]>
> >> >> Sent: Wednesday, September 15, 2010 11:06
> AM
> >> >> To: <[email protected]>
> >> >> Subject: RE: hbase doesn't delete data
> older than
> >> TTL in old regions
> >> >>
> >> >>> This sounds reasonable.
> >> >>>
> >> >>> We are tracking min/max timestamps
> in
> >> storefiles too, so it's possible
> >> >>> that we could expire some files of a
> region as
> >> well, even if the region was
> >> >>> not completely expired.
> >> >>>
> >> >>> Jinsong, mind filing a jira?
> >> >>>
> >> >>> JG
> >> >>>
> >> >>>> -----Original Message-----
> >> >>>> From: Jinsong Hu [mailto:[email protected]]
> >> >>>> Sent: Wednesday, September 15,
> 2010 10:39
> >> AM
> >> >>>> To: [email protected]
> >> >>>> Subject: Re: hbase doesn't delete
> data
> >> older than TTL in old regions
> >> >>>>
> >> >>>> Yes, Current TTL based on
> compaction is
> >> working as advertised if the
> >> >>>> key
> >> >>>> randomly distribute the incoming
> data
> >> >>>> among all regions.  However,
> if the
> >> key is designed in chronological
> >> >>>> order,
> >> >>>> the TTL doesn't really work,
> as  no
> >> compaction
> >> >>>> will happen for data already
> written. So
> >> we can't say  that current TTL
> >> >>>> really work as advertised, as it
> is key
> >> structure dependent.
> >> >>>>
> >> >>>> This is a pity, because a major
> use case
> >> for hbase is for people to
> >> >>>> store
> >> >>>> history or log data. normally
> people only
> >> >>>> want to retain the data for a
> fixed
> >> period. for example, US government
> >> >>>> default data retention policy is
> 7 years.
> >> Those
> >> >>>> data are saved in chronological
> order.
> >> Current TTL implementation
> >> >>>> doesn't
> >> >>>> work at all for those kind of use
> case.
> >> >>>>
> >> >>>> In order for that use case to
> really work,
> >> hbase needs to have an
> >> >>>> active
> >> >>>> thread that periodically runs and
> check if
> >> there
> >> >>>> are data older than TTL, and
> delete the
> >> data older than TTL is
> >> >>>> necessary,
> >> >>>> and compact small regions older
> than
> >> certain time period
> >> >>>> into larger ones to save system
> resource.
> >> It can optimize the deletion
> >> >>>> by
> >> >>>> delete the whole region if it
> detects that
> >> the last time
> >> >>>> stamp for the region is older
> than
> >> TTL.  There should be 2 parameters
> >> >>>> to
> >> >>>> configure for hbase:
> >> >>>>
> >> >>>> 1. whether to disable/enable the
> TTL
> >> thread.
> >> >>>> 2. the interval that TTL will
> run. maybe
> >> we can use a special value
> >> >>>> like 0
> >> >>>> to indicate that we don't run the
> TTL
> >> thread, thus saving one
> >> >>>> configuration
> >> >>>> parameter.
> >> >>>> for the default TTL, probably it
> should be
> >> set to 1 day.
> >> >>>> 3. How small will the region be
> merged. it
> >> should be a percentage of
> >> >>>> the
> >> >>>> store size. for example, if 2
> consecutive
> >> region is only 10% of the
> >> >>>> store
> >> >>>> szie ( default is 256M), we can
> initiate a
> >> region merge.  We probably
> >> >>>> need a
> >> >>>> parameter to reduce the merge
> too. for
> >> example , we only merge for
> >> >>>> regions
> >> >>>> who's largest timestamp
> >> >>>> is older than half of TTL.
> >> >>>>
> >> >>>>
> >> >>>> Jimmy
> >> >>>>
> >> >>>>
> >>
> --------------------------------------------------
> >> >>>> From: "Stack" <[email protected]>
> >> >>>> Sent: Wednesday, September 15,
> 2010 10:08
> >> AM
> >> >>>> To: <[email protected]>
> >> >>>> Subject: Re: hbase doesn't delete
> data
> >> older than TTL in old regions
> >> >>>>
> >> >>>> > On Wed, Sep 15, 2010 at 9:54
> AM,
> >> Jinsong Hu <[email protected]>
> >> >>>> > wrote:
> >> >>>> >> I have tested the TTL
> for hbase
> >> and found that it relies on
> >> >>>> compaction to
> >> >>>> >> remove old data .
> However, if a
> >> region has data that is older
> >> >>>> >> than TTL, and there is
> no trigger
> >> to compact it, then the data will
> >> >>>> >> remain
> >> >>>> >> there forever, wasting
> disk space
> >> and memory.
> >> >>>> >>
> >> >>>> >
> >> >>>> > So its working as advertised
> then?
> >> >>>> >
> >> >>>> > There's currently an issue
> where we
> >> can skip major compactions if
> >> >>>> your
> >> >>>> > write loading has a
> particular
> >> character: hbase-2990.
> >> >>>> >
> >> >>>> >
> >> >>>> >> It appears at this
> state, to
> >> really remove data older than TTL we
> >> >>>> need to
> >> >>>> >> start a client side
> deletion
> >> request.
> >> >>>> >
> >> >>>> > Or run a manual major
> compaction:
> >> >>>> >
> >> >>>> > $ echo "major_compact
> TABLENAME" |
> >> ./bin/hbase shell
> >> >>>> >
> >> >>>> >
> >> >>>> >
> >> >>>> > This is really a pity
> because
> >> >>>> >> it is an more expensive
> way to
> >> get the job done.  Another side
> >> >>>> effect of
> >> >>>> >> this is that as time
> goes on, we
> >> will end up with some small
> >> >>>> >> regions if the data are
> saved in
> >> chronological order in regions. It
> >> >>>> >> appears
> >> >>>> >> that hbase doesn't have
> a
> >> mechanism to merge 2 consecutive
> >> >>>> >> small regions into a
> bigger one
> >> at this time.
> >> >>>> >
> >> >>>> > $ ./bin/hbase
> >> org.apache.hadoop.hbase.util.Merge
> >> >>>> > Usage: bin/hbase merge
> >> <table-name> <region-1>
> <region-2>
> >> >>>> >
> >> >>>> > Currently only works on
> offlined
> >> table but there's a patch available
> >> >>>> > to make it run against
> onlined
> >> regions.
> >> >>>> >
> >> >>>> >
> >> >>>> > So if data is saved in
> >> >>>> >> chronological order,
> sooner or
> >> later we will run out of capacity ,
> >> >>>> even
> >> >>>> >> if
> >> >>>> >> the amount of data in
> hbase is
> >> small, because we have lots of
> >> >>>> regions
> >> >>>> >> with
> >> >>>> >> small storage space.
> >> >>>> >>
> >> >>>> >> A much cheaper way to
> remove data
> >> older than TTL would be to
> >> >>>> remember the
> >> >>>> >> latest timestamp for the
> region
> >> in the .META. table
> >> >>>> >> and if the time is older
> than
> >> TTL, we just adjust the row in .META.
> >> >>>> and
> >> >>>> >> delete the store ,
> without doing
> >> any compaction.
> >> >>>> >>
> >> >>>> >
> >> >>>> > Say more on the above. 
> It
> >> sounds promising.  Are you suggesting that
> >> >>>> > in addition to compactions
> that we
> >> also have a provision where we
> >> >>>> keep
> >> >>>> > account of a storefiles
> latest
> >> timestamp (we already do this I
> >> >>>> > believe) and that when now
> -
> >> storefile-timestamp > ttl, we just
> >> >>>> remove
> >> >>>> > the storefile
> wholesale.  That
> >> sounds like it could work, if that is
> >> >>>> > what you are
> suggesting.  Mind
> >> filing an issue w/ a detailed
> >> >>>> > description?
> >> >>>> >
> >> >>>> > Thanks,
> >> >>>> > St.Ack
> >> >>>> >
> >> >>>> >
> >> >>>> >
> >> >>>> >> Can this be added to the
> hbase
> >> requirement for future release ?
> >> >>>> >>
> >> >>>> >> Jimmy
> >> >>>> >>
> >> >>>> >>
> >> >>>> >>
> >> >>>> >
> >> >>>
> >> >>
> >> >
> >> 
> > 
> > 
> > 
> > 
> > 
>

Re: hbase doesn't delete data older than TTL in old regions

Reply via email to