[ 
https://issues.apache.org/jira/browse/HBASE-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12593451#action_12593451
 ] 

Andrew Purtell commented on HBASE-47:
-------------------------------------

Please find attached another patch. It includes a simple unit test (TestTTL) 
that completes successfully for me:

% ant test -Dtestcase=TestTTL
...
    [junit] Tests run:1, Failures: 0, Errors:0, Time elapsed: 71.449 sec.

 I have also performed some limited testing with HQL in a single master / 
single regionserver environment, e.g. using disable/enable table to force 
flushes and the like, then select to confirm the expected behavior, in 
conjunction with DEBUG logging. I'm going to try for some more substantial 
testing with a multimillion row data set, but it may take me some time to 
complete that.

> option to set TTL for columns in hbase
> --------------------------------------
>
>                 Key: HBASE-47
>                 URL: https://issues.apache.org/jira/browse/HBASE-47
>             Project: Hadoop HBase
>          Issue Type: New Feature
>          Components: hql, regionserver
>            Reporter: Billy Pearson
>            Priority: Minor
>         Attachments: hbase-ttl-0.1.patch
>
>
> I would like to see the option to have a TTL on the columns in hbase this 
> feature could be helpfully in removing stale data from large datasets with 
> out havening to do a full scan of the dataset and then issuing deletes.
> Example 
> Say I am crawling pages and only refreshing pages based on a set score and 
> some pages doe not get updated over X days the old version of the page gets 
> removed from the data set. 
> Say I am striping out links form html and storing them say a link is removed 
> from a page then I would need to issue a delete statement to remove that 
> links form the data set with a ttl the link data would remove its self if not 
> updated in x secs. These are just examples based on crawling like nutch but I 
> can foresee many apps using this option. 
> This is a feature in bigtables thats is handled when bigtable does 
> garbage-collection.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to