[ 
https://issues.apache.org/jira/browse/HBASE-4071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13089940#comment-13089940
 ] 

jirapos...@reviews.apache.org commented on HBASE-4071:
------------------------------------------------------



bq.  On 2011-08-24 00:03:43, Michael Stack wrote:
bq.  > 
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java,
 line 127
bq.  > <https://reviews.apache.org/r/1582/diff/7/?file=34607#file34607line127>
bq.  >
bq.  >     I defer to Jon's review here (he knows this stuff...)

This basically just counts versions up instead of down (see next comment below 
as to why).


bq.  On 2011-08-24 00:03:43, Michael Stack wrote:
bq.  > 
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java,
 line 205
bq.  > <https://reviews.apache.org/r/1582/diff/7/?file=34607#file34607line205>
bq.  >
bq.  >     Why this change?

ExplicitColumnTracker used to count versions down and ScanWildcardColumnTracker 
is counting version up.
This change goes to together with the previous change (increment vs decrement).

I did that for two reasons:

1. When both column trackers follow similar logic it will be easier to change 
them and to simplify them further in the future (such as extracting a GC 
policy).

2. Now the comparison for versions can be:
 if(count >= maxVersions || (count >= minVersions && isExpired(timestamp))
    // Done with versions for this column

whereas otherwise it would need to be:
 if(count == 0) || ((maxVersion-count) < minVersions && isExpired(timestamp))
    // Done with versions for this column

Which is harder to read and less intuitive, I think.


bq.  On 2011-08-24 00:03:43, Michael Stack wrote:
bq.  > 
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java,
 line 47
bq.  > <https://reviews.apache.org/r/1582/diff/7/?file=34606#file34606line47>
bq.  >
bq.  >     javadoc doesn't match param name -- I can fix on commit.

Argh... One more that I missed. Thanks Stack!


- Lars


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1582/#review1602
-----------------------------------------------------------


On 2011-08-23 05:13:38, Lars Hofhansl wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1582/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-08-23 05:13:38)
bq.  
bq.  
bq.  Review request for hbase, Todd Lipcon, Michael Stack, and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Allow enforcing a minimum number of versions when TTL is enable for a 
store.
bq.  The GC logic for both versions and TTL is unified inside the 
ColumnTrackers.
bq.  
bq.  
bq.  This addresses bug HBASE-4071.
bq.      https://issues.apache.org/jira/browse/HBASE-4071
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
 1160440 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java
 1160440 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
 1160440 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
 1160440 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
 1160440 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java
 1160440 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
 1160440 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
 1160440 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 
1160440 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
 1160440 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
 1160440 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java
 PRE-CREATION 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java
 1160440 
bq.  
bq.  Diff: https://reviews.apache.org/r/1582/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  Ran all tests. I get error (not failures) from two: 
TestDistributedLogSplitting and TestHTablePool. Both fail with or without my 
changes.
bq.  New tests: TestMinVersions.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Lars
bq.  
bq.



> Data GC: Remove all versions > TTL EXCEPT the last written version
> ------------------------------------------------------------------
>
>                 Key: HBASE-4071
>                 URL: https://issues.apache.org/jira/browse/HBASE-4071
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: stack
>            Assignee: Lars Hofhansl
>         Attachments: MinVersions.diff
>
>
> We were chatting today about our backup cluster.  What we want is to be able 
> to restore the dataset from any point of time but only within a limited 
> timeframe -- say one week.  Thereafter, if the versions are older than one 
> week, rather than as we do with TTL where we let go of all versions older 
> than TTL, instead, let go of all versions EXCEPT the last one written.  So, 
> its like versions==1 when TTL > one week.  We want to allow that if an error 
> is caught within a week of its happening -- user mistakenly removes a 
> critical table -- then we'll be able to restore up the the moment just before 
> catastrophe hit otherwise, we keep one version only.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to