[ 
https://issues.apache.org/jira/browse/HBASE-4536?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13131082#comment-13131082
 ] 

[email protected] commented on HBASE-4536:
------------------------------------------------------



bq.  On 2011-10-19 21:16:42, Prakash Khemani wrote:
bq.  >

Thanks for the review Prakash.


bq.  On 2011-10-19 21:16:42, Prakash Khemani wrote:
bq.  > 
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java,
 line 160
bq.  > <https://reviews.apache.org/r/2178/diff/12/?file=50948#file50948line160>
bq.  >
bq.  >     If KEEP_DELETED_CELLS is true then what will be the behavior of 
GETs? Will gets be able to reach the deleted cells as well?
bq.  >     
bq.  >     (If both gets and deletes are able to fetch the deleted cells then 
why keep delete markers?)
bq.  >     
bq.  >     Can the value of KEEP_DELETD_CELLS for a column family dynamically 
altered in hbase-shell?

Gets and Scans only see deleted rows when they use a time range that ends 
before the resp. delete marker.
The idea is that by using timerange [0,T+1), you can query the system for the 
state it was in at time T.
If that range includes the delete marker you won't see the deleted rows, if it 
does not you will.


bq.  On 2011-10-19 21:16:42, Prakash Khemani wrote:
bq.  > 
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java,
 line 114
bq.  > <https://reviews.apache.org/r/2178/diff/12/?file=50953#file50953line114>
bq.  >
bq.  >     Why only ExplicitColumnTracker? A "delete family marker" that 
doesn't have any column qualifier shouldn't be passed to any kind of column 
tracker, right?

During compactions the ScanWildcardColumnTracker is used, always.
For the new "raw" scans I simply do not want to support ExplicitColumnTracker. 
There's a check that prevents a "raw" scan from adding any columns.

Note that a "raw" scan is not the same as a time range scan/get. Jon G 
suggested that'd be a good thing to have, and it was easy to add. A "raw" scan 
sees everything. Including the delete markers, and could be used (for example) 
to do custom replication which includes delete markers and deleted rows.

ScanWildcardColumnTracker has all the new logic to deal with delete markers and 
deleted rows.


bq.  On 2011-10-19 21:16:42, Prakash Khemani wrote:
bq.  > 
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java,
 line 274
bq.  > <https://reviews.apache.org/r/2178/diff/12/?file=50954#file50954line274>
bq.  >
bq.  >     Jsut as in checkColumns() should there be an assert for this method 
also that a delete marker should never call it? A delete family marker doesn't 
have the column qualifier.

Delete marker should still be subject to this, so this is ok.


bq.  On 2011-10-19 21:16:42, Prakash Khemani wrote:
bq.  > 
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java,
 line 297
bq.  > <https://reviews.apache.org/r/2178/diff/12/?file=50954#file50954line297>
bq.  >
bq.  >     If keepDeletedCells is true and retainDeletesInOutput is false then 
a delete-marker-kv can reach here and fail the assert in checkColumn()?

no, because retainDeletesInOutput is only ever true for: a minor compaction, a 
memstore flush, and a raw scan. In all cases no columns can be specified.


- Lars


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2178/#review2677
-----------------------------------------------------------


On 2011-10-18 21:43:38, Lars Hofhansl wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2178/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-10-18 21:43:38)
bq.  
bq.  
bq.  Review request for hbase, Ted Yu and Jonathan Gray.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  HBase timerange Gets and Scans allow to do "timetravel" in HBase. I.e. 
look at the state of the data at any point in the past, provided the data is 
still around.
bq.  This did not work for deletes, however. Deletes would always mask all puts 
in the past.
bq.  This change adds a flag that can be on HColumnDescriptor to enable 
retention of deleted rows.
bq.  These rows are still subject to TTL and/or VERSIONS.
bq.  
bq.  This changes the following:
bq.  1. There is a new flag on HColumnDescriptor enabling that behavior.
bq.  2. Allow gets/scans with a timerange to retrieve rows hidden by a delete 
marker, if the timerange does not include the delete marker.
bq.  3. Do not unconditionally collect all deleted rows during a compaction.
bq.  4. Allow a "raw" Scan, which retrieves all delete markers and deleted rows.
bq.  
bq.  The change is small'ish, but the logic is intricate, so please review 
carefully.
bq.  
bq.  
bq.  This addresses bug HBASE-4536.
bq.      https://issues.apache.org/jira/browse/HBASE-4536
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/HColumnDescriptor.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/KeyValue.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Attributes.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/Scan.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ColumnTracker.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ExplicitColumnTracker.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/ScanWildcardColumnTracker.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java
 1185362 
bq.    http://svn.apache.org/repos/asf/hbase/trunk/src/main/ruby/hbase/admin.rb 
1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestCompaction.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestExplicitColumnTracker.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestKeepDeletes.java
 PRE-CREATION 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMemStore.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestMinVersions.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestQueryMatcher.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestScanWildcardColumnTracker.java
 1185362 
bq.    
http://svn.apache.org/repos/asf/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/TestStoreScanner.java
 1185362 
bq.  
bq.  Diff: https://reviews.apache.org/r/2178/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  All tests pass now.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Lars
bq.  
bq.


                
> Allow CF to retain deleted rows
> -------------------------------
>
>                 Key: HBASE-4536
>                 URL: https://issues.apache.org/jira/browse/HBASE-4536
>             Project: HBase
>          Issue Type: New Feature
>          Components: regionserver
>    Affects Versions: 0.92.0
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>             Fix For: 0.94.0
>
>         Attachments: 4536-v15.txt
>
>
> Parent allows for a cluster to retain rows for a TTL or keep a minimum number 
> of versions.
> However, if a client deletes a row all version older than the delete tomb 
> stone will be remove at the next major compaction (and even at memstore flush 
> - see HBASE-4241).
> There should be a way to retain those version to guard against software error.
> I see two options here:
> 1. Add a new flag HColumnDescriptor. Something like "RETAIN_DELETED".
> 2. Folds this into the parent change. I.e. keep minimum-number-of-versions of 
> versions even past the delete marker.
> #1 would allow for more flexibility. #2 comes somewhat naturally with parent 
> (from a user viewpoint)
> Comments? Any other options?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to