[ 
https://issues.apache.org/jira/browse/HBASE-5523?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13223017#comment-13223017
 ] 

Lars Hofhansl commented on HBASE-5523:
--------------------------------------

I remember the initial motivation for the +1 shift now.
If somebody accidentally places a Delete at T it would not be possible to get 
at any Puts of T with normal Scan API. The +1 allow setting an interval that 
includes the Puts but not the Delete. I.e. setting the range to [0,T+1) would 
include the Puts and Deletes. (note that the lower bound inclusive and the 
upper bound is exclusive hence the [x,y) notation).

With the +1 shift [0,T+1) would not contain the Delete, but [0,T+2) would.
This is very confusing, since [0,T+1) doesn't actually mean that when it comes 
to deletes.

To recover the above mentioned Puts one could use a raw scan, instead.

I'm going to commit the attached patch; it makes these scenarios much clearer.

                
> Fix Delete Timerange logic for KEEP_DELETED_CELLS
> -------------------------------------------------
>
>                 Key: HBASE-5523
>                 URL: https://issues.apache.org/jira/browse/HBASE-5523
>             Project: HBase
>          Issue Type: Sub-task
>          Components: regionserver
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Minor
>             Fix For: 0.94.0, 0.96.0
>
>         Attachments: 5523.txt
>
>
> A Delete at time T marks a Put at time T as deleted.
> In parent I invented special logic that insert a virtual millisecond into the 
> tr if the encountered KV is a delete marker.
> This was so that there is a way to specify a timerange that would allow to 
> see the put but not the delete:
> {code}
> if (kv.isDelete()) {
>   if (!keepDeletedCells) {
>     // first ignore delete markers if the scanner can do so, and the
>     // range does not include the marker
>     boolean includeDeleteMarker = seePastDeleteMarkers ?
>     // +1, to allow a range between a delete and put of same TS
>     tr.withinTimeRange(timestamp+1) :
>     tr.withinOrAfterTimeRange(timestamp);
> {code}
> Discussed this today with a coworker and he convinced me that this is very 
> confusing and also not needed.
> When we have a Delete and Put at the same time T, there *is* not timerange 
> that can include the Put but not the Delete.
> So I will change the code to this (and fix the tests):
> {code}
> if (kv.isDelete()) {
>   if (!keepDeletedCells) {
>     // first ignore delete markers if the scanner can do so, and the
>     // range does not include the marker
>     boolean includeDeleteMarker = seePastDeleteMarkers ?
>     tr.withinTimeRange(timestamp) :
>     tr.withinOrAfterTimeRange(timestamp);
> {code}
> It's easier to understand, and does not lead to strange scenarios when the TS 
> is used as a controlled counter.
> Needs to be done before 0.94 goes out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to