[ 
https://issues.apache.org/jira/browse/HBASE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689935#comment-13689935
 ] 

Feng Honghua commented on HBASE-8721:
-------------------------------------


[~lhofhansl] Now we're extending this to Puts as well, and are saying that a 
Put that hits the RegionServer later should be considered newer even if its TS 
is old, this opens another can of worms

  ===> Maybe you misunderstand me here, I never proposed 'a Put that hits the 
RegionServer later should be considered newer even if its TS is old'. The 
sequence 'put T3, put T2, put T1' (where T3>T2>T1) to a CF with max-version = 2 
will result in (T3,T2) and T3 is the first version, though T1 is the last one 
hits RS, this is what I mean by 'timestamp is the only dimension which 
determines version ordering/survival by rule 'the bigger wins''

  ===> What I proposed is this (can via a config to provide customers the 
option if they want this behavior) : the delete masks (existing) puts with 
timestamps less than or equal to its (not changed); and customers can choose 
whether the delete can mask puts still not written to HBase (future puts) 
according their individual real-world application logic / requirement.


  KEEP_DELETED_CELLS would still work fine, but their main goal is to allow 
correct point-in-time-queries, which among others is important for consistent 
backups

  ===> KEEP_DELETED_CELLS indeed can prevent the inconsistency in the example 
scenario 'put - delete - (major-compact) - put - get', and it provides a 
consistent result of 'get nothing'. But this result is also unacceptable for 
our customers since they expect the later 'put' not masked by the earlier 
delete.
                
> Deletes can mask puts that happen after the delete
> --------------------------------------------------
>
>                 Key: HBASE-8721
>                 URL: https://issues.apache.org/jira/browse/HBASE-8721
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>            Reporter: Feng Honghua
>         Attachments: HBASE-8721-0.94-V0.patch
>
>
> this fix aims for bug mentioned in http://hbase.apache.org/book.html 5.8.2.1:
> "Deletes mask puts, even puts that happened after the delete was entered. 
> Remember that a delete writes a tombstone, which only disappears after then 
> next major compaction has run. Suppose you do a delete of everything <= T. 
> After this you do a new put with a timestamp <= T. This put, even if it 
> happened after the delete, will be masked by the delete tombstone. Performing 
> the put will not fail, but when you do a get you will notice the put did have 
> no effect. It will start working again after the major compaction has run. 
> These issues should not be a problem if you use always-increasing versions 
> for new puts to a row. But they can occur even if you do not care about time: 
> just do delete and put immediately after each other, and there is some chance 
> they happen within the same millisecond."

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to