[
https://issues.apache.org/jira/browse/HBASE-8721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13689935#comment-13689935
]
Feng Honghua commented on HBASE-8721:
-------------------------------------
[~lhofhansl] Now we're extending this to Puts as well, and are saying that a
Put that hits the RegionServer later should be considered newer even if its TS
is old, this opens another can of worms
===> Maybe you misunderstand me here, I never proposed 'a Put that hits the
RegionServer later should be considered newer even if its TS is old'. The
sequence 'put T3, put T2, put T1' (where T3>T2>T1) to a CF with max-version = 2
will result in (T3,T2) and T3 is the first version, though T1 is the last one
hits RS, this is what I mean by 'timestamp is the only dimension which
determines version ordering/survival by rule 'the bigger wins''
===> What I proposed is this (can via a config to provide customers the
option if they want this behavior) : the delete masks (existing) puts with
timestamps less than or equal to its (not changed); and customers can choose
whether the delete can mask puts still not written to HBase (future puts)
according their individual real-world application logic / requirement.
KEEP_DELETED_CELLS would still work fine, but their main goal is to allow
correct point-in-time-queries, which among others is important for consistent
backups
===> KEEP_DELETED_CELLS indeed can prevent the inconsistency in the example
scenario 'put - delete - (major-compact) - put - get', and it provides a
consistent result of 'get nothing'. But this result is also unacceptable for
our customers since they expect the later 'put' not masked by the earlier
delete.
> Deletes can mask puts that happen after the delete
> --------------------------------------------------
>
> Key: HBASE-8721
> URL: https://issues.apache.org/jira/browse/HBASE-8721
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Reporter: Feng Honghua
> Attachments: HBASE-8721-0.94-V0.patch
>
>
> this fix aims for bug mentioned in http://hbase.apache.org/book.html 5.8.2.1:
> "Deletes mask puts, even puts that happened after the delete was entered.
> Remember that a delete writes a tombstone, which only disappears after then
> next major compaction has run. Suppose you do a delete of everything <= T.
> After this you do a new put with a timestamp <= T. This put, even if it
> happened after the delete, will be masked by the delete tombstone. Performing
> the put will not fail, but when you do a get you will notice the put did have
> no effect. It will start working again after the major compaction has run.
> These issues should not be a problem if you use always-increasing versions
> for new puts to a row. But they can occur even if you do not care about time:
> just do delete and put immediately after each other, and there is some chance
> they happen within the same millisecond."
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira