[
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13210444#comment-13210444
]
Phabricator commented on HBASE-5241:
------------------------------------
aaiyer has commented on the revision "HBASE-5241 [jira] Deletes should not mask
Puts that come after it.".
INLINE COMMENTS
src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java:1748 This
will only happen for Deletes (Column and Family). The idea is that the Delete
shall apply to all the puts, with a lower memstoreTS, regardless of their
timestamp -- even if it is in "future".
Subsequent Puts etc. will not get masked by the Delete, because they should
have a memstoreTS that is larger.
src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:155
This is not yet in production. But, if we decide to go down this route, we will
definitely test it out for performance.
Haven't optimised much here. Since, I don't expect there to be too many
delete Family.
Will revisit if the assumption turns out to be false.
src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:155
I'm not sure if we want to put this under ENFORCE_STRICTER_SEMANTICS ....
my understanding was that it would be better to have Puts not be masked by
previous Deletes, regardless ....
weather we are willing to pay the extra performance cost for it, was the
trade-off enforced using ENFORCE_STRICTER_SEMANTICS.
If there is a good reason for clients to expect that the Put will be masked
by previous Deletes, we can definitely guard this with the flag.
src/main/java/org/apache/hadoop/hbase/regionserver/ScanDeleteTracker.java:173
Perhaps, I might rename this class to something different, and we can add a
flag in ScanQueryMatcher to instantiate the appropriate DeleteTracker.
src/main/java/org/apache/hadoop/hbase/regionserver/ScanQueryMatcher.java:223
Agree that this is going to be a performance issue here.
But, this is just a V-1 to get the general idea out. I'm hopeful, we can
optimise the codepath so that we incur the performance penalty only when there
is really a later KV with a higher memstoreTS.
We currently, do not have a way to tell that. But, it can be done, say dump a
flag while writing the HFile, if there is a memstoreTS inversion. Or something
along that lines ....
Will try to optimise this, if needed, along those lines.
REVISION DETAIL
https://reviews.facebook.net/D1731
> Deletes should not mask Puts that come after it.
> ------------------------------------------------
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
> Issue Type: Improvement
> Reporter: Amitanand Aiyer
> Assignee: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch, HBASE-5241.D1731.2.patch,
> HBASE-5241.D1731.3.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we
> should be able
> to differentiate whether or not the Put happened after the Delete and offer
> better
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not
> quite the same.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira