[
https://issues.apache.org/jira/browse/HBASE-5241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13208769#comment-13208769
]
Amitanand Aiyer commented on HBASE-5241:
----------------------------------------
@stack. The potential performance slow down on seek is due to this:
In ScanQueryMatcher, we used to return getNextRowOrNextColumn(bytes, offset,
qualLength) for FAMILY_DELETED and COLUMN_DELETED; because once we see a KV
that is deleted due to a family or a column delete, all the remaining KV's
(with a lower timestamp) are guaranteed to be deleted.
Now, we return SKIP instead. This change is required, because there might be a
KV, later in the file -- that has a lower timestamp, but a higher memstoreTS
(so that deleteFamily does not apply). In this case, we end up moving 1 KV at a
time; instead of potentially skipping the entire column or row.
> Deletes should not mask Puts that come after it.
> ------------------------------------------------
>
> Key: HBASE-5241
> URL: https://issues.apache.org/jira/browse/HBASE-5241
> Project: HBase
> Issue Type: Improvement
> Reporter: Amitanand Aiyer
> Attachments: HBASE-5241.D1731.1.patch
>
>
> Suppose that we have a delete row, and then followed by the put. The delete
> row
> can mask the put, unless there was a major compaction in between.
> Now that we are flushing the memstoreTS to disk, along with the KVs, we
> should be able
> to differentiate whether or not the Put happened after the Delete and offer
> better
> delete semantics.
> Couldn't find a pre-existing JIRA that already discusses this, so creating
> one.
> Seems related to https://issues.apache.org/jira/browse/HBASE-2406, but is not
> quite the same.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira