Optimize time range scans using a delete Bloom filter
-----------------------------------------------------
Key: HBASE-4962
URL: https://issues.apache.org/jira/browse/HBASE-4962
Project: HBase
Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Mikhail Bautin
Priority: Minor
To speed up time range scans we need to seek to the maximum timestamp of the
requested range,instead of going to the first KV of the (row, column) pair and
iterating from there. If we don't know the (row, column), e.g. if it is not
specified in the query, we need to go to end of the current row/column pair
first, get a KV from there, and do another seek to (row', column',
timerange_max) from there. We can only skip over to the timerange_max timestamp
when we know that there are no DeleteColumn records at the top of that
row/column with a higher timestamp. We can utilize another Bloom filter keyed
on (row, column) to quickly find that out.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira