[
https://issues.apache.org/jira/browse/HBASE-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mikhail Bautin updated HBASE-5032:
----------------------------------
Assignee: Adela Maznikar (was: Liyin Tang)
> Add other DELETE type information into the delete bloom filter to optimize
> the time range query
> -----------------------------------------------------------------------------------------------
>
> Key: HBASE-5032
> URL: https://issues.apache.org/jira/browse/HBASE-5032
> Project: HBase
> Issue Type: Improvement
> Reporter: Liyin Tang
> Assignee: Adela Maznikar
>
> To speed up time range scans we need to seek to the maximum timestamp of the
> requested range,instead of going to the first KV of the (row, column) pair
> and iterating from there. If we don't know the (row, column), e.g. if it is
> not specified in the query, we need to go to end of the current row/column
> pair first, get a KV from there, and do another seek to (row', column',
> timerange_max) from there. We can only skip over to the timerange_max
> timestamp when we know that there are no DeleteColumn records at the top of
> that row/column with a higher timestamp. We can utilize another Bloom filter
> keyed on (row, column) to quickly find that out. (From HBASE-4962)
> So the motivation is to save seek ops for scanning time-range queries if we
> know there is no delete for this row/column.
> From the implementation prospective, we have already had a delete family
> bloom filter which contains all the delete family key values. So we can reuse
> the same bloom filter for all other kinds of delete information such as
> delete columns or delete.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira