Lars Hofhansl created HBASE-12311:
-------------------------------------
Summary: Version stats in HFiles?
Key: HBASE-12311
URL: https://issues.apache.org/jira/browse/HBASE-12311
Project: HBase
Issue Type: Brainstorming
Reporter: Lars Hofhansl
In HBASE-9778 I basically punted the decision on whether doing repeated
scanner.next() called instead of the issueing (re)seeks to the user.
I think we can do better.
One way do that is maintain simple stats of what the maximum number of versions
we've seen for any row/col combination and store these in the HFile's metadata
(just like the timerange, oldest Put, etc).
Then we estimate fairly accurately whether we have to expect lots of versions
(i.e. seek between columns is better) or not (in which case we'd issue repeated
next()'s).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)