[
https://issues.apache.org/jira/browse/HBASE-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180475#comment-14180475
]
Lars Hofhansl commented on HBASE-12311:
---------------------------------------
In the end, I'm mostly looking for alternatives to speed up (re)seek. The core
of the problem is that we always seek even if the seek will land us on the same
block and hence the seek is actually unnecessary. There is a check for that
(see HFileReaderV2.reseekTo), but that is too far down and we still need to
recompare at the Store/SQM level.
Having some simple stats about number of versions per column and/or columns per
row would be a reasonable proxy. Ideally we'd need a need good guess on whether
the seek would seek us out of the current bock with high likelihood. If so we
seek if not we do next() a few time.
So maybe a better stat would the number of bytes per column and per row. I.e.
we'd sum up the sizes of the KVs of version for a column and all the KVs for a
row and than keep max/avg stats about that. Then with knowing the HFile block
size we can guess whether a seek to a column or row would propel us out of the
block or not.
With that is mind the metrics to keep would be something like avg/max COL_SIZE
and avg/max ROW_SIZE. The former would inform whether we should SEEK_NEXT_COL
the latter whether we should SEEK_NEXT_ROW - both instead of performing a
series of SKIP.
Presumably for most use cases the distribution of the col and row size would be
pretty similar across the entire table (I know we have a bunch of such use
cases).
> Version stats in HFiles?
> ------------------------
>
> Key: HBASE-12311
> URL: https://issues.apache.org/jira/browse/HBASE-12311
> Project: HBase
> Issue Type: Brainstorming
> Reporter: Lars Hofhansl
>
> In HBASE-9778 I basically punted the decision on whether doing repeated
> scanner.next() called instead of the issueing (re)seeks to the user.
> I think we can do better.
> One way do that is maintain simple stats of what the maximum number of
> versions we've seen for any row/col combination and store these in the
> HFile's metadata (just like the timerange, oldest Put, etc).
> Then we estimate fairly accurately whether we have to expect lots of versions
> (i.e. seek between columns is better) or not (in which case we'd issue
> repeated next()'s).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)