[ 
https://issues.apache.org/jira/browse/HBASE-12311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180475#comment-14180475
 ] 

Lars Hofhansl commented on HBASE-12311:
---------------------------------------

In the end, I'm mostly looking for alternatives to speed up (re)seek. The core 
of the problem is that we always seek even if the seek will land us on the same 
block and hence the seek is actually unnecessary. There is a check for that 
(see HFileReaderV2.reseekTo), but that is too far down and we still need to 
recompare at the Store/SQM level.

Having some simple stats about number of versions per column and/or columns per 
row would be a reasonable proxy. Ideally we'd need a need good guess on whether 
the seek would seek us out of the current bock with high likelihood. If so we 
seek if not we do next() a few time.

So maybe a better stat would the number of bytes per column and per row. I.e. 
we'd sum up the sizes of the KVs of version for a column and all the KVs for a 
row and than keep max/avg stats about that. Then with knowing the HFile block 
size we can guess whether a seek to a column or row would propel us out of the 
block or not.

With that is mind the metrics to keep would be something like avg/max COL_SIZE 
and avg/max ROW_SIZE. The former would inform whether we should SEEK_NEXT_COL 
the latter whether we should SEEK_NEXT_ROW - both instead of performing a 
series of SKIP.

Presumably for most use cases the distribution of the col and row size would be 
pretty similar across the entire table (I know we have a bunch of such use 
cases).


> Version stats in HFiles?
> ------------------------
>
>                 Key: HBASE-12311
>                 URL: https://issues.apache.org/jira/browse/HBASE-12311
>             Project: HBase
>          Issue Type: Brainstorming
>            Reporter: Lars Hofhansl
>
> In HBASE-9778 I basically punted the decision on whether doing repeated 
> scanner.next() called instead of the issueing (re)seeks to the user.
> I think we can do better.
> One way do that is maintain simple stats of what the maximum number of 
> versions we've seen for any row/col combination and store these in the 
> HFile's metadata (just like the timerange, oldest Put, etc).
> Then we estimate fairly accurately whether we have to expect lots of versions 
> (i.e. seek between columns is better) or not (in which case we'd issue 
> repeated next()'s).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to