[
https://issues.apache.org/jira/browse/HBASE-8001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672771#comment-13672771
]
Raymond Liu commented on HBASE-8001:
------------------------------------
[~lhofhansl], [~ted_yu]: sorry for late on this issue. busy for other staff. I
do this test again in with single RS. I still use M/R job, while the table only
have one region. and is 2M rows ,1CF, 18col, without any compression or
encoding. size about 3G on disk. And I don't use blockcache,every time the data
is read from disk by a real seek. but as we discussed before, use of blockcache
will only led to more gain with this patch.
with this patch, a 18col full table scan cost 99-101s, while without this patch
it will cost 108-109s. still noticeable difference. I test it for several times
on each case. the result is pretty stable.
Do you mind to take a end2end test? I am not sure is there any other thing
might still have impact upon your test case. might be that the data size is too
small?
> Avoid unnecessary lazy seek
> ---------------------------
>
> Key: HBASE-8001
> URL: https://issues.apache.org/jira/browse/HBASE-8001
> Project: HBase
> Issue Type: Improvement
> Components: regionserver
> Affects Versions: 0.94.5
> Reporter: Raymond Liu
> Assignee: Raymond Liu
> Fix For: 0.98.0
>
> Attachments: HBASE-8001_onescanner.patch,
> HBASE-8001_onescanner_v2.patch
>
>
> Lazy seek helps to reduce the real seek needed for multi hfile, when the kv
> from newer hfile is enough to satisfy the query.
> While in many case, it just push the real seek later, and do not reduce the
> number of real seek. e.g. there are only one hfile, or storefilescanner is
> closed and only one left, or the scan need to go through all the versions, or
> there are only one version of row and a sequence scan is performed. In these
> case, lazy seek just bring extra overhead.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira