[ 
https://issues.apache.org/jira/browse/HBASE-8001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13672771#comment-13672771
 ] 

Raymond Liu commented on HBASE-8001:
------------------------------------

[~lhofhansl], [~ted_yu]: sorry for late on this issue. busy for other staff. I 
do this test again in with single RS. I still use M/R job, while the table only 
have one region. and is 2M rows ,1CF, 18col, without any compression or 
encoding. size about 3G on disk. And I don't use blockcache,every time the data 
is read from disk by a real seek. but as we discussed before, use of blockcache 
will only led to more gain with this patch.

with this patch, a 18col full table scan cost 99-101s, while without this patch 
it will cost 108-109s. still noticeable difference. I test it for several times 
on each case. the result is pretty stable.

Do you mind to take a end2end test? I am not sure is there any other thing 
might still have impact upon your test case. might be that the data size is too 
small? 
                
> Avoid unnecessary lazy seek
> ---------------------------
>
>                 Key: HBASE-8001
>                 URL: https://issues.apache.org/jira/browse/HBASE-8001
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.94.5
>            Reporter: Raymond Liu
>            Assignee: Raymond Liu
>             Fix For: 0.98.0
>
>         Attachments: HBASE-8001_onescanner.patch, 
> HBASE-8001_onescanner_v2.patch
>
>
> Lazy seek helps to reduce the real seek needed for multi hfile, when the kv 
> from newer hfile is enough to satisfy the query.
> While in many case, it just push the real seek later, and do not reduce the 
> number of real seek. e.g. there are only one hfile, or storefilescanner is 
> closed and only one left, or the scan need to go through all the versions, or 
> there are only one version of row and a sequence scan is performed. In these 
> case, lazy seek just bring extra overhead.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to