[ 
https://issues.apache.org/jira/browse/HBASE-9102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13725515#comment-13725515
 ] 

Liyin Tang commented on HBASE-9102:
-----------------------------------

It is true that OS cached the compressed/encoded blocks and the DFSClient 
non-pread operation is also able to pre-load all the bytes up to that DFS 
block. And this feature is to pre-load (decompress/decoded) these data blocks 
in additional to the OS cache/disk read-ahead.

Also the scan prefetch is currently implemented in the RegionScanner level. I 
think it is a good idea to implement some prefetch logic in the HBase client as 
well.
                
> HFile block pre-loading for large sequential scan
> -------------------------------------------------
>
>                 Key: HBASE-9102
>                 URL: https://issues.apache.org/jira/browse/HBASE-9102
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.89-fb
>            Reporter: Liyin Tang
>            Assignee: Liyin Tang
>
> The current HBase scan model cannot take full advantage of the aggrediate 
> disk throughput, especially for the large sequential scan cases. And for the 
> large sequential scan, it is easy to predict what the next block to read in 
> advance so that it can pre-load and decompress/decoded these data blocks from 
> HDFS into block cache right before the current read point. 
> Therefore, this jira is to optimized the large sequential scan performance by 
> pre-loading the HFile blocks into the block cache in a stream fashion so that 
> the scan query can read from the cache directly.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to