[ 
https://issues.apache.org/jira/browse/HBASE-17910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970486#comment-15970486
 ] 

stack commented on HBASE-17910:
-------------------------------

A few notes:

 * I like the idea of hiding inside StoreFile the trickery we need to get 
around the idiosyncrasie sof the underlying FS.
 * This is an old problem manifesting in a new way (short-circuit read) w/ an 
update on an old idea (separate readers). Excellent.
 * On the question "But for streaming read I think the buffer is somehow still 
useful?", I don't know. A long time ago, single-threaded streaming read was 
~15% better than preading it. But then streaming read blocked out other 
concurrent reads so in multithreaded case, pread had more throughput. Those 
measures were taken long ago. Not sure of what current state is. We should 
probably remeasure.
+ Would be excellent if we could remove the lock.

> Use separated StoreFileReader for streaming read
> ------------------------------------------------
>
>                 Key: HBASE-17910
>                 URL: https://issues.apache.org/jira/browse/HBASE-17910
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Duo Zhang
>
> For now we have already supportted using private readers for compaction, by 
> creating a new StoreFile copy. I think a better way is to allow creating 
> multiple readers from a single StoreFile instance, thus we can avoid the ugly 
> cloning, and the reader can also be used for streaming scan, not only for 
> compaction.
> The reason we want to do this is that, we found a read amplification when 
> using short circult read. {{BlockReaderLocal}} will use an internal buffer to 
> read data first, the buffer size is based on the configured buffer size and 
> the readahead option in CachingStrategy. For normal pread request, we should 
> just bypass the buffer, this can be achieved by setting readahead to 0. But 
> for streaming read I think the buffer is somehow still useful? So we need to 
> use different FSDataInputStream for pread and streaming read.
> And one more thing is that, we can also remove the streamLock if streaming 
> read always use its own reader.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to