[
https://issues.apache.org/jira/browse/HBASE-27896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17727580#comment-17727580
]
Duo Zhang commented on HBASE-27896:
-----------------------------------
{quote}
When that flag is false, we pass a readahead value of 0 to
FSDataInputStream.setReadahead. When the flag is true, we pass -1 which
triggers hdfs default behavior. The default behavior is to use a readahead of
4MB.
{quote}
I think we will pass 0 when the flag is true and pass -1 when the flag is false?
https://github.com/apache/hbase/blob/79c985f2468429d717b82563e81116c4c5f75f5d/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HStoreFile.java#LL382C58-L382C58
{code}
long readahead = fileInfo.isNoReadahead() ? 0L : -1L;
ReaderContext context = fileInfo.createReaderContext(false, readahead,
ReaderType.PREAD);
{code}
> Disable hdfs readahead for pread reads
> --------------------------------------
>
> Key: HBASE-27896
> URL: https://issues.apache.org/jira/browse/HBASE-27896
> Project: HBase
> Issue Type: Improvement
> Reporter: Bryan Beaudreault
> Priority: Major
>
> In https://issues.apache.org/jira/browse/HBASE-17914, a flag was introduced
> {{{}hbase.store.reader.no-readahead{}}}. The default is false, so readahead
> is enabled. This flag is used for creating the default store reader (i.e. the
> one used by PREAD reads). Stream readers don't use this flag, instead they
> always pass -1.
> When that flag is false, we pass a readahead value of 0 to
> FSDataInputStream.setReadahead. When the flag is true, we pass -1 which
> triggers hdfs default behavior. The default behavior is to use a readahead of
> 4MB.
> It seems to me that we don't want readahead for PREAD reads, and especially
> not such a large readahead. Our default block size is 64kb, which is much
> smaller than that. A PREAD read is not likely to do sequential IO, so not
> likely to utilize the cached readahead buffer.
> I set no-readahead to true in a few of our clusters and in each case saw a
> massive reduction in disk IO and thus increase in throughput. I load tested
> this in a test cluster which does fully random reads of ~300 byte rows on a
> dataset which is 20x larger than memory. The load test was able to achieve
> nearly double the throughput.
> As a follow-on, we might consider tuning the readahead for STREAM reads. 4mb
> seems way too big for many common workloads.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)