[
https://issues.apache.org/jira/browse/HDFS-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633288#comment-13633288
]
Colin Patrick McCabe commented on HDFS-4697:
--------------------------------------------
It is certainly true that the OS does some readahead of its own. However, we
found that doing manual readahead provided a performance boost in many
scenarios, especially ones involving long sequential reads. That's why the
Datanode currently does readahead by default. These settings should be honored
when using short-circuit local reads, so that the behavior is consistent and
configurable.
Most of HBase's reads are random reads. Readahead will not benefit random
reads. The current readahead code in the DN will not do readahead when small,
random reads are being done, and we should follow suit in {{BlockReaderLocal}}.
I do think HBase will see some benefit when doing long scans, and doing
compactions.
As you mentioned, it's true that readahead is not always a win when memory
pressure is extremely high. However, when memory pressure is so high that
sections that got read ahead have to be purged prior to use, the system usually
has other problems that make it unstable and essentially unusable, like the OOM
killer triggering.
> short-circuit reads do not honor readahead settings
> ---------------------------------------------------
>
> Key: HDFS-4697
> URL: https://issues.apache.org/jira/browse/HDFS-4697
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client
> Affects Versions: 2.0.3-alpha
> Reporter: Colin Patrick McCabe
> Assignee: Colin Patrick McCabe
> Priority: Minor
>
> Neither the new nor the legacy short-circuit read implementations honor
> {{dfs.datanode.readahead.bytes}}. This can result in scenarios where
> non-short-circuit reads are faster for long reads, simply because they are
> doing more readahead, and SCR is not. We should do readahead in both cases
> if it is configured.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira