[
https://issues.apache.org/jira/browse/HADOOP-16317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16857907#comment-16857907
]
Steve Loughran commented on HADOOP-16317:
-----------------------------------------
Update
The HADOOP-15229 API lets callers who switch to the openFile() API to pass in
options. If you want to define a standard seek policy one with some standard
options (issue: what are those standard options) then it could be shared across
stores
Probable set of values
* Default: whatever the default is
* Adaptive: adapting
* sequential
* random: warn of arbitrary random IO
* columnar: columnar formats. Could map to random, but give the implementations
the chance to do something even more specific for those read plans.
There is a seek option fo ropenfile and S3a. You could do one for abfs, but
it'd be a lot better to have a unified one for abfs+s3a+wasb, maybe even HDFS
> ABFS: improve random read performance
> -------------------------------------
>
> Key: HADOOP-16317
> URL: https://issues.apache.org/jira/browse/HADOOP-16317
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Affects Versions: 3.2.0
> Reporter: Da Zhou
> Priority: Major
>
> Improving random read performance is an interesting topic. ABFS doesn't
> perform well when reading column format files as the process involves with
> many seek operations which make the readAhead no use, and if readAheadĀ is
> used unwisely it would lead to unnecessary data request.
> Hence creating this Jira as a reminder to track the investigation and
> progress of the work.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]