[ 
https://issues.apache.org/jira/browse/HADOOP-16317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16857907#comment-16857907
 ] 

Steve Loughran commented on HADOOP-16317:
-----------------------------------------

Update

The HADOOP-15229 API lets callers who switch to the openFile() API to pass in 
options. If you want to define a standard seek policy one with some standard 
options (issue: what are those standard options) then it could be shared across 
stores

Probable set of values
* Default: whatever the default is
* Adaptive: adapting
* sequential
* random: warn of arbitrary random IO
* columnar: columnar formats. Could map to random, but give the implementations 
the chance to do something even more specific for those read plans.

There is a seek option fo ropenfile and S3a. You could do one for abfs, but 
it'd be a lot better to have a unified one for abfs+s3a+wasb, maybe even HDFS



> ABFS: improve random read performance
> -------------------------------------
>
>                 Key: HADOOP-16317
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16317
>             Project: Hadoop Common
>          Issue Type: Sub-task
>          Components: fs/azure
>    Affects Versions: 3.2.0
>            Reporter: Da Zhou
>            Priority: Major
>
> Improving random read performance is an interesting topic. ABFS doesn't 
> perform well when reading column format files as the process involves with 
> many seek operations which make the readAhead no use, and if readAheadĀ is 
> used unwisely it would lead to unnecessary data request.
> Hence creating this Jira as a reminder to track the investigation and 
> progress of the work.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to