[
https://issues.apache.org/jira/browse/HADOOP-16202?focusedWorklogId=536449&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-536449
]
ASF GitHub Bot logged work on HADOOP-16202:
-------------------------------------------
Author: ASF GitHub Bot
Created on: 15/Jan/21 12:22
Start Date: 15/Jan/21 12:22
Worklog Time Spent: 10m
Work Description: steveloughran commented on pull request #2584:
URL: https://github.com/apache/hadoop/pull/2584#issuecomment-760911137
@ThomasMarquardt could you take a look @ this ?
* I've updated the docs as suggested
* proposed making the policy broader than just seek policy, so allowing
stores to turn on whatever other tuning options they have, especially for file
types they've profiled
The goal there is rather than set cluster wide options which work well for
some datatypes but are suboptimal for others, the app provides more information
down.
weakness there is that with multiple libraries working with Parquet data
(spark, parquet.jar, iceberg, impala) it's not enough to declare the format.
You'd really need to declare your app and version
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 536449)
Time Spent: 8h (was: 7h 50m)
> Stabilize openFile() and adopt internally
> -----------------------------------------
>
> Key: HADOOP-16202
> URL: https://issues.apache.org/jira/browse/HADOOP-16202
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs, fs/s3, tools/distcp
> Affects Versions: 3.3.0
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Major
> Labels: pull-request-available
> Time Spent: 8h
> Remaining Estimate: 0h
>
> The {{openFile()}} builder API lets us add new options when reading a file
> Add an option {{"fs.s3a.open.option.length"}} which takes a long and allows
> the length of the file to be declared. If set, *no check for the existence of
> the file is issued when opening the file*
> Also: withFileStatus() to take any FileStatus implementation, rather than
> only S3AFileStatus -and not check that the path matches the path being
> opened. Needed to support viewFS-style wrapping and mounting.
> and Adopt where appropriate to stop clusters with S3A reads switched to
> random IO from killing download/localization
> * fs shell copyToLocal
> * distcp
> * IOUtils.copy
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]