[
https://issues.apache.org/jira/browse/HADOOP-17347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Loughran updated HADOOP-17347:
------------------------------------
Summary: ABFS: read/cache footer with fs.azure.footer.read.request.size
(was: ABFS: Optimise read for small files/tails of files)
> ABFS: read/cache footer with fs.azure.footer.read.request.size
> --------------------------------------------------------------
>
> Key: HADOOP-17347
> URL: https://issues.apache.org/jira/browse/HADOOP-17347
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Affects Versions: 3.4.0
> Reporter: Bilahari T H
> Assignee: Bilahari T H
> Priority: Major
> Labels: pull-request-available
> Fix For: 3.3.1
>
> Time Spent: 12h 50m
> Remaining Estimate: 0h
>
> Optimize read performance for the following scenarios
> # Read small files completely
> Files that are of size smaller than the read buffer size can be considered
> as small files. In case of such files it would be better to read the full
> file into the AbfsInputStream buffer.
> # Read last block if the read is for footer
> If the read is for the last 8 bytes, read the full file.
> This will optimize reads for parquet files. [Parquet file
> format|https://www.ellicium.com/parquet-file-format-structure/]
> Both these optimizations will be present under configs as follows
> # fs.azure.read.smallfilescompletely
> # fs.azure.read.optimizefooterread
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]