[
https://issues.apache.org/jira/browse/HADOOP-19102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17839930#comment-17839930
]
ASF GitHub Bot commented on HADOOP-19102:
-----------------------------------------
saxenapranav opened a new pull request, #6763:
URL: https://github.com/apache/hadoop/pull/6763
JIRA: https://issues.apache.org/jira/browse/HADOOP-19102
PR on trunk: https://github.com/apache/hadoop/pull/6617
Merged commit on trunk:
https://github.com/apache/hadoop/commit/6404692c0973a7b018ca77f4aaad4248b62782e2
The method `optimisedRead` creates a buffer array of size `readBufferSize`.
If footerReadBufferSize is greater than readBufferSize, abfs will attempt to
read more data than the buffer array can hold, which causes an exception.
Change: To avoid this, we will keep footerBufferSize =
min(readBufferSizeConfig, footerBufferSizeConfig)
Test change: `ITestAbfsInputStreamReadFooter` tests different scenarios with
different combinations of fileSize and footerBufferReadSize. Have added a
dimension of readBufferSize in the testcases. Now its a combination of
fileSize, readBufferSize, footerBufferReadSize.
Also, as part of this PR, have improved tests within
`ITestAbfsInputStreamReadFooter`. There are tests which have multiple
combination, and there was file getting created for all the combination. There
has to be a combination on different fileSize.
The change: We will spin up one thread each for each fileSize. And in each
thread, all the combination for that particular fileSize will run. This will
help in creating file once for a fileSize and multiple fileSize related
assertion can happen in parallel and use hardware capability.
Improvement: on a 6 processor VM [outside Azure network], on trunk, it tool
8min47sec to run all tests of ITestAbfsInputStreamReadFooter and in the PR
branch, it took 7 min. (Its IDE run wherein each test method run one after
another unlike sunfire-maven command(used in runTest script) which can run
tests in parallel).
> [ABFS]: FooterReadBufferSize should not be greater than readBufferSize
> ----------------------------------------------------------------------
>
> Key: HADOOP-19102
> URL: https://issues.apache.org/jira/browse/HADOOP-19102
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/azure
> Affects Versions: 3.4.0
> Reporter: Pranav Saxena
> Assignee: Pranav Saxena
> Priority: Major
> Labels: pull-request-available
>
> The method `optimisedRead` creates a buffer array of size `readBufferSize`.
> If footerReadBufferSize is greater than readBufferSize, abfs will attempt to
> read more data than the buffer array can hold, which causes an exception.
> Change: To avoid this, we will keep footerBufferSize =
> min(readBufferSizeConfig, footerBufferSizeConfig)
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]