[
https://issues.apache.org/jira/browse/MAPREDUCE-7520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18051397#comment-18051397
]
ASF GitHub Bot commented on MAPREDUCE-7520:
-------------------------------------------
github-actions[bot] closed pull request #8010: MAPREDUCE-7520
testCombineFileInputFormat is overly constrained and can sometimes fail
URL: https://github.com/apache/hadoop/pull/8010
> testCombineFileInputFormat is overly constrained and can sometimes fail
> -----------------------------------------------------------------------
>
> Key: MAPREDUCE-7520
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-7520
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Paco Chan
> Priority: Trivial
> Labels: pull-request-available
>
> the Hadoop documentation states that the number of paths per split in
> {{CombineFileInputFormat}} is not fixed and can vary.
>
> {quote}"If a maxSplitSize is specified, then blocks on the same node are
> combined to form a single split. Blocks that are left over are then combined
> with other blocks in the same rack."
> [hadoop.apache.org|https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapred/lib/CombineFileInputFormat.html?utm_source=chatgpt.com]
> {quote}
> This means that the number of paths in a split is determined by the block
> placement and the configuration settings, leading to potential variations in
> the number of paths per split.
>
> This causes the test to sometimes fail depending on the split. As such, the
> test could be reworked to avoid strictly testing for the number of paths in
> each split.
> h4.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]