[
https://issues.apache.org/jira/browse/HADOOP-14943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17337593#comment-17337593
]
Hadoop QA commented on HADOOP-14943:
------------------------------------
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m
0s{color} | {color:blue}{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 10s{color}
| {color:red}{color} | {color:red} HADOOP-14943 does not apply to trunk. Rebase
required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for
help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HADOOP-14943 |
| JIRA Patch URL |
https://issues.apache.org/jira/secure/attachment/12910618/HADOOP-14943-004.patch
|
| Console output |
https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/188/console |
| versions | git=2.17.1 |
| Powered by | Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org |
This message was automatically generated.
> Add common getFileBlockLocations() emulation for object stores, including S3A
> -----------------------------------------------------------------------------
>
> Key: HADOOP-14943
> URL: https://issues.apache.org/jira/browse/HADOOP-14943
> Project: Hadoop Common
> Issue Type: Sub-task
> Components: fs/s3
> Affects Versions: 2.8.1
> Reporter: Steve Loughran
> Assignee: Steve Loughran
> Priority: Minor
> Attachments: HADOOP-14943-001.patch, HADOOP-14943-002.patch,
> HADOOP-14943-002.patch, HADOOP-14943-003.patch, HADOOP-14943-004.patch
>
>
> It looks suspiciously like S3A isn't providing the partitioning data needed
> in {{listLocatedStatus}} and {{getFileBlockLocations()}} needed to break up a
> file by the blocksize. This will stop tools using the MRv1 APIS doing the
> partitioning properly if the input format isn't doing it own split logic.
> FileInputFormat in MRv2 is a bit more configurable about input split
> calculation & will split up large files. but otherwise, the partitioning is
> being done more by the default values of the executing engine, rather than
> any config data from the filesystem about what its "block size" is,
> NativeAzureFS does a better job; maybe that could be factored out to
> hadoop-common and reused?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]