[
https://issues.apache.org/jira/browse/PIG-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15832355#comment-15832355
]
Artem Ervits edited comment on PIG-5106 at 1/20/17 8:29 PM:
------------------------------------------------------------
thank you for reviewing [~rohini], I fixed the formatting, I'm struggling a bit
with test though. Do you want to test for expected {code}List<FileStatus>{code}
Also, my unit tests do not get executed no matter what I pass to assert. I
might be changing config incorrectly? I also noticed there's
PigSequenceFileInputFormat.java, but it has different implementation than other
InputFormat classes. What are your thoughts there?
was (Author: dbist13):
thank you for reviewing [~rohini], I fixed the formatting, I'm struggling a bit
with test though. Do you want to test for expected {code}List<FileStatus>{code}
Also, my unit tests do not get executed no matter what I pass to assert. I
might be changing config incorrectly?
> Optimize when mapreduce.input.fileinputformat.input.dir.recursive set to true
> -----------------------------------------------------------------------------
>
> Key: PIG-5106
> URL: https://issues.apache.org/jira/browse/PIG-5106
> Project: Pig
> Issue Type: Bug
> Reporter: Rohini Palaniswamy
> Assignee: Artem Ervits
> Labels: newbie
> Fix For: 0.17.0
>
> Attachments: PIG-5106-0.patch, PIG-5106-1.patch
>
>
> Many of our classes extending InputFormat have
> {code}
> /*
> * This is to support multi-level/recursive directory listing until
> * MAPREDUCE-1577 is fixed.
> */
> @Override
> protected List<FileStatus> listStatus(JobContext job) throws IOException
> {
> return MapRedUtil.getAllFileRecursively(super.listStatus(job),
> job.getConfiguration());
> }
> {code}
> Now that we have dropped Hadoop 1.x, it can be optimized to
> {code}
> if (getInputDirRecursive(job)) {
> return super.listStatus(job);
> } else {
> /*
> * mapreduce.input.fileinputformat.input.dir.recursive is not
> true
> * by default for backward compatibility reasons.
> */
> return MapRedUtil.getAllFileRecursively(super.listStatus(job),
> job.getConfiguration());
> }
> {code}
> That would avoid one extra iteration when
> mapreduce.input.fileinputformat.input.dir.recursive is set to true by users.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)