[
https://issues.apache.org/jira/browse/PIG-5106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rohini Palaniswamy updated PIG-5106:
------------------------------------
Fix Version/s: (was: 0.17.0)
0.18.0
> Optimize when mapreduce.input.fileinputformat.input.dir.recursive set to true
> -----------------------------------------------------------------------------
>
> Key: PIG-5106
> URL: https://issues.apache.org/jira/browse/PIG-5106
> Project: Pig
> Issue Type: Bug
> Reporter: Rohini Palaniswamy
> Assignee: Artem Ervits
> Labels: newbie
> Fix For: 0.18.0
>
> Attachments: PIG-5106-0.patch, PIG-5106-1.patch
>
>
> Many of our classes extending InputFormat have
> {code}
> /*
> * This is to support multi-level/recursive directory listing until
> * MAPREDUCE-1577 is fixed.
> */
> @Override
> protected List<FileStatus> listStatus(JobContext job) throws IOException
> {
> return MapRedUtil.getAllFileRecursively(super.listStatus(job),
> job.getConfiguration());
> }
> {code}
> Now that we have dropped Hadoop 1.x, it can be optimized to
> {code}
> if (getInputDirRecursive(job)) {
> return super.listStatus(job);
> } else {
> /*
> * mapreduce.input.fileinputformat.input.dir.recursive is not
> true
> * by default for backward compatibility reasons.
> */
> return MapRedUtil.getAllFileRecursively(super.listStatus(job),
> job.getConfiguration());
> }
> {code}
> That would avoid one extra iteration when
> mapreduce.input.fileinputformat.input.dir.recursive is set to true by users.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)