[
https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588891#action_12588891
]
Runping Qi commented on HADOOP-3162:
------------------------------------
I assume the two new methods you refer to are meant to be:
{code}
public static setInputPaths(JobConf job, String commaSeparatedFilePaths);
public static addInputPaths(JobConf job, String commaSeparatedFilePaths);
{code}
They don't break backward compatibility.
The patch implemented then incorrectly.
The correct implementation should look like:
{code}
public static addInputPaths(JobConf job, String commaSeparatedFilePaths) {
// treat the comma in commaSeparatedFilePaths that are not enclosed by '{'
and '}' as separators
// split commaSeparatedFilePaths into string arrays using those separators
// Let Path [] paths be the array of paths created from those strings
return setInputPaths(job, paths);
}
{code}
When you replace the code using the existing api with the one using the new api
like:
{code}
- grepJob.setInputPath(new Path(args[0]));
+ FileInputFormat.setInputPaths(grepJob, new Path(args[0]));
[code}
That is incorrect. The correct one should be:
{code}
- grepJob.setInputPath(new Path(args[0]));
+ FileInputFormat.setInputPaths(grepJob, args[0]);
{code}
> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
> Key: HADOOP-3162
> URL: https://issues.apache.org/jira/browse/HADOOP-3162
> Project: Hadoop Core
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.17.0
> Reporter: Runping Qi
> Assignee: Amareshwari Sriramadasu
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt,
> patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class
> throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist :
> hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
> at
> org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
> at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
> at
> org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.