[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Hairong Kuang (JIRA) Mon, 14 Apr 2008 12:10:10 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-3162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12588719#action_12588719
 ]


Hairong Kuang commented on HADOOP-3162:
---------------------------------------

I do not think that we need add the following two methods to FileInputFormat:
1. setInputPaths(JobConf conf, Sring commaSeparatedPaths)
2. addInputPaths(JobConf conf, String commaSeparatedPaths).

We have not discussed what a comma means in the user facing interfaces like 
streaming if a user provides a comma separated path name. In streaming, should 
we support commas in a path name? What if a user wants to use a glob that 
contains a comma? These questions need to be well discussed and documented 
before we make any code change to support it. We could do it in release 18.

In release 17, the patch only needs to revert the old behavior of addInputPath 
and setInputPath of JobConf. Applications like streaming should continue to use 
JobConf.setInputPath while a user that needs to use a glob or a path containing 
commas can use the new APIs in FileInputFormat.

> Map/reduce stops working with comma separated input paths
> ---------------------------------------------------------
>
>                 Key: HADOOP-3162
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3162
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.17.0
>            Reporter: Runping Qi
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: patch-3162.txt, patch-3162.txt, patch-3162.txt, 
> patch-3162.txt, patch-3162.txt
>
>
> When a job is given a comma separated input file list, FileInputFormat class 
> throws an exception, complaining the input is invalid:
> org.apache.hadoop.mapred.InvalidInputException: Input path doesnt exist : 
> hdfs:/
> namenode:port/gridmix/data/MonsterQueryBlockCompressed/part-
> 00000,/gridmix/data/MonsterQueryBlockCompressed/part-00001,/gridmix/data/Monster
> QueryBlockCompressed/part-00002
>         at 
> org.apache.hadoop.mapred.FileInputFormat.validateInput(FileInputForma
> t.java:213)
>         at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:705)
>         at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:973)
>         at 
> org.apache.hadoop.mapred.GenericMRLoadGenerator.run(GenericMRLoadGene
> rator.java:189)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3162) Map/reduce stops working with comma separated input paths

Reply via email to