Zac Hopkinson created MAPREDUCE-6596:
----------------------------------------

             Summary: MultipleInputs does not escape Path characters
                 Key: MAPREDUCE-6596
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6596
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: mrv2
    Affects Versions: 2.6.2
            Reporter: Zac Hopkinson
            Assignee: Zac Hopkinson


Filenames containing commas or semicolons cause MultipleInputs to break since 
these characters are used for joining and storing the path names.

MultipleInputs stores mapreduce.input.multipleinputs.dir.formats as:

```
path;inputFormatClass,path2;inputFormatClass2[, ...]
```

If a filename contains one of the characters used for joining the data then 
getInputFormatMap and getMapperTypeMap will fail.

Looking at FileInputFormat.addInputPath() it uses escapeString and 
unescapeString from StringUtils. I took the same approach for escaping in 
MultipleInputs.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to