Zac Hopkinson created MAPREDUCE-6596:
----------------------------------------
Summary: MultipleInputs does not escape Path characters
Key: MAPREDUCE-6596
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6596
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: mrv2
Affects Versions: 2.6.2
Reporter: Zac Hopkinson
Assignee: Zac Hopkinson
Filenames containing commas or semicolons cause MultipleInputs to break since
these characters are used for joining and storing the path names.
MultipleInputs stores mapreduce.input.multipleinputs.dir.formats as:
```
path;inputFormatClass,path2;inputFormatClass2[, ...]
```
If a filename contains one of the characters used for joining the data then
getInputFormatMap and getMapperTypeMap will fail.
Looking at FileInputFormat.addInputPath() it uses escapeString and
unescapeString from StringUtils. I took the same approach for escaping in
MultipleInputs.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)