[
https://issues.apache.org/jira/browse/MAPREDUCE-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13507861#comment-13507861
]
Harsh J commented on MAPREDUCE-3772:
------------------------------------
bq. I agree that adding some javadocs doesn't make the bug into a feature. Any
problem with silent data loss is a very serious issue.
Its not as much as a data loss as it is a lack of understanding in how
speculative execution may affect a task's write attempts, when the concept of
attempt IDs (i.e. Output Committing) isn't utilized. This does not apply to MO
alone, it also applies to general writes done from any job. We've hit it with
MO cause MO provided an API (quite accidental, if you ask me) minus enforcement
that has lead to all this.
> MultipleOutputs output lost if baseOutputPath starts with ../
> -------------------------------------------------------------
>
> Key: MAPREDUCE-3772
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3772
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: client
> Affects Versions: 0.20.2
> Reporter: Radim Kolar
> Assignee: Harsh J
> Attachments: MAPREDUCE-3772.patch
>
>
> Lets say you have output directory set:
> FileOutputFormat.setOutputPath(job, "/tmp/multi1/out");
> and want to place output from MultipleOutputs into /tmp/multi1/extra
> I expect following code to work:
> mos = new MultipleOutputs<Text, IntWritable>(context);
> mos.write(new Text("zrr"), value, "../extra/");
> but no Exception is throw and expected output directory /tmp/multi1/extra
> does not even exists. All data written to this output vanish without trace.
> To make it work fullpath must be used
> mos.write(new Text("zrr"), value, "/tmp/multi1/extra/");
> Output is listed in statistics from MultipleOutputs correctly:
> org.apache.hadoop.mapreduce.lib.output.MultipleOutputs
> ../gaja1/=13333 (* everything is lost *)
> /tmp/multi1/out/../ksd34/=13333 (* this using full path works
> *)
> list1=6667
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira