[
https://issues.apache.org/jira/browse/MAPREDUCE-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506713#comment-13506713
]
Harsh J commented on MAPREDUCE-3772:
------------------------------------
Radim,
Could you propose your idea more formally?
At present, the baseOutputPath is directly evaluated to a path (either in
relation, or absolutely), and a record writer created upon it.
If we begin the idea of handling this (absolutes or external-to-output paths)
for the user (which, IMO we should NOT do, as MultipleOutputs never really
intended to expose these APIs originally - see the stable API, there's no such
thing), then there is quite a few things to take care of:
# Detect all forms of out-of-dir paths (/foo, ../foo, ./../foo, etc.) and
create temporary name mappings for the real write, to rename them out later.
# End-rename logic needs to be done in the OC stage, which MO does not control
by its design (the stable API had a direct OutputFormat for this, which could
do it perhaps).
Any other things we'll need addressed?
I also wonder if its worth doing all this, when user logic can take care of
sub-dir movement in the post-job stage with a few moves - which costs even
lesser.
> MultipleOutputs output lost if baseOutputPath starts with ../
> -------------------------------------------------------------
>
> Key: MAPREDUCE-3772
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-3772
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: client
> Affects Versions: 0.20.2
> Reporter: Radim Kolar
> Assignee: Harsh J
> Attachments: MAPREDUCE-3772.patch
>
>
> Lets say you have output directory set:
> FileOutputFormat.setOutputPath(job, "/tmp/multi1/out");
> and want to place output from MultipleOutputs into /tmp/multi1/extra
> I expect following code to work:
> mos = new MultipleOutputs<Text, IntWritable>(context);
> mos.write(new Text("zrr"), value, "../extra/");
> but no Exception is throw and expected output directory /tmp/multi1/extra
> does not even exists. All data written to this output vanish without trace.
> To make it work fullpath must be used
> mos.write(new Text("zrr"), value, "/tmp/multi1/extra/");
> Output is listed in statistics from MultipleOutputs correctly:
> org.apache.hadoop.mapreduce.lib.output.MultipleOutputs
> ../gaja1/=13333 (* everything is lost *)
> /tmp/multi1/out/../ksd34/=13333 (* this using full path works
> *)
> list1=6667
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira