[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13506713#comment-13506713
 ] 

Harsh J commented on MAPREDUCE-3772:
------------------------------------

Radim,

Could you propose your idea more formally?

At present, the baseOutputPath is directly evaluated to a path (either in 
relation, or absolutely), and a record writer created upon it.

If we begin the idea of handling this (absolutes or external-to-output paths) 
for the user (which, IMO we should NOT do, as MultipleOutputs never really 
intended to expose these APIs originally - see the stable API, there's no such 
thing), then there is quite a few things to take care of:

# Detect all forms of out-of-dir paths (/foo, ../foo, ./../foo, etc.) and 
create temporary name mappings for the real write, to rename them out later.
# End-rename logic needs to be done in the OC stage, which MO does not control 
by its design (the stable API had a direct OutputFormat for this, which could 
do it perhaps).

Any other things we'll need addressed?

I also wonder if its worth doing all this, when user logic can take care of 
sub-dir movement in the post-job stage with a few moves - which costs even 
lesser.
                
> MultipleOutputs output lost if baseOutputPath starts with ../
> -------------------------------------------------------------
>
>                 Key: MAPREDUCE-3772
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3772
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 0.20.2
>            Reporter: Radim Kolar
>            Assignee: Harsh J
>         Attachments: MAPREDUCE-3772.patch
>
>
> Lets say you have output directory set:
> FileOutputFormat.setOutputPath(job, "/tmp/multi1/out");
> and want to place output from MultipleOutputs into /tmp/multi1/extra
> I expect following code to work:
> mos = new MultipleOutputs<Text, IntWritable>(context);
> mos.write(new Text("zrr"), value, "../extra/");
> but no Exception is throw and expected output directory /tmp/multi1/extra 
> does not even exists. All data written to this output vanish without trace.
> To make it work fullpath must be used
> mos.write(new Text("zrr"), value, "/tmp/multi1/extra/");
> Output is listed in statistics from MultipleOutputs correctly:
>         org.apache.hadoop.mapreduce.lib.output.MultipleOutputs
>                 ../gaja1/=13333 (* everything is lost *)
>                 /tmp/multi1/out/../ksd34/=13333 (* this using full path works 
> *)
>                 list1=6667

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to