[jira] [Commented] (MAPREDUCE-2715) submitAndMonitorJob() doesn't play nice with MultipleOutputFile

Geoffrey Young (JIRA) Wed, 20 Jul 2011 12:33:26 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13068577#comment-13068577
 ]


Geoffrey Young commented on MAPREDUCE-2715:
-------------------------------------------

hi :)

sorry for my fat fingers.  to your first point, the class I was talking about 
is this:

  
http://search-hadoop.com/c/Map/Reduce:/src/java/org/apache/hadoop/mapred/lib/MultipleOutputFormat.java

which I'm using via several layers of abstraction.

I'm sure you're right about the second point - I just googled the error and 
found some stale code, which I'm sure has been cleaned up by now.

as for the third point, the finer points of the implementation aren't something 
I'm familiar with.  but it does seem to me that the consistency check should 
happen later on when using multiple output formats (or at least let me run 
things at my own risk :)

HTH


> submitAndMonitorJob() doesn't play nice with MultipleOutputFile
> ---------------------------------------------------------------
>
>                 Key: MAPREDUCE-2715
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2715
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Geoffrey Young
>
> part of submitAndMonitorJob() balks if the output directory currently exists 
> but is non-empty:
>   "Error launching job , Output path already exists : "
> this logic actually conflicts with the ideas behind MultipleOutputFile, where 
> the output file path is calculated later on.
> it would be really nice to remove the restriction for non-empty output 
> directories in submitAndMonitorJob() so that MultipleOutputFile becomes more 
> useful - as it stands now, I can't, for example, specify a base output path 
> then use MutlipleOutputFile to partition by date on a daily basis.
> thanks.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-2715) submitAndMonitorJob() doesn't play nice with MultipleOutputFile

Reply via email to