[
https://issues.apache.org/jira/browse/MAPREDUCE-370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746720#action_12746720
]
Amareshwari Sriramadasu commented on MAPREDUCE-370:
---------------------------------------------------
bq. I think that there should be the ability to have complete control over the
output filename, much as MultipleOutputFormat does. To achieve this we could
change the baseOutputPath parameter in the write methods to be a full output
path. The user application would be reponsible for making sure there are no
name clashes - this is like the functionality available in MultipleOutputFormat
today. The overloaded version is available if the user doesn't care so much
about the output filenames, which will then have a {m,r}-nnnnn suffix. Does
this make sense?
Tom, I did not do this, because MultipleOutputs has a feature for maintaining
counters, which counts the number of records written to each output name. If we
take full output name from user, aggregating these counters at job level is not
straight forward. Also, if user doesn't give unique name for the output file,
there are chances that output will be garbled. So, I thought taking
baseOutputName (which is the counter name also) from user and constructing full
output filename by the framework would be the right solution. Don't you think
this is right?
bq.I think that there should be the ability to have complete control over the
output filename, much as MultipleOutputFormat does.
With current patch, user has complete control over the path. Just that whatever
path he chooses, the file name is <baseOutputPath>-m/r-<part-number>
> Change org.apache.hadoop.mapred.lib.MultipleOutputs to use new api.
> -------------------------------------------------------------------
>
> Key: MAPREDUCE-370
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-370
> Project: Hadoop Map/Reduce
> Issue Type: Sub-task
> Reporter: Amareshwari Sriramadasu
> Assignee: Amareshwari Sriramadasu
> Fix For: 0.21.0
>
> Attachments: patch-370-1.txt, patch-370.txt
>
>
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.