[jira] [Commented] (MAPREDUCE-5153) Support for running combiners without reducers

Tsuyoshi OZAWA (JIRA) Thu, 25 Jul 2013 22:45:10 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-5153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13720426#comment-13720426
 ]


Tsuyoshi OZAWA commented on MAPREDUCE-5153:
-------------------------------------------

This discussion is "in-mapper combining vs disk-based combining" essentially. 
If user program including scalding and cascading does in-mapper combining and 
emits their values based on memory usage,  the similar effect can be gotten, 
although it's partially. In most case, this partial approach is enough to get 
more performance. What do you think?
                
> Support for running combiners without reducers
> ----------------------------------------------
>
>                 Key: MAPREDUCE-5153
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-5153
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Radim Kolar
>
> scenario: Workflow mapper -> sort -> combiner -> hdfs
> No api change is need, if user set combiner class and reducers = 0 then run 
> combiner and sent output to HDFS.
> Popular libraries such as scalding and cascading are offering this 
> functionality, but they use caching entire mapper output in memory.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-5153) Support for running combiners without reducers

Reply via email to