[
https://issues.apache.org/jira/browse/MAPREDUCE-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13527520#comment-13527520
]
Mariappan Asokan commented on MAPREDUCE-4808:
---------------------------------------------
Hi Aun,
Thanks for your feedback. Perhaps I should mention some use cases of a
MergeManager plugin in addition to the technical details of the design
mentioned here as well as in MAPREDUCE-4812.
MergeManager plugin would allow us and any implementer of the plugin to do
variety of additional transformations like copy, limit-N query(MAPREDUCE-1928),
full join, and hashed aggregation more efficiently. Since shuffle code is
available in the framework, we want to make use of it. In my opinion, the
framework shuffle code seems to be stable in MRv2.
Making Merger to be pluggable will not add much value. If I understand
correctly, it allows plugin implementers to implement only a single pass of the
merge. The overall merge is still driven by MergeManager. Also, there is only
merge operation possible. Any additional transformation has to be done in the
Reducer only. A lot of times this is not very efficient.
Hope I clarified the usefulness of allowing MergeManager to be pluggable.
Please feel free if you any questions.
Thanks.
-- Asokan
> Allow reduce-side merge to be pluggable
> ---------------------------------------
>
> Key: MAPREDUCE-4808
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4808
> Project: Hadoop Map/Reduce
> Issue Type: Sub-task
> Affects Versions: 2.0.2-alpha
> Reporter: Arun C Murthy
> Assignee: Mariappan Asokan
> Fix For: 2.0.3-alpha
>
> Attachments: COMBO-mapreduce-4809-4812-4808.patch,
> mapreduce-4808.patch
>
>
> Allow reduce-side merge to be pluggable for MAPREDUCE-2454
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira