[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable

Mariappan Asokan (JIRA) Thu, 22 Nov 2012 19:15:04 -0800

     [ 
https://issues.apache.org/jira/browse/MAPREDUCE-4807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Mariappan Asokan updated MAPREDUCE-4807:
----------------------------------------

    Attachment: mapreduce-4807-4809.patch

Hi Alejandro,
  I did not want to create a simple mock plugin test for a couple of reasons:
* An end-to-end test for testing the combo of MAPREDUCE-4807 and MAPREDUCE-4809 
will test the full MR data flow.
* This is an interesting test that demonstrates how merge operation can be 
supported in Hadoop.
Currently, you can do only sort even if you have multiple input files that are 
already sorted and you want to merge them.  The MapOutputCollector plugin in 
the test will route the <key, value> pairs to proper partition such that sort 
order is still kept within each partition.  This will speed up the map tasks 
since O(NlogN) time complexity is reduced to O(N) for a merge.
The reduce tasks will still incur O(NlogN) time in the merge though.

There is one caveat: the test may make the patch size slightly big.  I think it 
is worth.

Please review and give your feedback.

Thanks.
-- Asokan
                
> Allow MapOutputBuffer to be pluggable
> -------------------------------------
>
>                 Key: MAPREDUCE-4807
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4807
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>    Affects Versions: 2.0.2-alpha
>            Reporter: Arun C Murthy
>            Assignee: Mariappan Asokan
>             Fix For: 2.0.3-alpha
>
>         Attachments: mapreduce-4807-4809.patch, mapreduce-4807.patch, 
> mapreduce-4807.patch, mapreduce-4807.patch
>
>
> Allow MapOutputBuffer to be pluggable

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (MAPREDUCE-4807) Allow MapOutputBuffer to be pluggable

Reply via email to