[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13475872#comment-13475872
 ] 

Arun C Murthy commented on MAPREDUCE-2454:
------------------------------------------

Mariappan, I've started taking a look - there is a *lot* to digest here. I 
apologize this has taken so long, but I've personally started on this many, 
many times and then got distracted since there is so much here for me to 
review. I'm sure it's the same for several other committers - it's not an 
excuse, but unfortunately you are asking all reviewers here for significant 
investment of their time... 

This is *much* more so because you are fiddling with some of the most core 
pieces of the MR framework (i.e. sorting and shuffling).

Also, a meta point - it's also easier to work in an open-src community by 
starting with small and building up some credibility quickly by fixing bugs, 
tests etc. This was people are more comfortable when you make bigger changes.

I'm not alone sharing this, see a recent interview with Todd:
{quote}
What is your advice for someone who is interested in participating in any open 
source project for the first time?

Walk before you run. One mistake I’ve seen new contributors make is that they 
try to start off with a huge chunk of work at the core of the system. Instead, 
learn your way around the source code by doing small improvements, bug fixes, 
etc. Then, when you want to propose a larger change, the rest of the community 
will feel more comfortable accepting it. One great way to build karma in the 
community is to look at recently failing unit tests, file bugs, and fix them up.
{quote}

In general, it would be better if you could break this into a series of smaller 
patches and do this work in a development branch. This will make it easier to 
review.

I understand this is frustrating, I apologize - but this is unfortunately a 
*lot* of work and highly risky one to boot.

----

Having said this - how do we proceed?

Let's start a discussion on mr-dev@ on design review, a dev branch where we 
make a series of small changes and then proceed there-on. Thoughts?

Again, apologies.
                
> Allow external sorter plugin for MR
> -----------------------------------
>
>                 Key: MAPREDUCE-2454
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2454
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>    Affects Versions: 2.0.0-alpha, 3.0.0, 2.0.2-alpha
>            Reporter: Mariappan Asokan
>            Assignee: Mariappan Asokan
>            Priority: Minor
>              Labels: features, performance, plugin, sort
>         Attachments: HadoopSortPlugin.pdf, HadoopSortPlugin.pdf, 
> KeyValueIterator.java, MapOutputSorterAbstract.java, MapOutputSorter.java, 
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
> mapreduce-2454.patch, mapreduce-2454.patch, mapreduce-2454.patch, 
> mr-2454-on-mr-279-build82.patch.gz, MR-2454-trunkPatchPreview.gz, 
> ReduceInputSorter.java
>
>
> Define interfaces and some abstract classes in the Hadoop framework to 
> facilitate external sorter plugins both on the Map and Reduce sides.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to