On Wed, Oct 19, 2011 at 2:42 PM, Zheng Shao <[email protected]> wrote:
> Google's Tenzing paper mentioned that they modified MR to make sorting in
> reducer optional:
> http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/us/pubs/archive/37200.pdf
>
> Is there any plan to support that in MR 2.0?

Hey Zheng,

I don't know that anyone is working on it, but IMO the main advantage
of MR2 here is that it will be much easier for users to experiment
with new ideas on top of a shared cluster. Since the MR framework code
becomes user-level submitted code, it's easy to recompile and resubmit
jobs with a hacked MR without restarting the cluster or impacting
other users.

Would be interesting to see the Hive project experiment with this
optimization - I remember discussing this on a JIRA with Joydeep a
couple years ago.

-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera

Reply via email to