> > A critical part of Hadoop's usability comes from its framework combined > with library code that allows users to get the desired functionality without > writing it themselves. >
> The goal is to make Hadoop useful out of the box. > To the best of my knowledge, Owen, your organization requires users to petition a committee before writing MapReduce jobs. At Facebook, the vast majority of jobs are submitted via Hive. Our customers at Cloudera primarily consume MapReduce through Pig, Hive, and other high-level tools. Users of Hadoop have moved beyond MapReduce. The community would be far better served by a compact, reliable, and efficient kernel. That's the project direction Doug has suggested for MapReduce, and it's one that Eric and Tom have supported. I also support this direction for the project. We're clearly having a hard time, as a community, agreeing on standards for library code. We've also shipped updates to the framework without updating the library code, seriously damaging the usability of the project. In this discussion, we're prioritizing the rapidly shrinking proportion of users of MapReduce library code in favor of the far larger community of consumers of the framework. Arun recently asked on Quora about issues that users face with Hadoop MapReduce: http://qr.ae/pPNK. There are currently five issues brought up there, with 19 votes for those issues; none of them are addressed directly by this extended debate. I'd be ecstatic to see this discussion result in moving the file formats, input and output formats, and other library code out to a separate Apache project or Github where they can evolve rapidly based on user needs, so that the MapReduce project can begin to address some of the outstanding issues with the framework itself. HDFS, HBase, Hive, Pig, Oozie, and other Hadoop-related projects continue to make forward progress at a remarkable rate; I'd like to see MapReduce return to health as well. Clearing away these major sources of conflict seems like one promising path forward. So, I'm not on the PMC, but I'm -1 on the proposed vote.
