On Tue, Aug 28, 2012 at 4:12 PM, Arun C Murthy <a...@hortonworks.com> wrote:
> On Aug 23, 2012, at 9:20 PM, Eli Collins wrote:
>
>> Per this thread [1] should we have a single set of committers for the
>> entire Hadoop project, ie all subprojects?
>
> I feel like we need to have a wider discussion here.
>
> This discussion started when a diverse set of folks working on YARN for a 
> year and a half wanted their own identity and an acknowledgement of the fact 
> that they are a distinct community. In retrospect, I went about convincing 
> the wider Hadoop community about this in the wrong way. My apologies.
>
> Upon reflection, I think Chris Mattman has convinced me that we have an even 
> wider issue at hand and that the right way to a better place, not just for 
> YARN, but for all of Hadoop, is to expedite the process of graduating Hadoop 
> sub-projects into TLPs. This is a mere reflection of the fact that Hadoop is 
> not a single community.
>
> Historically there have been at least 2 communities (HDFS, MapReduce) under 
> the Hadoop umbrella; and there now 3 (HDFS, MapReduce, YARN).
> At least for the last 3 years, if not more, the overwhelming majority of 
> contributors to Hadoop have focussed exclusively on one of the sub-projects. 
> That is a clear indicator.
> This is exactly the thinking behind graduating former sub-projects like 
> HBase, Hive & Pig graduating, upon the nudge received by the Hadoop PMC from 
> the Board.
>
> The good news is that, in principle, most seem to agree on the need for 
> Hadoop sub-projects to stand alone and the path to get there. It could lead 
> to several great outcomes such as ensuring HDFS pays equal attention to HBase 
> as MapReduce, YARN pays attention to projects beyond MapReduce etc. by not 
> tying them together.
>
> Rather than sweep this under the carpet, I feel we are better off 
> acknowledging this.
>
> This is very much in keeping with the way the ASF and the Board wants to see 
> communities - small and focussed on a single project.
>
> A meta or umbrella community like Hadoop leads to issues which are well 
> documented and understood in the ASF, something experienced Apache Members 
> like Chris Mattman have repeatedly pointed out.
>
> It is also fair, per Chris Douglas, to set a reasonable time frame. After due 
> consideration, I think doing this before hadoop-2 is declared stable (GA) is 
> the most reasonable option. It gives us necessary headroom hereupon and will 
> ensure we don't confuse users further by doing it post-fact hadoop-2. Let's 
> discuss the mechanics, timelines etc. further.
>
> Yes, this is hard work and there are several technical challenges. But, the 
> ASF is all about communities and I'm sure we can solve these technical issues 
> for a better long-term health of these distinct communities.
>
> Thoughts?

I'd start a separate discussion thread or vote about moving some or
all of the sub-projects to TLPs. IMO we should resolve this issue
independently - there's no reason to block this decision on a possible
future direction for the project. For example if YARN spins out as a
TLP this issue still remains for the rest of the sub-projects, so I
don't want to stall progress on this on the larger more complex
discussion of whether all projects become TLPs. And if a sub-project
spins out as a TLP that's a great opportunity to figure out the right
set of committers. Ie the decision here doesn't prevent YARN from
establishing a new committer lists if/when it spins out.

Thanks,
Eli

Reply via email to