I agree with Alan and Dmitriy - Pig is tightly coupled with hadoop, and heavily influenced by its roadmap. I think it makes sense to continue as a sub-project of hadoop.
-Thejas On 3/31/10 4:04 PM, "Dmitriy Ryaboy" <dvrya...@gmail.com> wrote: > Over time, Pig is increasing its coupling to Hadoop (for good reasons), > rather than decreasing it. If and when Pig becomes a viable entity without > hadoop around, it might make sense as a TLP. As is, I think becoming a TLP > will only introduce unnecessary administrative and bureaucratic headaches. > So my vote is also -1. > > -Dmitriy > > > > On Wed, Mar 31, 2010 at 2:38 PM, Alan Gates <ga...@yahoo-inc.com> wrote: > >> So far I haven't seen any feedback on this. Apache has asked the Hadoop >> PMC to submit input in April on whether some subprojects should be promoted >> to TLPs. We, the Pig community, need to give feedback to the Hadoop PMC on >> how we feel about this. Please make your voice heard. >> >> So now I'll head my own call and give my thoughts on it. >> >> The biggest advantage I see to being a TLP is a direct connection to >> Apache. Right now all of the Pig team's interaction with Apache is through >> the Hadoop PMC. Being directly connected to Apache would benefit Pig team >> members who would have a better view into Apache. It would also raise our >> profile in Apache and thus make other projects more aware of us. >> >> However, I am concerned about loosing Pig's explicit connection to Hadoop. >> This concern has a couple of dimensions. One, Hadoop and MapReduce are the >> current flavor of the month in computing. Given that Pig shares a name with >> the common farm animal, it's hard to be sure based on search statistics. >> But Google trends shows that "hadoop" is searched on much more frequently >> than "hadoop pig" or "apache pig" (see >> http://www.google.com/trends?q=hadoop%2Chadoop+pig). I am guessing that >> most Pig users come from Hadoop users who discover Pig via Hadoop's website. >> Loosing that subproject tab on Hadoop's front page may radically lower the >> number of users coming to Pig to check out our project. I would argue that >> this benefits Hadoop as well, since high level languages like Pig Latin have >> the potential to greatly extend the user base and usability of Hadoop. >> >> Two, being explicitly connected to Hadoop keeps our two communities aware >> of each others needs. There are features proposed for MR that would greatly >> help Pig. By staying in the Hadoop community Pig is better positioned to >> advocate for and help implement and test those features. The response to >> this will be that Pig developers can still subscribe to Hadoop mailing >> lists, submit patches, etc. That is, they can still be part of the Hadoop >> community. Which reinforces my point that it makes more sense to leave Pig >> in the Hadoop community since Pig developers will need to be part of that >> community anyway. >> >> Finally, philosophically it makes sense to me that projects that are >> tightly connected belong together. It strikes me as strange to have Pig as >> a TLP completely dependent on another TLP. Hadoop was originally a >> subproject of Lucene. It moved out to be a TLP when it became obvious that >> Hadoop had become independent of and useful apart from Lucene. Pig is not >> in that position relative to Hadoop. >> >> So, I'm -1 on Pig moving out. But this is a soft -1. I'm open to being >> persuaded that I'm wrong or my concerns can be addressed while still having >> Pig as a TLP. >> >> Alan. >> >> >> On Mar 19, 2010, at 10:59 AM, Alan Gates wrote: >> >> You have probably heard by now that there is a discussion going on in the >>> Hadoop PMC as to whether a number of the subprojects (Hbase, Avro, >>> Zookeeper, Hive, and Pig) should move out from under the Hadoop umbrella and >>> become top level Apache projects (TLP). This discussion has picked up >>> recently since the Apache board has clearly communicated to the Hadoop PMC >>> that it is concerned that Hadoop is acting as an umbrella project with many >>> disjoint subprojects underneath it. They are concerned that this gives >>> Apache little insight into the health and happenings of the subproject >>> communities which in turn means Apache cannot properly mentor those >>> communities. >>> >>> The purpose of this email is to start a discussion within the Pig >>> community about this topic. Let me cover first what becoming TLP would mean >>> for Pig, and then I'll go into what options I think we as a community have. >>> >>> Becoming a TLP would mean that Pig would itself have a PMC that would >>> report directly to the Apache board. Who would be on the PMC would be >>> something we as a community would need to decide. Common options would be >>> to say all active committers are on the PMC, or all active committers who >>> have been a committer for at least a year. We would also need to elect a >>> chair of the PMC. This lucky person would have no additional power, but >>> would have the additional responsibility of writing quarterly reports on >>> Pig's status for Apache board meetings, as well as coordinating with Apache >>> to get accounts for new committers, etc. For more information see >>> http://www.apache.org/foundation/how-it-works.html#roles >>> >>> Becoming a TLP would not mean that we are ostracized from the Hadoop >>> community. We would continue to be invited to Hadoop Summits, HUGs, etc. >>> Since all Pig developers and users are by definition Hadoop users, we would >>> continue to be a strong presence in the Hadoop community. >>> >>> I see three ways that we as a community can respond to this: >>> >>> 1) Say yes, we want to be a TLP now. >>> 2) Say yes, we want to be a TLP, but not yet. We feel we need more time >>> to mature. If we choose this option we need to be able to clearly >>> articulate how much time we need and what we hope to see change in that >>> time. >>> 3) Say no, we feel the benefits for us staying with Hadoop outweigh the >>> drawbacks of being a disjoint subproject. If we choose this, we need to be >>> able to say exactly what those benefits are and why we feel they will be >>> compromised by leaving the Hadoop project. >>> >>> There may other options that I haven't thought of. Please feel free to >>> suggest any you think of. >>> >>> Questions? Thoughts? Let the discussion begin. >>> >>> Alan. >>> >>> >>