Namit, I was not proposing that promotion to full committership would be automatic. I assume it would still be done via a vote by the PMC. I agree that we cannot _guarantee_ committership for HCat committers in 6-9 months. But I am trying to lay out a clear path they can follow. If they don't follow the path then they won't be committers. I am also trying to make it non-preferential in that I am setting the criteria to be what I believe the Hive PMC would expect any prospective Hive committer to do. The only intended preferential part of the proposal is the Hive shepherds, which we have all agreed is a good idea.
Alan. On Dec 19, 2012, at 8:23 PM, Namit Jain wrote: > I don’t agree with the proposal. It is impractical to have a Hcat committer > with commit access to Hcat only portions of Hive. We cannot guarantee that > a Hcat > committer will become a Hive committer in 6-9 months, that depends on what > they do > in the next 6-9 months. > > The current Hcat committers should spend more time in reviewing patches, > work on non-Hcat areas in Hive, and then gradually become a hive > committer. They should not be given any preferential treatment, and the > process should be same as it would be for any other hive contributor > currently. Given that the expertise of the Hcat committers, they should > be inline for becoming a hive committer if they continue to work in hive, > but that cannot be guaranteed. I agree that some Hive committers should try > and help the existing Hcat patches, and again that is voluntary and > different > committers cannot be assigned to different parts of the code. > > Thanks, > -namit > > > > > > > > On 12/20/12 1:03 AM, "Carl Steinbach" <cwsteinb...@gmail.com> wrote: > >> Alan's proposal sounds like a good idea to me. >> >> +1 >> >> On Dec 18, 2012 5:36 PM, "Travis Crawford" <traviscrawf...@gmail.com> >> wrote: >> >>> Alan, I think your proposal sounds great. >>> >>> --travis >>> >>> On Tue, Dec 18, 2012 at 1:13 PM, Alan Gates <ga...@hortonworks.com> >>> wrote: >>>> Carl, speaking just for myself and not as a representative of the HCat >>> PPMC at this point, I am coming to agree with you that HCat integrating >>> with Hive fully makes more sense. >>>> >>>> However, this makes the committer question even thornier. Travis and >>> Namit, I think the shepherd proposal needs to lay out a clear and time >>> bounded path to committership for HCat committers. Having HCat >>> committers >>> as second class Hive citizens for the long run will not be healthy. I >>> propose the following as a starting point for discussion: >>>> >>>> All active HCat committers (those who have contributed or committed a >>> patch in the last 6 months) will be made committers in the HCat portion >>> only of Hive. In addition those committers will be assigned a >>> particular >>> shepherd who is a current Hive committer and who will be responsible for >>> mentoring them towards full Hive committership. As a part of this >>> mentorship the HCat committer will review patches of other contributors, >>> contribute patches to Hive (both inside and outside of HCatalog), >>> respond >>> to user issues on the mailing lists, etc. It is intended that as a >>> result >>> of this mentorship program HCat committers can become full Hive >>> committers >>> in 6-9 months. No new HCat only committers will be elected in Hive >>> after >>> this. All Hive committers will automatically also have commit rights on >>> HCatalog. >>>> >>>> Alan. >>>> >>>> On Dec 14, 2012, at 10:05 AM, Carl Steinbach wrote: >>>> >>>>> On a functional level I don't think there is going to be much of a >>>>> difference between the subproject option proposed by Travis and the >>> other >>>>> option where HCatalog becomes a TLP. In both cases HCatalog and Hive >>> will >>>>> have separate committers, separate code repositories, separate >>> release >>>>> cycles, and separate project roadmaps. Aside from ASF bureaucracy, I >>> think >>>>> the only major difference between the two options is that the >>> subproject >>>>> route will give the rest of the community the false impression that >>> the >>> two >>>>> projects have coordinated roadmaps and a process to prevent >>> overlapping >>>>> functionality from appearing in both projects. Consequently, If these >>> are >>>>> the only two options then I would prefer that HCatalog become a TLP. >>>>> >>>>> On the other hand, I also agree with many of the sentiments that have >>>>> already been expressed in this thread, namely that the two projects >>> are >>>>> closely related and that it would benefit the community at large if >>> the >>> two >>>>> projects could be brought closer together. Up to this point the major >>>>> source of pain for the HCatalog team has been the frequent necessity >>> of >>>>> making changes on both the Hive and HCatalog sides when implementing >>> new >>>>> features in HCatalog. This situation is compounded by the ASF >>> requirement >>>>> that release artifacts may not depend on snapshot artifacts from >>> other >>> ASF >>>>> projects. Furthermore, if Hive adds a dependency on HCatalog then it >>> will >>>>> be subject to these same problems (in addition to the gross circular >>>>> dependency!). >>>>> >>>>> I think the best way to avoid these problems is for HCatalog to >>> become a >>>>> Hive submodule. In this scenario HCatalog would exist as a >>> subdirectory >>> in >>>>> the Hive repository and would be distributed as a Hive artifact in >>> future >>>>> Hive releases. In addition to solving the problems I mentioned >>> earlier, >>> I >>>>> think this would also help to assuage the concerns of many Hive >>> committers >>>>> who don't want to see the MetaStore split out into a separate >>> project. >>>>> >>>>> Thanks. >>>>> >>>>> Carl >>>>> >>>>> On Thu, Dec 13, 2012 at 7:59 PM, Namit Jain <nj...@fb.com> wrote: >>>>> >>>>>> I am fine with this. Any hive committers who wants to volunteer to >>> be >>>>>> a hcat shepherd is welcome. >>>>>> >>>>>> >>>>>> >>>>>> On 12/14/12 7:01 AM, "Travis Crawford" <traviscrawf...@gmail.com> >>> wrote: >>>>>> >>>>>>> Thanks for reviving this thread. Reviewing the comments everyone >>> seems >>>>>>> to agree HCatalog makes sense as a Hive subproject. I think that's >>>>>>> great news for the Hadoop community. >>>>>>> >>>>>>> The discussion seems to have turned to one of committer >>> permissions. I >>>>>>> agree with the Hive folks sentiment that its something that must be >>>>>>> earned. That said, I've found it challenging at times getting >>> patches >>>>>>> into Hive that would help earn taking on a hive committer >>>>>>> responsibility. >>>>>>> >>>>>>> Proposal: if a couple hive committers can volunteer to be hcat >>>>>>> shepherds, we can work with the shepherds when making hive changes >>> in >>>>>>> a timely manor. Conversely, we can help shepherd any hive >>> committers >>>>>>> who are interested in working more with hcat. There are certainly >>>>>>> benefits to cross-committership, and this approach could help each >>>>>>> other build a history of meaningful contributions and earn the >>>>>>> privilege & responsibility of being committers. >>>>>>> >>>>>>> Thoughts? >>>>>>> >>>>>>> --travis >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Thu, Dec 13, 2012 at 11:59 AM, Edward Capriolo < >>> edlinuxg...@gmail.com> >>>>>>> wrote: >>>>>>>> I initially was a hesitant of hcatalog mostly because I imagined >>> we >>>>>>>> would >>>>>>>> end up in a spot very similar to this. >>>>>>>> >>>>>>>> Namely the hcatlog folks are interested in making a metastore to >>> support >>>>>>>> pig, hive, and map reduce. However I get the impression that many >>> in >>>>>>>> hive >>>>>>>> do not care much to have a metastore that caters to everyone. >>> Their >>>>>>>> needs >>>>>>>> are only based on what hive needs. Which I believe is the wrong >>> way >>> to >>>>>>>> look >>>>>>>> at this situation. >>>>>>>> >>>>>>>> I though to reply to this thread because I have been following >>> this >>>>>>>> Jira: >>>>>>>> https://issues.apache.org/jira/browse/HIVE-3752 >>>>>>>> >>>>>>>> On a high level I do not like this duplication of effort and >>> code. If >>>>>>>> hive >>>>>>>> is compatible with hcatalog I do not see why we put off merging >>> the >>> two >>>>>>>> at >>>>>>>> all. Hive users would get an immediate benefit if Hive used >>> hcatalog >>>>>>>> with >>>>>>>> no apparent downside. Meanwhile we are putting this off and >>> staying >>> in >>>>>>>> this >>>>>>>> awkward transition phase. >>>>>>>> >>>>>>>> Personally, I do not have a problem being a hive committer and not >>>>>>>> having >>>>>>>> hcatalog commit. None of the hive work I have done has ever >>> touched >>> the >>>>>>>> metastore. Also of the thousands of jiras and features we have >>> added >>>>>>>> only a >>>>>>>> small portion require metastore changes. >>>>>>>> >>>>>>>> As long as a couple active users have commit on hive and the >>> suggested >>>>>>>> hcatalog subproject I do not think not having commit will be a >>>>>>>> roadblock in >>>>>>>> moving hive forward. >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Dec 3, 2012 at 6:22 PM, Alan Gates <ga...@hortonworks.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> I am not sure where we are on this discussion. So far those who >>> have >>>>>>>>> chimed in seemed generally positive (Namit, Edward, Clark, >>> Alexander). >>>>>>>>> Namit and I have different visions for what the committership >>> might >>>>>>>>> look >>>>>>>>> like, so I'd like to hear from other Hive PMC members what their >>> view >>>>>>>>> is on >>>>>>>>> this. I have to say from an HCatalog perspective the >>> proposition is >>>>>>>>> much >>>>>>>>> less attractive without some commit rights. >>>>>>>>> >>>>>>>>> On a related note, people should be aware of these threads in the >>>>>>>>> Incubator list: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>> >>> http://mail-archives.apache.org/mod_mbox/incubator-general/201211.mbox/% >>>>>>>>> 3CCAGU5spdWHNtJxgQ8f%3DnPEXx9xNLjyjOYaFfnSw4EyAjgm1c46w% >>>>>> 40mail.gmail.com >>>>>>>>> %3E >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>> >>> http://mail-archives.apache.org/mod_mbox/incubator-general/201211.mbox/% >>>>>>>>> 3CCAKQbXgDZj_zMj4qSodXjMHV7xQZxpcY1-35cvq959YKLNd6tJQ% >>> 40mail.gmail.com >>>>>> %3 >>>>>>>>> E >>>>>>>>> >>>>>>>>> For those not inclined to read all the mails in the threads I >>> will >>>>>>>>> summarize (though I urge all PMC members of Hive and PPMC >>> members of >>>>>>>>> HCat >>>>>>>>> to read both mail threads because this is highly relevant to >>> what we >>>>>>>>> are >>>>>>>>> discussing). There are two salient points in these threads: >>>>>>>>> >>>>>>>>> 1) It is not wise to build a subproject that is distinct from the >>> main >>>>>>>>> project in the sense that it has separate community members >>> interested >>>>>>>>> in >>>>>>>>> it. Bertrand, Arun, Chris Mattman, and Greg Stein all spoke >>> against >>>>>>>>> this, >>>>>>>>> and all are long time Apache contributors with a lot of >>> experience. >>>>>>>>> They >>>>>>>>> were all of the opinion that it was reasonable for one project to >>>>>>>>> release >>>>>>>>> separate products. >>>>>>>>> >>>>>>>>> 2) It is not wise to have committers that have access to parts >>> of a >>>>>>>>> project but not others. Greg and Bertrand argued (and Arun >>> seemed >>> to >>>>>>>>> imply) that splitting up committer lists by sections of the code >>> did >>>>>>>>> not >>>>>>>>> work out well. >>>>>>>>> >>>>>>>>> These insights cause me to question what we mean by subproject. >>> I >>> had >>>>>>>>> originally envisioned something that looked like Pig and Hive did >>> when >>>>>>>>> they >>>>>>>>> were subprojects of Hadoop. But this violates both 1 and 2 >>> above. >>>>>>>>> Given >>>>>>>>> this input from many of the "wise old timers" of Apache I think >>> we >>>>>>>>> should >>>>>>>>> consider what we mean when we say subproject and how tightly we >>> are >>>>>>>>> willing >>>>>>>>> to integrate these projects. Personally I think it makes sense >>> to >>>>>>>>> continue >>>>>>>>> to pursue integration, as I think HCat is really a set of >>> interfaces >>>>>>>>> on top >>>>>>>>> of Hive and it makes sense to coalesce those into one project. I >>> guess >>>>>>>>> this would mean HCat becomes just another set of jars that Hive >>>>>>>>> releases >>>>>>>>> when it releases, rather than a stand alone entity. But I'm >>> curious to >>>>>>>>> hear what others think. >>>>>>>>> >>>>>>>>> Alan. >>>>>>>>> >>>>>>>>> On Nov 14, 2012, at 10:22 PM, Namit Jain wrote: >>>>>>>>> >>>>>>>>>> The same criteria should be applied to all Hive committers. >>> Only a >>>>>>>>>> committer should be able to commit code. >>>>>>>>>> I don¹t think we should bend this rule. Metastore is not a >>> separate >>>>>>>>>> project, but a integral part of hive. >>>>>>>>>> >>>>>>>>>> -namit >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 11/12/12 10:32 PM, "Alan Gates" <ga...@hortonworks.com> >>> wrote: >>>>>>>>>> >>>>>>>>>>> I would suggest looking over the patch history of HCat >>> committers. >>>>>>>>> I >>>>>>>>>>> think most of them have already contributed a number of >>> patches to >>>>>>>>> the >>>>>>>>>>> metastore. All are certainly aware of how to run Hive unit >>> tests >>>>>>>>> and >>>>>>>>>>> have an understanding of how Hive works. So I don't think it's >>>>>>>>> fair to >>>>>>>>>>> say they would be unsafe with access to the metastore. And the >>>>>>>>> Hive PMC >>>>>>>>>>> is there to assure this does not happen. If there are issues >>> I am >>>>>>>>> sure >>>>>>>>>>> they can deal with them. >>>>>>>>>>> >>>>>>>>>>> Alan. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Nov 6, 2012, at 8:06 PM, Namit Jain wrote: >>>>>>>>>>> >>>>>>>>>>>> Alan, that would not be a good idea. Metastore code is part of >>> hive >>>>>>>>>>>> code, >>>>>>>>>>>> and it >>>>>>>>>>>> would be safer if only Hive committers had commit access to >>> that. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 11/6/12 11:25 PM, "Alan Gates" <ga...@hortonworks.com> >>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Nov 4, 2012, at 8:35 PM, Namit Jain wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> I like the idea of Hcatalog becoming a Hive sub-project. The >>>>>>>>>>>>>> enhancements/bugs in the serde/metastore areas can >>> indirectly >>>>>>>>>>>>>> benefit the hive community, and it will be easier for the >>> fix >>> to >>>>>>>>> be >>>>>>>>> in >>>>>>>>>>>>>> one >>>>>>>>>>>>>> place. Having said that, I don't see serde/metastore >>>>>>>>>>>>>> moving out of hive into a separate component. Things are >>> tied >>> too >>>>>>>>>>>>>> closely >>>>>>>>>>>>>> together. I am assuming that no new committers would >>>>>>>>>>>>>> be automatically added to Hive as part of this, and both >>> Hive >>> and >>>>>>>>>>>>>> HCatalog >>>>>>>>>>>>>> will continue to have its own committers. >>>>>>>>>>>>> >>>>>>>>>>>>> One thing in this we'd like to discuss is the HCatalog >>> committers >>>>>>>>>>>>> having >>>>>>>>>>>>> commit access to the metastore sections of Hive code. That >>>>>>>>> doesn't >>>>>>>>>>>>> mean >>>>>>>>>>>>> it has to move into HCatalog's code base. But more and more >>> the >>>>>>>>> fixes >>>>>>>>>>>>> and changes we're doing in HCatalog are really in Hive's >>>>>>>>> metastore. >>>>>>>>> So >>>>>>>>>>>>> we believe it would make sense to give HCat committers >>> access to >>>>>>>>> that >>>>>>>>>>>>> component as well as HCat. >>>>>>>>>>>>> >>>>>>>>>>>>> Alan. >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>> -namit >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 11/3/12 2:22 AM, "Alan Gates" <ga...@hortonworks.com> >>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Hello Hive community. It is time for HCatalog to graduate >>> from >>>>>>>>> the >>>>>>>>>>>>>>> Apache Incubator. Given the heavy dependence of HCatalog >>> on >>>>>>>>> Hive >>>>>>>>> the >>>>>>>>>>>>>>> HCatalog community agreed it made sense to explore >>> graduating >>>>>>>>> from >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>> Incubator to become a subproject of Hive (see >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>> >>> http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/20120 >>>>>>>>>>>>>>> 9. >>>>>>>>>>>>>>> mb >>>>>>>>>>>>>>> >>> ox/%3C08C40723-8D4D-48EB-942B-8EE4327DD84A%40hortonworks.com >>> %3E >>>>>>>>> and >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>> >>> http://mail-archives.apache.org/mod_mbox/incubator-hcatalog-user/20121 >>>>>>>>>>>>>>> 0. >>>>>>>>>>>>>>> mb >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>> >>> ox/%3CCABN7xTCRM5wXGgJKEko0PmqDXhuAYpK%2BD-H57T29zcSGhkwGQw%40mail.gma >>>>>>>>>>>>>>> il >>>>>>>>>>>>>>> .c >>>>>>>>>>>>>>> om%3E ). To help both communities understand what >>> HCatalog is >>>>>>>>> and >>>>>>>>>>>>>>> hopes >>>>>>>>>>>>>>> to become we also developed a roadmap that summarizes >>> HCatalog's >>>>>>>>>>>>>>> current >>>>>>>>>>>>>>> features, planned features, and other possible features >>> under >>>>>>>>>>>>>>> discussion: >>>>>>>>>>>>>>> >>>>>>>>> >>> https://cwiki.apache.org/confluence/display/HCATALOG/HCatalog+Roadmap >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> So we are now approaching you to see if there is agreement >>> in >>>>>>>>> the >>>>>>>>>>>>>>> Hive >>>>>>>>>>>>>>> community that HCatalog graduating into Hive would make >>> sense. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Alan. >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>> >>>>>> >>>> >>> >