Based on the typical drawing of a Hadoop stack, where Hcatalog sits just above HDFS and Hbase, and below Pig, Hive, and MapReduce, my understanding was that SerDes and Storage Handlers *should* belong to Hcatalog, whereas Hive's CLI should make use of Hcatalog API.
Is that understanding correct ? If yes, are there any discussions happening on this refactoring ? - Milind On 11/14/11 1:39 PM, "Carl Steinbach" <[email protected]> wrote: >HCatalog also depends on Hive's CLI, its parser/query compiler, and >its collection of SerDes and StorageHandlers, so HCatalog will still >have Hive dependencies even if the metastore is moved over to HCat. > >On Mon, Nov 14, 2011 at 4:20 PM, <[email protected]> wrote: > >> Any roadmaps/timelines/discussions on moving the Hive meta store to >> Hcatalog, so that the dependencies are reversed, as they should be ? >> >> - milind >> >> --- >> Milind Bhandarkar >> Greenplum Labs, EMC >> (Disclaimer: Opinions expressed in this email are those of the author, >>and >> do not necessarily represent the views of any organization, past or >> present, the author might be affiliated with.) >> >> >> >> On 11/10/11 5:24 PM, "Olga Natkovich" <[email protected]> wrote: >> >> >Hi Alan, >> > >> >Thanks for your feedback. >> > >> >Yes, I agree that we should prefer released code but for that Hive >>needs >> >to have a pretty frequent release schedule and we have not see that so >> >far. >> > >> >Hopefully, it would be latest Hive code by default but if that is >> >problematic then we could use whatever code meets the requirements. >> > >> >In step 3, we don't need to wait till we branch - we could do it as the >> >project goes. I am just saying we need to make sure that when we branch >> >we make a call of which version/revision of the code to use with the >> >release. >> > >> >Olga >> > >> >-----Original Message----- >> >From: Alan Gates [mailto:[email protected]] >> >Sent: Thursday, November 10, 2011 8:51 AM >> >To: [email protected] >> >Subject: Re: [DISCUSSION]: Synchronizing HCatalog and Hive trees >> > >> >Mostly agree, a few comments inline. >> > >> >Alan. >> > >> >On Nov 9, 2011, at 2:54 PM, Olga Natkovich wrote: >> > >> >> Hi, >> >> >> >> Since HCatalog has dependencies on Hive source tree we need to figure >> >>out how to stay in synch with Hive source while not having to deal >>with >> >>random build/test failures on a regular basis. Here is the proposal: >> >> >> >> >> >> (1) During normal development cycle, HCatalog trunk would use a >> >>particular revision of Hive to build against >> >> >> >> (2) Any time a change from Hive is needed by Hcatalog, the >>revision >> >>number will move forward. The developer who is brining this change >>into >> >>Hcatalog is responsible for making sure that the build is stable >>before >> >>moving the extern tag >> >> >> >> (3) As part of the HCatalog release process, prior to branching >>for >> >>the release, HCatalog will be integrated with the latest Hive code. >> >> >> >> a. This could be the latest Hive release if it contains all the >> >>changes required for Hcatalog or the latest Hive trunk otherwise >> > >> >s/could/should We should strongly prefer using released versions where >> >possible. >> >> >> >> b. Developer responsible for branching for the release is >> >>responsible for stabilizing the build with the latest Hive code prior >>to >> >>branching. Once the stabilization is done, a tag is created in Hive >>and >> >>the release branch uses that tag for all builds >> > >> >s/latest Hive code/chosen Hive code I assume that's what you meant >> >> >> >> c. If later on a problem is found with this tag, Hive code >>would >> >>be branched on the tag and necessary bug fixes applied. >> >> >> >> Comments? >> > >> >It's not clear to me why step 3 needs to wait for a release cycle. >>That >> >seems like a bound, but not something that we have to wait for. >> > >> >> >> >> Olga >> > >> > >> >>
