Based on the typical drawing of a Hadoop stack, where Hcatalog sits just
above HDFS and Hbase, and below Pig, Hive, and MapReduce, my understanding
was that SerDes and Storage Handlers *should* belong to Hcatalog, whereas
Hive's CLI should make use of Hcatalog API.

Is that understanding correct ?

If yes, are there any discussions happening on this refactoring ?

- Milind


On 11/14/11 1:39 PM, "Carl Steinbach" <[email protected]> wrote:

>HCatalog also depends on Hive's CLI, its parser/query compiler, and
>its collection of SerDes and StorageHandlers, so HCatalog will still
>have Hive dependencies even if the metastore is moved over to HCat.
>
>On Mon, Nov 14, 2011 at 4:20 PM, <[email protected]> wrote:
>
>> Any roadmaps/timelines/discussions on moving the Hive meta store to
>> Hcatalog, so that the dependencies are reversed, as they should be ?
>>
>> - milind
>>
>> ---
>> Milind Bhandarkar
>> Greenplum Labs, EMC
>> (Disclaimer: Opinions expressed in this email are those of the author,
>>and
>> do not necessarily represent the views of any organization, past or
>> present, the author might be affiliated with.)
>>
>>
>>
>> On 11/10/11 5:24 PM, "Olga Natkovich" <[email protected]> wrote:
>>
>> >Hi Alan,
>> >
>> >Thanks for your feedback.
>> >
>> >Yes, I agree that we should prefer released code but for that Hive
>>needs
>> >to have a pretty frequent release schedule and we have not see that so
>> >far.
>> >
>> >Hopefully, it would be latest Hive code by default but if that is
>> >problematic then we could use whatever code meets the requirements.
>> >
>> >In step 3, we don't need to wait till we branch - we could do it as the
>> >project goes. I am just saying we need to make sure that when we branch
>> >we make a call of which version/revision of the code to use with the
>> >release.
>> >
>> >Olga
>> >
>> >-----Original Message-----
>> >From: Alan Gates [mailto:[email protected]]
>> >Sent: Thursday, November 10, 2011 8:51 AM
>> >To: [email protected]
>> >Subject: Re: [DISCUSSION]: Synchronizing HCatalog and Hive trees
>> >
>> >Mostly agree, a few comments inline.
>> >
>> >Alan.
>> >
>> >On Nov 9, 2011, at 2:54 PM, Olga Natkovich wrote:
>> >
>> >> Hi,
>> >>
>> >> Since HCatalog has dependencies on Hive source tree we need to figure
>> >>out how to stay in synch with Hive source while not having to deal
>>with
>> >>random build/test failures on a regular basis. Here is the proposal:
>> >>
>> >>
>> >> (1)    During normal development cycle, HCatalog trunk would use a
>> >>particular revision of Hive to build against
>> >>
>> >> (2)    Any time a change from Hive is needed by Hcatalog, the
>>revision
>> >>number will move forward. The developer who is brining this change
>>into
>> >>Hcatalog is responsible for making sure that the build is stable
>>before
>> >>moving the extern tag
>> >>
>> >> (3)    As part of the HCatalog release process, prior to branching
>>for
>> >>the release, HCatalog will be integrated with the latest Hive code.
>> >>
>> >> a.       This could be the latest Hive release if it contains all the
>> >>changes required for Hcatalog or the latest Hive trunk otherwise
>> >
>> >s/could/should  We should strongly prefer using released versions where
>> >possible.
>> >>
>> >> b.      Developer responsible for branching for the release is
>> >>responsible for stabilizing the build with the latest Hive code prior
>>to
>> >>branching. Once the stabilization is done, a tag is created in Hive
>>and
>> >>the release branch uses that tag for all builds
>> >
>> >s/latest Hive code/chosen Hive code  I assume that's what you meant
>> >>
>> >> c.       If later on a problem is found with this tag, Hive code
>>would
>> >>be branched on the tag and necessary bug fixes applied.
>> >>
>> >> Comments?
>> >
>> >It's not clear to me why step 3 needs to wait for a release cycle.
>>That
>> >seems like a bound, but not something that we have to wait for.
>> >
>> >>
>> >> Olga
>> >
>> >
>>
>>

Reply via email to