Re: [VOTE] Sponsoring Howl as an Apache Incubator project

Alan Gates Wed, 02 Feb 2011 21:17:34 -0800

Edward,

I understand your concern with having a copy of the metastore code inHowl. However, let's separate code from governance. The reason Howlhas a copy of Hive's metastore is not because we're proposing it forthe Incubator, it is because in the course of developing it over thelast six months we've found that Howl development needs to move muchfaster than Hive development can. This is appropriate, since Hive isa mature product and has at least one large customer that runs code inproduction very soon after it is checked in. Thus the Hive communityis rightly cautious about checking in changes to the metastore. Howl,on the other hand, is new and innovating quickly, so it likes to getthings checked in quickly. Over the last six months every patch Howlhas made to the Hive metastore code has made it back into Hive code.But it generally takes a few weeks or more to get in.

Whether Howl is a Hive subproject or an Incubator project it faces thesame dilemma. The only other alternative that was suggested was tohave Howl extern the metastore code from Hive and keep its patches inits build and apply them at build time. But this is very fragile,since any changes in the Hive metastore code could invalidate allthose patches. We know that this is not sustainable in the long run,which is why the proposal calls out the need to resolve this one wayor another as the project matures.

As far as reaching an end state where Hive and Howl are notcompatible, we would view that as a failure for Howl. The goal forHowl is to be a metastore for Pig, MapReduce, and Hive, not just 2 out3. So we have a strong motivation to maintain that compatibility.

In terms of governance, given that we have significant contributionscoming from members of the Pig team, the Hive team, and the coreHadoop team it seemed that giving Howl its own space in the Incubatormade more sense than adding it as a subproject of any one of thoseteams.


Alan.

On Feb 2, 2011, at 3:11 PM, Edward Capriolo wrote:

On Wed, Feb 2, 2011 at 5:08 PM, Jeff Hammerbacher<ham...@cloudera.com> wrote:
Awesome! Huge +1.
On Wed, Feb 2, 2011 at 1:18 PM, Alan Gates <ga...@yahoo-inc.com>wrote:
Howl is a table management system built to provide metadata andstoragemanagement across data processing tools in Hadoop (Pig, Hive,MapReduce,...). You can learn more details at http://wiki.apache.org/pig/Howl. Forthe last six months the code has been hosted at github. The Howlteam would
like to move the project into the Apache Incubator.  You can see the
proposal for the project at http://wiki.apache.org/incubator/HowlProposal.
In order to be accepted as an Incubator project Howl needs aSponsoringproject. I propose that we, the Pig project, sponsor Howl. BysponsoringHowl we are saying that we believe it is a good fit for the ASFand that wewill assist the Howl project to succeed. You can read fulldetails of
sponsoring a project at
http://incubator.apache.org/incubation/Roles_and_Responsibilities.html#Sponsor
.
Our bylaws don't explicitly cover such a vote, but I think lazymajorityshould be reasonable. All votes are welcome, PMC member voteswill be
binding.

Clearly I'm +1.

Alan.
I do think it is a great idea that hive/pig/ and map reduce share a
meta store. However I am not sure I agree with the approach. IMHO Howl
should be a hive sub project.

"The initial release of Howl will allow interoperability of data
between Pig, Map Reduce, and Hive"
I believe the "The initial release of Howl should support hive"
at this point hive should remove the /metastore code from inside hive
and depend on howl.

I say this because hive is very actively reworking the metastore right
now for security, a new type of views, and indexes. I feel if the
metastore branches from the hive as howl getting the two entities back
together will be difficult. Having 99% of the same code base shared
between hive and howl but not having compatibility between the two is
my fear.

Re: [VOTE] Sponsoring Howl as an Apache Incubator project

Reply via email to