[ 
https://issues.apache.org/jira/browse/HADOOP-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12616283#action_12616283
 ] 

Doug Cutting commented on HADOOP-3601:
--------------------------------------

> the sub-project requirements (in terms of PMC involvement) are fairly rigorous

Not really, I'd be happy to kibitz on the mailing lists while things get 
established.  Once you've made a release or two then we can perhaps nominate 
some Hive folks to the PMC.  It is best for each subproject to be represented 
on the PMC by active committers.

> we can put the 'component' field in the email header

If the component is specified then it is included in every message body, and 
folks can filter for it there.

> there have already been suggestions on this thread with not having contrib 
> test failures stop acceptance of patches

My preference would not be to treat Hive differently from any other contrib 
module.  If it doesn't fit contrib, then it should be a sub-project.  If you 
think there will be a lot of JIRA traffic that's not of interest to the rest of 
Hadoop Core then that's a sign that it doesn't belong in Hadoop Core releases 
and should be a sub-project.

> Hive as a contrib project
> -------------------------
>
>                 Key: HADOOP-3601
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3601
>             Project: Hadoop Core
>          Issue Type: New Feature
>    Affects Versions: 0.17.0
>            Reporter: Joydeep Sen Sarma
>            Priority: Minor
>         Attachments: HiveTutorial.pdf
>
>   Original Estimate: 1080h
>  Remaining Estimate: 1080h
>
> Hive is a data warehouse built on top of flat files (stored primarily in 
> HDFS). It includes:
> - Data Organization into Tables with logical and hash partitioning
> - A Metastore to store metadata about Tables/Partitions etc
> - A SQL like query language over object data stored in Tables
> - DDL commands to define and load external data into tables
> Hive's query language is executed using Hadoop map-reduce as the execution 
> engine. Queries can use either single stage or multi-stage map-reduce. Hive 
> has a native format for tables - but can handle any data set (for example 
> json/thrift/xml) using an IO library framework.
> Hive uses Antlr for query parsing, Apache JEXL for expression evaluation and 
> may use Apache Derby as an embedded database for MetaStore. Antlr has a BSD 
> license and should be compatible with Apache license.
> We are currently thinking of contributing to the 0.17 branch as a contrib 
> project (since that is the version under which it will get tested internally) 
> - but looking for advice on the best release path.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to