[jira] Created: (HADOOP-3601) Hive as a contrib project

Joydeep Sen Sarma (JIRA) Thu, 19 Jun 2008 14:20:07 -0700

Hive as a contrib project
-------------------------

                 Key: HADOOP-3601
                 URL: https://issues.apache.org/jira/browse/HADOOP-3601
             Project: Hadoop Core
          Issue Type: New Feature
    Affects Versions: 0.17.0
            Reporter: Joydeep Sen Sarma
            Priority: Minor



Hive is a data warehouse built on top of flat files (stored primarily in HDFS). 
It includes:
- Data Organization into Tables with logical and hash partitioning
- A Metastore to store metadata about Tables/Partitions etc
- A SQL like query language over object data stored in Tables
- DDL commands to define and load external data into tables

Hive's query language is executed using Hadoop map-reduce as the execution 
engine. Queries can use either single stage or multi-stage map-reduce. Hive has 
a native format for tables - but can handle any data set (for example 
json/thrift/xml) using an IO library framework.

Hive uses Antlr for query parsing, Apache JEXL for expression evaluation and 
may use Apache Derby as an embedded database for MetaStore. Antlr has a BSD 
license and should be compatible with Apache license.

We are currently thinking of contributing to the 0.17 branch as a contrib 
project (since that is the version under which it will get tested internally) - 
but looking for advice on the best release path.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-3601) Hive as a contrib project

Reply via email to