Re: FW: [jira] Updated: (HADOOP-3601) Hive as a contrib project

tim robertson Thu, 10 Jul 2008 00:53:51 -0700

Thanks Ashish, I am happy to build and try and run from svn/cvs and just try
loading in data, querying etc whenever you have something.


Cheers

Tim

On Wed, Jul 9, 2008 at 8:46 PM, Ashish Thusoo <[EMAIL PROTECTED]> wrote:

> Hi Tim,
>
> Point well taken. We are trying to get this out as soon as possible.
> Thanks for the offer for helping us test this things out. We will get
> something out to you (an early version) as soon as we have a logical
> feature checkpoint.
>
> Cheers,
> Ashish
>
> -----Original Message-----
> From: tim robertson [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, July 09, 2008 1:25 AM
> To: [email protected]
> Subject: Re: FW: [jira] Updated: (HADOOP-3601) Hive as a contrib project
>
> Hi Ashish
>
> I am very excited to try this, having been evaluating Hadoop, HBase,
> Cascading etc recently to process 100 millions of Biodiversity records
> (expecting billions soon), with a view for data mining purposes (species
> that are critically endangered and observed outside of protected areas
> within the last 2 years).  All open access to Biodiversity information.
> It is difficult to comment on the paper, as it looks to offer pretty
> much most of what I am looking for, but without running it, it's
> difficult...
>
> If you would like a tester, I would happily fill this role and offer
> sample code and input files which could go into "getting started" guides
> on wiki etc.
>
> Cheers,
>
> Tim
>
>
>
>
>
> On Wed, Jul 9, 2008 at 9:47 AM, Ashish Thusoo <[EMAIL PROTECTED]>
> wrote:
>
> > Hi Folks,
> >
> > We recently opened up a JIRA in order to bring Hive into the open
> > source fold with the aim of contributing back to hadoop - which has
> > really made large scale data processing so much easier for us at
> > Facebook. We have also uploaded a small tutorial as part of that JIRA
> > that gives a flavor of what kind of capabilities the system has. We
> > would love to get feedback on this, so please check out the described
> > functionality and post any comments, criticisms, wish lists etc. on
> > the JIRA at
> >
> > https://issues.apache.org/jira/browse/HADOOP-3601
> >
> > We are planning on an initial release of hive as a contrib project in
> > 0.19 version of hadoop and are really excited about the open source
> > possibilities that it can enable, specially in the data
> > warehousing/ETL space. So please stay tunned to the JIRA for future
> updates on Hive.
> >
> > Thanks,
> > Ashish for [EMAIL PROTECTED]
> >
> > -----Original Message-----
> > From: Ashish Thusoo (JIRA) [mailto:[EMAIL PROTECTED]
> > Sent: Tuesday, July 08, 2008 4:15 PM
> > To: Ashish Thusoo
> > Subject: [jira] Updated: (HADOOP-3601) Hive as a contrib project
> >
> >
> >     [
> > https://issues.apache.org/jira/browse/HADOOP-3601?page=com.atlassian.j
>  > ir a.plugin.system.issuetabpanels:all-tabpanel ]
> >
> > Ashish Thusoo updated HADOOP-3601:
> > ----------------------------------
> >
> >    Attachment: HiveTutorial.pdf
> >
> > Tutorial on the capabilities of Hive. This is a pdf of internal
> > documentation and contains query, dml and ddl examples as well as the
> > overview of the system. A formal language spec, architecture documents
>
> > and roadmaps will follow. This document gives the initial preview of
> > the system and hopefully will seed a lot of interesting
> > discussion/questions etc. around this system.
> >
> > > Hive as a contrib project
> > > -------------------------
> > >
> > >                 Key: HADOOP-3601
> > >                 URL:
> https://issues.apache.org/jira/browse/HADOOP-3601
> > >             Project: Hadoop Core
> > >          Issue Type: New Feature
> > >    Affects Versions: 0.17.0
> > >            Reporter: Joydeep Sen Sarma
> > >            Priority: Minor
> > >         Attachments: HiveTutorial.pdf
> > >
> > >   Original Estimate: 1080h
> > >  Remaining Estimate: 1080h
> > >
> > > Hive is a data warehouse built on top of flat files (stored
> > > primarily
> > in HDFS). It includes:
> > > - Data Organization into Tables with logical and hash partitioning
> > > - A Metastore to store metadata about Tables/Partitions etc
> > > - A SQL like query language over object data stored in Tables
> > > - DDL commands to define and load external data into tables Hive's
> > > query language is executed using Hadoop map-reduce as the execution
> > engine. Queries can use either single stage or multi-stage map-reduce.
> > Hive has a native format for tables - but can handle any data set (for
>
> > example json/thrift/xml) using an IO library framework.
> > > Hive uses Antlr for query parsing, Apache JEXL for expression
> > evaluation and may use Apache Derby as an embedded database for
> > MetaStore. Antlr has a BSD license and should be compatible with
> > Apache license.
> > > We are currently thinking of contributing to the 0.17 branch as a
> > contrib project (since that is the version under which it will get
> > tested internally) - but looking for advice on the best release path.
> >
> > --
> > This message is automatically generated by JIRA.
> > -
> > You can reply to this email to add a comment to the issue online.
> >
> >
>

Re: FW: [jira] Updated: (HADOOP-3601) Hive as a contrib project

Reply via email to