Thanks Ashish, I am happy to build and try and run from svn/cvs and just try loading in data, querying etc whenever you have something.
Cheers Tim On Wed, Jul 9, 2008 at 8:46 PM, Ashish Thusoo <[EMAIL PROTECTED]> wrote: > Hi Tim, > > Point well taken. We are trying to get this out as soon as possible. > Thanks for the offer for helping us test this things out. We will get > something out to you (an early version) as soon as we have a logical > feature checkpoint. > > Cheers, > Ashish > > -----Original Message----- > From: tim robertson [mailto:[EMAIL PROTECTED] > Sent: Wednesday, July 09, 2008 1:25 AM > To: [email protected] > Subject: Re: FW: [jira] Updated: (HADOOP-3601) Hive as a contrib project > > Hi Ashish > > I am very excited to try this, having been evaluating Hadoop, HBase, > Cascading etc recently to process 100 millions of Biodiversity records > (expecting billions soon), with a view for data mining purposes (species > that are critically endangered and observed outside of protected areas > within the last 2 years). All open access to Biodiversity information. > It is difficult to comment on the paper, as it looks to offer pretty > much most of what I am looking for, but without running it, it's > difficult... > > If you would like a tester, I would happily fill this role and offer > sample code and input files which could go into "getting started" guides > on wiki etc. > > Cheers, > > Tim > > > > > > On Wed, Jul 9, 2008 at 9:47 AM, Ashish Thusoo <[EMAIL PROTECTED]> > wrote: > > > Hi Folks, > > > > We recently opened up a JIRA in order to bring Hive into the open > > source fold with the aim of contributing back to hadoop - which has > > really made large scale data processing so much easier for us at > > Facebook. We have also uploaded a small tutorial as part of that JIRA > > that gives a flavor of what kind of capabilities the system has. We > > would love to get feedback on this, so please check out the described > > functionality and post any comments, criticisms, wish lists etc. on > > the JIRA at > > > > https://issues.apache.org/jira/browse/HADOOP-3601 > > > > We are planning on an initial release of hive as a contrib project in > > 0.19 version of hadoop and are really excited about the open source > > possibilities that it can enable, specially in the data > > warehousing/ETL space. So please stay tunned to the JIRA for future > updates on Hive. > > > > Thanks, > > Ashish for [EMAIL PROTECTED] > > > > -----Original Message----- > > From: Ashish Thusoo (JIRA) [mailto:[EMAIL PROTECTED] > > Sent: Tuesday, July 08, 2008 4:15 PM > > To: Ashish Thusoo > > Subject: [jira] Updated: (HADOOP-3601) Hive as a contrib project > > > > > > [ > > https://issues.apache.org/jira/browse/HADOOP-3601?page=com.atlassian.j > > ir a.plugin.system.issuetabpanels:all-tabpanel ] > > > > Ashish Thusoo updated HADOOP-3601: > > ---------------------------------- > > > > Attachment: HiveTutorial.pdf > > > > Tutorial on the capabilities of Hive. This is a pdf of internal > > documentation and contains query, dml and ddl examples as well as the > > overview of the system. A formal language spec, architecture documents > > > and roadmaps will follow. This document gives the initial preview of > > the system and hopefully will seed a lot of interesting > > discussion/questions etc. around this system. > > > > > Hive as a contrib project > > > ------------------------- > > > > > > Key: HADOOP-3601 > > > URL: > https://issues.apache.org/jira/browse/HADOOP-3601 > > > Project: Hadoop Core > > > Issue Type: New Feature > > > Affects Versions: 0.17.0 > > > Reporter: Joydeep Sen Sarma > > > Priority: Minor > > > Attachments: HiveTutorial.pdf > > > > > > Original Estimate: 1080h > > > Remaining Estimate: 1080h > > > > > > Hive is a data warehouse built on top of flat files (stored > > > primarily > > in HDFS). It includes: > > > - Data Organization into Tables with logical and hash partitioning > > > - A Metastore to store metadata about Tables/Partitions etc > > > - A SQL like query language over object data stored in Tables > > > - DDL commands to define and load external data into tables Hive's > > > query language is executed using Hadoop map-reduce as the execution > > engine. Queries can use either single stage or multi-stage map-reduce. > > Hive has a native format for tables - but can handle any data set (for > > > example json/thrift/xml) using an IO library framework. > > > Hive uses Antlr for query parsing, Apache JEXL for expression > > evaluation and may use Apache Derby as an embedded database for > > MetaStore. Antlr has a BSD license and should be compatible with > > Apache license. > > > We are currently thinking of contributing to the 0.17 branch as a > > contrib project (since that is the version under which it will get > > tested internally) - but looking for advice on the best release path. > > > > -- > > This message is automatically generated by JIRA. > > - > > You can reply to this email to add a comment to the issue online. > > > > >
