Amazon Elastic MapReduce Supports Apache Hive

Andrew Hitchcock Mon, 05 Oct 2009 13:59:53 -0700

Greetings,

We are excited to announce that Amazon Elastic MapReduce now supports Apache
Hive – making the service even more compelling for large data set processing
and analytics.  Hive is an open source data warehouse and analytics package
that runs on top of Hadoop. Hive is operated by a SQL-based language called
Hive QL that allows users to structure, summarize, and query data sources
stored in Amazon S3. Hive QL goes beyond standard SQL, adding first-class
support for map/reduce functions and complex extensible user defined data
types like Json and Thrift. This capability allows processing of complex and
unstructured data sources, such as text documents, and log files, in
applications such as data mining or click stream analysis.  Hive also allows
user extensions via user-defined functions written in Java and deployed via
storage in Amazon S3.


Here are some resources to help you get started:

   - Tutorial: *Running Hive on Amazon ElasticMap
Reduce<http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2857>
   *
   - Video a video
tutorial<http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2862>
   - Sample application: Operating a Data Warehouse with Hive Amazon Elastic
   MapReduce and Amazon
SimpleDB.<http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2854>
   * *

Sincerely,

The Amazon Elastic MapReduce Team

Amazon Elastic MapReduce Supports Apache Hive

Reply via email to