Hi,
It is from Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data
Computing in the Apache Hadoop 2 Ecosystem (Addison-Wesley Data & Analytics
Series) (Kindle Locations 735-738).
Apache Hive is a data warehouse infrastructure built on top of Hadoop for
providing data summary, ad hoc queries, and the analysis of large data sets
using an SQL-like language called HiveQL. Hive transparently translates queries
into MapReduce jobs that are executed in HBase. Hive is considered the de facto
standard for interactive SQL queries over petabytes of data
On Tuesday, 10 November 2015, 13:03, Binglin Chang <[email protected]>
wrote:
Hive transparently translates queries into MapReduce jobs that are executed in
HBase
I think this is not correct, are you sure it is from some book?
On Tue, Nov 10, 2015 at 6:56 PM, Ashok Kumar <[email protected]> wrote:
hi,
I have read in a book about Hadoop that says
Apache Hive is a data warehouse infrastructure built on top of Hadoop for
providing data summary, ad hoc queries, and the analysis of large data sets
using an SQL-like language called HiveQL.
Hive transparently translates queries into MapReduce jobs that are executed in
HBase. Hive is considered the de facto standard for interactive SQL queries
over petabytes of data.
What is the relation between Hive and HBase? I always thought that HBase is an
independent database.
Is it correct that Hive itself uses MapReduce engine that in turn uses HBase as
the database. I always thought that Hive is a data warehouse database or I am
missing something.
Thanking you