hive git commit: HIVE-10676 : Update Hive's README to mention spark, and to remove jdk1.6 (Sushanth Sowmyan, reviewed by Alan Gates)

khorgath Mon, 11 May 2015 16:57:52 -0700

Repository: hive
Updated Branches:
  refs/heads/branch-1.2 ec78f43b2 -> 63f0f9452



HIVE-10676 : Update Hive's README to mention spark, and to remove jdk1.6 
(Sushanth Sowmyan, reviewed by Alan Gates)


Project: http://git-wip-us.apache.org/repos/asf/hive/repo
Commit: http://git-wip-us.apache.org/repos/asf/hive/commit/63f0f945
Tree: http://git-wip-us.apache.org/repos/asf/hive/tree/63f0f945
Diff: http://git-wip-us.apache.org/repos/asf/hive/diff/63f0f945

Branch: refs/heads/branch-1.2
Commit: 63f0f945204e6296e0b14c32d95492f3457d3400
Parents: ec78f43
Author: Sushanth Sowmyan <[email protected]>
Authored: Mon May 11 16:56:49 2015 -0700
Committer: Sushanth Sowmyan <[email protected]>
Committed: Mon May 11 16:56:49 2015 -0700

----------------------------------------------------------------------
 README.txt | 32 +++++++++++++++++++-------------
 1 file changed, 19 insertions(+), 13 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/hive/blob/63f0f945/README.txt
----------------------------------------------------------------------
diff --git a/README.txt b/README.txt
index 194746d..80dbbec 100644
--- a/README.txt
+++ b/README.txt
@@ -27,18 +27,24 @@ capabilities of the language. QL can also be extended with 
custom
 scalar functions (UDF's), aggregations (UDAF's), and table
 functions (UDTF's).
 
-Hive users have a choice of 2 runtimes when executing SQL queries.
-Users can choose to use the Apache Hadoop MapReduce framework,
-which is mature and proven at large scales. MapReduce is a purely
-batch framework, and queries run using the MapReduce framework
-may experience higher latencies (tens of seconds), even
-over small datasets. Alternatively, users can choose to use the
-newer Apache Tez framework to process SQL queries. Tez is
-designed for interactive query and has substantially reduced
-overheads versus MapReduce. Users are free to switch back and
-forth between these frameworks at any time. In either case,
-Hive is best suited for use cases where the amount of data
-processed is large enough to require a distributed system.
+Hive users have a choice of 3 runtimes when executing SQL queries.
+Users can choose between Apache Hadoop MapReduce, Apache Tez or
+Apache Spark frameworks as their execution backend. MapReduce is a
+mature framework that is proven at large scales. However, MapReduce
+is a purely batch framework, and queries using it may experience
+higher latencies (tens of seconds), even over small datasets. Apache
+Tez is designed for interactive query, and has substantially reduced
+overheads versus MapReduce. Apache Spark is a cluster computing
+framework that's built outside of MapReduce, but on top of HDFS,
+with a notion of composable and transformable distributed collection
+of items called Resilient Distributed Dataset (RDD) which allows
+processing and analysis without traditional intermediate stages that
+MapReduce introduces.
+
+Users are free to switch back and forth between these frameworks
+at any time. In each case, Hive is best suited for use cases
+where the amount of data processed is large enough to require a
+distributed system.
 
 Hive is not designed for online transaction processing and does
 not support row level insert/updates. It is best used for batch
@@ -73,7 +79,7 @@ Getting Started
 Requirements
 ============
 
-- Java 1.6, 1.7
+- Java 1.7
 
 - Hadoop 1.x, 2.x

hive git commit: HIVE-10676 : Update Hive's README to mention spark, and to remove jdk1.6 (Sushanth Sowmyan, reviewed by Alan Gates)

Reply via email to