Re: When queried through hiveContext, does hive executes these queries using its execution engine (default is map-reduce), or spark just reads the data and performs those queries itself?

Vikash Pareek Wed, 08 Jun 2016 09:29:48 -0700

Himanshu,

Spark doesn't use hive execution engine (Map Reduce) to execute query. Spark
only reads the meta data from hive meta store db and executes the query
within Spark execution engine. This meta data is used by Spark's own SQL
execution engine (this includes components such as catalyst, tungsten to
optimize queries) to execute query and generate result faster than hive (Map
Reduce).


Using HiveContext means connecting to hive meta store db. Thus, HiveContext
can access hive meta data, and hive meta data includes location of data,
serialization and de-serializations, compression codecs, columns, datatypes
etc. thus, Spark have enough information about the hive tables and it's data
to understand the target data and execute the query over its on execution
engine.

Overall, Spark replaced the Map Reduce model completely by it's
in-memory(RDD) computation engine.

- Vikash Pareek



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/When-queried-through-hiveContext-does-hive-executes-these-queries-using-its-execution-engine-default-tp27114p27117.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: When queried through hiveContext, does hive executes these queries using its execution engine (default is map-reduce), or spark just reads the data and performs those queries itself?

Reply via email to